300 new human genes discoveredNovember 21st, 2007 - 1:13 pm ICT by admin
Washington, Nov 21 (ANI): Researchers at the Cornell University have discovered around 300 previously unidentified human genes, and have also found extensions of several hundred genes, which were already known, using evolution-tracking method.
The research, led by Adam Siepel, Cornell assistant professor of biological statistics and computational biology, used supercomputers to compare portions of the human genome with those of other mammals.
The discovery is based on the idea that as organisms evolve, sections of genetic code that do something useful for the organism change in different ways.
More than 20,000 protein-coding genes have been identified.
The complete human genome was sequenced several years ago, but that simply means that the order of the 3 billion or so chemical units, called bases, that make up the genetic code is known. What remains is the identification of the exact location of all the short sections that code for proteins or perform regulatory or other functions.
Siepel said that the methods biological methods are very effective at finding genes that are widely expressed but may miss those that are expressed only in certain tissues or at early stages of embryonic development.
“What’s exciting is using evolution to identify these genes. Evolution has been doing this experiment for millions of years. The computer is our microscope to observe the results,” Siepel said.
Four different bases — commonly referred to by the letters G, C, A and T — make up DNA. Three bases in a row can code for an amino acid, i.e. the building blocks of proteins, and a string of these three-letter codes can be a gene, coding for a string of amino acids that a cell can make into a protein.
In the research, the team started with alignments discovered by other workers, stretches up to several thousand bases long that are mostly alike across two or more species.
By using large-scale computer clusters, including an 850-node cluster, the researchers ran three different algorithms, or computing designs to compare the alignments between human, mouse, rat and chicken in various combinations.
The computer looked for regions with those sorts of changes by creating a mathematical model of how the gene might have changed, then looking for matches to this model.
After eliminating predictions that matched already known genes, the researchers then tested the remainder in the laboratory, proving that many of the genes could in fact be found in samples of human tissue and could code for proteins.
They were sometimes able to identify the proteins by comparison with databases of known proteins.
The researchers have said that the discovery shows that there still could be many more genes that have been missed using current biological methods.
The research will be published in the journal Genome Research. (ANI)
Tags: 300, biological methods, biological statistics, building blocks of proteins, computational biology, computer clusters, cornell assistant professor, cornell university, evolution, exact location, genetic code, have been identified, human genes, human genome, node cluster, protein, scale computer, stages of embryonic development, supercomputers, unidentified human