1. See attached graphic. If there is no ortholog to human for a given mouse or 
yeast gene, how can there be links to a human gene in Gene Details and Gene 
Sorter? These links were correct so for some reason there was a failure to find 
the ortholog and browser location (of course those are given at both the gene 
details and gene sorter links).

I think the problem here is New UCSC Genes vs Old UCSC genes and certain old uc 
numbers no longer corresponding to anything. That may mean the underlying 
tables for Orthologous Genes in Other Species" were never updated to New UCSC 
Genes. That is, Gene Sorter and this table aren't on the 'same page'.

Example:
Human Gene RPL11 (uc001bhk.3)
S. cerevisiae Gene RPL11A (YPR102C)




2. I am seeing hundreds of cases where a single human gene has multiple yeast 
'orthologs'. This contradicts our claim that we are using best reciprocal 
blastp. That would put orthologs in a 1-1 relationship. The real problem is 
yeast has a lot of duplicated proteins that are nearly identical to each other. 
Humans also have a lot of duplicated proteins that are nearly identical to each 
other and homologous to the yeast set. It is very problematic to match these 
up. 

yeast    human
YGR214W  uc003cjr.2
YLR048W  uc003cjr.2

YHR216W  uc003vmx.2
YLR432W  uc003vmx.2
YML056C  uc003vmx.2

YBL068W  uc004cvb.2
YER099C  uc004cvb.2
YHL011C  uc004cvb.2

sacCer3.sgdGene
sacCer3.hgBlastTab fields

3. A prose problem deep in the tables:
     
http://genome-test.cse.ucsc.edu/cgi-bin/hgGene?hgsid=3502800&hgg_do_otherProteinAli=on&hgg_otherPepTable=mm9.knownGenePep&hgg_otherId=uc009mke.1

"The single best exon  chains extending over more than 60% of the query protein 
were included. Exon chains that extended over 60% of the query and matched at 
least 60% of the protein's amino acids were also included."

That should read:
"The single best exon  chains extending over more than 60% of the query protein 
were included. Other exon chains that extended over 60% of the query and 
matched at least 60% of the protein's amino acids were also included."


3. More unclear prose:

Schema for Human Proteins - Human Proteins Mapped by Chained tBLASTn     

"ID (including gaps) 97.9%, coverage (of both) 100.0%,..."

I don't know what coverage of both could possiblyh. We have qStart, qEnd, 
tStart, tEnd which make sense. The match starts at position such and such in 
the query and ends a ways later. That match corresponds to a position range in 
the target. I don't see how or why that should be shortened.

YAL012W 8       392     17      397
YAL061W 2       289     10      271
YAL060W 9       237     17      220
YAL058W 49      443     82      466
YAL054C 66      707     37      692
YAL048C 2       623     1       591
YAL046C 19      107     13      102



 

_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to