Hi Vikram,

Thank you for bringing this to our attention. As it turns out blat, 
which is used to align the refSeq transcripts to the genome, trims polyA 
tails. However, in this case, blat was trimming the ends of the stop 
codons (TAA and TGA), thus they were not correctly aligning to the 
genome. We are looking into a fix for this bug.

Best,
Mary
---------------------
Mary Goldman
UCSC Bioinformatics Group

On 8/5/10 11:00 PM, Vikram Katju wrote:
> I wish to draw your attention to the discrepancies I find in the CDS file
> for human data (please see attached screenshots to see the settings i used
> to download this file). I find that the length of the coding region of the
> following accessions is not a multiple of three. In other words, it is
> incomplete. My manual check tells me that the terminal exons in CDS file are
> missing one or two bases at the end. However, there could be other
> variations to the theme. My random checks for some of the entries in the
> corresponding NCBI file shows no discrepancy. I am listing some of the
> entries from Xchromosome below but it is likely that this problem exists
> even for other entries on other chromosomes.
>
> Also, I find that some entries have been annotated on both the strands (Eg:
> NM_001079538).
>
> Please have a look and do the needful.
>
>
>
> NM_001101357
> NM_001136234
> NM_138702
> NM_001004486
> NM_003868
> NM_005193
> NM_001007524
> NM_001013627
> NM_033380
> NM_001136273
> NM_001011719
> NM_001007523
> NM_001079538
>    
>
>
> _______________________________________________
> Genome maillist  -  [email protected]
> https://lists.soe.ucsc.edu/mailman/listinfo/genome
>    
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to