Hi, I am little confused with the data in knownGene in the Human section. As far as I know, it has "Transcription start & end position" and "Coding region start & end" information for an entry at column number 4, 5, 6, and 7 respectively.
Check out the following two entries, why does the Coding region start and end (columns 6 & 7) are identical, despite the fact that the uc002ybr just has one exon (it could be a non-coding UTR) and the uc010gjw entry has 5 exons (column 8). [pme...@portal]$ grep "uc002ybr" knownGene.txt uc002ybr.1 chr20 + 60513179 60515672 60513179 60513179 1 60513179, 60515672, uc002ybr.1 [pme...@portal]$ grep "uc010gjw" knownGene.txt uc010gjw.1 chr20 + 58713536 59228976 58713536 58713536 5 58713536,58729462,58755892,58994077,59225766, 58713700,58729557,58755971,58994213,59228976, uc010gjw.1 I may be missing something. I will appreciate any help. Thanks, perdeep Perdeep K. Mehta, PhD Research Scientist, Bioinformatics Research Informatics, Information Sciences Division St. Jude Children's Research Hospital 262 Danny Thomas Place Memphis, TN 38105-2794 Tel: 901-595 3774 http://www.hatwellcenter.org/ Email Disclaimer: www.stjude.org/emaildisclaimer _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
