Hello Andrew,

The gene area you are examining is complex.

Technically, UCSC has clustered these transcripts into two distinct 
genes (as noted by the different cluster IDs). The two clusters do not 
share exons, which is a requirement of the gene clustering algorithm 
used by the UCSC Genes processing (see the track description for all the 
details).

Scientifically, the entire transcript set appears to be related, with 
the annotation noting that the upstream group is protein coding and 
regulatory (transcription factor) in function and the downstream group 
is non-coding with vaguely defined oncogene function noted. The two 
groups are not the same gene (in the classical sense) and they are 
clearly not paralogs. Given this data, merging these clusters together 
or using a single representative transcript would probably result in a 
loss of information.

So, why is MYC associated with both? Likely a function of how the gene 
symbols are brought into the processing for the track. The genes are 
related. The kgXref table can bring in associations through many sources 
and sometimes the gene symbols/labels should be interpreted to mean 
"associated with gene X" rather than "is gene X". It looks like the 
second, non-coding gene has stronger MYC annotation via RefSeq, but the 
best advice is to examine all of the evidence yourself (at UCSC and the 
external sources/literature) to flush out the exact details.

Hopefully this helps,
Jennifer


---------------------------------
Jennifer Jackson
UCSC Genome Informatics Group
http://genome.ucsc.edu/

On 4/15/10 2:31 PM, Andrew Yee wrote:
> When I was using the knownCanonical table to find the canonical transcript
> for MYC, I find that there are two entries.  See below.  I also included
> some fields from hg19.kgXref fields.  Is there an accepted method to
> determine which one is the most "canonical" transcript?  Perhaps use the
> transcript where there is a "NM" as the prefix in refseq?
>
> Thanks,
> Andrew
>
> #hg19.knownCanonical.chrom hg19.knownCanonical.chromStart
> hg19.knownCanonical.chromEnd hg19.knownCanonical.clusterId
> hg19.knownCanonical.transcript hg19.knownCanonical.protein hg19.kgXref.kgID
> hg19.kgXref.mRNA hg19.kgXref.spID hg19.kgXref.spDisplayID
> hg19.kgXref.geneSymbol hg19.kgXref.refseq
>
> chr8  128748314       128753678       24861   uc003ysi.2      uc003ysi.2      
> uc003ysi.2      NM_002467       A0N2G3  A0N2G3_HUMAN    MYC     NM_002467
> chr8  128806778       129113498       24862   uc010mdq.2      uc010mdq.2      
> uc010mdq.2      NR_003367                       MYC     NR_003367
> _______________________________________________
> Genome maillist  -  [email protected]
> https://lists.soe.ucsc.edu/mailman/listinfo/genome
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to