Hi Michael,

Sorry for the delay in getting back to you.  The RefSeq Genes track is 
curated.  See their page for more information:
http://www.ncbi.nlm.nih.gov/RefSeq/

You might want to look at the CCDS or the Gencode Genes tracks (also in 
the "Genes and Gene Prediction" track group) instead.  Click on the blue 
track names to read more about each gene set.

--
Brooke Rhead
UCSC Genome Bioinformatics Group


On 10/7/11 6:31 PM, Rusch, Michael wrote:
> Is there a better option, then?  Something curated?
>
> Michael
>
> -----Original Message-----
> From: Brooke Rhead [mailto:[email protected]]
> Sent: Friday, October 07, 2011 5:52 PM
> To: Rusch, Michael
> Cc: '[email protected]'
> Subject: Re: [Genome] genes with disparate loci in refFlat
>
> Hi Michael,
>
> The RefSeq Genes track is made by aligning RefSeq sequences to the
> genome using BLAT.  You can click on the blue "RefSeq Genes" link on the
> main Genome Browser page to read the track description.  In part, it says:
>
> "RefSeq RNAs were aligned against the human genome using blat; those
> with an alignment of less than 15% were discarded. When a single RNA
> aligned in multiple places, the alignment having the highest base
> identity was identified. Only alignments having a base identity level
> within 0.1% of the best and at least 96% base identity with the genomic
> sequence were kept."
>
> So, it is expected that some sequences will align very well in multiple
> locations.  One explanation for what you are seeing is duplication
> events in the genome.  You might try turning on the "Segmental Dups"
> track (in the Variation and Repeats track group).  Both of your example
> regions show activity in that track.
>
> If you have further questions, please contact us again at
> [email protected].
>
> --
> Brooke Rhead
> UCSC Genome Bioinformatics Group
>
>
> On 10/6/11 7:27 AM, Rusch, Michael wrote:
>> I've found some things in refFlat that I don't understand. Perhaps
> somebody can help shed some light on this.
>>
>> Intuitively it seemed to me that in most circumstances, all of the
> records with the same geneName should be in about the same place, and
> certainly in the same orientation on the same chromosome. However, I
> have found several situations where this is not the case. Some of these
> make sense to me, for example, genes in the PARs have records on both
> chrX and chrY. Also, there are several that have some records on the
> "hap" sequences. These I can understand. Others truly puzzle me. Maybe
> somebody can help me interpret.
>>
>> First example is MAGEA2. This gene has two locations on chrX:
>> MAGEA2  chrX    -      151918388       151922364       3
>> MAGEA2  chrX    +       151883119       151887095       3
>>
>> I don't understand how the same gene could be in two different places?
>>
>> In some cases they are even on different chromosomes.
>>
>> In many cases, there seem to be duplicates with different
> geneName/names. For example:
>>
>> MIR4509-1       NR_039732       chr15   -       22675147        22675241
>> MIR4509-2       NR_039733       chr15   -       22675147        22675241
>> MIR4509-3       NR_039734       chr15   -       22675147        22675241
>> MIR4509-1       NR_039732       chr15   +       28671636        28671730
>> MIR4509-2       NR_039733       chr15   +       28671636        28671730
>> MIR4509-3       NR_039734       chr15   +       28671636        28671730
>> MIR4509-1       NR_039732       chr15   -       28735897        28735991
>> MIR4509-2       NR_039733       chr15   -       28735897        28735991
>> MIR4509-3       NR_039734       chr15   -       28735897        28735991
>>
>> In this case, there are three geneName/name combinations, and three
> loci, and each geneName/name has a record in each locus.
>>
>> There are hundreds of these that I've found.
>>
>> I get the impression that I'm not using this data correctly, and
> perhaps there would be a better table to be using for the purpose of
> locating genes and annotated transcripts on the genome. Can anybody
> explain this to me?
>
>> Michael
>>
>> ________________________________
>> Email Disclaimer: www.stjude.org/emaildisclaimer
>> _______________________________________________
>> Genome maillist  -  [email protected]
>> https://lists.soe.ucsc.edu/mailman/listinfo/genome
>
>
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to