Thanks Max.

They are different, but either could be useful, depending on what you 
are doing.  The knownGeneMrnas table contains all of the sequences used 
in the UCSC Genes track, and it is the same sequence you get when you 
click on a UCSC Gene and then the "mRNA (may differ from genome)" link.

The refMrna.fa.gz file consists of all of the RefSeq Gene mRNAs that we 
get from GenBank, prior to aligning it to the genome.  So, it will 
contain mRNAs that are not in the RefSeq Genes track.

--
Brooke Rhead
UCSC Genome Bioinformatics Group


Maximilian Haussler wrote on 3/4/10 1:14 AM:
> Hm, was just listening, and thought that this file could be useful
> http://hgdownload.cse.ucsc.edu/goldenPath/hg18/bigZips/refMrna.fa.gz
> ...but I guess from your conversation, that "known mRNAs" is different
> from "all mRNAs in Genbank"... ?
> 
> cheers
> Max
> 
> On Thu, Mar 4, 2010 at 12:41 AM, Brooke Rhead <[email protected]> wrote:
>> Hi Shobha,
>>
>> There are a couple of ways to do this.  The first is to use the Table
>> Browser (the "Tables" link in the blue bar at the top of the page).
>> Select the UCSC Genes track and region "genome", then "output format:
>> sequence".  Enter a name for the output file.  You will likely want to
>> select the "gzip compressed" option, as this will generate a somewhat
>> large file (170M as plain text, 33M compressed).  On the next page,
>> select "mRNA", then hit submit.
>>
>> Another option is to download the table 'knownGeneMrna', which contains
>> the same sequence data (it is rather odd for sequence in the Genome
>> Browser to be stored in a table, but this one is an exception).  It
>> won't be in fasta format exactly, but it will be close.  The download is
>> available from the "Annotation Database" link under hg18 on our
>> downloads page (http://hgdownload.cse.ucsc.edu).  Here is a direct link:
>>
>> http://hgdownload.cse.ucsc.edu/goldenPath/hg18/database/knownGeneMrna.txt.gz
>>
>> or you can get it with ftp from the same location:
>>
>> ftp://hgdownload.cse.ucsc.edu/goldenPath/hg18/database/knownGeneMrna.txt.gz
>>
>> I hope this is helpful.
>>
>> --
>> Brooke Rhead
>> UCSC Genome Bioinformatics Group
>>
>>
>> On 03/03/10 05:24, Shobha Potluri wrote:
>>> Hi,
>>>           I am interested in obtaining a fasta file containing all
>>> cdna sequences from the hg18 annotation. One way to do this would be
>>> to extract the transcript starts and ends from the knownGene table and
>>> extract the corresponding sequences.
>>>
>>> Is there an easier way to obtain this information?
>>>
>>> Kindly let me know.
>>>
>>> Thanks,
>>> Shobha.
>>> _______________________________________________
>>> Genome maillist  -  [email protected]
>>> https://lists.soe.ucsc.edu/mailman/listinfo/genome
>> _______________________________________________
>> Genome maillist  -  [email protected]
>> https://lists.soe.ucsc.edu/mailman/listinfo/genome
>>
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to