Hi Carsten,
Unfortunately, the table browser can't retrieve just the coding exons
for mRNA. However, one of our engineers suggested a way for you to
determine the corresponding exon positions in the mRNA using a MySql
query on our public server genome-mysql. This link that explains how to
access genome-mysql:
http://genome.ucsc.edu/FAQ/FAQdownloads.html#download29
If you want just the mRNA accession and the CDS start and end offsets,
you can do this command:
select acc,cds.name
from gbCdnaInfo,cds
where cds.id = gbCdnaInfo.cds
and acc = "AM992877";
The command fetches the CDS offsets associated with a single accession,
AM992877 (as an example so it doesn't produce huge output). The 'and acc
= ...' part can be dropped to get those fields for all mrnas. In the
output of this command, the CDS offsets will be in the 'name' field. The
cds start and end offsets will be listed in this format: start..end. For
example, if you do the command above for AM992877, output is:
+----------+----------+
| acc | name |
+----------+----------+
| AM992877 | 365..502 |
+----------+----------+
The coding region starts on the 365th base of the mRNA and ends on the
502nd base of the mRNA.
If you want the exon coordinates and not just CDS, you will want some
additional all_mrna fields, and this command will be useful:
select
tName,tStart,tEnd,strand,qName,qSize,qStart,qEnd,blockCount,blockSizes,qStarts,cds.name
from all_mrna,gbCdnaInfo,cds
where cds.id = gbCdnaInfo.cds and gbCdnaInfo.acc = all_mrna.qName
and acc = "AM992877";
Here is the first line of the sample output (again, you can remove the
"and acc = "AM992877" to get this data for all mRNA):
+-------+-----------+-----------+--------+----------+-------+--------+------+------------+------------------+------------------+----------+
| tName | tStart | tEnd | strand | qName | qSize | qStart | qEnd |
blockCount | blockSizes | qStarts | name |
+-------+-----------+-----------+--------+----------+-------+--------+------+------------+------------------+------------------+----------+
| chr1 | 11873 | 14361 | + | AM992877 | 1604 | 0 | 1604 |
3 | 354,109,1141, | 0,354,463, | 365..502 |
Please be aware that when strand is '-', both blockSizes and qStarts
need to be reversed, and qStarts elements need to be subtracted from
qSize to get the start coordinates on the forward strand of the mRNA.
I hope this information is helpful. Please contact the mail list
([email protected]) again if you have any further questions.
Katrina Learned
UCSC Genome Bioinformatics Group
On 8/5/11 10:40 AM, Carsten Raabe wrote:
> Hi Carsten,
>
> Thank you for the clarification. So all our users can benefit from the
> clarification of your question and and so my colleagues can help provide
> ideas on the answer, please resend your email to our mailing list:
> [email protected].
>
> Thank you,
>
> Katrina Learned
> UCSC Genome Bioinformatics Group
>
> On 8/5/11 8:59 AM, Carsten Raabe wrote: Dear Katrina,
>
> thanks a lot for your fast reply; however this only helps to download
> *all* exons that are part of the all_mRNA database.
>
> I am interested to download the *CDS exons* that are contained within
> this database separately. Therefore is there any table that would define
> which exon belongs to the UTR and which belongs to the CDS.
>
>
> Thankx in advance ,
>
> C.
>
> Dear Madame, dear Sir,
>
> my name is Carsten and I am working at the institute of experimental
> Pathology at Muenster University in Germany. I wonder whether it would
> be possible to download all CDS exons contained within the human all
> mRNA database separately . I tried to manage the download via the table
> browser using the BED format, however there is no option like "CDS exon"
> or "5'UTR exon" provided as it is in the UCSC gene annotation.
>
> In fact I would happy to know the corresponding exon positions in the
> mRNA, or even the positions of translational start and stopcodon would
> do equally well.
>
> Thanks a lot in advance,
>
> C.
>
>
> _______________________________________________
> Genome maillist [email protected]
> https://lists.soe.ucsc.edu/mailman/listinfo/genome
_______________________________________________
Genome maillist - [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome