Hello,

Unfortunately, all of your questions cannot be answered using the Table 
browser alone. For the default case using the RefSeq track, the sequence 
output options translates "upstream = before the 5' UTR" and "downstream 
= after the 3' UTR". Not what you are looking to obtain. You want to 
specify regions upstream/downstream based on the CDS.

Query #1: 2000 bases upstream of TSS (the start of the CDS).
Query #2: 2000 bases downstream of the TSS (the start of the CDS).
Query #3: 2000 bases downstream of the Stop codon (the end of the CDS).

For #1, there are files on downloads that will help

http://hgdownload.cse.ucsc.edu/goldenPath/hg19/bigZips/

upstream1000.fa.gz - Sequences 1000 bases upstream of annotated
     transcription starts of RefSeq genes with annotated 5' UTRs.
     This file is updated weekly so it might be slightly out of sync with
     the RefSeq data which is updated daily for most assemblies.

upstream2000.fa.gz - Same as upstream1000, but 2000 bases.

upstream5000.fa.gz - Same as upstream1000, but 5000 bases.

Since there are no tools to directly create custom tracks that represent 
only the TSS or Stop codon, one per RefSeq, for #2 & #3, you will need 
to obtain the coordinates from the RefSeq track, download, format, and 
upload again. Use output = selected fields from primary and related 
tables. One file for TSS, one file for Stop codon. Then format both 
datasets (a BED file is recommended) using your own tools and upload as 
two custom tracks. Then back in the Table browser, use those custom 
tracks as primary tables to extract output = sequence (where the 
upstream/downstream bases can be specified).

Table browser help:
http://genome.ucsc.edu/goldenPath/help/hgTracksHelp.html#TableBrowser

Custom track help:
http://genome.ucsc.edu/goldenPath/help/hgTracksHelp.html#CustomTracks

I hope this information is helpful.  Please feel free to contact the
help mailing list again if you require further assistance.

Best regards,
Jen

UCSC Genome Browser Support
http://genome.ucsc.edu/contacts.html
[email protected]  [email protected]

On 6/18/10 5:16 PM, 云飞 李 wrote:
>
> Hello Assif,
>
> I need to download for each RefSeq, a sequence of 4000 bases; 2000 upstream 
> the TSS, 2000 downstream from the TSS of all mouse gene. By studying previous 
> mails I get a general idea on how to make it:
> 1) Start with group=Genes and Gene Prediction Tracks, track=RefSeq Genes and 
> region = genome
> 2) Select output as "Custom track" and "get output".
> 3) Choose upstream region 2000 bases and output in Table browser and submit
> 4) group=Custom Tracks
> 5) Select output as "Sequence". Name to download file (gzip recommended) and 
> "get output".
> 6) Now set downstream region as 2000 bases and submit. The zipped fasta file 
> should download.
>
> Is my understanding correct?
>
> Besides, I would also like to get a file contain sequence of 0nt upstream of 
> stop codon, 1000nt downstream for each RefSeq from STOP CODON. Would please 
> give me some instructions on how to generate it?
>
> Thanks,
>
> Yunfei Li
> --------------------------------------------------------------------------------------
> Research Assistant
> Department of Statistics&
> School of Molecular Biosciences
> Biotechnology Life Sciences Building 427
> Washington State University
> Pullman, WA 99164-7520
> Phone: 509-339-5096
> http://www.wsu.edu/~ye_lab/people.html
>
>                                       
> _________________________________________________________________
> Hotmail: Trusted email with powerful SPAM protection.
> https://signup.live.com/signup.aspx?id=60969
> _______________________________________________
> Genome maillist  -  [email protected]
> https://lists.soe.ucsc.edu/mailman/listinfo/genome
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to