Hello, Unfortunately, all of your questions cannot be answered using the Table browser alone. For the default case using the RefSeq track, the sequence output options translates "upstream = before the 5' UTR" and "downstream = after the 3' UTR". Not what you are looking to obtain. You want to specify regions upstream/downstream based on the CDS.
Query #1: 2000 bases upstream of TSS (the start of the CDS). Query #2: 2000 bases downstream of the TSS (the start of the CDS). Query #3: 2000 bases downstream of the Stop codon (the end of the CDS). For #1, there are files on downloads that will help http://hgdownload.cse.ucsc.edu/goldenPath/hg19/bigZips/ upstream1000.fa.gz - Sequences 1000 bases upstream of annotated transcription starts of RefSeq genes with annotated 5' UTRs. This file is updated weekly so it might be slightly out of sync with the RefSeq data which is updated daily for most assemblies. upstream2000.fa.gz - Same as upstream1000, but 2000 bases. upstream5000.fa.gz - Same as upstream1000, but 5000 bases. Since there are no tools to directly create custom tracks that represent only the TSS or Stop codon, one per RefSeq, for #2 & #3, you will need to obtain the coordinates from the RefSeq track, download, format, and upload again. Use output = selected fields from primary and related tables. One file for TSS, one file for Stop codon. Then format both datasets (a BED file is recommended) using your own tools and upload as two custom tracks. Then back in the Table browser, use those custom tracks as primary tables to extract output = sequence (where the upstream/downstream bases can be specified). Table browser help: http://genome.ucsc.edu/goldenPath/help/hgTracksHelp.html#TableBrowser Custom track help: http://genome.ucsc.edu/goldenPath/help/hgTracksHelp.html#CustomTracks I hope this information is helpful. Please feel free to contact the help mailing list again if you require further assistance. Best regards, Jen UCSC Genome Browser Support http://genome.ucsc.edu/contacts.html [email protected] [email protected] On 6/18/10 5:16 PM, 云飞 李 wrote: > > Hello Assif, > > I need to download for each RefSeq, a sequence of 4000 bases; 2000 upstream > the TSS, 2000 downstream from the TSS of all mouse gene. By studying previous > mails I get a general idea on how to make it: > 1) Start with group=Genes and Gene Prediction Tracks, track=RefSeq Genes and > region = genome > 2) Select output as "Custom track" and "get output". > 3) Choose upstream region 2000 bases and output in Table browser and submit > 4) group=Custom Tracks > 5) Select output as "Sequence". Name to download file (gzip recommended) and > "get output". > 6) Now set downstream region as 2000 bases and submit. The zipped fasta file > should download. > > Is my understanding correct? > > Besides, I would also like to get a file contain sequence of 0nt upstream of > stop codon, 1000nt downstream for each RefSeq from STOP CODON. Would please > give me some instructions on how to generate it? > > Thanks, > > Yunfei Li > -------------------------------------------------------------------------------------- > Research Assistant > Department of Statistics& > School of Molecular Biosciences > Biotechnology Life Sciences Building 427 > Washington State University > Pullman, WA 99164-7520 > Phone: 509-339-5096 > http://www.wsu.edu/~ye_lab/people.html > > > _________________________________________________________________ > Hotmail: Trusted email with powerful SPAM protection. > https://signup.live.com/signup.aspx?id=60969 > _______________________________________________ > Genome maillist - [email protected] > https://lists.soe.ucsc.edu/mailman/listinfo/genome _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
