Hello Assif,

I am assuming that you are using the Table browser and the RefSeq Genes 
track and want output = sequence.

For this case, Downstream means the genomic sequence past the 3' region 
of the transcript alignment (past the 3' UTR). If you select Upstream 
sequence plus Downstream sequence and put it together into the same 
fasta record as output, then genomic sequence surrounding the gene would 
be in the output. A Blat alignment with this type of sequence would have 
two alignment blocks, span the gene bound, and have a single gap where 
the transcript aligned. Not what you want.

Follow these steps to generate the correct 1200 bases using the Table 
browser
1) Start with the RefGene track and region = genomic
2) Select output as "Custom track" and "get output".
3) Choose upstream region 1000 bases and output in Table browser and submit
4) In the group pull-down menu, the top line will hold your custom 
track. Select this.
5) Select output as "Sequence". Name to download file (gzip recommended) 
and "get output".
6) Now set downstream region as 200 bases and submit. The zipped fasta 
file should download.
Note: Downstream still means 3' from the input region, but since the 
custom track only contains the 1000 bases upstream per gene, Downstream 
will extract contiguous genomic sequence.

The resulting sequence will be genomic 1000 bases upstream of the 
RefSeq's transcription start site (start of 5' UTR alignment) plus 200 
bases downstream. Try with a single sample sequence first to verify that 
the steps are working correctly. Simply take the final sequence output, 
run a Blat, and view in Assembly browser to confirm.

My test sequence was NM_000808 and the sequence output using this method 
is below. A quick Blat against human genomic using this sequence will 
show that it covers the region you describe (turn off all tracks except 
for the Blat result and RefGene to view clearly).

Good luck and please let us know if you need more help,
Jennifer Jackson
UCSC Genome Bioinformatics Group

>hg18_ct_tbrefGene_NM_000808_up_1000_chrX_151370488_r 
>range=chrX:151370288-151371487 5'pad=0 3'pad=200 strand=- repeatMasking=none
TATTTCCATTACCATTCATTTGGTCACTGAGGATATTGAGGGCCAGATTG
AAGAACTCGTTGAAAGCAACACAGCAATTTTCAGAGGCAGAACTCAAACT
CAAGTTCCTTTATTTTATTTTTAGCCTAGGTCTTTTTCCACTCTATGCTG
CATTGTTACTGTGCTTTGCCTAGGTTGCTTTTTTTCTTTTGACCCAATGG
GAATATAGTTCTTTTATACTTTGAGTCAATAATTATCCTTCTCCCCACCT
TAGTCCTTGTGCAACTCATGTATAGGCTCACACAATTACATACATCTATA
TTCCCTTTGGAATTTTTGAAAAGAAAGCATTGTGGTTCTGCTGCTTGTCT
GTCTTACCTACAGGGAAGCTAGAACTGATAGTTTTGAATTGGGTCTGCAC
TTCTACTTGGTTGAGGGTGGGGGTGGAAGTTAGGTCAACAGGGGATCAAT
GTGGCGGAACTCTTTCTTCTTAGTGGAGTATGAATGACAACCCCTTTGAT
GTACATTTTTCTAGCTCATTATTGTCCCCAGAGATGGAATCTGAACAGAC
AGGATTGTCAAACCTTTATTTGCAGCCTGGGTTAGAAGGCACGCCCATAA
GGATATAAGGGAGACAGAGGGAAAAAGGAGGAAGCTCACTCTTCAGTCTC
GGAGCAGCAAGTAAGCATCACACATCAGCTACAGTAACCCATACTTCCTT
TATCCTCCTATTCGCTCATACCCTTTTGCTCTCTTTCTCTTCCCTTTGCT
GCCTCTGCTCCTCCCTTTCTCTCTGGGCTTCCCTCTATTTCTCTGTTGGA
TGCCTTAAGGAATAAAATACTACTTTAGTATTTCGTAAAACAACCACATC
TAAAGCCTTGCAGTACTGTATTCAGAGCCCCAGCACAGCCGCGTGCTAAG
CACCAGTGGTGAACCTAAAGCCAGCAAAGGGGGAGGGGAGAAGGGGAGGG
GAGGAGGGGAGGGAGAAACTGACAGAGAGGGAGGGAGGCAGATTGAGAGA
GAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGCGAGAGAGCGTGAGCGC
GCGCAAGCTAGCGAGCAAACCAGAGAGACAGACCGAGAGAGGGACCAGGA
GAGAGACCCAGAGAGAGAAGAAGAAGCCAGAAGCCGAGCTCTGTCAGGGC
TCAACCTCCAACTTGTTTCAGTTCATTCATCCTTCTCTCCTTTCCGCTCA





Assif Yitzhaky wrote:
> Dear Supporter,
>  
> In the Sequence Retrieval Region Options:
>  
> What does it mean "Downstream by 200 bases"? downstream from the TSS? If not, 
> from what?
>  
> I would like to download for each RefSeq, a sequence of 1200 bases; 1000 
> upstream the TSS and 200 downstream from the TSS. I prefer not to download 
> the entire gene and truncate. 
>  
> As I remember, some time ago it was possible. How can I do that?
>  
> Thank you,
>  
> Assif.
> _______________________________________________
> Genome maillist  -  [email protected]
> https://lists.soe.ucsc.edu/mailman/listinfo/genome
>   
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to