Hi Yuval, You can use our Table Browser to obtain these sequences. Click on "Tables" in the blue bar at the top of the page. Choose your preferred cow assembly (it will default to the most recent assembly: bosTau4, Oct. 2007) and then make the following additional selections:
group: Genes and Prediction Tracks track: RefSeq Genes (Note: MGC Genes is equally useful & if you choose to use it instead, select mgcGenes as the table rather than refGene) table: refGene region: position; type "chr1" in the field output format: sequence output file: enter the name of the file that will be created file type returned: plain text Click "get output" Select "genomic" & click "submit" (this step will be skipped if you use the mgcGenes table) Select all of the following: Promoter/Upstream by <type in '1000'> bases 5' UTR Exons CDS Exons 3' UTR Exons Introns One FASTA record per gene. Exons in upper case, everything else in lower case. Click "get sequence" Repeat for each chromosome (there is too much data to do the entire genome at once). In the output files, each gene will have a brief header that starts with ">". The header line will be followed by the 1000 bases which are upstream from the TSS and then the bases that make up the gene will follow. The genes will be in order according to position on the chromosome. You will then need to parse the data to truncate the sequence 200 bases downstream of the TSS. It may be helpful to note that the first uppercase base will be the actual start of the gene. Please don't hesitate to contact the mail list again if you require further assistance. Katrina Learned UCSC Genome Bioinformatics Group Yuval Tabach wrote: > Dear all > > I want to consult on you how to get the sequence flanking all the > > Transcription start sites (TSS)of the Cow genes. I would like to get 1000 > upstream and 200bp downstream from the TSS. How can I get this sequence? > > Thanks > > > > > > Yuval Tabach, Ph.D. > Ruvkun Laboratory > Department of Molecular Biology > Massachusetts General Hospital > Department of Genetics > Harvard Medical School > > > > _______________________________________________ > Genome maillist - [email protected] > https://lists.soe.ucsc.edu/mailman/listinfo/genome > _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
