Hello Lipika,

Perhaps some help understanding the coordinate system used by UCSC will 
help. We use a 0-based start position. This can get tricky, especially 
when converting to the (-) strand, since we also store all coordinates 
smallest->largest along the chromosome.

Help is located in this wiki:
http://genomewiki.ucsc.edu/index.php/Coordinate_Transforms

All database tables/files will be formatted this way unless specifically 
noted in the data format FAQ:
http://genome.ucsc.edu/FAQ/FAQformat.html

There are utilities readily available that work with our coordinate 
system. Some function stand-alone and others require a database. The 
public mySQL database can be used when a database is required, if you do 
not run your own mirror.

A list of utilities is here:
http://hgwdev.cse.ucsc.edu/~larrym/utilities.html

Many can be downloaded pre-compiled from here (for certain platforms):
http://hgdownload.cse.ucsc.edu/admin/exe/

Otherwise, obtain the source and compile locally:
http://hgdownload.cse.ucsc.edu/downloads.html#source_downloads

Public mySQL access instructions:
http://genome.ucsc.edu/FAQ/FAQdownloads.html#download29

Please feel free to contact the mailing list support team again if you 
would like more assistance.

Warm regards,

Jen
UCSC Genome Browser Support

On 9/8/10 11:35 AM, Lipika Ray wrote:
> Hello UCSC group,
>
> I like to get the coding sequence of gene from refseq mrna ids (like,
> NM_003820) from hg18 version - big list of such ids.
>
> So I am getting information of exonstarts , exonends, cdsStart, cdsend from
> refFlat table under hg18.
>
> So for NM_003820, the record looks like this:
>
> geneName: TNFRSF14
>        name: NM_003820
>       chrom: chr1
>      strand: -
>     txStart: 2479150
>       txEnd: 2486613
>    cdsStart: 2479705
>      cdsEnd: 2486314
>   exonCount: 8
> exonStarts: 2479150,2480082,2481163,2482264,2483000,2484510,2485144,2486245,
>    exonEnds: 2479831,2480114,2481306,2482355,2483156,2484636,2485253,2486613,
>
> To get the dna sequence corresponding to the coding regions, I am extracting
> sequences from chr1.fa.gz file under chromosomes in hg18 version and then
> extracting the dna sequence corresponding to the region:
>
> 2479705-2479831, 2480082-2480114, 2481163-2481306, 2482264-2482355,
> 2483000-2483156, 2484510-2484636, 2485144-2485253, 2486245-2486314
>
> The corresponding sequence is not matching if I cross check with the
> sequence from web. Can you please guide me whether I can extract sequence in
> this way, or you already have sequences corresponding to genes stored
> separately in your datanbase.
>
> Thanks for your help.
>
> Lipika
> _______________________________________________
> Genome maillist  -  [email protected]
> https://lists.soe.ucsc.edu/mailman/listinfo/genome
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to