Re: [Genome] multi-search in genome browser

Jennifer Jackson Thu, 01 Apr 2010 13:29:55 -0700

Hello Jose,

There is a streamlined method to input coordinates and obtain gene track 
information using the Table browser.

Basic path:
1) select clade, genome, assembly
2) select gene track to extract data from (UCSC Genes, RefSeq Genes, etc.)
3) select region=genome, then click on the "define regions" button. 
Paste or upload your coordinates in the format specified and submit.
4) select the output format, name the file for download, and click on 
"get output" to extract the data.

Note that the output will be all rows in the track with any overlap with 
the defined regions. There could be more than one line of output per 
region. Also, the original region will not be in the output. This is 
sometimes not the desired output format.

Other options:

1) To do a full merge (data from each input in each output row):
create a custom track of your coordinates (BED) and load, then send both 
your custom track and the gene track to Galaxy to do an "interval 
intersection" function. Use the Table browser to send the data over 
(check the "Galaxy" box before submitting).

2) To do a merge using a custom track and the intersection function:
create a custom track of your regions (BED), then follow the basic path 
above, except change step #3 - instead use the "intersection" function 
to filter the output based on the custom track regions. Same output 
format as the original query (only data from primary gene track table).

Complications:
1) Your coordinate ranges are small and some fall into the intron 
regions of genes (no exon overlap == no gene returned from queries).
Solution: Create another custom track, this time of the gene track, that 
excludes exon blocks - to create a global genome "footprint" region for 
each transcript. To do this, follow the basic steps above, leaving 
region = genome, do not use a filter/intersection, and use output 
"selected fields from primary and related tables". Name file. After 
clicking on the "get output", select the fields to create a simple BED 
file - the first 6 columns (excluding the initial index "bin" field if 
present) - and download. Reload as a custom track, and use this custom 
track as the primary gene track instead. Send to Galaxy or use as the 
basis for a region filter or custom track intersection query.

Help links:
http://genome.ucsc.edu/cgi-bin/hgTables
http://genome.ucsc.edu/goldenPath/help/hgTablesHelp.html
http://genome.ucsc.edu/goldenPath/help/hgTracksHelp.html#CustomTracks
http://genome.ucsc.edu/FAQ/FAQformat.html#format1 (BED)

If you have any question or need more guidance, please let us know,
Jennifer

---------------------------------
Jennifer Jackson
UCSC Genome Informatics Group
http://genome.ucsc.edu/

On 4/1/10 2:09 AM, Jose Manuel Castro Tubio wrote:
> Dear Sir or Madam,
> every day we use the genome browser to identify genes from a list of 
> nucleotide positions on human chromosomes (coordinates). However, currently 
> our list of coordinates is very long, so the search one by one in the genome 
> browser will result in a very long process we can not assume. We are asking 
> if it would be possible to perform a search by entering all our coordinates 
> at once, to obtain a list with the genes where are located our coordinates.
> Waiting for a reply,
> Jose
> _____________
> Jose Tubio, PhD
> Center for Genomic Regulation (CRG)
> Plaça Charles Darwin s/n, PRBB Building, Room 521
> E-08003 Barcelona, Catalunya, Spain
> _______________________________________________
> Genome maillist  -  [email protected]
> https://lists.soe.ucsc.edu/mailman/listinfo/genome
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Re: [Genome] multi-search in genome browser

Reply via email to