Hello Jose, There is a streamlined method to input coordinates and obtain gene track information using the Table browser.
Basic path: 1) select clade, genome, assembly 2) select gene track to extract data from (UCSC Genes, RefSeq Genes, etc.) 3) select region=genome, then click on the "define regions" button. Paste or upload your coordinates in the format specified and submit. 4) select the output format, name the file for download, and click on "get output" to extract the data. Note that the output will be all rows in the track with any overlap with the defined regions. There could be more than one line of output per region. Also, the original region will not be in the output. This is sometimes not the desired output format. Other options: 1) To do a full merge (data from each input in each output row): create a custom track of your coordinates (BED) and load, then send both your custom track and the gene track to Galaxy to do an "interval intersection" function. Use the Table browser to send the data over (check the "Galaxy" box before submitting). 2) To do a merge using a custom track and the intersection function: create a custom track of your regions (BED), then follow the basic path above, except change step #3 - instead use the "intersection" function to filter the output based on the custom track regions. Same output format as the original query (only data from primary gene track table). Complications: 1) Your coordinate ranges are small and some fall into the intron regions of genes (no exon overlap == no gene returned from queries). Solution: Create another custom track, this time of the gene track, that excludes exon blocks - to create a global genome "footprint" region for each transcript. To do this, follow the basic steps above, leaving region = genome, do not use a filter/intersection, and use output "selected fields from primary and related tables". Name file. After clicking on the "get output", select the fields to create a simple BED file - the first 6 columns (excluding the initial index "bin" field if present) - and download. Reload as a custom track, and use this custom track as the primary gene track instead. Send to Galaxy or use as the basis for a region filter or custom track intersection query. Help links: http://genome.ucsc.edu/cgi-bin/hgTables http://genome.ucsc.edu/goldenPath/help/hgTablesHelp.html http://genome.ucsc.edu/goldenPath/help/hgTracksHelp.html#CustomTracks http://genome.ucsc.edu/FAQ/FAQformat.html#format1 (BED) If you have any question or need more guidance, please let us know, Jennifer --------------------------------- Jennifer Jackson UCSC Genome Informatics Group http://genome.ucsc.edu/ On 4/1/10 2:09 AM, Jose Manuel Castro Tubio wrote: > Dear Sir or Madam, > every day we use the genome browser to identify genes from a list of > nucleotide positions on human chromosomes (coordinates). However, currently > our list of coordinates is very long, so the search one by one in the genome > browser will result in a very long process we can not assume. We are asking > if it would be possible to perform a search by entering all our coordinates > at once, to obtain a list with the genes where are located our coordinates. > Waiting for a reply, > Jose > _____________ > Jose Tubio, PhD > Center for Genomic Regulation (CRG) > Plaça Charles Darwin s/n, PRBB Building, Room 521 > E-08003 Barcelona, Catalunya, Spain > _______________________________________________ > Genome maillist - [email protected] > https://lists.soe.ucsc.edu/mailman/listinfo/genome _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
