Hello Hernando, For #1 -
The FAQ for this data format explains the data columns (ignore the bin column, it is for indexing only): http://genome.ucsc.edu/goldenPath/help/net.html To extract genes for certain regions, use the Table browser, open the track of interest, and use the region filter by entering coordinates (or pasting/uploading a batch of coordinates using the "define regions" button). http://genome.ucsc.edu/goldenPath/help/hgTablesHelp.html For #2 - There are three options: a) in mySQL, type "desc <table_name>;" Good for one table description, no linkage information to other tables. b) in the Table browser, select the track/table of interest and use the "describe table schema" button. The next page has that table defined and is followed by linked tables (including what the linking keys are). Clicking on any of these linked tables will bring them up the to the top where the table is defined. c) in the Downloads area, files in the Annotation database directory have a .txt.gz file with the data and a .sql file with the mySQL table definitions. Our apologies for the late reply, Jennifer --------------------------------- Jennifer Jackson UCSC Genome Informatics Group http://genome.ucsc.edu/ On 4/19/10 7:39 AM, Ernando Faddeev wrote: > Hello Jennifer. Thank you for fast response. I have been looking at the > options you gave me and I feel I am a bit further, though not quite > there yet. > > 1) Table browser option. The human net track seems to align all the > human genome with lab mouse homologous nucleic sequences. After > selecting that table I got 279 MB tab separated file with thease headers: > > #bin score tName tSize tStart tEnd qName qSize qStrand qStart qEnd id > > I can guess what are some of the columns, though is there any > description for those names so for me to be sure I got it right? > > Having the nucleic genome aligned, how can I extract only protein coding > regions that correspond to the genes that interest me? (at the end I > only need aligned mouse and human protein sequences of the genes that > correspond to DNA repair, let's say those from gene ontology database > that show up after the search on terms "DNA"+"repair" on human genome). > > 2) MySQL option. I got connected to the public server and I found a lot > of databases and tables, though I didn't find any ER diagrams and field > description on the site. What databases should I look at and where to > find field description? > > Greetings, > > Hernando Sanchez > > > On Mon, Apr 19, 2010 at 4:36 PM, Ernando Faddeev <[email protected] > <mailto:[email protected]>> wrote: > > Hello Jennifer. Thank you for fast response. I have been looking at > the options you gave me and I feel I am a bit feather, though not > quite there yet. > > 1) Table browser option. The human net track seams to align all the > human genome with lab mouse homologous nucleic sequences. After > selecting that table I got 279 MB tab separated file with thease > headers: > > #bin score tName tSize tStart tEnd qName qSize qStrand qStart qEnd id > > I can guess what are some of the columns, though is there any > description for those names so for me to be sure I got it right? > > Having the nucleic genome aligned how can I extract only protein > coding regions that correspond to the genes that interest me? (at > the end I only need aligned mouse and human protein sequences of the > genes that correspond to DNA repair, let's say those from gene > ontology database that show up after the search on terms > "DNA"+"repair" on human genome). > > 2) MySQL option. I got connected to the public server and I found a > lot of databases and tables, though I didn't find any ER diagrams > and field description on the site. What databases should I look at > and where to find field description? > > Greetings, > Hernando Sanchez > > > On Fri, Apr 16, 2010 at 7:20 PM, Jennifer Jackson <[email protected] > <mailto:[email protected]>> wrote: > > Hello Hernando, > > There are a few options for you: > > 1) Use the Table browser for the batch query, possibly in > combination with Galaxy to perform full intersections. The > intermediate mapping tables could be the Conservation track or > the Chain/Net tracks between Human and Mouse. > > 2) Download the text files representing the tables in the > database for the datasets in #1, then create scripts to process > your query. Tools from the UCSC utility set may be helpful. > > 3) Use the public mySQL server to gain directly access to the > database and use mySQL, utilities from our source tree, your own > tools, etc. to process the query. > > 4) Create a local mirror of the Browser and do the same as #3, > but locally in your own instance. > > Do you have a preference? #4 would be the most private option, > if that is a concern for you, but would require the most > up-front work and may not be necessary. > > Please write back and let us know your preference and we can > send full details about suggested tables & utilities, file > download/ftp help, and mirroring assistance. > > Thank you, > Jennifer > > --------------------------------- > Jennifer Jackson > UCSC Genome Informatics Group > http://genome.ucsc.edu/ > > > On 4/16/10 8:10 AM, Ernando Faddeev wrote: > > I want to compile a list of conserved protein coding > transcripts(/genes) > between mouse > (mm9<http://genome.ucsc.edu/cgi-bin/hgGateway?db=mm9>) and > > human(hg19) that are involved with DNA repair. Basically I > want the table > with names next to aligned sequences of the transcripts. I > have the names > and accession numbers of around 300 human genes that I am > interested in, and > now I want to find the sequences of their mouse homologous. > GBrowser appears > to do so in some extend, though only gene by gene and > graphically. I have > the required computational skills required to install the > database and > manage scripts, though I do not know where to start and what > tools to use, > so therefor the question: what tools can I use to build this DB? > > Greetings, > Hernando Sanchez > _______________________________________________ > Genome maillist - [email protected] > <mailto:[email protected]> > https://lists.soe.ucsc.edu/mailman/listinfo/genome > > > _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
