Hello Dr. Florian Wagner, As you have discovered, the accession numbers are stored in the primary database tables without a version extension.
In order to view the version contained in the RefSeq track, create a join to the table gbCdnaInfo. This table contain ancillary information for GenBank sequences. The GenBank data for tracks is incrementally updated nightly, so the most current version of any sequence will be included in the RefSeq Genes track. Method to join data using the Table browser 1) set clade, genome, assembly 2) set group = Gene and Gene Prediction tracks 3) set track = RefSeq Genes 4) leave primary table assigned as default (refGene) 5) set region = genome and then paste in the three identifiers (accession number only, no version) 6) set output = selected fields from primary and related tables 7) name file for export and click on "get output" 8) a new form will appear that will allow you to choose fields from the primary table (refGene) and add in associated linked tables. Check the box for the table gbCdnaInfo, and click on "Allow selection from Checked Tables". The table gbCdnaInfo should appear next under the refGene table, where you can select fields for output. The field gbCdnaInfo.version is where the information you want is contained. 9) when complete, click on "get output" (button is under first table refGene) 10) output file should download Below is an example of output from this procedure using your three RefSeq transcripts. For these three, the version you wanted to extract seems to be the same as the version in the RefSeq Genes track. #hg19.refGene.name hg19.refGene.chrom hg19.refGene.strand hg19.refGene.txStart hg19.refGene.txEnd hg19.gbCdnaInfo.acc hg19.gbCdnaInfo.version hg19.gbCdnaInfo.moddate hg19.gbCdnaInfo.type NM_000651 chr1 + 207669472 207815109 NM_000651 4 2010-02-12 mRNA NM_139343 chr2 - 127805608 127864864 NM_139343 1 2010-02-12 mRNA NM_007166 chr11 - 85668485 85780108 NM_007166 2 2010-02-04 mRNA It is important to note that many of the GenBank tables are quite large. Some are too large to query with the Table browser. If you find that a join query is stalling out or that it reports incomplete data, then using an alternate data access method would be required (ftp the text files from the Downloads server and perform the filter/join with your town tools or use the public mySQL server with your own query). Help links - Table browser: http://genome.ucsc.edu/goldenPath/help/hgTablesHelp.html Downloads server: http://genome.ucsc.edu/FAQ/FAQdownloads.html#download1 Public mySQL server: http://genome.ucsc.edu/FAQ/FAQdownloads.html#download29 Our apologies for not replying sooner. Your message was bounced by our new mail filters. Writing back for an update was the appropriate way to help us to locate and answer your question. Thank you, Jennifer --------------------------------- Jennifer Jackson UCSC Genome Informatics Group http://genome.ucsc.edu/ Message-ID: <[email protected]> Date: Thu, 22 Apr 2010 17:02:35 +0200 From: Florian Wagner <[email protected]> User-Agent: Thunderbird 2.0.0.19 (Windows/20081209) MIME-Version: 1.0 To: [email protected] Subject: question about upload of RefSeq accession numbers in table browser Content-Type: text/plain; charset=ISO-8859-15; format=flowed Content-Transfer-Encoding: 8bit Dear Sir/Madam, I recently tried to upload three RefSeq Acc. No., namely: NM_000651.4 NM_007166.2 NM_139343 in the table browser as a txt file, but got an error message "2 of the 3 given identifiers (e.g. NM_007166.2) have no match in table refGene, field name or in alias table refLink, field name" I think this is related to the version extension, and when I deleted these extensions everything worked fine. However, I am not sure if I get exactly what I want when I use "NM_000651" instead of "NM_000651.4". Maybe you can help me with this issue. Thank you very much and kind regards, Florian Wagner -- Dr. Florian Wagner Head of Microarray Service Unit T +49 (0) 30/3 19 89 66-41 | F -19 [email protected] ATLAS Biolabs GmbH Friedrichstraße 147 | 10117 Berlin www.atlas-biolabs.de Sitz der Gesellschaft: Köln Amtsgericht Köln | HRB 59119 Geschäftsführer: Prof. Dr. Peter Nürnberg _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
