Hi Jason, I think I see what is happening. There are three different ways that kgXref and refLink are related in the Table Browser:
hg19.refLink.name (via kgXref.geneSymbol) hg19.refLink.mrnaAcc (via kgXref.refseq) hg19.refLink.protAcc (via kgXref.protAcc) (I'm seeing these relations by hitting the "describe table schema" button and then scrolling down to the "Connected Tables and Joining Fields" section.) I'm not sure which of these relationships is being used when you get "fields from primary and related tables," but when I look at the number of matches in MySQL, I get different numbers of results depending on which relationship I use for the query. I have two solutions for you. You could either do your own MySQL queries on this data using our public MySQL server: http://genome.ucsc.edu/FAQ/FAQdownloads.html#download29 Another, probably easier solution is to use the knownToLocusLink table in the Table Browser. The knownToLocusLink table is part of the UCSC Genes set of tables. I don't know the exact details of how that table was made, but it has Entrez identifiers for 71,350 of the 80,922 genes in the knownGene table, so it covers most of the UCSC Genes. If you have further questions, please contact us again at [email protected]. -- Brooke Rhead UCSC Genome Bioinformatics Group On 6/7/12 12:36 PM, Jason Lu wrote: > Hi, > > I used table browser for attempting to link the knownGene ids to the entrez ids. I was able to submit the query (by search for suggestions on this board). However I got all 'n/a' in the field of hg19.refLink.locusLinkId (see example below). > Could anyone point what could go wrong here? > > > ============== > #hg19.knownGene.name hg19.knownGene.chrom > hg19.knownGene.strand hg19.knownGene.txStart > hg19.knownGene.txEnd hg19.knownGene.cdsStart > hg19.knownGene.cdsEnd hg19.knownGene.exonCount > hg19.knownGene.exonStarts hg19.knownGene.exonEnds > hg19.knownGene.proteinID hg19.knownGene.alignID > hg19.kgXref.geneSymbol hg19.kgXref.refseq > hg19.refLink.locusLinkId > uc001aaa.3 chr1 + 11873 14409 11873 11873 > 3 11873,12612,13220, 12227,12721,14409, > uc001aaa.3 DDX11L1 n/a > uc010nxr.1 chr1 + 11873 14409 11873 11873 > 3 11873,12645,13220, 12227,12697,14409, > uc010nxr.1 DDX11L1 n/a > uc010nxq.1 chr1 + 11873 14409 12189 13639 > 3 11873,12594,13402, 12227,12721,14409, > B7ZGX9 uc010nxq.1 DDX11L9 > n/a > uc009vis.3 chr1 - 14361 16765 14361 > 14361 4 14361,14969,15795,16606, > 14829,15038,15942,16765, uc009vis.3 > WASH7P n/a > uc009vjc.1 chr1 - 16857 17751 16857 > 16857 2 16857,17232, 17055,17751, > uc009vjc.1 WASH7P n/a > > > Thanks, > > Jason > > _______________________________________________ > Genome maillist - [email protected] > https://lists.soe.ucsc.edu/mailman/listinfo/genome _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
