Dear Brooke, UCSC team, thank you very much for this explanation! I could get the table this way.
I would like to ask a more general question about the UCSC table browser. I often want to make various tables, connecting fields from one table with fields from other tables. Just like in the question you just answered. 1.) I would like to know, how I can generally find out by myself, how I can link a field from one table to a field from another table? For example in the previous question I wanted to link RefSeq to description. But how does one get the crucial information that "name" is in "rn4.description", and "rn4.description" will only appear after "rn4.gbCdnaInfo" is selected? As it is now, I always need to ask at the UCSC mailing list. But I would really like to figure this out by myself. Often I try to find out how the tables are linked by selecting tables with promising names, but there are simply too many tables, and the descriptions of the fields in the table browser are somewhat vague (for example "rn4.refLink: Link together a refseq mRNA and other stuff"). Is there something like an overview "map", how the tables are linked to each other? I.e. how does one find out how to get from a specific field to another specific field? That would help be greatly in my research, if I could just figure this out by myself how to get from one field to another. 2.) Related to the above question. When clicking on a (RGD) gene, I often get a very rich description of the gene on the "Description and Page Index". This information is often what I want to link. How can I find out which field, in which table, is where this information is from? For example, there is this box "Microarray Expression Data" which contains "Affymetrix All Exon Microarrays". How can I find out which table in the table browser contains this information? Or let's say I want to know all chemicals that interact with a gene, as is shown on the "Description and Page Index". How would one figure out which fields/tables contain this info? (This is just an example. I am actually not interested in microarrays or toxicogenomics, but I am asking about the general mechanism for finding the connection to the fields). Maybe it is possible to put this information right into the "Description and Page Index" itself? Or is there maybe another table which contains this information in turn? Thanks. Anton P.S.: From what I understand the table browser is just a web interface to the mySQL database at genome-mysql.cse.ucsc.edu. I have some basic knowledge of mySQL and if it is not possible with the table browser interface, but it is with the mySQL interface that would be helpful too. On Sat, Jun 30, 2012 at 9:09 AM, Brooke Rhead <[email protected]> wrote: > Hi Anton, > > RGD replaced UCSC Genes, not RefSeq Genes, so you are right that you do > not need the RGD tables at all. The RefSeq ID and symbol are contained in > the refGene table. The description is in a table called "description," > which can be connected to refGene via the gbCdnaInfo table. To do this in > the Table Browser, select the rn4 RefSeq Genes track, then: > > table: refGene > region: genome (or whatever you want to limit the output to) > > output format: selected fields from primary and related tables > > Hit "get output" and scroll down to the Linked Tables section. Select the > gbCdnaInfo table and hit "allow selection from checked tables." Once you > have done that, you should be able to scroll down again and select the > "description" table. > > Now select the name and name2 fields from refGene, and the name field from > description and hit "get output". You should get the output you are > looking for. > > -- > Brooke Rhead > UCSC Genome Bioinformatics Group > > > > On 6/28/12 8:20 PM, Anton Kratz wrote: > >> Dear UCSC team, >> >> regarding my original email from Thu Apr 26 01:17:46 PDT 2012, subject: >> [Genome] kgXref, rn4. >> >> Could I please have a comment on how to connect rn4 RefSeq to Gene Symbol >> and description? I mean RefSeq, *not* RGD! >> >> I want to make a tab-separated table with entries like this: >> # RefSeq id Symbol Description >> NM_001107116.1 Pcp2 Rattus norvegicus Purkinje cell protein 2 >> (Pcp2), >> mRNA. >> >> I got an answer from Luvina on Apr 27. However that answer was with >> respect >> to RGD, but I need to connect RefSeq. We then got sidetrack with some >> issues regarding the RGD tables, so herewith I just wanted to come back to >> the original question and would be happy if you could help me with this. >> Thank you. >> >> Anton >> >> >> On Wed, May 30, 2012 at 5:52 AM, Steve Heitner<[email protected]> >> wrote: >> >> Hello, Anton. >>> >>> The purpose of the rgdGene2Xref table is to combine the contents of >>> several >>> of the supporting tables, but you are correct that there appears to be >>> some >>> disagreement between the various fields of rgdGene2Xref. Thank you for >>> bringing this to our attention. We will certainly look into this issue. >>> >>> The best solution here would be to link to the individual tables that >>> contain the information you are interested in. Based on your description >>> of >>> your desired output, follow Luvina's previous instructions, but in the >>> step >>> where she instructed you to link to regGene2Xref, you will link to >>> rgdGene2ToDescription, rgdGene2ToRefSeq and rgdGene2ToSymbol. >>> >>> For your output, you will want to select rn4.rgdGene2.name, >>> rn4.rgdGene2ToDescription.**value, rn4.rgdGene2ToRefSeq.value and >>> rn4.rgdGene2ToSymbol.**geneSymbol. You also mentioned that you would >>> like >>> coordinates included in your output. There are several coordinate fields >>> in >>> the rn4.rgdGene2 table. Select the fields appropriate to your needs. >>> >>> Please contact us again at [email protected] if you have any further >>> questions. >>> >>> --- >>> Steve Heitner >>> UCSC Genome Bioinformatics Group >>> >>> -----Original Message----- >>> From: [email protected] >>> [mailto:genome-bounces@soe.**ucsc.edu<[email protected]>] >>> On >>> Behalf Of Anton Kratz >>> Sent: Monday, May 28, 2012 10:00 PM >>> To: UCSC Genome Browser Mailing List >>> Subject: [Genome] [info and infoType don't match] Re: kgXref, rn4 >>> >>> Dear UCSC team, >>> >>> thank you for your answer; I have a follow-up question. >>> >>> I followed your instructions to generate a table with these three fields: >>> #rn4.rgdGene2.name rn4.rgdGene2Xref.infoType rn4.rgdGene2Xref.info >>> >>> How is one supposed to make the correct relation between the >>> comma-separated >>> elements in rn4.rgdGene2Xref.infoType and rn4.rgdGene2Xref.info? >>> >>> I believe the idea is that the entries for the infoType and info fields >>> are >>> themselves comma-separated fields, and are in the same order from left to >>> right?! But this is not actually the case (examples see below). >>> >>> I want to parse this data with my own Perl scripts. >>> >>> I do not understand how one can (in a Perl or Python script) figure out >>> from >>> this data which of the entries in rn4.rgdGene2Xref.infoType belongs to >>> which >>> of the entries in rn4.rgdGene2Xref.info. >>> >>> For example, the first line: >>> >>> RGD:1565877 GeneID,ID,Name,gene, 499014,RGD1565877,Zdhhc14, >>> >>> The field "infoType" has four entries, "info" has three entries. >>> >>> Or this entry: >>> >>> RGD:1303141 GeneID,ID,Name,Note,gene, >>> 308291,Phactr2,RGD1303141,**Phactr2,member of a family of proteins that >>> bind >>> protein phosphatase 1 and cytoplasmic actin%3B may play a role in >>> regulation >>> of the actin cytoskeleton,Phactr2, >>> >>> In this case the comma-separated fields seem to be mixed up: >>> >>> GeneID 308291 >>> ID Phactr2 >>> Name RGD1303141 >>> Note Phactr2 >>> gene member of a family of proteins that bind protein [...] >>> ??? Phactr2 >>> >>> info and infoType do not match, and I get the actual Gene Symbol three >>> times... >>> >>> Could you please explain how to parse a table, which has been generated >>> as >>> described in the previous emails (please see below)? >>> >>> What I want to generate is a table with RGD-Gene ID and coordinates, >>> Protein >>> encoded, Description in TAB-separated fields so that I can parse this >>> from >>> a >>> script. >>> >>> best regards, >>> Anton >>> >>> >>> On Fri, Apr 27, 2012 at 3:00 AM, Luvina Guruvadoo >>> <[email protected]>wrote: >>> >>> Hi Anton, >>>> >>>> The rgdGene2Xref table should provide the information you are looking >>>> >>> for. >>> >>>> In the Table Browser, make the following selections: >>>> >>>> track: RGD Genes >>>> table: rgdGene2 >>>> output format: selected fields from primary and related tables >>>> >>>> Click 'get output'. Scroll down and select rgdGene2Xref from the >>>> Linked Tables section and click 'Allow Selection From Checked Tables". >>>> From here, you can select fields such as infoType and info from >>>> >>> rn4.rgdGene2Xref. >>> >>>> >>>> I hope this helps. Please contact us again at [email protected] if >>>> you have any further questions. >>>> >>>> --- >>>> Luvina Guruvadoo >>>> UCSC Genome Bioinformatics Group >>>> >>>> >>>> >>>> On 4/26/2012 1:17 AM, Anton Kratz wrote: >>>> >>>> Dear UCSC team, >>>>> >>>>> I am trying to make a table that contains RefSeq IDs and various >>>>> other information. >>>>> >>>>> I can not find the table kgXref anymore, when I select RefSeq in the >>>>> table browser. >>>>> >>>>> I assume that RefSeq has been superseded by RGD, but there is a lot >>>>> of guessing involved regarding the way the tables are linked. >>>>> >>>>> Could you please let me know about the relevant changes? >>>>> >>>>> I'd like to make a table with RefSeq pointing to gene symbol, >>>>> description and maybe other type of info. >>>>> >>>>> thanks, >>>>> Anton >>>>> ______________________________****_________________ >>>>> Genome maillist - [email protected] >>>>> https://lists.soe.ucsc.edu/****mailman/listinfo/genome<https://lists.soe.ucsc.edu/**mailman/listinfo/genome> >>>>> <https:**//lists.so <https://lists.so> >>>>> e.ucsc.edu/mailman/listinfo/**genome<http://e.ucsc.edu/mailman/listinfo/genome> >>>>> > >>>>> >>>>> >>>> >>>> ______________________________**_________________ >>> Genome maillist - [email protected] >>> https://lists.soe.ucsc.edu/**mailman/listinfo/genome<https://lists.soe.ucsc.edu/mailman/listinfo/genome> >>> >>> >>> ______________________________**_________________ >> Genome maillist - [email protected] >> https://lists.soe.ucsc.edu/**mailman/listinfo/genome<https://lists.soe.ucsc.edu/mailman/listinfo/genome> >> > _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
