Dear UCSC team, regarding my original email from Thu Apr 26 01:17:46 PDT 2012, subject: [Genome] kgXref, rn4.
Could I please have a comment on how to connect rn4 RefSeq to Gene Symbol and description? I mean RefSeq, *not* RGD! I want to make a tab-separated table with entries like this: # RefSeq id Symbol Description NM_001107116.1 Pcp2 Rattus norvegicus Purkinje cell protein 2 (Pcp2), mRNA. I got an answer from Luvina on Apr 27. However that answer was with respect to RGD, but I need to connect RefSeq. We then got sidetrack with some issues regarding the RGD tables, so herewith I just wanted to come back to the original question and would be happy if you could help me with this. Thank you. Anton On Wed, May 30, 2012 at 5:52 AM, Steve Heitner <[email protected]> wrote: > Hello, Anton. > > The purpose of the rgdGene2Xref table is to combine the contents of several > of the supporting tables, but you are correct that there appears to be some > disagreement between the various fields of rgdGene2Xref. Thank you for > bringing this to our attention. We will certainly look into this issue. > > The best solution here would be to link to the individual tables that > contain the information you are interested in. Based on your description > of > your desired output, follow Luvina's previous instructions, but in the step > where she instructed you to link to regGene2Xref, you will link to > rgdGene2ToDescription, rgdGene2ToRefSeq and rgdGene2ToSymbol. > > For your output, you will want to select rn4.rgdGene2.name, > rn4.rgdGene2ToDescription.value, rn4.rgdGene2ToRefSeq.value and > rn4.rgdGene2ToSymbol.geneSymbol. You also mentioned that you would like > coordinates included in your output. There are several coordinate fields > in > the rn4.rgdGene2 table. Select the fields appropriate to your needs. > > Please contact us again at [email protected] if you have any further > questions. > > --- > Steve Heitner > UCSC Genome Bioinformatics Group > > -----Original Message----- > From: [email protected] [mailto:[email protected]] On > Behalf Of Anton Kratz > Sent: Monday, May 28, 2012 10:00 PM > To: UCSC Genome Browser Mailing List > Subject: [Genome] [info and infoType don't match] Re: kgXref, rn4 > > Dear UCSC team, > > thank you for your answer; I have a follow-up question. > > I followed your instructions to generate a table with these three fields: > #rn4.rgdGene2.name rn4.rgdGene2Xref.infoType rn4.rgdGene2Xref.info > > How is one supposed to make the correct relation between the > comma-separated > elements in rn4.rgdGene2Xref.infoType and rn4.rgdGene2Xref.info? > > I believe the idea is that the entries for the infoType and info fields are > themselves comma-separated fields, and are in the same order from left to > right?! But this is not actually the case (examples see below). > > I want to parse this data with my own Perl scripts. > > I do not understand how one can (in a Perl or Python script) figure out > from > this data which of the entries in rn4.rgdGene2Xref.infoType belongs to > which > of the entries in rn4.rgdGene2Xref.info. > > For example, the first line: > > RGD:1565877 GeneID,ID,Name,gene, 499014,RGD1565877,Zdhhc14, > > The field "infoType" has four entries, "info" has three entries. > > Or this entry: > > RGD:1303141 GeneID,ID,Name,Note,gene, > 308291,Phactr2,RGD1303141,Phactr2,member of a family of proteins that bind > protein phosphatase 1 and cytoplasmic actin%3B may play a role in > regulation > of the actin cytoskeleton,Phactr2, > > In this case the comma-separated fields seem to be mixed up: > > GeneID 308291 > ID Phactr2 > Name RGD1303141 > Note Phactr2 > gene member of a family of proteins that bind protein [...] > ??? Phactr2 > > info and infoType do not match, and I get the actual Gene Symbol three > times... > > Could you please explain how to parse a table, which has been generated as > described in the previous emails (please see below)? > > What I want to generate is a table with RGD-Gene ID and coordinates, > Protein > encoded, Description in TAB-separated fields so that I can parse this from > a > script. > > best regards, > Anton > > > On Fri, Apr 27, 2012 at 3:00 AM, Luvina Guruvadoo > <[email protected]>wrote: > > > Hi Anton, > > > > The rgdGene2Xref table should provide the information you are looking > for. > > In the Table Browser, make the following selections: > > > > track: RGD Genes > > table: rgdGene2 > > output format: selected fields from primary and related tables > > > > Click 'get output'. Scroll down and select rgdGene2Xref from the > > Linked Tables section and click 'Allow Selection From Checked Tables". > > From here, you can select fields such as infoType and info from > rn4.rgdGene2Xref. > > > > I hope this helps. Please contact us again at [email protected] if > > you have any further questions. > > > > --- > > Luvina Guruvadoo > > UCSC Genome Bioinformatics Group > > > > > > > > On 4/26/2012 1:17 AM, Anton Kratz wrote: > > > >> Dear UCSC team, > >> > >> I am trying to make a table that contains RefSeq IDs and various > >> other information. > >> > >> I can not find the table kgXref anymore, when I select RefSeq in the > >> table browser. > >> > >> I assume that RefSeq has been superseded by RGD, but there is a lot > >> of guessing involved regarding the way the tables are linked. > >> > >> Could you please let me know about the relevant changes? > >> > >> I'd like to make a table with RefSeq pointing to gene symbol, > >> description and maybe other type of info. > >> > >> thanks, > >> Anton > >> ______________________________**_________________ > >> Genome maillist - [email protected] > >> https://lists.soe.ucsc.edu/**mailman/listinfo/genome<https://lists.so > >> e.ucsc.edu/mailman/listinfo/genome> > >> > > > > > _______________________________________________ > Genome maillist - [email protected] > https://lists.soe.ucsc.edu/mailman/listinfo/genome > > _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
