Dear UCSC team, thank you for your answer; I have a follow-up question.
I followed your instructions to generate a table with these three fields: #rn4.rgdGene2.name rn4.rgdGene2Xref.infoType rn4.rgdGene2Xref.info How is one supposed to make the correct relation between the comma-separated elements in rn4.rgdGene2Xref.infoType and rn4.rgdGene2Xref.info? I believe the idea is that the entries for the infoType and info fields are themselves comma-separated fields, and are in the same order from left to right?! But this is not actually the case (examples see below). I want to parse this data with my own Perl scripts. I do not understand how one can (in a Perl or Python script) figure out from this data which of the entries in rn4.rgdGene2Xref.infoType belongs to which of the entries in rn4.rgdGene2Xref.info. For example, the first line: RGD:1565877 GeneID,ID,Name,gene, 499014,RGD1565877,Zdhhc14, The field "infoType" has four entries, "info" has three entries. Or this entry: RGD:1303141 GeneID,ID,Name,Note,gene, 308291,Phactr2,RGD1303141,Phactr2,member of a family of proteins that bind protein phosphatase 1 and cytoplasmic actin%3B may play a role in regulation of the actin cytoskeleton,Phactr2, In this case the comma-separated fields seem to be mixed up: GeneID 308291 ID Phactr2 Name RGD1303141 Note Phactr2 gene member of a family of proteins that bind protein [...] ??? Phactr2 info and infoType do not match, and I get the actual Gene Symbol three times... Could you please explain how to parse a table, which has been generated as described in the previous emails (please see below)? What I want to generate is a table with RGD-Gene ID and coordinates, Protein encoded, Description in TAB-separated fields so that I can parse this from a script. best regards, Anton On Fri, Apr 27, 2012 at 3:00 AM, Luvina Guruvadoo <[email protected]>wrote: > Hi Anton, > > The rgdGene2Xref table should provide the information you are looking for. > In the Table Browser, make the following selections: > > track: RGD Genes > table: rgdGene2 > output format: selected fields from primary and related tables > > Click 'get output'. Scroll down and select rgdGene2Xref from the Linked > Tables section and click 'Allow Selection From Checked Tables". From here, > you can select fields such as infoType and info from rn4.rgdGene2Xref. > > I hope this helps. Please contact us again at [email protected] if you > have any further questions. > > --- > Luvina Guruvadoo > UCSC Genome Bioinformatics Group > > > > On 4/26/2012 1:17 AM, Anton Kratz wrote: > >> Dear UCSC team, >> >> I am trying to make a table that contains RefSeq IDs and various other >> information. >> >> I can not find the table kgXref anymore, when I select RefSeq in the table >> browser. >> >> I assume that RefSeq has been superseded by RGD, but there is a lot of >> guessing involved regarding the way the tables are linked. >> >> Could you please let me know about the relevant changes? >> >> I'd like to make a table with RefSeq pointing to gene symbol, description >> and maybe other type of info. >> >> thanks, >> Anton >> ______________________________**_________________ >> Genome maillist - [email protected] >> https://lists.soe.ucsc.edu/**mailman/listinfo/genome<https://lists.soe.ucsc.edu/mailman/listinfo/genome> >> > > _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
