Hello, Anton.

The purpose of the rgdGene2Xref table is to combine the contents of several
of the supporting tables, but you are correct that there appears to be some
disagreement between the various fields of rgdGene2Xref.  Thank you for
bringing this to our attention.  We will certainly look into this issue.

The best solution here would be to link to the individual tables that
contain the information you are interested in.  Based on your description of
your desired output, follow Luvina's previous instructions, but in the step
where she instructed you to link to regGene2Xref, you will link to
rgdGene2ToDescription, rgdGene2ToRefSeq and rgdGene2ToSymbol.

For your output, you will want to select rn4.rgdGene2.name,
rn4.rgdGene2ToDescription.value, rn4.rgdGene2ToRefSeq.value and
rn4.rgdGene2ToSymbol.geneSymbol.  You also mentioned that you would like
coordinates included in your output.  There are several coordinate fields in
the rn4.rgdGene2 table.  Select the fields appropriate to your needs.

Please contact us again at [email protected] if you have any further
questions.

---
Steve Heitner
UCSC Genome Bioinformatics Group

-----Original Message-----
From: [email protected] [mailto:[email protected]] On
Behalf Of Anton Kratz
Sent: Monday, May 28, 2012 10:00 PM
To: UCSC Genome Browser Mailing List
Subject: [Genome] [info and infoType don't match] Re: kgXref, rn4

Dear UCSC team,

thank you for your answer; I have a follow-up question.

I followed your instructions to generate a table with these three fields:
#rn4.rgdGene2.name    rn4.rgdGene2Xref.infoType    rn4.rgdGene2Xref.info

How is one supposed to make the correct relation between the comma-separated
elements in rn4.rgdGene2Xref.infoType and rn4.rgdGene2Xref.info?

I believe the idea is that the entries for the infoType and info fields are
themselves comma-separated fields, and are in the same order from left to
right?! But this is not actually the case (examples see below).

I want to parse this data with my own Perl scripts.

I do not understand how one can (in a Perl or Python script) figure out from
this data which of the entries in rn4.rgdGene2Xref.infoType belongs to which
of the entries in rn4.rgdGene2Xref.info.

For example, the first line:

RGD:1565877    GeneID,ID,Name,gene,    499014,RGD1565877,Zdhhc14,

The field "infoType" has four entries, "info" has three entries.

Or this entry:

RGD:1303141    GeneID,ID,Name,Note,gene,
308291,Phactr2,RGD1303141,Phactr2,member of a family of proteins that bind
protein phosphatase 1 and cytoplasmic actin%3B may play a role in regulation
of the actin cytoskeleton,Phactr2,

In this case the comma-separated fields seem to be mixed up:

GeneID 308291
ID     Phactr2
Name   RGD1303141
Note   Phactr2
gene   member of a family of proteins that bind protein [...]
???    Phactr2

info and infoType do not match, and I get the actual Gene Symbol three
times...

Could you please explain how to parse a table, which has been generated as
described in the previous emails (please see below)?

What I want to generate is a table with RGD-Gene ID and coordinates, Protein
encoded, Description in TAB-separated fields so that I can parse this from a
script.

best regards,
Anton


On Fri, Apr 27, 2012 at 3:00 AM, Luvina Guruvadoo
<[email protected]>wrote:

> Hi Anton,
>
> The rgdGene2Xref table should provide the information you are looking for.
> In the Table Browser, make the following selections:
>
> track: RGD Genes
> table: rgdGene2
> output format: selected fields from primary and related tables
>
> Click 'get output'. Scroll down and select rgdGene2Xref from the 
> Linked Tables section and click 'Allow Selection From Checked Tables". 
> From here, you can select fields such as infoType and info from
rn4.rgdGene2Xref.
>
> I hope this helps. Please contact us again at [email protected] if 
> you have any further questions.
>
> ---
> Luvina Guruvadoo
> UCSC Genome Bioinformatics Group
>
>
>
> On 4/26/2012 1:17 AM, Anton Kratz wrote:
>
>> Dear UCSC team,
>>
>> I am trying to make a table that contains RefSeq IDs and various 
>> other information.
>>
>> I can not find the table kgXref anymore, when I select RefSeq in the 
>> table browser.
>>
>> I assume that RefSeq has been superseded by RGD, but there is a lot 
>> of guessing involved regarding the way the tables are linked.
>>
>> Could you please let me know about the relevant changes?
>>
>> I'd like to make a table with RefSeq pointing to gene symbol, 
>> description and maybe other type of info.
>>
>> thanks,
>> Anton
>> ______________________________**_________________
>> Genome maillist  -  [email protected] 
>> https://lists.soe.ucsc.edu/**mailman/listinfo/genome<https://lists.so
>> e.ucsc.edu/mailman/listinfo/genome>
>>
>
>
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to