Dear UCSC team,

thank you for your answer; I have a follow-up question.

I followed your instructions to generate a table with these three fields:
#rn4.rgdGene2.name    rn4.rgdGene2Xref.infoType    rn4.rgdGene2Xref.info

How is one supposed to make the correct relation between the
comma-separated elements in rn4.rgdGene2Xref.infoType and
rn4.rgdGene2Xref.info?

I believe the idea is that the entries for the infoType and info fields are
themselves comma-separated fields, and are in the same order from left to
right?! But this is not actually the case (examples see below).

I want to parse this data with my own Perl scripts.

I do not understand how one can (in a Perl or Python script) figure out
from this data which of the entries in rn4.rgdGene2Xref.infoType belongs to
which of the entries in rn4.rgdGene2Xref.info.

For example, the first line:

RGD:1565877    GeneID,ID,Name,gene,    499014,RGD1565877,Zdhhc14,

The field "infoType" has four entries, "info" has three entries.

Or this entry:

RGD:1303141    GeneID,ID,Name,Note,gene,
308291,Phactr2,RGD1303141,Phactr2,member of a family of proteins that bind
protein phosphatase 1 and cytoplasmic actin%3B may play a role in
regulation of the actin cytoskeleton,Phactr2,

In this case the comma-separated fields seem to be mixed up:

GeneID 308291
ID     Phactr2
Name   RGD1303141
Note   Phactr2
gene   member of a family of proteins that bind protein [...]
???    Phactr2

info and infoType do not match, and I get the actual Gene Symbol three
times...

Could you please explain how to parse a table, which has been generated as
described in the previous emails (please see below)?

What I want to generate is a table with RGD-Gene ID and coordinates,
Protein encoded, Description in TAB-separated fields so that I can parse
this from a script.

best regards,
Anton


On Fri, Apr 27, 2012 at 3:00 AM, Luvina Guruvadoo <[email protected]>wrote:

> Hi Anton,
>
> The rgdGene2Xref table should provide the information you are looking for.
> In the Table Browser, make the following selections:
>
> track: RGD Genes
> table: rgdGene2
> output format: selected fields from primary and related tables
>
> Click 'get output'. Scroll down and select rgdGene2Xref from the Linked
> Tables section and click 'Allow Selection From Checked Tables". From here,
> you can select fields such as infoType and info from rn4.rgdGene2Xref.
>
> I hope this helps. Please contact us again at [email protected] if you
> have any further questions.
>
> ---
> Luvina Guruvadoo
> UCSC Genome Bioinformatics Group
>
>
>
> On 4/26/2012 1:17 AM, Anton Kratz wrote:
>
>> Dear UCSC team,
>>
>> I am trying to make a table that contains RefSeq IDs and various other
>> information.
>>
>> I can not find the table kgXref anymore, when I select RefSeq in the table
>> browser.
>>
>> I assume that RefSeq has been superseded by RGD, but there is a lot of
>> guessing involved regarding the way the tables are linked.
>>
>> Could you please let me know about the relevant changes?
>>
>> I'd like to make a table with RefSeq pointing to gene symbol, description
>> and maybe other type of info.
>>
>> thanks,
>> Anton
>> ______________________________**_________________
>> Genome maillist  -  [email protected]
>> https://lists.soe.ucsc.edu/**mailman/listinfo/genome<https://lists.soe.ucsc.edu/mailman/listinfo/genome>
>>
>
>
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to