Hi Tar,

The examples of kgXref containing suboptimal UniProt identifiers are 
very likely due to a bug that was recently found and fixed in the UCSC 
Genes pipeline.  We fixed it during a new build of the human (hg19) set 
of UCSC Genes, which will be released to the public website soon; 
unfortunately, you won't see the fix in the mm9 UCSC Genes set in the 
near future.  (Eventually the mm9 gene set will be updated using the 
improved pipeline, but I have no estimate at this time of when that 
might happen.)

In the meantime, if it would be useful for you to use the hg19 human 
kgXref table, it is available now on our our test server: 
http://genome-test.cse.ucsc.edu.  Be aware that there is a lot of 
experimental and untested data on the test server.  If you want to be 
alerted when the new human UCSC Genes track is officially released, you 
can subscribe to our low-volume announcement mailing list, 
[email protected].

Regarding your question about the spID column of kgXref: some of our 
databases/tables/fields still use "swissProt" or "sp" when referring to 
the more general UniProt proteins, since we originally were using 
Swiss-Prot proteins but then switched to Uniprot after Swiss-Prot became 
part of UniProt (see: 
http://web.expasy.org/docs/userman.html#what_is_uniprot).  I see that 
the table description still says that the spID column is the "SWISS-PROT 
protein Accession number." I will talk to our engineers about changing 
the description to say "UniProt protein accession number" instead.

If you have further questions, please contact us again at 
[email protected].

--
Brooke Rhead
UCSC Genome Bioinformatics Group


On 1/14/12 5:11 PM, Tar Viturawong wrote:
> Hello Pauline,
>
> I discovered that the hgXref table often maps the UCSC identifier to a
> suboptimal Uniprot identifier. Many mapped Uniprot IDs are
> "uncharacterised" whereas an alternative, more correctly annotated Uniprot
> identifier exists for each of these. To give you a couple of examples:
>
> 1) Six1 homeobox (ID Q62231) is mapped to Q3V2C3 which is still an
> "uncharacterized protein" on the kgXref table.
>
> 2) Satb1 (ID Q60611) is mapped to Q91XB1, which once put into Uniprot web
> interface, simply redirects you to Q60611.
>
> 3) Pou2f3 (ID P31362) is mapped to Q3U5D1, which also still an
> "uncharacterized protein".
>
> I also noticed that the mapped Uniprot IDs (from the kgXref table) are
> displayed at the Uniprot web interface as "UniprotKB/TrEMBL" whereas the
> ones in my dataset are displayed as "UniprotKB/Swiss-Prot", which I found
> confusing as I understood that the spID column in the kgXref table was
> supposed to return Swiss-Prot ID.
>
> Many thanks and kind regards,
>
> Tar
>
> On Sat, Jan 14, 2012 at 1:43 AM, Pauline Fujita<[email protected]>wrote:
>
>> Hello Tar,
>>
>> Please see this previously answered mailing list question:
>>
>> https://lists.soe.ucsc.edu/pipermail/genome/2011-November/027780.html
>>
>> Basically you will want to use our blastTab tables (in this case the
>> mmBlastTab table) which maps UCSC genes identifiers from one assembly to
>> another. You will also need to use a linked table (kgXref) to convert your
>> Uniprot identifiers to UCSC genes identifiers.
>>
>> Note that Uniprot identifiers are equivalent to " SWISS-PROT protein
>> Accession numbers" in the kgXref table, so begin by choosing that table (as
>> outlined in the link above), then you will be able to paste in your Uniprot
>> identifiers as a list.
>>
>> Then choose "selected fields from primary and related tables" click "get
>> output" and in the subsequent menu be sure to check the "mmBlastTab" table
>> and "allow selection from checked tables". This will allow you to include
>> fields from the mmBlastTab table in your output and you will want to
>> include the query (human id) and target (mouse id) fields.
>>
>> To convert the resulting list of mouse UCSC ids back to Uniprot you would
>> need to use the kgXref a second time pasting in the list of UCSC ids and
>> then selecting the "SWISS-PROT protein Accession numbers" field for your
>> output.
>>
>> Best regards,
>>
>> Pauline Fujita
>> UCSC Genome Bioinformatics Group
>> http://genome.ucsc.edu
>>
>>
>>
>>
>> On 1/11/12 8:56 AM, Tar Viturawong wrote:
>>
>> Hello,
>>
>> I'm trying to compare interaction data between two organisms: mouse and
>> human and working with Uniprot identifiers, and I'm wondering whether there
>> is a way to map a set of Uniprot identifiers as homologs of another set?
>> For example, is there perhaps a way to download some sort of table that
>> contains information linking pairs of gene identifiers, one from mouse and
>> one from human, as homologs? I have a feeling it might be a far-fetched
>> request but I thought I'd give it a try. If the Genome Browser isn't the
>> right place for this, could anyone suggest where I could continue looking?
>>
>> Many thanks!
>> Tar
>>
>>
>>
>
>
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to