Re: blog: semantic dissonance in uniprot

Kei Cheung Thu, 26 Mar 2009 07:08:25 -0700

In addition to Uniprot, in light of Matthias' earlier email, what abouthttp://en.wikipedia.org/wiki/Protein, http://dbpedia.org/page/Protein,and the protein related ontologies listed in OBO(http://www.obofoundry.org/)?


-Kei


Michel_Dumontier wrote:

Pursuant to my email, and in light of several other comments, if our
goal is to now rectify what Uniprot:Protein _actually_ means in our
domain, and how it can be semantically mapped to other bio-ontologies,
then I might also suggest that instances of Uniprot:Protein are
aggregates of proteins (err... :ProteinAggregate anyone?), possibly
separated by both space and time, having a similar (base sequence +
mutations / ptms) composition, sharing certain characteristics (e.g.
functionality, domains) and observed to participate in biological
processes. Clearly not a type of protein of the single molecule form,
but again, certainly not a Record.

-=Michel=-

 If however, what we've been talking about is that identifiers like
        http://purl.uniprot.org/uniprot/Q16665

are actually database records, and not molecular entities, then we can
settle this quickly:

Uniprot RDF file: http://www.uniprot.org/uniprot/Q16665.rdf
(is this what people were referring to as a Record???)

Contains:

<rdf:Description rdf:about="http://purl.uniprot.org/uniprot/Q16665";>
 <rdf:type rdf:resource="http://purl.uniprot.org/core/Protein"; />


It's clear that the entity denoted by :Q16665 is rdf:type :Protein and
is the subject of statements that are biological in nature such as
being
located in sub-cellular compartments or being involved in biochemical
reactions. It is clearly not a Record. This is generally the case for
nearly all entries in biomolecular databases.

Cheers,

-=Michel=-

Anxiously waiting see if this clears up things or generates

controversy

.. it's hard to predict!

If nobody ever wants to use the same property to talk about the
database
record as was used to talk about the molecule, and nobody ever makes

an

assertion that implies that the class of database records is

disjoint

from the class of molecules, then I don't see any harm in using the
same
URI to ambiguously denote both.   But if one is trying to design

data

to
be reusable by others in unforeseen ways, there clearly *is* a risk
that
someone will want to make such assertions in conjunction with the

data,

and if that happens there is a clear harm.  This risk is easy to

avoid

by using separate URIs.

There *are* trade-offs.  Minting two URIs instead of one *does* add
some
complexity, though as I pointed out that additional complexity can

be

mitigated to the point that it is a *very* low cost.  Still,

different

people will weigh these trade-offs differently, and what's best for

one

situation may not be best for another, as I indicated in my original
post.

Furthermore, even if one does use the same URI to ambiguously denote
both a database record and a molecule, that is not the end of the

world

either.  It is possible (though more difficult) to later separate

out

and relate the different senses of an ambiguous URI, as I have
described:
http://dbooth.org/2007/splitting/
Ambiguity is inescapable, and ambiguity between a thing and a page

that

describes that thing is not fundamentally different from other kinds

of

ambiguity (except perhaps that we are aware of it in advance and it

can

be easily avoided), as explained here:
http://dbooth.org/2007/splitting/#httpRange-14

Finally, although it is flattering that you have named this

suggestion

after me, I cannot take credit.  As I pointed out in my original

post,

the suggestion to differentiate between a molecule and the database
record that describes that molecule originates with the Architecture

of

the World Wide Web:
http://www.w3.org/TR/webarch/#URI-collision
and best practices for implementing this distinction are described

in

Cool URIs for the Semantic Web:
http://www.w3.org/TR/cooluris

David Booth

Re: blog: semantic dissonance in uniprot

Reply via email to