Eric Jain wrote:
M. Scott Marshall wrote:
It should be possible for people to make statements specifically about
the DNA, mRNA, amino acid sequence, (in organism human, mouse,..),
NMR, MS(mass spec), etc. that is associated with a protein in addition
to saying something general about the protein itself e.g. "P53 plays
an important role in apoptosis". Although I understand that there are
ways to refer to such info in Uniprot (kudos for that!), wouldn't it
be better to use URI's that point explicitly into an ontology than to
use/create a URI system that we will eventually want to (re)map to
such an ontology anyway?
There are databases for nucleotide sequences (EMBL/GenBank) and for mass
spec data, if you really wanted to associate any statements with such
data? (Note that EMBL/GenBank sequences are referenced from UniProt.)
I think that we are talking about two different things. I'm not
objecting to using UniProt or EMBL for any particular type of protein
data record. I'm suggesting that URI's to OWL classes would be both
unambiguous and well-defined identifiers for biothings like proteins. Of
course, such URI's could eventually be used to access data records at
UniProt.
I think that we should separate class (and 'records') from instance. So,
if I want to refer to the P53 molecule, there's a specific URI to an OWL
class that I can use as my identifier. As another example, if I want to
talk about a type of motif in the P53 sequence or an MS/MS signature, I
have OWL URI's for those (presumably inherited from the Protein class or
the Molecule class).
I like this form of ontology versioning (date in the URL):
http://www.co-ode.org/ontologies/amino-acid/2006/05/18/amino-acid.owl
Notice that you can easily adjust for a different version by changing
the declared namespace.
The problem remains: if we are to use URI's to OWL as identifiers in a
controlled vocabulary, where will the ontology come from?
-scott
--
M. Scott Marshall
http://staff.science.uva.nl/~marshall
http://adaptivedisclosure.org