On Mar 25, 2009, at 5:27 PM, eric neumann wrote:
Several different issues here.
On Wed, Mar 25, 2009 at 5:47 PM, Bijan Parsia <bpar...@cs.manchester.ac.uk
> wrote:
Eric,
Thanks for the use case!
On 25 Mar 2009, at 21:31, eric neumann wrote:
<snip>
This is the kind of "similar" used in most internal genomic/compound
systems...
<http://myOrg.com/sw/mxid/PHLP0005> :isIdentifiedwith <http://www.uniprot.org/uniprot/P16233
>
Can you explicate this a bit more for me? I.e., could you present
what you expect this to do or not do?
Certainly... I want look up what myOrg knows about a uniprot
protein, but since they do their own internal data-keeping on things
like "druggability" which aren't included (yet) in uniprot, I need
to make sure my extra data is mapped to the public protein object.
Does this help you?
It doesn't help me. We need to have a 'semantic' answer. What kinds of
thing are being talked about here? What do the URIs refer to? (Records
or chemicals?) Because the use of sameAs depends on the answer to this
question very crucially.
(Of course, in a SW world this could have all been done with
internal triples added to the uniprot URI locally...)
It really isn't probabilistic anymore since the scientists have all
agreed and defined their entry based on some of the info from the
public entity; for most situations it is an 'exact mapping' to the
referred molecules.
Is it that most, but not all of the time, you can treat is as sameAs
but sometimes you don't want to?
Well, the question we ask of experts like you is: should we are
should we not use owl:sameAs for exact mappings to entities with
different records?
If your URIs are referring to the entities, then use sameAs when you
are sure you are talking about the same entity, no matter what your
records say about it. If they are referring to the records, then I
would guess that sameAs would be true only when two URIs resolve to
the same resource using GET.
I agree owl:sameAs was not intended for this kind of relation, but
is is extremely common, and a specialized relation for this would be
very much desired. : )
We need to make me understand the relation :)
There are other "identiity" or "similar" relations
Braaagh! Semantic alarm! Identity is NOT similarity. Identity really
does mean being EXACTLY the same thing. If A similarTo B, then we are
talking about two things which are similar. If A sameAs B, then we are
talking about ONE THING which happens to have two names.
in mol biology:
- homolog (symmetric) ; similar function in different species
- paralog (symmetric, sub-property of homolog ) ; similar origin
duplication in same species
- ortholog (symmetric; sub-property of homolog) ; similar function
in different species
None of these are identity.
(also Ohnology and Xenology, see
http://en.wikipedia.org/wiki/Homology_(biology))
- variant of (a non-subsumptive form of specialization within genes)
- modified form of (a non-subsumptive form of specialization for
protein gene products), includes splice variants (see http://www.affymetrix.com/community/publications/affymetrix/tmsplice/index.affx)
- similar chem structures (symmetric for compounds)
None of these are identity.
One way to use identity here is to try to map the original things to a
'sort' or 'similarity class' or similarity type' or <choose your own
buzzword>, and then use identity reasoning on these 'types'. So [ A
similar-to B] is glossed as [(similarity-type A) sameAs (similarity-
type B)] but this only takes you so far: you still get transitivity,
for example, so notions like 'very close' don't work this way. Still,
it might be one way to approach the issue.
... I'm sure there a re dozens more.
Remember also, even though these URIs may be of instances in terms
of records,
instances of what?
For a "collective grouping" of similar instances of (physical)
molecules... d-glucose is 'a' specific molecular structure, but
there are over 10^25 of glucose molecules in a teaspoon of dextrose
sweetener.... Not the usual OWL concept of "instance of class
Molecule" is it?
This is just a basic ontology issue. You need to distinguish a
particular molecule from a molecular 'pattern' from a class of
isomers, etc.., BUt you can;'t expect OWL to do all this kind of work
for you automatically.
Defining 'glucose' as a Class just pushes the definition of Molecule
up to become more akin to a meta-Class...
Right, exactly. Classes weren't meant to carry this kind of conceptual
load. You will just have to do some real ontologizing, my friend :-)
the molecule referenced is not really "a specific single molecule"
found in nature (conceptually possible, but never thought of this
way in may experience). In fact, this is almost always the case in
molecular biology (genes, genomes, SNPs, proteins, etc), while when
dealing with macro-humans, we can refer to an exact instance in the
real world.
We cannot?
No one in pharma is interested in mapping URIs to an individual
exact, physical molecule; IP is always around the chemical structure
(which IS unique) rather than the molecule.
Good: you have a clear ontology and a clear identity criterion for
sameAs. You are talking about chemical structures. I'd suggest, if you
really want to talk about molecules, having properties
has_chemical_structure (domain: molecule; range; chemstruct) and
is_a_molecule_of as its inverse. Don't use the class structure for
Avogadro.
Pat Hayes
Perhaps we really need a set of basic relations (and meta classing?)
for this scale of scientific phenomena to keep it distinct from
organism examples in clinical studies and experiments...
I suspect there's more weight on "exemplar" than I know how to give
at the moment :)
Well, try keeping a URI tracking a single molecule-- there's no
business value in that! ; )
Eric
Cheers,
Bijan.
------------------------------------------------------------
IHMC (850)434 8903 or (650)494 3973
40 South Alcaniz St. (850)202 4416 office
Pensacola (850)202 4440 fax
FL 32502 (850)291 0667 mobile
phayesAT-SIGNihmc.us http://www.ihmc.us/users/phayes