On Mar 26, 2009, at 11:28 AM, eric neumann wrote:
Pat,
Basically I'm in agreement with all of your points, but need to
correct some mis-interpretations you made of my comments...
Sure, thanks for clarifying.
On Thu, Mar 26, 2009 at 3:13 AM, Pat Hayes <pha...@ihmc.us> wrote:
On Mar 25, 2009, at 5:27 PM, eric neumann wrote:
Several different issues here.
On Wed, Mar 25, 2009 at 5:47 PM, Bijan Parsia <bpar...@cs.manchester.ac.uk
> wrote:
Eric,
Thanks for the use case!
On 25 Mar 2009, at 21:31, eric neumann wrote:
<snip>
This is the kind of "similar" used in most internal genomic/
compound systems...
<http://myOrg.com/sw/mxid/PHLP0005> :isIdentifiedwith <http://www.uniprot.org/uniprot/P16233
>
Can you explicate this a bit more for me? I.e., could you present
what you expect this to do or not do?
Certainly... I want look up what myOrg knows about a uniprot
protein, but since they do their own internal data-keeping on
things like "druggability" which aren't included (yet) in uniprot,
I need to make sure my extra data is mapped to the public protein
object.
Does this help you?
It doesn't help me. We need to have a 'semantic' answer. What kinds
of thing are being talked about here? What do the URIs refer to?
(Records or chemicals?) Because the use of sameAs depends on the
answer to this question very crucially.
In my company I have a ProteinDictionary table populated will all
'known human proteins' (this is the conceptual part that is easy for
all biologists, but is causing some confusion in the thread); each
entry is identified (sameAs?) with a protein in Uniprot (as well as
a protein in NCBI-Entrez)
In ProteinDictionary I include a lot of additional data (not found
in Uniprot) on what antibodies exists for that protein (structure) .
Therefore, the records "refer" to the same protein, but do not have
identical properties
The _records_ don't have identical properties, sure. But (if I follow
you), the names in the table refer to the proteins, not to the
records. Therefore, no problem. It is fine for you to have more
information about the same thing that someone else is talking about.
Well, maybe a practical problem. After reading Michel's post, its not
at all obvious that Uniprot and NCBI-Entrez are actually talking about
the same kinds of thing, which is the practical reason why using
sameAs might be problematic. I'd suggest (inventing and) using
something like sameProteinAs, which is reflexive and symmetric and
probably transitive but not substitutive. Think of it as a topic-
specific version of seeAlso.
My company has more knowledge about the protein, but it is not
common to everyone; case of Open World assumptions...
Exactly. But that is not a problem (unless you identify proteins with
table entries... nah, bad idea.)
Is that clearer?
(Of course, in a SW world this could have all been done with
internal triples added to the uniprot URI locally...)
It really isn't probabilistic anymore since the scientists have all
agreed and defined their entry based on some of the info from the
public entity; for most situations it is an 'exact mapping' to the
referred molecules.
Is it that most, but not all of the time, you can treat is as
sameAs but sometimes you don't want to?
Well, the question we ask of experts like you is: should we are
should we not use owl:sameAs for exact mappings to entities with
different records?
If your URIs are referring to the entities, then use sameAs when you
are sure you are talking about the same entity, no matter what your
records say about it. If they are referring to the records, then I
would guess that sameAs would be true only when two URIs resolve to
the same resource using GET.
If we all agree we are referring to the protein in question, but the
Uniprot and Entrez URIs may have different (hopefully consistent up
to open-world assumptions) information.
Of course, but thats not an ontological problem. In fact, it is to be
expected.
I agree owl:sameAs was not intended for this kind of relation, but
is is extremely common, and a specialized relation for this would
be very much desired. : )
We need to make me understand the relation :)
There are other "identiity" or "similar" relations
Braaagh! Semantic alarm! Identity is NOT similarity. Identity
really does mean being EXACTLY the same thing. If A similarTo B,
then we are talking about two things which are similar. If A sameAs
B, then we are talking about ONE THING which happens to have two
names.
I did not intend to equate "identity" and "similar"
Sorry, I have a hair trigger on that issue. Think of me as an annoying
car alarm.
<snip>
Certainly, but how best should we apply OWL so that this can be well
represented?
Good question. I need to know more than I do about protein chemistry,
but from reading the stuff on this thread, seems to me that being
explicit about proteins being a kind of substance or material would be
a good start, and then asking what kinds of mass-term relationships
one might need (mixture, L-isomeric form of, whatever) between protein-
stuffs.
Dare we promote meta-classing at this point?
Its in OWL 2, and I think many tools allow it already or will very
soon. So yes, dare :-) But be very clear that it really does what you
want. I'm not yet convinced, myself.
I'd rather use OWL to accurately represent "a Molecule Class means
this...., and an instance means that ...." whether its structure
patterns, property groupings, or mind-conceptual objects ("I can
create a specific and novel chemical with this structure and these
properties")
...If this discussion is beginning to settle onto a commonly agreed
set of principles, I'd like to suggest we capture it and circulate
for comment, perhaps through HCLS.
Even if its only a draft for discussion, sounds like a good idea.
cheers,
-Eric
Pat