Re: Performance issues with OWL Reasoners => subclass vs instance-of

William Bug Fri, 15 Sep 2006 06:04:55 -0700

Hi All,

Just as a clarification for the less informed - myself included - we're discussing the subtle and extremely difficult aspects of creating knowledge maps/annotation repositories/KBs/KR repositories (what have you) ultimately capable of supporting reasoning (simple classification through more complex reasoning) for both UNIVERSALS and INSTANCES.

Some DEFINITIONS:

CLASSes represent UNIVERSALs or TYPEs. The TBox is the set of CLASSes and the ASSERTIONs associated with CLASSes.

INSTANCEs represent EXISTENTIALs or INDIVIDUALs instantiating a CLASS in the real world. The ABox is the set of INSTANCEs and the ASSERTIONs associated with those INSTANCEs.

Properly specified CLASSes are defined in the context of the INSTANCEs whose PROPERTIES and RELATIONs they formally represent.

Properly specified INSTANCEs are defined via their reference to an appropriate set of CLASSes.

Reasoners (RacerPro, Pellet, FACT++) generally have optimizations specific to either reasoning on the TBox or reasoning on the ABox, but it's difficult (i.e., no existing examples experts such as Phil and others can cite) to optimize both for reasoning on the TBox, the ABox AND - most importantly - TBox + ABox (across these sets).

All of us trying to apply ontology-based formalisms to create machine-parsable representations of real world biomedical continuants and occurents have banged our heads bloody against this UNIVERSAL-EXISTENTIAL border. Even determining which of the many biomedical informatic resources to employ when you seek to reference relevant UNIVERSALs can be an very difficult task. We're in the midst an extended debate within the BIRN Ontology Task Force on how best to do this for proteins relevant to cross-species representation of neurodegenerative disease such as Glial Fibrillary Acidic Protein (GFAP)).

I strongly encourage the experts to please clarify, embellish, or correct the above definitions as they see fit for the edification of all us disciples. :-)

Cheers,

Bill

On Sep 15, 2006, at 8:30 AM, Phillip Lord wrote:

"KV" == Kashyap, Vipul <[EMAIL PROTECTED]> writes:

KV> Obviously, if mapping into instances gives better performance
KV> for a given set of inferences, that might be the basis of
KV> choosing the instance-of relationship. Towards this end I have
KV> the following questions for Phil:

KV> 1. What are the set of Abox inferences implemented in the GO
KV> example?

In that example, there aren't any. At that stage, the instance store
was not doing ABox reasoning at all, just TBox, made to look like
ABox.

The system is richer now, and you can express some relationship
between individuals in the ABox (as well as any expressivity you like
in the TBox). But, I don't have details, I am afraid.

KV> 2. What would be the corresponding set of TBox inferences
KV> implemented if the
KV> design choice proposed by Chris was adopted, i.e., p53 is a
KV> subclass of Gene (assuming a general "Gene" class)

I am presuming by "set of inferences" you mean, what can you express?
The TBox supports OWL-DL in full. Actually, as the InstanceStore punts
much of the work to the reasoner, without limits this is constrainted
by the reasoner not the instancestore per se. So it does what ever you
reasoner does.

KV> 3. What are the performance and scalability implications of (1)
KV> vs (2)

ABox reasoning is harder than TBox. As is the way with DL, exactly
what the implications are, depends on exactly what you express and I
am not really an expert.

KV> 4. What are the expressiveness implications of (1) vs (2), i.e.,
KV> can we express
KV> some statements using subclass-of based modeling which are not
KV> possible using instance-of modeling; or vice versa....

KV> Look forward to a good use case illustrating the above and
KV> discussing its possible consequences.

The limitation is that if you're entities are in the ABox in this
case, there are a very limited number of things that you can say about
their relationships to other entities in the ABox, although you have
the full expressivity of OWL to relate them to the TBox. Flip side, is
that if you put everything into the TBox, then you get nothing from
the relational backend of the instancestore. In the GO example, for
instance, you could put all the associations into a reason as modelled
as OWL classes, but the reasoner will probably not scale to 6 million
instances.

Separating entities into ABox and TBox depending on how many of them
there are is, of course, unsatisfying from an ontological perspective,
but as you are asking about scalability of computational reasoning I
don't think you have any choice but to be pragmatic.

Phil

Bill Bug

Senior Research Analyst/Ontological Engineer

Laboratory for Bioimaging & Anatomical Informatics

www.neuroterrain.org

Department of Neurobiology & Anatomy

Drexel University College of Medicine

2900 Queen Lane

Philadelphia, PA 19129

215 991 8430 (ph)

610 457 0443 (mobile)

215 843 9367 (fax)

Please Note: I now have a new email - [EMAIL PROTECTED]


This email and any accompanying attachments are confidential. 
This information is intended solely for the use of the individual 
to whom it is addressed. Any review, disclosure, copying, 
distribution, or use of this email communication by others is strictly 
prohibited. If you are not the intended recipient please notify us 
immediately by returning this message to the sender and delete 
all copies. Thank you for your cooperation.

Re: Performance issues with OWL Reasoners => subclass vs instance-of

Reply via email to