Also, what would be great is to get a concrete real world example
which
illustrates the above. The example given by Kavitha, I believe has
a SQL
translation. Getting such examples are crucial to showing the value
of the web.
Vipul, just to clarify, which example are you referring to when you
say you can do it in SQL? Also, about precomputing the closure of
SNOMED, if you mean that storing the
results of the classification hierarchy of SNOMED can eliminate the
reasoning step in query answering, we know for sure that this won't
work for most of the queries we looked at. The reason is that most
queries require (at the very least) that sub-parts of the query
actually be classified on the fly. Let me give you a concrete
example from our work. Lets take the query
that is looking for patients on medications with an active ingredient
of steroids. In the actual instance data you have:
Patient X onMedication Vendor-specific DrugX
The first subpart of the query is patients onMedication, and that can
be answered by looking at the actual instance data. The second
subpart of the query is drugs that have an active
ingredient of steroids. This part does not map directly to any
existing concept in SNOMED, so a precomputed classification will not
help.
As an example, here are all the TBox assertions that are needed for
us to find drugs with active ingredients of steroids (I am using just
one drug case to illustrate):
1. We know from our mappings that Vendor-specific DrugX is a
subclass of the SNOMED concept Hydrocortisone preparation.
2. We also know from SNOMED that a Hydrocortisone preparation drug
which has an active ingredient of Hydrocortisone (the substance), AND
it has a dose form of oral dosage. In OWL
this would mean Hydrocortisone preparation is defined as equivalent
to an intersection of 2 existentials (exists.ActiveIngredient
(Hydrocortisone-substance) and exists.dosageForm(OralDosageForm)
3. Hydrocortisone(substance) has superclasses Oxycortiosteroid
(substance) and Hyroxycorticosteroid(substance), each of these
ultimately end up with a superclass of steroids.
On Sep 13, 2007, at 8:40 AM, Kashyap, Vipul wrote:
The data complexity of EL++ suggest strongly that a sensible
reduction to SQL is unlikely (i.e., you'll need datalogesque rules as
well).
[VK] The interesting question in my mind then is what is the
additional
functionality achieved by these datalogesque rules that are not
present in
SQL? The reason I ask is because today the major RDBMS vendors support
transitive closure operations and I was wondering if there is any
other
functionality that is missing in SQL.
Also, what would be great is to get a concrete real world example
which
illustrates the above. The example given by Kavitha, I believe has
a SQL
translation. Getting such examples are crucial to showing the value
of the web.
Even logspace data complex logics can be tricky. The DL-Lite family
is the paramount example and they can have an exponential blowup in
the size of the query (since they need to intern parts of the tbox in
the query, so each conjunct might expand, and then the permutations
of the expansions must be added to the union of queries...er...as I
recall :))
[VK] From a pragmatic point of view, in the context of a given
application,
this just needs to be done once. There are well defined RDBMS
approaches to
create views, materialize them, develop indexing structures to achieve
scalability.
For instance, I know that a common approach to using Snomed is to
precompute
the "closure" and store it in a RDBMS.
So, the real world has figured out ways of dealing with these
situations and
I am yet to see examples of how using semantic web technologies,
will give them
the scalability and make their life easier.
So, basically, large queries with large, connectd TBoxes will be
challenging, requiring clever optimization of the rewriting. This
isn't something you'll do by hand ;)
[VK] Can I have some real world examples which illustrate this?
---Vipul
The information transmitted in this electronic communication is
intended only for the person or entity to whom it is addressed and
may contain confidential and/or privileged material. Any review,
retransmission, dissemination or other use of or taking of any
action in reliance upon this information by persons or entities
other than the intended recipient is prohibited. If you received
this information in error, please contact the Compliance HelpLine
at 800-856-1983 and properly dispose of this information.