[VK] It will be great if you could share specific examples of some
criteria that
were not expressible in SQL. We can then incorporate those into the
use
case and help make a case for SW technologies. On the other hand,
taking a quick
look at the SHER project at IBM, looks like you are using a
polynomial time
reasoner (CEL) for the matching. I may be mistaken, but my initial
sense is that
any CEL expression is likely expressed in SQL/Relational Algebra or
vice versa.
Just a quick correction -- the SHER reasoner is different from the
CEL reasoner, because it is built on
the standard tableau algorithm (internally SHER uses Pellet). It
supports the SHIN subset of DL
(in OWL DL terms, no nominals).
So for instance, SHER handles cardinality constraints which can
change the
nature of the graph that is stored in the relational DB.
E.g., a R b (a has an R relationship to b)
and a R c, with a maximum cardinality of 1.
Lets say b has a P edge to d. The reasoner will merge b and c to be
the same node in the graph.
Let's say you now want to know if c has a P edge to something. A
simple SQL query will not be able to find this edge because it is a
function of the merger
that happened in the process of reasoning. That's just a general
example.
In the clinical trials data, we model negations in the lab data
(e.g., lab results ruled out the presence of an organism, A) as
saying that for this particular lab event, any
causative agent it might have cannot be A. In DL terms, this is a
universal restriction, that propagates a concept (not A) along the
causative Agent edge. If you now want to find
a lab event which indicated the presence of some Agent (X) and not A,
you will again miss things using SQL, because all you will have in
the actual database is that a lab event has a causative Agent X, and
the lab Event is a member of a universal restriction
forAll.causativeAgent.not(A). One might argue that you can do
syntactic checks on it etc., but it gets hairy quite fast when you
consider that the negation may be on a concept that is itself a
complex concept (e.g., a radiological report ruled out the presence
of a colon neoplasm).
Hope this helps?
Kavitha
On Sep 12, 2007, at 11:30 AM, Kashyap, Vipul wrote:
However, if someone is not explicitly asserted to be on
some prescription drug, it is fair to assume that they are not taking
the drug (closed world assumption).
[VK] The key issue is how well this assumption is likely to work in
practice.
Guess we need some experimentation to get at this.
2. I tend to think this comes from an understanding of the domain
(unfortunately), and what you are modeling rather than the data
characteristics per se.
[VK] I agree that whether you need to use OWA/CWA come from an
understanding of
the domain. However, sometimes it could also be an artifact of the
data
representation scheme. For instance, in Chintan's example above,
one could have
negative assertions for drugs, i.e., patient not on drug X, in
which case one
would use OWA instead of CWA.
In terms of whether you can do this using SQL querying
alone, based on our experience, its unlikely. The problem is that
the types of clinical exclusion and inclusion criteria we saw on
clinicalTrials.gov cannot be easily reduced to SQL querying (at least
with the structured medical records we got from Columbia). From
discussions with other institutions, we know this isn't unique to
Columbia (i.e., there is a substantial "semantic gap" between what's
in the structured record and what is being queried by investigators
for clinical trials).
this information.
[VK] It will be great if you could share specific examples of some
criteria that
were not expressible in SQL. We can then incorporate those into the
use
case and help make a case for SW technologies. On the other hand,
taking a quick
look at the SHER project at IBM, looks like you are using a
polynomial time
reasoner (CEL) for the matching. I may be mistaken, but my initial
sense is that
any CEL expression is likely expressed in SQL/Relational Algebra or
vice versa.
---Vipul
The information transmitted in this electronic communication is
intended only for the person or entity to whom it is addressed and
may contain confidential and/or privileged material. Any review,
retransmission, dissemination or other use of or taking of any
action in reliance upon this information by persons or entities
other than the intended recipient is prohibited. If you received
this information in error, please contact the Compliance HelpLine
at 800-856-1983 and properly dispose of this information.