On Mar 14, 2007, at 4:44 PM, Kashyap, Vipul wrote:
Alan,
You have proposed some modeling suggestions and of course alignment
with the OBO
relations ontology.
Other than expressing the semantics of these classes precisely, it
will be great
if you and someone in this group could identify the potential impact
of these modeling choices on:
- Enabling different types of integration that were not feasible
before
I think at the moment I am more concerned about data integration than
novel inferences, although I do expect a number of inference
demonstrations. I view the comments I'm providing as a way to deal
with some integration problems before they arise, but I think it will
be better shown once we start looking at specific queries.
The semantics, however, are somewhat more important, particularly
such things as clearly defining classes, distinguishing part of, is
a, and derives from, etc. Whenever they are mixed up we will get
some wrong answers when we questions using these relations.
Put another way, the goal might be stated as wanting to get both
*all* available answers to our questions, and *only* correct answers
to our questions, and both the above contribute to achieving that goal.
Regarding this sort of integration not being feasible before, I'd
stay away from that argument. I do hope to show that, as a matter of
fact, this sort of integration is rarely done, that it is possible to
do better with an acceptable level of effort, and that both the
semantic web tools and ethos help make it easier and more fruitful.
A small example of this was illustrated yesterday in the discussion
about dart grid. We were looking at mapping a column that recorded
gender as a text field with either the character "M" or "F". Now
typically, this is a distinction we wish to make in our ontologies,
and we would generally have a class (ideally the same class across
ontologies) to capture this distinction. In a standard object-
relational model, one could make M and F instead "object" by having a
second table, and a foreign key to that table to record the gender.
But no one does that because it seems "overkill" - the queries are
more painful, the computational overhead is more, etc. But RDF or OWL
this kind of thing is (or should be) common practice, we incur no
penalty, and having it in this form makes it more straightforward to
integrate across independently constructed ontologies - sameas,
subclass, equivalent class all provide standard ways of making the
connection. Compare this to the effort to merge two relational
schemas, where gender columns are used in various tables, named
differently, and where one database uses "M" and "F" and the other
uses "Male" and "Female".
- Enabling different types of inferences which would enable further
integration
not possible before.
I don't think I have said, or want to say, that integration before
was not possible. However, I note that in fact it is has not been
done in a usable way for many of the resources we realistically would
want to use to ask questions about our scientific use case. There are
a number of reasons for this, some of which our use of semantic web
technologies speak to. For example, that there is a shared standard
and working tools based on it means that efforts to integrate can be
built on by others, which offers more bang for your buck, so to
speak, an important consideration when deciding to devote the not
insubstantial effort necessary to put resources in a form that makes
it possible to effectively integrate them. Technically, the fact that
there is less pain involved with schema extension and evolution when
using OWL/RDF then when using traditional RDMS table oriented schema
reduces the effort to integrate a large number of sources.
Alternatively, for the purpose of the demo, one could just do a
shallow alignment so that different data sets can be integrated.
We will do what's necessary. But at this point, since people have
volunteered to own the translation of certain data sources, and since
one of our goals is to explore and learn, I've been trying to get us
further than we would be with this approach. There have been previous
demonstrations of this sort of shallow alignment, and from the point
of view of showing something novel, it would be nice to go beyond
that. Given what's been done so far, and the responses I've seen to
the analysis and suggestions people have been offering, I'm feeling
optimistic.
Best,
Alan