On May 18, 2007, at 3:06 AM, Eric Jain wrote:

Alan Ruttenberg wrote:
If you want to say that the protein is found in some tissue, that's what should be said. However, in your email you wrote that the protein is expressed in the tissue.

Sorry about that, should run a consistency checker on my outgoing mail :-)

This is not a matter of consistency, it is a matter of saying what is meant :)

If it is know to be found in the tissue I would make the subclass be the subclass of the protein each instance of which is located in some instance of the tissue. No processes involved at all.

You would use different representations depending on how well it is known?

All entities are involved in processes all the time. That we don't know the specifics doesn't mean they are not there, nor does it mean that the representation is different when we state the specifics.

By a reasonable definition of process (following, e.g., the BFO papers), if a process happens in a location, then each participant is located in some part of that location. So if it turns out that the truth is that the protein expression process happened in the tissue, and we had the relations appropriately encoded in our computational system, then the location of the protein - the fact that was stated, would be able to be inferred. So we would have extra information, but the information we have would stay true.

In fact, very few such axioms are currently encoded in the BFO and OBO ontologies, a problem which many people want to and will address and which some, including myself, are working on. For example, I recently encoded a bunch of axioms representing constraints on part_of (e.g. a 3-d spatial region can't be part of a 2-d spatial region) in OWL and expect them to be added to a version of the relation ontology some time in the near future. Thomas Bittner is working on a FOL encoding of the BFO at http://www.ifomis.uni- saarland.de/bfo/fol which is substantially more detailed than any of the current OWL representations.

There are other computational systems that are candidates for doing such inferences. I'm particularly interested in OWL because it has the widest adoption and hence work in it has a higher chance, IMO, of being used by people.

I don't think we can make due with core RDF features

Neither do I; just not enthusiastic about reimplementing core features...

I think there is a lot of mileage we can get out of OWL, which extends RDF. Use of OWL has the dual benefit of saving us work, and helping the OWL people advance the state of their tools because they have realistic use cases. Use of OWL is not without problems - the reasoning techniques don't scale to anything near the size of the database we've created. OTOH there is ongoing research to address this and now they have a target (and they are very interested in tackling it). One area I am watching is the DL-Lite work, which offers some level of reasoning in a way that can be implemented in relational databases. I'm also aware (but haven't yet tried) the upcoming Oracle RDF store that implements a subset of OWL, the OWLIM system, and interest at Openlink in adding further inference techniques. I'm sure there are others that I'm not aware of.

In the mean time, the approach we tooks for the demo was to add some ability to query over inferred knowledge by precomputing specific pieces of information which we knew would be useful for some of the queries we wanted to do, for example the part_of relations in the GO.

-Alan




Reply via email to