On Apr 18, 2007, at 1:38 AM, Chris Mungall wrote:

On Apr 17, 2007, at 10:49 AM, William Bug wrote:

I with Bijan on this issue.

However complex the current OWL representation may appear, it's considerably more terse than the expression of this same info in a relational model.

I'm not sure if this is necessarily the case.

To be clear about what I said: I am not fond of using triple based syntaxes for representing class expressions (and axioms involving class expressions, and queries involving class expressions). I also dislike long names for operators (e.g., intersectionOf) and having to name (some) pieces of syntax (e.g., owl:Restrictions). Some non triple/rdf representations representations retain the latter features (e.g., OWL 1.1 functional and XML syntax). I still tend to find them superior to the triple based ones, so these are separate issues.

For complex class expressions, I prefer an operator syntax such as standard DL or FOL. Standard variable free DL syntax has considerable concision and composition advantages, in my experience. (Even in ad hoc textual variants, it's nice to be able to do something like (some.P C) instead of (some(y)(Pxy & Cy). Nesting quantifiers is really nice in DL syntax.)

If we are talking specifically about the representation of OWL *in RDF triples* and the corresponding SPARQL queries, then we are essentially talking about a 3-ary relational model anyway,

I hesitate to follow moves about syntax through "essentially"s to models. Lisp lists are "essentally" chains of cons cells, but '(1 2 3) doesn't wear that on its sleeve (and could resolve to an array based internal form).

modulo the usual concerns re open vs closed world and the like.

Such talk *really* worries me when we are talking *syntax*.

And n-ary relations are surely either as terse or more terse than 3- ary relations.

Since OWL is restricted in the number of distinct variables (and the combinations thereof), you get some of the advantages of variable freedom even in the hairier syntaxes.

Compare facts in an imaginary relational model for OWL [1]:

        existential_restriction(part_of, CellNucleus, Cell)

This is not far off from current OWL 1.1 functional syntax, see:
        http://webont.org/owl/1.1/owl_specification.html#4

But I'd want it to be composition, i.e., "Cell" to be replacable with a complex class expression, e.g., another existential_restriction. If we are going to talk "relational model" then we've added function terms, at the very least.

With [2]:

        subClassOf(CellNucleus,_r1)
        restriction(_r1)
        onProperty(_r1,part_of)
        someValuesFrom(_r1,Cell)

Yes, this is exactly the "trouble with triples". ewww. hate that bnode too.

[snip]
And of course SQL and most implementations of the relational model give you little or no deductive facilities; but then, this is also true for most SPARQL implementations too. Even with RDFS entailment, you don't have enough for basic class-level (TBox) transitivity.

Pellet and KAON2 support SPARQL syntax for *Abox* queries (to some degree, it varies) and racer has a similar language.


Anyway, I think I'm being pedantic and straying from the point. The issue is that queries expressed in SPARQL over class-level relations (eg part_of in a TBox)

And relatively new. I.e.,most conjunctive query in DL land is purely over *aboxes*. Querying *TBoxes* is done with special functions a la DIG. Thus, the triple syntax of sparql is a bit misleading.

However, in at least one version of Pellet we had experimental support for mixed TBox/Abox queries and we've written this up:
        http://clarkparsia.com/files/pdf/sparqldl.pdf
with an eye to getting a spec together at OWLED. Intuitively, you compile out the TBox query parts and turn them into DIG calls, then perform query expansion on the class or property variables in the abox atoms.

represented using owl restrictions are verbose, contrasted to representations that use a single predicate for the class-level relation. The issue here is not the syntax per se, rather the additional triples and bNode created when layering the OWL on the RDF model. I don't know if it's such a huge problem - I have learned to live with it - but I know that people used to n-ary relational queries balk at doing a multi-triple-with-bnode query for simple TBox queries such as the above.

Hence my preference to plug in a better syntax for SPARQL/DL. The triple syntax is also misleading as it leads users to expect some queries to be legal (and useful) which just aren't.

This is something which we'll spend some considerable time at OWLED on.

One solution here is Alan's alternate non-OWL layering of class level relations in the RDF model, possibly controversial. Another is an additional layer on top of SPARQL - eg some macro language that provides constructs such as a single predicate for class level relations - and compiles down to SPARQL - this appears to be what is suggested below? Manchester syntax is mentioned - a QL based on Manchester syntax would be nice. For our ABox query we could say "? X part_of some Cell". I imagine this could trivially compile down to SPARQL - or it could be an OWL QL that has its own model.

As I said, Kendall Clark and I made progress on an XML syntax for SPARQL. You could then leave the algebra parts constant and plug in OWL 1.1's xml syntax as either a compilation target or source.

This is related to but different from the issue of entailment - many RDF systems, including most SPARQL implementations - give you little or no entailment - eg RDFS. This isn't enough to give you a complete answer for [2] (assuming part_of is transitive). Alan's transformation does, I believe, give you a correct answer for when you have RDFS entailment.

Yyou can write some very effective SPARQL queries against it, after playing with it a bit to get a more complete understanding of what the ontology is trying to express.

I've certainly been having pretty good luck creating SPARQL queries - even by hand (i.e., without fancy end-user oriented tools) - against some of the similarly modeled data in the NeuroCommons repository.

SPARQL seems adequate in many respects for data oriented queries (typically, but not always, ABox) - the verbosity manifests in TBox queries, and possibly other scenarios that dictate the standard n- ary pattern transform.

And in arbitrary DL systems, you are most likely to only *get* ABox queries, since traditionally you used an API for your TBox queries. The above reference paper is trying to change that.

(Cerebra had a sorta mixed tbox/abox query language based on XQuery, but it just made the DIGgish calles more or less explicit.)

Cheers,
Bijan.

Reply via email to