Re: Demo SPARQL notes

Bijan Parsia Tue, 17 Apr 2007 18:30:08 -0700


On Apr 18, 2007, at 1:38 AM, Chris Mungall wrote:

On Apr 17, 2007, at 10:49 AM, William Bug wrote:
I with Bijan on this issue.
However complex the current OWL representation may appear, it'sconsiderably more terse than the expression of this same info in arelational model.
I'm not sure if this is necessarily the case.

To be clear about what I said: I am not fond of using triple basedsyntaxes for representing class expressions (and axioms involvingclass expressions, and queries involving class expressions). I alsodislike long names for operators (e.g., intersectionOf) and having toname (some) pieces of syntax (e.g., owl:Restrictions). Some nontriple/rdf representations representations retain the latter features(e.g., OWL 1.1 functional and XML syntax). I still tend to find themsuperior to the triple based ones, so these are separate issues.

For complex class expressions, I prefer an operator syntax such asstandard DL or FOL. Standard variable free DL syntax has considerableconcision and composition advantages, in my experience. (Even in adhoc textual variants, it's nice to be able to do something like(some.P C) instead of (some(y)(Pxy & Cy). Nesting quantifiers isreally nice in DL syntax.)

If we are talking specifically about the representation of OWL *inRDF triples* and the corresponding SPARQL queries, then we areessentially talking about a 3-ary relational model anyway,

I hesitate to follow moves about syntax through "essentially"s tomodels. Lisp lists are "essentally" chains of cons cells, but '(1 23) doesn't wear that on its sleeve (and could resolve to an arraybased internal form).

modulo the usual concerns re open vs closed world and the like.


Such talk *really* worries me when we are talking *syntax*.

And n-ary relations are surely either as terse or more terse than 3-ary relations.

Since OWL is restricted in the number of distinct variables (and thecombinations thereof), you get some of the advantages of variablefreedom even in the hairier syntaxes.

Compare facts in an imaginary relational model for OWL [1]:

        existential_restriction(part_of, CellNucleus, Cell)


This is not far off from current OWL 1.1 functional syntax, see:
        http://webont.org/owl/1.1/owl_specification.html#4

But I'd want it to be composition, i.e., "Cell" to be replacable witha complex class expression, e.g., another existential_restriction. Ifwe are going to talk "relational model" then we've added functionterms, at the very least.

With [2]:

        subClassOf(CellNucleus,_r1)
        restriction(_r1)
        onProperty(_r1,part_of)
        someValuesFrom(_r1,Cell)

Yes, this is exactly the "trouble with triples". ewww. hate thatbnode too.


[snip]

And of course SQL and most implementations of the relational modelgive you little or no deductive facilities; but then, this is alsotrue for most SPARQL implementations too. Even with RDFSentailment, you don't have enough for basic class-level (TBox)transitivity.

Pellet and KAON2 support SPARQL syntax for *Abox* queries (to somedegree, it varies) and racer has a similar language.

Anyway, I think I'm being pedantic and straying from the point. Theissue is that queries expressed in SPARQL over class-levelrelations (eg part_of in a TBox)

And relatively new. I.e.,most conjunctive query in DL land is purelyover *aboxes*. Querying *TBoxes* is done with special functions a laDIG. Thus, the triple syntax of sparql is a bit misleading.

However, in at least one version of Pellet we had experimentalsupport for mixed TBox/Abox queries and we've written this up:

        http://clarkparsia.com/files/pdf/sparqldl.pdf

with an eye to getting a spec together at OWLED. Intuitively, youcompile out the TBox query parts and turn them into DIG calls, thenperform query expansion on the class or property variables in theabox atoms.

represented using owl restrictions are verbose, contrasted torepresentations that use a single predicate for the class-levelrelation. The issue here is not the syntax per se, rather theadditional triples and bNode created when layering the OWL on theRDF model. I don't know if it's such a huge problem - I havelearned to live with it - but I know that people used to n-aryrelational queries balk at doing a multi-triple-with-bnode queryfor simple TBox queries such as the above.

Hence my preference to plug in a better syntax for SPARQL/DL. Thetriple syntax is also misleading as it leads users to expect somequeries to be legal (and useful) which just aren't.


This is something which we'll spend some considerable time at OWLED on.

One solution here is Alan's alternate non-OWL layering of classlevel relations in the RDF model, possibly controversial. Anotheris an additional layer on top of SPARQL - eg some macro languagethat provides constructs such as a single predicate for class levelrelations - and compiles down to SPARQL - this appears to be whatis suggested below? Manchester syntax is mentioned - a QL based onManchester syntax would be nice. For our ABox query we could say "?X part_of some Cell". I imagine this could trivially compile downto SPARQL - or it could be an OWL QL that has its own model.

As I said, Kendall Clark and I made progress on an XML syntax forSPARQL. You could then leave the algebra parts constant and plug inOWL 1.1's xml syntax as either a compilation target or source.

This is related to but different from the issue of entailment -many RDF systems, including most SPARQL implementations - give youlittle or no entailment - eg RDFS. This isn't enough to give you acomplete answer for [2] (assuming part_of is transitive). Alan'stransformation does, I believe, give you a correct answer for whenyou have RDFS entailment.
Yyou can write some very effective SPARQL queries against it,after playing with it a bit to get a more complete understandingof what the ontology is trying to express.
I've certainly been having pretty good luck creating SPARQLqueries - even by hand (i.e., without fancy end-user orientedtools) - against some of the similarly modeled data in theNeuroCommons repository.
SPARQL seems adequate in many respects for data oriented queries(typically, but not always, ABox) - the verbosity manifests in TBoxqueries, and possibly other scenarios that dictate the standard n-ary pattern transform.

And in arbitrary DL systems, you are most likely to only *get* ABoxqueries, since traditionally you used an API for your TBox queries.The above reference paper is trying to change that.

(Cerebra had a sorta mixed tbox/abox query language based on XQuery,but it just made the DIGgish calles more or less explicit.)


Cheers,
Bijan.

Re: Demo SPARQL notes

Reply via email to