I just checked in the most recent version of what had been an
experimental, generated (see:
http://copia.ogbuji.net/blog/2005-04-27/Of_BisonGe) parser for the
full SPARQL syntax, I had been working on to hook up with sparql-p.
It parses a SPARQL query into a set of Python objects representing the
components of the grammar:

http://svn.rdflib.net/trunk/rdflib/sparql/bison/

The parses itself is a Python/C extension, so the setup.py had to be
modified in order to compile it into a Python module.

I also checked in a test harness that's meant to work with the DAWG test cases:

http://svn.rdflib.net/trunk/test/BisonSPARQLParser

I'm currently stuck on this test case, but working through it:

http://www.w3.org/2001/sw/DataAccess/tests/#optional-outer-filter-with-bound

The test harness only checks for parsing, it doesn't evaluate the
parsed query against the corresponding set of test data, but can be
easily be extended to do so.   I'm not sure about the state of those
test cases, some have been 'accepted' and some haven't.  I came across
a couple that were illegal according to the most recent SPARQL grammar
(the bad tests are noted in the test harness).  Currently the parser
is stand-alone, it doesn't invoke sparql-p for a few reasons:

1) I wanted to get it through parsing the queries in the test case first
2) Our integrated version of sparql-p is outdated as there is a more
recent version  that Ivan has been working on with some improvements
we should consider integrating
3) Some of the more complex combinations of Graph Patterns don't seem
solvable without re-working / extending the expansion tree solver.  I
have some ideas about how this could be done (to handle things like
nested UNIONS and OPTIONALs) but wanted to get a working parser in
first

Using the parser is simple:

from rdflib.sparql.bison import Parse
p = Parse(query,DEBUG)
print p

p is an instance of rdflib.sparql.bison.Query.Query

Most of the parsed objects implement a __repr__ function which prints
a representation of the parsed objects.  The functions recurse down
into the lower level objects, so tracing how each __repr__ method is
implemented is a good way to determine how to deconstruct the parsed
SPARQL query object.

These __repr__ methods could probably be re-written to echo the SPARQL
query right back as a way to

1) Test round-tripping of SPARQL queries
2) Create SPARQL queries by instanciating the rdflib.sparql.bison.*
objects and converting them to strings

It's still a work in progress, but I think it's far enough through the
test cases that it can handle most of the more common syntax.

Chimezie

_______________________________________________
Dev mailing list
[email protected]
http://rdflib.net/mailman/listinfo/dev

Reply via email to