Hi Mike, I agree that a limited Gremlin-to-SPARQL compiler would be a great thing. While graph DB vendors with (other) entrenched data models and query languages are unlikely to implement SPARQL, TP3 would pull in a world of RDF triple stores with this move. I know that Franz (AllegroGraph) in particular has been interested in Gremlin support. However, one of the nice things about SPARQL is that it comes with the protocol, so we could interface with any compliant triple store at all.
To what extent various language features can be carried over efficiently is another question. I would imagine that your property graph DSL is a rough approximation of that which can be done in Gremlin. Things are sure to get interesting around recursive features. E.g. certain Gremlin queries may be mapped to a single SPARQL query each, whereas others will require the generation and evaluation of successive queries. Josh On Tue, Oct 27, 2015 at 7:59 AM, Mike Personick <[email protected]> wrote: > I am just reading this older thread about OpenCypher / SPARQL on the > archives. Very interesting. > > My 2 cents - Tinkerpop is a great API that makes graph application > development much easier, but the lack of a declarative query language is a > barrier to making those applications scale. I strongly prefer to develop > application code using Tinkerpop over raw RDF or Sesame, but once the data > is there I prefer to access and update it via SPARQL. > > I am of course biased but I think trying to bring SPARQL more directly into > the fold would be a good thing. I did it with our TP2 integration and I > plan to do it again with TP3. New users drawn into the RDF world through > the TP API end up replacing a lot of custom code and stored procedures with > SPARQL queries, with which you can do a lot of very powerful things. > > I'd love to find some time to write a compiler to compile gremlin > traversals into SPARQL operators directly, instead of re-formulating > traversals by hand into SPARQL. Once in SPARQL form a query optimizer / > vectored query engine can decide on an optimal execution order based on the > cardinalities it finds in the graph, instead of a fixed execution order > specified by the user. This type of re-write is of course fundamental to > scaling. > > I've spent a fair amount of time working with Cypher at this point and my > (again, biased) conclusion is that Neo4j is re-inventing the wheel and > Cypher is still many years behind SPARQL 1.1 in terms of its capabilities > and its scalability implications for implementators. There is already a > W3C standard query language with a wide user-base, why not use it? > > It is also possible to develop a property-graph DSL on top of SPARQL that > inter-operates with the Tinkerpop API and data model. I have done this in > a proprietary setting already and my goal is to eventually bring it into > open-source. > > Thanks, > Mike > Blazegraph Core Development Team >
