Hello all,

I would like to take some concrete steps toward the TinkerPop 4
interoperability goals I've stated a few times (e.g. see TinkerPop 2020
<https://www.slideshare.net/joshsh/tinkerpop-2020>from last year). At a
meetup <https://www.meetup.com/Category-Theory/events/277331504/> a couple
of months ago, I demonstrated an approach for generating TinkerPop APIs
consistently into different languages. I have started to check in some of
that generated code in a branch (see my commits here
<https://github.com/apache/tinkerpop/commits/TINKERPOP-2563-language/gremlin-language>)
and add bits and pieces for RDF support, as well.

The Apache Software Foundation asks us to discuss any significant changes
to the code base on the dev list. Since these steps toward TP4 will be
major changes if and when they are merged into the master branch, I will
start discussing them here. Expect occasional emails from me about the
various things I will be doing in the branch. I absolutely invite comments,
feedback, and actual discussion on these design proposals, but even if it's
just me issuing self-affirming statements into the void like the King of
Pointland, I will just carry on, because that's how this process works.

A brief summary of the changes so far:


   - *Abstract specification of Gremlin traversals*. I have turned
   Stephen's Gremlin.g4
   
<https://github.com/apache/tinkerpop/blob/TINKERPOP-2563-language/gremlin-language/src/main/antlr4/Gremlin.g4>
   ANTLR grammar into an abstract specification of Gremlin traversal syntax
   using the Dragon (YAML-based) format. Unfortunately, it is looking very
   unlikely that Dragon will become available as open-source software, so you
   can expect this YAML format to change just slightly once we have a new
   Dragon-like tool for schema and data transformations. More on that later.
   Right now, the syntax specification can be found here
   
<https://github.com/apache/tinkerpop/tree/TINKERPOP-2563-language/gremlin-language/src/main/yaml/org/apache/tinkerpop/gremlin/language/model>,
   although the file path might change in the future.


   - *Traversal DTOs*. Based on the abstract specification, I have
   generated Java classes for building and working with traversals. The
   generated files can currently be found here
   
<https://github.com/apache/tinkerpop/tree/TINKERPOP-2563-language/gremlin-language/src/gen/java/org/apache/tinkerpop/gremlin/language/model>.
   These are essentially POJOs or DTO classes, with special boilerplate
   methods for equality, pattern matching over alternative constructors, and
   modification by copying (since the instances are immutable). These classes
   allow you to build traversals in a declarative way, while all of the logic
   for evaluating traversals goes elsewhere. Support for serialization and
   deserialization for traversals is to be added in the future -- and the same
   goes for all other classes generated in this way.


   - *RDF 1.1 concepts model*. RDF support was part of TinkerPop from the
   beginning, but it was de-emphasized for TinkerPop 3 due to other priorities
   such as OLAP. For years, developers have been asking us for better
   interoperability with RDF. While we do have some query-level support for
   RDF these days in sparql-gremlin, we no longer have any data-level support,
   e.g. supporting loading RDF data into a property graph and getting it back
   out, evaluating Gremlin traversals over RDF datasets, etc. These things are
   not especially hard to do, in certain limited ways, but our old approach of
   writing adapters like GraphSail
   <https://github.com/tinkerpop/blueprints/wiki/Sail-Ouplementation>,
   SailGraph
   <https://github.com/tinkerpop/blueprints/wiki/Sail-Implementation>, and
   PropertyGraphSail
   
<https://github.com/tinkerpop/blueprints/wiki/PropertyGraphSail-Ouplementation>
   in Java, with no support for other languages, does not seem appropriate for
   TinkerPop 4. Also, those early mappings were extremely underspecified in a
   formal sense -- good enough for some practical applications, but not good
   enough for anything requiring inference, optimization, or composition with
   other mappings. To that end, I am starting to add abstract specifications
   for RDF along the lines of the Gremlin specifications I described above.
   The first of these, a specification of RDF 1.1 Concepts, can currently be
   found here
   
<https://github.com/apache/tinkerpop/blob/TINKERPOP-2563-language/gremlin-language/src/main/yaml/org/apache/tinkerpop/rdf/rdf11concepts.yaml>,
   with generated Java classes here
   
<https://github.com/apache/tinkerpop/tree/TINKERPOP-2563-language/gremlin-language/src/gen/java/org/apache/tinkerpop/rdf/rdf11concepts>.
   This gives us a way of working with RDF data in a language-neutral and
   framework-neutral way (whereas we were previously tied to Java and to the
   RDF4j, nee Sesame, API). Mappings into and out of RDF will be defined with
   respect to these abstract types, which can easily be adapted to native RDF
   APIs in whatever language you happen to be working in.


I will write more about the above topics as time goes by and I continue
adding code to the branch. Happy to answer any questions or discuss any
feedback in the meantime.

Josh

Reply via email to