Hi Pieter,

You give some good motivation for a formal schema language. My proposal for
an abstract data model for TinkerPop was, and is Algebraic Property Graphs (
paper <https://arxiv.org/abs/1909.04881>), of which Dragon's data model is
an extension. APG is broader than typical property graphs (e.g. allowing
hyperelements, nested data, and other features which are uncommon or
unknown in connection with TinkerPop), so the best answer to your question
is probably "a variant of APG with restrictions".

Given a formal specification of TinkerPop's data model, we can be very
flexible with respect to concrete syntaxes. Dragon has its YAML syntax, and
the new framework will probably support a slightly different YAML syntax,
but you can specify graph schemas in a variety of languages (the current
tooling will read schemas expressed in YAML, JSON, Thrift, or Protobuf),
and you can express graph data in a variety of languages. What the formal
specification of the data model, and the mappings, give you is the ability
to map schemas and data transparently between the formats, so you can use
whatever is most appropriate to your application.

Btw. at some point, you'll see a schema for property graph features appear
in the branch -- a kind of TP4 successor to Graph.Features
<https://tinkerpop.apache.org/javadocs/current/full/org/apache/tinkerpop/gremlin/structure/Graph.Features.html>.
This will be a small language for declaring the specific refinement of APG
/ the TinkerPop data model which is supported by a given property graph
implementation. That will help you understand not only a single graph, but
also the characteristic class of graphs for a given vendor, adapter, etc.

Josh




On Thu, Jun 3, 2021 at 11:43 AM pieter gmail <pieter.mar...@gmail.com>
wrote:

> Hi,
>
> I kinda lost track of what we discussed previously.
> Did we come to a decision regarding what language we are going to use to
> describe the structure of the graph.
>
> yaml,xsd,uml,yang or some category theory based language?
>
> From my understanding this would be the biggest change in tp4. A TinkerPop
> graph will no be longer a tangle of endless vertices and edges but instead
> can, optionally, be well defined and constrained. This way an engineer can,
> long after the original creators of a graph have left, immediately
> understand the graph, without needing to write a single query.
>
> Thanks
> Pieter
>
>
>
>
> On Thu, 2021-06-03 at 09:59 -0700, Joshua Shinavier wrote:
>
> Hi Pieter,
>
>
> On Thu, Jun 3, 2021 at 9:40 AM pieter gmail <pieter.mar...@gmail.com>
> wrote:
>
> Hi,
>
> Just to understand a bit better whats going on.
>
> Did you hand write the dragon yaml with the antlr grammar as input?
>
>
>
> Yes, the YAML was written by hand, and based pretty closely on Gremlin.g4.
> You can see Stephen's ANTLR definitions inline with the YAML as comments. I
> also took some direction from the Java API.
>
>
>
>
> Did you generate the java classes from the yaml using dragon or
> something else?
>
>
>
> Yes, the Java classes are currently generated using Dragon. I'm limiting
> the generated code to Java for now (other possible targets being Scala and
> Haskell) just to keep diffs to a reasonable size, and because a new,
> open-source solution is needed to replace Dragon. My current thinking is
> that the new transformation framework will be separate from TinkerPop, as
> it will serve non-graph as well as graph use cases. For now, you can think
> of the code generation as a bootstrapping strategy.
>
> Josh
>
>
>
>
>
> Thanks
> Pieter
>
> On Thu, 2021-06-03 at 07:48 -0700, Joshua Shinavier wrote:
> > Hello all,
> >
> > I would like to take some concrete steps toward the TinkerPop 4
> > interoperability goals I've stated a few times (e.g. see TinkerPop
> > 2020
> > <https://www.slideshare.net/joshsh/tinkerpop-2020>from last year). At
> > a
> > meetup <https://www.meetup.com/Category-Theory/events/277331504/> a
> > couple
> > of months ago, I demonstrated an approach for generating TinkerPop
> > APIs
> > consistently into different languages. I have started to check in
> > some of
> > that generated code in a branch (see my commits here
> > <
> https://github.com/apache/tinkerpop/commits/TINKERPOP-2563-language/gremlin-language
> > >)
> > and add bits and pieces for RDF support, as well.
> >
> > The Apache Software Foundation asks us to discuss any significant
> > changes
> > to the code base on the dev list. Since these steps toward TP4 will
> > be
> > major changes if and when they are merged into the master branch, I
> > will
> > start discussing them here. Expect occasional emails from me about
> > the
> > various things I will be doing in the branch. I absolutely invite
> > comments,
> > feedback, and actual discussion on these design proposals, but even
> > if it's
> > just me issuing self-affirming statements into the void like the King
> > of
> > Pointland, I will just carry on, because that's how this process
> > works.
> >
> > A brief summary of the changes so far:
> >
> >
> >    - *Abstract specification of Gremlin traversals*. I have turned
> >    Stephen's Gremlin.g4
> >
> > <
> https://github.com/apache/tinkerpop/blob/TINKERPOP-2563-language/gremlin-language/src/main/antlr4/Gremlin.g4
> > >
> >    ANTLR grammar into an abstract specification of Gremlin traversal
> > syntax
> >    using the Dragon (YAML-based) format. Unfortunately, it is looking
> > very
> >    unlikely that Dragon will become available as open-source
> > software, so you
> >    can expect this YAML format to change just slightly once we have a
> > new
> >    Dragon-like tool for schema and data transformations. More on that
> > later.
> >    Right now, the syntax specification can be found here
> >
> > <
> https://github.com/apache/tinkerpop/tree/TINKERPOP-2563-language/gremlin-language/src/main/yaml/org/apache/tinkerpop/gremlin/language/model
> > >,
> >    although the file path might change in the future.
> >
> >
> >    - *Traversal DTOs*. Based on the abstract specification, I have
> >    generated Java classes for building and working with traversals.
> > The
> >    generated files can currently be found here
> >
> > <
> https://github.com/apache/tinkerpop/tree/TINKERPOP-2563-language/gremlin-language/src/gen/java/org/apache/tinkerpop/gremlin/language/model
> > >.
> >    These are essentially POJOs or DTO classes, with special
> > boilerplate
> >    methods for equality, pattern matching over alternative
> > constructors, and
> >    modification by copying (since the instances are immutable). These
> > classes
> >    allow you to build traversals in a declarative way, while all of
> > the logic
> >    for evaluating traversals goes elsewhere. Support for
> > serialization and
> >    deserialization for traversals is to be added in the future -- and
> > the same
> >    goes for all other classes generated in this way.
> >
> >
> >    - *RDF 1.1 concepts model*. RDF support was part of TinkerPop from
> > the
> >    beginning, but it was de-emphasized for TinkerPop 3 due to other
> > priorities
> >    such as OLAP. For years, developers have been asking us for better
> >    interoperability with RDF. While we do have some query-level
> > support for
> >    RDF these days in sparql-gremlin, we no longer have any data-level
> > support,
> >    e.g. supporting loading RDF data into a property graph and getting
> > it back
> >    out, evaluating Gremlin traversals over RDF datasets, etc. These
> > things are
> >    not especially hard to do, in certain limited ways, but our old
> > approach of
> >    writing adapters like GraphSail
> >
> > <https://github.com/tinkerpop/blueprints/wiki/Sail-Ouplementation>,
> >    SailGraph
> >
> > <https://github.com/tinkerpop/blueprints/wiki/Sail-Implementation>,
> > and
> >    PropertyGraphSail
> >
> > <
> https://github.com/tinkerpop/blueprints/wiki/PropertyGraphSail-Ouplementation
> > >
> >    in Java, with no support for other languages, does not seem
> > appropriate for
> >    TinkerPop 4. Also, those early mappings were extremely
> > underspecified in a
> >    formal sense -- good enough for some practical applications, but
> > not good
> >    enough for anything requiring inference, optimization, or
> > composition with
> >    other mappings. To that end, I am starting to add abstract
> > specifications
> >    for RDF along the lines of the Gremlin specifications I described
> > above.
> >    The first of these, a specification of RDF 1.1 Concepts, can
> > currently be
> >    found here
> >
> > <
> https://github.com/apache/tinkerpop/blob/TINKERPOP-2563-language/gremlin-language/src/main/yaml/org/apache/tinkerpop/rdf/rdf11concepts.yaml
> > >,
> >    with generated Java classes here
> >
> > <
> https://github.com/apache/tinkerpop/tree/TINKERPOP-2563-language/gremlin-language/src/gen/java/org/apache/tinkerpop/rdf/rdf11concepts
> > >.
> >    This gives us a way of working with RDF data in a language-neutral
> > and
> >    framework-neutral way (whereas we were previously tied to Java and
> > to the
> >    RDF4j, nee Sesame, API). Mappings into and out of RDF will be
> > defined with
> >    respect to these abstract types, which can easily be adapted to
> > native RDF
> >    APIs in whatever language you happen to be working in.
> >
> >
> > I will write more about the above topics as time goes by and I
> > continue
> > adding code to the branch. Happy to answer any questions or discuss
> > any
> > feedback in the meantime.
> >
> > Josh
>
>
>
>

Reply via email to