Re: [Discussed] Integrating SPARQL-Gremlin 0.2 Plugin with the TinkerPop codebase

Stephen Mallette Tue, 06 Feb 2018 13:59:21 -0800

Thanks for your review of my concern with the traspiling for GROUP.  So
there's two aspects to your reply that I'd like to discuss. First, the
specific issue with GROUP that I'm seeing is that its simply choosing the
last variable given in the GROUP


https://github.com/apache/tinkerpop/blob/74b568a8babb8b52b790767e7bb05f462dc5c5f0/sparql-gremlin/src/main/java/org/apache/tinkerpop/gremlin/sparql/SparqlToGremlinTranspiler.java#L139-L141

so when you do this (which i think is legitimate SPARQL - i'm still
learning):

SELECT ?age ?name (COUNT(?name) AS ?name_count)
WHERE {
  ?a e:created ?b .
  ?a v:name ?name .
  ?a v:age ?age .
}
GROUP BY ?age ?name

you will get back a grouping on "name" and if you transpose the variables
in that last line to:

GROUP BY ?name ?age

then you get a grouping on "age".  That doesn't seem right to me. Now, that
would lead me to my second issue I'd want to bring up. Perhaps, supporting
GROUP with multiple variables is something we don't support yet. Perhaps
there's a long line of other SPARQL capabilities that we aren't quite ready
to provide full transpiling for. Moreover, perhaps there are certain SPARQL
statements that just aren't possible to translate to Gremlin at all. I
think we need to do something smart in those cases. We don't want
situations like the one I presented in GROUP where it transpiles to Gremlin
but doesn't really accomplish what the intention of the SPARQL query was. I
feel like we need to do several things with respect to this:

1. If we can't transpile the SPARQL, we throw an
UnsupportedOperationException with a nice error message that says why the
user's SPARQL didn't transpile (i.e. what don't we support that they tried
to pass through)
2. We document the boundaries of what we do support and what our
limitations are.

Any thoughts on all that?

>   Dharmen and I did check your corrections and comments in the code. We
found them appropriate.

That's good to know. Thanks.




On Tue, Feb 6, 2018 at 3:36 PM, Harsh Thakkar <[email protected]> wrote:

> Hi Stephen,
>
> Apologies for being quiet for some time. I have been down with severe flu
> and just recovered. I looked into the order by issue and the reason for
> having only an aggregation variable in the select clause is because of
> SPARQL. SPARQL does not support projecting any other variable other than
> the one which is being used in group by. One could write such a SPARQL
> query, however, it would be incorrect and wouldn't be able to be parsed by
> any SPARQL query processor.
>
> For instance,
>
>  select ?unitOnOrder
>   where {
>                   ?a v:label "product" .
>                   ?a v:name ?name .
>                   ?a v:unitsOnOrder ?unitOnOrder .
>   } GROUP BY (?unitOnOrder)
>
> the above query will be valid and return an appropriate response, whereas:
>
>  select ?name
>   where {
>                   ?a v:label "product" .
>                   ?a v:name ?name .
>                   ?a v:unitsOnOrder ?unitOnOrder .
>   } GROUP BY (?unitOnOrder)
>
> OR
>
>  select ?unitOnOrder ?name
>   where {
>                   ?a v:label "product" .
>                   ?a v:name ?name .
>                   ?a v:unitsOnOrder ?unitOnOrder .
>   } GROUP BY (?unitOnOrder)
>
>
> would throw the following exception => 
> org.apache.jena.query.QueryParseException:
> Non-group key variable in SELECT
> This is specifically for Jena, but if any other SPARQL processor would
> throw a similar exception while processing as it is beyond the formal
> definition of SPARQL query language.
> Thus, I believe we will have to live with it.
>
> What else is happening on your side? Please let me know where I can get in
> and be of help. Dharmen and I did check your corrections and comments in
> the code. We found them appropriate.
> We will continue to go throw and add more comments if need be, especially
> me. I haven't been able to work much on this due to ill health. But I am
> back now.
>
>
> On 2018/01/29 16:09:50, Stephen Mallette <[email protected]> wrote:
> > > SPARQL 1.1 test suite could be used
> >
> > thanks josh - will need to look into that further
> >
> > Harsh, the plugin is pushed at this point. After building we can now do:
> >
> > gremlin> :install org.apache.tinkerpop sparql-gremlin 3.3.2-SNAPSHOT
> > ==>Loaded: [org.apache.tinkerpop, sparql-gremlin, 3.3.2-SNAPSHOT]
> > gremlin> :plugin use tinkerpop.sparql
> > ==>tinkerpop.sparql activated
> >
> > so that's good. i also added a bit of asciidoc for sparql-gremlin. you
> can
> > append whatever documentation you write to that. i will probably step
> away
> > from this branch for a bit to work on other things and give you a chance
> to
> > get familiar with the changes i've pushed so far.
> >
> > On Sun, Jan 28, 2018 at 11:21 AM, Joshua Shinavier <[email protected]>
> > wrote:
> >
> > > For testing, perhaps the SPARQL 1.1 test suite could be used:
> > >
> > >     https://www.w3.org/2009/sparql/docs/tests
> > >
> > > This would provide a strong guarantee of coverage and correctness of
> > > supported features. The metadata about required features for individual
> > > tests is limited, so an appropriate subset of the test cases would
> need to
> > > be hand-picked according to that coverage.
> > >
> > >
> > >
> > > On Sun, Jan 28, 2018 at 7:20 AM, Stephen Mallette <
> [email protected]>
> > > wrote:
> > >
> > > > Harsh, on Friday, I pushed a great many changes to the TINKERPOP-1878
> > > > branch. I got quite familiar with the code and even fixed up a bug in
> > > ORDER
> > > > where it wasn't properly handling multiple fields passed to it. I
> believe
> > > > that there are similar problems in GROUP. At this point, I've got
> most of
> > > > what Marko suggested in place, but I'm not especially happy with it
> - the
> > > > java generics are giving me a hard time...kinda ugly. I've added a
> > > > reasonable level of javadoc and inline comments, but you may want to
> > > review
> > > > and include more if I've missed something or mis-stated some intent.
> The
> > > > test suite is still a bit of a mystery to me. I'm not quite sure how
> we
> > > > should go about dealing with that. At this point, I'm basically using
> > > code
> > > > coverage as a guide to drive the tests written, but I'm not sure if
> > > that's
> > > > effective in the big picture. I don't have a plugin working at this
> > > point.
> > > > I only got all that stuff i tweeted to work in the console because I
> > > > installed it all by hand manually. That still needs to be done.
> > > >
> > > > Here's my suggestion for how we proceed forward:
> > > >
> > > > 1. You start by pulling my latest changes to your fork for review. I
> > > > changed a lot of things - renaming, refactoring, removing dead code,
> etc.
> > > > You should get familiar with what's there and let me know if I did
> > > anything
> > > > dumb.
> > > > 2. Perhaps you look at the issue I think that I see with GROUP
> (which is
> > > > basically identical to ORDER in that it only accepts the last field
> as a
> > > > GROUPing...i don't think that's right).
> > > > 3. Perhaps you could also think about writing some documentation that
> > > > explains the support TinkerPop has for SPARQL - describe the aspects
> of
> > > > SPARQL that we support and any limitations that we have in that
> support.
> > > > 4. I will work on the plugin and get that working on early this
> coming
> > > > week.
> > > > 5. I will also keep thinking about testing - i still don't think
> that the
> > > > approach I have is sufficient. If you have ideas about that, please
> let
> > > me
> > > > know.
> > > >
> > > > How does that sound?
> > > >
> > > > btw, note that i had to do a bit of trickery to get the
> sparql-gremlin
> > > > stuff to work in the console for that screenshot i posted on twitter.
> > > > obviously, without the plugin things don't work too easily. i had to
> > > > manually install all the dependencies to the console to get all that
> to
> > > > work. again, that should be resolved early this coming week and then
> it
> > > can
> > > > be easily imported to the console and server.
> > > >
> > > >
> > > >
> > > >
> > > > On Thu, Jan 25, 2018 at 4:58 PM, Stephen Mallette <
> [email protected]>
> > > > wrote:
> > > >
> > > > > Marko had a nice idea with:
> > > > >
> > > > > gremlin> sparql = graph.traversal(SPARQLTraversalStrategy.class)
> > > > > .withRemote(“127.0.0.2”)
> > > > > gremlin> sparql.query(“SELECT ?x ?y WHERE {…}”).toList()
> > > > > ==>{?x:marko, ?y:29}
> > > > > ==>{?x:josh, ?y:32}
> > > > >
> > > > > The problem i'm seeing is that it requires that the
> TraversalSource on
> > > > the
> > > > > server be a SparqlTraversalSource because when it gets to the
> server it
> > > > > ends up trying to deserialize the bytecode into a
> GraphTraversalSource.
> > > > > Now, that's exactly how a DSL would work, but a DSL would start
> with an
> > > > > existing start step such as V() or E(), but not constant() which is
> > > what
> > > > > SparqlTraversalSource is sending with the sparql query in it. I
> might
> > > be
> > > > > not thinking of something right in how he expected to implement it,
> > > but I
> > > > > came up with a reasonably simple workaround - I added an empty
> inject()
> > > > > step before the constant() so that the GraphTraversalSource will be
> > > used.
> > > > > Both of these steps will be wholly replaced by the transpiled
> traversal
> > > > > when the SparqlStrategy executes and we thus get:
> > > > >
> > > > > gremlin> graph = EmptyGraph.instance()
> > > > > ==>emptygraph[empty]
> > > > > gremlin> cluster = Cluster.open()
> > > > > ==>localhost/127.0.0.1:8182
> > > > > gremlin> g = graph.traversal(SparqlTraversalSource.class).
> > > > > ......1>                 withStrategies(SparqlStrategy.
> instance()).
> > > > > ......2>                 withRemote(DriverRemoteConnection.using(
> > > > cluster))
> > > > > ==>sparqltraversalsource[emptygraph[empty], standard]
> > > > > gremlin> g.sparql("SELECT ?name ?age WHERE { ?person v:name ?name .
> > > > > ?person v:age ?age }")
> > > > > ==>[name:marko,age:29]
> > > > > ==>[name:vadas,age:27]
> > > > > ==>[name:josh,age:32]
> > > > > ==>[name:peter,age:35]
> > > > >
> > > > > Treating sparql-gremlin as a DSL really seems like the best way to
> get
> > > > > this all working - especially since it already is! :)  To get the
> same
> > > > > pattern going with GLVs we would only need to make use of the DSL
> > > > patterns
> > > > > which already exist. Anyway, it's nice to have these basic premises
> > > > nailed
> > > > > down in code to ensure the ideas were sound. Please let me know if
> you
> > > > have
> > > > > any thoughts....
> > > > >
> > > > >
> > > > >
> > > > > On Thu, Jan 25, 2018 at 2:37 PM, Stephen Mallette <
> > > [email protected]>
> > > > > wrote:
> > > > >
> > > > >> Check this out:
> > > > >>
> > > > >> gremlin> graph = TinkerFactory.createModern()
> > > > >> ==>tinkergraph[vertices:6 edges:6]
> > > > >> gremlin> g = graph.traversal(SparqlTraversalSource.class).
> > > > >> ......1>           withStrategies(SparqlStrategy.instance())
> > > > >> ==>sparqltraversalsource[tinkergraph[vertices:6 edges:6],
> standard]
> > > > >> gremlin> g.sparql("SELECT ?name ?age WHERE { ?person v:name ?name
> .
> > > > >> ?person v:age ?age }")
> > > > >> ==>[name:marko,age:29]
> > > > >> ==>[name:vadas,age:27]
> > > > >> ==>[name:josh,age:32]
> > > > >> ==>[name:peter,age:35]
> > > > >>
> > > > >> The work is horribly hacked together at the moment and I've not
> pushed
> > > > it
> > > > >> to the development branch yet, but that's the general idea for how
> > > > >> gremlin-sparql will be used based on what we talked about earlier
> in
> > > > this
> > > > >> thread. Pretty neat?
> > > > >>
> > > > >>
> > > > >>
> > > > >> On Wed, Jan 24, 2018 at 3:38 PM, Stephen Mallette <
> > > [email protected]
> > > > >
> > > > >> wrote:
> > > > >>
> > > > >>> I just wanted to quickly note that sparql-gremlin is now building
> > > > >>> properly on the TINKERPOP-1878 branch (i just pushed some
> changes to
> > > > clean
> > > > >>> up some pom.xml/dependency conflicts issues). As we discussed in
> this
> > > > >>> thread, the branch currently contains a fairly bare bones model
> and
> > > it
> > > > will
> > > > >>> need some work to get it complete enough for it to be considered
> for
> > > > merge
> > > > >>> to a release branch. In a way that's good, because it will give
> the
> > > > >>> community a chance to shape exactly how sparql-gremlin will work.
> > > > >>>
> > > > >>> On Tue, Jan 9, 2018 at 10:47 AM, Harsh Thakkar <
> [email protected]>
> > > > >>> wrote:
> > > > >>>
> > > > >>>> Hi Stephen,
> > > > >>>>
> > > > >>>> It does make sense to me. The work is going on slow but steady.
> > > Let's
> > > > >>>> wait and see how other devs feel about this, as you said.
> > > > >>>>
> > > > >>>> Cheers,
> > > > >>>> Harsh
> > > > >>>> On 2018-01-09 16:31, Stephen Mallette <[email protected]>
> wrote:
> > > > >>>> > I've had some thoughts on this thread since December. Since
> > > > >>>> sparql-gremlin
> > > > >>>> > has a pretty long to-do list and there is likely a lot of
> > > discussion
> > > > >>>> > required on this list prior to it being ready for merge to a
> > > release
> > > > >>>> > branch, it seems like we might treat this as a normal feature
> > > under
> > > > >>>> > development. I think we should just merge it to a development
> > > branch
> > > > >>>> in the
> > > > >>>> > TinkerPop repository and then collaborate on it from there.
> We've
> > > > >>>> taken
> > > > >>>> > similar approaches with other "long term" pull requests which
> has
> > > > >>>> allowed
> > > > >>>> > the code to develop as it would typically would. I'm thinking
> > > > that's a
> > > > >>>> > better approach than a "big-bang" pull request.
> > > > >>>> >
> > > > >>>> > Harsh, if that's ok with you, feel free to issue your PR
> against
> > > > >>>> master and
> > > > >>>> > I'll get it setup against a development branch on our end (no
> > > rush,
> > > > >>>> please
> > > > >>>> > give it a few days to see if everyone is ok with that
> approach).
> > > > >>>> >
> > > > >>>> > On Mon, Dec 18, 2017 at 5:16 PM, Stephen Mallette <
> > > > >>>> [email protected]>
> > > > >>>> > wrote:
> > > > >>>> >
> > > > >>>> > > > Should I also remove the northwind file?
> > > > >>>> > >
> > > > >>>> > > I think I'd prefer to see all of our sparql examples use the
> > > > >>>> existing toy
> > > > >>>> > > graphs - better not to add more options - so I'd remove it
> as
> > > > well.
> > > > >>>> If
> > > > >>>> > > anyone disagrees, I don't really feel too strongly about not
> > > > >>>> including it,
> > > > >>>> > > but it would be good to hear some reasoning as to why the
> > > existing
> > > > >>>> datasets
> > > > >>>> > > that we already package are insufficient for users to learn
> > > with.
> > > > >>>> > >
> > > > >>>> > > >  will need some help (quite possibly) with getting things
> > > right
> > > > >>>> as far
> > > > >>>> > > as the DSL pattern for the gremlin language variants is
> > > concerned.
> > > > >>>> > >
> > > > >>>> > > We can help point you in the right direction when you get
> stuck
> > > or
> > > > >>>> need to
> > > > >>>> > > clarify things. If you get really stuck, we can move to
> step 2
> > > and
> > > > >>>> have you
> > > > >>>> > > issue a PR sooner than later and we'll just merge what you
> have
> > > > to a
> > > > >>>> > > development branch so others can collaborate with you on it
> more
> > > > >>>> easily.
> > > > >>>> > > Let's see how things develop.
> > > > >>>> > >
> > > > >>>> > > > Also, since you are very well versed in the test suite, I
> > > would
> > > > >>>> also
> > > > >>>> > > request some assistance for the same when we are there :)
> as it
> > > is
> > > > >>>> our
> > > > >>>> > > first time pushing a work to the production level. So bear
> with
> > > us
> > > > >>>> :)
> > > > >>>> > >
> > > > >>>> > > no worries. i will need to think on the testing approach. my
> > > > >>>> thinking will
> > > > >>>> > > be focused on what i would call integration tests i.e. tests
> > > that
> > > > >>>> evaluate
> > > > >>>> > > sparql-gremlin across the entire stack. I don't imagine
> that you
> > > > >>>> need my
> > > > >>>> > > input to write some unit tests to validate the workings of
> your
> > > > >>>> current
> > > > >>>> > > code though.
> > > > >>>> > >
> > > > >>>> > > > One question, though there is not a strict deadline, when
> is
> > > the
> > > > >>>> 3.3.2
> > > > >>>> > > release planned?
> > > > >>>> > >
> > > > >>>> > > We have no timeline on 3.3.2 at this point (we are just in
> the
> > > > >>>> process of
> > > > >>>> > > releasing 3.3.1 so it will be a while before we see 3.3.2).
> I
> > > > think
> > > > >>>> the
> > > > >>>> > > merging of gremlin-javascript will likely trigger that
> release,
> > > i
> > > > >>>> would
> > > > >>>> > > guess no earlier than February 2018 if all goes right with
> > > that. I
> > > > >>>> also
> > > > >>>> > > don't mean to make it sound like sparql-gremlin needs to be
> part
> > > > of
> > > > >>>> that
> > > > >>>> > > release, so if it's not ready then, it's not ready and it
> > > releases
> > > > >>>> with
> > > > >>>> > > 3.3.3. You'll find that with TinkerPop, we tend to release
> when
> > > > >>>> software is
> > > > >>>> > > "ready" and not by setting long range time deadlines for
> > > > ourselves.
> > > > >>>> So,
> > > > >>>> > > don't worry about when we release sparql-gremlin too much.
> Let's
> > > > >>>> stay
> > > > >>>> > > focused on just getting the code right.
> > > > >>>> > >
> > > > >>>> > > Thanks for your understanding.
> > > > >>>> > >
> > > > >>>> > >
> > > > >>>> > >
> > > > >>>> > >
> > > > >>>> > > On Mon, Dec 18, 2017 at 5:01 PM, Harsh Thakkar <
> > > [email protected]
> > > > >
> > > > >>>> wrote:
> > > > >>>> > >
> > > > >>>> > >> Hello Stephen,
> > > > >>>> > >>
> > > > >>>> > >> Alright, I will remove the bsbm file from the repository
> and I
> > > > >>>> refer to
> > > > >>>> > >> it in the docs (with some examples) sharing a link to
> download
> > > > >>>> from the
> > > > >>>> > >> website if that is acceptable. No worries.
> > > > >>>> > >> Should I also remove the northwind file?
> > > > >>>> > >>
> > > > >>>> > >>
> > > > >>>> > >> Your expectations are reasonable, it was just that I wasn't
> > > very
> > > > >>>> clear
> > > > >>>> > >> about what needs to be done. Now it is pretty much clear.
> It
> > > will
> > > > >>>> take some
> > > > >>>> > >> time for me to wrap my head around the specifics of the
> > > tinkerpop
> > > > >>>> codebase
> > > > >>>> > >> in order to satisfy the 3 requirements. I will need some
> help
> > > > >>>> (quite
> > > > >>>> > >> possibly) with getting things right as far as the DSL
> pattern
> > > for
> > > > >>>> the
> > > > >>>> > >> gremlin language variants is concerned. I am already
> reading
> > > the
> > > > >>>> dev-docs
> > > > >>>> > >> on this, from here:
> > > > >>>> > >> http://tinkerpop.apache.org/docs/current/reference/#dsl
> > > > >>>> > >>
> > > > >>>> > >> Also, since you are very well versed in the test suite, I
> would
> > > > >>>> also
> > > > >>>> > >> request some assistance for the same when we are there :)
> as it
> > > > is
> > > > >>>> our
> > > > >>>> > >> first time pushing a work to the production level. So bear
> with
> > > > us
> > > > >>>> :)
> > > > >>>> > >>
> > > > >>>> > >> I agree with you on not having any API shifts, this does
> > > > certainly
> > > > >>>> not
> > > > >>>> > >> give a good impression, also its a lot of effort down the
> > > drain.
> > > > >>>> Quality
> > > > >>>> > >> must be ensured.
> > > > >>>> > >>
> > > > >>>> > >> One question, though there is not a strict deadline, when
> is
> > > the
> > > > >>>> 3.3.2
> > > > >>>> > >> release planned?
> > > > >>>> > >>
> > > > >>>> > >> Cheers,
> > > > >>>> > >> Harsh
> > > > >>>> > >>
> > > > >>>> > >>
> > > > >>>> > >> On 2017-12-18 20:48, Stephen Mallette <
> [email protected]>
> > > > >>>> wrote:
> > > > >>>> > >> > A quick note about (4) - Having some sample data for user
> > > > >>>> convenience is
> > > > >>>> > >> > good. Files like that though should not be "resources",
> but
> > > > >>>> should be
> > > > >>>> > >> added
> > > > >>>> > >> > here:
> > > > >>>> > >> >
> > > > >>>> > >> > https://github.com/harsh9t/tinkerpop/tree/master/data
> > > > >>>> > >> >
> > > > >>>> > >> > Placing those files there will allow them to be included
> in
> > > the
> > > > >>>> the .zip
> > > > >>>> > >> > distribution files we produce for Gremlin Console and
> Gremlin
> > > > >>>> Server.
> > > > >>>> > >> Now,
> > > > >>>> > >> > that BSBM file is a bit much. It's 90M in size and 22M
> > > > >>>> compressed to
> > > > >>>> > >> zip.
> > > > >>>> > >> > Either way, that's going to push our already large zip
> > > > >>>> distributions
> > > > >>>> > >> bigger
> > > > >>>> > >> > than they should be. I don't think the value of this
> file is
> > > > >>>> worth the
> > > > >>>> > >> > that. We can definitely make it available as a separate
> > > > download
> > > > >>>> from
> > > > >>>> > >> the
> > > > >>>> > >> > web site if everyone thinks it's that important and then
> > > > provide
> > > > >>>> links
> > > > >>>> > >> to
> > > > >>>> > >> > it, but I don't think it should be in the source
> repository
> > > as
> > > > >>>> it is
> > > > >>>> > >> now.
> > > > >>>> > >> >
> > > > >>>> > >> > Aside from (4) I just wanted to make some general points
> > > about
> > > > my
> > > > >>>> > >> > expectations for a sparql-gremlin being part of a
> TinkerPop
> > > > >>>> release
> > > > >>>> > >> branch.
> > > > >>>> > >> > Apologies if this wasn't clear from when we started. I
> think
> > > we
> > > > >>>> need to
> > > > >>>> > >> see
> > > > >>>> > >> > sparql-gremlin as close to a final form as possible
> before we
> > > > >>>> look to
> > > > >>>> > >> merge
> > > > >>>> > >> > it. By "final" I mean:
> > > > >>>> > >> >
> > > > >>>> > >> > 1. sparql-gremlin has a full test suite - that means good
> > > unit
> > > > >>>> test
> > > > >>>> > >> > coverage at a minimum and integration tests as necessary
> > > (and I
> > > > >>>> sense
> > > > >>>> > >> they
> > > > >>>> > >> > will be necessary). I agree with marko, that we also
> have to
> > > > >>>> consider
> > > > >>>> > >> the
> > > > >>>> > >> > testing pattern carefully, so that we set the stage
> properly
> > > > for
> > > > >>>> future
> > > > >>>> > >> > languages.
> > > > >>>> > >> > 2. sparql-gremlin has a clear and easy method of usage
> that
> > > is
> > > > >>>> > >> consistent
> > > > >>>> > >> > with how TinkerPop works - this is crucial prior to merge
> > > > because
> > > > >>>> > >> TinkerPop
> > > > >>>> > >> > has high profile production usage. once merged
> sparql-gremlin
> > > > >>>> will
> > > > >>>> > >> > immediately be consumed by users and we will not want to
> > > shift
> > > > >>>> that API
> > > > >>>> > >> > once it is available. we will break the code of too many
> > > people
> > > > >>>> if we do
> > > > >>>> > >> > that. we need to strive to get this right from the start.
> > > > >>>> > >> > 3. sparql-gremlin has a good body of user documentation.
> > > > >>>> > >> >
> > > > >>>> > >> > I don't think any of this is insurmountable, but it does
> mean
> > > > >>>> there is a
> > > > >>>> > >> > fair bit of work to do and it won't happen overnight. We
> held
> > > > >>>> > >> > gremlin-dotnet to the same rigorous level before merging
> and
> > > > even
> > > > >>>> > >> > gremlin-javascript all these months later is still not
> merged
> > > > for
> > > > >>>> > >> basically
> > > > >>>> > >> > the same reasons, so this is just the process that we
> tend to
> > > > go
> > > > >>>> > >> through.
> > > > >>>> > >> > If we follow what we did for the GLVs, we will likely
> follow
> > > > >>>> this basic
> > > > >>>> > >> > process:
> > > > >>>> > >> >
> > > > >>>> > >> > 1. You get sparql-gremlin "pretty close" to final in your
> > > fork
> > > > >>>> > >> > 2. Once we all agree that you are "pretty close", you
> offer
> > > the
> > > > >>>> pull
> > > > >>>> > >> request
> > > > >>>> > >> > 3. We merge it into a TinkerPop branch for further
> evaluation
> > > > >>>> (this
> > > > >>>> > >> will be
> > > > >>>> > >> > a development branch and not a release branch)
> > > > >>>> > >> > 4. We work together to get the development branch "final"
> > > > >>>> > >> > 5. We issue a pull request from that development branch
> > > > >>>> > >> > 6. The pull request goes through the standard review/vote
> > > > >>>> process and we
> > > > >>>> > >> > merge to a release branch.
> > > > >>>> > >> > 7. sparql-gremlin will likely be part of 3.3.2 release
> > > > >>>> > >> >
> > > > >>>> > >> > I hope that make sense.
> > > > >>>> > >> >
> > > > >>>> > >> >
> > > > >>>> > >> > On Mon, Dec 18, 2017 at 12:26 PM, Marko Rodriguez <
> > > > >>>> [email protected]
> > > > >>>> > >> >
> > > > >>>> > >> > wrote:
> > > > >>>> > >> >
> > > > >>>> > >> > > Actually, my (3) is bad. Given that query() would
> always
> > > > >>>> return a
> > > > >>>> > >> > > Traversal<S, Map<String,E>>, it would be necessary to
> have
> > > > that
> > > > >>>> > >> linearized
> > > > >>>> > >> > > to Traversal<Vertex,Vertex> for the test suite to
> validate
> > > > it.
> > > > >>>> That
> > > > >>>> > >> would
> > > > >>>> > >> > > mean making SPARQLTraversal support extended Traversal
> > > > methods
> > > > >>>> like
> > > > >>>> > >> > > flatMap(), blah, blah… That seems excessive, though
> > > > convenient.
> > > > >>>> > >> > >
> > > > >>>> > >> > > Hm…… Thoughts?,
> > > > >>>> > >> > > Marko.
> > > > >>>> > >> > >
> > > > >>>> > >> > > http://markorodriguez.com
> > > > >>>> > >> > >
> > > > >>>> > >> > >
> > > > >>>> > >> > >
> > > > >>>> > >> > > > On Dec 18, 2017, at 9:21 AM, Marko Rodriguez <
> > > > >>>> [email protected]>
> > > > >>>> > >> > > wrote:
> > > > >>>> > >> > > >
> > > > >>>> > >> > > > Hello,
> > > > >>>> > >> > > >
> > > > >>>> > >> > > > A couple of items worth considering.
> > > > >>>> > >> > > >
> > > > >>>> > >> > > > Regarding (7), that should be done prior to master/
> > > merge.
> > > > >>>> It is
> > > > >>>> > >> > > necessary to follow the patterns that are established
> in
> > > > >>>> TinkerPop
> > > > >>>> > >> > > regarding language interoperability. The DSL pattern
> > > > developed
> > > > >>>> for
> > > > >>>> > >> Gremlin
> > > > >>>> > >> > > language variants seems to be the best pattern for
> distinct
> > > > >>>> languages
> > > > >>>> > >> as
> > > > >>>> > >> > > well. In essence, if your language is not a fluent
> > > language,
> > > > >>>> and
> > > > >>>> > >> instead,
> > > > >>>> > >> > > uses a String, then it should be wrapped as such in a
> > > fluent
> > > > >>>> interface
> > > > >>>> > >> > > using all the Strategy, Step, and Traversal methods
> that
> > > > makes
> > > > >>>> sense
> > > > >>>> > >> so it
> > > > >>>> > >> > > works within the larger infrastructure of TinkerPop
> (e.g.
> > > > >>>> testing! —
> > > > >>>> > >> see
> > > > >>>> > >> > > below). What I proposed in my previous email seems the
> > > > easiest
> > > > >>>> and
> > > > >>>> > >> cleanest
> > > > >>>> > >> > > way to do things.
> > > > >>>> > >> > > >
> > > > >>>> > >> > > > Regarding (3), testing is crucial. Given that this
> would
> > > be
> > > > >>>> > >> TinkerPop’s
> > > > >>>> > >> > > first distinct language, we don’t have a pattern set
> forth
> > > > for
> > > > >>>> > >> testing.
> > > > >>>> > >> > > However, this doesn’t mean we can’t improvise on our
> > > current
> > > > >>>> model.
> > > > >>>> > >> Off the
> > > > >>>> > >> > > top of my head, perhaps the best way would be to
> follow the
> > > > >>>> > >> > > ProcessTestSuite and do the SPARQL variants of those.
> For
> > > > >>>> instance:
> > > > >>>> > >> > > >
> > > > >>>> > >> > > >       https://github.com/apache/tin
> > > > >>>> kerpop/blob/master/gremlin-
> > > > >>>> > >> > > test/src/main/java/org/apache/
> tinkerpop/gremlin/process/
> > > > >>>> > >> > > traversal/step/map/VertexTest.java#L62 <
> > > > >>>> https://github.com/apache/
> > > > >>>> > >> > > tinkerpop/blob/master/gremlin-
> > > test/src/main/java/org/apache/
> > > > >>>> > >> > > tinkerpop/gremlin/process/
> traversal/step/map/VertexTest.
> > > java
> > > > >>>> #L62>
> > > > >>>> > >> > > >
> > > > >>>> > >> > > > The SPARQL test version would be:
> > > > >>>> > >> > > >
> > > > >>>> > >> > > > @Override
> > > > >>>> > >> > > > public Traversal<Vertex, Vertex> get_g_VX1X_out(final
> > > > Object
> > > > >>>> v1Id) {
> > > > >>>> > >> > > >   return sparql.query(“SELECT ?x WHERE {“ +
> toURI(v1Id)
> > > + “
> > > > >>>> ?a ?x
> > > > >>>> > >> }”);
> > > > >>>> > >> > > > }
> > > > >>>> > >> > > >
> > > > >>>> > >> > > > In this way, sparql is your SPARQLTraversalSource for
> > > each
> > > > >>>> test and
> > > > >>>> > >> > > query() will return a Traversal typed according
> (query()
> > > will
> > > > >>>> have to
> > > > >>>> > >> have
> > > > >>>> > >> > > solid generic support). From there, you would implement
> > > each
> > > > >>>> and
> > > > >>>> > >> every test
> > > > >>>> > >> > > that is semantically possible with SPARQL (where SPARQ
> > > won’t
> > > > >>>> be able
> > > > >>>> > >> to
> > > > >>>> > >> > > semantically cover all Gremlin tests).
> > > > >>>> > >> > > >
> > > > >>>> > >> > > > Stephen has done a lot of recent work to generalize
> our
> > > > test
> > > > >>>> suite
> > > > >>>> > >> out
> > > > >>>> > >> > > of Java so it is in a language agnostic form. I haven’t
> > > been
> > > > >>>> > >> following that
> > > > >>>> > >> > > work so I’m not sure what I’m am saying above is
> exactly as
> > > > it
> > > > >>>> should
> > > > >>>> > >> be
> > > > >>>> > >> > > done, but it is a start.
> > > > >>>> > >> > > >
> > > > >>>> > >> > > > HTH,
> > > > >>>> > >> > > > Marko.
> > > > >>>> > >> > > >
> > > > >>>> > >> > > > http://markorodriguez.com <
> http://markorodriguez.com/>
> > > > >>>> > >> > > >
> > > > >>>> > >> > > >
> > > > >>>> > >> > > >
> > > > >>>> > >> > > >> On Dec 18, 2017, at 7:43 AM, Harsh Thakkar <
> > > > >>>> [email protected]
> > > > >>>> > >> <mailto:
> > > > >>>> > >> > > [email protected]>> wrote:
> > > > >>>> > >> > > >>
> > > > >>>> > >> > > >> Hi Stephen and All,
> > > > >>>> > >> > > >>
> > > > >>>> > >> > > >> Thanks for going through the code. I address your
> > > > questions
> > > > >>>> below
> > > > >>>> > >> (in
> > > > >>>> > >> > > the same order):
> > > > >>>> > >> > > >>
> > > > >>>> > >> > > >> 1. Yes, this file can be removed. It was just to
> test
> > > the
> > > > >>>> traversal
> > > > >>>> > >> > > method.
> > > > >>>> > >> > > >>
> > > > >>>> > >> > > >> 2. Yes, I have commented out the block of tests at
> this
> > > > >>>> moment
> > > > >>>> > >> since we
> > > > >>>> > >> > > do not need to run tests at mvn clean install time.
> > > However,
> > > > I
> > > > >>>> kept
> > > > >>>> > >> it (in
> > > > >>>> > >> > > commented out form) if there arose a need in future
> for the
> > > > >>>> same. It
> > > > >>>> > >> can
> > > > >>>> > >> > > surely be removed if you think, it won't be necessary.
> > > > >>>> > >> > > >>
> > > > >>>> > >> > > >> 3. There were two testing units (we continued them
> from
> > > > >>>> Daniel's
> > > > >>>> > >> > > version), one to check whether the prefixes are being
> > > encoded
> > > > >>>> > >> correctly,
> > > > >>>> > >> > > the second one is to test whether the generated
> traversal
> > > is
> > > > >>>> correct
> > > > >>>> > >> (in
> > > > >>>> > >> > > short the compiler is functioning as it should).
> Since, we
> > > > >>>> extended
> > > > >>>> > >> > > previous work supporting a variety of SPARQL operators,
> > > more
> > > > >>>> test
> > > > >>>> > >> cases can
> > > > >>>> > >> > > be added to validate that each of these is functioning
> as
> > > > >>>> expected.
> > > > >>>> > >> > > However, as I mentioned in point #2. we need not do it
> > > > >>>> explicitly as
> > > > >>>> > >> we
> > > > >>>> > >> > > (Dharmen and I) have already tested them on 3-4
> different
> > > > >>>> datasets and
> > > > >>>> > >> > > query-sets. Now, since we did not know if that was
> going to
> > > > be
> > > > >>>> > >> formally
> > > > >>>> > >> > > required in the future or not, we left them as it is,
> just
> > > > >>>> commented
> > > > >>>> > >> it out.
> > > > >>>> > >> > > >>
> > > > >>>> > >> > > >> 4. These resources are the graphml files that we
> wish to
> > > > >>>> provide
> > > > >>>> > >> the
> > > > >>>> > >> > > users, for (i) loading and querying famous datasets -
> the
> > > > >>>> Berlin
> > > > >>>> > >> SPARQL
> > > > >>>> > >> > > Benchmark (BSBM)  (famous in the Semantic Web-RDF
> > > community)
> > > > >>>> so that
> > > > >>>> > >> they
> > > > >>>> > >> > > do not have to look elsewhere for the same. (ii) Also,
> it
> > > > >>>> provides a
> > > > >>>> > >> strong
> > > > >>>> > >> > > use-case for demonstrating the applicability of
> > > > sparql-gremlin
> > > > >>>> > >> (creates
> > > > >>>> > >> > > trust in the SW community users) and (iii) to keep the
> > > > plug-in
> > > > >>>> pretty
> > > > >>>> > >> much
> > > > >>>> > >> > > self-dependent.
> > > > >>>> > >> > > >>
> > > > >>>> > >> > > >> 5 & 6  YES, damn it. The IDE did this. I will revert
> > > these
> > > > >>>> changes.
> > > > >>>> > >> > > It's like when you are not looking, the IDE does
> things on
> > > it
> > > > >>>> own :-/
> > > > >>>> > >> > > apologies!
> > > > >>>> > >> > > >>
> > > > >>>> > >> > > >> 7. Regarding, Marko's thoughts -- Yes, I was
> waiting for
> > > > >>>> you to
> > > > >>>> > >> reply
> > > > >>>> > >> > > to the thread. I do have some thoughts on this. But
> first,
> > > I
> > > > >>>> was
> > > > >>>> > >> wondering
> > > > >>>> > >> > > if this (what Marko suggested) is supposed to be
> entirely
> > > > >>>> implemented
> > > > >>>> > >> in
> > > > >>>> > >> > > the current version of sparql-gremlin 0.2, i.e.
> including
> > > the
> > > > >>>> > >> > > withStrategies() and withStrategies() and remote()
> > > features,
> > > > >>>> or it is
> > > > >>>> > >> to be
> > > > >>>> > >> > > supported eventually (after the sparql-gremlin 0.2.0)
> > > plugin
> > > > is
> > > > >>>> > >> rolled out.
> > > > >>>> > >> > > Also, I am not entirely sure I got what Marko was
> exactly
> > > > >>>> suggesting.
> > > > >>>> > >> I
> > > > >>>> > >> > > bring this to light in the in-line style reply to
> Marko's
> > > > >>>> comment
> > > > >>>> > >> later
> > > > >>>> > >> > > here.
> > > > >>>> > >> > > >>
> > > > >>>> > >> > > >> The current implementation is more of a typical
> > > compiler,
> > > > >>>> the
> > > > >>>> > >> users,
> > > > >>>> > >> > > however, can use it by specifying the query file and
> the
> > > > >>>> dataset
> > > > >>>> > >> against
> > > > >>>> > >> > > which it is to be executed via the command (once in the
> > > > gremlin
> > > > >>>> > >> shell):
> > > > >>>> > >> > > >>
> > > > >>>> > >> > > >> gremlin> graph = TinkerGraph.open(..)
> > > > >>>> > >> > > >> gremlin> SparqlToGremlinCompiler.conver
> > > > >>>> tToGremlinTraversal(graph,
> > > > >>>> > >> > > "SELECT ?a WHERE {....} ")
> > > > >>>> > >> > > >> ==>{?x:marko, ?y:29}
> > > > >>>> > >> > > >> ==>{?x:josh, ?y:32}
> > > > >>>> > >> > > >>
> > > > >>>> > >> > > >>
> > > > >>>> > >> > > >> i.e. load a graph using pre-defined tinkerpop
> methods (
> > > > >>>> graph.io <
> > > > >>>> > >> > > http://graph.io/>(IoCore.gryo()).readGraph(graphName),
> > > > >>>> > >> > > TinkerGraph.open(), etc ) , then execute the traversal
> as
> > > > >>>> above with
> > > > >>>> > >> > > arguments -- (graph, queryString), where queryString =
> > > > "SPARQL
> > > > >>>> query".
> > > > >>>> > >> > > >>
> > > > >>>> > >> > > >> Now Let me quote Marko's comment and reply in-line
> to
> > > > bring
> > > > >>>> more
> > > > >>>> > >> > > clarity:
> > > > >>>> > >> > > >>
> > > > >>>> > >> > > >> 1. There should be a SPARQLTraversalSource which
> > > supports
> > > > >>>> one spawn
> > > > >>>> > >> > > method â€” query(String).
> > > > >>>> > >> > > >>      This is already happening inside the code.
> > > Therefore,
> > > > >>>> we do
> > > > >>>> > >> not
> > > > >>>> > >> > > need to mention it explicitly. Please correct me if I
> got
> > > it
> > > > >>>> wrong
> > > > >>>> > >> here.
> > > > >>>> > >> > > >>
> > > > >>>> > >> > > >> 2. SPARQLTraversal is spawned and it only supports
> only
> > > > the
> > > > >>>> > >> Traversal
> > > > >>>> > >> > > methods â€” next(), toList(), iterate(), etc.
> > > > >>>> > >> > > >>      All traversal methods that are supported,
> available
> > > > to
> > > > >>>> a
> > > > >>>> > >> regular
> > > > >>>> > >> > > gremlin traversal, can be used by the sparql-gremlin
> > > compiler
> > > > >>>> > >> generated
> > > > >>>> > >> > > traversal as well.
> > > > >>>> > >> > > >>
> > > > >>>> > >> > > >> 3. query(String) adds a ConstantStep(String).
> > > > >>>> > >> > > >>             This is happening internally (as shown
> in
> > > the
> > > > >>>> example
> > > > >>>> > >> > > above), we can also make explicit. i.e. let the user
> only
> > > > >>>> provide the
> > > > >>>> > >> > > queryString instead of the whole
> "SparqlToGremlinCompiler.
> > > > >>>> > >> > > convertToGremlinTraversal(graph, "SELECT ?a WHERE
> {....}
> > > ")"
> > > > >>>> command.
> > > > >>>> > >> > > Does this make sense? or am I missing something here.
> > > > >>>> > >> > > >>
> > > > >>>> > >> > > >>
> > > > >>>> > >> > > >> 4. SPARQLTraversalSource has a registered
> > > SPARQLStrategy.
> > > > >>>> > >> > > >>      At this moment, we leave it to the default
> setting
> > > > for
> > > > >>>> this
> > > > >>>> > >> > > strategy selection.
> > > > >>>> > >> > > >>
> > > > >>>> > >> > > >> 5. SPARQLTraversalSource should also support
> > > > >>>> withStrategies(),
> > > > >>>> > >> > > withoutStrategies(), withRemote(), etc.
> > > > >>>> > >> > > >>      Once the traversal is generated, it can
> support all
> > > > >>>> strategies
> > > > >>>> > >> > > like any other gremlin traversal. Does this make sense
> to
> > > > you?
> > > > >>>> > >> > > >>
> > > > >>>> > >> > > >> In a nutshell,
> > > > >>>> > >> > > >> What is happening is that we are converting the
> SPARQL
> > > > >>>> queryString
> > > > >>>> > >> into
> > > > >>>> > >> > > a gremlin traversal and leave it upto the tinkerpop
> > > compiler
> > > > to
> > > > >>>> > >> choose what
> > > > >>>> > >> > > is best for it.
> > > > >>>> > >> > > >> We only map a SPARQL query to its corresponding
> pattern
> > > > >>>> matching
> > > > >>>> > >> > > gremlin traversal (i.e. using with .match() clause).
> Since,
> > > > the
> > > > >>>> > >> > > expressibility of SPARQL is less than that of Gremlin
> (i.e.
> > > > >>>> SPARQL 1.0
> > > > >>>> > >> > > doesn't support/allow  performing looping and
> traversing
> > > > >>>> operations),
> > > > >>>> > >> we
> > > > >>>> > >> > > can only map what is in the scope of SPARQL language to
> > > > >>>> Gremlin. Once
> > > > >>>> > >> the
> > > > >>>> > >> > > traversal is generated, it is left to the tinkerpop
> > > compiler
> > > > to
> > > > >>>> > >> select and
> > > > >>>> > >> > > execute a wide range of strategies ( various levels of
> > > > >>>> optimizations,
> > > > >>>> > >> et).
> > > > >>>> > >> > > >>
> > > > >>>> > >> > > >> NOTE - Also, Right now the sparql-gremlin compiler
> > > returns
> > > > >>>> the
> > > > >>>> > >> > > traversal (string) and not Bytecode. Returning the
> Bytecode
> > > > is
> > > > >>>> also
> > > > >>>> > >> > > completely possible, if you want so. We can just
> perform
> > > > >>>> > >> > > traversal.asAdmin().getBytecode() for this and it is
> done.
> > > > >>>> > >> > > >>
> > > > >>>> > >> > > >> Since, we extended Daniel's work, we have not
> changed
> > > the
> > > > >>>> names of
> > > > >>>> > >> > > classes, methods and variable which were used. This,
> > > however,
> > > > >>>> can be
> > > > >>>> > >> > > changed, if you suggest so.
> > > > >>>> > >> > > >>
> > > > >>>> > >> > > >> 8. Yes, working in the academia doesn't groom you
> much
> > > on
> > > > >>>> the
> > > > >>>> > >> > > importance of commenting in the code by default, or for
> > > that
> > > > >>>> matter
> > > > >>>> > >> any
> > > > >>>> > >> > > "good-practices". I will add appropriate comment block
> in
> > > > each
> > > > >>>> class
> > > > >>>> > >> for
> > > > >>>> > >> > > the javadocs.
> > > > >>>> > >> > > >>
> > > > >>>> > >> > > >> I hope the above reply address your questions to
> quite
> > > > some
> > > > >>>> extent.
> > > > >>>> > >> > > Most of the issues are already handled internally (as I
> > > > stated
> > > > >>>> > >> above). We
> > > > >>>> > >> > > can also leave some advanced features such as
> remote(), for
> > > > >>>> the 0.2.1
> > > > >>>> > >> > > release (though this is just an option) :D
> > > > >>>> > >> > > >> Having said that, Of course, we are in no hurry for
> the
> > > > pull
> > > > >>>> > >> request. I
> > > > >>>> > >> > > also believe it makes complete sense to get things
> right.
> > > > >>>> > >> > > >>
> > > > >>>> > >> > > >> Cheers!
> > > > >>>> > >> > > >>
> > > > >>>> > >> > > >> On 2017-12-18 14:11, Stephen Mallette <
> > > > [email protected]
> > > > >>>> > >> <mailto:
> > > > >>>> > >> > > [email protected]>> wrote:
> > > > >>>> > >> > > >>> Harsh, I looked at the code in a bit more detail
> than I
> > > > >>>> have.
> > > > >>>> > >> Here's
> > > > >>>> > >> > > some
> > > > >>>> > >> > > >>> thoughts/questions I had as I was going through
> things:
> > > > >>>> > >> > > >>>
> > > > >>>> > >> > > >>> 1. Can this file be removed - it doesn't appear to
> have
> > > > >>>> any usage
> > > > >>>> > >> that
> > > > >>>> > >> > > I
> > > > >>>> > >> > > >>> can see:
> > > > >>>> > >> > > >>>
> > > > >>>> > >> > > >>> https://github.com/harsh9t/
> > > tinkerpop/blob/master/sparql-
> > > > >>>> > >> > > gremlin/src/main/java/org/apache/tinkerpop/gremlin/
> > > sparql/Ru
> > > > >>>> nable.java
> > > > >>>> > >> <
> > > > >>>> > >> > > https://github.com/harsh9t/
> tinkerpop/blob/master/sparql-
> > > > >>>> > >> > > gremlin/src/main/java/org/apache/tinkerpop/gremlin/
> sparql/
> > > > >>>> > >> Runable.java>
> > > > >>>> > >> > > >>>
> > > > >>>> > >> > > >>> 2. I note that this entire block of tests is
> commented
> > > > out
> > > > >>>> -
> > > > >>>> > >> should
> > > > >>>> > >> > > that be
> > > > >>>> > >> > > >>> removed?:
> > > > >>>> > >> > > >>>
> > > > >>>> > >> > > >>> https://github.com/harsh9t/
> > > tinkerpop/blob/master/sparql-
> > > > >>>> > >> > > gremlin/src/test/java/org/apache/tinkerpop/gremlin/
> sparql/
> > > > >>>> > >> > > SparqlToGremlinCompilerTest.java
> > > > >>>> > >> > > >>>
> > > > >>>> > >> > > >>> 3. I could be wrong, but even if you didn't remove
> the
> > > > >>>> tests
> > > > >>>> > >> above, it
> > > > >>>> > >> > > >>> seems like unit testing is rather thin at this
> point.
> > > Am
> > > > I
> > > > >>>> missing
> > > > >>>> > >> > > >>> something? Is there more work to do there?
> > > > >>>> > >> > > >>>
> > > > >>>> > >> > > >>> 4. I don't understand the nature of these
> resources:
> > > > >>>> > >> > > >>>
> > > > >>>> > >> > > >>> https://github.com/harsh9t/
> > > tinkerpop/tree/master/sparql-
> > > > >>>> > >> > > gremlin/src/main/resources
> > > > >>>> > >> > > >>>
> > > > >>>> > >> > > >>> Is there any need to package those with the jar?
> Should
> > > > >>>> those be
> > > > >>>> > >> "test"
> > > > >>>> > >> > > >>> resources instead? Do we need the really large
> > > > >>>> data/bsbm1m.graphml
> > > > >>>> > >> > > file for
> > > > >>>> > >> > > >>> any specific reason?
> > > > >>>> > >> > > >>>
> > > > >>>> > >> > > >>> 5. What are these changes to these poms?
> > > > >>>> > >> > > >>>
> > > > >>>> > >> > > >>> https://github.com/harsh9t/tinkerpop/commit/
> > > > >>>> > >> > > cb3b6512ea3536f556108e5a257c4586aa4d157a
> > > > >>>> > >> > > >>>
> > > > >>>> > >> > > >>> I assume that your IDE did that accidentally and
> it was
> > > > not
> > > > >>>> > >> intended.
> > > > >>>> > >> > > >>> Please revert that change.
> > > > >>>> > >> > > >>>
> > > > >>>> > >> > > >>> 6. This looks odd too - gremlin-shaded repeated
> again
> > > and
> > > > >>>> again
> > > > >>>> > >> and
> > > > >>>> > >> > > again:
> > > > >>>> > >> > > >>>
> > > > >>>> > >> > > >>> https://github.com/harsh9t/tinkerpop/commit/
> > > > >>>> > >> > > 143d16f20dcaa9c915b96cdd4adf7b1504db5d36#diff-
> > > > >>>> > >> > > 9e90009f097eabeb25c28159571fc6a2R118
> > > > >>>> > >> > > >>>
> > > > >>>> > >> > > >>> 7. Did you have any thoughts in reference to
> Marko's
> > > > >>>> earlier
> > > > >>>> > >> reply that
> > > > >>>> > >> > > >>> described how sparql-gremlin should be used? Right
> now,
> > > > it
> > > > >>>> seems
> > > > >>>> > >> like
> > > > >>>> > >> > > the
> > > > >>>> > >> > > >>> code you have there is just the "engine" but lacks
> the
> > > > >>>> piece that
> > > > >>>> > >> > > connects
> > > > >>>> > >> > > >>> it into the rest of the stack. From my
> perspective, I
> > > > >>>> think we
> > > > >>>> > >> need to
> > > > >>>> > >> > > be
> > > > >>>> > >> > > >>> sure that users have an easy, clear and consistent
> way
> > > to
> > > > >>>> use
> > > > >>>> > >> > > >>> sparql-gremlin before we can merge this work.
> > > Obviously,
> > > > >>>> having
> > > > >>>> > >> that
> > > > >>>> > >> > > aspect
> > > > >>>> > >> > > >>> of the code thought through will impact the
> > > documentation
> > > > >>>> that you
> > > > >>>> > >> > > write as
> > > > >>>> > >> > > >>> well, so I think you need to go down this path a
> bit
> > > > >>>> further
> > > > >>>> > >> before we
> > > > >>>> > >> > > get
> > > > >>>> > >> > > >>> to the pull request stage.
> > > > >>>> > >> > > >>>
> > > > >>>> > >> > > >>> 8. We aren't big javadoc sticklers here, but we
> try to
> > > at
> > > > >>>> least
> > > > >>>> > >> get
> > > > >>>> > >> > > class
> > > > >>>> > >> > > >>> level javadoc in place for most classes. I don't
> see
> > > much
> > > > >>>> javadoc
> > > > >>>> > >> or
> > > > >>>> > >> > > >>> comments in the code right now. I think I'd like to
> > > see a
> > > > >>>> modicum
> > > > >>>> > >> of
> > > > >>>> > >> > > >>> javadoc/comments present as part of this work.
> > > > >>>> > >> > > >>>
> > > > >>>> > >> > > >>> So, that's my broad level feedback at this point.
> It
> > > > seems
> > > > >>>> as
> > > > >>>> > >> though
> > > > >>>> > >> > > there
> > > > >>>> > >> > > >>> are some reasonably large issues there to contend
> with
> > > > >>>> before a
> > > > >>>> > >> pull
> > > > >>>> > >> > > >>> request is worth issuing. That's not a problem, of
> > > > >>>> course....we
> > > > >>>> > >> will
> > > > >>>> > >> > > just
> > > > >>>> > >> > > >>> keep iterating toward the goal. I'm not aware of
> > > anything
> > > > >>>> that is
> > > > >>>> > >> > > pushing
> > > > >>>> > >> > > >>> us to rush to a pull request - I'm of the opinion
> that
> > > we
> > > > >>>> can
> > > > >>>> > >> take the
> > > > >>>> > >> > > time
> > > > >>>> > >> > > >>> to get this right.
> > > > >>>> > >> > > >>>
> > > > >>>> > >> > > >>> Thanks,
> > > > >>>> > >> > > >>>
> > > > >>>> > >> > > >>> Stephen
> > > > >>>> > >> > > >>>
> > > > >>>> > >> > > >>>
> > > > >>>> > >> > > >>> On Fri, Dec 15, 2017 at 1:46 PM, Joshua Shinavier <
> > > > >>>> > >> [email protected]>
> > > > >>>> > >> > > wrote:
> > > > >>>> > >> > > >>>
> > > > >>>> > >> > > >>>> Hi Marko,
> > > > >>>> > >> > > >>>>
> > > > >>>> > >> > > >>>> I think we're more or less on the same page here;
> it's
> > > > >>>> clear
> > > > >>>> > >> that TP3
> > > > >>>> > >> > > has a
> > > > >>>> > >> > > >>>> different API than TP2. If you look at the guts
> of TP3
> > > > >>>> GraphSail
> > > > >>>> > >> [1],
> > > > >>>> > >> > > it
> > > > >>>> > >> > > >>>> uses the modern APIs, and yet does adapt them to
> the
> > > > Sail
> > > > >>>> > >> interface.
> > > > >>>> > >> > > >>>>
> > > > >>>> > >> > > >>>> Something like PropertyGraphSail (or an equivalent
> > > Jena
> > > > >>>> thing)
> > > > >>>> > >> still
> > > > >>>> > >> > > makes
> > > > >>>> > >> > > >>>> sense in TP3, as well. One interesting detail
> here is
> > > > >>>> that in
> > > > >>>> > >> TP3,
> > > > >>>> > >> > > vertices
> > > > >>>> > >> > > >>>> can have labels, which can be turned into rdf:type
> > > > >>>> statements
> > > > >>>> > >> (that,
> > > > >>>> > >> > > in
> > > > >>>> > >> > > >>>> turn, can be used to enable subclass/superclass
> > > > >>>> inheritance if
> > > > >>>> > >> the
> > > > >>>> > >> > > graph is
> > > > >>>> > >> > > >>>> combined with a RDF schema.
> > > > >>>> > >> > > >>>>
> > > > >>>> > >> > > >>>> A TP3 equivalent of SailGraph would indeed be
> quite
> > > > >>>> different in
> > > > >>>> > >> > > >>>> implementation -- strategies, not wrapper graph --
> > > than
> > > > >>>> what we
> > > > >>>> > >> had
> > > > >>>> > >> > > for
> > > > >>>> > >> > > >>>> Blueprints, and yet would serve the same purpose.
> > > > >>>> > >> > > >>>>
> > > > >>>> > >> > > >>>> Josh
> > > > >>>> > >> > > >>>>
> > > > >>>> > >> > > >>>>
> > > > >>>> > >> > > >>>> [1]
> > > > >>>> > >> > > >>>> https://github.com/joshsh/
> graphsail/tree/master/src/
> > > > >>>> > >> > > >>>> main/java/net/fortytwo/tpop/sail
> > > > >>>> > >> > > >>>>
> > > > >>>> > >> > > >>>>
> > > > >>>> > >> > > >>>>
> > > > >>>> > >> > > >>>> On Fri, Dec 15, 2017 at 10:22 AM, Marko Rodriguez
> <
> > > > >>>> > >> > > [email protected]>
> > > > >>>> > >> > > >>>> wrote:
> > > > >>>> > >> > > >>>>
> > > > >>>> > >> > > >>>>> Hello,
> > > > >>>> > >> > > >>>>>
> > > > >>>> > >> > > >>>>> The model proposed below is in-line with
> > > TinkerPop2â€™s
> > > > >>>> way of
> > > > >>>> > >> > > thinking.
> > > > >>>> > >> > > >>>>> Unfortunately, TinkerPop3 and more so for
> TinkerPop4,
> > > > >>>> the Graph
> > > > >>>> > >> > > >>>> â€œstructure"
> > > > >>>> > >> > > >>>>> API will become deprecated. This means that the
> > > notion
> > > > of
> > > > >>>> > >> > > â€œwrapping the
> > > > >>>> > >> > > >>>>> Graph APIâ€  has gone away for TP3 and will be
> > > > >>>> completely gone
> > > > >>>> > >> in
> > > > >>>> > >> > > TP4. In
> > > > >>>> > >> > > >>>>> TP4, there will not even be a Graph API â€” no
> more
> > > > >>>> Vertex,
> > > > >>>> > >> Edge,
> > > > >>>> > >> > > Property,
> > > > >>>> > >> > > >>>>> etc. Only the concept of a Graph with only
> methods
> > > like
> > > > >>>> > >> > > >>>> Graph.traversal(),
> > > > >>>> > >> > > >>>>> Graph.partitions(), etc.
> > > > >>>> > >> > > >>>>>
> > > > >>>> > >> > > >>>>> Why was this route taken? In TinkerPop3, there
> was a
> > > > >>>> need to
> > > > >>>> > >> support
> > > > >>>> > >> > > any
> > > > >>>> > >> > > >>>>> language besides Java. This was why Gremlin
> bytecode
> > > > and
> > > > >>>> the
> > > > >>>> > >> concept
> > > > >>>> > >> > > of
> > > > >>>> > >> > > >>>> the
> > > > >>>> > >> > > >>>>> Gremlin traversal machine was introduced. A
> provider
> > > > >>>> simply gets
> > > > >>>> > >> > > Gremlin
> > > > >>>> > >> > > >>>>> bytecode and has to do something with it. For the
> > > > >>>> Java-based
> > > > >>>> > >> Gremlin
> > > > >>>> > >> > > >>>>> traversal machine, this is why providers
> implement
> > > > their
> > > > >>>> own
> > > > >>>> > >> > > GraphStep,
> > > > >>>> > >> > > >>>>> VertexStep, etc. For a Python-based Gremlin
> traversal
> > > > >>>> machine,
> > > > >>>> > >> > > likewiseâ€¦
> > > > >>>> > >> > > >>>>>
> > > > >>>> > >> > > >>>>> This means that SailGraph, GraphSail,
> > > PropertyGraphSail
> > > > >>>> as
> > > > >>>> > >> stated
> > > > >>>> > >> > > below
> > > > >>>> > >> > > >>>>> donâ€™t make sense in the current and future
> > > > >>>> architectures.
> > > > >>>> > >> > > >>>>>
> > > > >>>> > >> > > >>>>> The next question becomes, "well how would you
> turn
> > > an
> > > > >>>> RDF store
> > > > >>>> > >> > > into a
> > > > >>>> > >> > > >>>>> PropertyGraph?â€  Easy â€” implement your own
> custom
> > > > >>>> GraphStep,
> > > > >>>> > >> > > VertexStep,
> > > > >>>> > >> > > >>>>> etc. and respective ProviderStrategies that will
> > > handle
> > > > >>>> the
> > > > >>>> > >> bytecode
> > > > >>>> > >> > > >>>>> compilation accordingly.
> > > > >>>> > >> > > >>>>>
> > > > >>>> > >> > > >>>>> The next question becomes, â€œwell how would a
> > > > >>>> PropertyGraph
> > > > >>>> > >> support
> > > > >>>> > >> > > >>>>> reasoning?â€  Easy â€” implement your own custom
> > > > >>>> > >> DecorationStrategy
> > > > >>>> > >> > > that will
> > > > >>>> > >> > > >>>>> insert reasoning into the traversal giving the
> RDFS
> > > > >>>> schema. For
> > > > >>>> > >> > > instance:
> > > > >>>> > >> > > >>>>>        g.V().out(â€œlikesâ€ )
> > > > >>>> > >> > > >>>>>                ==>
> > > > >>>> > >> > > >>>>>        g.V().out(â€œknowsâ€ ,â€ likesâ€ )
> > > > >>>> > >> > > >>>>>                iff â€œlikesâ€  is a sub-property
> of
> > > > >>>> â€œknowsâ€
> > > > >>>> > >> > > >>>>>
> > > > >>>> > >> > > >>>>> In essence, it is possible to do this
> integration of
> > > > RDF
> > > > >>>> and
> > > > >>>> > >> > > TinkerPop,
> > > > >>>> > >> > > >>>> it
> > > > >>>> > >> > > >>>>> just needs to be done at the correct level of
> > > > >>>> abstraction so
> > > > >>>> > >> that it
> > > > >>>> > >> > > >>>> stays
> > > > >>>> > >> > > >>>>> in line with how TinkerPop is evolving, not how
> it
> > > was
> > > > >>>> back in
> > > > >>>> > >> 2012.
> > > > >>>> > >> > > >>>>>
> > > > >>>> > >> > > >>>>> Take care,
> > > > >>>> > >> > > >>>>> Marko.
> > > > >>>> > >> > > >>>>>
> > > > >>>> > >> > > >>>>> http://markorodriguez <http://markorodriguez/
> >.com
> > > > >>>> > >> > > >>>>>
> > > > >>>> > >> > > >>>>>
> > > > >>>> > >> > > >>>>> On 2017-12-13 07:46, Joshua Shinavier <
> > > > [email protected]
> > > > >>>> >
> > > > >>>> > >> wrote:
> > > > >>>> > >> > > >>>>>> Hi Harsh,>
> > > > >>>> > >> > > >>>>>>
> > > > >>>> > >> > > >>>>>> Glad you are taking Daniel's work forward. In
> > > porting
> > > > >>>> the code
> > > > >>>> > >> to
> > > > >>>> > >> > > the>
> > > > >>>> > >> > > >>>>>> TinkerPop code base, might I suggest we allow
> for
> > > not
> > > > >>>> only
> > > > >>>> > >> > > >>>>> SPARQL-Gremlin,>
> > > > >>>> > >> > > >>>>>> but a whole suite of RDF tools as in TP2.
> Perhaps
> > > call
> > > > >>>> the
> > > > >>>> > >> module>
> > > > >>>> > >> > > >>>>>> rdf-gremlin. Then we could have all of:>
> > > > >>>> > >> > > >>>>>>
> > > > >>>> > >> > > >>>>>> * SPARQL-Gremlin: executes standard SPARQL
> queries
> > > > over
> > > > >>>> a
> > > > >>>> > >> Property
> > > > >>>> > >> > > >>>> Graph>
> > > > >>>> > >> > > >>>>>> database>
> > > > >>>> > >> > > >>>>>> * GraphSail [1,2]: stores RDF quads in the
> database,
> > > > >>>> > >> explicitly,
> > > > >>>> > >> > > and>
> > > > >>>> > >> > > >>>>>> enables SPARQL and triple pattern queries over
> the
> > > > >>>> quads>
> > > > >>>> > >> > > >>>>>> * PropertyGraphSail [3]: exposes a Property
> Graph
> > > with
> > > > >>>> of two
> > > > >>>> > >> > > mappings
> > > > >>>> > >> > > >>>>> to>
> > > > >>>> > >> > > >>>>>> the RDF data model>
> > > > >>>> > >> > > >>>>>> * SailGraph [4]: takes an RDF triple store (not
> > > > natively
> > > > >>>> > >> supporting>
> > > > >>>> > >> > > >>>>>> Gremlin) and enables Gremlin queries>
> > > > >>>> > >> > > >>>>>> * others? I have often thought that a continuous
> > > > SPARQL
> > > > >>>> > >> > > implementation>
> > > > >>>> > >> > > >>>>>> built on Gremlin would be powerful>
> > > > >>>> > >> > > >>>>>>
> > > > >>>> > >> > > >>>>>> The biggest mismatch between the TP2 suite and
> what
> > > > >>>> might be
> > > > >>>> > >> built
> > > > >>>> > >> > > for>
> > > > >>>> > >> > > >>>>>> Apache TinkerPop is that the previous suite was
> > > > >>>> implemented
> > > > >>>> > >> using
> > > > >>>> > >> > > >>>>> (Eclipse)>
> > > > >>>> > >> > > >>>>>> RDF4j, whereas things seem to be leaning towards
> > > > >>>> (Apache) Jena
> > > > >>>> > >> now.>
> > > > >>>> > >> > > >>>>>> However, the same principles could be applied.>
> > > > >>>> > >> > > >>>>>>
> > > > >>>> > >> > > >>>>>> Josh>
> > > > >>>> > >> > > >>>>>>
> > > > >>>> > >> > > >>>>>>
> > > > >>>> > >> > > >>>>>> [1] https://github.com/tinkerpop/
> > > > blueprints/wiki/Sail-
> > > > >>>> > >> > > Ouplementation>
> > > > >>>> > >> > > >>>>>> [2] https://github.com/joshsh/graphsail>
> > > > >>>> > >> > > >>>>>> [3]>
> > > > >>>> > >> > > >>>>>> https://github.com/tinkerpop/b
> > > > >>>> lueprints/wiki/PropertyGraphSa
> > > > >>>> > >> il-
> > > > >>>> > >> > > >>>>> Ouplementation>
> > > > >>>> > >> > > >>>>>> [4] https://github.com/tinkerpop/
> > > > blueprints/wiki/Sail-
> > > > >>>> > >> > > Implementation
> > > > >>>> > >> > > >>>>>
> > > > >>>> > >> > > >>>>> http://markorodriguez.com
> > > > >>>> > >> > > >
> > > > >>>> > >> > >
> > > > >>>> > >> > >
> > > > >>>> > >> >
> > > > >>>> > >>
> > > > >>>> > >
> > > > >>>> > >
> > > > >>>> >
> > > > >>>>
> > > > >>>
> > > > >>>
> > > > >>
> > > > >
> > > >
> > >
> >
>

Re: [Discussed] Integrating SPARQL-Gremlin 0.2 Plugin with the TinkerPop codebase

Reply via email to