Hi Mark,
I think that any host language embedding should use its native idioms while, at
the same time, staying as true as possible to Gremlin-Java (not Gremlin-Groovy
-- though they are nearly identical). I would argue that Gremlin-Java is the
"true representation" of the language. So what do I mean by native idioms?
in_V vs inV // if camel case isn't a thing in the native language
$g vs. g // of course if thats how variables are
referenced
…huh, can't think of anything else :). But I hope you get the point.
I notice in Gremlin-Py you do g.v(2) vs g.V(2). Why is that?
*** Would you be interested in working on a tutorial (with me?) about the 3
ways to create a Gremlin language variant. Given your expertise in Python and
the existence of Gremlin-Py, I think we can both (1) make a good tutorial to
teach others down the line and (2) spruce up Gremlin-Py's documentation and
appearance (e.g. you need a Gremlin logo! -- Gremlin with a Snake around his
neck? -- want me to make you one?). ***
Please see: https://issues.apache.org/jira/browse/TINKERPOP-1232
Thanks Mark,
Marko.
http://markorodriguez.com
On Apr 14, 2016, at 8:28 AM, Mark Henderson <[email protected]> wrote:
> I think writing "Gremlin/Groovy" in a host language is pretty awesome as long
> as it isn't too far off from writing actual Gremlin. I can revive my PHP
> project if it would be helpful to the community. A JavaScript version would
> probably be one that would get the most attention from developers today, but
> JS, even with es6, doesn't have the flexibility (maybe with Proxies) with its
> objects where you wouldn't have to write a full-on 1-to-1 api equivalent of
> Gremlin (let alone mimicking Groovy). It seems like a Ruby version would be
> doable by implementing `method_missing`
>
> Thanks for adding Gremlinpy to the new site (I need to clean up the code a
> bit *shame*)
>
> On Thursday, April 14, 2016 at 9:34:40 AM UTC-4, Marko A. Rodriguez wrote:
> Hi Mark,
>
> Exactly. I never saw Gremlin-Py until now and just noticed it on the Apache
> TinkerPop homepage. That is good stuff. Moreover, as you say, there is a
> distinction between:
>
> 1. Writing Gremlin in a host language.
> 2. Communicating to a GremlinServer-compliant server in a host language.
>
> The (1) is about query syntax and the (2) is about protocol stuffs.
>
> Lots of the libraries either confound the two or just do (2) with (1) simply
> being a Groovy String (cheesy).
>
> I would like to see a lot more (1) of the community libraries as I think this
> is one of the big selling points of Gremlin -- write in your native language.
>
> BTW, I added Gremlin-Py to the description in the "host language embedding"
> section here: http://www.planettinkerpop.org/#gremlin (2 scrolls down).
>
> Thanks for your thoughts,
> Marko.
>
> http://markorodriguez.com
>
> On Apr 14, 2016, at 7:06 AM, Mark Henderson <[email protected]> wrote:
>
> I've written "native object to Gremlin" libs in both PHP and Python and it
> isn't too bad/not too far from Groovy. The biggest issues were around indices
> [..] (when it had that format) and closures "{x -> ...}", but otherwise both
> langs allowed for easy query building.
>
> It basically looked like this in PHP:
>
> $g= Gremlin();
> $g->V()->has('"name"','mark');
> echo (str)$g; //g.V().has("name",SOME_BOUND_VAR_1)
>
> Works pretty much the same with the Python lib that I've been building
> (https://github.com/emehrkay/gremlinpy).
>
> If we wanted to actually execute the query on every step, that wouldn't be
> too difficult to implement with Gremlinpy. Gremlinpy is a simple linked list,
> it looks at g.V().has('"name"', 'mark') as three token objects with a shared
> pool of bound parameters. It creates the string query and parameters
> dictionary when you cast the list to a string. The only change needed would
> be to bind in a library like Gremlinclient
> (https://github.com/davebshow/gremlinclient), build the query with every
> step, and send it to the server.
>
> res = g.V() # sends request
> res2 = g.V().has('"name"', 'mark') # second request
> ...
>
> The remaining difficulty would be deciding what gets bound. Maybe you can
> pass in a key val pair for what you want bound
>
> res = g.V().has('"name"',{'NAME':'mark'}) # g.V().has("name",NAME)
>
>
>
> On Tuesday, April 12, 2016 at 10:54:08 PM UTC-4, Dmill wrote:
> Yes a lot of the points you bring up are valid.
>
> One of the main problems with stringifying everything is that it does not
> allow for some of the stuff I mentioned in my PS. That is to name "smart
> merges". This query building behavior that makes use of scopes is
> unfortunately the standard for frameworks in the industry.
> This is mostly due to the SQL heritage and it's declarative nature ; ordering
> of "steps" doesn't matter so it allows for easy "after the fact" client side
> filtering. It's not uncommon to have a base query that gets altered by some
> filtering data. In some cases it's a simple has() that needs to be injected
> somewhere, in other cases it's a repeat() that needs to be completely altered.
> Use cases can get a little complicated here but in it's simplest form imagine
> having to add/remove entries to/from a match(). Of course that scenario works
> well with a toString approach but for other steps, not so well. Our
> experience has been that the builder needs to be aware of the step's
> signatures to resolve merges.
>
> So sure this is another problem entirely, in the end users can't really do
> this with string queries either. But for widespread adoption it would be best
> if the query builder could handle these scenarios.
>
> Also to bounce off of some of your comments :
>
> > $id -> "~id"
> > $label -> "~label"
> > g.V().out("%%x")
> > $g->V()->has($id,Number::long(36)) ==> g.V().has("id",36l)
>
> All of the above are absolutely possible. But it's a lot to keep in mind for
> users that are already trying to figure out how Gremlin works. Now they also
> need to translate gremlin-groovy into gremlin-php.
> One of the advantages of going the hard route and keeping track of all step
> signatures instead of a toString approach is that you can significantly
> reduce the above cases. The builder can resolve quite a few of these
> automatically and when conflicts arise it can do it's best to resolve it and
> throw/log a warning telling the user how he could explicit his query.
>
> >For your Date example, you would have to have a special "toString()" for PHP
> >dates to Java dates (or whichever backend ScriptEngine is being used).
>
> There are no PHP Dates [insert desperate crying emoji here]. PHP sucks with
> typing. It's got it's good points but this kind of stuff is not one of them.
> Basically PHP Dates come in various forms, from Integer timestamps to String
> and only the user really knows what he wants. We can provide this
> functionality like you did with long() but it's another thing to keep in mind.
>
> One point we haven't gone over have been lambdas. We can't really toString
> these. I guess this is where customStep() or script() come in play.
>
> To wrap it up, a toString query builder is absolutely an option and could
> cover a lot of the API. In fact in PHP we could magically make any API method
> available, $g->something("~label", "lolo") would stringify to
> g.something(label, "lolo") regardless of whether or not the step exists. But
> this involves quite a few language specific alterations and doesn't provide
> much (if any) functional benefit.
> It would be so much easier for people to just write a gremlin-groovy string
> as it's well documented and doesn't need any extra knowledge.
> If on the other hand the query builder has features like mentioned in the PS
> or earlier in this post, it's well worth the effort. I believe most people
> who build their own query builders do so to support some form of extra
> feature they wouldn't have by using gremlin-groovy string queries.
> But such a query builder enters the realm of non-trivial (although not
> unachievable). A first step in helping people make these builders would be to
> provide an easily parseable list of signatures for the most desirable
> classes. Maybe something along the lines of a yaml file.
>
> Anyways I'm just thinking out loud at this point.
>
>
>
>
>
> On Tue, Apr 12, 2016 at 9:42 PM, Marko Rodriguez <[email protected]> wrote:
> Hi Dylan,
>
> Your email is excellent. Thank you for breaking things down for me. Here are
> some responses.
>
> 1. Method overloading :
>
> abstract class Query {
> public function has(PropertyKey $key); //1
> public function has(PropertyKey $key, Object $value); //2
> public function has(Label $label, String $value); //3
> public function has(VertexId $id, Long $value); //4
> public function has(VertexId $id, Int $value); //5
> public function has(VertexId $id, Predicate $p); //6
> }
>
> The above is illegal in languages like PHP (or javascript?). Instead we're
> stuck with :
>
> abstract class Query {
> public function has(Array $args);
> }
>
> We're then left to figure out what is what in the array and sort out how we
> need to stringify the output.
>
> I was thinking, why would you need to introspect into the array? Just
> toString() each element in the array with a comma (,) in between. For
> instance:
>
> * has("age",32) ==> has(["age",32]) ==> has("age",32) // all String array
> element need " " wrappers.
> * has("age") ==> has(["age"]) ==> has("age")
> * has("person","name","marko") ==> has(["person","name","marko"]) ==>
> has("person","name","marko")
>
> Thus, Gremlin-PHP have one has()-method and that method just iterates the
> arguments and toString()'s thing accordingly with comma deliminators.
>
> If the user does $g->V()->has("label", "user") do we add quotes to the first
> argument or is it a label/id? What about the second argument, is it a
> predicate? etc. This gets complexe very quickly.
>
> The universal rule --- if its a String add quotes. If its not, don't.
>
> $id -> "~id"
> $label -> "~label"
>
> $g->V()->has($label,"user")
>
> And what if I had $g->V()->has("id", 36) . PHP only supports Int so one of
> the two signatures (4 or 5) needs to give as we have a major conflict. This
> example is fictional for has() but I've run into this on a couple of other
> methods, just can't remember which.
>
> Yea, that sucks. Well, you could do this:
>
> $g->V()->has($id,Number::long(36)) ==> g.V().has("id",36l)
>
> This would, of course, bind you to Gremlin-Groovy as the ultimate
> ScriptEngine.
>
> Another example would be g.V().has(id, neq(m)) . We could imagine the
> following PHP equivalent $g->V()->has(new Id(), Predicate::neq("m")) where
> Id() is a class that helps us recognize this type, and neq() a static method
> of Predicate. However "m" has to be passed as string and we have no clue what
> m is... is this a string or a binding or a server side variable? More on this
> in point 2.
>
> Well, this is the same problem in Gremlin-Java. where() is ALWAYS bindings
> and has() is ALWAYS objects. Thus:
>
> $g->V()->where("a",Predicate::neq("m")) ==> g.V().where("a",neq("m")) //
> again strings always get " "-wrappers.
>
> To close things off here there's also the case of signatures like
> out(String... edgeLabels) that need their own logic.
>
> Again, just toString() each object in the array and insert commas between.
>
> $g->V()->out(["created","knows"]) ==> g.V().out("created","knows")
>
>
> Conclusion: There's a lot of manual work that needs to go into separating the
> logic between signatures and handling special cases. Part of this can be
> automated if your language supports magic getters and setters by parsing the
> javadocs for example. But not only is that an if, the rest will still be
> manual. This step is maintenance heavy.
>
> I see the biggest pains being:
>
> 1. Having to implement each method.
> 2. Having to have helper classes for P, T, Order, Column, etc.
>
> This is simply a matter of fat fingering stuff in and not anything
> implementation-wise that is problematic -- ????….
>
> 2. Conflicts
>
> Because we're manipulating strings it's really hard to tell a few items
> appart (binding vs server variable vs string; Theres a reason why I separate
> binding and variable).
>
> For instance in the example above of gremlin : g.V().has(id, neq(m)) vs PHP:
> $g->V()->has(new Id(), Predicate::neq("m")) we don't know what to make of m.
> Is this a binding or a string or even a variable that was previously set in
> the session? There is no clean way of working around this.
>
> Firstly because bindings tend to be handled on a different layer than the
> query builder.
> Secondly because methods that will help in avoiding the conflicts will also
> lose typing data.
> For example : $g->V()->has(new Id(), Predicate::neq(Query::variable("m")))
> could generate the proper query by outputting m without quotes but we don't
> know what type m is so in some cases it might be tricky to select the proper
> signature.
>
> Conclusion: there are a number of ways around this point. We use prefixes B_m
> or V_m and a hack to ignore signatures altogether when in this scenario. It's
> not that these aren't solve-able they just aren't trivial.
>
> Hm. Yea, I'm not to smart about sever variables. Out of my butt you could
> create a "crazy String" for those an then do replaceAll-style updates.
>
> g.V().out("%%x")
>
> replaceAll("%%x",x)
>
> ?
>
>
> 3. API
>
> Why we would need traversal, graph, vertex and edge APIs are quite self
> explanatory for everyday work with Gremlin. I'm just going to expose why we
> would also require some Java classes as well.
>
> Because JSON is lossy by nature we often have to cast variables to certain
> types. For example by submitting these kind of scripts :
> g.V(1).property("date", new Date(B_m)); with B_m = timestamp. This is just
> another case that is difficult to cover.
>
> This adds onto the other points in making a gremlin language variant
> non-trivial.
>
> All of the above can be worked around by using an injection method that just
> appends a string to the query : $g->customStep("V().has(id, neq(m))") but
> that's besides the point.
>
>
> Ah. Classy. Note that in ?3.2.1? we might support script()-step.
>
> g.V().script("out().map{ it.name }")
>
> …to enable lambdas in remote'd traversals (Server or OLAP).
>
> For your Date example, you would have to have a special "toString()" for PHP
> dates to Java dates (or whichever backend ScriptEngine is being used).
>
> $g->V()->property("data", phpDate)
>
> Your Array-string-ifier would not just call toString() blindly on the objects
> of the array arguments, but would do stuff like:
>
> if(object instanceof String)
> return \" + object.toString() + "\;
> else if(object instanceof Date)
> return "new Date(…)";
> else
> return object.toString()
>
>
> Final Conclusion: It's not a trivial task. Of course the examples above are
> very verbose and achieving something closer to gremlin in style is possible
> but there are always going to be "gotchas" users will need to keep in mind.
> A while back in TP2 I released a php library for this (the one we currently
> use in our projects). I decided to remove it as it was too much maintenance
> to get it to work across user causes so I decided to concentrate on our own
> one (some choices made in 2. wouldn't have worked for other cases)
> I'm convinced there's got to be a way of reconciling everything and getting
> this to work flawlessly but it's going to require a lot of thought/work
>
>
> PS: I mentioned some other points like managing multiple versions of gremlin
> (for two lines of releases) which is a real headache.
> For performance it may be good to allow the builder to handle multiple lines,
> which comes with it's load of complications as well.
> And then there's the ability to "block" queries and either inject them into
> each other or merge them together which simplifies unit testing and extends
> functionality :
>
> $query = $g->V()->out("likes")->flag("flagname")->has("age", 20);
> // Some logic here accesses new information and realizes the query needs
> altering
> $query->getFlag("flagname")->out("hates", true) // true for merge
> $query->toString(); // g.V().out('likes', hates').has('age', 20)
>
> But this point alone could warrant it's own email as it is relatively
> complex. Though TP3 has simplified some cases thanks to union() and some
> other steps.
>
> Our builder supports all of the above so if you have any questions feel free
> to ask me.
>
> Phew that was long. I'll add this to the ticket in a bit.
>
>
> Yes, maintenance seems the biggest pain. Every new method to Gremlin-Java
> requires updates to Gremlin-PHP ---- perhaps there is a programmatic way to
> introspect the Java source file (or JavaDoc) and generate the code
> automagically?
>
> public GraphTraversal out(final String… edgeLabels)
> ==auto-write==>
> out(Array… edgeLabels) {
> $string -> $string + ".out(" + StringHelper::toString(edgeLabels) + ")";
> }
>
>
> If you could do that, then the only code you actually have to write/maintain
> (besides the introspector above) is StringHelper which does all the fancy
> String conversion of arguments.
>
> ??.
>
> Thanks Dylan for your time,
> Marko.
>
> http://markorodriguez.com
>
>
> On Tue, Apr 12, 2016 at 4:37 PM, Marko Rodriguez <[email protected]> wrote:
> Hello everyone,
>
> Please see the section entitled "Host Language Embedding" here:
> http://www.planettinkerpop.org/#gremlin (3 sections down)
>
> When I was writing up this section, I noticed that most of the language
> drivers that are advertised on our homepage
> (http://tinkerpop.incubator.apache.org/#graph-libraries) know how to talk to
> Gremlin Server via web sockets, REST, etc., but rely on the user to create a
> String of their graph traversal and submit it. For instance, here is a
> snippet from the Gremlin-PHP documentation:
>
> $db = new Connection([
> 'host' => 'localhost',
> 'graph' => 'graph',
> 'username' => 'pomme',
> 'password' => 'hardToCrack'
> ]);
> //you can set $db->timeout = 0.5; if you wish
> $db->open();
> $db->send('g.V(2)');
> //do something with result
> $db->close();
>
> $db->send(String) is great, but it would be better if the user didn't have to
> leave PHP.
>
> Please see this ticket:
> https://issues.apache.org/jira/browse/TINKERPOP-1232
>
> I think for non-JVM languages, it would be nice if these drivers (PHP,
> JavaScript, Python, etc.) didn't require the user to explicitly create
> Gremlin-XXX Strings, but instead either used JINI or model-3 in the ticket
> above. Lets look at model-3 as I think its the easiest and more general.
>
> For instance, they would have a class in their native language that would
> mirror the GraphTraversal API. *** I don't know any other languages well
> enough, so I'm just going to do this in Groovy :), hopefully you get the
> generalized point. ***
>
> public class Test {
>
> String s;
>
> public Test(final String source) {
> s = source;
> }
>
> public Test() {
> s = "";
> }
>
> public Test V() {
> s = s + ".V()";
> return this;
> }
>
> public Test outE(final String label) {
> s = s + ".outE(\"${label}\")";
> return this;
> }
>
> public Test repeat(final Test test) {
> s = s + ".repeat(${test.toString()})";
> return this;
> }
>
> public String toString() {
> return s;
> }
> }
>
> Then, via fluency (function composition) and nesting, you could generate a
> Gremlin-Groovy (or which ever ScriptEngine language) traversal String in the
> backend.
>
> gremlin> g = new Test("g");
> ==>g
> gremlin> g.V().outE("knows")
> ==>g.V().outE("knows")
> gremlin>
> gremlin> g = new Test("g");
> ==>g
> gremlin> g.V().repeat(new Test().outE("knows"))
> ==>g.V().repeat(.outE("knows"))
> gremlin>
>
> From there, that String is then submitted as you normally do with your
> driver. For instance, with Gremlin-PHP, via $db->send(String).
>
> Of course, if your driver is already on a JVM language, there is no reason to
> do this (e.g. Gremlin-Scala), but if you are not on the JVM, this gives the
> user host language embedding and a more natural "look and feel." Moreover, if
> your language doesn't use "dot notation," you would use the natural idioms of
> your language.
>
> $g->V->outE("knows")
>
> If anyone is interested in updating their non-JVM language driver to use this
> model, I would like to write a blog post about it. Or perhaps, a tutorial for
> for language designers.
>
> Thoughts?,
> Marko.
>
> http://markorodriguez.com
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "Gremlin-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> ...
>
> --
> You received this message because you are subscribed to the Google Groups
> "Gremlin-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/gremlin-users/27e000ac-39c1-415d-bd3c-48c40febc97d%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.