Yes, I expected to return results first and then stream the side-effects. On Fri, Jul 22, 2016 at 5:05 PM, Dylan Millikin <dylan.milli...@gmail.com> wrote:
> > Perhaps nicer than doing all that trickery with transactions would be to > self-detach the vertex ahead of time > > This was the original idea, I never dove too deep into it as the > sideEffects were applied mid traversal and extra filtering/SEs still had to > occur. I wasn't sure it was actually possible and the transaction hack > allowed me to move on. > > As for the GLV limitations, it's mostly going to be network overhead. > Unfortunately one round trip with the server is costly and I know that > we've ended up having to be creative in order to limit the round trips by > concatenating scripts for each query. A GLV approach would need some > careful planing and probably a multiline byteCode feature. But I digress > that's not what this thread is about. > > In the spirit of GLVs returning side effects how would your original > proposition stream over the network? Would you get all data first and then > SE? I'm guessing you would want to stream the SEs as well. > > On Fri, Jul 22, 2016 at 4:42 PM, Stephen Mallette <spmalle...@gmail.com> > wrote: > > > > You can take the case of a group count as a really simple example. > > > > So you want the side-effect in the Vertex itself so you can use it with > the > > ORM. Interesting. Perhaps nicer than doing all that trickery with > > transactions would be to self-detach the vertex ahead of time (i.e. > create > > a DetachedVertex) and add the property you want. As indirect as that > > sounds, that seems more direct to me than the "fake" transaction. Not > sure > > that what I'm doing here will help you with that problem. > > > > > I'll add that I'm looking at this from a non-GLV perspective so I'm > > disregarding object mapping done through GraphSONv2.0 typing in favor of > a > > format guarantied result set (say that either only contains vertices, > > edges, or a combination of both). > > > > Also interesting. Not sure that kind of serialization has a place in > > TinkerPop where we encourage folks to return everything under the sun by > > using Gremlin to return data in a form that suits their required end > > result. if this is the outcome you want, I think that my suggestion with > > self-detaching is probably on the right track. Maybe consider a custom > > serializer that coerces all results to a graph elements. That would take > > care of all the embedded objects and the whole lot. > > > > > The reason for this is that GLV is too > > inefficient for larger projects so a more traditional script->result > > approach is required. > > > > I'm hijacking my own thread by going too deep down this path, but I think > > we should strive toward a solution for GLVs to be robust enough for > > developers to be successful with TinkerPop in the language of their > choice. > > Just like we'll never get rid of all lambdas in Gremlin, we will probably > > never quite get rid of script->result for all use cases (but, again, like > > lambdas the goal will be to get quite close). I find it quite interesting > > that we might be able to figure out how a python dev could write Gremlin > in > > python that would remotely execute on the server seamlessly, however it's > > also interesting that that same GLV code could be treated as server-side > to > > be accessed by from a python client. In that way, heavy complex logic > (the > > type you are talking about) could be written in python and then accessed > > from python on the client. In short, i think that it would be better to > > prefer to think of the work around GLVs as "how to make Gremlin good in > > other languages" rather than the more narrow view of just "remoting > > traversals". If we go wider, we might come up with some good ideas to > > really broaden access to TinkerPop and graphs in a very big way. > > > > We already have a really big improvement with "remoting" as compared to > > good 'ol RexsterGraph - so that's something - haha ;) > > > > > > > > > > > > > > On Fri, Jul 22, 2016 at 3:17 PM, Dylan Millikin < > dylan.milli...@gmail.com> > > wrote: > > > > > Yeah sorry I left out an important part. This is especially an issue > when > > > you're dealing with an ORM layer that's expecting results of a specific > > > type (for example vertices). > > > You can take the case of a group count as a really simple example. Your > > > result set could be : > > > > > > [{count:5, vertex:v[1]}, {count:3, vertex:v[2]}, {count:1, > vertex:v[3]}] > > > and this is easy enough to do with gremlin. But unless this is built > into > > > the ORM itself chances are you'll need to implement the object mapping > > > yourself. > > > > > > The alternative is to add "count" as a property of vertex and then you > > can > > > leverage all available features from your ORM such as filtering, > > ordering, > > > etc... Actually, the way we did it above we can also do those directly > in > > > gremlin as well. > > > > > > This is a simple case, but once it gets more complicated with > > hierarchical > > > data, the option of implementing the object mapping yourself is just a > > > headache and often times less efficient than just rolling back a > > > transaction. > > > > > > Dunno if that was clear enough this time around. > > > > > > I'll add that I'm looking at this from a non-GLV perspective so I'm > > > disregarding object mapping done through GraphSONv2.0 typing in favor > of > > a > > > format guarantied result set (say that either only contains vertices, > > > edges, or a combination of both). The reason for this is that GLV is > too > > > inefficient for larger projects so a more traditional script->result > > > approach is required. > > > > > > On Fri, Jul 22, 2016 at 2:09 PM, Stephen Mallette < > spmalle...@gmail.com> > > > wrote: > > > > > > > hi dylan, could you please provide a more concrete example of the > > problem > > > > you're facing? > > > > > > > > On Fri, Jul 22, 2016 at 1:24 PM, Dylan Millikin < > > > dylan.milli...@gmail.com> > > > > wrote: > > > > > > > > > I'm going to confirm that this is actually a common issue. > > > > > One thing to keep in mind is that often times the sideEffects are > > > > directly > > > > > linked to returned elements on a 1 --> n basis which neither of the > > > above > > > > > really help with. That is to say that if you're streaming your > > results > > > > > you'll need the sideEffects that relate to the streamed element. > > > > > > > > > > There is no easy way of handling this currently. Especially if you > > > order > > > > > your results and get unordered sideEffect results. > > > > > One way we've found to work around this is very hacky, not > efficient > > > and > > > > > only works for non mutating queries: > > > > > > > > > > - we start a transaction > > > > > - we append the sideEffect data to the elements we're emitting (say > > as > > > > > properties of a vertex) > > > > > - get the full result set with sideEffects as properties of the > > result > > > > > elements. > > > > > - rollback transaction so properties are not persisted to the > graph. > > > > > > > > > > A truly wicked succession of events born from absolute desperation. > > > > > I enquired a while back about the ability to treat elements as > > detached > > > > > from the graph in order to do the above without the transaction > > > handling. > > > > > But I never followed up. > > > > > > > > > > I figured I would put this out there as another case where non-Java > > > > > languages struggle. > > > > > > > > > > On Thu, Jul 21, 2016 at 1:19 PM, Stephen Mallette < > > > spmalle...@gmail.com> > > > > > wrote: > > > > > > > > > > > Your way made me think that if you wrote your traversal like > that, > > > you > > > > > > would return the side-effects twice - once in your traversal as > > part > > > of > > > > > the > > > > > > standard result and then again as a side-effect. Not sure what > > that > > > > > means > > > > > > - just a thought. > > > > > > > > > > > > While I'm thinking thoughts that may or may not be obvious, it > also > > > > > occurs > > > > > > to me that the downside for a GLV retrieving data that way is > that > > > the > > > > > > result of the traversal won't be streamed back. It will aggregate > > the > > > > > > result (and the side-effects naturally) in memory and then return > > > that > > > > > all > > > > > > as a whole. > > > > > > > > > > > > On Thu, Jul 21, 2016 at 11:24 AM, Daniel Kuppitz <m...@gremlin.guru > > > > > > > wrote: > > > > > > > > > > > > > If you really want to have your result and your side-effects > > > returned > > > > > by > > > > > > a > > > > > > > single request, you could do something like this: > > > > > > > > > > > > > > gremlin> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > g.V(1,2,4).aggregate("names").by("name").aggregate("ages").by("age")*.fold().as("data").select("data", > > > > > > > "names", "ages")* > > > > > > > ==>[data:[v[1], v[2], v[4]], names:[marko, vadas, josh], > > ages:[29, > > > > 27, > > > > > > 32]] > > > > > > > gremlin> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > g.V(1,2,4).aggregate("names").by("name").aggregate("ages").by("age")*.fold().project("data", > > > > > > > "se").by().by(cap("names","ages"))* > > > > > > > ==>[data:[v[1], v[2], v[4]], se:[names:[marko, vadas, josh], > > > > ages:[29, > > > > > > 27, > > > > > > > 32]]] > > > > > > > gremlin> > > > > > g.V(1,2,4).aggregate("names").by("name")*.fold().project("data", > > > > > > > "se").by().by(cap("names"))* > > > > > > > ==>[data:[v[1], v[2], v[4]], se:[marko, vadas, josh]] > > > > > > > > > > > > > > I'm not saying it would be bad to have Gremlin Server handle > that > > > for > > > > > > you, > > > > > > > just wanted to show that it's actually pretty easy to get the > > data > > > > and > > > > > > the > > > > > > > side-effects without using the traversal admin methods (hence > it > > > > should > > > > > > > work for all GLVs). > > > > > > > > > > > > > > Cheers, > > > > > > > Daniel > > > > > > > > > > > > > > > > > > > > > On Thu, Jul 21, 2016 at 4:51 PM, Stephen Mallette < > > > > > spmalle...@gmail.com> > > > > > > > wrote: > > > > > > > > > > > > > > > As we look to build out GLVs and expand Gremlin into other > > > > > programming > > > > > > > > languages, one of the important aspects of doing this should > be > > > to > > > > > > > consider > > > > > > > > consistency across GLVs. We should try to prevent > capabilities > > of > > > > > Java > > > > > > > from > > > > > > > > being lost in Python, JS, etc. > > > > > > > > > > > > > > > > As we look at both RemoteGraph in Java and gremlin-python we > > find > > > > > that > > > > > > > > there is no way to get traversal side-effects. If you write a > > > > > Traversal > > > > > > > and > > > > > > > > want side-effects from it, you have to write your traversal > to > > > > return > > > > > > > them > > > > > > > > so that it comes back as part of the result set. Since > > > RemoteGraph > > > > > and > > > > > > > > gremlin-python don't really allow you to directly "submit a > > > script" > > > > > > it's > > > > > > > > not as though you can execute a traversal once for both the > > > result > > > > > and > > > > > > > the > > > > > > > > side-effect and package them together in a single request as > > you > > > > > might > > > > > > do > > > > > > > > with a simple script request: > > > > > > > > > > > > > > > > $ curl -X POST -d > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > "{\"gremlin\":\"t=g.V(1).values('name').aggregate('x');[v:t.toList(),se:t.getSideEffects().get('x')]\"}" > > > > > > > > http://localhost:8182 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > {"requestId":"3d3258b2-e421-459a-bf53-ea1e58ece4aa","status":{"message":"","code":200,"attributes":{}},"result":{"data":[{"v":["marko"]},{"se":["marko"]}],"meta":{}}} > > > > > > > > > > > > > > > > I'm thinking that we could alter things in a non-breaking way > > to > > > > > allow > > > > > > > > optional return of side-effect data so that there is a way to > > > have > > > > > this > > > > > > > all > > > > > > > > streamed back without the need for the little workaround I > just > > > > > > > > demonstrated. For REST I think we could just include a > > sideEffect > > > > > > request > > > > > > > > parameter that allowed for a list of side-effect keys to > > return. > > > > > > Perhaps > > > > > > > > the a "*" could indicate that all should be returned. the > > > > > side-effects > > > > > > > > could be serialized into a key sibling to "data" called > > > > "sideEffect". > > > > > > > > > > > > > > > > I think a similar approach could be used for websockets and > NIO > > > > where > > > > > > we > > > > > > > > could amend the protocol to accept that sideEffect parameter. > > We > > > > > would > > > > > > > > first stream results (marked with meta data to specify a > > > "result") > > > > > and > > > > > > > then > > > > > > > > stream side effects (again marked with meta data as such). > > > > > > > > > > > > > > > > I considered caching the Traversal instances so that a future > > > > request > > > > > > > could > > > > > > > > get the side effects, but for a variety of reasons I > abandoned > > > that > > > > > > (the > > > > > > > > cache meant more heap and trying to get the right balance, > new > > > > > > > transactions > > > > > > > > would have to be opened if the side-effect contained graph > > > > elements, > > > > > > > etc.) > > > > > > > > > > > > > > > > I like the approach of just maintaining our single > > > request-response > > > > > > model > > > > > > > > with the changes I proposed above.It seems to provide the > least > > > > > impact > > > > > > > with > > > > > > > > no new dependencies, is backward compatible and could be > > > completely > > > > > > > > optional to RemoteConnections. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >