Re: [DISCUSS] Returning Side Effects

Dylan Millikin Fri, 22 Jul 2016 10:25:17 -0700

I'm going to confirm that this is actually a common issue.
One thing to keep in mind is that often times the sideEffects are directly
linked to returned elements on a 1 --> n basis which neither of the above
really help with. That is to say that if you're streaming your results
you'll need the sideEffects that relate to the streamed element.


There is no easy way of handling this currently. Especially if you order
your results and get unordered sideEffect results.
One way we've found to work around this is very hacky, not efficient and
only works for non mutating queries:

- we start a transaction
- we append the sideEffect data to the elements we're emitting (say as
properties of a vertex)
- get the full result set with sideEffects as properties of the result
elements.
- rollback transaction so properties are not persisted to the graph.

A truly wicked succession of events born from absolute desperation.
I enquired a while back about the ability to treat elements as detached
from the graph in order to do the above without the transaction handling.
But I never followed up.

I figured I would put this out there as another case where non-Java
languages struggle.

On Thu, Jul 21, 2016 at 1:19 PM, Stephen Mallette <[email protected]>
wrote:

> Your way made me think that if you wrote your traversal like that, you
> would return the side-effects twice - once in your traversal as part of the
> standard result and then again as a side-effect.  Not sure what that means
> - just a thought.
>
> While I'm thinking thoughts that may or may not be obvious, it also occurs
> to me that the downside for a GLV retrieving data that way is that the
> result of the traversal won't be streamed back. It will aggregate the
> result (and the side-effects naturally) in memory and then return that all
> as a whole.
>
> On Thu, Jul 21, 2016 at 11:24 AM, Daniel Kuppitz <[email protected]> wrote:
>
> > If you really want to have your result and your side-effects returned by
> a
> > single request, you could do something like this:
> >
> > gremlin>
> >
> >
> g.V(1,2,4).aggregate("names").by("name").aggregate("ages").by("age")*.fold().as("data").select("data",
> > "names", "ages")*
> > ==>[data:[v[1], v[2], v[4]], names:[marko, vadas, josh], ages:[29, 27,
> 32]]
> > gremlin>
> >
> >
> g.V(1,2,4).aggregate("names").by("name").aggregate("ages").by("age")*.fold().project("data",
> > "se").by().by(cap("names","ages"))*
> > ==>[data:[v[1], v[2], v[4]], se:[names:[marko, vadas, josh], ages:[29,
> 27,
> > 32]]]
> > gremlin> g.V(1,2,4).aggregate("names").by("name")*.fold().project("data",
> > "se").by().by(cap("names"))*
> > ==>[data:[v[1], v[2], v[4]], se:[marko, vadas, josh]]
> >
> > I'm not saying it would be bad to have Gremlin Server handle that for
> you,
> > just wanted to show that it's actually pretty easy to get the data and
> the
> > side-effects without using the traversal admin methods (hence it should
> > work for all GLVs).
> >
> > Cheers,
> > Daniel
> >
> >
> > On Thu, Jul 21, 2016 at 4:51 PM, Stephen Mallette <[email protected]>
> > wrote:
> >
> > > As we look to build out GLVs and expand Gremlin into other programming
> > > languages, one of the important aspects of doing this should be to
> > consider
> > > consistency across GLVs. We should try to prevent capabilities of Java
> > from
> > > being lost in Python, JS, etc.
> > >
> > > As we look at both RemoteGraph in Java and gremlin-python we find that
> > > there is no way to get traversal side-effects. If you write a Traversal
> > and
> > > want side-effects from it, you have to write your traversal to return
> > them
> > > so that it comes back as part of the result set. Since RemoteGraph and
> > > gremlin-python don't really allow you to directly "submit a script"
> it's
> > > not as though you can execute a traversal once for both the result and
> > the
> > > side-effect and package them together in a single request as you might
> do
> > > with a simple script request:
> > >
> > > $ curl -X POST -d
> > >
> > >
> >
> "{\"gremlin\":\"t=g.V(1).values('name').aggregate('x');[v:t.toList(),se:t.getSideEffects().get('x')]\"}"
> > > http://localhost:8182
> > >
> > >
> >
> {"requestId":"3d3258b2-e421-459a-bf53-ea1e58ece4aa","status":{"message":"","code":200,"attributes":{}},"result":{"data":[{"v":["marko"]},{"se":["marko"]}],"meta":{}}}
> > >
> > > I'm thinking that we could alter things in a non-breaking way to allow
> > > optional return of side-effect data so that there is a way to have this
> > all
> > > streamed back without the need for the little workaround I just
> > > demonstrated. For REST I think we could just include a sideEffect
> request
> > > parameter that allowed for a list of side-effect keys to return.
> Perhaps
> > > the a "*" could indicate that all should be returned.  the side-effects
> > > could be serialized into a key sibling to "data" called "sideEffect".
> > >
> > > I think a similar approach could be used for websockets and NIO where
> we
> > > could amend the protocol to accept that sideEffect parameter. We would
> > > first stream results (marked with meta data to specify a "result") and
> > then
> > > stream side effects (again marked with meta data as such).
> > >
> > > I considered caching the Traversal instances so that a future request
> > could
> > > get the side effects, but for a variety of reasons I abandoned that
> (the
> > > cache meant more heap and trying to get the right balance, new
> > transactions
> > > would have to be opened if the side-effect contained graph elements,
> > etc.)
> > >
> > > I like the approach of just maintaining our single request-response
> model
> > > with the changes I proposed above.It seems to provide the least impact
> > with
> > > no new dependencies, is backward compatible and could be completely
> > > optional to RemoteConnections.
> > >
> >
>

Re: [DISCUSS] Returning Side Effects

Reply via email to