Yeah, that's what I meant. The steps inside are replaced with some
JanusGraph stuff.

Cheers,
Keith


On Tue, Apr 24, 2018 at 1:52 PM pieter gmail <pieter.mar...@gmail.com>
wrote:

> Nah, that looks to me like the RepeatStep survived. Just the nested
> VertexStep that got replaced with JanusgraphVertexStep.
> Good for them, first prize is not replacing anything.
>
> Cheers
> Pieter
>
> On 24/04/2018 19:50, Keith Lohnes wrote:
> > It looks like it,
> > `g.V().has("foo", "bar").repeat(out()).emit().explain()`
> >
> > yields
> >
> > `[JanusGraphStep([],[foo.eq(bar)]),
> > RepeatStep([JanusGraphVertexStep(OUT,vertex),
> > RepeatEndStep],until(false),emit(true))]`
> >
> >
> >
> > On Tue, Apr 24, 2018 at 12:12 PM pieter gmail <pieter.mar...@gmail.com>
> > wrote:
> >
> >> Hi,
> >>
> >> Sqlg completely replaces TinkerPop's RepeatStep. The idea being that
> >> with g.V().repeat(out()).times(x) only x round trips to the db is needed
> >> regardless of the size of the graph. Each time it will go to the db with
> >> the full set of the previous step's incoming starts.
> >>
> >> But yeah TinkerPop's implementation is always the starting point so I'll
> >> definitely have a look at how you have implemented DFS.
> >>
> >> BTW, does Janus graph use TinkerPop's default RepeatStep as is with no
> >> optimization strategies?
> >>
> >> Cheers
> >> Pieter
> >>
> >> On 24/04/2018 16:33, Keith Lohnes wrote:
> >>> Pieter,
> >>>
> >>> If you take a look at https://github.com/apache/tinkerpop/pull/838 DFS
> >> is
> >>> implemented as a modification to BFS. It's taking the starts that come
> in
> >>> from a BFS and stashing them to be processed later. I haven't seen a
> big
> >>> performance difference on JanusGraph; At least for the queries that
> I've
> >>> been running with it. I'm not terribly familiar with Sqlg, but I wonder
> >> if
> >>> in the context of how DFS is implemented there, it may be less of a
> >>> concern.
> >>>
> >>> Cheers,
> >>> Keith
> >>>
> >>> On Thu, Apr 19, 2018 at 12:46 PM pieter gmail <pieter.mar...@gmail.com
> >
> >>> wrote:
> >>>
> >>>> Hi,
> >>>>
> >>>> Not really sure either what 'global' means technically with respect to
> >>>> TinkerPop's current configuration support.
> >>>> Perhaps it can be the start of a global configuration registry that
> can
> >>>> be overridden per traversal.
> >>>>
> >>>> I get that DFS is the preferred default but for Sqlg the performance
> >>>> impact is so great that I'd rather, if possible have BFS as its
> default.
> >>>>
> >>>> I am not sure about this but I reckon that any graph where the
> TinkerPop
> >>>> vm is not running in the same process space as the actual graph/data
> >>>> that latency is a big issue. BFS alleviates the latency issue
> >>>> significantly.
> >>>>
> >>>> Cheers
> >>>> Pieter
> >>>>
> >>>>
> >>>>
> >>>> On 19/04/2018 14:49, Keith Lohnes wrote:
> >>>>>> whether this will affect more than just
> >>>>> repeat()?
> >>>>>
> >>>>> For the PR that I start and the intent of this thread was to only
> >> affect
> >>>>> repeat.
> >>>>>
> >>>>>> I prefer that the semantics of the traversal be specified in the
> >>>> traversal
> >>>>> as a first class citizen.
> >>>>>
> >>>>> +1
> >>>>>
> >>>>>>     I am fine with any default but am wondering whether it would be
> >>>>> worthwhile for the default to be overridden at a global level at not
> >> just
> >>>>> per traversal
> >>>>>
> >>>>> It might be nice, but I guess I'm not sure what `global` means in
> this
> >>>>> context. Configured on the graph object?
> >>>>>
> >>>>> On Tue, Apr 17, 2018 at 9:28 PM Daniel Kuppitz <m...@gremlin.guru>
> >> wrote:
> >>>>>> TinkerPop makes no guarantees about the order of elements unless you
> >>>>>> specify an explicit order. This also goes back to the fact that
> >> certain
> >>>>>> strategies (LazyBarrier-, RepeatUnroll- and PathRetractionStrategy)
> >> add
> >>>>>> NoOpBarrierSteps to your traversal, which ultimately turns it into a
> >>>>>> DFS/BFS mix. Check the .explain() output of your traversal to see
> >> which
> >>>>>> strategy adds which steps.
> >>>>>>
> >>>>>> Cheers,
> >>>>>> Daniel
> >>>>>>
> >>>>>>
> >>>>>> On Tue, Apr 17, 2018 at 4:45 PM, Michael Pollmeier <
> >>>>>> mich...@michaelpollmeier.com> wrote:
> >>>>>>
> >>>>>>>> Also it seems to me that DFS only really applies to repeat() with
> an
> >>>>>>>> emit().
> >>>>>>>> g.V().hasLabel("A").repeat().times(2) gets rewritten as
> >>>>>>>> g.V().hasLabel("A").out().out(). Are their subtleties that I am
> not
> >>>>>>>> aware of or does DFV vs BFS not matter in this case?
> >>>>>>> When I read this I thought: clearly `.out().out()` is DFS for OLTP,
> >>>>>>> that's also what the documentation says, e.g. in this nice
> >> infographic
> >>>>>>> http://tinkerpop.apache.org/docs/current/images/oltp-vs-olap.png
> >>>>>>>
> >>>>>>> However, looks like that's not the case. Has my life been a lie?
> >>>> Setting
> >>>>>>> up a simple flat graph to make things more obvious:
> >>>>>>> v3 <- v1 <- v0 -> v2 -> v4
> >>>>>>>
> >>>>>>> ```
> >>>>>>> graph = TinkerGraph.open()
> >>>>>>> v0 = graph.addVertex("l0")
> >>>>>>> v1 = graph.addVertex("l1")
> >>>>>>> v2 = graph.addVertex("l1")
> >>>>>>> v3 = graph.addVertex("l2")
> >>>>>>> v4 = graph.addVertex("l2")
> >>>>>>> v0.addEdge("e", v2)
> >>>>>>> v2.addEdge("e", v4)
> >>>>>>> v0.addEdge("e", v1)
> >>>>>>> v1.addEdge("e", v3)
> >>>>>>> g = graph.traversal()
> >>>>>>> g.V(v0).out().sideEffect{println(it)}.out().sideEffect{println(it)}
> >>>>>>> ```
> >>>>>>>
> >>>>>>> Prints:
> >>>>>>> v[2]
> >>>>>>> v[1]
> >>>>>>> v[4]
> >>>>>>> ==>v[4]
> >>>>>>> v[3]
> >>>>>>> ==>v[3]
> >>>>>>>
> >>>>>>> If this was OLTP the output would be:
> >>>>>>> v[2]
> >>>>>>> v[4]
> >>>>>>> ==>v[4]
> >>>>>>> v[1]
> >>>>>>> v[3]
> >>>>>>> ==>v[3]
> >>>>>>>
> >>>>>>> Cheers
> >>>>>>> Michael
> >>>>>>>
> >>>>>>> On 18/04/18 02:58, pieter gmail wrote:
> >>>>>>>> Hi,
> >>>>>>>>
> >>>>>>>> I agree with the question about whether this will affect more than
> >>>> just
> >>>>>>>> repeat()?
> >>>>>>>>
> >>>>>>>> I prefer that the semantics of the traversal be specified in the
> >>>>>>>> traversal as a first class citizen. i.e. with order(SearchAlgo).
> >>>>>>>> Strategies are to my mind internal to an implementation. In
> Robert's
> >>>>>>>> example LazyBarrierStrategy may be replaced/removed by an
> >>>>>> implementation
> >>>>>>>> for whatever internal reason they have.
> >>>>>>>>
> >>>>>>>> Regarding the default, I am fine with any default but am wondering
> >>>>>>>> whether it would be worthwhile for the default to be overridden
> at a
> >>>>>>>> global level at not just per traversal? That way the impact can
> also
> >>>> be
> >>>>>>>> alleviated when folks upgrade.
> >>>>>>>>
> >>>>>>>> Also it seems to me that DFS only really applies to repeat() with
> an
> >>>>>>>> emit().
> >>>>>>>> g.V().hasLabel("A").repeat().times(2) gets rewritten as
> >>>>>>>> g.V().hasLabel("A").out().out(). Are their subtleties that I am
> not
> >>>>>>>> aware of or does DFV vs BFS not matter in this case?
> >>>>>>>>
> >>>>>>>> Cheers
> >>>>>>>> Pieter
> >>>>>>>>
> >>>>>>>> On 17/04/2018 14:58, Robert Dale wrote:
> >>>>>>>>> +1 on DFS
> >>>>>>>>> -1 on order(SearchAlgo)
> >>>>>>>>>
> >>>>>>>>> It seems like a strategy may be more appropriate.  It could
> affect
> >>>>>> more
> >>>>>>>>> than just repeat().  And how would this interact with
> >>>>>>>>> LazyBarrierStrategy?
> >>>>>>>>>
> >>>>>>>>> Maybe the default should be DFS with LazyBarrierStrategy. Then
> >>>>>>>>> LazyBarrierStrategy
> >>>>>>>>> can be removed with 'withoutStrategies()' and then it works just
> >> like
> >>>>>>>>> everything else.  I'd prefer consistency with the way everything
> >> else
> >>>>>>>>> works.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Robert Dale
> >>>>>>>>>
> >>>>>>>>> On Tue, Apr 17, 2018 at 8:11 AM, Stephen Mallette <
> >>>>>> spmalle...@gmail.com
> >>>>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>>> Thanks for the summary. Just a quick note - I'd not worry about
> >> the
> >>>>>> GLV
> >>>>>>>>>> tests for now. That part should be easy to sort out. Let's first
> >>>> make
> >>>>>>>>>> sure
> >>>>>>>>>> that we get clear on the other items first before digging too
> >> deeply
> >>>>>>>>>> there.
> >>>>>>>>>>
> >>>>>>>>>> On an administrative front, I think that this change should just
> >> go
> >>>>>> to
> >>>>>>>>>> 3.4.0/master (so it's currently targeted to the right branch
> >>>>>>>>>> actually) as
> >>>>>>>>>> it sounds like we want DFS to be the default and that could be a
> >>>>>>>>>> breaking
> >>>>>>>>>> change as the semantics of the traversal shift a bit. So that's
> >> one
> >>>>>>>>>> issue
> >>>>>>>>>> to make sure we're all happy with.
> >>>>>>>>>>
> >>>>>>>>>> A next issue would be the API from the user perspective - thanks
> >> for
> >>>>>>> the
> >>>>>>>>>> link to that JIRA btw - I see I was involved in that
> conversation
> >> a
> >>>>>>>>>> little
> >>>>>>>>>> bit but then dropped off. You'd commented that you preferred a
> per
> >>>>>> loop
> >>>>>>>>>> configuration for search approach which is what you stuck with
> for
> >>>>>> the
> >>>>>>>>>> implementation. I guess I can see that there could be some value
> >> in
> >>>>>>>>>> having
> >>>>>>>>>> that ability. That said, I'm not sure that I like the overload
> of
> >>>>>>>>>> order()
> >>>>>>>>>> as the method for doing that:
> >>>>>>>>>>
> >>>>>>>>>> g.V().has("name", "marko").
> >>>>>>>>>>        repeat(__.outE().order().by("weight", decr).inV()).
> >>>>>>>>>>          emit().
> >>>>>>>>>>          order(SearchAlgo.DFS)
> >>>>>>>>>>
> >>>>>>>>>> Still thinking about what else we might do there, but perhaps
> that
> >>>>>>>>>> can wait
> >>>>>>>>>> a bit as I still need to study your changes in detail.
> Regarding:
> >>>>>>>>>>
> >>>>>>>>>>>      The barrier step that Daniel described doesn’t currently
> work
> >>>>>> since
> >>>>>>>>>> there’s basically booleans in the RepeatStep on whether or not
> to
> >>>>>>>>>> stash the
> >>>>>>>>>> starts to make the RepeatStep depth first.
> >>>>>>>>>>
> >>>>>>>>>> I assume you are referring to these booleans:
> >>>>>>>>>>
> >>>>>>>>>> https://github.com/apache/tinkerpop/blob/
> >>>>>>> fe8ee98f6967c7b0f0ee7f2cf2d105
> >>>>>>>>>> 31b68fab8b/gremlin-core/src/main/java/org/apache/
> >>>>>>>>>> tinkerpop/gremlin/process/traversal/step/branch/
> >>>>>>> RepeatStep.java#L46-L47
> >>>>>>>>>> ?
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> On Tue, Apr 17, 2018 at 7:37 AM, Keith Lohnes <
> lohn...@gmail.com>
> >>>>>>> wrote:
> >>>>>>>>>>> Stephen, That’s a fair summary. I had an immediate need for it,
> >> so
> >>>> I
> >>>>>>>>>>> developed something based on Michel Pollmeier’s work and a
> >>>>>>> modification
> >>>>>>>>>> to
> >>>>>>>>>>> the syntax Pieter Martin suggested in the Jira
> >>>>>>>>>>> <https://issues.apache.org/jira/browse/TINKERPOP-1822?
> >>>>>>>>>>> focusedCommentId=16233681&page=com.atlassian.jira.
> >>>>>>>>>>>
> plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16233681>
> >>>>>>>>>>> with DFS being explicit.
> >>>>>>>>>>>
> >>>>>>>>>>> The semantics I used were g.V().repeat(out()).order(DFS),
> putting
> >>>>>> the
> >>>>>>>>>>> order
> >>>>>>>>>>> clause outside of the repeat. It made it simpler for me to
> >> develop
> >>>>>> and
> >>>>>>>>>>> seemed nice to make that DFS vs BFS explicit. The barrier step
> >> that
> >>>>>>>>>> Daniel
> >>>>>>>>>>> described doesn’t currently work since there’s basically
> booleans
> >>>> in
> >>>>>>>>>>> the
> >>>>>>>>>>> RepeatStep on whether or not to stash the starts to make the
> >>>>>>> RepeatStep
> >>>>>>>>>>> depth first.
> >>>>>>>>>>>
> >>>>>>>>>>> I added a test for the DFS, though I’ve had some issues getting
> >>>>>>>>>>> things to
> >>>>>>>>>>> run re: .net tests and some other tests timing out. I have some
> >>>> more
> >>>>>>>>>> tests
> >>>>>>>>>>> I'd like to write based off issues that I ran into testing this
> >> in
> >>>>>> the
> >>>>>>>>>>> console (k-ary trees, until/emit before the repeat vs after),
> >> but I
> >>>>>>>>>> really
> >>>>>>>>>>> wanted to get this out there for people to take a look at and
> see
> >>>> if
> >>>>>>> it
> >>>>>>>>>>> could work out.
> >>>>>>>>>>> ​
> >>>>>>>>>>>
> >>>>>>>>>>> On Mon, Apr 16, 2018 at 7:29 PM Stephen Mallette <
> >>>>>>> spmalle...@gmail.com>
> >>>>>>>>>>> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>>      Keith, I have to admit that I didn't really follow this
> >> general
> >>>>>>> body
> >>>>>>>>>> of
> >>>>>>>>>>>> work closely, which include this pull request, the earlier one
> >>>> from
> >>>>>>>>>>> Michael
> >>>>>>>>>>>> Pollmeier, the JIRA and I think some discussion somewhere on
> one
> >>>> of
> >>>>>>>>>>>> the
> >>>>>>>>>>>> mailing lists. As best I have gathered from what I did
> follow, I
> >>>>>> feel
> >>>>>>>>>>> like
> >>>>>>>>>>>> the focus of this work so far was mostly related to "can
> someone
> >>>>>>>>>>>> get it
> >>>>>>>>>>> to
> >>>>>>>>>>>> actually work", but not a lot about other aspects like
> testing,
> >>>>>> api,
> >>>>>>>>>>>> release branch to apply it to, etc. Is that a fair depiction
> of
> >>>> how
> >>>>>>>>>> this
> >>>>>>>>>>>> work has developed so far? if so, let's use this thread to
> make
> >>>>>> sure
> >>>>>>>>>>> we're
> >>>>>>>>>>>> all on the same page as to how this change will get in on all
> >>>> those
> >>>>>>>>>> sorts
> >>>>>>>>>>>> of issues.
> >>>>>>>>>>>>
> >>>>>>>>>>>> btw, thanks to you and michael for sticking at this to come to
> >>>>>>>>>> something
> >>>>>>>>>>>> that seems to work. looking forward to the continued
> discussion
> >> on
> >>>>>>>>>>>> this
> >>>>>>>>>>>> thread.
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Mon, Apr 16, 2018 at 6:54 PM, Michael Pollmeier <
> >>>>>>>>>>>> mich...@michaelpollmeier.com> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>>> Unsurprisingly I'm also +1 for defaulting to DFS in OLTP. My
> >>>>>> feeling
> >>>>>>>>>> is
> >>>>>>>>>>>>> that most users currently expect it to be DFS since that's
> what
> >>>>>> the
> >>>>>>>>>>> docs
> >>>>>>>>>>>>> say.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> And yes, it's easy to verify the default in the test suite,
> >> once
> >>>>>> we
> >>>>>>>>>>>>> agreed on what the default should be.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Cheers
> >>>>>>>>>>>>> Michael
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On 17/04/18 04:40, pieter gmail wrote:
> >>>>>>>>>>>>>> Hi,
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> I have not properly followed the previous thread. However I
> >>>>>> thought
> >>>>>>>>>>> is
> >>>>>>>>>>>>>> going to be a way to set the default behavior as apposed to
> >>>>>> needing
> >>>>>>>>>>> to
> >>>>>>>>>>>>>> use barrier()
> >>>>>>>>>>>>>> Is this the case or not?
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> For Sqlg at least it is possible to optimize BFS much more
> >>>>>>>>>>> effectively
> >>>>>>>>>>>>>> than DFS so it will be nice to have a way to set the
> strategy
> >>>>>>>>>> rather
> >>>>>>>>>>>>>> than having to manually inject barriers.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Is the test suite going to enforce the BFS vs DFS?
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Thanks
> >>>>>>>>>>>>>> Pieter
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> On 16/04/2018 16:56, Daniel Kuppitz wrote:
> >>>>>>>>>>>>>>> +1 for DFS. If the query relies on BFS, you can always do
> >>>>>>>>>>>>>>> .repeat(....barrier())...
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> ^ This holds true as long as there's no significant
> >> difference
> >>>>>> in
> >>>>>>>>>>> the
> >>>>>>>>>>>>>>> cpu+memory consumption and overall performance of the two
> >>>>>>>>>>> approaches.
> >>>>>>>>>>>>> BFS
> >>>>>>>>>>>>>>> has its advantages when it comes to bulking; an arbitrary
> >>>> number
> >>>>>>>>>> of
> >>>>>>>>>>>>>>> traversers on the same element is processed at the same
> pace
> >> as
> >>>>>> a
> >>>>>>>>>>>> single
> >>>>>>>>>>>>>>> traverser. I don't think we can benefit from bulking in
> DFS.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Cheers,
> >>>>>>>>>>>>>>> Daniel
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> On Mon, Apr 16, 2018 at 5:44 AM, Keith Lohnes <
> >>>>>> lohn...@gmail.com>
> >>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>> As part of #838 <
> >> https://github.com/apache/tinkerpop/pull/838
> >>>>>>>>>>>> there’s
> >>>>>>>>>>>>>>>> some
> >>>>>>>>>>>>>>>> discussion around whether or not to make DFS the default
> for
> >>>>>> the
> >>>>>>>>>>>> repeat
> >>>>>>>>>>>>>>>> step. On the one hand, everything else in OLTP is depth
> >> first.
> >>>>>> On
> >>>>>>>>>>> the
> >>>>>>>>>>>>>>>> other
> >>>>>>>>>>>>>>>> hand, there’s likely existing traversals that depend on
> the
> >>>>>>>>>> breadth
> >>>>>>>>>>>>>>>> first
> >>>>>>>>>>>>>>>> nature of repeat. My general preference is to make DFS
> >>>> optional
> >>>>>>>>>> at
> >>>>>>>>>>>>>>>> first,
> >>>>>>>>>>>>>>>> and at some later date, change the default and have that
> be
> >> a
> >>>>>>>>>>>> separate
> >>>>>>>>>>>>>>>> change from implementing DFS for repeat
> >>>>>>>>>>>>>>>> ​
> >>>>>>>>>>>>>>>>
> >>
>
>

Reply via email to