> whether this will affect more than just repeat()? For the PR that I start and the intent of this thread was to only affect repeat.
> I prefer that the semantics of the traversal be specified in the traversal as a first class citizen. +1 > I am fine with any default but am wondering whether it would be worthwhile for the default to be overridden at a global level at not just per traversal It might be nice, but I guess I'm not sure what `global` means in this context. Configured on the graph object? On Tue, Apr 17, 2018 at 9:28 PM Daniel Kuppitz <m...@gremlin.guru> wrote: > TinkerPop makes no guarantees about the order of elements unless you > specify an explicit order. This also goes back to the fact that certain > strategies (LazyBarrier-, RepeatUnroll- and PathRetractionStrategy) add > NoOpBarrierSteps to your traversal, which ultimately turns it into a > DFS/BFS mix. Check the .explain() output of your traversal to see which > strategy adds which steps. > > Cheers, > Daniel > > > On Tue, Apr 17, 2018 at 4:45 PM, Michael Pollmeier < > mich...@michaelpollmeier.com> wrote: > > > > Also it seems to me that DFS only really applies to repeat() with an > > > emit(). > > > g.V().hasLabel("A").repeat().times(2) gets rewritten as > > > g.V().hasLabel("A").out().out(). Are their subtleties that I am not > > > aware of or does DFV vs BFS not matter in this case? > > > > When I read this I thought: clearly `.out().out()` is DFS for OLTP, > > that's also what the documentation says, e.g. in this nice infographic > > http://tinkerpop.apache.org/docs/current/images/oltp-vs-olap.png > > > > However, looks like that's not the case. Has my life been a lie? Setting > > up a simple flat graph to make things more obvious: > > v3 <- v1 <- v0 -> v2 -> v4 > > > > ``` > > graph = TinkerGraph.open() > > v0 = graph.addVertex("l0") > > v1 = graph.addVertex("l1") > > v2 = graph.addVertex("l1") > > v3 = graph.addVertex("l2") > > v4 = graph.addVertex("l2") > > v0.addEdge("e", v2) > > v2.addEdge("e", v4) > > v0.addEdge("e", v1) > > v1.addEdge("e", v3) > > g = graph.traversal() > > g.V(v0).out().sideEffect{println(it)}.out().sideEffect{println(it)} > > ``` > > > > Prints: > > v[2] > > v[1] > > v[4] > > ==>v[4] > > v[3] > > ==>v[3] > > > > If this was OLTP the output would be: > > v[2] > > v[4] > > ==>v[4] > > v[1] > > v[3] > > ==>v[3] > > > > Cheers > > Michael > > > > On 18/04/18 02:58, pieter gmail wrote: > > > Hi, > > > > > > I agree with the question about whether this will affect more than just > > > repeat()? > > > > > > I prefer that the semantics of the traversal be specified in the > > > traversal as a first class citizen. i.e. with order(SearchAlgo). > > > Strategies are to my mind internal to an implementation. In Robert's > > > example LazyBarrierStrategy may be replaced/removed by an > implementation > > > for whatever internal reason they have. > > > > > > Regarding the default, I am fine with any default but am wondering > > > whether it would be worthwhile for the default to be overridden at a > > > global level at not just per traversal? That way the impact can also be > > > alleviated when folks upgrade. > > > > > > Also it seems to me that DFS only really applies to repeat() with an > > > emit(). > > > g.V().hasLabel("A").repeat().times(2) gets rewritten as > > > g.V().hasLabel("A").out().out(). Are their subtleties that I am not > > > aware of or does DFV vs BFS not matter in this case? > > > > > > Cheers > > > Pieter > > > > > > On 17/04/2018 14:58, Robert Dale wrote: > > >> +1 on DFS > > >> -1 on order(SearchAlgo) > > >> > > >> It seems like a strategy may be more appropriate. It could affect > more > > >> than just repeat(). And how would this interact with > > >> LazyBarrierStrategy? > > >> > > >> Maybe the default should be DFS with LazyBarrierStrategy. Then > > >> LazyBarrierStrategy > > >> can be removed with 'withoutStrategies()' and then it works just like > > >> everything else. I'd prefer consistency with the way everything else > > >> works. > > >> > > >> > > >> > > >> Robert Dale > > >> > > >> On Tue, Apr 17, 2018 at 8:11 AM, Stephen Mallette < > spmalle...@gmail.com > > > > > >> wrote: > > >> > > >>> Thanks for the summary. Just a quick note - I'd not worry about the > GLV > > >>> tests for now. That part should be easy to sort out. Let's first make > > >>> sure > > >>> that we get clear on the other items first before digging too deeply > > >>> there. > > >>> > > >>> On an administrative front, I think that this change should just go > to > > >>> 3.4.0/master (so it's currently targeted to the right branch > > >>> actually) as > > >>> it sounds like we want DFS to be the default and that could be a > > >>> breaking > > >>> change as the semantics of the traversal shift a bit. So that's one > > >>> issue > > >>> to make sure we're all happy with. > > >>> > > >>> A next issue would be the API from the user perspective - thanks for > > the > > >>> link to that JIRA btw - I see I was involved in that conversation a > > >>> little > > >>> bit but then dropped off. You'd commented that you preferred a per > loop > > >>> configuration for search approach which is what you stuck with for > the > > >>> implementation. I guess I can see that there could be some value in > > >>> having > > >>> that ability. That said, I'm not sure that I like the overload of > > >>> order() > > >>> as the method for doing that: > > >>> > > >>> g.V().has("name", "marko"). > > >>> repeat(__.outE().order().by("weight", decr).inV()). > > >>> emit(). > > >>> order(SearchAlgo.DFS) > > >>> > > >>> Still thinking about what else we might do there, but perhaps that > > >>> can wait > > >>> a bit as I still need to study your changes in detail. Regarding: > > >>> > > >>>> The barrier step that Daniel described doesn’t currently work > since > > >>> there’s basically booleans in the RepeatStep on whether or not to > > >>> stash the > > >>> starts to make the RepeatStep depth first. > > >>> > > >>> I assume you are referring to these booleans: > > >>> > > >>> https://github.com/apache/tinkerpop/blob/ > > fe8ee98f6967c7b0f0ee7f2cf2d105 > > >>> 31b68fab8b/gremlin-core/src/main/java/org/apache/ > > >>> tinkerpop/gremlin/process/traversal/step/branch/ > > RepeatStep.java#L46-L47 > > >>> > > >>> ? > > >>> > > >>> > > >>> > > >>> On Tue, Apr 17, 2018 at 7:37 AM, Keith Lohnes <lohn...@gmail.com> > > wrote: > > >>> > > >>>> Stephen, That’s a fair summary. I had an immediate need for it, so I > > >>>> developed something based on Michel Pollmeier’s work and a > > modification > > >>> to > > >>>> the syntax Pieter Martin suggested in the Jira > > >>>> <https://issues.apache.org/jira/browse/TINKERPOP-1822? > > >>>> focusedCommentId=16233681&page=com.atlassian.jira. > > >>>> plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16233681> > > >>>> with DFS being explicit. > > >>>> > > >>>> The semantics I used were g.V().repeat(out()).order(DFS), putting > the > > >>>> order > > >>>> clause outside of the repeat. It made it simpler for me to develop > and > > >>>> seemed nice to make that DFS vs BFS explicit. The barrier step that > > >>> Daniel > > >>>> described doesn’t currently work since there’s basically booleans in > > >>>> the > > >>>> RepeatStep on whether or not to stash the starts to make the > > RepeatStep > > >>>> depth first. > > >>>> > > >>>> I added a test for the DFS, though I’ve had some issues getting > > >>>> things to > > >>>> run re: .net tests and some other tests timing out. I have some more > > >>> tests > > >>>> I'd like to write based off issues that I ran into testing this in > the > > >>>> console (k-ary trees, until/emit before the repeat vs after), but I > > >>> really > > >>>> wanted to get this out there for people to take a look at and see if > > it > > >>>> could work out. > > >>>> > > >>>> > > >>>> On Mon, Apr 16, 2018 at 7:29 PM Stephen Mallette < > > spmalle...@gmail.com> > > >>>> wrote: > > >>>> > > >>>>> Keith, I have to admit that I didn't really follow this general > > body > > >>> of > > >>>>> work closely, which include this pull request, the earlier one from > > >>>> Michael > > >>>>> Pollmeier, the JIRA and I think some discussion somewhere on one of > > >>>>> the > > >>>>> mailing lists. As best I have gathered from what I did follow, I > feel > > >>>> like > > >>>>> the focus of this work so far was mostly related to "can someone > > >>>>> get it > > >>>> to > > >>>>> actually work", but not a lot about other aspects like testing, > api, > > >>>>> release branch to apply it to, etc. Is that a fair depiction of how > > >>> this > > >>>>> work has developed so far? if so, let's use this thread to make > sure > > >>>> we're > > >>>>> all on the same page as to how this change will get in on all those > > >>> sorts > > >>>>> of issues. > > >>>>> > > >>>>> btw, thanks to you and michael for sticking at this to come to > > >>> something > > >>>>> that seems to work. looking forward to the continued discussion on > > >>>>> this > > >>>>> thread. > > >>>>> > > >>>>> > > >>>>> On Mon, Apr 16, 2018 at 6:54 PM, Michael Pollmeier < > > >>>>> mich...@michaelpollmeier.com> wrote: > > >>>>> > > >>>>>> Unsurprisingly I'm also +1 for defaulting to DFS in OLTP. My > feeling > > >>> is > > >>>>>> that most users currently expect it to be DFS since that's what > the > > >>>> docs > > >>>>>> say. > > >>>>>> > > >>>>>> And yes, it's easy to verify the default in the test suite, once > we > > >>>>>> agreed on what the default should be. > > >>>>>> > > >>>>>> Cheers > > >>>>>> Michael > > >>>>>> > > >>>>>> On 17/04/18 04:40, pieter gmail wrote: > > >>>>>>> Hi, > > >>>>>>> > > >>>>>>> I have not properly followed the previous thread. However I > thought > > >>>> is > > >>>>>>> going to be a way to set the default behavior as apposed to > needing > > >>>> to > > >>>>>>> use barrier() > > >>>>>>> Is this the case or not? > > >>>>>>> > > >>>>>>> For Sqlg at least it is possible to optimize BFS much more > > >>>> effectively > > >>>>>>> than DFS so it will be nice to have a way to set the strategy > > >>> rather > > >>>>>>> than having to manually inject barriers. > > >>>>>>> > > >>>>>>> Is the test suite going to enforce the BFS vs DFS? > > >>>>>>> > > >>>>>>> Thanks > > >>>>>>> Pieter > > >>>>>>> > > >>>>>>> On 16/04/2018 16:56, Daniel Kuppitz wrote: > > >>>>>>>> +1 for DFS. If the query relies on BFS, you can always do > > >>>>>>>> .repeat(....barrier())... > > >>>>>>>> > > >>>>>>>> ^ This holds true as long as there's no significant difference > in > > >>>> the > > >>>>>>>> cpu+memory consumption and overall performance of the two > > >>>> approaches. > > >>>>>> BFS > > >>>>>>>> has its advantages when it comes to bulking; an arbitrary number > > >>> of > > >>>>>>>> traversers on the same element is processed at the same pace as > a > > >>>>> single > > >>>>>>>> traverser. I don't think we can benefit from bulking in DFS. > > >>>>>>>> > > >>>>>>>> Cheers, > > >>>>>>>> Daniel > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> On Mon, Apr 16, 2018 at 5:44 AM, Keith Lohnes < > lohn...@gmail.com> > > >>>>>> wrote: > > >>>>>>>>> As part of #838 <https://github.com/apache/tinkerpop/pull/838> > > >>>>> there’s > > >>>>>>>>> some > > >>>>>>>>> discussion around whether or not to make DFS the default for > the > > >>>>> repeat > > >>>>>>>>> step. On the one hand, everything else in OLTP is depth first. > On > > >>>> the > > >>>>>>>>> other > > >>>>>>>>> hand, there’s likely existing traversals that depend on the > > >>> breadth > > >>>>>>>>> first > > >>>>>>>>> nature of repeat. My general preference is to make DFS optional > > >>> at > > >>>>>>>>> first, > > >>>>>>>>> and at some later date, change the default and have that be a > > >>>>> separate > > >>>>>>>>> change from implementing DFS for repeat > > >>>>>>>>> > > >>>>>>>>> > > >>>>>>> > > >>>>>> > > > > > > > > > > >