> Also it seems to me that DFS only really applies to repeat() with an > emit(). > g.V().hasLabel("A").repeat().times(2) gets rewritten as > g.V().hasLabel("A").out().out(). Are their subtleties that I am not > aware of or does DFV vs BFS not matter in this case?
When I read this I thought: clearly `.out().out()` is DFS for OLTP, that's also what the documentation says, e.g. in this nice infographic http://tinkerpop.apache.org/docs/current/images/oltp-vs-olap.png However, looks like that's not the case. Has my life been a lie? Setting up a simple flat graph to make things more obvious: v3 <- v1 <- v0 -> v2 -> v4 ``` graph = TinkerGraph.open() v0 = graph.addVertex("l0") v1 = graph.addVertex("l1") v2 = graph.addVertex("l1") v3 = graph.addVertex("l2") v4 = graph.addVertex("l2") v0.addEdge("e", v2) v2.addEdge("e", v4) v0.addEdge("e", v1) v1.addEdge("e", v3) g = graph.traversal() g.V(v0).out().sideEffect{println(it)}.out().sideEffect{println(it)} ``` Prints: v[2] v[1] v[4] ==>v[4] v[3] ==>v[3] If this was OLTP the output would be: v[2] v[4] ==>v[4] v[1] v[3] ==>v[3] Cheers Michael On 18/04/18 02:58, pieter gmail wrote: > Hi, > > I agree with the question about whether this will affect more than just > repeat()? > > I prefer that the semantics of the traversal be specified in the > traversal as a first class citizen. i.e. with order(SearchAlgo). > Strategies are to my mind internal to an implementation. In Robert's > example LazyBarrierStrategy may be replaced/removed by an implementation > for whatever internal reason they have. > > Regarding the default, I am fine with any default but am wondering > whether it would be worthwhile for the default to be overridden at a > global level at not just per traversal? That way the impact can also be > alleviated when folks upgrade. > > Also it seems to me that DFS only really applies to repeat() with an > emit(). > g.V().hasLabel("A").repeat().times(2) gets rewritten as > g.V().hasLabel("A").out().out(). Are their subtleties that I am not > aware of or does DFV vs BFS not matter in this case? > > Cheers > Pieter > > On 17/04/2018 14:58, Robert Dale wrote: >> +1 on DFS >> -1 on order(SearchAlgo) >> >> It seems like a strategy may be more appropriate. It could affect more >> than just repeat(). And how would this interact with >> LazyBarrierStrategy? >> >> Maybe the default should be DFS with LazyBarrierStrategy. Then >> LazyBarrierStrategy >> can be removed with 'withoutStrategies()' and then it works just like >> everything else. I'd prefer consistency with the way everything else >> works. >> >> >> >> Robert Dale >> >> On Tue, Apr 17, 2018 at 8:11 AM, Stephen Mallette <spmalle...@gmail.com> >> wrote: >> >>> Thanks for the summary. Just a quick note - I'd not worry about the GLV >>> tests for now. That part should be easy to sort out. Let's first make >>> sure >>> that we get clear on the other items first before digging too deeply >>> there. >>> >>> On an administrative front, I think that this change should just go to >>> 3.4.0/master (so it's currently targeted to the right branch >>> actually) as >>> it sounds like we want DFS to be the default and that could be a >>> breaking >>> change as the semantics of the traversal shift a bit. So that's one >>> issue >>> to make sure we're all happy with. >>> >>> A next issue would be the API from the user perspective - thanks for the >>> link to that JIRA btw - I see I was involved in that conversation a >>> little >>> bit but then dropped off. You'd commented that you preferred a per loop >>> configuration for search approach which is what you stuck with for the >>> implementation. I guess I can see that there could be some value in >>> having >>> that ability. That said, I'm not sure that I like the overload of >>> order() >>> as the method for doing that: >>> >>> g.V().has("name", "marko"). >>> repeat(__.outE().order().by("weight", decr).inV()). >>> emit(). >>> order(SearchAlgo.DFS) >>> >>> Still thinking about what else we might do there, but perhaps that >>> can wait >>> a bit as I still need to study your changes in detail. Regarding: >>> >>>> The barrier step that Daniel described doesn’t currently work since >>> there’s basically booleans in the RepeatStep on whether or not to >>> stash the >>> starts to make the RepeatStep depth first. >>> >>> I assume you are referring to these booleans: >>> >>> https://github.com/apache/tinkerpop/blob/fe8ee98f6967c7b0f0ee7f2cf2d105 >>> 31b68fab8b/gremlin-core/src/main/java/org/apache/ >>> tinkerpop/gremlin/process/traversal/step/branch/RepeatStep.java#L46-L47 >>> >>> ? >>> >>> >>> >>> On Tue, Apr 17, 2018 at 7:37 AM, Keith Lohnes <lohn...@gmail.com> wrote: >>> >>>> Stephen, That’s a fair summary. I had an immediate need for it, so I >>>> developed something based on Michel Pollmeier’s work and a modification >>> to >>>> the syntax Pieter Martin suggested in the Jira >>>> <https://issues.apache.org/jira/browse/TINKERPOP-1822? >>>> focusedCommentId=16233681&page=com.atlassian.jira. >>>> plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16233681> >>>> with DFS being explicit. >>>> >>>> The semantics I used were g.V().repeat(out()).order(DFS), putting the >>>> order >>>> clause outside of the repeat. It made it simpler for me to develop and >>>> seemed nice to make that DFS vs BFS explicit. The barrier step that >>> Daniel >>>> described doesn’t currently work since there’s basically booleans in >>>> the >>>> RepeatStep on whether or not to stash the starts to make the RepeatStep >>>> depth first. >>>> >>>> I added a test for the DFS, though I’ve had some issues getting >>>> things to >>>> run re: .net tests and some other tests timing out. I have some more >>> tests >>>> I'd like to write based off issues that I ran into testing this in the >>>> console (k-ary trees, until/emit before the repeat vs after), but I >>> really >>>> wanted to get this out there for people to take a look at and see if it >>>> could work out. >>>> >>>> >>>> On Mon, Apr 16, 2018 at 7:29 PM Stephen Mallette <spmalle...@gmail.com> >>>> wrote: >>>> >>>>> Keith, I have to admit that I didn't really follow this general body >>> of >>>>> work closely, which include this pull request, the earlier one from >>>> Michael >>>>> Pollmeier, the JIRA and I think some discussion somewhere on one of >>>>> the >>>>> mailing lists. As best I have gathered from what I did follow, I feel >>>> like >>>>> the focus of this work so far was mostly related to "can someone >>>>> get it >>>> to >>>>> actually work", but not a lot about other aspects like testing, api, >>>>> release branch to apply it to, etc. Is that a fair depiction of how >>> this >>>>> work has developed so far? if so, let's use this thread to make sure >>>> we're >>>>> all on the same page as to how this change will get in on all those >>> sorts >>>>> of issues. >>>>> >>>>> btw, thanks to you and michael for sticking at this to come to >>> something >>>>> that seems to work. looking forward to the continued discussion on >>>>> this >>>>> thread. >>>>> >>>>> >>>>> On Mon, Apr 16, 2018 at 6:54 PM, Michael Pollmeier < >>>>> mich...@michaelpollmeier.com> wrote: >>>>> >>>>>> Unsurprisingly I'm also +1 for defaulting to DFS in OLTP. My feeling >>> is >>>>>> that most users currently expect it to be DFS since that's what the >>>> docs >>>>>> say. >>>>>> >>>>>> And yes, it's easy to verify the default in the test suite, once we >>>>>> agreed on what the default should be. >>>>>> >>>>>> Cheers >>>>>> Michael >>>>>> >>>>>> On 17/04/18 04:40, pieter gmail wrote: >>>>>>> Hi, >>>>>>> >>>>>>> I have not properly followed the previous thread. However I thought >>>> is >>>>>>> going to be a way to set the default behavior as apposed to needing >>>> to >>>>>>> use barrier() >>>>>>> Is this the case or not? >>>>>>> >>>>>>> For Sqlg at least it is possible to optimize BFS much more >>>> effectively >>>>>>> than DFS so it will be nice to have a way to set the strategy >>> rather >>>>>>> than having to manually inject barriers. >>>>>>> >>>>>>> Is the test suite going to enforce the BFS vs DFS? >>>>>>> >>>>>>> Thanks >>>>>>> Pieter >>>>>>> >>>>>>> On 16/04/2018 16:56, Daniel Kuppitz wrote: >>>>>>>> +1 for DFS. If the query relies on BFS, you can always do >>>>>>>> .repeat(....barrier())... >>>>>>>> >>>>>>>> ^ This holds true as long as there's no significant difference in >>>> the >>>>>>>> cpu+memory consumption and overall performance of the two >>>> approaches. >>>>>> BFS >>>>>>>> has its advantages when it comes to bulking; an arbitrary number >>> of >>>>>>>> traversers on the same element is processed at the same pace as a >>>>> single >>>>>>>> traverser. I don't think we can benefit from bulking in DFS. >>>>>>>> >>>>>>>> Cheers, >>>>>>>> Daniel >>>>>>>> >>>>>>>> >>>>>>>> On Mon, Apr 16, 2018 at 5:44 AM, Keith Lohnes <lohn...@gmail.com> >>>>>> wrote: >>>>>>>>> As part of #838 <https://github.com/apache/tinkerpop/pull/838> >>>>> there’s >>>>>>>>> some >>>>>>>>> discussion around whether or not to make DFS the default for the >>>>> repeat >>>>>>>>> step. On the one hand, everything else in OLTP is depth first. On >>>> the >>>>>>>>> other >>>>>>>>> hand, there’s likely existing traversals that depend on the >>> breadth >>>>>>>>> first >>>>>>>>> nature of repeat. My general preference is to make DFS optional >>> at >>>>>>>>> first, >>>>>>>>> and at some later date, change the default and have that be a >>>>> separate >>>>>>>>> change from implementing DFS for repeat >>>>>>>>> >>>>>>>>> >>>>>>> >>>>>> > >
signature.asc
Description: OpenPGP digital signature