What branch are you in? Perhaps give me URLs (to GitHub) to the files touched? (if its not too many)
Marko. http://markorodriguez.com > On Nov 1, 2016, at 7:19 AM, pieter-gmail <[email protected]> wrote: > > Hi, > > Yes I am but afraid I do not have time at present to concentrate on it. > I just noticed your ImmutablePath ticket which will overlap with some of > what I have done. > > I'd suggest to pull my branch and look at what I did there. It was very > little, but dangerous code, which is why I was reluctant to submit a PR > at first. If you don't continue with it, I should in 2 or 3 weeks be > able to look at it again. > > Thanks > Pieter > > > On 01/11/2016 15:11, Marko Rodriguez wrote: >> Hi Pieter, >> >> I’m still really interested in your work in this area. Are you still doing >> this? >> >> Marko. >> >> http://markorodriguez.com >> >> >> >>> On Aug 7, 2016, at 9:12 AM, Pieter Martin <[email protected]> wrote: >>> >>> To avoid the collection logic alltogether. For most steps there is no need >>> to >>> check the labels as it is known that they are added, immutable and correct. >>> >>> Also with the current strategy the `ImmutablePath.currentLabels` is exactly >>> the >>> same collection as that of the step. It is not a copy. >>> >>> `Traverser.addLabels()` is only called for the steps previously mentioned. >>> They >>> too can be optimized as there is no need to create a new `ImmutablePath` >>> just to >>> set the labels. There could be a `Traverer.split()` method that creates the >>> initial `ImmutablePath` with the correct labels to start with. As things >>> stand >>> now the `ImmutablePath` is created just to be replaced 1 or 2 millis later. >>> I >>> have however not benchmarked any of those queries so am not touching that at >>> the moment. >>> >>> In some ways its a matter of style. The label logic in `AbstractStep` is not >>> relevant to all of its subclasses and should be higher in the inheritance >>> hierarchy. >>> >>> Cheers >>> Pieter >>> >>> >>> Excerpts from Marko Rodriguez's message of August 7, 2016 3:54 : >>>> Why not just check to see if the labels to be added already exist, if they >>>> do, don’t addLabels() and thus, don’t create a new collection. >>>> Marko. >>>> http://markorodriguez.com >>>>> On Aug 7, 2016, at 6:07 AM, Pieter Martin <[email protected]> wrote: >>>>> Here is what I have come up with so far. >>>>> The idea is that `Traverser.split(r, step)` already copies the labels to >>>>> the >>>>> traverser so there is no need to call `Traverser.addLabels(labels)` again. >>>>> I removed the `Traverser.addLabels(labels)` call from `AbstractStep`. >>>>> For the traversers that do not call `Traverer.split(r, step)` I manually >>>>> added >>>>> the `traverser.addLabels(labels)` call in `processNextStart()`. This was >>>>> done >>>>> by fixing test failures rather than searching and investigating all calls >>>>> to >>>>> `Traverser.split()`. >>>>> The following steps needed the labels to be added manually. >>>>> `AggregateStep` >>>>> `CollectingBarrierStep` >>>>> `FilterStep` >>>>> `SideEffectStep` >>>>> `StartStep` >>>>> `NoOpBarrierStep` >>>>> Further seeing as `Step.getLabels()` already returns a unmodifiable >>>>> collection and >>>>> `ImmutablePath` is well immutable there is no need for it to have its own >>>>> copy >>>>> of the labels. I set the labels directly on the path as oppose to making >>>>> a copy. >>>>> `TinkerGraphProcessComputerTest` passes. >>>>> `TinkerGraphProcessStandardTest` passes. >>>>> `TinkerGraphStructureStandardTest` has one failure. >>>>> `SerializationTest$GryoTest.shouldSerializePathAsDetached` fails with >>>>> `java.lang.IllegalArgumentException: Class is not registered: >>>>> java.util.Collections$UnmodifiableSet` >>>>> Let me know what you think. >>>>> Thanks >>>>> Pieter >>>>> Excerpts from Ted Wilmes's message of August 7, 2016 12:16 : >>>>>> Neat find Pieter. With regards to the update to ImmutablePath, I think >>>>>> that defensive copying of inbound collections is generally a good idea >>>>>> but >>>>>> if we can target specific areas where we can reap big gains from not >>>>>> creating new collections it may be worth it to relax that constraint, >>>>>> especially if we're only talking about internal APIs. If we do relax >>>>>> that >>>>>> constraint though, we'll need to careful during code review and unit >>>>>> testing because those bugs can be difficult to track down. The other >>>>>> nice >>>>>> thing that the defensive copy gets us, particularly in the case of the >>>>>> ImmutablePath, is that it isolates ImmutablePath from whatever the >>>>>> subclass >>>>>> of set was that the caller passed in. I think that's what is causing the >>>>>> serialization test failure in this case since the caller passed in an >>>>>> unmodifiable set. >>>>>> --Ted >>>>>> On Fri, Aug 5, 2016, 2:31 PM Marko Rodriguez <[email protected]> >>>>>> wrote: >>>>>>> Hello, >>>>>>> This is cool. Check out also ImmutablePath.extend(labels) as that is >>>>>>> ultimately what Traverser.addLabels() calls. We have a lot of set >>>>>>> copying >>>>>>> and I don’t know if its needed (as you seem to be demonstrating). What I >>>>>>> don’t like about your solution is the explicit reference to the >>>>>>> B_L_P…Traverser in AbstractStep. See if you can work your solution >>>>>>> without >>>>>>> it. >>>>>>> Good luck, >>>>>>> Marko. >>>>>>> http://markorodriguez.com >>>>>>>> On Aug 5, 2016, at 12:44 PM, pieter-gmail <[email protected]> >>>>>>> wrote: >>>>>>>> Sorry forgot to add a rather important part. >>>>>>>> >>>>>>>> I changed ImmutablePath's constructor to >>>>>>>> >>>>>>>> private ImmutablePath(final ImmutablePathImpl previousPath, final >>>>>>>> Object currentObject, final Set<String> currentLabels) { >>>>>>>> this.previousPath = previousPath; >>>>>>>> this.currentObject = currentObject; >>>>>>>> this.currentLabels = currentLabels; >>>>>>>> // this.currentLabels.addAll(currentLabels); >>>>>>>> } >>>>>>>> >>>>>>>> Setting the collection directly as oppose to `addAll` >>>>>>>> >>>>>>>> Thanks >>>>>>>> Pieter >>>>>>>> >>>>>>>> >>>>>>>> On 05/08/2016 20:40, pieter-gmail wrote: >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> I have been optimizing Sqlg of late and eventually arrived at >>>>>>>>> TinkerPop >>>>>>>>> code. >>>>>>>>> >>>>>>>>> The gremlin in particular that I am interested is path queries. >>>>>>>>> >>>>>>>>> Here is the test that I am running in jmh. >>>>>>>>> >>>>>>>>> //@Setup >>>>>>>>> Vertex a = graph.addVertex(T.label, "A", "name", "a1"); >>>>>>>>> for (int i = 1; i < 1_000_001; i++) { >>>>>>>>> Vertex b = graph.addVertex(T.label, "B", "name", "name_" + >>>>>>> i); >>>>>>>>> a.addEdge("outB", b); >>>>>>>>> for (int j = 0; j < 1; j++) { >>>>>>>>> Vertex c = graph.addVertex(T.label, "C", "name", "name_" >>>>>>>>> + i + " " + j); >>>>>>>>> b.addEdge("outC", c); >>>>>>>>> } >>>>>>>>> } >>>>>>>>> >>>>>>>>> And the query being benchmarked is >>>>>>>>> >>>>>>>>> GraphTraversal<Vertex, Path> traversal = >>>>>>>>> g.V(a).as("a").out().as("b").out().as("c").path(); >>>>>>>>> while (traversal.hasNext()) { >>>>>>>>> Path path = traversal.next(); >>>>>>>>> } >>>>>>>>> >>>>>>>>> Before the optimization, (as things are now) >>>>>>>>> >>>>>>>>> Benchmark Mode Cnt Score Error >>>>>>> Units >>>>>>>>> GremlinPathBenchmark.g_path avgt 100 1.086 ± 0.020 s/op >>>>>>>>> >>>>>>>>> The optimization I did is in AbstractStep.prepareTraversalForNextStep, >>>>>>>>> to not call addLabels() for path gremlins as the labels are known by >>>>>>>>> the >>>>>>>>> step and do not change again so there is not need to keep adding them. >>>>>>>>> >>>>>>>>> private final Traverser.Admin<E> prepareTraversalForNextStep(final >>>>>>>>> Traverser.Admin<E> traverser) { >>>>>>>>> if (!this.traverserStepIdAndLabelsSetByChild) { >>>>>>>>> traverser.setStepId(this.nextStep.getId()); >>>>>>>>> if (traverser instanceof B_LP_O_P_S_SE_SL_Traverser) { >>>>>>>>> } else { >>>>>>>>> traverser.addLabels(this.labels); >>>>>>>>> } >>>>>>>>> } >>>>>>>>> return traverser; >>>>>>>>> } >>>>>>>>> >>>>>>>>> After optimization, >>>>>>>>> >>>>>>>>> Benchmark Mode Cnt Score Error >>>>>>> Units >>>>>>>>> GremlinPathBenchmark.g_path avgt 100 0.680 ± 0.004 s/op >>>>>>>>> >>>>>>>>> 1.086 vs 0.689 seconds for the traversal. >>>>>>>>> >>>>>>>>> I ran the Structured and Process test suites. 2 tests are failing with >>>>>>>>> this optimization. >>>>>>>>> >>>>>>>>> InjectTest.g_VX1X_out_name_injectXdanielX_asXaX_mapXlengthX_path fails >>>>>>> with >>>>>>>>> "java.lang.IllegalArgumentException: The step with label a does not >>>>>>> exist" >>>>>>>>> and >>>>>>>>> >>>>>>>>> SerializationTest.shouldSerializePathAsDetached fails with >>>>>>>>> >>>>>>>>> "Caused by: java.lang.IllegalArgumentException: Class is not >>>>>>> registered: >>>>>>>>> java.util.Collections$UnmodifiableSet" >>>>>>>>> >>>>>>>>> Before investigating the failures is this optimization worth pursuing? >>>>>>>>> >>>>>>>>> Thanks >>>>>>>>> Pieter >>>>>>>>> >
