To avoid the collection logic alltogether. For most steps there is no need to
check the labels as it is known that they are added, immutable and correct.

Also with the current strategy the `ImmutablePath.currentLabels` is exactly the
same collection as that of the step. It is not a copy.

`Traverser.addLabels()` is only called for the steps previously mentioned. They
too can be optimized as there is no need to create a new `ImmutablePath` just to
set the labels. There could be a `Traverer.split()` method that creates the
initial `ImmutablePath` with the correct labels to start with. As things stand
now the `ImmutablePath` is created just to be replaced 1 or 2 millis later. I
have however not benchmarked any of those queries so am not touching that at
the moment.

In some ways its a matter of style. The label logic in `AbstractStep` is not
relevant to all of its subclasses and should be higher in the inheritance
hierarchy.

Cheers
Pieter


Excerpts from Marko Rodriguez's message of August 7, 2016 3:54 :
Why not just check to see if the labels to be added already exist, if they do, 
don’t addLabels() and thus, don’t create a new collection.

Marko.

http://markorodriguez.com



On Aug 7, 2016, at 6:07 AM, Pieter Martin <[email protected]> wrote:

Here is what I have come up with so far.

The idea is that `Traverser.split(r, step)` already copies the labels to the
traverser so there is no need to call `Traverser.addLabels(labels)` again.

I removed the `Traverser.addLabels(labels)` call from `AbstractStep`.
For the traversers that do not call `Traverer.split(r, step)` I manually added
the `traverser.addLabels(labels)` call in `processNextStart()`. This was done
by fixing test failures rather than searching and investigating all calls to
`Traverser.split()`.

The following steps needed the labels to be added manually.

`AggregateStep`
`CollectingBarrierStep`
`FilterStep`
`SideEffectStep`
`StartStep`
`NoOpBarrierStep`

Further seeing as `Step.getLabels()` already returns a unmodifiable collection 
and
`ImmutablePath` is well immutable there is no need for it to have its own copy
of the labels. I set the labels directly on the path as oppose to making a copy.

`TinkerGraphProcessComputerTest` passes.
`TinkerGraphProcessStandardTest` passes.
`TinkerGraphStructureStandardTest` has one failure.
`SerializationTest$GryoTest.shouldSerializePathAsDetached` fails with
`java.lang.IllegalArgumentException: Class is not registered:
java.util.Collections$UnmodifiableSet`

Let me know what you think.

Thanks
Pieter


Excerpts from Ted Wilmes's message of August 7, 2016 12:16 :
Neat find Pieter.  With regards to the update to ImmutablePath, I think
that defensive copying of inbound collections is generally a good idea but
if we can target specific areas where we can reap big gains from not
creating new collections it may be worth it to relax that constraint,
especially if we're only talking about internal APIs.  If we do relax that
constraint though, we'll need to careful during code review and unit
testing because those bugs can be difficult to track down.  The other nice
thing that the defensive copy gets us, particularly in the case of the
ImmutablePath, is that it isolates ImmutablePath from whatever the subclass
of set was that the caller passed in.  I think that's what is causing the
serialization test failure in this case since the caller passed in an
unmodifiable set.
--Ted
On Fri, Aug 5, 2016, 2:31 PM Marko Rodriguez <[email protected]> wrote:
Hello,

This is cool. Check out also ImmutablePath.extend(labels) as that is
ultimately what Traverser.addLabels() calls. We have a lot of set copying
and I don’t know if its needed (as you seem to be demonstrating). What I
don’t like about your solution is the explicit reference to the
B_L_P…Traverser in AbstractStep. See if you can work your solution without
it.

Good luck,
Marko.

http://markorodriguez.com



> On Aug 5, 2016, at 12:44 PM, pieter-gmail <[email protected]>
wrote:
>
> Sorry forgot to add a rather important part.
>
> I changed ImmutablePath's constructor to
>
>    private ImmutablePath(final ImmutablePathImpl previousPath, final
> Object currentObject, final Set<String> currentLabels) {
>        this.previousPath = previousPath;
>        this.currentObject = currentObject;
>        this.currentLabels = currentLabels;
> //        this.currentLabels.addAll(currentLabels);
>    }
>
> Setting the collection directly as oppose to `addAll`
>
> Thanks
> Pieter
>
>
> On 05/08/2016 20:40, pieter-gmail wrote:
>> Hi,
>>
>> I have been optimizing Sqlg of late and eventually arrived at TinkerPop
>> code.
>>
>> The gremlin in particular that I am interested is path queries.
>>
>> Here is the test that I am running in jmh.
>>
>>        //@Setup
>>        Vertex a = graph.addVertex(T.label, "A", "name", "a1");
>>        for (int i = 1; i < 1_000_001; i++) {
>>            Vertex b = graph.addVertex(T.label, "B", "name", "name_" +
i);
>>            a.addEdge("outB", b);
>>            for (int j = 0; j < 1; j++) {
>>                Vertex c = graph.addVertex(T.label, "C", "name", "name_"
>> + i + " " + j);
>>                b.addEdge("outC", c);
>>            }
>>        }
>>
>> And the query being benchmarked is
>>
>>        GraphTraversal<Vertex, Path> traversal =
>> g.V(a).as("a").out().as("b").out().as("c").path();
>>        while (traversal.hasNext()) {
>>            Path path = traversal.next();
>>        }
>>
>> Before the optimization, (as things are now)
>>
>> Benchmark                                 Mode Cnt  Score Error
Units
>> GremlinPathBenchmark.g_path  avgt  100  1.086 ± 0.020   s/op
>>
>> The optimization I did is in AbstractStep.prepareTraversalForNextStep,
>> to not call addLabels() for path gremlins as the labels are known by the
>> step and do not change again so there is not need to keep adding them.
>>
>>    private final Traverser.Admin<E> prepareTraversalForNextStep(final
>> Traverser.Admin<E> traverser) {
>>        if (!this.traverserStepIdAndLabelsSetByChild) {
>>            traverser.setStepId(this.nextStep.getId());
>>            if (traverser instanceof B_LP_O_P_S_SE_SL_Traverser) {
>>            } else {
>>                traverser.addLabels(this.labels);
>>            }
>>        }
>>        return traverser;
>>    }
>>
>> After optimization,
>>
>> Benchmark                                 Mode Cnt  Score Error
Units
>> GremlinPathBenchmark.g_path  avgt  100  0.680 ± 0.004   s/op
>>
>> 1.086 vs 0.689 seconds for the traversal.
>>
>> I ran the Structured and Process test suites. 2 tests are failing with
>> this optimization.
>>
>> InjectTest.g_VX1X_out_name_injectXdanielX_asXaX_mapXlengthX_path fails
with
>>
>> "java.lang.IllegalArgumentException: The step with label a does not
exist"
>>
>> and
>>
>> SerializationTest.shouldSerializePathAsDetached fails with
>>
>> "Caused by: java.lang.IllegalArgumentException: Class is not
registered:
>> java.util.Collections$UnmodifiableSet"
>>
>> Before investigating the failures is this optimization worth pursuing?
>>
>> Thanks
>> Pieter
>>
>




Reply via email to