Here is what I have come up with so far.

The idea is that `Traverser.split(r, step)` already copies the labels to the
traverser so there is no need to call `Traverser.addLabels(labels)` again.

I removed the `Traverser.addLabels(labels)` call from `AbstractStep`.
For the traversers that do not call `Traverer.split(r, step)` I manually added
the `traverser.addLabels(labels)` call in `processNextStart()`. This was done
by fixing test failures rather than searching and investigating all calls to
`Traverser.split()`.

The following steps needed the labels to be added manually.

`AggregateStep`
`CollectingBarrierStep`
`FilterStep`
`SideEffectStep`
`StartStep`
`NoOpBarrierStep`

Further seeing as `Step.getLabels()` already returns a unmodifiable collection 
and
`ImmutablePath` is well immutable there is no need for it to have its own copy
of the labels. I set the labels directly on the path as oppose to making a copy.

`TinkerGraphProcessComputerTest` passes.
`TinkerGraphProcessStandardTest` passes.
`TinkerGraphStructureStandardTest` has one failure.
`SerializationTest$GryoTest.shouldSerializePathAsDetached` fails with
`java.lang.IllegalArgumentException: Class is not registered:
java.util.Collections$UnmodifiableSet`

Let me know what you think.

Thanks
Pieter


Excerpts from Ted Wilmes's message of August 7, 2016 12:16 :
Neat find Pieter.  With regards to the update to ImmutablePath, I think
that defensive copying of inbound collections is generally a good idea but
if we can target specific areas where we can reap big gains from not
creating new collections it may be worth it to relax that constraint,
especially if we're only talking about internal APIs.  If we do relax that
constraint though, we'll need to careful during code review and unit
testing because those bugs can be difficult to track down.  The other nice
thing that the defensive copy gets us, particularly in the case of the
ImmutablePath, is that it isolates ImmutablePath from whatever the subclass
of set was that the caller passed in.  I think that's what is causing the
serialization test failure in this case since the caller passed in an
unmodifiable set.

--Ted

On Fri, Aug 5, 2016, 2:31 PM Marko Rodriguez <[email protected]> wrote:

Hello,

This is cool. Check out also ImmutablePath.extend(labels) as that is
ultimately what Traverser.addLabels() calls. We have a lot of set copying
and I don’t know if its needed (as you seem to be demonstrating). What I
don’t like about your solution is the explicit reference to the
B_L_P…Traverser in AbstractStep. See if you can work your solution without
it.

Good luck,
Marko.

http://markorodriguez.com



> On Aug 5, 2016, at 12:44 PM, pieter-gmail <[email protected]>
wrote:
>
> Sorry forgot to add a rather important part.
>
> I changed ImmutablePath's constructor to
>
>    private ImmutablePath(final ImmutablePathImpl previousPath, final
> Object currentObject, final Set<String> currentLabels) {
>        this.previousPath = previousPath;
>        this.currentObject = currentObject;
>        this.currentLabels = currentLabels;
> //        this.currentLabels.addAll(currentLabels);
>    }
>
> Setting the collection directly as oppose to `addAll`
>
> Thanks
> Pieter
>
>
> On 05/08/2016 20:40, pieter-gmail wrote:
>> Hi,
>>
>> I have been optimizing Sqlg of late and eventually arrived at TinkerPop
>> code.
>>
>> The gremlin in particular that I am interested is path queries.
>>
>> Here is the test that I am running in jmh.
>>
>>        //@Setup
>>        Vertex a = graph.addVertex(T.label, "A", "name", "a1");
>>        for (int i = 1; i < 1_000_001; i++) {
>>            Vertex b = graph.addVertex(T.label, "B", "name", "name_" +
i);
>>            a.addEdge("outB", b);
>>            for (int j = 0; j < 1; j++) {
>>                Vertex c = graph.addVertex(T.label, "C", "name", "name_"
>> + i + " " + j);
>>                b.addEdge("outC", c);
>>            }
>>        }
>>
>> And the query being benchmarked is
>>
>>        GraphTraversal<Vertex, Path> traversal =
>> g.V(a).as("a").out().as("b").out().as("c").path();
>>        while (traversal.hasNext()) {
>>            Path path = traversal.next();
>>        }
>>
>> Before the optimization, (as things are now)
>>
>> Benchmark                                 Mode Cnt  Score Error
 Units
>> GremlinPathBenchmark.g_path  avgt  100  1.086 ± 0.020   s/op
>>
>> The optimization I did is in AbstractStep.prepareTraversalForNextStep,
>> to not call addLabels() for path gremlins as the labels are known by the
>> step and do not change again so there is not need to keep adding them.
>>
>>    private final Traverser.Admin<E> prepareTraversalForNextStep(final
>> Traverser.Admin<E> traverser) {
>>        if (!this.traverserStepIdAndLabelsSetByChild) {
>>            traverser.setStepId(this.nextStep.getId());
>>            if (traverser instanceof B_LP_O_P_S_SE_SL_Traverser) {
>>            } else {
>>                traverser.addLabels(this.labels);
>>            }
>>        }
>>        return traverser;
>>    }
>>
>> After optimization,
>>
>> Benchmark                                 Mode Cnt  Score Error
 Units
>> GremlinPathBenchmark.g_path  avgt  100  0.680 ± 0.004   s/op
>>
>> 1.086 vs 0.689 seconds for the traversal.
>>
>> I ran the Structured and Process test suites. 2 tests are failing with
>> this optimization.
>>
>> InjectTest.g_VX1X_out_name_injectXdanielX_asXaX_mapXlengthX_path fails
with
>>
>> "java.lang.IllegalArgumentException: The step with label a does not
exist"
>>
>> and
>>
>> SerializationTest.shouldSerializePathAsDetached fails with
>>
>> "Caused by: java.lang.IllegalArgumentException: Class is not
registered:
>> java.util.Collections$UnmodifiableSet"
>>
>> Before investigating the failures is this optimization worth pursuing?
>>
>> Thanks
>> Pieter
>>
>



Reply via email to