Norio Akagi created TINKERPOP-2396:
--------------------------------------

             Summary: TraverserSet should be extendable for GraphDB provider
                 Key: TINKERPOP-2396
                 URL: https://issues.apache.org/jira/browse/TINKERPOP-2396
             Project: TinkerPop
          Issue Type: Improvement
          Components: process
            Reporter: Norio Akagi


Currently in many TinkerPop steps, TraverserSet is internally created and used. 
However GraphDB provider may want to use their own TraverserSet to change the 
logic  how it populates an inner Map.

Just let me give one example

Say internally we have a unique internal Id for each element and to get a user 
facing Id we need to retrieve it from our Storage using the internal Id. 
For example Element#id returns the user facing Id so one Storage API call is 
required on our end. It is ok.

What is unexpected for us was that when TraverserSet adds a Traverser<Element> 
, the element's id is used for hashCode.
So whenever some step with TraverserSet handles solution, as many Storage API 
calls are required as a number of Traverser<Element>. As you may see, some 
query like .count() at the end doesn't require the user facing value so we want 
to avoid that if we can.


We can simply achieve this by using internal id as a hashCode, but we can't 
overwrite ElementHelper's behavior so first we overwrote Element#hashCode but 
after that we saw that this test effectively fails.
https://github.com/apache/tinkerpop/blob/cff4c161615f2b50bda27b6ba523c7f52b833532/gremlin-test/src/main/java/org/apache/tinkerpop/gremlin/structure/util/detached/DetachedVertexTest.java#L75
Because v here is our own custom vertex instance which uses internal id for 
hashCode but DetachedVertex just uses a user facing id copied from v, so they 
are not considered as equal when used as hash key.

We may skip this test but it's better to keep all Tinkerpop tests passed as 
GraphDB provider I think.

So instead, I will propose to change so that we can add a capability to swap 
TraverserSet as needed.
The change should be simple, 
1. ExpandableIterator should have constructor like ExpandableIterator(final 
Step<S, ?> hostStep, final TraverserSet<S> traverserSet) and use the 
traverserSet as its own traverserSet value.
2. AbstractStep should have construct like AbstractStep(final Traversal.Admin 
traversal, final Supplier<TraverserSet<S>> traverserSetSupplier) and use 
traverserSetSuppler.get() for all places that currently we instantiate new 
TraverserSet.
 Also this supplier.get() may be passed when AbstractStep instantiates 
ExpandableIterator.

So still GraphDB provider needs to extends all Steps that deals with 
TraverserSet and overwrite their constructor to have their own 
traverserSetSupplier, but after that they can freely use their own 
TraverserSet, so the logic how the key of TraverserSet is determined is up to 
GraphDB provider.
Or if we add some helper method like getTraverserSetSupplier under, for 
instance, Graph class, then instead of extending all steps we can just override 
that method under our own Graph class. But I am not sure which is better in 
this case, but either way the goal is the same.

Any thoughts or objections on this idea ?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to