Marko A. Rodriguez created TINKERPOP-1254:
---------------------------------------------

             Summary: Support dropping traverser path information when it is no 
longer needed.
                 Key: TINKERPOP-1254
                 URL: https://issues.apache.org/jira/browse/TINKERPOP-1254
             Project: TinkerPop
          Issue Type: Improvement
          Components: process
    Affects Versions: 3.1.1-incubating
            Reporter: Marko A. Rodriguez


The most expensive traversals (especially in OLAP) are those that can not be 
"bulked." There are various reasons why two traversers at the same object can 
not be bulked, but the primary reason is {{PATH}} or {{LABELED_PATH}}. That is, 
when the history of the traverser is required, the probability of two 
traversers having the same history is low.

A key to making traversals more efficient is to do as a much as possible to 
remove historic information from a traverser so it can get bulked. How does one 
do this? 

{code}
g.V.as('a').out().as('b').out().where(neq('a').and().neq('b')).both().name
{code}

The {{LABELED_PATH}} of "a" and "b" are required up to the {{where()}} and at 
which point, at {{both()}}, they are no longer required. It would be smart to 
support:

{code}
traverser.dropLabels(Set<String>)
traverser.dropPath()
{code}

We would then, via a {{TraversalOptimizationStrategy}} insert a step between 
{{where()}} and {{both()}} called {{PathPruneStep}} which would be a 
{{SideEffectStep}}. The strategy would know which labels were no longer needed 
(via forward lookahead) and then do:

{code}
public class PathPruneStep {
  final Set<String> dropLabels = ...
  final boolean dropPath = ...

  public void sideEffect(final Traverser<S> traverser) {
    final Traverser<S> start = this.starts.next();
    if(this.dropPath) start.dropPath();
    else start.dropLabels(labels); 
  }
}
{code}

Again, the more we can prune historic path data no longer needed, the higher 
the probability of bulking. Think about this in terms of {{match()}}.

{code}
g.V().match(
  a.out.b,
  b.out.c,
  c.neq.a,
  c.out.b,
).select("a")
{code}

All we need is "a" at the end. Thus, once a pattern has been passed and no 
future patterns require that label, drop it! 

This idea is related to TINKERPOP-331, but I don't think we should deal with 
manipulating the species. Thus, I think 331 is too "low level."



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to