[
https://issues.apache.org/jira/browse/TINKERPOP-1254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15369889#comment-15369889
]
ASF GitHub Bot commented on TINKERPOP-1254:
-------------------------------------------
Github user okram commented on the issue:
https://github.com/apache/tinkerpop/pull/358
Verified the following simply traversal on Grateful Dead graph. For OLTP
(`MatchStep.standardAlgorithm()`), I added a lazy barrier internal to
`MatchStep` which is similar in behavior to the lazy barrier OLTP optimization
added in `PrunePathStrategy`.
```
source.V().match(
as("a").out().as("b"),
as("b").out().as("c")).
select("c").count().profile()
```
*With and then without PrunePathStrategy OLAP*
```
Traversal Metrics
Step Count
Traversers Time (ms) % Dur
=============================================================================================================
GraphStep(vertex,[]) 808
808 40.418 43.47
MatchStep(AND,[[MatchStartStep(a), ProfileStep,... 327370
452 28.015 30.13
MatchStartStep(a) 808
808 61.751
VertexStep(OUT,vertex) 8049
7957 37.971
MatchEndStep(b) (profiling ignored)
0.000
MatchStartStep(b) 8049
562 14.453
VertexStep(OUT,vertex) 327370
7510 16.124
MatchEndStep(c) (profiling ignored)
0.000
SelectOneStep(c) 327370
452 24.531 26.39
CountGlobalStep 1
1 0.004 0.00
>TOTAL -
- 92.969 -
Traversal Metrics
Step Count
Traversers Time (ms) % Dur
=============================================================================================================
GraphStep(vertex,[]) 808
808 6.844 0.35
MatchStep(AND,[[MatchStartStep(a), ProfileStep,... 327370
326983 1257.050 64.22
MatchStartStep(a) 808
808 11.806
VertexStep(OUT,vertex) 8049
7957 30.286
MatchEndStep(b) (profiling ignored)
0.000
MatchStartStep(b) 8049
7957 14.627
VertexStep(OUT,vertex) 327370
326983 774.934
MatchEndStep(c) (profiling ignored)
0.000
SelectOneStep(c) 327370
326983 693.564 35.43
CountGlobalStep 1
1 0.001 0.00
>TOTAL -
- 1957.460 -
```
*With and then without PrunePathStrategy OLTP*
```
Traversal Metrics
Step Count
Traversers Time (ms) % Dur
=============================================================================================================
TinkerGraphStep(vertex,[]) 808
808 2.050 0.07
MatchStep(AND,[[MatchStartStep(a), ProfileStep,... 327370
56814 2794.288 91.47
MatchStartStep(a) 808
808 103.524
VertexStep(OUT,vertex) 8049
8049 107.345
MatchEndStep(b) 8049
8049 159.505
MatchStartStep(b) 8049
7957 105.127
VertexStep(OUT,vertex) 327370
327370 351.935
MatchEndStep(c) 327370
327370 1331.478
NoOpBarrierStep(1000) 327370
452 254.309 8.32
SelectOneStep(c) 327370
452 3.731 0.12
CountGlobalStep 1
1 0.465 0.02
>TOTAL -
- 3054.845 -
Traversal Metrics
Step Count
Traversers Time (ms) % Dur
=============================================================================================================
TinkerGraphStep(vertex,[]) 808
808 2023.710 36.87
MatchStep(AND,[[MatchStartStep(a), ProfileStep,... 327370
326983 2995.219 54.57
MatchStartStep(a) 808
808 221.105
VertexStep(OUT,vertex) 8049
8049 185.946
MatchEndStep(b) 8049
8049 186.814
MatchStartStep(b) 8049
7957 161.012
VertexStep(OUT,vertex) 327370
327370 405.787
MatchEndStep(c) 327370
327370 423.213
SelectOneStep(c) 327370
326983 324.356 5.91
CountGlobalStep 1
1 145.952 2.66
>TOTAL -
- 5489.238 -
```
> Support dropping traverser path information when it is no longer needed.
> ------------------------------------------------------------------------
>
> Key: TINKERPOP-1254
> URL: https://issues.apache.org/jira/browse/TINKERPOP-1254
> Project: TinkerPop
> Issue Type: Improvement
> Components: process
> Affects Versions: 3.1.1-incubating
> Reporter: Marko A. Rodriguez
> Assignee: Ted Wilmes
>
> The most expensive traversals (especially in OLAP) are those that can not be
> "bulked." There are various reasons why two traversers at the same object can
> not be bulked, but the primary reason is {{PATH}} or {{LABELED_PATH}}. That
> is, when the history of the traverser is required, the probability of two
> traversers having the same history is low.
> A key to making traversals more efficient is to do as a much as possible to
> remove historic information from a traverser so it can get bulked. How does
> one do this?
> {code}
> g.V.as('a').out().as('b').out().where(neq('a').and().neq('b')).both().name
> {code}
> The {{LABELED_PATH}} of "a" and "b" are required up to the {{where()}} and at
> which point, at {{both()}}, they are no longer required. It would be smart to
> support:
> {code}
> traverser.dropLabels(Set<String>)
> traverser.dropPath()
> {code}
> We would then, via a {{TraversalOptimizationStrategy}} insert a step between
> {{where()}} and {{both()}} called {{PathPruneStep}} which would be a
> {{SideEffectStep}}. The strategy would know which labels were no longer
> needed (via forward lookahead) and then do:
> {code}
> public class PathPruneStep {
> final Set<String> dropLabels = ...
> final boolean dropPath = ...
> public void sideEffect(final Traverser<S> traverser) {
> final Traverser<S> start = this.starts.next();
> if(this.dropPath) start.dropPath();
> else start.dropLabels(labels);
> }
> }
> {code}
> Again, the more we can prune historic path data no longer needed, the higher
> the probability of bulking. Think about this in terms of {{match()}}.
> {code}
> g.V().match(
> a.out.b,
> b.out.c,
> c.neq.a,
> c.out.b,
> ).select("a")
> {code}
> All we need is "a" at the end. Thus, once a pattern has been passed and no
> future patterns require that label, drop it!
> This idea is related to TINKERPOP-331, but I don't think we should deal with
> manipulating the species. Thus, I think 331 is too "low level."
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)