Robert Dale created TINKERPOP-1904:
--------------------------------------
Summary: union performance is O(g^2)
Key: TINKERPOP-1904
URL: https://issues.apache.org/jira/browse/TINKERPOP-1904
Project: TinkerPop
Issue Type: Improvement
Components: process
Affects Versions: 3.3.1, 3.2.7
Reporter: Robert Dale
Where 'g' is the number of traversals within union(), the space analyzed by
union() grows O(g^2)
Comparing addV()s in-lined Vs unioned. That is, g.addV().addV()... Vs.
union(addV(),addV(),...)
*count: inline, union*
*20*: 12ms, 20ms - ~2x slower
*200*: 30ms, 350ms - ~10x slower
*2000*: 150ms, 55s - ~366x slower
For 2000 count, .profile() claims only 1.3s of the 55s.
It seems that most of the time is spent in
TraversalHelper.reIdSteps()
.. BranchStep.getGlobalChildren() <- called 4 million times (2000x2000 or
g^2)
Looks like it's iterating all traversals for each traversal. I'm not sure
what the purpose or the mechanics of this are but I wonder if it can be
improved.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)