Artem Aliev created TINKERPOP-1801:
--------------------------------------

             Summary:  OLAP profile() step return incorrect timing
                 Key: TINKERPOP-1801
                 URL: https://issues.apache.org/jira/browse/TINKERPOP-1801
             Project: TinkerPop
          Issue Type: Bug
    Affects Versions: 3.2.6, 3.3.0
            Reporter: Artem Aliev


Graph ProfileStep calculates time of next()/hasNext() calls, expecting 
recursion.
But Message passing/RDD joins is used by GraphComputer.
So next() does not recursively call next steps, but message is generated. And 
most of the time is taken by message passing (RDD join). 
Thus on graph computer the time between ProfileStep should be measured, not 
inside it.

The other approach is to get Spark statistics with SparkListener and add spark 
stages timings into profiler metrics. that will work only for spark but will 
give better representation of step costs.
The simple fix is measuring time between OLAP iterations and add it to the 
profiler step.
This will not take into account computer setup time, but will be precise enough 
for long running queries.

To reproduce:
tinkerPop 3.2.6 gremlin:

{code}
plugin activated: tinkerpop.server
plugin activated: tinkerpop.utilities
plugin activated: tinkerpop.spark
plugin activated: tinkerpop.tinkergraph
gremlin> graph = 
GraphFactory.open('conf/hadoop/hadoop-grateful-gryo.properties')
gremlin> g = graph.traversal().withComputer(SparkGraphComputer)
==>graphtraversalsource[hadoopgraph[gryoinputformat->gryooutputformat], 
sparkgraphcomputer]
gremlin> g.V().out().out().count().profile()
==>Traversal Metrics
Step                                                               Count  
Traversers       Time (ms)    % Dur
=============================================================================================================
GraphStep(vertex,[])                                                 808        
 808           2.025    18.35
VertexStep(OUT,vertex)                                              8049        
 562           4.430    40.14
VertexStep(OUT,edge)                                              327370        
7551           4.581    41.50
CountGlobalStep                                                        1        
   1           0.001     0.01
                                            >TOTAL                     -        
   -          11.038        -
gremlin> clock(1){g.V().out().out().count().next() }
==>3421.92758
gremlin>
{code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to