Venki Korukanti created SPARK-35799:
---------------------------------------

             Summary: Fix the allUpdatesTimeMs metric measuring in 
FlatMapGroupsWithStateExec
                 Key: SPARK-35799
                 URL: https://issues.apache.org/jira/browse/SPARK-35799
             Project: Spark
          Issue Type: Improvement
          Components: Structured Streaming
    Affects Versions: 3.1.2
            Reporter: Venki Korukanti


Metric `allUpdatesTimeMs` meant to capture the start to end walltime of the 
operator `FlatMapGroupsWithStateExec`, but currently it just 
[captures|https://github.com/apache/spark/blob/79362c4efcb6bd4b575438330a14a6191cca5e4b/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/FlatMapGroupsWithStateExec.scala#L121]
 the iterator creation time. 

Fix it to measure similar to how other stateful operators measure. Example one 
[here|https://github.com/apache/spark/blob/79362c4efcb6bd4b575438330a14a6191cca5e4b/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/statefulOperators.scala#L406].
 This measurement is not perfect due to the nature of the lazy iterator and 
also includes the time the consumer operator spent in processing the current 
operator output, but it should give a good signal when comparing the metric in 
one microbatch to the metric in another microbatch.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to