HeartSaVioR commented on a change in pull request #28040: [SPARK-31278][SS] Fix 
StreamingQuery output rows metric
URL: https://github.com/apache/spark/pull/28040#discussion_r399092630
 
 

 ##########
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/ProgressReporter.scala
 ##########
 @@ -189,7 +188,7 @@ trait ProgressReporter extends Logging {
       sink = sinkProgress,
       observedMetrics = new java.util.HashMap(observedMetrics.asJava))
 
-    if (hasNewData) {
+    if (hasExecuted) {
 
 Review comment:
   Nice finding. We don't recognize the bug because lastNoDataProgressEventTime 
is set to Long.MinValue which makes next no new data micro batch to update the 
progress immediately, which hides the bug. (If that's intentional, well, then 
it's too tricky and we should have commented here.)
   
   Maybe we should also rename lastNoDataProgressEventTime as well as the fix 
changes the semantic?
   
   And we may want to revisit that our intention is updating progress 
immediately whenever the batch has not run after any batch run.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to