Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/21721 If batch query also leverages AccumulatorV2 for metrics, IMHO it might not need to redesign metrics API start from scratch. For batch and micro-batch the metrics API work without any concerns (it is getting requests for improvement though), and for continuous mode the metrics just don't work because task never finishes. The change in metrics affects both query status as well as SQL tab in UI. I haven't concerned too deeply with metrics on continuous mode so not sure about current state of UI and the ideal shape of UI, so will spend time to play with. My 2 cents, once we have existing metrics work well, we could find out some ways to let current metrics work well with continuous mode, to not break other things as well. One thing I would like to ask to ourselves is, would we treat epoch id as batch id? For checkpointing we already did it, and in some streaming framework they represent `stream between epochs` as `logical batch` which makes sense to me. If we deal with watermark we are likely to update watermark per epoch, as well as dealing with state, and if my understanding is correct epoch id looks like just an alias of batch id.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org