Github user HeartSaVioR commented on the issue:

    https://github.com/apache/spark/pull/21721
  
    If batch query also leverages AccumulatorV2 for metrics, IMHO it might not 
need to redesign metrics API start from scratch. For batch and micro-batch the 
metrics API work without any concerns (it is getting requests for improvement 
though), and for continuous mode the metrics just don't work because task never 
finishes.
    
    The change in metrics affects both query status as well as SQL tab in UI. I 
haven't concerned too deeply with metrics on continuous mode so not sure about 
current state of UI and the ideal shape of UI, so will spend time to play with. 
My 2 cents, once we have existing metrics work well, we could find out some 
ways to let current metrics work well with continuous mode, to not break other 
things as well.
    
    One thing I would like to ask to ourselves is, would we treat epoch id as 
batch id? For checkpointing we already did it, and in some streaming framework 
they represent `stream between epochs` as `logical batch` which makes sense to 
me. If we deal with watermark we are likely to update watermark per epoch, as 
well as dealing with state, and if my understanding is correct epoch id looks 
like just an alias of batch id.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to