Github user squito commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20940#discussion_r179796239
  
    --- Diff: 
core/src/main/scala/org/apache/spark/scheduler/EventLoggingListener.scala ---
    @@ -93,6 +94,9 @@ private[spark] class EventLoggingListener(
       // Visible for tests only.
       private[scheduler] val logPath = getLogPath(logBaseDir, appId, 
appAttemptId, compressionCodecName)
     
    +  // Peak metric values for each executor
    +  private var peakExecutorMetrics = new mutable.HashMap[String, 
PeakExecutorMetrics]()
    --- End diff --
    
    you need to handle overlapping stages.  I think you actually need to key on 
both executor and stage, and on stage end, you only clear the metrics for that 
stage.
    
    EDIT: ok after I went through everything, I think I see how this works -- 
since you log on every new peak, you'll also get a logged message for the 
earlier update.  But as I mention below, this strategy seems like it'll result 
in a lot of extra logging.  Maybe I'm wrong, though, would be great to see how 
much the logs grow this way.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to