Github user squito commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20940#discussion_r180426692
  
    --- Diff: 
core/src/main/scala/org/apache/spark/scheduler/EventLoggingListener.scala ---
    @@ -234,8 +244,22 @@ private[spark] class EventLoggingListener(
         }
       }
     
    -  // No-op because logging every update would be overkill
    -  override def onExecutorMetricsUpdate(event: 
SparkListenerExecutorMetricsUpdate): Unit = { }
    +  /**
    +   * Log if there is a new peak value for one of the memory metrics for 
the given executor.
    +   * Metrics are cleared out when a new stage is started in 
onStageSubmitted, so this will
    +   * log new peak memory metric values per executor per stage.
    +   */
    +  override def onExecutorMetricsUpdate(event: 
SparkListenerExecutorMetricsUpdate): Unit = {
    --- End diff --
    
    yeah logging an event per executor at stage end seems good to me.  It would 
be great if we could see how much that version affects log size as well, if you 
can get those metrics.
    
    also these tradeoffs should go into the design doc, its harder to find 
comments from a PR after this feature has been merged.  For now, it would also 
be nice if you could post a version that everyone can comment on, eg. a google 
doc.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to