Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/20940#discussion_r181896603 --- Diff: core/src/main/scala/org/apache/spark/scheduler/EventLoggingListener.scala --- @@ -234,8 +244,22 @@ private[spark] class EventLoggingListener( } } - // No-op because logging every update would be overkill - override def onExecutorMetricsUpdate(event: SparkListenerExecutorMetricsUpdate): Unit = { } + /** + * Log if there is a new peak value for one of the memory metrics for the given executor. + * Metrics are cleared out when a new stage is started in onStageSubmitted, so this will + * log new peak memory metric values per executor per stage. + */ + override def onExecutorMetricsUpdate(event: SparkListenerExecutorMetricsUpdate): Unit = { --- End diff -- is your concern the size of the msg from the executors to the driver? that certainly is valid, but I wonder if we should think a bit harder about this if that is going to be a common concern, as I think we'll want to add more metrics. One possibility is for the executor to do the peak calculation itself, and then only send an update for the metrics with a new peak. Also that would let us just send the peak on task end events. I'm just brainstorming at the moment, not saying it should be changed one way or the other ... need to think about it more
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org