Github user edwinalu commented on a diff in the pull request: https://github.com/apache/spark/pull/21221#discussion_r195957438 --- Diff: core/src/main/scala/org/apache/spark/scheduler/EventLoggingListener.scala --- @@ -169,6 +182,31 @@ private[spark] class EventLoggingListener( // Events that trigger a flush override def onStageCompleted(event: SparkListenerStageCompleted): Unit = { + if (shouldLogExecutorMetricsUpdates) { + // clear out any previous attempts, that did not have a stage completed event + val prevAttemptId = event.stageInfo.attemptNumber() - 1 + for (attemptId <- 0 to prevAttemptId) { + liveStageExecutorMetrics.remove((event.stageInfo.stageId, attemptId)) + } + + // log the peak executor metrics for the stage, for each live executor, + // whether or not the executor is running tasks for the stage + val accumUpdates = new ArrayBuffer[(Long, Int, Int, Seq[AccumulableInfo])]() + val executorMap = liveStageExecutorMetrics.remove( + (event.stageInfo.stageId, event.stageInfo.attemptNumber())) + executorMap.foreach { + executorEntry => { + for ((executorId, peakExecutorMetrics) <- executorEntry) { + val executorMetrics = new ExecutorMetrics(-1, peakExecutorMetrics.metrics) --- End diff -- The last timestamp seems like it wouldn't have enough information, since peaks for different metrics could occur at different times, and with different combinations of stages running. Only -1 would be logged. Right now it's writing out SparkListenerExecutorMetricsUpdate events, which contain ExecutorMetrics, which has timestamp. Do you think timestamp should be removed from ExecutorMetrics? It seems good to have the timestamp for when the metrics were gathered, but it's not being exposed at this point. For both the history server and the live UI, the goal is to show the peak value for each metric each executor. For the executors tab, this is the peak value of each metric over the lifetime of the executor. For the stages tab, this is the peak value for each metric for that executor while the stage is running. The executor could be processing tasks for other stages as well, if there are concurrent stages, or no tasks for this stage if it isn't assigned any tasks, but it is the peak values between the time the stage starts and ends. Can you describe how the stage level metrics would work the last timestamp for any peak metric? Would there be a check to see if the event is being read from the history log?
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org