GitHub user edwinalu opened a pull request: https://github.com/apache/spark/pull/21221
[SPARK-23429][CORE] Add executor memory metrics to heartbeat and expose in executors REST API The original PR #20940 is messed up, and the dif shows changes not related to SPARK-23429. This is a cleaned up version of that pull request. Add new executor level memory metrics (JVM used memory, on/off heap execution memory, on/off heap storage memory, on/off heap unified memory, direct memory, and mapped memory), and expose via the executors REST API. This information will help provide insight into how executor and driver JVM memory is used, and for the different memory regions. It can be used to help determine good values for spark.executor.memory, spark.driver.memory, spark.memory.fraction, and spark.memory.storageFraction. ## What changes were proposed in this pull request? An ExecutorMetrics class is added, with jvmUsedHeapMemory, jvmUsedNonHeapMemory, onHeapExecutionMemory, offHeapExecutionMemory, onHeapStorageMemory, and offHeapStorageMemory, onHeapUnifiedMemory, offHeapUnifiedMemory, directMemory and mappedMemory. The new ExecutorMetrics is sent by executors to the driver as part of the Heartbeat. A heartbeat is added for the driver as well, to collect these metrics for the driver. The EventLoggingListener store information about the peak values for each metric, per active stage and executor. When a StageCompleted event is seen, and ExecutorMetricsUpdate event will be logged for each executor, with peal values for the stage. Only the ExecutorMetrics will be logged, and not the TaskMetrics, to minimize additional logging. The AppStatusListener records the peak values for each memory metric. The new memory metrics are added to the executors REST API. ## How was this patch tested? New unit tests have been added. This was also tested on our cluster. You can merge this pull request into a Git repository by running: $ git pull https://github.com/edwinalu/spark SPARK-23429.2 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/21221.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #21221 ---- commit c8e8abedbdfec6e92b0c63e90f3c2c5755fd8978 Author: Edwina Lu <edlu@...> Date: 2018-03-09T23:39:36Z SPARK-23429: Add executor memory metrics to heartbeat and expose in executors REST API Add new executor level memory metrics (JVM used memory, on/off heap execution memory, on/off heap storage memory), and expose via the executors REST API. This information will help provide insight into how executor and driver JVM memory is used, and for the different memory regions. It can be used to help determine good values for spark.executor.memory, spark.driver.memory, spark.memory.fraction, and spark.memory.storageFraction. Add an ExecutorMetrics class, with jvmUsedMemory, onHeapExecutionMemory, offHeapExecutionMemory, onHeapStorageMemory, and offHeapStorageMemory. The new ExecutorMetrics will be sent by executors to the driver as part of Heartbeat. A heartbeat will be added for the driver as well, to collect these metrics for the driver. Modify the EventLoggingListener to log ExecutorMetricsUpdate events if there is a new peak value for any of the memory metrics for an executor and stage. Only the ExecutorMetrics will be logged, and not the TaskMetrics, to minimize additional logging. Modify the AppStatusListener to record the peak values for each memory metric. Add the new memory metrics to the executors REST API. commit 5d6ae1c34bf6618754e4b8b2e756a9a7b4bad987 Author: Edwina Lu <edlu@...> Date: 2018-04-02T02:13:41Z modify MimaExcludes.scala to filter changes to SparkListenerExecutorMetricsUpdate commit ad10d2814bbfbaf8c21fcbb1abe83ef7a8e9ffe7 Author: Edwina Lu <edlu@...> Date: 2018-04-22T00:02:57Z Address code review comments, change event logging to stage end. ---- --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org