[ https://issues.apache.org/jira/browse/SPARK-33906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Dongjoon Hyun reassigned SPARK-33906: ------------------------------------- Assignee: Baohe Zhang > SPARK UI Executors page stuck when ExecutorSummary.peakMemoryMetrics is unset > ----------------------------------------------------------------------------- > > Key: SPARK-33906 > URL: https://issues.apache.org/jira/browse/SPARK-33906 > Project: Spark > Issue Type: Bug > Components: Web UI > Affects Versions: 3.2.0 > Reporter: Baohe Zhang > Assignee: Baohe Zhang > Priority: Blocker > Attachments: executor-page.png > > > How to reproduce it? > In mac OS standalone mode, open a spark-shell and run > $SPARK_HOME/bin/spark-shell --master spark://localhost:7077 > {code:scala} > val x = sc.makeRDD(1 to 100000, 5) > x.count() > {code} > Then open the app UI in the browser, and click the Executors page, will get > stuck at this page: > !executor-page.png! > Also the return JSON of REST API endpoint > http://localhost:4040/api/v1/applications/app-20201224134418-0003/executors > miss "peakMemoryMetrics" for executors. > {noformat} > [ { > "id" : "driver", > "hostPort" : "192.168.1.241:50042", > "isActive" : true, > "rddBlocks" : 0, > "memoryUsed" : 0, > "diskUsed" : 0, > "totalCores" : 0, > "maxTasks" : 0, > "activeTasks" : 0, > "failedTasks" : 0, > "completedTasks" : 0, > "totalTasks" : 0, > "totalDuration" : 0, > "totalGCTime" : 0, > "totalInputBytes" : 0, > "totalShuffleRead" : 0, > "totalShuffleWrite" : 0, > "isBlacklisted" : false, > "maxMemory" : 455501414, > "addTime" : "2020-12-24T19:44:18.033GMT", > "executorLogs" : { }, > "memoryMetrics" : { > "usedOnHeapStorageMemory" : 0, > "usedOffHeapStorageMemory" : 0, > "totalOnHeapStorageMemory" : 455501414, > "totalOffHeapStorageMemory" : 0 > }, > "blacklistedInStages" : [ ], > "peakMemoryMetrics" : { > "JVMHeapMemory" : 135021152, > "JVMOffHeapMemory" : 149558576, > "OnHeapExecutionMemory" : 0, > "OffHeapExecutionMemory" : 0, > "OnHeapStorageMemory" : 3301, > "OffHeapStorageMemory" : 0, > "OnHeapUnifiedMemory" : 3301, > "OffHeapUnifiedMemory" : 0, > "DirectPoolMemory" : 67963178, > "MappedPoolMemory" : 0, > "ProcessTreeJVMVMemory" : 0, > "ProcessTreeJVMRSSMemory" : 0, > "ProcessTreePythonVMemory" : 0, > "ProcessTreePythonRSSMemory" : 0, > "ProcessTreeOtherVMemory" : 0, > "ProcessTreeOtherRSSMemory" : 0, > "MinorGCCount" : 15, > "MinorGCTime" : 101, > "MajorGCCount" : 0, > "MajorGCTime" : 0 > }, > "attributes" : { }, > "resources" : { }, > "resourceProfileId" : 0, > "isExcluded" : false, > "excludedInStages" : [ ] > }, { > "id" : "0", > "hostPort" : "192.168.1.241:50054", > "isActive" : true, > "rddBlocks" : 0, > "memoryUsed" : 0, > "diskUsed" : 0, > "totalCores" : 12, > "maxTasks" : 12, > "activeTasks" : 0, > "failedTasks" : 0, > "completedTasks" : 5, > "totalTasks" : 5, > "totalDuration" : 2107, > "totalGCTime" : 25, > "totalInputBytes" : 0, > "totalShuffleRead" : 0, > "totalShuffleWrite" : 0, > "isBlacklisted" : false, > "maxMemory" : 455501414, > "addTime" : "2020-12-24T19:44:20.335GMT", > "executorLogs" : { > "stdout" : > "http://192.168.1.241:8081/logPage/?appId=app-20201224134418-0003&executorId=0&logType=stdout", > "stderr" : > "http://192.168.1.241:8081/logPage/?appId=app-20201224134418-0003&executorId=0&logType=stderr" > }, > "memoryMetrics" : { > "usedOnHeapStorageMemory" : 0, > "usedOffHeapStorageMemory" : 0, > "totalOnHeapStorageMemory" : 455501414, > "totalOffHeapStorageMemory" : 0 > }, > "blacklistedInStages" : [ ], > "attributes" : { }, > "resources" : { }, > "resourceProfileId" : 0, > "isExcluded" : false, > "excludedInStages" : [ ] > } ] > {noformat} > I debugged it and observed that ExecutorMetricsPoller > .getExecutorUpdates returns an empty map, which causes peakExecutorMetrics to > None in > https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/status/LiveEntity.scala#L345. > The possible reason for returning the empty map is that the stage completion > time is shorter than the heartbeat interval, so the stage entry in stageTCMP > has already been removed before the reportHeartbeat is called. > How to fix it? > Check if the peakMemoryMetrics is undefined in executorspage.js. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org