[ https://issues.apache.org/jira/browse/SPARK-17582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sean Owen resolved SPARK-17582. ------------------------------- Resolution: Not A Problem > Dead executors shouldn't show in the SparkUI > -------------------------------------------- > > Key: SPARK-17582 > URL: https://issues.apache.org/jira/browse/SPARK-17582 > Project: Spark > Issue Type: Bug > Components: Spark Core > Reporter: xukun > Priority: Minor > > When executor is losted, SparkUI ExecutorsTab still show its executor info. > class HeartbeatReceiver.scala > {code:borderStyle=solid} > override def receiveAndReply(context: RpcCallContext): PartialFunction[Any, > Unit] = { > // Messages sent and received locally > case ExecutorRegistered(executorId) => > executorLastSeen(executorId) = clock.getTimeMillis() > context.reply(true) > case ExecutorRemoved(executorId) => > executorLastSeen.remove(executorId) > context.reply(true) > case TaskSchedulerIsSet => > scheduler = sc.taskScheduler > context.reply(true) > case ExpireDeadHosts => > expireDeadHosts() > context.reply(true) > // Messages received from executors > case heartbeat @ Heartbeat(executorId, taskMetrics, blockManagerId) => > if (scheduler != null) { > if (executorLastSeen.contains(executorId)) { > executorLastSeen(executorId) = clock.getTimeMillis() > eventLoopThread.submit(new Runnable { > override def run(): Unit = Utils.tryLogNonFatalError { > val unknownExecutor = !scheduler.executorHeartbeatReceived( > executorId, taskMetrics, blockManagerId) > val response = HeartbeatResponse(reregisterBlockManager = > unknownExecutor) > context.reply(response) > } > }) > } else { > // This may happen if we get an executor's in-flight heartbeat > immediately > // after we just removed it. It's not really an error condition so > we should > // not log warning here. Otherwise there may be a lot of noise > especially if > // we explicitly remove executors (SPARK-4134). > logDebug(s"Received heartbeat from unknown executor $executorId") > context.reply(HeartbeatResponse(reregisterBlockManager = false)) > } > } else { > // Because Executor will sleep several seconds before sending the > first "Heartbeat", this > // case rarely happens. However, if it really happens, log it and ask > the executor to > // register itself again. > logWarning(s"Dropping $heartbeat because TaskScheduler is not ready > yet") > context.reply(HeartbeatResponse(reregisterBlockManager = true)) > } > } > {code} > If the process like listed: > 1. process HeartBeat and eventLoopThread not return result > 2.Executor is lost > variables unknownExecutor will be true,it will lead to > reregisterBlockManager. > The result is that dead executors are still shown in the SparkUI . -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org