[ 
https://issues.apache.org/jira/browse/SPARK-17582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15503461#comment-15503461
 ] 

Thomas Graves commented on SPARK-17582:
---------------------------------------

Yes they are meant to be there, status is DEAD.  This allows you to still see 
all the stats and get to the logs on it.  

You will notice the summary table specifically has a dead row.  

If you have issues with the way they are displayed perhaps propose something to 
make it easier.

> Dead executors shouldn't show in the SparkUI
> --------------------------------------------
>
>                 Key: SPARK-17582
>                 URL: https://issues.apache.org/jira/browse/SPARK-17582
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>            Reporter: xukun
>            Priority: Minor
>
> When executor is losted, SparkUI ExecutorsTab still show its executor info.
>  class HeartbeatReceiver.scala
> {code:borderStyle=solid}
>   override def receiveAndReply(context: RpcCallContext): PartialFunction[Any, 
> Unit] = {
>     // Messages sent and received locally
>     case ExecutorRegistered(executorId) =>
>       executorLastSeen(executorId) = clock.getTimeMillis()
>       context.reply(true)
>     case ExecutorRemoved(executorId) =>
>       executorLastSeen.remove(executorId)
>       context.reply(true)
>     case TaskSchedulerIsSet =>
>       scheduler = sc.taskScheduler
>       context.reply(true)
>     case ExpireDeadHosts =>
>       expireDeadHosts()
>       context.reply(true)
>     // Messages received from executors
>     case heartbeat @ Heartbeat(executorId, taskMetrics, blockManagerId) =>
>       if (scheduler != null) {
>         if (executorLastSeen.contains(executorId)) {
>           executorLastSeen(executorId) = clock.getTimeMillis()
>           eventLoopThread.submit(new Runnable {
>             override def run(): Unit = Utils.tryLogNonFatalError {
>               val unknownExecutor = !scheduler.executorHeartbeatReceived(
>                 executorId, taskMetrics, blockManagerId)
>               val response = HeartbeatResponse(reregisterBlockManager = 
> unknownExecutor)
>               context.reply(response)
>             }
>           })
>         } else {
>           // This may happen if we get an executor's in-flight heartbeat 
> immediately
>           // after we just removed it. It's not really an error condition so 
> we should
>           // not log warning here. Otherwise there may be a lot of noise 
> especially if
>           // we explicitly remove executors (SPARK-4134).
>           logDebug(s"Received heartbeat from unknown executor $executorId")
>           context.reply(HeartbeatResponse(reregisterBlockManager = false))
>         }
>       } else {
>         // Because Executor will sleep several seconds before sending the 
> first "Heartbeat", this
>         // case rarely happens. However, if it really happens, log it and ask 
> the executor to
>         // register itself again.
>         logWarning(s"Dropping $heartbeat because TaskScheduler is not ready 
> yet")
>         context.reply(HeartbeatResponse(reregisterBlockManager = true))
>       }
>   }
> {code}
> If the process like listed:
> 1. process HeartBeat and eventLoopThread not return result
>  2.Executor is lost
> variables unknownExecutor will  be true,it will lead to 
> reregisterBlockManager.
> The result is that dead executors are still shown in the SparkUI .  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to