[ 
https://issues.apache.org/jira/browse/SPARK-4991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-4991:
-----------------------------------

    Assignee: Apache Spark

> Worker should reconnect to Master when Master actor restart
> -----------------------------------------------------------
>
>                 Key: SPARK-4991
>                 URL: https://issues.apache.org/jira/browse/SPARK-4991
>             Project: Spark
>          Issue Type: Improvement
>          Components: Deploy, Spark Core
>    Affects Versions: 1.0.0, 1.1.0, 1.2.0
>            Reporter: Zhang, Liye
>            Assignee: Apache Spark
>
> This is a following JIRA of 
> [SPARK-4989|https://issues.apache.org/jira/browse/SPARK-4989]. when Master 
> akka actor encounter an exception, the Master will restart (akka actor 
> restart not JVM restart). And all old information are cleared on Master 
> (including workers, applications, etc). However, the workers are not aware of 
> this at all. The state of the cluster is that: the master is on, and all 
> workers are also on, but master is not aware of the exists of workers, and 
> will ignore all worker's heartbeat because all workers are not registered. So 
> that the whole cluster is not available.
> For some other information about this part, please refer to 
> [SPARK-3736|https://issues.apache.org/jira/browse/SPARK-3736] and 
> [SPARK-4592|https://issues.apache.org/jira/browse/SPARK-4592]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to