spark git commit: [SPARK-13447][YARN][CORE] Clean the stale states for AM failure and restart situation

andrewor14 Mon, 28 Mar 2016 17:04:11 -0700

Repository: spark
Updated Branches:
  refs/heads/master ad9e3d50f -> 2bc7c96d6



[SPARK-13447][YARN][CORE] Clean the stale states for AM failure and restart 
situation

## What changes were proposed in this pull request?

This is a follow-up fix of #9963, in #9963 we handle this stale states clean-up 
work only for dynamic allocation enabled scenario. Here we should also clean 
the states in `CoarseGrainedSchedulerBackend` for dynamic allocation disabled 
scenario.

Please review, CC andrewor14 lianhuiwang , thanks a lot.

## How was this patch tested?

Run the unit test locally, also with integration test manually.

Author: jerryshao <ss...@hortonworks.com>

Closes #11366 from jerryshao/SPARK-13447.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/2bc7c96d
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/2bc7c96d
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/2bc7c96d

Branch: refs/heads/master
Commit: 2bc7c96d61a51bd458ba04e9d318640ddada559d
Parents: ad9e3d5
Author: jerryshao <ss...@hortonworks.com>
Authored: Mon Mar 28 17:03:21 2016 -0700
Committer: Andrew Or <and...@databricks.com>
Committed: Mon Mar 28 17:03:21 2016 -0700

----------------------------------------------------------------------
 .../cluster/CoarseGrainedSchedulerBackend.scala | 21 +++++++++-----------
 1 file changed, 9 insertions(+), 12 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/2bc7c96d/core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
----------------------------------------------------------------------
diff --git 
a/core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
 
b/core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
index b7919ef..eb4f533 100644
--- 
a/core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
+++ 
b/core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
@@ -356,20 +356,17 @@ class CoarseGrainedSchedulerBackend(scheduler: 
TaskSchedulerImpl, val rpcEnv: Rp
 
   /**
    * Reset the state of CoarseGrainedSchedulerBackend to the initial state. 
Currently it will only
-   * be called in the yarn-client mode when AM re-registers after a failure, 
also dynamic
-   * allocation is enabled.
+   * be called in the yarn-client mode when AM re-registers after a failure.
    * */
   protected def reset(): Unit = synchronized {
-    if (Utils.isDynamicAllocationEnabled(conf)) {
-      numPendingExecutors = 0
-      executorsPendingToRemove.clear()
-
-      // Remove all the lingering executors that should be removed but not 
yet. The reason might be
-      // because (1) disconnected event is not yet received; (2) executors die 
silently.
-      executorDataMap.toMap.foreach { case (eid, _) =>
-        driverEndpoint.askWithRetry[Boolean](
-          RemoveExecutor(eid, SlaveLost("Stale executor after cluster manager 
re-registered.")))
-      }
+    numPendingExecutors = 0
+    executorsPendingToRemove.clear()
+
+    // Remove all the lingering executors that should be removed but not yet. 
The reason might be
+    // because (1) disconnected event is not yet received; (2) executors die 
silently.
+    executorDataMap.toMap.foreach { case (eid, _) =>
+      driverEndpoint.askWithRetry[Boolean](
+        RemoveExecutor(eid, SlaveLost("Stale executor after cluster manager 
re-registered.")))
     }
   }
 


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-13447][YARN][CORE] Clean the stale states for AM failure and restart situation

Reply via email to