Repository: spark Updated Branches: refs/heads/master ad9e3d50f -> 2bc7c96d6
[SPARK-13447][YARN][CORE] Clean the stale states for AM failure and restart situation ## What changes were proposed in this pull request? This is a follow-up fix of #9963, in #9963 we handle this stale states clean-up work only for dynamic allocation enabled scenario. Here we should also clean the states in `CoarseGrainedSchedulerBackend` for dynamic allocation disabled scenario. Please review, CC andrewor14 lianhuiwang , thanks a lot. ## How was this patch tested? Run the unit test locally, also with integration test manually. Author: jerryshao <ss...@hortonworks.com> Closes #11366 from jerryshao/SPARK-13447. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/2bc7c96d Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/2bc7c96d Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/2bc7c96d Branch: refs/heads/master Commit: 2bc7c96d61a51bd458ba04e9d318640ddada559d Parents: ad9e3d5 Author: jerryshao <ss...@hortonworks.com> Authored: Mon Mar 28 17:03:21 2016 -0700 Committer: Andrew Or <and...@databricks.com> Committed: Mon Mar 28 17:03:21 2016 -0700 ---------------------------------------------------------------------- .../cluster/CoarseGrainedSchedulerBackend.scala | 21 +++++++++----------- 1 file changed, 9 insertions(+), 12 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/spark/blob/2bc7c96d/core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala ---------------------------------------------------------------------- diff --git a/core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala b/core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala index b7919ef..eb4f533 100644 --- a/core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala +++ b/core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala @@ -356,20 +356,17 @@ class CoarseGrainedSchedulerBackend(scheduler: TaskSchedulerImpl, val rpcEnv: Rp /** * Reset the state of CoarseGrainedSchedulerBackend to the initial state. Currently it will only - * be called in the yarn-client mode when AM re-registers after a failure, also dynamic - * allocation is enabled. + * be called in the yarn-client mode when AM re-registers after a failure. * */ protected def reset(): Unit = synchronized { - if (Utils.isDynamicAllocationEnabled(conf)) { - numPendingExecutors = 0 - executorsPendingToRemove.clear() - - // Remove all the lingering executors that should be removed but not yet. The reason might be - // because (1) disconnected event is not yet received; (2) executors die silently. - executorDataMap.toMap.foreach { case (eid, _) => - driverEndpoint.askWithRetry[Boolean]( - RemoveExecutor(eid, SlaveLost("Stale executor after cluster manager re-registered."))) - } + numPendingExecutors = 0 + executorsPendingToRemove.clear() + + // Remove all the lingering executors that should be removed but not yet. The reason might be + // because (1) disconnected event is not yet received; (2) executors die silently. + executorDataMap.toMap.foreach { case (eid, _) => + driverEndpoint.askWithRetry[Boolean]( + RemoveExecutor(eid, SlaveLost("Stale executor after cluster manager re-registered."))) } } --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org