Repository: spark
Updated Branches:
  refs/heads/master c77aa42f5 -> ffe256ce1


[SPARK-25730][K8S] Delete executor pods from kubernetes after figuring out why 
they died

## What changes were proposed in this pull request?

`removeExecutorFromSpark` tries to fetch the reason the executor exited from 
Kubernetes, which may be useful if the pod was OOMKilled. However, the code 
previously deleted the pod from Kubernetes first which made retrieving this 
status impossible. This fixes the ordering.

On a separate but related note, it would be nice to wait some time before 
removing the pod - to let the operator examine logs and such.

## How was this patch tested?

Running on my local cluster.

Author: Mike Kaplinskiy <mike.kaplins...@gmail.com>

Closes #22720 from mikekap/patch-1.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/ffe256ce
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/ffe256ce
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/ffe256ce

Branch: refs/heads/master
Commit: ffe256ce161884f0a1304b4925d51d39a9bfa5df
Parents: c77aa42
Author: Mike Kaplinskiy <mike.kaplins...@gmail.com>
Authored: Sun Oct 21 11:32:33 2018 -0700
Committer: Felix Cheung <felixche...@apache.org>
Committed: Sun Oct 21 11:32:33 2018 -0700

----------------------------------------------------------------------
 .../spark/scheduler/cluster/k8s/ExecutorPodsLifecycleManager.scala | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/ffe256ce/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsLifecycleManager.scala
----------------------------------------------------------------------
diff --git 
a/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsLifecycleManager.scala
 
b/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsLifecycleManager.scala
index cc254b8..1a75ae0 100644
--- 
a/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsLifecycleManager.scala
+++ 
b/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsLifecycleManager.scala
@@ -112,8 +112,8 @@ private[spark] class ExecutorPodsLifecycleManager(
       execId: Long,
       schedulerBackend: KubernetesClusterSchedulerBackend,
       execIdsRemovedInRound: mutable.Set[Long]): Unit = {
-    removeExecutorFromK8s(podState.pod)
     removeExecutorFromSpark(schedulerBackend, podState, execId)
+    removeExecutorFromK8s(podState.pod)
     execIdsRemovedInRound += execId
   }
 


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

Reply via email to