Repository: spark Updated Branches: refs/heads/master c77aa42f5 -> ffe256ce1
[SPARK-25730][K8S] Delete executor pods from kubernetes after figuring out why they died ## What changes were proposed in this pull request? `removeExecutorFromSpark` tries to fetch the reason the executor exited from Kubernetes, which may be useful if the pod was OOMKilled. However, the code previously deleted the pod from Kubernetes first which made retrieving this status impossible. This fixes the ordering. On a separate but related note, it would be nice to wait some time before removing the pod - to let the operator examine logs and such. ## How was this patch tested? Running on my local cluster. Author: Mike Kaplinskiy <mike.kaplins...@gmail.com> Closes #22720 from mikekap/patch-1. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/ffe256ce Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/ffe256ce Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/ffe256ce Branch: refs/heads/master Commit: ffe256ce161884f0a1304b4925d51d39a9bfa5df Parents: c77aa42 Author: Mike Kaplinskiy <mike.kaplins...@gmail.com> Authored: Sun Oct 21 11:32:33 2018 -0700 Committer: Felix Cheung <felixche...@apache.org> Committed: Sun Oct 21 11:32:33 2018 -0700 ---------------------------------------------------------------------- .../spark/scheduler/cluster/k8s/ExecutorPodsLifecycleManager.scala | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/spark/blob/ffe256ce/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsLifecycleManager.scala ---------------------------------------------------------------------- diff --git a/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsLifecycleManager.scala b/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsLifecycleManager.scala index cc254b8..1a75ae0 100644 --- a/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsLifecycleManager.scala +++ b/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsLifecycleManager.scala @@ -112,8 +112,8 @@ private[spark] class ExecutorPodsLifecycleManager( execId: Long, schedulerBackend: KubernetesClusterSchedulerBackend, execIdsRemovedInRound: mutable.Set[Long]): Unit = { - removeExecutorFromK8s(podState.pod) removeExecutorFromSpark(schedulerBackend, podState, execId) + removeExecutorFromK8s(podState.pod) execIdsRemovedInRound += execId } --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org