[ 
https://issues.apache.org/jira/browse/SPARK-24248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16471280#comment-16471280
 ] 

Matt Cheah commented on SPARK-24248:
------------------------------------

I thought about it a bit more, and believe that we can do most if not all of 
our actions in the Watcher directly.

In other words, can we drive the entire lifecycle of all executors solely from 
the perspective of watch events? This would be a pretty big rewrite, but you 
get a large number of benefits from this - namely, you remove any need for 
synchronization or local state management at all. 
{{KubernetesClusterSchedulerBackend}} becomes effectively stateless, apart from 
the parent class's fields.

This would imply at least the following changes, though I'm sure I'm missing 
some:
 * When the watcher receives a modified or error event, check the status of the 
executor, construct the exit reason, and call \{{RemoveExecutor}} directly
 * The watcher keeps a running count of active executors and itself triggers 
rounds of creating new executors (instead of the periodic polling)
 * {\{KubernetesDriverEndpoint::onDisconnected}} is a tricky one. What I'm 
thinking is that we can just disable the executor but not remove it, counting 
on the Watch to receive an event that would actually trigger removing the 
executor. The idea here is that the status of the pods as reported by the Watch 
should be fully reliable - e.g. whenever any error occurs in the executor such 
that it becomes unusable, the Kubernetes API should report such state. We could 
perhaps make the API's representation of the world more accurate by attaching 
liveness probes to the executor pod.

 

> [K8S] Use the Kubernetes cluster as the backing store for the state of pods
> ---------------------------------------------------------------------------
>
>                 Key: SPARK-24248
>                 URL: https://issues.apache.org/jira/browse/SPARK-24248
>             Project: Spark
>          Issue Type: Improvement
>          Components: Kubernetes
>    Affects Versions: 2.3.0
>            Reporter: Matt Cheah
>            Priority: Major
>
> We have a number of places in KubernetesClusterSchedulerBackend right now 
> that maintains the state of pods in memory. However, the Kubernetes API can 
> always give us the most up to date and correct view of what our executors are 
> doing. We should consider moving away from in-memory state as much as can in 
> favor of using the Kubernetes cluster as the source of truth for pod status. 
> Maintaining less state in memory makes it so that there's a lower chance that 
> we accidentally miss updating one of these data structures and breaking the 
> lifecycle of executors.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to