[ 
https://issues.apache.org/jira/browse/SPARK-24105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16454847#comment-16454847
 ] 

Anirudh Ramanathan commented on SPARK-24105:
--------------------------------------------

> To avoid this deadlock, its required to support node selector (in future 
> affinity/anti-affinity) configruation by driver & executor.

Would inter-pod anti-affinity be a better bet here for this use-case?
In the extreme case, this is a gang scheduling issue IMO, where we don't want 
to schedule drivers if there are no executors that can be scheduled.
There's some work on gang scheduling ongoing in 
https://github.com/kubernetes/kubernetes/issues/61012 under sig-scheduling.

> Spark 2.3.0 on kubernetes
> -------------------------
>
>                 Key: SPARK-24105
>                 URL: https://issues.apache.org/jira/browse/SPARK-24105
>             Project: Spark
>          Issue Type: Improvement
>          Components: Kubernetes
>    Affects Versions: 2.3.0
>            Reporter: Lenin
>            Priority: Major
>
> Right now its only possible to define node selector configurations 
> thruspark.kubernetes.node.selector.[labelKey]. This gets used for both driver 
> & executor pods. Without the capability to isolate driver & executor pods, 
> the cluster can run into a livelock scenario, where if there are a lot of 
> spark submits, can cause the driver pods to fill up the cluster capacity, 
> with no room for executor pods to do any work.
>  
> To avoid this deadlock, its required to support node selector (in future 
> affinity/anti-affinity) configruation by driver & executor.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to