[ https://issues.apache.org/jira/browse/SPARK-24105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16454847#comment-16454847 ]
Anirudh Ramanathan commented on SPARK-24105: -------------------------------------------- > To avoid this deadlock, its required to support node selector (in future > affinity/anti-affinity) configruation by driver & executor. Would inter-pod anti-affinity be a better bet here for this use-case? In the extreme case, this is a gang scheduling issue IMO, where we don't want to schedule drivers if there are no executors that can be scheduled. There's some work on gang scheduling ongoing in https://github.com/kubernetes/kubernetes/issues/61012 under sig-scheduling. > Spark 2.3.0 on kubernetes > ------------------------- > > Key: SPARK-24105 > URL: https://issues.apache.org/jira/browse/SPARK-24105 > Project: Spark > Issue Type: Improvement > Components: Kubernetes > Affects Versions: 2.3.0 > Reporter: Lenin > Priority: Major > > Right now its only possible to define node selector configurations > thruspark.kubernetes.node.selector.[labelKey]. This gets used for both driver > & executor pods. Without the capability to isolate driver & executor pods, > the cluster can run into a livelock scenario, where if there are a lot of > spark submits, can cause the driver pods to fill up the cluster capacity, > with no room for executor pods to do any work. > > To avoid this deadlock, its required to support node selector (in future > affinity/anti-affinity) configruation by driver & executor. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org