Seth Horrigan created SPARK-38794:
-------------------------------------

             Summary: When ConfigMap creation fails, Spark driver starts but 
fails to start executors
                 Key: SPARK-38794
                 URL: https://issues.apache.org/jira/browse/SPARK-38794
             Project: Spark
          Issue Type: Bug
          Components: Kubernetes
    Affects Versions: 3.2.1, 3.2.0, 3.1.2, 3.1.1
            Reporter: Seth Horrigan


When running Spark in Kubernetes client mode, all executors assume that a 
ConfigMap exactly matching `KubernetesClientUtils.configMapNameExecutor` will 
exist (see 
[https://github.com/apache/spark/blob/02a055a42de5597cd42c1c0d4470f0e769571dc3/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala#L98])

If the ConfigMap creation fails, 
[https://github.com/apache/spark/blob/02a055a42de5597cd42c1c0d4470f0e769571dc3/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/KubernetesClusterSchedulerBackend.scala#L80],
 (due to the Kubernetes control plane being temporarily unavailable or the 
permissions of the serviceaccount being insufficient to create a ConfigMap), 
the driver will start fully, then will wait for executors that will forever 
fail to start due to "MountVolume.SetUp failed for volume 
\"spark-conf-volume-exec\" : configmap \"spark-exec-...-conf-map\" not found" 

 

Either the driver start-up should fail with an error, or the driver should 
retry the attempt to create the ConfigMap



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to