[ 
https://issues.apache.org/jira/browse/SPARK-48327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Praneet Sharma updated SPARK-48327:
-----------------------------------
    Component/s: Kubernetes
                     (was: Spark Core)

> Concurrent Spark jobs execution on K8s cluster intermittently throws 
> 'configmaps already exists' error
> ------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-48327
>                 URL: https://issues.apache.org/jira/browse/SPARK-48327
>             Project: Spark
>          Issue Type: Bug
>          Components: Kubernetes
>    Affects Versions: 3.2.0
>            Reporter: Praneet Sharma
>            Priority: Major
>
> We have multiple iterations where, in each iteration, we are submitting 120 
> concurrent Spark jobs on a Kubernetes cluster (1.20 version). In one such 
> iteration, 2 spark jobs failed with "Message: configmaps 
> "spark-exec-2cf3698dc8c8226d-conf-map" already exists." error:
>  
> {code:java}
> 2024-02-20 23:09:43Z ERROR SparkContext - Error initializing SparkContext.
> io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: 
> POST at: https://kubernetes.default.svc/api/v1/namespaces/default/configmaps. 
> Message: configmaps "spark-exec-2cf3698dc8c8226d-conf-map" already exists. 
> Received status: Status(apiVersion=v1, code=409, 
> details=StatusDetails(causes=[], group=null, kind=configmaps, 
> name=spark-exec-2cf3698dc8c8226d-conf-map, retryAfterSeconds=null, uid=null, 
> additionalProperties={}), kind=Status, message=configmaps 
> "spark-exec-2cf3698dc8c8226d-conf-map" already exists, 
> metadata=ListMeta(_continue=null, remainingItemCount=null, 
> resourceVersion=null, selfLink=null, additionalProperties={}), 
> reason=AlreadyExists, status=Failure, additionalProperties={}).
>     at 
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:682)
>  ~[kubernetes-client-5.12.2.jar:?]
>     at 
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:661)
>  ~[kubernetes-client-5.12.2.jar:?]
>     at 
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.assertResponseCode(OperationSupport.java:612)
>  ~[kubernetes-client-5.12.2.jar:?]
>     at 
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:555)
>  ~[kubernetes-client-5.12.2.jar:?]
>     at 
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:518)
>  ~[kubernetes-client-5.12.2.jar:?]
>     at 
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleCreate(OperationSupport.java:305)
>  ~[kubernetes-client-5.12.2.jar:?]
>     at 
> io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleCreate(BaseOperation.java:644)
>  ~[kubernetes-client-5.12.2.jar:?]
>     at 
> io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleCreate(BaseOperation.java:83)
>  ~[kubernetes-client-5.12.2.jar:?]
>     at 
> io.fabric8.kubernetes.client.dsl.base.CreateOnlyResourceOperation.create(CreateOnlyResourceOperation.java:61)
>  ~[kubernetes-client-5.12.2.jar:?]
>     at 
> org.apache.spark.scheduler.cluster.k8s.KubernetesClusterSchedulerBackend.setUpExecutorConfigMap(KubernetesClusterSchedulerBackend.scala:110)
>  ~[spark-kubernetes_2.12-3.3.1.jar:3.3.1]
>     at 
> org.apache.spark.scheduler.cluster.k8s.KubernetesClusterSchedulerBackend.start(KubernetesClusterSchedulerBackend.scala:139)
>  ~[spark-kubernetes_2.12-3.3.1.jar:3.3.1]
>     at 
> org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:222)
>  ~[spark-core_2.12-3.3.1.jar:3.3.1]
>     at org.apache.spark.SparkContext.<init>(SparkContext.scala:585) 
> ~[spark-core_2.12-3.3.1.jar:3.3.1]
>     at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2704) 
> ~[spark-core_2.12-3.3.1.jar:3.3.1]
>     at 
> org.apache.spark.sql.SparkSession$Builder.$anonfun$getOrCreate$2(SparkSession.scala:953)
>  ~[spark-sql_2.12-3.3.1.jar:3.3.1]
>     at scala.Option.getOrElse(Option.scala:189) ~[scala-library-2.12.15.jar:?]
>     at 
> org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:947) 
> ~[spark-sql_2.12-3.3.1.jar:3.3.1] {code}
> I found 2 somewhat similar issues raised SPARK-41006 and SPARK-39115 where a 
> similar error was seen for Spark driver configmap.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to