[jira] [Commented] (SPARK-39006) Show a directional error message for PVC Dynamic Allocation Failure

2023-05-06 Thread Dongjoon Hyun (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-39006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17720266#comment-17720266
 ] 

Dongjoon Hyun commented on SPARK-39006:
---

This is reverted via 
[https://github.com/apache/spark/commit/3ba1fa3678a4fcc0aaba8abb0d4312e8fb42efba]

> Show a directional error message for PVC Dynamic Allocation Failure
> ---
>
> Key: SPARK-39006
> URL: https://issues.apache.org/jira/browse/SPARK-39006
> Project: Spark
>  Issue Type: Sub-task
>  Components: Kubernetes
>Affects Versions: 3.4.0
>Reporter: Qian Sun
>Assignee: Qian Sun
>Priority: Minor
> Fix For: 3.4.0
>
>
> When spark application requires multiple executors and not set pvc claimName 
> with onDemand or SPARK_EXECUTOR_ID, it always create executor pods. Because 
> pvc has be created by first executor pod.
> {noformat}
> 22/04/22 08:55:47 WARN ExecutorPodsSnapshotsStoreImpl: Exception when 
> notifying snapshot subscriber.
> io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: 
> POST at: 
> https://kubernetes.default.svc/api/v1/namespaces/default/persistentvolumeclaims.
>  Message: persistentvolumeclaims "test-1" already exists. Received status: 
> Status(apiVersion=v1, code=409, details=StatusDetails(causes=[], group=null, 
> kind=persistentvolumeclaims, name=test-1, retryAfterSeconds=null, uid=null, 
> additionalProperties={}), kind=Status, message=persistentvolumeclaims 
> "test-1" already exists, metadata=ListMeta(_continue=null, 
> remainingItemCount=null, resourceVersion=null, selfLink=null, 
> additionalProperties={}), reason=AlreadyExists, status=Failure, 
> additionalProperties={}).
>         at 
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:697)
>  ~[kubernetes-client-5.10.1.jar:?]
>         at 
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:676)
>  ~[kubernetes-client-5.10.1.jar:?]
>         at 
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.assertResponseCode(OperationSupport.java:629)
>  ~[kubernetes-client-5.10.1.jar:?]
>         at 
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:566)
>  ~[kubernetes-client-5.10.1.jar:?]
>         at 
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:527)
>  ~[kubernetes-client-5.10.1.jar:?]
>         at 
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleCreate(OperationSupport.java:315)
>  ~[kubernetes-client-5.10.1.jar:?]
>         at 
> io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleCreate(BaseOperation.java:651)
>  ~[kubernetes-client-5.10.1.jar:?]
>         at 
> io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleCreate(BaseOperation.java:91)
>  ~[kubernetes-client-5.10.1.jar:?]
>         at 
> io.fabric8.kubernetes.client.dsl.base.CreateOnlyResourceOperation.create(CreateOnlyResourceOperation.java:61)
>  ~[kubernetes-client-5.10.1.jar:?]
>         at 
> org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.$anonfun$requestNewExecutors$3(ExecutorPodsAllocator.scala:415)
>  ~[spark-kubernetes_2.12-3.3.0-SNAPSHOT.jar:3.3.0-SNAPSHOT]
>         at scala.collection.immutable.List.foreach(List.scala:431) 
> ~[scala-library-2.12.15.jar:?]
>         at 
> org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.$anonfun$requestNewExecutors$1(ExecutorPodsAllocator.scala:408)
>  ~[spark-kubernetes_2.12-3.3.0-SNAPSHOT.jar:3.3.0-SNAPSHOT]
>         at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:158) 
> ~[scala-library-2.12.15.jar:?]
>         at 
> org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.requestNewExecutors(ExecutorPodsAllocator.scala:385)
>  ~[spark-kubernetes_2.12-3.3.0-SNAPSHOT.jar:3.3.0-SNAPSHOT]
>         at 
> org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.$anonfun$onNewSnapshots$35(ExecutorPodsAllocator.scala:349)
>  ~[spark-kubernetes_2.12-3.3.0-SNAPSHOT.jar:3.3.0-SNAPSHOT]
>         at 
> org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.$anonfun$onNewSnapshots$35$adapted(ExecutorPodsAllocator.scala:342)
>  ~[spark-kubernetes_2.12-3.3.0-SNAPSHOT.jar:3.3.0-SNAPSHOT]
>         at 
> scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62) 
> ~[scala-library-2.12.15.jar:?]
>         at 
> scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55) 
> ~[scala-library-2.12.15.jar:?]
>         at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49) 
> ~[scala-library-2.12.15.jar:?]
>         at 
> org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.onNewSnapshots(ExecutorPodsAllocator.scala:342)
>  ~[spark-kubernetes_2.12-3.3.0-SNAPSHOT.jar:3.3.0-SNAPSHOT]
>         

[jira] [Commented] (SPARK-39006) Show a directional error message for PVC Dynamic Allocation Failure

2022-04-27 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-39006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17528683#comment-17528683
 ] 

Apache Spark commented on SPARK-39006:
--

User 'dcoliversun' has created a pull request for this issue:
https://github.com/apache/spark/pull/36374

> Show a directional error message for PVC Dynamic Allocation Failure
> ---
>
> Key: SPARK-39006
> URL: https://issues.apache.org/jira/browse/SPARK-39006
> Project: Spark
>  Issue Type: Improvement
>  Components: Kubernetes
>Affects Versions: 3.1.0
>Reporter: qian
>Priority: Major
>
> When spark application requires multiple executors and not set pvc claimName 
> with onDemand or SPARK_EXECUTOR_ID, it always create executor pods. Because 
> pvc has be created by first executor pod.
> {noformat}
> 22/04/22 08:55:47 WARN ExecutorPodsSnapshotsStoreImpl: Exception when 
> notifying snapshot subscriber.
> io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: 
> POST at: 
> https://kubernetes.default.svc/api/v1/namespaces/default/persistentvolumeclaims.
>  Message: persistentvolumeclaims "test-1" already exists. Received status: 
> Status(apiVersion=v1, code=409, details=StatusDetails(causes=[], group=null, 
> kind=persistentvolumeclaims, name=test-1, retryAfterSeconds=null, uid=null, 
> additionalProperties={}), kind=Status, message=persistentvolumeclaims 
> "test-1" already exists, metadata=ListMeta(_continue=null, 
> remainingItemCount=null, resourceVersion=null, selfLink=null, 
> additionalProperties={}), reason=AlreadyExists, status=Failure, 
> additionalProperties={}).
>         at 
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:697)
>  ~[kubernetes-client-5.10.1.jar:?]
>         at 
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:676)
>  ~[kubernetes-client-5.10.1.jar:?]
>         at 
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.assertResponseCode(OperationSupport.java:629)
>  ~[kubernetes-client-5.10.1.jar:?]
>         at 
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:566)
>  ~[kubernetes-client-5.10.1.jar:?]
>         at 
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:527)
>  ~[kubernetes-client-5.10.1.jar:?]
>         at 
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleCreate(OperationSupport.java:315)
>  ~[kubernetes-client-5.10.1.jar:?]
>         at 
> io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleCreate(BaseOperation.java:651)
>  ~[kubernetes-client-5.10.1.jar:?]
>         at 
> io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleCreate(BaseOperation.java:91)
>  ~[kubernetes-client-5.10.1.jar:?]
>         at 
> io.fabric8.kubernetes.client.dsl.base.CreateOnlyResourceOperation.create(CreateOnlyResourceOperation.java:61)
>  ~[kubernetes-client-5.10.1.jar:?]
>         at 
> org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.$anonfun$requestNewExecutors$3(ExecutorPodsAllocator.scala:415)
>  ~[spark-kubernetes_2.12-3.3.0-SNAPSHOT.jar:3.3.0-SNAPSHOT]
>         at scala.collection.immutable.List.foreach(List.scala:431) 
> ~[scala-library-2.12.15.jar:?]
>         at 
> org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.$anonfun$requestNewExecutors$1(ExecutorPodsAllocator.scala:408)
>  ~[spark-kubernetes_2.12-3.3.0-SNAPSHOT.jar:3.3.0-SNAPSHOT]
>         at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:158) 
> ~[scala-library-2.12.15.jar:?]
>         at 
> org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.requestNewExecutors(ExecutorPodsAllocator.scala:385)
>  ~[spark-kubernetes_2.12-3.3.0-SNAPSHOT.jar:3.3.0-SNAPSHOT]
>         at 
> org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.$anonfun$onNewSnapshots$35(ExecutorPodsAllocator.scala:349)
>  ~[spark-kubernetes_2.12-3.3.0-SNAPSHOT.jar:3.3.0-SNAPSHOT]
>         at 
> org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.$anonfun$onNewSnapshots$35$adapted(ExecutorPodsAllocator.scala:342)
>  ~[spark-kubernetes_2.12-3.3.0-SNAPSHOT.jar:3.3.0-SNAPSHOT]
>         at 
> scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62) 
> ~[scala-library-2.12.15.jar:?]
>         at 
> scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55) 
> ~[scala-library-2.12.15.jar:?]
>         at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49) 
> ~[scala-library-2.12.15.jar:?]
>         at 
> org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.onNewSnapshots(ExecutorPodsAllocator.scala:342)
>  ~[spark-kubernetes_2.12-3.3.0-SNAPSHOT.jar:3.3.0-SNAPSHOT]
>         at 
>