[ https://issues.apache.org/jira/browse/SPARK-40298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17600631#comment-17600631 ]
Dongjoon Hyun commented on SPARK-40298: --------------------------------------- Thank you for trying Apache Spark feature, [~todd5167] , but as [~hyukjin.kwon] mentioned, this is more like a question. First, could you provide a reproducible test case for your case? I want to help you. Second, I assume that you verified KubernetesLocalDiskShuffleExecutorComponents logs correctly. However, the following could be partial observation. {quote}It can be confirmed that the pvc has been multiplexed by other pods, and the Index and data data information has been sent {quote} SPARK-35593 was designed to help the recovery and to improve the stability at the best effort approach without any regressions which mean SPARK-35593 doesn't aim to block the existing Spark features like re-computation or executor allocation with new PVC. More specifically, there exists two cases where Spark's processing is faster than the recovery. Case 1. When the Spark executor termination is a little slow and PVC is not available cleanly from K8s control plan for some some reason to Spark driver, Spark driver creates a new executor with a new PVC (of course, driver owned). In this case, you can have more PVCs than the executors. You can confirm this case with `kubectl` command. Case 2. When the Spark processing is faster that Spark&K8s's executor allocation(Pod Creation+PVC assignment+Docker Image Downloading+...), Spark recomputes the lineage with the running executors without waiting new executor allocation (or recovery from it). It's Spark's original design. It can happen always. > shuffle data recovery on the reused PVCs no effect > --------------------------------------------------- > > Key: SPARK-40298 > URL: https://issues.apache.org/jira/browse/SPARK-40298 > Project: Spark > Issue Type: Bug > Components: Kubernetes > Affects Versions: 3.2.2 > Reporter: todd > Priority: Major > Attachments: 1662002808396.jpg, 1662002822097.jpg > > > I use spark3.2.2 to test the [ Support shuffle data recovery on the reused > PVCs (SPARK-35593) ] feature.I found that when shuffle read fails, data is > still read from source. > It can be confirmed that the pvc has been multiplexed by other pods, and the > Index and data data information has been sent > *This is my spark configuration information:* > --conf spark.driver.memory=5G > --conf spark.executor.memory=15G > --conf spark.executor.cores=1 > --conf spark.executor.instances=50 > --conf spark.sql.shuffle.partitions=50 > --conf spark.dynamicAllocation.enabled=false > --conf spark.kubernetes.driver.reusePersistentVolumeClaim=true > --conf spark.kubernetes.driver.ownPersistentVolumeClaim=true > --conf > spark.kubernetes.executor.volumes.persistentVolumeClaim.data.options.claimName=OnDemand > --conf > spark.kubernetes.executor.volumes.persistentVolumeClaim.data.options.storageClass=gp2 > --conf > spark.kubernetes.executor.volumes.persistentVolumeClaim.data.options.sizeLimit=100Gi > --conf > spark.kubernetes.executor.volumes.persistentVolumeClaim.data.mount.path=/tmp/data > --conf > spark.kubernetes.executor.volumes.persistentVolumeClaim.data.mount.readOnly=false > --conf spark.executorEnv.SPARK_EXECUTOR_DIRS=/tmp/data > --conf > spark.shuffle.sort.io.plugin.class=org.apache.spark.shuffle.KubernetesLocalDiskShuffleDataIO > --conf spark.kubernetes.executor.missingPodDetectDelta=10s > --conf spark.kubernetes.executor.apiPollingInterval=10s > --conf spark.shuffle.io.retryWait=60s > --conf spark.shuffle.io.maxRetries=5 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org