[ https://issues.apache.org/jira/browse/SPARK-31666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17151363#comment-17151363 ]
Dongjoon Hyun commented on SPARK-31666: --------------------------------------- First of all, the following is Spark 2.4 behavior since 2.4.0. It's not a bug. {quote}In Spark 2.4, the `LocalDirsFeatureStep` iterates through the list of paths in `spark.local.dir`. For each one, it creates a Kubernetes volume of mount type `emptyDir` with the name `spark-local-dir-${index}`. {quote} The following is a wrong use case because Spark 2.4.x features are not designed for that. {quote}The issue is that I need my Spark job to use paths from my host machine that are on a mount point that isn't part of the directory which Kubernetes uses to allocate space for `emptyDir` volumes. Therefore, I mount these paths as type `hostPath` and ask Spark to use them as local directory space. {quote} Please note that the error message came from K8s. Apache Spark starts to support your use case in Apache Spark 3.0 by adding new features. I guess you are confused on the issue types in the open source projects. Not only Apache Spark, All Apache Projects distinguishes `New Improvement` and `New Feature` from `Bug`. Many new improvements and features are adding those kind of unsupported stuffs inside old versions. We cannot backport everything into old branches. All committers and developers are already moving to Apache Spark 3.1.0. For the following, you misunderstand again. We didn't kill 2.4.x like 1.6.x. Historically, 1.6 was killed at 1.6.3. For 2.4.x, you can still use Apache Spark 2.4.6 and more. That's the reason why Apache Spark community declared 2.4 as LTS (long term support). [https://spark.apache.org/versioning-policy.html.] We will maintain with critical bug fixes and security fixes like [https://spark.apache.org/security.html] . However, 2.4.7 (or 2.4.8) will be the same with 2.4.0~2.4.6 in terms of the features. That's the community policy. > I feel giving folks 6 months to migrate from one Spark release to the next >is fair, especially now considering how mature Spark is as a project. What are >your thoughts on this? > Cannot map hostPath volumes to container > ---------------------------------------- > > Key: SPARK-31666 > URL: https://issues.apache.org/jira/browse/SPARK-31666 > Project: Spark > Issue Type: Bug > Components: Kubernetes, Spark Core > Affects Versions: 2.4.5 > Reporter: Stephen Hopper > Priority: Major > > I'm trying to mount additional hostPath directories as seen in a couple of > places: > [https://aws.amazon.com/blogs/containers/optimizing-spark-performance-on-kubernetes/] > [https://github.com/GoogleCloudPlatform/spark-on-k8s-operator/blob/master/docs/user-guide.md#using-volume-for-scratch-space] > [https://spark.apache.org/docs/latest/running-on-kubernetes.html#using-kubernetes-volumes] > > However, whenever I try to submit my job, I run into this error: > {code:java} > Uncaught exception in thread kubernetes-executor-snapshots-subscribers-1 │ > io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: > POST at: https://kubernetes.default.svc/api/v1/namespaces/my-spark-ns/pods. > Message: Pod "spark-pi-1588970477877-exec-1" is invalid: > spec.containers[0].volumeMounts[1].mountPath: Invalid value: "/tmp1": must be > unique. Received status: Status(apiVersion=v1, code=422, > details=StatusDetails(causes=[StatusCause(field=spec.containers[0].volumeMounts[1].mountPath, > message=Invalid value: "/tmp1": must be unique, reason=FieldValueInvalid, > additionalProperties={})], group=null, kind=Pod, > name=spark-pi-1588970477877-exec-1, retryAfterSeconds=null, uid=null, > additionalProperties={}), kind=Status, message=Pod > "spark-pi-1588970477877-exec-1" is invalid: > spec.containers[0].volumeMounts[1].mountPath: Invalid value: "/tmp1": must be > unique, metadata=ListMeta(_continue=null, remainingItemCount=null, > resourceVersion=null, selfLink=null, additionalProperties={}), > reason=Invalid, status=Failure, additionalProperties={}).{code} > > This is my spark-submit command (note: I've used my own build of spark for > kubernetes as well as a few other images that I've seen floating around (such > as this one seedjeffwan/spark:v2.4.5) and they all have this same issue): > {code:java} > bin/spark-submit \ > --master k8s://https://my-k8s-server:443 \ > --deploy-mode cluster \ > --name spark-pi \ > --class org.apache.spark.examples.SparkPi \ > --conf spark.executor.instances=2 \ > --conf spark.kubernetes.container.image=my-spark-image:my-tag \ > --conf spark.kubernetes.driver.pod.name=sparkpi-test-driver \ > --conf spark.kubernetes.namespace=my-spark-ns \ > --conf > spark.kubernetes.executor.volumes.hostPath.spark-local-dir-2.mount.path=/tmp1 > \ > --conf > spark.kubernetes.executor.volumes.hostPath.spark-local-dir-2.options.path=/tmp1 > \ > --conf spark.local.dir="/tmp1" \ > --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark > local:///opt/spark/examples/jars/spark-examples_2.11-2.4.5.jar 20000{code} > Any ideas on what's causing this? > -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org