[ https://issues.apache.org/jira/browse/SPARK-27499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16860519#comment-16860519 ]
Junjie Chen commented on SPARK-27499: ------------------------------------- Hi, [~dongjoon], I know SPARK_LOCAL_DIRS can be mounted as emptyDir. However, emptyDir just one directory on node. I opened this Jira to track a feature to setting multiple directories to full utilize the nodes' disks bandwidth for spilling, which I think currently it can not be achieve through setting spark.local.dir. Even I set to multiple dirs, they still map to one directory on node. This Jira is intended to use hostPath volumes mounts as spark.local.dir, for exmaple: spark.kubernetes.executor.volumes.hostPath.spark-local-dir-1.mount.path=/data/mnt-x > Support mapping spark.local.dir to hostPath volume > -------------------------------------------------- > > Key: SPARK-27499 > URL: https://issues.apache.org/jira/browse/SPARK-27499 > Project: Spark > Issue Type: Improvement > Components: Kubernetes > Affects Versions: 3.0.0 > Reporter: Junjie Chen > Priority: Minor > Fix For: 2.4.0 > > > Currently, the k8s executor builder mount spark.local.dir as emptyDir or > memory, it should satisfy some small workload, while in some heavily workload > like TPCDS, both of them can have some problem, such as pods are evicted due > to disk pressure when using emptyDir, and OOM when using tmpfs. > In particular on cloud environment, users may allocate cluster with minimum > configuration and add cloud storage when running workload. In this case, we > can specify multiple elastic storage as spark.local.dir to accelerate the > spilling. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org