[ 
https://issues.apache.org/jira/browse/SPARK-27499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16860519#comment-16860519
 ] 

Junjie Chen commented on SPARK-27499:
-------------------------------------

Hi, [~dongjoon], I know SPARK_LOCAL_DIRS can be mounted as emptyDir. However, 
emptyDir just one directory on node. I opened this Jira to track a feature to 
setting multiple directories to full utilize the nodes' disks bandwidth for 
spilling, which I think currently it can not be achieve through setting 
spark.local.dir. Even I set to multiple dirs, they still map to one directory 
on node.

 

This Jira is intended to use hostPath volumes mounts as spark.local.dir, for 
exmaple:

spark.kubernetes.executor.volumes.hostPath.spark-local-dir-1.mount.path=/data/mnt-x
 

 

> Support mapping spark.local.dir to hostPath volume
> --------------------------------------------------
>
>                 Key: SPARK-27499
>                 URL: https://issues.apache.org/jira/browse/SPARK-27499
>             Project: Spark
>          Issue Type: Improvement
>          Components: Kubernetes
>    Affects Versions: 3.0.0
>            Reporter: Junjie Chen
>            Priority: Minor
>             Fix For: 2.4.0
>
>
> Currently, the k8s executor builder mount spark.local.dir as emptyDir or 
> memory, it should satisfy some small workload, while in some heavily workload 
> like TPCDS, both of them can have some problem, such as pods are evicted due 
> to disk pressure when using emptyDir, and OOM when using tmpfs.
> In particular on cloud environment, users may allocate cluster with minimum 
> configuration and add cloud storage when running workload. In this case, we 
> can specify multiple elastic storage as spark.local.dir to accelerate the 
> spilling. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to