I think we want to change the value of spark.local.dir to point to where your PVC is mounted. Can you give that a try and let us know if that moves the spills as expected?
-Matt Cheah From: Tomasz Krol <patric...@gmail.com> Date: Wednesday, February 27, 2019 at 3:41 AM To: "user@spark.apache.org" <user@spark.apache.org> Subject: Spark on k8s - map persistentStorage for data spilling Hey Guys, I hope someone will be able to help me, as I've stuck with this for a while:) Basically I am running some jobs on kubernetes as per documentation https://spark.apache.org/docs/latest/running-on-kubernetes.html [spark.apache.org] All works fine, however if I run queries on bigger data volume, then jobs failing that there is not enough space in /var/data/spark-1xxx directory. Obviously the reason for this is that emptyDir mounted doesnt have enough space. I also mounted pvc to the driver and executors pods which I can see during the runtime. I am wondering if someone knows how to set that data will be spilled to different directory (i.e my persistent storage directory) instead of empyDir with some limitted space. Or if I can mount the empyDir somehow on my pvc. Basically at the moment I cant run any jobs as they are failing due to insufficient space in that /var/data directory. Thanks -- Tomasz Krol patric...@gmail.com
smime.p7s
Description: S/MIME cryptographic signature