I think we want to change the value of spark.local.dir to point to where your 
PVC is mounted. Can you give that a try and let us know if that moves the 
spills as expected?


-Matt Cheah


From: Tomasz Krol <patric...@gmail.com>
Date: Wednesday, February 27, 2019 at 3:41 AM
To: "user@spark.apache.org" <user@spark.apache.org>
Subject: Spark on k8s - map persistentStorage for data spilling


Hey Guys,


I hope someone will be able to help me, as I've stuck with this for a while:) 
Basically I am running some jobs on kubernetes as per documentation




All works fine, however if I run queries on bigger data volume, then jobs 
failing that there is not enough space in /var/data/spark-1xxx directory.


Obviously the reason for this is that emptyDir mounted doesnt have enough space.


I also mounted pvc to the driver and executors pods which I can see during the 
runtime. I am wondering if someone knows how to set that data will be spilled 
to different directory (i.e my persistent storage directory) instead of empyDir 
with some limitted space. Or if I can mount the empyDir somehow on my pvc. 
Basically at the moment I cant run any jobs as they are failing due to 
insufficient space in that /var/data directory.




Tomasz Krol

Attachment: smime.p7s
Description: S/MIME cryptographic signature

Reply via email to