is spark.cleaner.ttl safe?

Michael Allman Tue, 11 Mar 2014 13:59:23 -0700

Hello,

I've been trying to run an iterative spark job that spills 1+ GB to diskper iteration on a system with limited disk space. I believe there'senough space if spark would clean up unused data from previous iterations,but as it stands the number of iterations I can run is limited byavailable disk space.

I found a thread on the usage of spark.cleaner.ttl on the old Spark UsersGoogle group here:


https://groups.google.com/forum/#!topic/spark-users/9ebKcNCDih4

I think this setting may be what I'm looking for, however the cleanerseems to delete data that's still in use. The effect is I get bizarreexceptions from Spark complaining about missing broadcast data orArrayIndexOutOfBounds. When is spark.cleaner.ttl safe to use? Is itsupposed to delete in-use data or is this a bug/shortcoming?


Cheers,

Michael

is spark.cleaner.ttl safe?

Reply via email to