change default storage level

2015-07-09 Thread Michal Čizmazia
Is there a way how to change the default storage level? If not, how can I properly change the storage level wherever necessary, if my input and intermediate results do not fit into memory? In this example: context.wholeTextFiles(...) .flatMap(s - ...) .flatMap(s - ...) Does persist()

Re: change default storage level

2015-07-09 Thread Shixiong Zhu
Spark won't store RDDs to memory unless you use a memory StorageLevel. By default, your input and intermediate results won't be put into memory. You can call persist if you want to avoid duplicate computation or reading. E.g., val r1 = context.wholeTextFiles(...) val r2 = r1.flatMap(s - ...) val

Re: change default storage level

2015-07-09 Thread Michal Čizmazia
Thanks Shixiong! Your response helped me to understand the role of persist(). No persist() calls were required indeed. I solved my problem by setting spark.local.dir to allow more space for Spark temporary folder. It works automatically. I am seeing logs like this: Not enough space to cache