rdd cache priority

2016-02-04 Thread charles li
say I have 2 RDDs, RDD1 and RDD2. both are 20g in memory. and I cache both of them in memory using RDD1.cache() and RDD2.cache() the in the further steps on my app, I never use RDD1 but use RDD2 for lots of time. then here is my question: if there is only 40G memory in my cluster, and here

Re: rdd cache priority

2016-02-04 Thread Takeshi Yamamuro
Hi, u're right; rdd3 is not totally cached and it is re-computed every time. If MEMORY_AND_DISK, rdd3 is written to disk. Also, the current Spark does not automatically unpersist rdds depends on frequency of use. On Fri, Feb 5, 2016 at 12:15 PM, charles li wrote: >