Re: Understanding Spark's caching

2015-04-28 Thread ayan guha
Hi I replied you in SO. If option A had a action call then it should suffice too. On 28 Apr 2015 05:30, Eran Medan eran.me...@gmail.com wrote: Hi Everyone! I'm trying to understand how Spark's cache work. Here is my naive understanding, please let me know if I'm missing something: val

Re: Understanding Spark's caching

2015-04-28 Thread Akhil Das
Option B would be fine, as in the SO itself the answer says, Since RDD transformations merely build DAG descriptions without execution, in Option A by the time you call unpersist, you still only have job descriptions and not a running execution. Also note, In Option A, you are not specifying any

Understanding Spark's caching

2015-04-27 Thread Eran Medan
Hi Everyone! I'm trying to understand how Spark's cache work. Here is my naive understanding, please let me know if I'm missing something: val rdd1 = sc.textFile(some data) rdd.cache() //marks rdd as cached val rdd2 = rdd1.filter(...) val rdd3 = rdd1.map(...) rdd2.saveAsTextFile(...)