Split RDD into multiple RDDs using filter-transformation

2015-11-02 Thread Sushrut Ikhar
Hi, I need to split a RDD into 3 different RDD using filter-transformation. I have cached the original RDD before using filter. The input is lopsided leaving some executors with heavy load while others with less; so I have repartitioned it. *DAG-lineage I expected:* I/P RDD --> MAP RDD -->

Re: Split RDD into multiple RDDs using filter-transformation

2015-11-02 Thread Deng Ching-Mallete
Hi, You should perform an action (e.g. count, take, saveAs*, etc. ) in order for your RDDs to be cached since cache/persist are lazy functions. You might also want to do coalesce instead of repartition to avoid shuffling. Thanks, Deng On Mon, Nov 2, 2015 at 5:53 PM, Sushrut Ikhar