Re: spark persistence doubt

2016-09-29 Thread Bedrytski Aliaksandr
Hi, the 4th step should contain "transformrdd2", right? considering that transformations are lined-up and executed only when there is an action (also known as lazy execution), I would say that adding persist() to the step 1 would not do any good (and may even be harmful as you may lose the optimi

spark persistence doubt

2016-09-28 Thread Shushant Arora
Hi I have a flow like below 1.rdd1=some source.transform(); 2.tranformedrdd1 = rdd1.transform(..); 3.transformrdd2 = rdd1.transform(..); 4.tranformrdd1.action(); Does I need to persist rdd1 to optimise step 2 and 3 ? or since there is no lineage breakage so it will work without persist ? Thank