Hi,
the 4th step should contain "transformrdd2", right?
considering that transformations are lined-up and executed only when
there is an action (also known as lazy execution), I would say that
adding persist() to the step 1 would not do any good (and may even be
harmful as you may lose the optimi
Hi
I have a flow like below
1.rdd1=some source.transform();
2.tranformedrdd1 = rdd1.transform(..);
3.transformrdd2 = rdd1.transform(..);
4.tranformrdd1.action();
Does I need to persist rdd1 to optimise step 2 and 3 ? or since there is no
lineage breakage so it will work without persist ?
Thank