3.Also will the mappartitions can go out of memory if I return the
arraylist of whole partition after processing the partition ? whats the
alternative to this if this can fail.
On Fri, Jan 27, 2017 at 9:32 AM, Shushant Arora
wrote:
> Hi
>
> I have two transformations
Hi
I have two transformations in series.
rdd1 = sourcerdd.map(new Function(...)); //step1
rdd2 = rdd1.mapPartitions(new Function(...)); //step2
1.Is map and mapPartitions narrow dependency ? Does spark optimise the dag
and execute step 1 and step2 in single stage or there will be two stages ?