spark narrow vs wide dependency

Shushant Arora Thu, 26 Jan 2017 20:03:28 -0800

Hi

I have two transformations in series.


rdd1 = sourcerdd.map(new Function(...)); //step1
rdd2 = rdd1.mapPartitions(new Function(...)); //step2

1.Is map and mapPartitions narrow dependency ? Does spark optimise the dag
and execute step 1 and step2 in single stage or there will be two stages ?

Bsically I have a requirement to use a complex object in step2 which I
don't want to instantiate for each record so I have used mapPartitons at
step 2.

2.If I have a requirement to instantiate a complex object across all tasks
 on same executor node also , does making object singleton is fine there ?
Since java discourages singleton , will it be fine here to use singleton or
is there any other better alternative ?

Thanks

spark narrow vs wide dependency

Reply via email to