Hi I have two transformations in series.
rdd1 = sourcerdd.map(new Function(...)); //step1 rdd2 = rdd1.mapPartitions(new Function(...)); //step2 1.Is map and mapPartitions narrow dependency ? Does spark optimise the dag and execute step 1 and step2 in single stage or there will be two stages ? Bsically I have a requirement to use a complex object in step2 which I don't want to instantiate for each record so I have used mapPartitons at step 2. 2.If I have a requirement to instantiate a complex object across all tasks on same executor node also , does making object singleton is fine there ? Since java discourages singleton , will it be fine here to use singleton or is there any other better alternative ? Thanks