Optimizations

Marius Danciu Fri, 03 Jul 2015 00:13:55 -0700

Hi all,

If I have something like:


rdd.join(...).mapPartitionToPair(...)

It looks like mapPartitionToPair runs in a different stage then join. Is
there a way to piggyback this computation inside the join stage ? ... such
that each result partition after join is passed to
the mapPartitionToPair function, all running in the same state without any
other costs.

Best,
Marius

Optimizations

Reply via email to