You may need to persist r1 after partitionBy call. second join will be more
efficient.
On Mon, Nov 16, 2015 at 2:48 PM, Rishi Mishra wrote:
> AFAIK and can see in the code both of them should behave same.
>
> On Sat, Nov 14, 2015 at 2:10 AM, Alexander Pivovarov
AFAIK and can see in the code both of them should behave same.
On Sat, Nov 14, 2015 at 2:10 AM, Alexander Pivovarov
wrote:
> Hi Everyone
>
> Is there any difference in performance btw the following two joins?
>
>
> val r1: RDD[(String, String]) = ???
> val r2: RDD[(String,
Hi Everyone
Is there any difference in performance btw the following two joins?
val r1: RDD[(String, String]) = ???
val r2: RDD[(String, String]) = ???
val partNum = 80
val partitioner = new HashPartitioner(partNum)
// Join 1
val res1 =