Re: Join and HashPartitioner question

2015-11-16 Thread Erwan ALLAIN
You may need to persist r1 after partitionBy call. second join will be more efficient. On Mon, Nov 16, 2015 at 2:48 PM, Rishi Mishra wrote: > AFAIK and can see in the code both of them should behave same. > > On Sat, Nov 14, 2015 at 2:10 AM, Alexander Pivovarov

Re: Join and HashPartitioner question

2015-11-16 Thread Rishi Mishra
AFAIK and can see in the code both of them should behave same. On Sat, Nov 14, 2015 at 2:10 AM, Alexander Pivovarov wrote: > Hi Everyone > > Is there any difference in performance btw the following two joins? > > > val r1: RDD[(String, String]) = ??? > val r2: RDD[(String,

Join and HashPartitioner question

2015-11-13 Thread Alexander Pivovarov
Hi Everyone Is there any difference in performance btw the following two joins? val r1: RDD[(String, String]) = ??? val r2: RDD[(String, String]) = ??? val partNum = 80 val partitioner = new HashPartitioner(partNum) // Join 1 val res1 =