Re: Map-Side Join in Spark

๏̯͡๏ Mon, 20 Apr 2015 21:19:49 -0700

The reason am asking this is, i am not able to understand how do i do a
skip.


1) Broadcast small table-1 as map.
2) I jun do .map() on large table-2.
       When you do .map() you must map each element to a new element.
 However with map-side join, when i get the broadcasted map, i will search
in it with a key, and if that element in not found in map then i want to
skip that input all together. (This is what happens when you do .join, it
skips automatically). With map side join you need to do it. I am assuming
you do it with mapPartitions & yield.

A working code will help me understand it better.

On Tue, Apr 21, 2015 at 9:40 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) <deepuj...@gmail.com> wrote:

> Can someone share their working code of Map Side join in Spark + Scala.
> (No Spark-SQL)
>
> The only resource i could find was this (Open in chrome with Chinese to
> english translator)
>
> http://dongxicheng.org/framework-on-yarn/apache-spark-join-two-tables/
>
>
>
> --
> Deepak
>
>


-- 
Deepak

Re: Map-Side Join in Spark

Reply via email to