The reason am asking this is, i am not able to understand how do i do a skip.
1) Broadcast small table-1 as map. 2) I jun do .map() on large table-2. When you do .map() you must map each element to a new element. However with map-side join, when i get the broadcasted map, i will search in it with a key, and if that element in not found in map then i want to skip that input all together. (This is what happens when you do .join, it skips automatically). With map side join you need to do it. I am assuming you do it with mapPartitions & yield. A working code will help me understand it better. On Tue, Apr 21, 2015 at 9:40 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) <deepuj...@gmail.com> wrote: > Can someone share their working code of Map Side join in Spark + Scala. > (No Spark-SQL) > > The only resource i could find was this (Open in chrome with Chinese to > english translator) > > http://dongxicheng.org/framework-on-yarn/apache-spark-join-two-tables/ > > > > -- > Deepak > > -- Deepak