Hi,

How about using broadcast joins?
largeDf.join(broadcast(smallDf), "joinKey")

On Sat, Feb 6, 2016 at 2:25 AM, Rex X <dnsr...@gmail.com> wrote:

> Dear all,
>
> The new DataFrame of spark is extremely fast. But out cluster have limited
> RAM (~500GB).
>
> What is the best way to do such a big table Join?
>
> Any sample code is greatly welcome!
>
>
> Best,
> Rex
>
>


-- 
---
Takeshi Yamamuro

Reply via email to