Hi Reynold,
Can you please elaborate on this. I thought RDD also opens only an
iterator. Does it get materialized for joins?

Rishi

On Saturday, September 19, 2015, Reynold Xin <r...@databricks.com> wrote:

> Yes for RDD -- both are materialized. No for DataFrame/SQL - one side
> streams.
>
>
> On Thu, Sep 17, 2015 at 11:21 AM, Koert Kuipers <ko...@tresata.com
> <javascript:_e(%7B%7D,'cvml','ko...@tresata.com');>> wrote:
>
>> in scalding we join with the smaller side on the left, since the smaller
>> side will get buffered while the bigger side streams through the join.
>>
>> looking at CoGroupedRDD i do not get the impression such a distiction is
>> made. it seems both sided are put into a map that can spill to disk. is
>> this correct?
>>
>> thanks
>>
>
>

Reply via email to