will shuffle, and following join COULD cause another shuffle.
>> So I am not sure if it is a smart way.
>>
>> Yong
>>
>> --
>> *From:* shyla deshpande <deshpandesh...@gmail.com>
>> *Sent:* Wednesday, March 29, 2017 12:33 PM
it is a smart way.
>
> Yong
>
> --
> *From:* shyla deshpande <deshpandesh...@gmail.com>
> *Sent:* Wednesday, March 29, 2017 12:33 PM
> *To:* user
> *Subject:* Re: Spark SQL, dataframe join questions.
>
>
>
> On Tue, Mar 28, 2017 at 2:57 PM, shyla deshpande <deshpa
join COULD cause another shuffle. So I
am not sure if it is a smart way.
Yong
From: shyla deshpande <deshpandesh...@gmail.com>
Sent: Wednesday, March 29, 2017 12:33 PM
To: user
Subject: Re: Spark SQL, dataframe join questions.
On Tue, Mar 28, 2017 at 2
On Tue, Mar 28, 2017 at 2:57 PM, shyla deshpande
wrote:
> Following are my questions. Thank you.
>
> 1. When joining dataframes is it a good idea to repartition on the key column
> that is used in the join or
> the optimizer is too smart so forget it.
>
> 2. In RDD