Hi.
I have 2 dataframe with 1 and 12 partitions respectively. When I do a inner
join between these dataframes, the result contains 200 partitions. *Why?*
df1.join(df2, df1(id) === df2(id), Inner) = returns 200 partitions
Thanks!!!
--
Regards.
Miguel Ángel
That’s due to the config setting spark.sql.shuffle.partitions which defaults to
200
From: Masf
Date: Thursday, May 28, 2015 at 10:02 AM
To: user@spark.apache.orgmailto:user@spark.apache.org
Subject: Dataframe Partitioning
Hi.
I have 2 dataframe with 1 and 12 partitions respectively. When I do