ty useful. The default
> value is 200.
>
>
>
> Mohammed
>
>
>
> *From:* Alex Nastetsky [mailto:alex.nastet...@vervemobile.com]
> *Sent:* Wednesday, October 14, 2015 8:14 PM
> *To:* user
> *Subject:* dataframes and numPartitions
>
>
>
> A lot of RDD met
You may find the spark.sql.shuffle.partitions property useful. The default
value is 200.
Mohammed
From: Alex Nastetsky [mailto:alex.nastet...@vervemobile.com]
Sent: Wednesday, October 14, 2015 8:14 PM
To: user
Subject: dataframes and numPartitions
A lot of RDD methods take a numPartitions
A lot of RDD methods take a numPartitions parameter that lets you specify
the number of partitions in the result. For example, groupByKey.
The DataFrame counterparts don't have a numPartitions parameter, e.g.
groupBy only takes a bunch of Columns as params.
I understand that the DataFrame API is