Hi,
How about trying adaptive execution in spark?
https://issues.apache.org/jira/browse/SPARK-9850
This feature is turned off by default because it seems experimental.
// maropu
On Wed, Jul 27, 2016 at 3:26 PM, Brandon White
wrote:
> Hello,
>
> My platform runs
Hello,
My platform runs hundreds of Spark jobs every day each with its own
datasize from 20mb to 20TB. This means that we need to set resources
dynamically. One major pain point for doing this is
spark.sql.shuffle.partitions, the number of partitions to use when
shuffling data for joins or