subject:"Setting spark.sql.shuffle.partitions Dynamically"

Re: Setting spark.sql.shuffle.partitions Dynamically

2016-07-27 Thread Takeshi Yamamuro

Hi, How about trying adaptive execution in spark? https://issues.apache.org/jira/browse/SPARK-9850 This feature is turned off by default because it seems experimental. // maropu On Wed, Jul 27, 2016 at 3:26 PM, Brandon White wrote: > Hello, > > My platform runs

Setting spark.sql.shuffle.partitions Dynamically

2016-07-27 Thread Brandon White

Hello, My platform runs hundreds of Spark jobs every day each with its own datasize from 20mb to 20TB. This means that we need to set resources dynamically. One major pain point for doing this is spark.sql.shuffle.partitions, the number of partitions to use when shuffling data for joins or