Why change the number of partitions of RDDs? especially since you can't generally do that without a shuffle. If you just mean to ramp up and down resource usage, dynamic allocation (of executors) already does that.
On Wed, Sep 30, 2015 at 10:49 PM, Muhammed Uluyol <ulu...@umich.edu> wrote: > Hello, > > How feasible would it be to have spark speculatively increase the number of > partitions when there is spare capacity in the system? We want to do this to > increase to decrease application runtime. Initially, we will assume that > function calls of the same type will have the same runtime (e.g. all maps > take equal time) and that the runtime will scale linearly with the number of > workers. If a numPartitions value is specified, we may increase beyond this, > but if a Partitioner is specified, we would not change the number of > partitions. > > Some initial questions we had: > * Does spark already do this? > * Is there interest in supporting this functionality? > * Are there any potential issues that we should be aware of? > * What changes would need to be made for such a project? > > Thanks, > Muhammed --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org