GitHub user JonathanTaws opened a pull request: https://github.com/apache/spark/pull/15405
[SPARK-15917][CORE] Added support for number of executors in Standalone [WIP] ## What changes were proposed in this pull request? Currently in standalone mode it is not possible to set the number of executors by using the `--num-executors` or `spark.executor.instances` property. Instead, as many executors as possible will be spawned based on the available resources and the properties set. This patch corrects that to support the number of executors property. Here's the new behavior : - If the `executor.cores` property isn't set, we will try to spawn one executor on each worker taking all of the cores available (like the default value) while the number of workers < number of executors requested. If we can't launch the specified number of executors, a warning is logged. - If the `executor.cores` property is set (repeat the same logic for `executor.memory`): - and `executor.instances` * `executor.cores` <= `cores.max`, then `executor.instances` will be spawned, - and `executor.instances` * `executor.cores` > `cores.max`, then as many executors will be spawned as it is possible - basically the previous behavior when only executor.cores was set - but we also log a warning saying we couldn't spawn the requested number of executors, In the case where `executor.memory` is set, all constraints are taken into account based on the number of cores and memory per worker assigned (same logic as with the cores). ## How was this patch tested? I tested this patch by running a simple Spark app in standalone mode and specifying the `--num-executors` or `spark.executor.instances property`, and checking if the number of executors was coherent based on the available resources and the requested number of executors. I plan on testing this patch by adding tests in `MasterSuite` and running the usual `/dev/run-tests`. You can merge this pull request into a Git repository by running: $ git pull https://github.com/JonathanTaws/spark SPARK-15917 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/15405.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #15405 ---- commit f45a6732e30c7ae374089d5a63146a03b7a40671 Author: Jonathan Taws <jonathan.t...@gmail.com> Date: 2016-06-24T10:23:33Z [SPARK-15917] Added support for number of executors for Standalone mode commit d0b1a71cc1106413fdedcc1c658aa4830b1122f0 Author: JonathanTaws <jonathan.t...@gmail.com> Date: 2016-10-04T13:32:03Z [SPARK-15917] Added warning message if requested number of executors can't be satisfied commit 0af7b10c42c73d8ee9a0e49e9b652946274d1bae Author: JonathanTaws <jonathan.t...@gmail.com> Date: 2016-10-09T10:43:29Z Added check on number of workers to avoid displaying the same message multiple times commit eed3ecd91e3c84c0e17c513e4b48b92f6b1532f0 Author: JonathanTaws <jonathan.t...@gmail.com> Date: 2016-10-09T12:30:57Z Improved check on num executors warning message ---- --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org