[ https://issues.apache.org/jira/browse/SPARK-4751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Andrew Or updated SPARK-4751: ----------------------------- Description: This is equivalent to SPARK-3822 but for standalone mode. This is actually a very tricky issue because the scheduling mechanism in the standalone Master uses different semantics. In standalone mode we allocate resources based on cores. By default, an application will grab all the cores in the cluster unless "spark.cores.max" is specified. Unfortunately, this means an application could get executors of different sizes (in terms of cores) if: 1) App 1 kills an executor 2) App 2, with "spark.cores.max" set, grabs a subset of cores on a worker 3) App 1 requests an executor In this case, App 1 will get back an executor of half the number of cores. Further, standalone mode is subject to the constraint that only one executor can be allocated on each worker per application. As a result, it is rather meaningless to request new executors if the existing ones are already spread out across all nodes. was: This is equivalent to SPARK-3822 but for standalone mode. This is actually a very tricky issue the scheduling mechanism in the standalone Master uses different semantics. In standalone mode we allocate resources based on cores. By default, an application will grab all the cores in the cluster unless "spark.cores.max" is specified. Unfortunately, this means an application could get executors of different sizes (in terms of cores) if: 1) App 1 kills an executor 2) App 2, with "spark.cores.max" set, grabs a subset of cores on a worker 3) App 1 requests an executor In this case, App 1 will get back an executor of half the number of cores. Further, standalone mode is subject to the constraint that only one executor can be allocated on each worker per application. As a result, it is rather meaningless to request new executors if the existing ones are already spread out across all nodes. > Support dynamic allocation for standalone mode > ---------------------------------------------- > > Key: SPARK-4751 > URL: https://issues.apache.org/jira/browse/SPARK-4751 > Project: Spark > Issue Type: New Feature > Components: Spark Core > Affects Versions: 1.2.0 > Reporter: Andrew Or > Assignee: Andrew Or > Priority: Blocker > > This is equivalent to SPARK-3822 but for standalone mode. > This is actually a very tricky issue because the scheduling mechanism in the > standalone Master uses different semantics. In standalone mode we allocate > resources based on cores. By default, an application will grab all the cores > in the cluster unless "spark.cores.max" is specified. Unfortunately, this > means an application could get executors of different sizes (in terms of > cores) if: > 1) App 1 kills an executor > 2) App 2, with "spark.cores.max" set, grabs a subset of cores on a worker > 3) App 1 requests an executor > In this case, App 1 will get back an executor of half the number of cores. > Further, standalone mode is subject to the constraint that only one executor > can be allocated on each worker per application. As a result, it is rather > meaningless to request new executors if the existing ones are already spread > out across all nodes. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org