[ https://issues.apache.org/jira/browse/SPARK-21833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sital Kedia updated SPARK-21833: -------------------------------- Description: We have seen this issue in coarse grained scheduler that in case of dynamic executor allocation is turned on, the scheduler asks for more executors than needed. Consider the following situation where there are excutor allocation manager is ramping down the number of executors. It will lower the executor targer number by calling requestTotalExecutors api. Later, when the allocation manager finds some executors to be idle, it will call killExecutor api. The coarse grain scheduler, in the killExecutor function replaces the total executor needed to current + pending which overrides the earlier target set by the allocation manager. This results in scheduler spawning more executors than actually needed. was: We have seen this issue in coarse grained scheduler that in case of dynamic executor allocation is turned on, the scheduler asks for more executors than needed. Consider the following situation where there are excutor allocation manager is ramping down the number of executors. It will lower the executor targer number by calling requestTotalExecutors api (see https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala#L326. Later, when the allocation manager finds some executors to be idle, it will call killExecutor api (https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala#L447). The coarse grain scheduler, in the killExecutor function replaces the total executor needed to current + pending which overrides the earlier target set by the allocation manager https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala#L523. This results in scheduler spawning more executors than actually needed. > CoarseGrainedSchedulerBackend leaks executors in case of dynamic allocation > --------------------------------------------------------------------------- > > Key: SPARK-21833 > URL: https://issues.apache.org/jira/browse/SPARK-21833 > Project: Spark > Issue Type: Bug > Components: Scheduler > Affects Versions: 2.2.0 > Reporter: Sital Kedia > > We have seen this issue in coarse grained scheduler that in case of dynamic > executor allocation is turned on, the scheduler asks for more executors than > needed. Consider the following situation where there are excutor allocation > manager is ramping down the number of executors. It will lower the executor > targer number by calling requestTotalExecutors api. > Later, when the allocation manager finds some executors to be idle, it will > call killExecutor api. The coarse grain scheduler, in the killExecutor > function replaces the total executor needed to current + pending which > overrides the earlier target set by the allocation manager. > This results in scheduler spawning more executors than actually needed. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org