Sital Kedia created SPARK-21833: ----------------------------------- Summary: CoarseGrainedSchedulerBackend leaks executors in case of dynamic allocation Key: SPARK-21833 URL: https://issues.apache.org/jira/browse/SPARK-21833 Project: Spark Issue Type: Bug Components: Scheduler Affects Versions: 2.2.0 Reporter: Sital Kedia
We have seen this issue in coarse grained scheduler that in case of dynamic executor allocation is turned on, the scheduler asks for more executors than needed. Consider the following situation where there are excutor allocation manager is ramping down the number of executors. It will lower the executor targer number by calling requestTotalExecutors api (see https://github.com/sitalkedia/spark/blob/master/core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala#L321). Later, when the allocation manager finds some executors to be idle, it will call killExecutor api (https://github.com/sitalkedia/spark/blob/master/core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala#L421). The coarse grain scheduler, in the killExecutor function replaces the total executor needed to current + pending which overrides the earlier target set by the allocation manager https://github.com/sitalkedia/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala#L523. This results in scheduler spawning more executors than actually needed. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org