Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/7888#discussion_r44492100 --- Diff: core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala --- @@ -509,6 +511,13 @@ private[spark] class ExecutorAllocationManager( private def onExecutorBusy(executorId: String): Unit = synchronized { logDebug(s"Clearing idle timer for $executorId because it is now running a task") removeTimes.remove(executorId) + + // Executor is added to remove by misjudgment due to async listener making it as idle). + // see SPARK-9552 + if (executorsPendingToRemove.contains(executorId)) { --- End diff -- So, backtracking a step because this code can be confusing. `executorsPendingToRemove` is only updated by `removeExecutor` in this class. `removeExecutor` only adds an executor to that list if calls `ExecutorAllocationClient.killExecutor` and the result is `true`, meaning a request was made to kill the executor. So if you do what I suggested above (change the return value of `ExecutorAllocationClient.killExecutor` so that it returns `false` when the executor is busy, and thus a request to kill it was not sent), the executor would not be added to this list in the first place, and you wouldn't need to fix it. Does that make sense? Why can't that be done?
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org