Github user tgravescs commented on the issue: https://github.com/apache/spark/pull/18874 I suggest you go understand the code. I've already explained this multiple times. You get 0 executors by there being delays when an executors doesn't have a task scheduled. say you have a stage with 10 tasks. say 1 executors can run 1 task, it finishes, the driver Gc's for 60 seconds before it can put another task on it the dynamic allocation manager idle times out that executor. We never ask for more executors. This patch does address 0 executors, but again that is more the edge case, the real problem is it goes down a few and never gets any back. A job that should take 10's of minutes and take hours because of this. increasing the idle timeout is just a work around for the problem its not a solution. As I've said multiple times increasing the idle timeout has other consequences and a user should not have to increase the idle timeout just to get there job to run in a reasonable time. They should increase or decrease it to optimize things between stages or jobs. the definitely of dynamic allocation is to automatically get executors when they are needed. We are not doing this! they may not be needed at this moment, but if we would either keep them or reacquire them they would be used and are needed.
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org