Github user tgravescs commented on the issue:

    https://github.com/apache/spark/pull/18874
  
    I suggest you go understand the code. 
    
    I've already explained this multiple times.  You get 0 executors by there 
being delays when an executors doesn't have a task scheduled.  say you have a 
stage with 10 tasks.  say 1 executors can run 1 task, it finishes, the driver 
Gc's for 60 seconds before it can put another task on it the dynamic allocation 
manager idle times out that executor.  We never ask for more executors.  This 
patch does address 0 executors, but again that is more the edge case, the real 
problem is it goes down a few and never gets any back.  A job that should take 
10's of minutes and take hours because of this. 
    
    increasing the idle timeout is just a work around for the problem its not a 
solution.  As I've said multiple times increasing the idle timeout has other 
consequences and a user should not have to increase the idle timeout just to 
get there job to run in a reasonable time. They should increase or decrease it 
to optimize things between stages or jobs.  the definitely of dynamic 
allocation is to automatically get executors when they are needed. We are not 
doing this!   they may not be needed at this moment, but if we would either 
keep them or reacquire them they would be used and are needed.
    



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to