Github user srowen commented on the issue:

    https://github.com/apache/spark/pull/18874
  
    Seems not-unreasonable to me given the current problem statement. It does 
solve the possible problem about 0 executors, and then some. 
    
    The possible impact to a normal app is like: run a bunch of short-lived 
stages (think iterative ML). Target executor count stays high. But the tasks 
schedule on just a subset of executors because they finish quickly and the rest 
wait for the data-local slot and finish on those executors too. In this 
scenario, the extra executors can't be released, though will always be idle, 
because they have to be there to keep up the target count. Right now, they'd be 
released. This scenario is not unrealistic in my experience, but it's the only 
problem scenario I can think of.
    
    (Am I right that the check vs minimum executor count here is now redundant? 
the target can't go under the minimum count, and executors can't go under the 
target count now on removal.)
    
    I guess I'm still sort of unclear how in the stuck-driver scenario that 
`onExecutorBusy` isn't firing to mark executors as not-idle, but, the 
idle-timeout `schedule()` loops is still running fine. But it's imaginable. Yes 
this change fixes that scenario, and sounds like it has been observed, though 
may be chalked up to dire driver states that are going to fall over anyhow. It 
_does_ sound logical to not let the idle-timeout loop take the executor count 
below target, even though I assumed it was because of the scenario above, maybe?
    
    Those are the things we're weighing, and there's clear support for maybe 
inconveniencing the first scenario to both help the second and fix the 
0-executor risk, so I have no issue with that.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to