Github user squito commented on the issue: https://github.com/apache/spark/pull/17854 It looks to me like this is actually making 2 behavior changes: 1) throttle the requests for new containers, as you describe in your description 2) drop newly received containers if they are over the limit (not in the description). is that correct? Did you find that (2) was necessary as well? I understand the problem you are describing, but I'm surprised this really helps the driver scale up to more executors. Maybe this will let the executors start, but won't it just lead to the driver getting swamped when you've got 2500 executors sending heartbeats and task updates? I'm not saying its bad to make this improvement, just trying to understand. I'd feel better about just doing (1) -- if you found (2) is necessary, I would want to think through the implications a bit more.
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org