Github user squito commented on the issue: https://github.com/apache/spark/pull/21068 I totally understand your motivation for wanting the limit. But I'm trying to balance that against behavior which might not really achieve the desired effect and be even more confusing in some cases. It won't achieve the desired effect if your cluster has more nodes, but they're all tied up in other applications. It'll be confusing to users if they see notification about blacklisting in the logs and UI, but then still see spark trying to use those nodes anyway. I wonder if putting this in will make it hard All that said, I don't have a great alternative now, other than just removing the limit entirely for the moment and adding notification to the driver. We could have a more general starvation detector, which wouldn't only look at node count, but also look at delays in acquiring containers and finding places to schedule tasks (related to SPARK-15815 & SPARK-22148), but I don't want to tackle all of that here.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org