Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/21758#discussion_r205652317 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala --- @@ -359,20 +366,55 @@ private[spark] class TaskSchedulerImpl( // of locality levels so that it gets a chance to launch local tasks on all of them. // NOTE: the preferredLocality order: PROCESS_LOCAL, NODE_LOCAL, NO_PREF, RACK_LOCAL, ANY for (taskSet <- sortedTaskSets) { - var launchedAnyTask = false - var launchedTaskAtCurrentMaxLocality = false - for (currentMaxLocality <- taskSet.myLocalityLevels) { - do { - launchedTaskAtCurrentMaxLocality = resourceOfferSingleTaskSet( - taskSet, currentMaxLocality, shuffledOffers, availableCpus, tasks) - launchedAnyTask |= launchedTaskAtCurrentMaxLocality - } while (launchedTaskAtCurrentMaxLocality) - } - if (!launchedAnyTask) { - taskSet.abortIfCompletelyBlacklisted(hostToExecutors) + // Skip the barrier taskSet if the available slots are less than the number of pending tasks. + if (taskSet.isBarrier && availableSlots < taskSet.numTasks) { --- End diff -- You'll request the slots, but I think there are a lot more complications. The whole point of using dynamic allocation is on a multi-tenant cluster, so resources will come and go. If there aren't enough resources available on the cluster no matter what, then you'll see executors get acquired, have their idle timeout expire, get released, and then acquired again. This will be really confusing to the user, as it might look there is some progress with the constant logging about executors getting acquired and released, though really it would just wait indefinitely. Or you might get deadlock with two concurrent applications. Even if they could fit on the cluster by themselves, they might both acquire some resources, which would prevent either of them from getting enough. Again, they'd both go through the same loop, of acquiring some resources, then having them hit the idle timeout and releasing them, then acquiring resources, but they might just continually trade resources between each other. They'd only advance by luck. You have the similar problems with concurrent jobs within one spark application, but its a bit easier to control since at least the spark scheduler knows about everything. > We plan to fail the job on submit if it requires more slots than available. what exactly do you mean by "available"? Its not so well defined for dynamic allocation. The resources you have right when the job is submitted? Also can you point me to where that is being done? I didn't see it here -- is it another jira?
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org