Github user squito commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21758#discussion_r205652317
  
    --- Diff: 
core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala ---
    @@ -359,20 +366,55 @@ private[spark] class TaskSchedulerImpl(
         // of locality levels so that it gets a chance to launch local tasks 
on all of them.
         // NOTE: the preferredLocality order: PROCESS_LOCAL, NODE_LOCAL, 
NO_PREF, RACK_LOCAL, ANY
         for (taskSet <- sortedTaskSets) {
    -      var launchedAnyTask = false
    -      var launchedTaskAtCurrentMaxLocality = false
    -      for (currentMaxLocality <- taskSet.myLocalityLevels) {
    -        do {
    -          launchedTaskAtCurrentMaxLocality = resourceOfferSingleTaskSet(
    -            taskSet, currentMaxLocality, shuffledOffers, availableCpus, 
tasks)
    -          launchedAnyTask |= launchedTaskAtCurrentMaxLocality
    -        } while (launchedTaskAtCurrentMaxLocality)
    -      }
    -      if (!launchedAnyTask) {
    -        taskSet.abortIfCompletelyBlacklisted(hostToExecutors)
    +      // Skip the barrier taskSet if the available slots are less than the 
number of pending tasks.
    +      if (taskSet.isBarrier && availableSlots < taskSet.numTasks) {
    --- End diff --
    
    You'll request the slots, but I think there are a lot more complications.  
The whole point of using dynamic allocation is on a multi-tenant cluster, so 
resources will come and go.  If there aren't enough resources available on the 
cluster no matter what, then you'll see executors get acquired, have their idle 
timeout expire, get released, and then acquired again.  This will be really 
confusing to the user, as it might look there is some progress with the 
constant logging about executors getting acquired and released, though really 
it would just wait indefinitely.
    
    Or you might get deadlock with two concurrent applications.  Even if they 
could fit on the cluster by themselves, they might both acquire some resources, 
which would prevent either of them from getting enough.  Again, they'd both go 
through the same loop, of acquiring some resources, then having them hit the 
idle timeout and releasing them, then acquiring resources, but they might just 
continually trade resources between each other.  They'd only advance by luck.
    
    You have the similar problems with concurrent jobs within one spark 
application, but its a bit easier to control since at least the spark scheduler 
knows about everything.
    
    > We plan to fail the job on submit if it requires more slots than 
available.
    
    what exactly do you mean by "available"?  Its not so well defined for 
dynamic allocation.  The resources you have right when the job is submitted?  
Also can you point me to where that is being done?  I didn't see it here -- is 
it another jira?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to