[GitHub] spark pull request: [SPARK-8881] Fix algorithm for scheduling exec...

srowen Tue, 07 Jul 2015 18:11:50 -0700

Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/7274#discussion_r34108478
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/master/Master.scala 
---
    @@ -544,38 +544,60 @@ private[master] class Master(
        * has enough cores and memory. Otherwise, each executor grabs all the 
cores available on the
        * worker by default, in which case only one executor may be launched on 
each worker.
        */
    -  private def startExecutorsOnWorkers(): Unit = {
    -    // Right now this is a very simple FIFO scheduler. We keep trying to 
fit in the first app
    -    // in the queue, then the second app, etc.
    +
    +  private[master] def scheduleExecutorsOnWorkers(app: ApplicationInfo, 
usableWorkers: Array[WorkerInfo],
    +    spreadOutApps: Boolean): Array[Int] = {
    +    val coresPerExecutor = app.desc.coresPerExecutor.getOrElse(1)
    +    val memoryPerExecutor = app.desc.memoryPerExecutorMB
    +    val numUsable = usableWorkers.length
    +    val assignedCores = new Array[Int](numUsable) // Number of cores to 
give to each worker
    +    val assignedMemory = new Array[Int](numUsable) // Amount of memory to 
give to each worker
    +    var toAssign = math.min(app.coresLeft, 
usableWorkers.map(_.coresFree).sum)
    +    var pos = 0
         if (spreadOutApps) {
    -      // Try to spread out each app among all the workers, until it has 
all its cores
    -      for (app <- waitingApps if app.coresLeft > 0) {
    -        val usableWorkers = workers.toArray.filter(_.state == 
WorkerState.ALIVE)
    -          .filter(worker => worker.memoryFree >= 
app.desc.memoryPerExecutorMB &&
    -            worker.coresFree >= app.desc.coresPerExecutor.getOrElse(1))
    -          .sortBy(_.coresFree).reverse
    -        val numUsable = usableWorkers.length
    -        val assigned = new Array[Int](numUsable) // Number of cores to 
give on each node
    -        var toAssign = math.min(app.coresLeft, 
usableWorkers.map(_.coresFree).sum)
    -        var pos = 0
    -        while (toAssign > 0) {
    -          if (usableWorkers(pos).coresFree - assigned(pos) > 0) {
    -            toAssign -= 1
    -            assigned(pos) += 1
    -          }
    -          pos = (pos + 1) % numUsable
    -        }
    -        // Now that we've decided how many cores to give on each node, 
let's actually give them
    -        for (pos <- 0 until numUsable if assigned(pos) > 0) {
    -          allocateWorkerResourceToExecutors(app, assigned(pos), 
usableWorkers(pos))
    +      // Try to spread out executors among workers (sparse scheduling)
    +      while (toAssign > 0) {
    +        if (usableWorkers(pos).coresFree - assignedCores(pos) >= 
coresPerExecutor &&
    +            usableWorkers(pos).memoryFree - assignedMemory(pos) >= 
memoryPerExecutor) {
    +          toAssign -= coresPerExecutor
    --- End diff --
    
    Also yes I see you still have the filtering on cores available so this 
shouldn't keep looping over workers, right. Unless the available count can drop 
while this is in progress but that is either not a problem or already a problem 
so not directly relevant



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8881] Fix algorithm for scheduling exec...

Reply via email to