subject:"\[GitHub\] spark pull request\: SPARK\-1706\: Allow multiple executors per worke..."

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-04-14 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/731#issuecomment-93073105
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30274/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-04-14 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/731#issuecomment-93073065
  
  [Test build #30274 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30274/consoleFull)
 for   PR 731 at commit 
[`6dee808`](https://github.com/apache/spark/commit/6dee808d4da7da43805e4ed16fc553f7bc18f494).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.
 * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-04-14 Thread CodingCat

Github user CodingCat commented on the pull request:

https://github.com/apache/spark/pull/731#issuecomment-93051522
  
Hey, @andrewor14 , my pleasure,  many thanks for your patient review


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-04-14 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/731


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-04-14 Thread CodingCat

Github user CodingCat commented on a diff in the pull request:

https://github.com/apache/spark/pull/731#discussion_r28367538
  
--- Diff: core/src/main/scala/org/apache/spark/deploy/master/Master.scala 
---
@@ -524,52 +524,25 @@ private[master] class Master(
   }
 
   /**
-   * Can an app use the given worker? True if the worker has enough memory 
and we haven't already
-   * launched an executor for the app on it (right now the standalone 
backend doesn't like having
-   * two executors on the same worker).
+   * Schedule executors to be launched on the workers.There are two modes 
of launching executors.
+   * The first attempts to spread out an application's executors on as 
many workers as possible,
+   * while the second does the opposite (i.e. launch them on as few 
workers as possible). The former
+   * is usually better for data locality purposes and is the default. The 
number of cores assigned
+   * to each executor is configurable. When this is explicitly set, 
multiple executors from the same
+   * application may be launched on the same worker if the worker has 
enough cores and memory.
+   * Otherwise, each executor grabs all the cores available on the worker 
by default, in which case
+   * only one executor may be launched on each worker.
*/
-  private def canUse(app: ApplicationInfo, worker: WorkerInfo): Boolean = {
-worker.memoryFree >= app.desc.memoryPerSlave && 
!worker.hasExecutor(app)
-  }
-
-  /**
-   * Schedule the currently available resources among waiting apps. This 
method will be called
-   * every time a new app joins or resource availability changes.
-   */
-  private def schedule() {
-if (state != RecoveryState.ALIVE) { return }
-
-// First schedule drivers, they take strict precedence over 
applications
-// Randomization helps balance drivers
-val shuffledAliveWorkers = Random.shuffle(workers.toSeq.filter(_.state 
== WorkerState.ALIVE))
-val numWorkersAlive = shuffledAliveWorkers.size
-var curPos = 0
-
-for (driver <- waitingDrivers.toList) { // iterate over a copy of 
waitingDrivers
-  // We assign workers to each waiting driver in a round-robin 
fashion. For each driver, we
-  // start from the last worker that was assigned a driver, and 
continue onwards until we have
-  // explored all alive workers.
-  var launched = false
-  var numWorkersVisited = 0
-  while (numWorkersVisited < numWorkersAlive && !launched) {
-val worker = shuffledAliveWorkers(curPos)
-numWorkersVisited += 1
-if (worker.memoryFree >= driver.desc.mem && worker.coresFree >= 
driver.desc.cores) {
-  launchDriver(worker, driver)
-  waitingDrivers -= driver
-  launched = true
-}
-curPos = (curPos + 1) % numWorkersAlive
-  }
-}
-
+  private def startExecutorsOnWorkers(): Unit = {
 // Right now this is a very simple FIFO scheduler. We keep trying to 
fit in the first app
 // in the queue, then the second app, etc.
 if (spreadOutApps) {
-  // Try to spread out each app among all the nodes, until it has all 
its cores
+  // Try to spread out each app among all the workers, until it has 
all its cores
   for (app <- waitingApps if app.coresLeft > 0) {
 val usableWorkers = workers.toArray.filter(_.state == 
WorkerState.ALIVE)
-  .filter(canUse(app, _)).sortBy(_.coresFree).reverse
+  .filter(worker => worker.memoryFree >= 
app.desc.memoryPerExecutorMB &&
+worker.coresFree > 0)
--- End diff --

hmmmif we remove this, in the case above, the user prefers 3 cores per 
executor and all workers have at most 2 cores, though we will not allocate 
anything to the worker, we still generate a `assigned ` array


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-04-14 Thread andrewor14

Github user andrewor14 commented on the pull request:

https://github.com/apache/spark/pull/731#issuecomment-93050835
  
@CodingCat I'm merging this into master. Thanks for keeping this patch open 
for a long time and patiently reiterating on the reviews. I think the final 
solution we have here is much simpler than the one we began with.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-04-14 Thread andrewor14

Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/731#discussion_r28367067
  
--- Diff: core/src/main/scala/org/apache/spark/deploy/master/Master.scala 
---
@@ -524,52 +524,25 @@ private[master] class Master(
   }
 
   /**
-   * Can an app use the given worker? True if the worker has enough memory 
and we haven't already
-   * launched an executor for the app on it (right now the standalone 
backend doesn't like having
-   * two executors on the same worker).
+   * Schedule executors to be launched on the workers.There are two modes 
of launching executors.
+   * The first attempts to spread out an application's executors on as 
many workers as possible,
+   * while the second does the opposite (i.e. launch them on as few 
workers as possible). The former
+   * is usually better for data locality purposes and is the default. The 
number of cores assigned
+   * to each executor is configurable. When this is explicitly set, 
multiple executors from the same
+   * application may be launched on the same worker if the worker has 
enough cores and memory.
+   * Otherwise, each executor grabs all the cores available on the worker 
by default, in which case
+   * only one executor may be launched on each worker.
*/
-  private def canUse(app: ApplicationInfo, worker: WorkerInfo): Boolean = {
-worker.memoryFree >= app.desc.memoryPerSlave && 
!worker.hasExecutor(app)
-  }
-
-  /**
-   * Schedule the currently available resources among waiting apps. This 
method will be called
-   * every time a new app joins or resource availability changes.
-   */
-  private def schedule() {
-if (state != RecoveryState.ALIVE) { return }
-
-// First schedule drivers, they take strict precedence over 
applications
-// Randomization helps balance drivers
-val shuffledAliveWorkers = Random.shuffle(workers.toSeq.filter(_.state 
== WorkerState.ALIVE))
-val numWorkersAlive = shuffledAliveWorkers.size
-var curPos = 0
-
-for (driver <- waitingDrivers.toList) { // iterate over a copy of 
waitingDrivers
-  // We assign workers to each waiting driver in a round-robin 
fashion. For each driver, we
-  // start from the last worker that was assigned a driver, and 
continue onwards until we have
-  // explored all alive workers.
-  var launched = false
-  var numWorkersVisited = 0
-  while (numWorkersVisited < numWorkersAlive && !launched) {
-val worker = shuffledAliveWorkers(curPos)
-numWorkersVisited += 1
-if (worker.memoryFree >= driver.desc.mem && worker.coresFree >= 
driver.desc.cores) {
-  launchDriver(worker, driver)
-  waitingDrivers -= driver
-  launched = true
-}
-curPos = (curPos + 1) % numWorkersAlive
-  }
-}
-
+  private def startExecutorsOnWorkers(): Unit = {
 // Right now this is a very simple FIFO scheduler. We keep trying to 
fit in the first app
 // in the queue, then the second app, etc.
 if (spreadOutApps) {
-  // Try to spread out each app among all the nodes, until it has all 
its cores
+  // Try to spread out each app among all the workers, until it has 
all its cores
   for (app <- waitingApps if app.coresLeft > 0) {
 val usableWorkers = workers.toArray.filter(_.state == 
WorkerState.ALIVE)
-  .filter(canUse(app, _)).sortBy(_.coresFree).reverse
+  .filter(worker => worker.memoryFree >= 
app.desc.memoryPerExecutorMB &&
+worker.coresFree > 0)
--- End diff --

Again, this predicate is actually not needed because we handle it correctly 
in the line I pointed out earlier. But not a big deal, we can just leave it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-04-14 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/731#issuecomment-93046902
  
  [Test build #30274 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30274/consoleFull)
 for   PR 731 at commit 
[`6dee808`](https://github.com/apache/spark/commit/6dee808d4da7da43805e4ed16fc553f7bc18f494).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-04-14 Thread CodingCat

Github user CodingCat commented on a diff in the pull request:

https://github.com/apache/spark/pull/731#discussion_r28365419
  
--- Diff: core/src/main/scala/org/apache/spark/deploy/master/Master.scala 
---
@@ -524,52 +524,25 @@ private[master] class Master(
   }
 
   /**
-   * Can an app use the given worker? True if the worker has enough memory 
and we haven't already
-   * launched an executor for the app on it (right now the standalone 
backend doesn't like having
-   * two executors on the same worker).
+   * Schedule executors to be launched on the workers.There are two modes 
of launching executors.
+   * The first attempts to spread out an application's executors on as 
many workers as possible,
+   * while the second does the opposite (i.e. launch them on as few 
workers as possible). The former
+   * is usually better for data locality purposes and is the default. The 
number of cores assigned
+   * to each executor is configurable. When this is explicitly set, 
multiple executors from the same
+   * application may be launched on the same worker if the worker has 
enough cores and memory.
+   * Otherwise, each executor grabs all the cores available on the worker 
by default, in which case
+   * only one executor may be launched on each worker.
*/
-  private def canUse(app: ApplicationInfo, worker: WorkerInfo): Boolean = {
-worker.memoryFree >= app.desc.memoryPerSlave && 
!worker.hasExecutor(app)
-  }
-
-  /**
-   * Schedule the currently available resources among waiting apps. This 
method will be called
-   * every time a new app joins or resource availability changes.
-   */
-  private def schedule() {
-if (state != RecoveryState.ALIVE) { return }
-
-// First schedule drivers, they take strict precedence over 
applications
-// Randomization helps balance drivers
-val shuffledAliveWorkers = Random.shuffle(workers.toSeq.filter(_.state 
== WorkerState.ALIVE))
-val numWorkersAlive = shuffledAliveWorkers.size
-var curPos = 0
-
-for (driver <- waitingDrivers.toList) { // iterate over a copy of 
waitingDrivers
-  // We assign workers to each waiting driver in a round-robin 
fashion. For each driver, we
-  // start from the last worker that was assigned a driver, and 
continue onwards until we have
-  // explored all alive workers.
-  var launched = false
-  var numWorkersVisited = 0
-  while (numWorkersVisited < numWorkersAlive && !launched) {
-val worker = shuffledAliveWorkers(curPos)
-numWorkersVisited += 1
-if (worker.memoryFree >= driver.desc.mem && worker.coresFree >= 
driver.desc.cores) {
-  launchDriver(worker, driver)
-  waitingDrivers -= driver
-  launched = true
-}
-curPos = (curPos + 1) % numWorkersAlive
-  }
-}
-
+  private def startExecutorsOnWorkers(): Unit = {
 // Right now this is a very simple FIFO scheduler. We keep trying to 
fit in the first app
 // in the queue, then the second app, etc.
 if (spreadOutApps) {
-  // Try to spread out each app among all the nodes, until it has all 
its cores
+  // Try to spread out each app among all the workers, until it has 
all its cores
   for (app <- waitingApps if app.coresLeft > 0) {
 val usableWorkers = workers.toArray.filter(_.state == 
WorkerState.ALIVE)
-  .filter(canUse(app, _)).sortBy(_.coresFree).reverse
+  .filter(worker => worker.memoryFree >= 
app.desc.memoryPerExecutorMB &&
+worker.coresFree > 0)
--- End diff --

should be `worker.coresFree >= app.desc.coresPerExecutor.getOrElse(1)`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-04-14 Thread CodingCat

Github user CodingCat commented on a diff in the pull request:

https://github.com/apache/spark/pull/731#discussion_r28365151
  
--- Diff: core/src/main/scala/org/apache/spark/deploy/master/Master.scala 
---
@@ -524,52 +524,25 @@ private[master] class Master(
   }
 
   /**
-   * Can an app use the given worker? True if the worker has enough memory 
and we haven't already
-   * launched an executor for the app on it (right now the standalone 
backend doesn't like having
-   * two executors on the same worker).
+   * Schedule executors to be launched on the workers.There are two modes 
of launching executors.
+   * The first attempts to spread out an application's executors on as 
many workers as possible,
+   * while the second does the opposite (i.e. launch them on as few 
workers as possible). The former
+   * is usually better for data locality purposes and is the default. The 
number of cores assigned
+   * to each executor is configurable. When this is explicitly set, 
multiple executors from the same
+   * application may be launched on the same worker if the worker has 
enough cores and memory.
+   * Otherwise, each executor grabs all the cores available on the worker 
by default, in which case
+   * only one executor may be launched on each worker.
*/
-  private def canUse(app: ApplicationInfo, worker: WorkerInfo): Boolean = {
-worker.memoryFree >= app.desc.memoryPerSlave && 
!worker.hasExecutor(app)
-  }
-
-  /**
-   * Schedule the currently available resources among waiting apps. This 
method will be called
-   * every time a new app joins or resource availability changes.
-   */
-  private def schedule() {
-if (state != RecoveryState.ALIVE) { return }
-
-// First schedule drivers, they take strict precedence over 
applications
-// Randomization helps balance drivers
-val shuffledAliveWorkers = Random.shuffle(workers.toSeq.filter(_.state 
== WorkerState.ALIVE))
-val numWorkersAlive = shuffledAliveWorkers.size
-var curPos = 0
-
-for (driver <- waitingDrivers.toList) { // iterate over a copy of 
waitingDrivers
-  // We assign workers to each waiting driver in a round-robin 
fashion. For each driver, we
-  // start from the last worker that was assigned a driver, and 
continue onwards until we have
-  // explored all alive workers.
-  var launched = false
-  var numWorkersVisited = 0
-  while (numWorkersVisited < numWorkersAlive && !launched) {
-val worker = shuffledAliveWorkers(curPos)
-numWorkersVisited += 1
-if (worker.memoryFree >= driver.desc.mem && worker.coresFree >= 
driver.desc.cores) {
-  launchDriver(worker, driver)
-  waitingDrivers -= driver
-  launched = true
-}
-curPos = (curPos + 1) % numWorkersAlive
-  }
-}
-
+  private def startExecutorsOnWorkers(): Unit = {
 // Right now this is a very simple FIFO scheduler. We keep trying to 
fit in the first app
 // in the queue, then the second app, etc.
 if (spreadOutApps) {
-  // Try to spread out each app among all the nodes, until it has all 
its cores
+  // Try to spread out each app among all the workers, until it has 
all its cores
   for (app <- waitingApps if app.coresLeft > 0) {
 val usableWorkers = workers.toArray.filter(_.state == 
WorkerState.ALIVE)
-  .filter(canUse(app, _)).sortBy(_.coresFree).reverse
+  .filter(worker => worker.memoryFree >= 
app.desc.memoryPerExecutorMB &&
+worker.coresFree > 0)
--- End diff --

I replaced it with 'worker.coresFree >= 
app.desc.coresPerExecutor.getOrElse(0)', so that we do not need to run the 
following allocation algorithm for the case I mentioned above 

> e.g. I have 8 cores, 2 cores per machine, an application would like to 
use all of them; in spread mode, we will get an array assigned as Array(2, 2, 
2, 2), if we set --executor-cores as 3, then the application will get 0 core, 
as we have no allocation which is no less than 3...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-04-13 Thread andrewor14

Github user andrewor14 commented on the pull request:

https://github.com/apache/spark/pull/731#issuecomment-92532203
  
@CodingCat We shouldn't have to worry about the case when the user asks for 
more resources per executors than are available on the cluster. If each machine 
only has 2 cores the user shouldn't ask for 3 per executor. This holds 
regardless of whether spread out mode is used, since an executor cannot be 
"split" across machines. The existing approach is fine.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-04-13 Thread andrewor14

Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/731#discussion_r28288632
  
--- Diff: core/src/main/scala/org/apache/spark/deploy/master/Master.scala 
---
@@ -582,32 +555,63 @@ private[master] class Master(
   pos = (pos + 1) % numUsable
 }
 // Now that we've decided how many cores to give on each node, 
let's actually give them
-for (pos <- 0 until numUsable) {
-  if (assigned(pos) > 0) {
-val exec = app.addExecutor(usableWorkers(pos), assigned(pos))
-launchExecutor(usableWorkers(pos), exec)
-app.state = ApplicationState.RUNNING
-  }
+for (pos <- 0 until numUsable if assigned(pos) > 0) {
+  allocateWorkerResourceToExecutors(app, assigned(pos), 
usableWorkers(pos))
 }
   }
 } else {
-  // Pack each app into as few nodes as possible until we've assigned 
all its cores
+  // Pack each app into as few workers as possible until we've 
assigned all its cores
   for (worker <- workers if worker.coresFree > 0 && worker.state == 
WorkerState.ALIVE) {
-for (app <- waitingApps if app.coresLeft > 0) {
-  if (canUse(app, worker)) {
-val coresToUse = math.min(worker.coresFree, app.coresLeft)
-if (coresToUse > 0) {
-  val exec = app.addExecutor(worker, coresToUse)
-  launchExecutor(worker, exec)
-  app.state = ApplicationState.RUNNING
-}
-  }
+for (app <- waitingApps if app.coresLeft > 0 &&
+  worker.memoryFree >= app.desc.memoryPerExecutorMB) {
+allocateWorkerResourceToExecutors(app, app.coresLeft, worker)
+}
+  }
+}
+  }
+
+  /**
+   * Allocate a worker's resources to one or more executors.
+   * @param app the info of the application which the executors belong to
+   * @param coresToAllocate cores on this worker to be allocated to this 
application
+   * @param worker the worker info
+   */
+  private def allocateWorkerResourceToExecutors(
+  app: ApplicationInfo,
+  coresToAllocate: Int,
+  worker: WorkerInfo): Unit = {
+val memoryPerExecutor = app.desc.memoryPerExecutorMB
+val coresPerExecutor = 
app.desc.coresPerExecutor.getOrElse(coresToAllocate)
+var coresLeft = coresToAllocate
+while (coresLeft >= coresPerExecutor && worker.memoryFree >= 
memoryPerExecutor) {
+  val exec = app.addExecutor(worker, coresPerExecutor)
+  coresLeft -= coresPerExecutor
+  launchExecutor(worker, exec)
+  app.state = ApplicationState.RUNNING
+}
+  }
+
+  /**
+   * Schedule the currently available resources among waiting apps. This 
method will be called
+   * every time a new app joins or resource availability changes.
+   */
+  private def schedule(): Unit = {
+if (state != RecoveryState.ALIVE) { return }
+// start in-cluster drivers, they take strict precedence over 
applications
--- End diff --

Can you replace this with
```
// Drivers take strict precedence over executors
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-04-13 Thread andrewor14

Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/731#discussion_r28288621
  
--- Diff: core/src/main/scala/org/apache/spark/deploy/master/Master.scala 
---
@@ -582,32 +555,63 @@ private[master] class Master(
   pos = (pos + 1) % numUsable
 }
 // Now that we've decided how many cores to give on each node, 
let's actually give them
-for (pos <- 0 until numUsable) {
-  if (assigned(pos) > 0) {
-val exec = app.addExecutor(usableWorkers(pos), assigned(pos))
-launchExecutor(usableWorkers(pos), exec)
-app.state = ApplicationState.RUNNING
-  }
+for (pos <- 0 until numUsable if assigned(pos) > 0) {
+  allocateWorkerResourceToExecutors(app, assigned(pos), 
usableWorkers(pos))
 }
   }
 } else {
-  // Pack each app into as few nodes as possible until we've assigned 
all its cores
+  // Pack each app into as few workers as possible until we've 
assigned all its cores
   for (worker <- workers if worker.coresFree > 0 && worker.state == 
WorkerState.ALIVE) {
-for (app <- waitingApps if app.coresLeft > 0) {
-  if (canUse(app, worker)) {
-val coresToUse = math.min(worker.coresFree, app.coresLeft)
-if (coresToUse > 0) {
-  val exec = app.addExecutor(worker, coresToUse)
-  launchExecutor(worker, exec)
-  app.state = ApplicationState.RUNNING
-}
-  }
+for (app <- waitingApps if app.coresLeft > 0 &&
+  worker.memoryFree >= app.desc.memoryPerExecutorMB) {
+allocateWorkerResourceToExecutors(app, app.coresLeft, worker)
+}
+  }
+}
+  }
+
+  /**
+   * Allocate a worker's resources to one or more executors.
+   * @param app the info of the application which the executors belong to
+   * @param coresToAllocate cores on this worker to be allocated to this 
application
+   * @param worker the worker info
+   */
+  private def allocateWorkerResourceToExecutors(
+  app: ApplicationInfo,
+  coresToAllocate: Int,
+  worker: WorkerInfo): Unit = {
+val memoryPerExecutor = app.desc.memoryPerExecutorMB
+val coresPerExecutor = 
app.desc.coresPerExecutor.getOrElse(coresToAllocate)
+var coresLeft = coresToAllocate
+while (coresLeft >= coresPerExecutor && worker.memoryFree >= 
memoryPerExecutor) {
+  val exec = app.addExecutor(worker, coresPerExecutor)
+  coresLeft -= coresPerExecutor
+  launchExecutor(worker, exec)
+  app.state = ApplicationState.RUNNING
+}
+  }
+
+  /**
+   * Schedule the currently available resources among waiting apps. This 
method will be called
+   * every time a new app joins or resource availability changes.
+   */
+  private def schedule(): Unit = {
+if (state != RecoveryState.ALIVE) { return }
+// start in-cluster drivers, they take strict precedence over 
applications
+val shuffledWorkers = Random.shuffle(workers) // Randomization helps 
balance drivers
+for (worker <- shuffledWorkers if worker.state == WorkerState.ALIVE) {
+  for (driver <- waitingDrivers) {
+if (worker.memoryFree >= driver.desc.mem && worker.coresFree >= 
driver.desc.cores) {
+  launchDriver(worker, driver)
+  waitingDrivers -= driver
 }
   }
 }
+// start executors
--- End diff --

remove this comment, as it doesn't convey any information


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-04-13 Thread andrewor14

Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/731#discussion_r28288516
  
--- Diff: core/src/main/scala/org/apache/spark/deploy/master/Master.scala 
---
@@ -582,32 +555,63 @@ private[master] class Master(
   pos = (pos + 1) % numUsable
 }
 // Now that we've decided how many cores to give on each node, 
let's actually give them
-for (pos <- 0 until numUsable) {
-  if (assigned(pos) > 0) {
-val exec = app.addExecutor(usableWorkers(pos), assigned(pos))
-launchExecutor(usableWorkers(pos), exec)
-app.state = ApplicationState.RUNNING
-  }
+for (pos <- 0 until numUsable if assigned(pos) > 0) {
+  allocateWorkerResourceToExecutors(app, assigned(pos), 
usableWorkers(pos))
 }
   }
 } else {
-  // Pack each app into as few nodes as possible until we've assigned 
all its cores
+  // Pack each app into as few workers as possible until we've 
assigned all its cores
   for (worker <- workers if worker.coresFree > 0 && worker.state == 
WorkerState.ALIVE) {
-for (app <- waitingApps if app.coresLeft > 0) {
-  if (canUse(app, worker)) {
-val coresToUse = math.min(worker.coresFree, app.coresLeft)
-if (coresToUse > 0) {
-  val exec = app.addExecutor(worker, coresToUse)
-  launchExecutor(worker, exec)
-  app.state = ApplicationState.RUNNING
-}
-  }
+for (app <- waitingApps if app.coresLeft > 0 &&
+  worker.memoryFree >= app.desc.memoryPerExecutorMB) {
--- End diff --

No need to check this again here... we already check this in 
`allocateWorkerResourceToExecutors` L586


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-04-13 Thread andrewor14

Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/731#discussion_r28288450
  
--- Diff: core/src/main/scala/org/apache/spark/deploy/master/Master.scala 
---
@@ -524,52 +524,25 @@ private[master] class Master(
   }
 
   /**
-   * Can an app use the given worker? True if the worker has enough memory 
and we haven't already
-   * launched an executor for the app on it (right now the standalone 
backend doesn't like having
-   * two executors on the same worker).
+   * Schedule executors to be launched on the workers.There are two modes 
of launching executors.
+   * The first attempts to spread out an application's executors on as 
many workers as possible,
+   * while the second does the opposite (i.e. launch them on as few 
workers as possible). The former
+   * is usually better for data locality purposes and is the default. The 
number of cores assigned
+   * to each executor is configurable. When this is explicitly set, 
multiple executors from the same
+   * application may be launched on the same worker if the worker has 
enough cores and memory.
+   * Otherwise, each executor grabs all the cores available on the worker 
by default, in which case
+   * only one executor may be launched on each worker.
*/
-  private def canUse(app: ApplicationInfo, worker: WorkerInfo): Boolean = {
-worker.memoryFree >= app.desc.memoryPerSlave && 
!worker.hasExecutor(app)
-  }
-
-  /**
-   * Schedule the currently available resources among waiting apps. This 
method will be called
-   * every time a new app joins or resource availability changes.
-   */
-  private def schedule() {
-if (state != RecoveryState.ALIVE) { return }
-
-// First schedule drivers, they take strict precedence over 
applications
-// Randomization helps balance drivers
-val shuffledAliveWorkers = Random.shuffle(workers.toSeq.filter(_.state 
== WorkerState.ALIVE))
-val numWorkersAlive = shuffledAliveWorkers.size
-var curPos = 0
-
-for (driver <- waitingDrivers.toList) { // iterate over a copy of 
waitingDrivers
-  // We assign workers to each waiting driver in a round-robin 
fashion. For each driver, we
-  // start from the last worker that was assigned a driver, and 
continue onwards until we have
-  // explored all alive workers.
-  var launched = false
-  var numWorkersVisited = 0
-  while (numWorkersVisited < numWorkersAlive && !launched) {
-val worker = shuffledAliveWorkers(curPos)
-numWorkersVisited += 1
-if (worker.memoryFree >= driver.desc.mem && worker.coresFree >= 
driver.desc.cores) {
-  launchDriver(worker, driver)
-  waitingDrivers -= driver
-  launched = true
-}
-curPos = (curPos + 1) % numWorkersAlive
-  }
-}
-
+  private def startExecutorsOnWorkers(): Unit = {
 // Right now this is a very simple FIFO scheduler. We keep trying to 
fit in the first app
 // in the queue, then the second app, etc.
 if (spreadOutApps) {
-  // Try to spread out each app among all the nodes, until it has all 
its cores
+  // Try to spread out each app among all the workers, until it has 
all its cores
   for (app <- waitingApps if app.coresLeft > 0) {
 val usableWorkers = workers.toArray.filter(_.state == 
WorkerState.ALIVE)
-  .filter(canUse(app, _)).sortBy(_.coresFree).reverse
+  .filter(worker => worker.memoryFree >= 
app.desc.memoryPerExecutorMB &&
+worker.coresFree > 0)
--- End diff --

you technically don't need this check, since we already check in L551
```
if (usableWorkers(pos).coresFree - assigned(pos) > 0)
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-04-13 Thread andrewor14

Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/731#discussion_r28288173
  
--- Diff: core/src/main/scala/org/apache/spark/deploy/master/Master.scala 
---
@@ -524,52 +524,25 @@ private[master] class Master(
   }
 
   /**
-   * Can an app use the given worker? True if the worker has enough memory 
and we haven't already
-   * launched an executor for the app on it (right now the standalone 
backend doesn't like having
-   * two executors on the same worker).
+   * Schedule executors to be launched on the workers.There are two modes 
of launching executors.
+   * The first attempts to spread out an application's executors on as 
many workers as possible,
+   * while the second does the opposite (i.e. launch them on as few 
workers as possible). The former
+   * is usually better for data locality purposes and is the default. The 
number of cores assigned
--- End diff --

We should split on this sentence to form a new paragraph. Right now it's 
one huge chunk of text


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-04-13 Thread andrewor14

Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/731#discussion_r28288122
  
--- Diff: core/src/main/scala/org/apache/spark/deploy/master/Master.scala 
---
@@ -524,52 +524,25 @@ private[master] class Master(
   }
 
   /**
-   * Can an app use the given worker? True if the worker has enough memory 
and we haven't already
-   * launched an executor for the app on it (right now the standalone 
backend doesn't like having
-   * two executors on the same worker).
+   * Schedule executors to be launched on the workers.There are two modes 
of launching executors.
--- End diff --

space after `.`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-04-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/731#issuecomment-91973488
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30097/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-04-11 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/731#issuecomment-91973479
  
  [Test build #30097 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30097/consoleFull)
 for   PR 731 at commit 
[`da102d4`](https://github.com/apache/spark/commit/da102d42b62c98b50e015bdd705af961361c1063).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.
 * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-04-11 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/731#issuecomment-91962438
  
  [Test build #30098 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30098/consoleFull)
 for   PR 731 at commit 
[`940cb42`](https://github.com/apache/spark/commit/940cb4276c7d06f92a7077f32cb2ea748c59a873).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.
 * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-04-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/731#issuecomment-91962442
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30098/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-04-11 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/731#issuecomment-91962279
  
  [Test build #30098 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30098/consoleFull)
 for   PR 731 at commit 
[`940cb42`](https://github.com/apache/spark/commit/940cb4276c7d06f92a7077f32cb2ea748c59a873).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-04-11 Thread CodingCat

Github user CodingCat commented on the pull request:

https://github.com/apache/spark/pull/731#issuecomment-91961473
  
Ignore my last comment, I still hold the position that, 

1. if we want to define the exact number of the cores per executor, we need 
to change the allocation algorithm (bringing a larger patch)

2. otherwise, we go back to define the maxNumberOfCores Per Executor


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-04-11 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/731#issuecomment-91959021
  
  [Test build #30097 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30097/consoleFull)
 for   PR 731 at commit 
[`da102d4`](https://github.com/apache/spark/commit/da102d42b62c98b50e015bdd705af961361c1063).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-04-11 Thread CodingCat

Github user CodingCat commented on the pull request:

https://github.com/apache/spark/pull/731#issuecomment-91956998
  
after rethinking about this patch, I think it seems to be fine to allocate 
zero cores in the case I mentioned above, and we just need to filter out those 
workers if the freeCore in the worker is less than the spark.executor.cores (if 
it is defined)



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-04-09 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/731#issuecomment-91360706
  
  [Test build #29965 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/29965/consoleFull)
 for   PR 731 at commit 
[`bc82820`](https://github.com/apache/spark/commit/bc828200402981cd759dd63e04718fe9ea737e06).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-04-07 Thread andrewor14

Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/731#discussion_r27933420
  
--- Diff: core/src/main/scala/org/apache/spark/deploy/master/Master.scala 
---
@@ -583,31 +560,68 @@ private[master] class Master(
 }
 // Now that we've decided how many cores to give on each node, 
let's actually give them
 for (pos <- 0 until numUsable) {
-  if (assigned(pos) > 0) {
-val exec = app.addExecutor(usableWorkers(pos), assigned(pos))
-launchExecutor(usableWorkers(pos), exec)
-app.state = ApplicationState.RUNNING
-  }
+  allocateWorkerResourceToExecutors(app, assigned(pos), 
usableWorkers(pos))
 }
   }
 } else {
-  // Pack each app into as few nodes as possible until we've assigned 
all its cores
+  // Pack each app into as few workers as possible until we've 
assigned all its cores
   for (worker <- workers if worker.coresFree > 0 && worker.state == 
WorkerState.ALIVE) {
 for (app <- waitingApps if app.coresLeft > 0) {
-  if (canUse(app, worker)) {
-val coresToUse = math.min(worker.coresFree, app.coresLeft)
-if (coresToUse > 0) {
-  val exec = app.addExecutor(worker, coresToUse)
-  launchExecutor(worker, exec)
-  app.state = ApplicationState.RUNNING
-}
-  }
+  allocateWorkerResourceToExecutors(app, app.coresLeft, worker)
 }
   }
 }
   }
 
-  private def launchExecutor(worker: WorkerInfo, exec: ExecutorDesc) {
+  /**
+   * allocate resources in a certain worker to one or more executors
+   * @param app the info of the application which the executors belong to
+   * @param coresDemand the total number of cores to be allocated to this 
application
+   * @param worker the worker info
+   */
+  private def allocateWorkerResourceToExecutors(
+  app: ApplicationInfo,
+  coresDemand: Int,
+  worker: WorkerInfo): Unit = {
+  if (canUse(app, worker)) {
+val memoryPerExecutor = app.desc.memoryPerExecutorMB
+val maxCoresPerExecutor = 
app.desc.maxCorePerExecutor.getOrElse(Int.MaxValue)
--- End diff --

then this will be
```
val coresPerExecutor = app.desc.coresPerExecutor.getOrElse(coresToAllocate)
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-04-07 Thread andrewor14

Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/731#discussion_r27933391
  
--- Diff: docs/configuration.md ---
@@ -714,6 +714,15 @@ Apart from these, the following properties are also 
available, and may be useful
   
 
 
+  spark.deploy.maxCoresPerExecutor
--- End diff --

This will be `spark.executor.cores` instead (which is currently not already 
documented). The description would be something like:
"""
Default: 2 in YARN mode, all the available cores on the worker in 
standalone mode.

The number of cores to use on each executor. For YARN and standalone mode 
only.

In standalone mode, setting this parameter allows an application to run 
multiple executors on the same worker, provided that there are enough cores on 
that worker. Otherwise, only one executor per application will run on each 
worker.
"""


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-04-07 Thread andrewor14

Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/731#discussion_r27925316
  
--- Diff: core/src/main/scala/org/apache/spark/deploy/master/Master.scala 
---
@@ -583,31 +560,68 @@ private[master] class Master(
 }
 // Now that we've decided how many cores to give on each node, 
let's actually give them
 for (pos <- 0 until numUsable) {
-  if (assigned(pos) > 0) {
-val exec = app.addExecutor(usableWorkers(pos), assigned(pos))
-launchExecutor(usableWorkers(pos), exec)
-app.state = ApplicationState.RUNNING
-  }
+  allocateWorkerResourceToExecutors(app, assigned(pos), 
usableWorkers(pos))
 }
   }
 } else {
-  // Pack each app into as few nodes as possible until we've assigned 
all its cores
+  // Pack each app into as few workers as possible until we've 
assigned all its cores
   for (worker <- workers if worker.coresFree > 0 && worker.state == 
WorkerState.ALIVE) {
 for (app <- waitingApps if app.coresLeft > 0) {
-  if (canUse(app, worker)) {
-val coresToUse = math.min(worker.coresFree, app.coresLeft)
-if (coresToUse > 0) {
-  val exec = app.addExecutor(worker, coresToUse)
-  launchExecutor(worker, exec)
-  app.state = ApplicationState.RUNNING
-}
-  }
+  allocateWorkerResourceToExecutors(app, app.coresLeft, worker)
 }
   }
 }
   }
 
-  private def launchExecutor(worker: WorkerInfo, exec: ExecutorDesc) {
+  /**
+   * allocate resources in a certain worker to one or more executors
+   * @param app the info of the application which the executors belong to
+   * @param coresDemand the total number of cores to be allocated to this 
application
+   * @param worker the worker info
+   */
+  private def allocateWorkerResourceToExecutors(
+  app: ApplicationInfo,
+  coresDemand: Int,
+  worker: WorkerInfo): Unit = {
+  if (canUse(app, worker)) {
+val memoryPerExecutor = app.desc.memoryPerExecutorMB
+val maxCoresPerExecutor = 
app.desc.maxCorePerExecutor.getOrElse(Int.MaxValue)
+var coresToAssign = coresDemand
--- End diff --

`var coresLeft = coresToAssign`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-04-07 Thread andrewor14

Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/731#discussion_r27925182
  
--- Diff: core/src/main/scala/org/apache/spark/deploy/master/Master.scala 
---
@@ -583,31 +560,68 @@ private[master] class Master(
 }
 // Now that we've decided how many cores to give on each node, 
let's actually give them
 for (pos <- 0 until numUsable) {
-  if (assigned(pos) > 0) {
-val exec = app.addExecutor(usableWorkers(pos), assigned(pos))
-launchExecutor(usableWorkers(pos), exec)
-app.state = ApplicationState.RUNNING
-  }
+  allocateWorkerResourceToExecutors(app, assigned(pos), 
usableWorkers(pos))
 }
   }
 } else {
-  // Pack each app into as few nodes as possible until we've assigned 
all its cores
+  // Pack each app into as few workers as possible until we've 
assigned all its cores
   for (worker <- workers if worker.coresFree > 0 && worker.state == 
WorkerState.ALIVE) {
 for (app <- waitingApps if app.coresLeft > 0) {
-  if (canUse(app, worker)) {
-val coresToUse = math.min(worker.coresFree, app.coresLeft)
-if (coresToUse > 0) {
-  val exec = app.addExecutor(worker, coresToUse)
-  launchExecutor(worker, exec)
-  app.state = ApplicationState.RUNNING
-}
-  }
+  allocateWorkerResourceToExecutors(app, app.coresLeft, worker)
 }
   }
 }
   }
 
-  private def launchExecutor(worker: WorkerInfo, exec: ExecutorDesc) {
+  /**
+   * allocate resources in a certain worker to one or more executors
+   * @param app the info of the application which the executors belong to
+   * @param coresDemand the total number of cores to be allocated to this 
application
--- End diff --

More specifically, this should be "cores on this worker to be allocated to 
this application". I would rename this variable to `coresToAllocate`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-04-07 Thread andrewor14

Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/731#discussion_r27924903
  
--- Diff: 
core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala ---
@@ -480,10 +480,15 @@ private[deploy] class SparkSubmitArguments(args: 
Seq[String], env: Map[String, S
 | Spark standalone and Mesos only:
 |  --total-executor-cores NUM  Total cores for all executors.
 |
+| Spark standalone and YARN only:
+|  --executor-cores NUMNumber of cores to use on each 
executor. Default:
+|  1 in YARN mode; in standalone mode, 
all cores in a worker
+|  allocated to an application will be 
assigned to a single
+|  executor.
--- End diff --

I would rephrase this as:

Number of cores per executor. (Default: 1 in YARN mode, or all available 
cores on the worker in standalone mode)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-04-07 Thread andrewor14

Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/731#discussion_r27924772
  
--- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala ---
@@ -374,6 +374,8 @@ object SparkSubmit {
   OptionAssigner(args.jars, YARN, CLUSTER, clOption = "--addJars"),
 
   // Other options
+  OptionAssigner(args.executorCores, STANDALONE, ALL_DEPLOY_MODES,
+sysProp = "spark.deploy.maxCoresPerExecutor"),
--- End diff --

this will become `spark.executor.cores`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-04-07 Thread andrewor14

Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/731#discussion_r27924675
  
--- Diff: 
core/src/main/scala/org/apache/spark/deploy/ApplicationDescription.scala ---
@@ -20,12 +20,13 @@ package org.apache.spark.deploy
 private[spark] class ApplicationDescription(
 val name: String,
 val maxCores: Option[Int],
-val memoryPerSlave: Int,
+val memoryPerExecutorMB: Int,
 val command: Command,
 var appUiUrl: String,
 val eventLogDir: Option[String] = None,
 // short name of compression codec used when writing event logs, if 
any (e.g. lzf)
-val eventLogCodec: Option[String] = None)
+val eventLogCodec: Option[String] = None,
+val maxCorePerExecutor: Option[Int] = None)
--- End diff --

as discussed, this would be `coresPerExecutor` instead


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-04-07 Thread andrewor14

Github user andrewor14 commented on the pull request:

https://github.com/apache/spark/pull/731#issuecomment-90743453
  
@CodingCat Thanks for the latest changes. It is much simpler and I believe 
it does what we want!

On a separate note, I had an offline discussion with @pwendell about the 
config semantics. He actually proposes that we configure the number of cores an 
executor will have exactly, rather than the maximum number cores it could have. 
Meaning, instead of having `spark.deploy.maxCoresPerExecutor`, we will reuse 
`spark.executor.cores` as suggested before, but modify the code a little to 
make sure each executor has exactly N cores instead of at most N cores (where N 
is the value of `spark.executor.cores`). I will make more suggestions inline to 
indicate what I mean.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-04-07 Thread andrewor14

Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/731#discussion_r27916624
  
--- Diff: core/src/main/scala/org/apache/spark/deploy/master/Master.scala 
---
@@ -524,49 +524,26 @@ private[master] class Master(
   }
 
   /**
-   * Can an app use the given worker? True if the worker has enough memory 
and we haven't already
-   * launched an executor for the app on it (right now the standalone 
backend doesn't like having
-   * two executors on the same worker).
+   * Can an app use the given worker?
*/
   private def canUse(app: ApplicationInfo, worker: WorkerInfo): Boolean = {
-worker.memoryFree >= app.desc.memoryPerSlave && 
!worker.hasExecutor(app)
+val enoughResources = worker.memoryFree >= 
app.desc.memoryPerExecutorMB && worker.coresFree > 0
+val allowToExecute = app.desc.maxCorePerExecutor.isDefined || 
!worker.hasExecutor(app)
+allowToExecute && enoughResources
   }
 
   /**
-   * Schedule the currently available resources among waiting apps. This 
method will be called
-   * every time a new app joins or resource availability changes.
+   * The resource allocator spread out each app among all the workers 
until it has all its cores in
+   * spreadOut mode otherwise packs each app into as few workers as 
possible until it has assigned
+   * all its cores. User can define spark.deploy.maxCoresPerExecutor per 
application to
+   * limit the maximum number of cores to allocate to each executor on 
each worker; if the parameter
+   * is not defined, then only one executor will be launched on a worker.
*/
-  private def schedule() {
-if (state != RecoveryState.ALIVE) { return }
-
-// First schedule drivers, they take strict precedence over 
applications
-// Randomization helps balance drivers
-val shuffledAliveWorkers = Random.shuffle(workers.toSeq.filter(_.state 
== WorkerState.ALIVE))
-val numWorkersAlive = shuffledAliveWorkers.size
-var curPos = 0
-
-for (driver <- waitingDrivers.toList) { // iterate over a copy of 
waitingDrivers
-  // We assign workers to each waiting driver in a round-robin 
fashion. For each driver, we
-  // start from the last worker that was assigned a driver, and 
continue onwards until we have
-  // explored all alive workers.
-  var launched = false
-  var numWorkersVisited = 0
-  while (numWorkersVisited < numWorkersAlive && !launched) {
-val worker = shuffledAliveWorkers(curPos)
-numWorkersVisited += 1
-if (worker.memoryFree >= driver.desc.mem && worker.coresFree >= 
driver.desc.cores) {
-  launchDriver(worker, driver)
-  waitingDrivers -= driver
-  launched = true
-}
-curPos = (curPos + 1) % numWorkersAlive
-  }
-}
-
+  private def startExecutorsOnWorkers() {
--- End diff --

here and everywhere


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-04-07 Thread andrewor14

Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/731#discussion_r27916393
  
--- Diff: core/src/main/scala/org/apache/spark/deploy/master/Master.scala 
---
@@ -524,49 +524,26 @@ private[master] class Master(
   }
 
   /**
-   * Can an app use the given worker? True if the worker has enough memory 
and we haven't already
-   * launched an executor for the app on it (right now the standalone 
backend doesn't like having
-   * two executors on the same worker).
+   * Can an app use the given worker?
*/
   private def canUse(app: ApplicationInfo, worker: WorkerInfo): Boolean = {
-worker.memoryFree >= app.desc.memoryPerSlave && 
!worker.hasExecutor(app)
+val enoughResources = worker.memoryFree >= 
app.desc.memoryPerExecutorMB && worker.coresFree > 0
+val allowToExecute = app.desc.maxCorePerExecutor.isDefined || 
!worker.hasExecutor(app)
+allowToExecute && enoughResources
   }
 
   /**
-   * Schedule the currently available resources among waiting apps. This 
method will be called
-   * every time a new app joins or resource availability changes.
+   * The resource allocator spread out each app among all the workers 
until it has all its cores in
+   * spreadOut mode otherwise packs each app into as few workers as 
possible until it has assigned
+   * all its cores. User can define spark.deploy.maxCoresPerExecutor per 
application to
+   * limit the maximum number of cores to allocate to each executor on 
each worker; if the parameter
+   * is not defined, then only one executor will be launched on a worker.
*/
-  private def schedule() {
-if (state != RecoveryState.ALIVE) { return }
-
-// First schedule drivers, they take strict precedence over 
applications
-// Randomization helps balance drivers
-val shuffledAliveWorkers = Random.shuffle(workers.toSeq.filter(_.state 
== WorkerState.ALIVE))
-val numWorkersAlive = shuffledAliveWorkers.size
-var curPos = 0
-
-for (driver <- waitingDrivers.toList) { // iterate over a copy of 
waitingDrivers
-  // We assign workers to each waiting driver in a round-robin 
fashion. For each driver, we
-  // start from the last worker that was assigned a driver, and 
continue onwards until we have
-  // explored all alive workers.
-  var launched = false
-  var numWorkersVisited = 0
-  while (numWorkersVisited < numWorkersAlive && !launched) {
-val worker = shuffledAliveWorkers(curPos)
-numWorkersVisited += 1
-if (worker.memoryFree >= driver.desc.mem && worker.coresFree >= 
driver.desc.cores) {
-  launchDriver(worker, driver)
-  waitingDrivers -= driver
-  launched = true
-}
-curPos = (curPos + 1) % numWorkersAlive
-  }
-}
-
+  private def startExecutorsOnWorkers() {
--- End diff --

`Unit` return type


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-04-07 Thread andrewor14

Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/731#discussion_r27916430
  
--- Diff: core/src/main/scala/org/apache/spark/deploy/master/Master.scala 
---
@@ -583,31 +560,68 @@ private[master] class Master(
 }
 // Now that we've decided how many cores to give on each node, 
let's actually give them
 for (pos <- 0 until numUsable) {
-  if (assigned(pos) > 0) {
-val exec = app.addExecutor(usableWorkers(pos), assigned(pos))
-launchExecutor(usableWorkers(pos), exec)
-app.state = ApplicationState.RUNNING
-  }
+  allocateWorkerResourceToExecutors(app, assigned(pos), 
usableWorkers(pos))
 }
   }
 } else {
-  // Pack each app into as few nodes as possible until we've assigned 
all its cores
+  // Pack each app into as few workers as possible until we've 
assigned all its cores
   for (worker <- workers if worker.coresFree > 0 && worker.state == 
WorkerState.ALIVE) {
 for (app <- waitingApps if app.coresLeft > 0) {
-  if (canUse(app, worker)) {
-val coresToUse = math.min(worker.coresFree, app.coresLeft)
-if (coresToUse > 0) {
-  val exec = app.addExecutor(worker, coresToUse)
-  launchExecutor(worker, exec)
-  app.state = ApplicationState.RUNNING
-}
-  }
+  allocateWorkerResourceToExecutors(app, app.coresLeft, worker)
 }
   }
 }
   }
 
-  private def launchExecutor(worker: WorkerInfo, exec: ExecutorDesc) {
+  /**
+   * allocate resources in a certain worker to one or more executors
+   * @param app the info of the application which the executors belong to
+   * @param coresDemand the total number of cores to be allocated to this 
application
+   * @param worker the worker info
+   */
+  private def allocateWorkerResourceToExecutors(
+  app: ApplicationInfo,
+  coresDemand: Int,
+  worker: WorkerInfo): Unit = {
+  if (canUse(app, worker)) {
--- End diff --

unindent this block by 2 spaces


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-04-07 Thread andrewor14

Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/731#discussion_r27916164
  
--- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala ---
@@ -374,6 +374,8 @@ object SparkSubmit {
   OptionAssigner(args.jars, YARN, CLUSTER, clOption = "--addJars"),
 
   // Other options
+  OptionAssigner(args.executorCores, STANDALONE, ALL_DEPLOY_MODES,
+sysProp = "spark.deploy.maxCoresPerExecutor"),
--- End diff --

please revert this change. I don't think we want to confuse 
`--executor-cores` with the new config


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-04-07 Thread andrewor14

Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/731#discussion_r27916121
  
--- Diff: 
core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala ---
@@ -480,10 +480,15 @@ private[deploy] class SparkSubmitArguments(args: 
Seq[String], env: Map[String, S
 | Spark standalone and Mesos only:
 |  --total-executor-cores NUM  Total cores for all executors.
 |
+| Spark standalone and YARN only:
+|  --executor-cores NUMNumber of cores to use on each 
executor. Default:
+|  1 in YARN mode; in standalone mode, 
all cores in a worker
+|  allocated to an application will be 
assigned to a single
+|  executor.
+|
 | YARN-only:
 |  --driver-cores NUM  Number of cores used by the driver, 
only in cluster mode
 |  (Default: 1).
-|  --executor-cores NUMNumber of cores per executor 
(Default: 1).
--- End diff --

please revert these changes


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-04-07 Thread andrewor14

Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/731#discussion_r27916107
  
--- Diff: 
core/src/main/scala/org/apache/spark/deploy/ApplicationDescription.scala ---
@@ -20,12 +20,13 @@ package org.apache.spark.deploy
 private[spark] class ApplicationDescription(
 val name: String,
 val maxCores: Option[Int],
-val memoryPerSlave: Int,
+val memoryPerExecutorMB: Int,
 val command: Command,
 var appUiUrl: String,
 val eventLogDir: Option[String] = None,
 // short name of compression codec used when writing event logs, if 
any (e.g. lzf)
-val eventLogCodec: Option[String] = None)
+val eventLogCodec: Option[String] = None,
+val maxCorePerExecutor: Option[Int] = None)
--- End diff --

maxCoresPerExecutor


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-04-07 Thread andrewor14

Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/731#discussion_r27915710
  
--- Diff: docs/configuration.md ---
@@ -714,6 +714,15 @@ Apart from these, the following properties are also 
available, and may be useful
   
 
 
+  spark.deploy.maxCoresPerExecutor
+  (infinite)
+  
+The maximum number of cores given to the executor. When this parameter 
is set, Spark will try to 
+run more than 1 executors on each worker in standalone mode; 
otherwise, only one executor is 
+launched on each worker.
--- End diff --

We should note that this is 1 executor per application. Technically a 
worker can still run multiple executors if they belong to different 
applications. I would rephrase this as:

"
The maximum number of cores given to an executor. When this parameter is 
set, multiple executors from the same application may run on the same worker, 
each with cores equal to or fewer than the configured value. Otherwise, at most 
one executor per application may run on each worker, as that executor will 
acquire all the worker's cores by default.

This is used in standalone mode only.
"


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-04-07 Thread andrewor14

Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/731#discussion_r27915709
  
--- Diff: docs/configuration.md ---
@@ -714,6 +714,15 @@ Apart from these, the following properties are also 
available, and may be useful
   
 
 
+  spark.deploy.maxCoresPerExecutor
+  (infinite)
+  
+The maximum number of cores given to the executor. When this parameter 
is set, Spark will try to 
+run more than 1 executors on each worker in standalone mode; 
otherwise, only one executor is 
+launched on each worker.
--- End diff --

We should note that this is 1 executor per application. Technically a 
worker can still run multiple executors if they belong to different 
applications. I would rephrase this as:

"""
The maximum number of cores given to an executor. When this parameter is 
set, multiple executors from the same application may run on the same worker, 
each with cores equal to or fewer than the configured value. Otherwise, at most 
one executor per application may run on each worker, as that executor will 
acquire all the worker's cores by default.

This is used in standalone mode only.
"""


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-04-05 Thread CodingCat

Github user CodingCat commented on the pull request:

https://github.com/apache/spark/pull/731#issuecomment-89811935
  
@andrewor14  I implement what you proposed, the only difference might be 
that, in the while condition, we also need to consider the memory constraint on 
the worker

Thanks for the review

I just uploaded the snapshots


the second picture shows that if we set per executor memory as 6G, the 
worker with only 10G memory can only afford one executor even it has extra 
cores 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-04-05 Thread CodingCat

Github user CodingCat commented on the pull request:

https://github.com/apache/spark/pull/731#issuecomment-89811465
  

![image](https://cloud.githubusercontent.com/assets/678008/6997683/a84d083e-db93-11e4-8e02-8363075293f8.png)



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-04-05 Thread CodingCat

Github user CodingCat commented on the pull request:

https://github.com/apache/spark/pull/731#issuecomment-89811438
  

![image](https://cloud.githubusercontent.com/assets/678008/6997678/9d33b010-db93-11e4-88d2-a6e0c225ab88.png)



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-04-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/731#issuecomment-89792165
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/29729/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-04-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/731#issuecomment-89776454
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/29728/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-04-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/731#issuecomment-89773667
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/29726/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-04-02 Thread CodingCat

Github user CodingCat commented on the pull request:

https://github.com/apache/spark/pull/731#issuecomment-89103420
  
I see.sorry for the misunderstanding before, I will fix it ASAP

the approach you proposed is quite neat~~


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-04-02 Thread andrewor14

Github user andrewor14 commented on the pull request:

https://github.com/apache/spark/pull/731#issuecomment-89100311
  
@CodingCat I never suggested that we grab all cores in spread out mode. The 
decision of how many cores to give each worker is the same as before. What's 
different is how we translate those cores into executors. Previously we launch 
one executor with all the cores given to a worker. Now I am suggesting that we 
launch multiple executors on the worker, each of which has at most 
`spark.deploy.maxCoresPerExecutor` cores. Note that if `maxCoresPerExecutor` is 
not defined, the behavior is the same as the old one, where we just launch one 
giant executor on the worker with all the cores it has been given.

Does that make sense?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-03-26 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/731#issuecomment-86806544
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/29270/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-03-26 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/731#issuecomment-86806537
  
  [Test build #29270 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/29270/consoleFull)
 for   PR 731 at commit 
[`193d687`](https://github.com/apache/spark/commit/193d6873a33cd8582c329e62d34c5fe52d5b6c1e).
 * This patch **passes all tests**.

 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-03-26 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/731#issuecomment-86798128
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/29269/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-03-26 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/731#issuecomment-86798107
  
  [Test build #29269 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/29269/consoleFull)
 for   PR 731 at commit 
[`f74cd8e`](https://github.com/apache/spark/commit/f74cd8e29368add06e6fc960cf11988f25d5418e).
 * This patch **passes all tests**.

 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-03-26 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/731#issuecomment-86788502
  
  [Test build #29270 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/29270/consoleFull)
 for   PR 731 at commit 
[`193d687`](https://github.com/apache/spark/commit/193d6873a33cd8582c329e62d34c5fe52d5b6c1e).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-03-26 Thread CodingCat

Github user CodingCat commented on the pull request:

https://github.com/apache/spark/pull/731#issuecomment-86788364
  
snapshot


![image](https://cloud.githubusercontent.com/assets/678008/6860808/99e0f450-d403-11e4-8ac0-2191640c0412.png)



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-03-26 Thread CodingCat

Github user CodingCat commented on the pull request:

https://github.com/apache/spark/pull/731#issuecomment-86787320
  
@andrewor14 I think in spreadApp mode, grabing all cores  when 
`spark.deploy.maxCoresPerExecutor` is not defined is *not* the right approach...

if we grab all cores, it is exactly equivalent to `non-spreadApp mode` in 
the implementation in current master branch. Instead, we should traverse all 
workers one by one, for each visit, we allocate 1 free core to the executor 
when  `spark.deploy.maxCoresPerExecutor == None`; otherwise, for each visit, we 
assign at most `spark.deploy.maxCoresPerExecutor`  cores


am I right?




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-03-26 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/731#issuecomment-86785383
  
  [Test build #29269 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/29269/consoleFull)
 for   PR 731 at commit 
[`f74cd8e`](https://github.com/apache/spark/commit/f74cd8e29368add06e6fc960cf11988f25d5418e).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-03-25 Thread CodingCat

Github user CodingCat commented on the pull request:

https://github.com/apache/spark/pull/731#issuecomment-85975750
  
@andrewor14 thanks for your patient review, I will address them on Thu. or 
Fri.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-03-24 Thread andrewor14

Github user andrewor14 commented on the pull request:

https://github.com/apache/spark/pull/731#issuecomment-85797557
  
Also, while I was reviewing this I had more thoughts on the appropriate 
config to use. Given the semantics that we want to support, I actually think it 
may not be correct to reuse `spark.executor.cores`. The reason is that here all 
we guarantee is that the executor gets *at most*, not exactly, 
`spark.executor.cores`. This is inconsistent with the semantics of this config 
in YARN.

Instead, maybe it makes sense to configure this in a new config: 
`spark.deploy.maxCoresPerExecutor`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-03-24 Thread andrewor14

Github user andrewor14 commented on the pull request:

https://github.com/apache/spark/pull/731#issuecomment-85795396
  
@CodingCat Thanks for merging the two cases. However, I still find the code 
overly complex. In particular, I think we can modify far less by rewriting 
[this 
section](https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/deploy/master/Master.scala#L578)
 of the original code with something like the following:
```
// Now that we've decided how many cores to give on each node, let's 
actually give them
for (pos <- 0 until numUsable) {
  while (assigned(pos) > 0) {
val maxCoresPerExecutor = 
app.maxCoresPerExecutor.getOrElse(Int.MaxValue)
val coresForThisExecutor = math.min(maxCoresPerExecutor, assigned(pos))
val exec = app.addExecutor(usableWorkers(pos), coresForThisExecutor)
assigned(pos) -= coresForThisExecutor
launchExecutor(usableWorkers(pos), exec)
app.state = ApplicationState.RUNNING
  }
}
```
then in the else case (where we don't spread out apps), we have to do 
something similar.

Another potential issue is that if we schedule multiple executors that 
belong to the same app on the same worker, we might not be able to distinguish 
them. This is what Patrick was suggesting on the 
[JIRA](https://issues.apache.org/jira/browse/SPARK-1706) (see point 3). I 
haven't verified whether this is actually a problem, but it would be good to 
keep it in mind.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-03-24 Thread andrewor14

Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/731#discussion_r27090207
  
--- Diff: core/src/main/scala/org/apache/spark/deploy/master/Master.scala 
---
@@ -513,89 +513,122 @@ private[spark] class Master(
   }
 
   /**
-   * Can an app use the given worker? True if the worker has enough memory 
and we haven't already
-   * launched an executor for the app on it (right now the standalone 
backend doesn't like having
-   * two executors on the same worker).
+   * Can an app use the given worker?
*/
-  def canUse(app: ApplicationInfo, worker: WorkerInfo): Boolean = {
-worker.memoryFree >= app.desc.memoryPerSlave && 
!worker.hasExecutor(app)
+  private def canUse(app: ApplicationInfo, worker: WorkerInfo): Boolean = {
+val enoughResources = worker.memoryFree >= 
app.desc.memoryPerExecutorMB && worker.coresFree > 0
+val allowToExecute = app.desc.maxCorePerExecutor.isDefined || 
!worker.hasExecutor(app)
+allowToExecute && enoughResources
--- End diff --

I'm not sure if we need `allowToExecute` here. If `maxCorePerExecutor` is 
not defined then we will grab all the cores, in which case `worker.coresFree` 
will be 0.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-03-24 Thread andrewor14

Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/731#discussion_r27089454
  
--- Diff: core/src/main/scala/org/apache/spark/deploy/master/Master.scala 
---
@@ -513,89 +513,122 @@ private[spark] class Master(
   }
 
   /**
-   * Can an app use the given worker? True if the worker has enough memory 
and we haven't already
-   * launched an executor for the app on it (right now the standalone 
backend doesn't like having
-   * two executors on the same worker).
+   * Can an app use the given worker?
*/
-  def canUse(app: ApplicationInfo, worker: WorkerInfo): Boolean = {
-worker.memoryFree >= app.desc.memoryPerSlave && 
!worker.hasExecutor(app)
+  private def canUse(app: ApplicationInfo, worker: WorkerInfo): Boolean = {
+val enoughResources = worker.memoryFree >= 
app.desc.memoryPerExecutorMB && worker.coresFree > 0
+val allowToExecute = app.desc.maxCorePerExecutor.isDefined || 
!worker.hasExecutor(app)
+allowToExecute && enoughResources
   }
 
   /**
-   * Schedule the currently available resources among waiting apps. This 
method will be called
-   * every time a new app joins or resource availability changes.
+   * This functions starts one or more executors on each worker.
+   * 
+   * It traverses the available worker list. In spreadOutApps mode, it 
allocates at most
+   * spark.executor.cores (multiple executors per worker) or 1 core(s) 
(one executor per worker) 
+   * for each visit of the worker (can be less than it when the worker 
does not have enough cores 
+   * or the demand is less than it) and app.desc.memoryPerExecutorMB 
megabytes memory and tracks 
+   * the resource allocation in a 2d array for each visit; Otherwise, it 
allocates at most 
+   * spark.executor.cores (multiple executors per worker) or 
worker.freeCores (one executor per
+   * worker) cores and app.desc.memoryPerExecutorMB megabytes to each 
executor.
*/
-  private def schedule() {
-if (state != RecoveryState.ALIVE) { return }
-
-// First schedule drivers, they take strict precedence over 
applications
-// Randomization helps balance drivers
-val shuffledAliveWorkers = Random.shuffle(workers.toSeq.filter(_.state 
== WorkerState.ALIVE))
-val numWorkersAlive = shuffledAliveWorkers.size
-var curPos = 0
-
-for (driver <- waitingDrivers.toList) { // iterate over a copy of 
waitingDrivers
-  // We assign workers to each waiting driver in a round-robin 
fashion. For each driver, we
-  // start from the last worker that was assigned a driver, and 
continue onwards until we have
-  // explored all alive workers.
-  var launched = false
-  var numWorkersVisited = 0
-  while (numWorkersVisited < numWorkersAlive && !launched) {
-val worker = shuffledAliveWorkers(curPos)
-numWorkersVisited += 1
-if (worker.memoryFree >= driver.desc.mem && worker.coresFree >= 
driver.desc.cores) {
-  launchDriver(worker, driver)
-  waitingDrivers -= driver
-  launched = true
-}
-curPos = (curPos + 1) % numWorkersAlive
-  }
-}
-
-// Right now this is a very simple FIFO scheduler. We keep trying to 
fit in the first app
-// in the queue, then the second app, etc.
+  private def startExecutorsOnWorker() {
 if (spreadOutApps) {
-  // Try to spread out each app among all the nodes, until it has all 
its cores
   for (app <- waitingApps if app.coresLeft > 0) {
-val usableWorkers = workers.toArray.filter(_.state == 
WorkerState.ALIVE)
-  .filter(canUse(app, _)).sortBy(_.coresFree).reverse
+val memoryPerExecutor = app.desc.memoryPerExecutorMB
+val usableWorkers = workers.filter(_.state == WorkerState.ALIVE).
+  filter(canUse(app, _)).toArray.sortBy(_.memoryFree / 
memoryPerExecutor).reverse
+// the maximum number of cores allocated on each executor per 
visit on the worker list
+val maxCoreAllocationPerRound = 
app.desc.maxCorePerExecutor.getOrElse(1)
+var maxCoresLeft = math.min(app.coresLeft, 
usableWorkers.map(_.coresFree).sum)
 val numUsable = usableWorkers.length
-val assigned = new Array[Int](numUsable) // Number of cores to 
give on each node
-var toAssign = math.min(app.coresLeft, 
usableWorkers.map(_.coresFree).sum)
+val maxExecutorPerWorker = {
+  if (app.desc.maxCorePerExecutor.isDefined) {
+usableWorkers(0).memoryFree / memoryPerExecutor
+  } else {
+1
+  }
+}
+// A 2D array that tracks the number of cores used by each 
executor launched on
+

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-03-24 Thread andrewor14

Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/731#discussion_r27089201
  
--- Diff: 
core/src/main/scala/org/apache/spark/deploy/ApplicationDescription.scala ---
@@ -41,5 +41,8 @@ private[spark] class ApplicationDescription(
 new ApplicationDescription(
   name, maxCores, memoryPerSlave, command, appUiUrl, eventLogDir, 
eventLogCodec)
 
+  // only valid when spark.executor.multiPerWorker is set to true
+  var maxCorePerExecutor: Option[Int] = None
--- End diff --

this is `private[spark] class` so it won't matter. Also it should be 
`maxCoresPerExecutor`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-03-24 Thread andrewor14

Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/731#discussion_r27089134
  
--- Diff: 
core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala ---
@@ -498,11 +498,18 @@ private[spark] class SparkSubmitArguments(args: 
Seq[String], env: Map[String, St
 |
 | Spark standalone and Mesos only:
 |  --total-executor-cores NUM  Total cores for all executors.
-|
+| 
+| Spark standalone and YARN only:
+|  --executor-cores NUMNumber of cores per executor. 
Default value: 1 (YARN), 0 (
+|  Standalone). In Standalone mode, 
Spark will try to run more
+|  than 1 executors on each worker in 
standalone mode; 
+|  otherwise, only one executor on 
each executor is allowed 
+|  (the executor will take all 
available cores of the worker at 
+|  the moment.
--- End diff --

I don't understand what this is saying. Maybe you mean
```
Number of cores to use on each executor. In standalone mode,
the executor will use all available cores on the worker if this
is not specified. (Default: 1 in YARN mode, all available cores
in standalone mode)
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-03-24 Thread andrewor14

Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/731#discussion_r27088973
  
--- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala ---
@@ -351,6 +351,8 @@ object SparkSubmit {
   OptionAssigner(args.ivyRepoPath, STANDALONE, CLUSTER, sysProp = 
"spark.jars.ivy"),
   OptionAssigner(args.driverMemory, STANDALONE, CLUSTER, sysProp = 
"spark.driver.memory"),
   OptionAssigner(args.driverCores, STANDALONE, CLUSTER, sysProp = 
"spark.driver.cores"),
+  OptionAssigner(args.executorCores, STANDALONE, ALL_DEPLOY_MODES, 
+sysProp = "spark.executor.cores"),
--- End diff --

this is in the wrong place because it's not just standalone cluster mode. 
It should be under `// Other options` in L378


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-03-24 Thread CodingCat

Github user CodingCat commented on a diff in the pull request:

https://github.com/apache/spark/pull/731#discussion_r27088974
  
--- Diff: 
core/src/main/scala/org/apache/spark/deploy/ApplicationDescription.scala ---
@@ -41,5 +41,8 @@ private[spark] class ApplicationDescription(
 new ApplicationDescription(
   name, maxCores, memoryPerSlave, command, appUiUrl, eventLogDir, 
eventLogCodec)
 
+  // only valid when spark.executor.multiPerWorker is set to true
+  var maxCorePerExecutor: Option[Int] = None
--- End diff --

but will MIMA be happy about it?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-03-24 Thread andrewor14

Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/731#discussion_r27088932
  
--- Diff: 
core/src/main/scala/org/apache/spark/deploy/ApplicationDescription.scala ---
@@ -41,5 +41,8 @@ private[spark] class ApplicationDescription(
 new ApplicationDescription(
   name, maxCores, memoryPerSlave, command, appUiUrl, eventLogDir, 
eventLogCodec)
 
+  // only valid when spark.executor.multiPerWorker is set to true
+  var maxCorePerExecutor: Option[Int] = None
--- End diff --

This shouldn't be a `var` because we only ever set it in one place (when we 
create it). This should just go into the constructor.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-03-12 Thread CodingCat

Github user CodingCat commented on the pull request:

https://github.com/apache/spark/pull/731#issuecomment-78555292
  
@andrewor14 would you mind taking further review, when you have time?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-03-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/731#issuecomment-77430389
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28302/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-03-05 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/731#issuecomment-77430367
  
  [Test build #28302 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28302/consoleFull)
 for   PR 731 at commit 
[`dd39148`](https://github.com/apache/spark/commit/dd391484356e3af490a8aa61dbe1b06e2db10d56).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-03-05 Thread CodingCat

Github user CodingCat commented on the pull request:

https://github.com/apache/spark/pull/731#issuecomment-77412507
  
@Du-Li , thanks for looking at the patch

This patch is still under review, we will get it merged ASAP


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-03-05 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/731#issuecomment-77412586
  
  [Test build #28302 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28302/consoleFull)
 for   PR 731 at commit 
[`dd39148`](https://github.com/apache/spark/commit/dd391484356e3af490a8aa61dbe1b06e2db10d56).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-03-05 Thread Du-Li

Github user Du-Li commented on the pull request:

https://github.com/apache/spark/pull/731#issuecomment-77408577
  
Has this PR been merged/released? It's a very useful patch. Thanks for 
working it out.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-02-22 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/731#issuecomment-75487977
  
  [Test build #27845 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27845/consoleFull)
 for   PR 731 at commit 
[`2c6d26a`](https://github.com/apache/spark/commit/2c6d26a9ba18c628c6a6d0b471d7be01c07dddb1).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-02-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/731#issuecomment-75487978
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/27845/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-02-22 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/731#issuecomment-75484698
  
**[Test build #27844 timed 
out](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27844/consoleFull)**
 for PR 731 at commit 
[`97ed489`](https://github.com/apache/spark/commit/97ed489597ef3599f93a1a06fa889e1536410c88)
 after a configured wait of `120m`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-02-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/731#issuecomment-75484704
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/27844/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-02-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/731#issuecomment-75484429
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/27843/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-02-22 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/731#issuecomment-75484422
  
**[Test build #27843 timed 
out](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27843/consoleFull)**
 for PR 731 at commit 
[`d941fa9`](https://github.com/apache/spark/commit/d941fa916b0b77ee00bc4f81fd6157193ed89b9d)
 after a configured wait of `120m`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-02-22 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/731#issuecomment-75483845
  
  [Test build #27845 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27845/consoleFull)
 for   PR 731 at commit 
[`2c6d26a`](https://github.com/apache/spark/commit/2c6d26a9ba18c628c6a6d0b471d7be01c07dddb1).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-02-22 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/731#issuecomment-75477819
  
  [Test build #27844 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27844/consoleFull)
 for   PR 731 at commit 
[`97ed489`](https://github.com/apache/spark/commit/97ed489597ef3599f93a1a06fa889e1536410c88).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-02-22 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/731#issuecomment-75477541
  
  [Test build #27843 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27843/consoleFull)
 for   PR 731 at commit 
[`d941fa9`](https://github.com/apache/spark/commit/d941fa916b0b77ee00bc4f81fd6157193ed89b9d).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-02-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/731#issuecomment-75475562
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/27842/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-02-22 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/731#issuecomment-75475561
  
**[Test build #27842 timed 
out](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27842/consoleFull)**
 for PR 731 at commit 
[`d6c19b4`](https://github.com/apache/spark/commit/d6c19b42e29944f0d6a0ff3c32741e0ce6a9a596)
 after a configured wait of `120m`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-02-22 Thread CodingCat

Github user CodingCat commented on the pull request:

https://github.com/apache/spark/pull/731#issuecomment-75468070
  
Hi, @andrewor14 , thanks for your review 

I think I misunderstood your original point. Now, I merged two functions 
(starting one/more executor(s)) .

The difference between my current approach and your proposal is that I 
didn't introduce ## spark.deploy.executorsPerWorker, instead, the executor 
number on each worker is determined by the available cores/memory of the worker 
at the scheduling moment.


The snapshots of running 4 executors in 2 workers were attached in my last 
comment.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-02-22 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/731#issuecomment-75467548
  
  [Test build #27842 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27842/consoleFull)
 for   PR 731 at commit 
[`d6c19b4`](https://github.com/apache/spark/commit/d6c19b42e29944f0d6a0ff3c32741e0ce6a9a596).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-02-22 Thread CodingCat

Github user CodingCat commented on the pull request:

https://github.com/apache/spark/pull/731#issuecomment-75466570
  

![image](https://cloud.githubusercontent.com/assets/678008/6320817/b3d4efb2-bab6-11e4-9615-1530d981e286.png)



![image](https://cloud.githubusercontent.com/assets/678008/6320824/c21b00fc-bab6-11e4-8b70-f6c945052527.png)



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-02-22 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/731#issuecomment-75448691
  
  [Test build #27839 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27839/consoleFull)
 for   PR 731 at commit 
[`5f90cd1`](https://github.com/apache/spark/commit/5f90cd1021b68a9ceb93c9412e84c76ce193df2d).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-02-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/731#issuecomment-75448693
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/27839/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-02-22 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/731#issuecomment-75444628
  
  [Test build #27839 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27839/consoleFull)
 for   PR 731 at commit 
[`5f90cd1`](https://github.com/apache/spark/commit/5f90cd1021b68a9ceb93c9412e84c76ce193df2d).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-02-22 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/731#issuecomment-75444258
  
  [Test build #27838 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27838/consoleFull)
 for   PR 731 at commit 
[`7fd8f7b`](https://github.com/apache/spark/commit/7fd8f7b93c664490b726483b402ba953749e4ac4).
 * This patch **fails to build**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-02-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/731#issuecomment-75444259
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/27838/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-02-22 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/731#issuecomment-75444168
  
  [Test build #27838 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27838/consoleFull)
 for   PR 731 at commit 
[`7fd8f7b`](https://github.com/apache/spark/commit/7fd8f7b93c664490b726483b402ba953749e4ac4).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-02-22 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/731#issuecomment-75443741
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/27836/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-02-22 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/731#issuecomment-75443738
  
  [Test build #27836 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27836/consoleFull)
 for   PR 731 at commit 
[`f4abf0c`](https://github.com/apache/spark/commit/f4abf0c9b24b73b05642ee330d95e2743bac1b16).
 * This patch **fails to build**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-02-22 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/731#issuecomment-75443622
  
  [Test build #27836 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27836/consoleFull)
 for   PR 731 at commit 
[`f4abf0c`](https://github.com/apache/spark/commit/f4abf0c9b24b73b05642ee330d95e2743bac1b16).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-02-20 Thread sryza

Github user sryza commented on a diff in the pull request:

https://github.com/apache/spark/pull/731#discussion_r25118056
  
--- Diff: docs/configuration.md ---
@@ -1207,6 +1207,25 @@ Apart from these, the following properties are also 
available, and may be useful
 
   spark.ui.view.acls
   Empty
+
+
+  spark.executor.multiPerWorker
--- End diff --

This is fundamentally a worker property, not an executor property, as it's 
the worker that reads it and makes decisions based on it, right?  If that's the 
case, this shouldn't be under the `spark.executor` namespace.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-02-20 Thread andrewor14

Github user andrewor14 commented on the pull request:

https://github.com/apache/spark/pull/731#issuecomment-75347509
  
@CodingCat I looked at this patch much more closely, and I still don't see 
a need to separate the single and the multiple executors per worker cases. More 
specifically, I see the single executor case as a special case of the multiple 
executors case, where each element of your 2D array will have a list of 1 
element (because there's only one executor).

I think it makes more sense to configure the number of executors per worker 
directly. Perhaps we need a config that looks something like 
`spark.deploy.executorsPerWorker`. Then, to avoid one executor from grabbing 
all the cores on that worker, the user will also need to set 
`spark.executor.cores`. In fact, we're doing something fairly similar in Mesos 
coarse-grained mode in this PR #4027. It would be good to model the general 
structure of the changes here after the one there.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-1706: Allow multiple executors per worke...

2015-02-20 Thread andrewor14

Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/731#discussion_r25116514
  
--- Diff: 
core/src/main/scala/org/apache/spark/deploy/master/ui/ApplicationPage.scala ---
@@ -74,18 +74,18 @@ private[spark] class ApplicationPage(parent: 
MasterWebUI) extends WebUIPage("app
 Name: {app.desc.name}
 User: {app.desc.user}
 Cores:
-{
+  {
   if (app.desc.maxCores.isEmpty) {
 "Unlimited (%s granted)".format(app.coresGranted)
   } else {
 "%s (%s granted, %s left)".format(
   app.desc.maxCores.get, app.coresGranted, app.coresLeft)
   }
-}
+  }
--- End diff --

let's revert these indentation changes


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

1 2 3 >

1 - 100 of 291 matches

Mail list logo