[GitHub] spark pull request: Fix scheduler to account for tasks using 1 C...

2014-03-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/219#issuecomment-38534171
  
One or more automated tests failed
Refer to this link for build results: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13423/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: Fix scheduler to account for tasks using 1 C...

2014-03-25 Thread kayousterhout
Github user kayousterhout commented on the pull request:

https://github.com/apache/spark/pull/219#issuecomment-38592694
  
LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: Fix scheduler to account for tasks using 1 C...

2014-03-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/219#issuecomment-38600307
  
One or more automated tests failed
Refer to this link for build results: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13437/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: Fix scheduler to account for tasks using 1 C...

2014-03-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/219#issuecomment-38600305
  
Merged build finished.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: Fix scheduler to account for tasks using 1 C...

2014-03-25 Thread shivaram
Github user shivaram commented on the pull request:

https://github.com/apache/spark/pull/219#issuecomment-38601404
  
Jenkins, retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: Fix scheduler to account for tasks using 1 C...

2014-03-25 Thread shivaram
Github user shivaram commented on the pull request:

https://github.com/apache/spark/pull/219#issuecomment-38601384
  
Failure was in recovery with file input stream.recovery with file input 
stream -- Something that I think is completely unrelated to this change. So 
lets try again ...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: Fix scheduler to account for tasks using 1 C...

2014-03-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/219#issuecomment-38601510
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: Fix scheduler to account for tasks using 1 C...

2014-03-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/219#issuecomment-38608422
  
All automated tests passed.
Refer to this link for build results: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13440/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: Fix scheduler to account for tasks using 1 C...

2014-03-25 Thread mridulm
Github user mridulm commented on the pull request:

https://github.com/apache/spark/pull/219#issuecomment-38608715
  
Looks good, thanks for the change - makes things cleaner.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: Fix scheduler to account for tasks using 1 C...

2014-03-24 Thread shivaram
GitHub user shivaram opened a pull request:

https://github.com/apache/spark/pull/219

Fix scheduler to account for tasks using  1 CPUs.

Move CPUS_PER_TASK to TaskSchedulerImpl as the value is a constant and use 
it in both Mesos and CoarseGrained scheduler backends.

Thanks @kayousterhout for the design discussion

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/shivaram/spark-1 multi-cpus

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/219.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #219


commit 647bc4582bb350faf69d1969072a2cc79f2f553e
Author: Shivaram Venkataraman shiva...@eecs.berkeley.edu
Date:   2014-03-25T01:53:03Z

Fix scheduler to account for tasks using  1 CPUs.
Move CPUS_PER_TASK to TaskSchedulerImpl as the value is a constant
and use it in both Mesos and CoarseGrained scheduler backends.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: Fix scheduler to account for tasks using 1 C...

2014-03-24 Thread pwendell
Github user pwendell commented on a diff in the pull request:

https://github.com/apache/spark/pull/219#discussion_r10915292
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala ---
@@ -62,6 +62,9 @@ private[spark] class TaskSchedulerImpl(
   // Threshold above which we warn user initial TaskSet may be starved
   val STARVATION_TIMEOUT = conf.getLong(spark.starvation.timeout, 15000)
 
+  // CPUs to request per task
+  val CPUS_PER_TASK = conf.getInt(spark.task.cpus, 1)
--- End diff --

Is this intentionally left as an undocumented parameter?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: Fix scheduler to account for tasks using 1 C...

2014-03-24 Thread shivaram
Github user shivaram commented on a diff in the pull request:

https://github.com/apache/spark/pull/219#discussion_r10915339
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala ---
@@ -62,6 +62,9 @@ private[spark] class TaskSchedulerImpl(
   // Threshold above which we warn user initial TaskSet may be starved
   val STARVATION_TIMEOUT = conf.getLong(spark.starvation.timeout, 15000)
 
+  // CPUs to request per task
+  val CPUS_PER_TASK = conf.getInt(spark.task.cpus, 1)
--- End diff --

Hmm - we could document this in docs/configuration.md -- but I don't think 
this is a commonly used flag, and it has been around for a while I guess. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: Fix scheduler to account for tasks using 1 C...

2014-03-24 Thread shivaram
Github user shivaram commented on a diff in the pull request:

https://github.com/apache/spark/pull/219#discussion_r10915407
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala ---
@@ -62,6 +62,9 @@ private[spark] class TaskSchedulerImpl(
   // Threshold above which we warn user initial TaskSet may be starved
   val STARVATION_TIMEOUT = conf.getLong(spark.starvation.timeout, 15000)
 
+  // CPUs to request per task
+  val CPUS_PER_TASK = conf.getInt(spark.task.cpus, 1)
--- End diff --

I guess documenting it is better -- added a commit for that.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: Fix scheduler to account for tasks using 1 C...

2014-03-24 Thread shivaram
Github user shivaram commented on the pull request:

https://github.com/apache/spark/pull/219#issuecomment-38524982
  
FYI - there was a test failure in TaskSetManagerSuite as a unit test was 
checking for availableCpus being zero. So I added back a non-zero check in 
TaskSetManager.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: Fix scheduler to account for tasks using 1 C...

2014-03-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/219#issuecomment-38526638
  
One or more automated tests failed
Refer to this link for build results: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13414/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: Fix scheduler to account for tasks using 1 C...

2014-03-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/219#issuecomment-38526636
  
Merged build finished.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: Fix scheduler to account for tasks using 1 C...

2014-03-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/219#issuecomment-38526736
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: Fix scheduler to account for tasks using 1 C...

2014-03-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/219#issuecomment-38527171
  
One or more automated tests failed
Refer to this link for build results: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13417/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: Fix scheduler to account for tasks using 1 C...

2014-03-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/219#issuecomment-38527170
  
Merged build finished.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: Fix scheduler to account for tasks using 1 C...

2014-03-24 Thread shivaram
Github user shivaram commented on the pull request:

https://github.com/apache/spark/pull/219#issuecomment-38529622
  
Jenkins, retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: Fix scheduler to account for tasks using 1 C...

2014-03-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/219#issuecomment-38529627
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: Fix scheduler to account for tasks using 1 C...

2014-03-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/219#issuecomment-38529626
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: Fix scheduler to account for tasks using 1 C...

2014-03-24 Thread kayousterhout
Github user kayousterhout commented on a diff in the pull request:

https://github.com/apache/spark/pull/219#discussion_r10917822
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala ---
@@ -388,7 +385,7 @@ private[spark] class TaskSetManager(
   maxLocality: TaskLocality.TaskLocality)
 : Option[TaskDescription] =
   {
-if (!isZombie  availableCpus = CPUS_PER_TASK) {
+if (!isZombie  availableCpus  0) {
--- End diff --

Can you just remove the availableCpus parameter?  It doesn't look like it's 
used anymore


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: Fix scheduler to account for tasks using 1 C...

2014-03-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/219#issuecomment-38531549
  
All automated tests passed.
Refer to this link for build results: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13421/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: Fix scheduler to account for tasks using 1 C...

2014-03-24 Thread shivaram
Github user shivaram commented on a diff in the pull request:

https://github.com/apache/spark/pull/219#discussion_r10918164
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala ---
@@ -388,7 +385,7 @@ private[spark] class TaskSetManager(
   maxLocality: TaskLocality.TaskLocality)
 : Option[TaskDescription] =
   {
-if (!isZombie  availableCpus = CPUS_PER_TASK) {
+if (!isZombie  availableCpus  0) {
--- End diff --

Makes sense -- Removed it now


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: Fix scheduler to account for tasks using 1 C...

2014-03-24 Thread pwendell
Github user pwendell commented on a diff in the pull request:

https://github.com/apache/spark/pull/219#discussion_r10918370
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala ---
@@ -62,6 +62,9 @@ private[spark] class TaskSchedulerImpl(
   // Threshold above which we warn user initial TaskSet may be starved
   val STARVATION_TIMEOUT = conf.getLong(spark.starvation.timeout, 15000)
 
+  // CPUs to request per task
+  val CPUS_PER_TASK = conf.getInt(spark.task.cpus, 1)
--- End diff --

Ah I see that it wasn't part of your change. Anyways it might make sense to 
document it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---