[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...

2014-09-27 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/2533


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...

2014-09-27 Thread rxin
Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/2533#discussion_r18127631
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/cluster/ExecutorData.scala ---
@@ -0,0 +1,38 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.scheduler.cluster
+
+import akka.actor.{Address, ActorRef}
+
+/**
+ * Grouping of data that is accessed by a CourseGrainedScheduler. This 
class
--- End diff --

Course -> Coarse. I will fix it during merge.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...

2014-09-27 Thread rxin
Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/2533#issuecomment-57075299
  
Thanks. Merging in master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...

2014-09-27 Thread tigerquoll
Github user tigerquoll commented on a diff in the pull request:

https://github.com/apache/spark/pull/2533#discussion_r18126678
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
 ---
@@ -149,13 +144,15 @@ class CoarseGrainedSchedulerBackend(scheduler: 
TaskSchedulerImpl, actorSystem: A
 // Make fake resource offers on all executors
 def makeOffers() {
   launchTasks(scheduler.resourceOffers(
-executorHost.toArray.map {case (id, host) => new WorkerOffer(id, 
host, freeCores(id))}))
+executorDataMap.map{ case(id, executorData) =>
+  new WorkerOffer( id, executorData.executorHost, 
executorData.freeCores)}.toSeq))
--- End diff --

done


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...

2014-09-27 Thread tigerquoll
Github user tigerquoll commented on a diff in the pull request:

https://github.com/apache/spark/pull/2533#discussion_r18126677
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
 ---
@@ -149,13 +144,15 @@ class CoarseGrainedSchedulerBackend(scheduler: 
TaskSchedulerImpl, actorSystem: A
 // Make fake resource offers on all executors
 def makeOffers() {
   launchTasks(scheduler.resourceOffers(
-executorHost.toArray.map {case (id, host) => new WorkerOffer(id, 
host, freeCores(id))}))
+executorDataMap.map{ case(id, executorData) =>
--- End diff --

done


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...

2014-09-27 Thread tigerquoll
Github user tigerquoll commented on a diff in the pull request:

https://github.com/apache/spark/pull/2533#discussion_r18126669
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
 ---
@@ -179,25 +176,22 @@ class CoarseGrainedSchedulerBackend(scheduler: 
TaskSchedulerImpl, actorSystem: A
   }
 }
 else {
-  freeCores(task.executorId) -= scheduler.CPUS_PER_TASK
-  executorActor(task.executorId) ! LaunchTask(new 
SerializableBuffer(serializedTask))
+  val executorInfo = executorDataMap(task.executorId)
+  executorInfo.freeCores -= scheduler.CPUS_PER_TASK
+  executorInfo.executorActor ! LaunchTask(new 
SerializableBuffer(serializedTask))
 }
   }
 }
 
 // Remove a disconnected slave from the cluster
 def removeExecutor(executorId: String, reason: String) {
-  if (executorActor.contains(executorId)) {
-logInfo("Executor " + executorId + " disconnected, so removing it")
-val numCores = totalCores(executorId)
-executorActor -= executorId
-executorHost -= executorId
-addressToExecutorId -= executorAddress(executorId)
-executorAddress -= executorId
-totalCores -= executorId
-freeCores -= executorId
-totalCoreCount.addAndGet(-numCores)
-scheduler.executorLost(executorId, SlaveLost(reason))
+  executorDataMap.get(executorId) match {
+case Some(executorInfo) =>
+  val numCores = executorInfo.totalCores
--- End diff --

expression inlined


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...

2014-09-27 Thread tigerquoll
Github user tigerquoll commented on a diff in the pull request:

https://github.com/apache/spark/pull/2533#discussion_r18126667
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
 ---
@@ -297,6 +291,7 @@ class CoarseGrainedSchedulerBackend(scheduler: 
TaskSchedulerImpl, actorSystem: A
   }
 }
 
+
--- End diff --

removed


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...

2014-09-27 Thread tigerquoll
Github user tigerquoll commented on a diff in the pull request:

https://github.com/apache/spark/pull/2533#discussion_r18126665
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
 ---
@@ -85,16 +79,14 @@ class CoarseGrainedSchedulerBackend(scheduler: 
TaskSchedulerImpl, actorSystem: A
 def receiveWithLogging = {
   case RegisterExecutor(executorId, hostPort, cores) =>
 Utils.checkHostPort(hostPort, "Host port expected " + hostPort)
-if (executorActor.contains(executorId)) {
+if (executorDataMap.contains(executorId)) {
   sender ! RegisterExecutorFailed("Duplicate executor ID: " + 
executorId)
 } else {
   logInfo("Registered executor: " + sender + " with ID " + 
executorId)
   sender ! RegisteredExecutor
-  executorActor(executorId) = sender
-  executorHost(executorId) = Utils.parseHostPort(hostPort)._1
-  totalCores(executorId) = cores
-  freeCores(executorId) = cores
-  executorAddress(executorId) = sender.path.address
+  executorDataMap.put(executorId,  new ExecutorData(sender, 
sender.path.address,
--- End diff --

Done


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...

2014-09-27 Thread tigerquoll
Github user tigerquoll commented on a diff in the pull request:

https://github.com/apache/spark/pull/2533#discussion_r18126663
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
 ---
@@ -126,8 +120,8 @@ class CoarseGrainedSchedulerBackend(scheduler: 
TaskSchedulerImpl, actorSystem: A
 
   case StopExecutors =>
 logInfo("Asking each executor to shut down")
-for (executor <- executorActor.values) {
-  executor ! StopExecutor
+for ((_,executorData) <- executorDataMap) {
--- End diff --

Done


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...

2014-09-27 Thread tigerquoll
Github user tigerquoll commented on a diff in the pull request:

https://github.com/apache/spark/pull/2533#discussion_r18126658
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/cluster/ExecutorData.scala ---
@@ -0,0 +1,28 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.scheduler.cluster
+
+import akka.actor.{Address, ActorRef}
+
+private[cluster] class ExecutorData(
+   var executorActor: ActorRef,
--- End diff --

Good point - All but freeCores changed to vals


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...

2014-09-27 Thread tigerquoll
Github user tigerquoll commented on a diff in the pull request:

https://github.com/apache/spark/pull/2533#discussion_r18126545
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
 ---
@@ -104,13 +96,15 @@ class CoarseGrainedSchedulerBackend(scheduler: 
TaskSchedulerImpl, actorSystem: A
   case StatusUpdate(executorId, taskId, state, data) =>
 scheduler.statusUpdate(taskId, state, data.value)
 if (TaskState.isFinished(state)) {
-  if (executorActor.contains(executorId)) {
-freeCores(executorId) += scheduler.CPUS_PER_TASK
-makeOffers(executorId)
-  } else {
-// Ignoring the update since we don't know about the executor.
-val msg = "Ignored task status update (%d state %s) from 
unknown executor %s with ID %s"
-logWarning(msg.format(taskId, state, sender, executorId))
+  executorDataMap.get(executorId) match {
+case Some(executorInfo) =>
+  executorInfo.freeCores += scheduler.CPUS_PER_TASK
+  makeOffers(executorId)
+case None =>
+  // Ignoring the update since we don't know about the 
executor.
+  val msg = "Ignored task status update (%d state %s) " +
--- End diff --

Done and replaced format with a 's' interpolated string.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...

2014-09-27 Thread tigerquoll
Github user tigerquoll commented on a diff in the pull request:

https://github.com/apache/spark/pull/2533#discussion_r18126510
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/cluster/ExecutorData.scala ---
@@ -0,0 +1,28 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.scheduler.cluster
+
+import akka.actor.{Address, ActorRef}
+
+private[cluster] class ExecutorData(
+   var executorActor: ActorRef,
+   var executorAddress: Address,
+   var executorHost: String ,
+   var freeCores: Int,
+   var totalCores: Int
+) {}
--- End diff --

done


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...

2014-09-27 Thread tigerquoll
Github user tigerquoll commented on a diff in the pull request:

https://github.com/apache/spark/pull/2533#discussion_r18126499
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/cluster/ExecutorData.scala ---
@@ -0,0 +1,28 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.scheduler.cluster
+
+import akka.actor.{Address, ActorRef}
+
+private[cluster] class ExecutorData(
--- End diff --

Done


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...

2014-09-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/2533#issuecomment-57043734
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/20907/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...

2014-09-26 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2533#issuecomment-57043731
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20907/consoleFull)
 for   PR 2533 at commit 
[`6890663`](https://github.com/apache/spark/commit/6890663c9b15914cfb8523b6cf26871c0f1c2727).
 * This patch **passes** unit tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...

2014-09-26 Thread andrewor14
Github user andrewor14 commented on the pull request:

https://github.com/apache/spark/pull/2533#issuecomment-57042814
  
LGTM pending a couple of stylistic comments


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...

2014-09-26 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/2533#discussion_r18122075
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
 ---
@@ -104,13 +96,15 @@ class CoarseGrainedSchedulerBackend(scheduler: 
TaskSchedulerImpl, actorSystem: A
   case StatusUpdate(executorId, taskId, state, data) =>
 scheduler.statusUpdate(taskId, state, data.value)
 if (TaskState.isFinished(state)) {
-  if (executorActor.contains(executorId)) {
-freeCores(executorId) += scheduler.CPUS_PER_TASK
-makeOffers(executorId)
-  } else {
-// Ignoring the update since we don't know about the executor.
-val msg = "Ignored task status update (%d state %s) from 
unknown executor %s with ID %s"
-logWarning(msg.format(taskId, state, sender, executorId))
+  executorDataMap.get(executorId) match {
+case Some(executorInfo) =>
+  executorInfo.freeCores += scheduler.CPUS_PER_TASK
+  makeOffers(executorId)
+case None =>
+  // Ignoring the update since we don't know about the 
executor.
+  val msg = "Ignored task status update (%d state %s) " +
--- End diff --

Not your code, but you probably don't need to put this in a val either


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...

2014-09-26 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/2533#discussion_r18122072
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
 ---
@@ -149,13 +144,15 @@ class CoarseGrainedSchedulerBackend(scheduler: 
TaskSchedulerImpl, actorSystem: A
 // Make fake resource offers on all executors
 def makeOffers() {
   launchTasks(scheduler.resourceOffers(
-executorHost.toArray.map {case (id, host) => new WorkerOffer(id, 
host, freeCores(id))}))
+executorDataMap.map{ case(id, executorData) =>
+  new WorkerOffer( id, executorData.executorHost, 
executorData.freeCores)}.toSeq))
--- End diff --

no space before id...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...

2014-09-26 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/2533#discussion_r18122067
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
 ---
@@ -149,13 +144,15 @@ class CoarseGrainedSchedulerBackend(scheduler: 
TaskSchedulerImpl, actorSystem: A
 // Make fake resource offers on all executors
 def makeOffers() {
   launchTasks(scheduler.resourceOffers(
-executorHost.toArray.map {case (id, host) => new WorkerOffer(id, 
host, freeCores(id))}))
+executorDataMap.map{ case(id, executorData) =>
+  new WorkerOffer( id, executorData.executorHost, 
executorData.freeCores)}.toSeq))
--- End diff --

no space before id


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...

2014-09-26 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/2533#discussion_r18122066
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
 ---
@@ -149,13 +144,15 @@ class CoarseGrainedSchedulerBackend(scheduler: 
TaskSchedulerImpl, actorSystem: A
 // Make fake resource offers on all executors
 def makeOffers() {
   launchTasks(scheduler.resourceOffers(
-executorHost.toArray.map {case (id, host) => new WorkerOffer(id, 
host, freeCores(id))}))
+executorDataMap.map{ case(id, executorData) =>
--- End diff --

space after map, and case


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...

2014-09-26 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/2533#discussion_r18122064
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
 ---
@@ -179,25 +176,22 @@ class CoarseGrainedSchedulerBackend(scheduler: 
TaskSchedulerImpl, actorSystem: A
   }
 }
 else {
-  freeCores(task.executorId) -= scheduler.CPUS_PER_TASK
-  executorActor(task.executorId) ! LaunchTask(new 
SerializableBuffer(serializedTask))
+  val executorInfo = executorDataMap(task.executorId)
+  executorInfo.freeCores -= scheduler.CPUS_PER_TASK
+  executorInfo.executorActor ! LaunchTask(new 
SerializableBuffer(serializedTask))
 }
   }
 }
 
 // Remove a disconnected slave from the cluster
 def removeExecutor(executorId: String, reason: String) {
-  if (executorActor.contains(executorId)) {
-logInfo("Executor " + executorId + " disconnected, so removing it")
-val numCores = totalCores(executorId)
-executorActor -= executorId
-executorHost -= executorId
-addressToExecutorId -= executorAddress(executorId)
-executorAddress -= executorId
-totalCores -= executorId
-freeCores -= executorId
-totalCoreCount.addAndGet(-numCores)
-scheduler.executorLost(executorId, SlaveLost(reason))
+  executorDataMap.get(executorId) match {
+case Some(executorInfo) =>
+  val numCores = executorInfo.totalCores
--- End diff --

you probably don't need to put this in a val


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...

2014-09-26 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/2533#discussion_r18122060
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
 ---
@@ -297,6 +291,7 @@ class CoarseGrainedSchedulerBackend(scheduler: 
TaskSchedulerImpl, actorSystem: A
   }
 }
 
+
--- End diff --

random new line


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...

2014-09-26 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/2533#discussion_r18122048
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/cluster/ExecutorData.scala ---
@@ -0,0 +1,28 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.scheduler.cluster
+
+import akka.actor.{Address, ActorRef}
+
+private[cluster] class ExecutorData(
+   var executorActor: ActorRef,
+   var executorAddress: Address,
+   var executorHost: String ,
+   var freeCores: Int,
+   var totalCores: Int
+) {}
--- End diff --

nit: you can remove the `{}`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...

2014-09-26 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/2533#discussion_r18122044
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
 ---
@@ -85,16 +79,14 @@ class CoarseGrainedSchedulerBackend(scheduler: 
TaskSchedulerImpl, actorSystem: A
 def receiveWithLogging = {
   case RegisterExecutor(executorId, hostPort, cores) =>
 Utils.checkHostPort(hostPort, "Host port expected " + hostPort)
-if (executorActor.contains(executorId)) {
+if (executorDataMap.contains(executorId)) {
   sender ! RegisterExecutorFailed("Duplicate executor ID: " + 
executorId)
 } else {
   logInfo("Registered executor: " + sender + " with ID " + 
executorId)
   sender ! RegisteredExecutor
-  executorActor(executorId) = sender
-  executorHost(executorId) = Utils.parseHostPort(hostPort)._1
-  totalCores(executorId) = cores
-  freeCores(executorId) = cores
-  executorAddress(executorId) = sender.path.address
+  executorDataMap.put(executorId,  new ExecutorData(sender, 
sender.path.address,
--- End diff --

2 spaces here, need only 1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...

2014-09-26 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2533#issuecomment-57042564
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20907/consoleFull)
 for   PR 2533 at commit 
[`6890663`](https://github.com/apache/spark/commit/6890663c9b15914cfb8523b6cf26871c0f1c2727).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...

2014-09-26 Thread rxin
Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/2533#issuecomment-57042526
  
I really like this change. Made some minor comments. 

Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...

2014-09-26 Thread rxin
Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/2533#discussion_r18121960
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
 ---
@@ -126,8 +120,8 @@ class CoarseGrainedSchedulerBackend(scheduler: 
TaskSchedulerImpl, actorSystem: A
 
   case StopExecutors =>
 logInfo("Asking each executor to shut down")
-for (executor <- executorActor.values) {
-  executor ! StopExecutor
+for ((_,executorData) <- executorDataMap) {
--- End diff --

add a space after comma


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...

2014-09-26 Thread andrewor14
Github user andrewor14 commented on the pull request:

https://github.com/apache/spark/pull/2533#issuecomment-57042458
  
test this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...

2014-09-26 Thread rxin
Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/2533#discussion_r18121953
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/cluster/ExecutorData.scala ---
@@ -0,0 +1,28 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.scheduler.cluster
+
+import akka.actor.{Address, ActorRef}
+
+private[cluster] class ExecutorData(
--- End diff --

maybe add some javadoc explaining what this is for ...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...

2014-09-26 Thread rxin
Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/2533#discussion_r18121952
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/cluster/ExecutorData.scala ---
@@ -0,0 +1,28 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.scheduler.cluster
+
+import akka.actor.{Address, ActorRef}
+
+private[cluster] class ExecutorData(
+   var executorActor: ActorRef,
--- End diff --

should they be vals?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...

2014-09-25 Thread tigerquoll
Github user tigerquoll commented on a diff in the pull request:

https://github.com/apache/spark/pull/2533#discussion_r18066186
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
 ---
@@ -179,25 +178,22 @@ class CoarseGrainedSchedulerBackend(scheduler: 
TaskSchedulerImpl, actorSystem: A
   }
 }
 else {
-  freeCores(task.executorId) -= scheduler.CPUS_PER_TASK
-  executorActor(task.executorId) ! LaunchTask(new 
SerializableBuffer(serializedTask))
+  val executorInfo = executorData(task.executorId)
+  executorInfo.freeCores -= scheduler.CPUS_PER_TASK
+  executorInfo.executorActor ! LaunchTask(new 
SerializableBuffer(serializedTask))
 }
   }
 }
 
 // Remove a disconnected slave from the cluster
 def removeExecutor(executorId: String, reason: String) {
-  if (executorActor.contains(executorId)) {
-logInfo("Executor " + executorId + " disconnected, so removing it")
-val numCores = totalCores(executorId)
-executorActor -= executorId
-executorHost -= executorId
-addressToExecutorId -= executorAddress(executorId)
-executorAddress -= executorId
-totalCores -= executorId
-freeCores -= executorId
-totalCoreCount.addAndGet(-numCores)
-scheduler.executorLost(executorId, SlaveLost(reason))
+  executorData.get(executorId) match {
+case Some(executorInfo) =>
+  val numCores = executorInfo.totalCores
+  executorData.-=(executorId)
--- End diff --

typo fixed


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...

2014-09-25 Thread tigerquoll
Github user tigerquoll commented on a diff in the pull request:

https://github.com/apache/spark/pull/2533#discussion_r18066183
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
 ---
@@ -179,25 +178,22 @@ class CoarseGrainedSchedulerBackend(scheduler: 
TaskSchedulerImpl, actorSystem: A
   }
 }
 else {
-  freeCores(task.executorId) -= scheduler.CPUS_PER_TASK
-  executorActor(task.executorId) ! LaunchTask(new 
SerializableBuffer(serializedTask))
+  val executorInfo = executorData(task.executorId)
+  executorInfo.freeCores -= scheduler.CPUS_PER_TASK
+  executorInfo.executorActor ! LaunchTask(new 
SerializableBuffer(serializedTask))
 }
   }
 }
 
 // Remove a disconnected slave from the cluster
 def removeExecutor(executorId: String, reason: String) {
-  if (executorActor.contains(executorId)) {
-logInfo("Executor " + executorId + " disconnected, so removing it")
-val numCores = totalCores(executorId)
-executorActor -= executorId
-executorHost -= executorId
-addressToExecutorId -= executorAddress(executorId)
-executorAddress -= executorId
-totalCores -= executorId
-freeCores -= executorId
-totalCoreCount.addAndGet(-numCores)
-scheduler.executorLost(executorId, SlaveLost(reason))
+  executorData.get(executorId) match {
--- End diff --

Pattern matching on an option is a clear and well understood pattern. 
please let clarity rule over conciseness in this case.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...

2014-09-25 Thread tigerquoll
Github user tigerquoll commented on a diff in the pull request:

https://github.com/apache/spark/pull/2533#discussion_r18065891
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
 ---
@@ -149,13 +147,14 @@ class CoarseGrainedSchedulerBackend(scheduler: 
TaskSchedulerImpl, actorSystem: A
 // Make fake resource offers on all executors
 def makeOffers() {
   launchTasks(scheduler.resourceOffers(
-executorHost.toArray.map {case (id, host) => new WorkerOffer(id, 
host, freeCores(id))}))
+executorData.map{ case(k,v) => new WorkerOffer( k, v.executorHost, 
v.freeCores)}.toSeq))
--- End diff --

done


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...

2014-09-25 Thread tigerquoll
Github user tigerquoll commented on a diff in the pull request:

https://github.com/apache/spark/pull/2533#discussion_r18065662
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
 ---
@@ -126,8 +124,8 @@ class CoarseGrainedSchedulerBackend(scheduler: 
TaskSchedulerImpl, actorSystem: A
 
   case StopExecutors =>
 logInfo("Asking each executor to shut down")
-for (executor <- executorActor.values) {
-  executor ! StopExecutor
+executorData.foreach { case(k,v)  =>
--- End diff --

done


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...

2014-09-25 Thread tigerquoll
Github user tigerquoll commented on a diff in the pull request:

https://github.com/apache/spark/pull/2533#discussion_r18065400
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
 ---
@@ -104,13 +100,15 @@ class CoarseGrainedSchedulerBackend(scheduler: 
TaskSchedulerImpl, actorSystem: A
   case StatusUpdate(executorId, taskId, state, data) =>
 scheduler.statusUpdate(taskId, state, data.value)
 if (TaskState.isFinished(state)) {
-  if (executorActor.contains(executorId)) {
-freeCores(executorId) += scheduler.CPUS_PER_TASK
-makeOffers(executorId)
-  } else {
-// Ignoring the update since we don't know about the executor.
-val msg = "Ignored task status update (%d state %s) from 
unknown executor %s with ID %s"
-logWarning(msg.format(taskId, state, sender, executorId))
+  executorData.get(executorId) match {
+case Some(executorInfo) =>
--- End diff --

The map is now called ExecutorDataMap


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...

2014-09-25 Thread tigerquoll
Github user tigerquoll commented on a diff in the pull request:

https://github.com/apache/spark/pull/2533#discussion_r18065365
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
 ---
@@ -85,16 +79,18 @@ class CoarseGrainedSchedulerBackend(scheduler: 
TaskSchedulerImpl, actorSystem: A
 def receiveWithLogging = {
   case RegisterExecutor(executorId, hostPort, cores) =>
 Utils.checkHostPort(hostPort, "Host port expected " + hostPort)
-if (executorActor.contains(executorId)) {
+if (executorData.contains(executorId)) {
   sender ! RegisterExecutorFailed("Duplicate executor ID: " + 
executorId)
 } else {
   logInfo("Registered executor: " + sender + " with ID " + 
executorId)
   sender ! RegisteredExecutor
-  executorActor(executorId) = sender
-  executorHost(executorId) = Utils.parseHostPort(hostPort)._1
-  totalCores(executorId) = cores
-  freeCores(executorId) = cores
-  executorAddress(executorId) = sender.path.address
+  executorData.put(executorId, new ExecutorData(
--- End diff --

removed parameter names and re-ordered parameters


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...

2014-09-25 Thread tigerquoll
Github user tigerquoll commented on a diff in the pull request:

https://github.com/apache/spark/pull/2533#discussion_r18065148
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
 ---
@@ -62,15 +62,9 @@ class CoarseGrainedSchedulerBackend(scheduler: 
TaskSchedulerImpl, actorSystem: A
   val createTime = System.currentTimeMillis()
 
   class DriverActor(sparkProperties: Seq[(String, String)]) extends Actor 
with ActorLogReceive {
-
 override protected def log = CoarseGrainedSchedulerBackend.this.log
-
-private val executorActor = new HashMap[String, ActorRef]
-private val executorAddress = new HashMap[String, Address]
-private val executorHost = new HashMap[String, String]
-private val freeCores = new HashMap[String, Int]
-private val totalCores = new HashMap[String, Int]
 private val addressToExecutorId = new HashMap[Address, String]
+private val executorData = new HashMap[String, ExecutorData]
--- End diff --

changed to
executorDataMap = new HashMap[String, ExecutorData]


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...

2014-09-25 Thread Ishiihara
Github user Ishiihara commented on a diff in the pull request:

https://github.com/apache/spark/pull/2533#discussion_r18027094
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
 ---
@@ -179,25 +178,22 @@ class CoarseGrainedSchedulerBackend(scheduler: 
TaskSchedulerImpl, actorSystem: A
   }
 }
 else {
-  freeCores(task.executorId) -= scheduler.CPUS_PER_TASK
-  executorActor(task.executorId) ! LaunchTask(new 
SerializableBuffer(serializedTask))
+  val executorInfo = executorData(task.executorId)
+  executorInfo.freeCores -= scheduler.CPUS_PER_TASK
+  executorInfo.executorActor ! LaunchTask(new 
SerializableBuffer(serializedTask))
 }
   }
 }
 
 // Remove a disconnected slave from the cluster
 def removeExecutor(executorId: String, reason: String) {
-  if (executorActor.contains(executorId)) {
-logInfo("Executor " + executorId + " disconnected, so removing it")
-val numCores = totalCores(executorId)
-executorActor -= executorId
-executorHost -= executorId
-addressToExecutorId -= executorAddress(executorId)
-executorAddress -= executorId
-totalCores -= executorId
-freeCores -= executorId
-totalCoreCount.addAndGet(-numCores)
-scheduler.executorLost(executorId, SlaveLost(reason))
+  executorData.get(executorId) match {
+case Some(executorInfo) =>
+  val numCores = executorInfo.totalCores
+  executorData.-=(executorId)
--- End diff --

'.'  a typo here?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...

2014-09-25 Thread Ishiihara
Github user Ishiihara commented on a diff in the pull request:

https://github.com/apache/spark/pull/2533#discussion_r18027054
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
 ---
@@ -179,25 +178,22 @@ class CoarseGrainedSchedulerBackend(scheduler: 
TaskSchedulerImpl, actorSystem: A
   }
 }
 else {
-  freeCores(task.executorId) -= scheduler.CPUS_PER_TASK
-  executorActor(task.executorId) ! LaunchTask(new 
SerializableBuffer(serializedTask))
+  val executorInfo = executorData(task.executorId)
+  executorInfo.freeCores -= scheduler.CPUS_PER_TASK
+  executorInfo.executorActor ! LaunchTask(new 
SerializableBuffer(serializedTask))
 }
   }
 }
 
 // Remove a disconnected slave from the cluster
 def removeExecutor(executorId: String, reason: String) {
-  if (executorActor.contains(executorId)) {
-logInfo("Executor " + executorId + " disconnected, so removing it")
-val numCores = totalCores(executorId)
-executorActor -= executorId
-executorHost -= executorId
-addressToExecutorId -= executorAddress(executorId)
-executorAddress -= executorId
-totalCores -= executorId
-freeCores -= executorId
-totalCoreCount.addAndGet(-numCores)
-scheduler.executorLost(executorId, SlaveLost(reason))
+  executorData.get(executorId) match {
--- End diff --

getOrElse?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...

2014-09-25 Thread Ishiihara
Github user Ishiihara commented on a diff in the pull request:

https://github.com/apache/spark/pull/2533#discussion_r18026933
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
 ---
@@ -179,25 +178,22 @@ class CoarseGrainedSchedulerBackend(scheduler: 
TaskSchedulerImpl, actorSystem: A
   }
 }
 else {
-  freeCores(task.executorId) -= scheduler.CPUS_PER_TASK
-  executorActor(task.executorId) ! LaunchTask(new 
SerializableBuffer(serializedTask))
+  val executorInfo = executorData(task.executorId)
+  executorInfo.freeCores -= scheduler.CPUS_PER_TASK
+  executorInfo.executorActor ! LaunchTask(new 
SerializableBuffer(serializedTask))
 }
   }
 }
 
 // Remove a disconnected slave from the cluster
 def removeExecutor(executorId: String, reason: String) {
-  if (executorActor.contains(executorId)) {
-logInfo("Executor " + executorId + " disconnected, so removing it")
-val numCores = totalCores(executorId)
-executorActor -= executorId
-executorHost -= executorId
-addressToExecutorId -= executorAddress(executorId)
-executorAddress -= executorId
-totalCores -= executorId
-freeCores -= executorId
-totalCoreCount.addAndGet(-numCores)
-scheduler.executorLost(executorId, SlaveLost(reason))
+  executorData.get(executorId) match {
--- End diff --

How about using getOrElse instead? It should be more concise. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...

2014-09-25 Thread Ishiihara
Github user Ishiihara commented on a diff in the pull request:

https://github.com/apache/spark/pull/2533#discussion_r18026850
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
 ---
@@ -149,13 +147,14 @@ class CoarseGrainedSchedulerBackend(scheduler: 
TaskSchedulerImpl, actorSystem: A
 // Make fake resource offers on all executors
 def makeOffers() {
   launchTasks(scheduler.resourceOffers(
-executorHost.toArray.map {case (id, host) => new WorkerOffer(id, 
host, freeCores(id))}))
+executorData.map{ case(k,v) => new WorkerOffer( k, v.executorHost, 
v.freeCores)}.toSeq))
--- End diff --

How about case (id, executorInfo)?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...

2014-09-25 Thread Ishiihara
Github user Ishiihara commented on a diff in the pull request:

https://github.com/apache/spark/pull/2533#discussion_r18026763
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
 ---
@@ -126,8 +124,8 @@ class CoarseGrainedSchedulerBackend(scheduler: 
TaskSchedulerImpl, actorSystem: A
 
   case StopExecutors =>
 logInfo("Asking each executor to shut down")
-for (executor <- executorActor.values) {
-  executor ! StopExecutor
+executorData.foreach { case(k,v)  =>
--- End diff --

for ((k,v) <- executorData) {
   v.executorActor ! StopExecutor
}


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...

2014-09-25 Thread Ishiihara
Github user Ishiihara commented on a diff in the pull request:

https://github.com/apache/spark/pull/2533#discussion_r18026654
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
 ---
@@ -85,16 +79,18 @@ class CoarseGrainedSchedulerBackend(scheduler: 
TaskSchedulerImpl, actorSystem: A
 def receiveWithLogging = {
   case RegisterExecutor(executorId, hostPort, cores) =>
 Utils.checkHostPort(hostPort, "Host port expected " + hostPort)
-if (executorActor.contains(executorId)) {
+if (executorData.contains(executorId)) {
   sender ! RegisterExecutorFailed("Duplicate executor ID: " + 
executorId)
 } else {
   logInfo("Registered executor: " + sender + " with ID " + 
executorId)
   sender ! RegisteredExecutor
-  executorActor(executorId) = sender
-  executorHost(executorId) = Utils.parseHostPort(hostPort)._1
-  totalCores(executorId) = cores
-  freeCores(executorId) = cores
-  executorAddress(executorId) = sender.path.address
+  executorData.put(executorId, new ExecutorData(
--- End diff --

Maybe 
executorData(executorID) = new ExecutorData(sender, sender.path.address, 
Utils.parseHostPort(hostPort)._1,cores, cores)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...

2014-09-25 Thread Ishiihara
Github user Ishiihara commented on a diff in the pull request:

https://github.com/apache/spark/pull/2533#discussion_r18026675
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
 ---
@@ -104,13 +100,15 @@ class CoarseGrainedSchedulerBackend(scheduler: 
TaskSchedulerImpl, actorSystem: A
   case StatusUpdate(executorId, taskId, state, data) =>
 scheduler.statusUpdate(taskId, state, data.value)
 if (TaskState.isFinished(state)) {
-  if (executorActor.contains(executorId)) {
-freeCores(executorId) += scheduler.CPUS_PER_TASK
-makeOffers(executorId)
-  } else {
-// Ignoring the update since we don't know about the executor.
-val msg = "Ignored task status update (%d state %s) from 
unknown executor %s with ID %s"
-logWarning(msg.format(taskId, state, sender, executorId))
+  executorData.get(executorId) match {
+case Some(executorInfo) =>
--- End diff --

See my comment above on executorInfo.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...

2014-09-25 Thread Ishiihara
Github user Ishiihara commented on a diff in the pull request:

https://github.com/apache/spark/pull/2533#discussion_r18026563
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
 ---
@@ -85,16 +79,18 @@ class CoarseGrainedSchedulerBackend(scheduler: 
TaskSchedulerImpl, actorSystem: A
 def receiveWithLogging = {
   case RegisterExecutor(executorId, hostPort, cores) =>
 Utils.checkHostPort(hostPort, "Host port expected " + hostPort)
-if (executorActor.contains(executorId)) {
+if (executorData.contains(executorId)) {
   sender ! RegisterExecutorFailed("Duplicate executor ID: " + 
executorId)
 } else {
   logInfo("Registered executor: " + sender + " with ID " + 
executorId)
   sender ! RegisteredExecutor
-  executorActor(executorId) = sender
-  executorHost(executorId) = Utils.parseHostPort(hostPort)._1
-  totalCores(executorId) = cores
-  freeCores(executorId) = cores
-  executorAddress(executorId) = sender.path.address
+  executorData.put(executorId, new ExecutorData(
--- End diff --

I prefer to construct ExecutorData object with parameters the same order as 
fields defined in ExecutorData class. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...

2014-09-25 Thread Ishiihara
Github user Ishiihara commented on a diff in the pull request:

https://github.com/apache/spark/pull/2533#discussion_r18026314
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
 ---
@@ -62,15 +62,9 @@ class CoarseGrainedSchedulerBackend(scheduler: 
TaskSchedulerImpl, actorSystem: A
   val createTime = System.currentTimeMillis()
 
   class DriverActor(sparkProperties: Seq[(String, String)]) extends Actor 
with ActorLogReceive {
-
 override protected def log = CoarseGrainedSchedulerBackend.this.log
-
-private val executorActor = new HashMap[String, ActorRef]
-private val executorAddress = new HashMap[String, Address]
-private val executorHost = new HashMap[String, String]
-private val freeCores = new HashMap[String, Int]
-private val totalCores = new HashMap[String, Int]
 private val addressToExecutorId = new HashMap[Address, String]
+private val executorData = new HashMap[String, ExecutorData]
--- End diff --

executorData and executorInfo are confusing here. executorInfo is of type 
ExecutorData. Maybe you can switch the two? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...

2014-09-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/2533#issuecomment-56803421
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...

2014-09-25 Thread tigerquoll
GitHub user tigerquoll opened a pull request:

https://github.com/apache/spark/pull/2533

SPARK-CORE [SPARK-3651] Group common CoarseGrainedSchedulerBackend 
variables together

from [SPARK-3651]
In CoarseGrainedSchedulerBackend, we have:

private val executorActor = new HashMap[String, ActorRef]
private val executorAddress = new HashMap[String, Address]
private val executorHost = new HashMap[String, String]
private val freeCores = new HashMap[String, Int]
private val totalCores = new HashMap[String, Int]

We only ever put / remove stuff from these maps together. It would simplify 
the code if we consolidate these all into one map as we have done in 
JobProgressListener in https://issues.apache.org/jira/browse/SPARK-2299.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tigerquoll/spark-tigerquoll SPARK-3651

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/2533.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2533


commit 7d671cf230bdad22f42f336174e8e0a8f7bc267b
Author: Dale 
Date:   2014-09-25T10:46:30Z

[SPARK-3651]  Grouped variables under a ExecutorDataObject, and reference 
them via a map entry as they are all retrieved under the same key




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org