[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/2533 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/2533#discussion_r18127631 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/ExecutorData.scala --- @@ -0,0 +1,38 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.scheduler.cluster + +import akka.actor.{Address, ActorRef} + +/** + * Grouping of data that is accessed by a CourseGrainedScheduler. This class --- End diff -- Course -> Coarse. I will fix it during merge. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/2533#issuecomment-57075299 Thanks. Merging in master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...
Github user tigerquoll commented on a diff in the pull request: https://github.com/apache/spark/pull/2533#discussion_r18126678 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala --- @@ -149,13 +144,15 @@ class CoarseGrainedSchedulerBackend(scheduler: TaskSchedulerImpl, actorSystem: A // Make fake resource offers on all executors def makeOffers() { launchTasks(scheduler.resourceOffers( -executorHost.toArray.map {case (id, host) => new WorkerOffer(id, host, freeCores(id))})) +executorDataMap.map{ case(id, executorData) => + new WorkerOffer( id, executorData.executorHost, executorData.freeCores)}.toSeq)) --- End diff -- done --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...
Github user tigerquoll commented on a diff in the pull request: https://github.com/apache/spark/pull/2533#discussion_r18126677 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala --- @@ -149,13 +144,15 @@ class CoarseGrainedSchedulerBackend(scheduler: TaskSchedulerImpl, actorSystem: A // Make fake resource offers on all executors def makeOffers() { launchTasks(scheduler.resourceOffers( -executorHost.toArray.map {case (id, host) => new WorkerOffer(id, host, freeCores(id))})) +executorDataMap.map{ case(id, executorData) => --- End diff -- done --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...
Github user tigerquoll commented on a diff in the pull request: https://github.com/apache/spark/pull/2533#discussion_r18126669 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala --- @@ -179,25 +176,22 @@ class CoarseGrainedSchedulerBackend(scheduler: TaskSchedulerImpl, actorSystem: A } } else { - freeCores(task.executorId) -= scheduler.CPUS_PER_TASK - executorActor(task.executorId) ! LaunchTask(new SerializableBuffer(serializedTask)) + val executorInfo = executorDataMap(task.executorId) + executorInfo.freeCores -= scheduler.CPUS_PER_TASK + executorInfo.executorActor ! LaunchTask(new SerializableBuffer(serializedTask)) } } } // Remove a disconnected slave from the cluster def removeExecutor(executorId: String, reason: String) { - if (executorActor.contains(executorId)) { -logInfo("Executor " + executorId + " disconnected, so removing it") -val numCores = totalCores(executorId) -executorActor -= executorId -executorHost -= executorId -addressToExecutorId -= executorAddress(executorId) -executorAddress -= executorId -totalCores -= executorId -freeCores -= executorId -totalCoreCount.addAndGet(-numCores) -scheduler.executorLost(executorId, SlaveLost(reason)) + executorDataMap.get(executorId) match { +case Some(executorInfo) => + val numCores = executorInfo.totalCores --- End diff -- expression inlined --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...
Github user tigerquoll commented on a diff in the pull request: https://github.com/apache/spark/pull/2533#discussion_r18126667 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala --- @@ -297,6 +291,7 @@ class CoarseGrainedSchedulerBackend(scheduler: TaskSchedulerImpl, actorSystem: A } } + --- End diff -- removed --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...
Github user tigerquoll commented on a diff in the pull request: https://github.com/apache/spark/pull/2533#discussion_r18126665 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala --- @@ -85,16 +79,14 @@ class CoarseGrainedSchedulerBackend(scheduler: TaskSchedulerImpl, actorSystem: A def receiveWithLogging = { case RegisterExecutor(executorId, hostPort, cores) => Utils.checkHostPort(hostPort, "Host port expected " + hostPort) -if (executorActor.contains(executorId)) { +if (executorDataMap.contains(executorId)) { sender ! RegisterExecutorFailed("Duplicate executor ID: " + executorId) } else { logInfo("Registered executor: " + sender + " with ID " + executorId) sender ! RegisteredExecutor - executorActor(executorId) = sender - executorHost(executorId) = Utils.parseHostPort(hostPort)._1 - totalCores(executorId) = cores - freeCores(executorId) = cores - executorAddress(executorId) = sender.path.address + executorDataMap.put(executorId, new ExecutorData(sender, sender.path.address, --- End diff -- Done --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...
Github user tigerquoll commented on a diff in the pull request: https://github.com/apache/spark/pull/2533#discussion_r18126663 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala --- @@ -126,8 +120,8 @@ class CoarseGrainedSchedulerBackend(scheduler: TaskSchedulerImpl, actorSystem: A case StopExecutors => logInfo("Asking each executor to shut down") -for (executor <- executorActor.values) { - executor ! StopExecutor +for ((_,executorData) <- executorDataMap) { --- End diff -- Done --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...
Github user tigerquoll commented on a diff in the pull request: https://github.com/apache/spark/pull/2533#discussion_r18126658 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/ExecutorData.scala --- @@ -0,0 +1,28 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.scheduler.cluster + +import akka.actor.{Address, ActorRef} + +private[cluster] class ExecutorData( + var executorActor: ActorRef, --- End diff -- Good point - All but freeCores changed to vals --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...
Github user tigerquoll commented on a diff in the pull request: https://github.com/apache/spark/pull/2533#discussion_r18126545 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala --- @@ -104,13 +96,15 @@ class CoarseGrainedSchedulerBackend(scheduler: TaskSchedulerImpl, actorSystem: A case StatusUpdate(executorId, taskId, state, data) => scheduler.statusUpdate(taskId, state, data.value) if (TaskState.isFinished(state)) { - if (executorActor.contains(executorId)) { -freeCores(executorId) += scheduler.CPUS_PER_TASK -makeOffers(executorId) - } else { -// Ignoring the update since we don't know about the executor. -val msg = "Ignored task status update (%d state %s) from unknown executor %s with ID %s" -logWarning(msg.format(taskId, state, sender, executorId)) + executorDataMap.get(executorId) match { +case Some(executorInfo) => + executorInfo.freeCores += scheduler.CPUS_PER_TASK + makeOffers(executorId) +case None => + // Ignoring the update since we don't know about the executor. + val msg = "Ignored task status update (%d state %s) " + --- End diff -- Done and replaced format with a 's' interpolated string. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...
Github user tigerquoll commented on a diff in the pull request: https://github.com/apache/spark/pull/2533#discussion_r18126510 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/ExecutorData.scala --- @@ -0,0 +1,28 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.scheduler.cluster + +import akka.actor.{Address, ActorRef} + +private[cluster] class ExecutorData( + var executorActor: ActorRef, + var executorAddress: Address, + var executorHost: String , + var freeCores: Int, + var totalCores: Int +) {} --- End diff -- done --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...
Github user tigerquoll commented on a diff in the pull request: https://github.com/apache/spark/pull/2533#discussion_r18126499 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/ExecutorData.scala --- @@ -0,0 +1,28 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.scheduler.cluster + +import akka.actor.{Address, ActorRef} + +private[cluster] class ExecutorData( --- End diff -- Done --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2533#issuecomment-57043734 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/20907/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2533#issuecomment-57043731 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20907/consoleFull) for PR 2533 at commit [`6890663`](https://github.com/apache/spark/commit/6890663c9b15914cfb8523b6cf26871c0f1c2727). * This patch **passes** unit tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/2533#issuecomment-57042814 LGTM pending a couple of stylistic comments --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/2533#discussion_r18122075 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala --- @@ -104,13 +96,15 @@ class CoarseGrainedSchedulerBackend(scheduler: TaskSchedulerImpl, actorSystem: A case StatusUpdate(executorId, taskId, state, data) => scheduler.statusUpdate(taskId, state, data.value) if (TaskState.isFinished(state)) { - if (executorActor.contains(executorId)) { -freeCores(executorId) += scheduler.CPUS_PER_TASK -makeOffers(executorId) - } else { -// Ignoring the update since we don't know about the executor. -val msg = "Ignored task status update (%d state %s) from unknown executor %s with ID %s" -logWarning(msg.format(taskId, state, sender, executorId)) + executorDataMap.get(executorId) match { +case Some(executorInfo) => + executorInfo.freeCores += scheduler.CPUS_PER_TASK + makeOffers(executorId) +case None => + // Ignoring the update since we don't know about the executor. + val msg = "Ignored task status update (%d state %s) " + --- End diff -- Not your code, but you probably don't need to put this in a val either --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/2533#discussion_r18122072 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala --- @@ -149,13 +144,15 @@ class CoarseGrainedSchedulerBackend(scheduler: TaskSchedulerImpl, actorSystem: A // Make fake resource offers on all executors def makeOffers() { launchTasks(scheduler.resourceOffers( -executorHost.toArray.map {case (id, host) => new WorkerOffer(id, host, freeCores(id))})) +executorDataMap.map{ case(id, executorData) => + new WorkerOffer( id, executorData.executorHost, executorData.freeCores)}.toSeq)) --- End diff -- no space before id... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/2533#discussion_r18122067 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala --- @@ -149,13 +144,15 @@ class CoarseGrainedSchedulerBackend(scheduler: TaskSchedulerImpl, actorSystem: A // Make fake resource offers on all executors def makeOffers() { launchTasks(scheduler.resourceOffers( -executorHost.toArray.map {case (id, host) => new WorkerOffer(id, host, freeCores(id))})) +executorDataMap.map{ case(id, executorData) => + new WorkerOffer( id, executorData.executorHost, executorData.freeCores)}.toSeq)) --- End diff -- no space before id --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/2533#discussion_r18122066 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala --- @@ -149,13 +144,15 @@ class CoarseGrainedSchedulerBackend(scheduler: TaskSchedulerImpl, actorSystem: A // Make fake resource offers on all executors def makeOffers() { launchTasks(scheduler.resourceOffers( -executorHost.toArray.map {case (id, host) => new WorkerOffer(id, host, freeCores(id))})) +executorDataMap.map{ case(id, executorData) => --- End diff -- space after map, and case --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/2533#discussion_r18122064 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala --- @@ -179,25 +176,22 @@ class CoarseGrainedSchedulerBackend(scheduler: TaskSchedulerImpl, actorSystem: A } } else { - freeCores(task.executorId) -= scheduler.CPUS_PER_TASK - executorActor(task.executorId) ! LaunchTask(new SerializableBuffer(serializedTask)) + val executorInfo = executorDataMap(task.executorId) + executorInfo.freeCores -= scheduler.CPUS_PER_TASK + executorInfo.executorActor ! LaunchTask(new SerializableBuffer(serializedTask)) } } } // Remove a disconnected slave from the cluster def removeExecutor(executorId: String, reason: String) { - if (executorActor.contains(executorId)) { -logInfo("Executor " + executorId + " disconnected, so removing it") -val numCores = totalCores(executorId) -executorActor -= executorId -executorHost -= executorId -addressToExecutorId -= executorAddress(executorId) -executorAddress -= executorId -totalCores -= executorId -freeCores -= executorId -totalCoreCount.addAndGet(-numCores) -scheduler.executorLost(executorId, SlaveLost(reason)) + executorDataMap.get(executorId) match { +case Some(executorInfo) => + val numCores = executorInfo.totalCores --- End diff -- you probably don't need to put this in a val --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/2533#discussion_r18122060 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala --- @@ -297,6 +291,7 @@ class CoarseGrainedSchedulerBackend(scheduler: TaskSchedulerImpl, actorSystem: A } } + --- End diff -- random new line --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/2533#discussion_r18122048 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/ExecutorData.scala --- @@ -0,0 +1,28 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.scheduler.cluster + +import akka.actor.{Address, ActorRef} + +private[cluster] class ExecutorData( + var executorActor: ActorRef, + var executorAddress: Address, + var executorHost: String , + var freeCores: Int, + var totalCores: Int +) {} --- End diff -- nit: you can remove the `{}` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/2533#discussion_r18122044 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala --- @@ -85,16 +79,14 @@ class CoarseGrainedSchedulerBackend(scheduler: TaskSchedulerImpl, actorSystem: A def receiveWithLogging = { case RegisterExecutor(executorId, hostPort, cores) => Utils.checkHostPort(hostPort, "Host port expected " + hostPort) -if (executorActor.contains(executorId)) { +if (executorDataMap.contains(executorId)) { sender ! RegisterExecutorFailed("Duplicate executor ID: " + executorId) } else { logInfo("Registered executor: " + sender + " with ID " + executorId) sender ! RegisteredExecutor - executorActor(executorId) = sender - executorHost(executorId) = Utils.parseHostPort(hostPort)._1 - totalCores(executorId) = cores - freeCores(executorId) = cores - executorAddress(executorId) = sender.path.address + executorDataMap.put(executorId, new ExecutorData(sender, sender.path.address, --- End diff -- 2 spaces here, need only 1 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2533#issuecomment-57042564 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20907/consoleFull) for PR 2533 at commit [`6890663`](https://github.com/apache/spark/commit/6890663c9b15914cfb8523b6cf26871c0f1c2727). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/2533#issuecomment-57042526 I really like this change. Made some minor comments. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/2533#discussion_r18121960 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala --- @@ -126,8 +120,8 @@ class CoarseGrainedSchedulerBackend(scheduler: TaskSchedulerImpl, actorSystem: A case StopExecutors => logInfo("Asking each executor to shut down") -for (executor <- executorActor.values) { - executor ! StopExecutor +for ((_,executorData) <- executorDataMap) { --- End diff -- add a space after comma --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/2533#issuecomment-57042458 test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/2533#discussion_r18121953 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/ExecutorData.scala --- @@ -0,0 +1,28 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.scheduler.cluster + +import akka.actor.{Address, ActorRef} + +private[cluster] class ExecutorData( --- End diff -- maybe add some javadoc explaining what this is for ... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/2533#discussion_r18121952 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/ExecutorData.scala --- @@ -0,0 +1,28 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.scheduler.cluster + +import akka.actor.{Address, ActorRef} + +private[cluster] class ExecutorData( + var executorActor: ActorRef, --- End diff -- should they be vals? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...
Github user tigerquoll commented on a diff in the pull request: https://github.com/apache/spark/pull/2533#discussion_r18066186 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala --- @@ -179,25 +178,22 @@ class CoarseGrainedSchedulerBackend(scheduler: TaskSchedulerImpl, actorSystem: A } } else { - freeCores(task.executorId) -= scheduler.CPUS_PER_TASK - executorActor(task.executorId) ! LaunchTask(new SerializableBuffer(serializedTask)) + val executorInfo = executorData(task.executorId) + executorInfo.freeCores -= scheduler.CPUS_PER_TASK + executorInfo.executorActor ! LaunchTask(new SerializableBuffer(serializedTask)) } } } // Remove a disconnected slave from the cluster def removeExecutor(executorId: String, reason: String) { - if (executorActor.contains(executorId)) { -logInfo("Executor " + executorId + " disconnected, so removing it") -val numCores = totalCores(executorId) -executorActor -= executorId -executorHost -= executorId -addressToExecutorId -= executorAddress(executorId) -executorAddress -= executorId -totalCores -= executorId -freeCores -= executorId -totalCoreCount.addAndGet(-numCores) -scheduler.executorLost(executorId, SlaveLost(reason)) + executorData.get(executorId) match { +case Some(executorInfo) => + val numCores = executorInfo.totalCores + executorData.-=(executorId) --- End diff -- typo fixed --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...
Github user tigerquoll commented on a diff in the pull request: https://github.com/apache/spark/pull/2533#discussion_r18066183 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala --- @@ -179,25 +178,22 @@ class CoarseGrainedSchedulerBackend(scheduler: TaskSchedulerImpl, actorSystem: A } } else { - freeCores(task.executorId) -= scheduler.CPUS_PER_TASK - executorActor(task.executorId) ! LaunchTask(new SerializableBuffer(serializedTask)) + val executorInfo = executorData(task.executorId) + executorInfo.freeCores -= scheduler.CPUS_PER_TASK + executorInfo.executorActor ! LaunchTask(new SerializableBuffer(serializedTask)) } } } // Remove a disconnected slave from the cluster def removeExecutor(executorId: String, reason: String) { - if (executorActor.contains(executorId)) { -logInfo("Executor " + executorId + " disconnected, so removing it") -val numCores = totalCores(executorId) -executorActor -= executorId -executorHost -= executorId -addressToExecutorId -= executorAddress(executorId) -executorAddress -= executorId -totalCores -= executorId -freeCores -= executorId -totalCoreCount.addAndGet(-numCores) -scheduler.executorLost(executorId, SlaveLost(reason)) + executorData.get(executorId) match { --- End diff -- Pattern matching on an option is a clear and well understood pattern. please let clarity rule over conciseness in this case. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...
Github user tigerquoll commented on a diff in the pull request: https://github.com/apache/spark/pull/2533#discussion_r18065891 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala --- @@ -149,13 +147,14 @@ class CoarseGrainedSchedulerBackend(scheduler: TaskSchedulerImpl, actorSystem: A // Make fake resource offers on all executors def makeOffers() { launchTasks(scheduler.resourceOffers( -executorHost.toArray.map {case (id, host) => new WorkerOffer(id, host, freeCores(id))})) +executorData.map{ case(k,v) => new WorkerOffer( k, v.executorHost, v.freeCores)}.toSeq)) --- End diff -- done --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...
Github user tigerquoll commented on a diff in the pull request: https://github.com/apache/spark/pull/2533#discussion_r18065662 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala --- @@ -126,8 +124,8 @@ class CoarseGrainedSchedulerBackend(scheduler: TaskSchedulerImpl, actorSystem: A case StopExecutors => logInfo("Asking each executor to shut down") -for (executor <- executorActor.values) { - executor ! StopExecutor +executorData.foreach { case(k,v) => --- End diff -- done --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...
Github user tigerquoll commented on a diff in the pull request: https://github.com/apache/spark/pull/2533#discussion_r18065400 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala --- @@ -104,13 +100,15 @@ class CoarseGrainedSchedulerBackend(scheduler: TaskSchedulerImpl, actorSystem: A case StatusUpdate(executorId, taskId, state, data) => scheduler.statusUpdate(taskId, state, data.value) if (TaskState.isFinished(state)) { - if (executorActor.contains(executorId)) { -freeCores(executorId) += scheduler.CPUS_PER_TASK -makeOffers(executorId) - } else { -// Ignoring the update since we don't know about the executor. -val msg = "Ignored task status update (%d state %s) from unknown executor %s with ID %s" -logWarning(msg.format(taskId, state, sender, executorId)) + executorData.get(executorId) match { +case Some(executorInfo) => --- End diff -- The map is now called ExecutorDataMap --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...
Github user tigerquoll commented on a diff in the pull request: https://github.com/apache/spark/pull/2533#discussion_r18065365 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala --- @@ -85,16 +79,18 @@ class CoarseGrainedSchedulerBackend(scheduler: TaskSchedulerImpl, actorSystem: A def receiveWithLogging = { case RegisterExecutor(executorId, hostPort, cores) => Utils.checkHostPort(hostPort, "Host port expected " + hostPort) -if (executorActor.contains(executorId)) { +if (executorData.contains(executorId)) { sender ! RegisterExecutorFailed("Duplicate executor ID: " + executorId) } else { logInfo("Registered executor: " + sender + " with ID " + executorId) sender ! RegisteredExecutor - executorActor(executorId) = sender - executorHost(executorId) = Utils.parseHostPort(hostPort)._1 - totalCores(executorId) = cores - freeCores(executorId) = cores - executorAddress(executorId) = sender.path.address + executorData.put(executorId, new ExecutorData( --- End diff -- removed parameter names and re-ordered parameters --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...
Github user tigerquoll commented on a diff in the pull request: https://github.com/apache/spark/pull/2533#discussion_r18065148 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala --- @@ -62,15 +62,9 @@ class CoarseGrainedSchedulerBackend(scheduler: TaskSchedulerImpl, actorSystem: A val createTime = System.currentTimeMillis() class DriverActor(sparkProperties: Seq[(String, String)]) extends Actor with ActorLogReceive { - override protected def log = CoarseGrainedSchedulerBackend.this.log - -private val executorActor = new HashMap[String, ActorRef] -private val executorAddress = new HashMap[String, Address] -private val executorHost = new HashMap[String, String] -private val freeCores = new HashMap[String, Int] -private val totalCores = new HashMap[String, Int] private val addressToExecutorId = new HashMap[Address, String] +private val executorData = new HashMap[String, ExecutorData] --- End diff -- changed to executorDataMap = new HashMap[String, ExecutorData] --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...
Github user Ishiihara commented on a diff in the pull request: https://github.com/apache/spark/pull/2533#discussion_r18027094 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala --- @@ -179,25 +178,22 @@ class CoarseGrainedSchedulerBackend(scheduler: TaskSchedulerImpl, actorSystem: A } } else { - freeCores(task.executorId) -= scheduler.CPUS_PER_TASK - executorActor(task.executorId) ! LaunchTask(new SerializableBuffer(serializedTask)) + val executorInfo = executorData(task.executorId) + executorInfo.freeCores -= scheduler.CPUS_PER_TASK + executorInfo.executorActor ! LaunchTask(new SerializableBuffer(serializedTask)) } } } // Remove a disconnected slave from the cluster def removeExecutor(executorId: String, reason: String) { - if (executorActor.contains(executorId)) { -logInfo("Executor " + executorId + " disconnected, so removing it") -val numCores = totalCores(executorId) -executorActor -= executorId -executorHost -= executorId -addressToExecutorId -= executorAddress(executorId) -executorAddress -= executorId -totalCores -= executorId -freeCores -= executorId -totalCoreCount.addAndGet(-numCores) -scheduler.executorLost(executorId, SlaveLost(reason)) + executorData.get(executorId) match { +case Some(executorInfo) => + val numCores = executorInfo.totalCores + executorData.-=(executorId) --- End diff -- '.' a typo here? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...
Github user Ishiihara commented on a diff in the pull request: https://github.com/apache/spark/pull/2533#discussion_r18027054 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala --- @@ -179,25 +178,22 @@ class CoarseGrainedSchedulerBackend(scheduler: TaskSchedulerImpl, actorSystem: A } } else { - freeCores(task.executorId) -= scheduler.CPUS_PER_TASK - executorActor(task.executorId) ! LaunchTask(new SerializableBuffer(serializedTask)) + val executorInfo = executorData(task.executorId) + executorInfo.freeCores -= scheduler.CPUS_PER_TASK + executorInfo.executorActor ! LaunchTask(new SerializableBuffer(serializedTask)) } } } // Remove a disconnected slave from the cluster def removeExecutor(executorId: String, reason: String) { - if (executorActor.contains(executorId)) { -logInfo("Executor " + executorId + " disconnected, so removing it") -val numCores = totalCores(executorId) -executorActor -= executorId -executorHost -= executorId -addressToExecutorId -= executorAddress(executorId) -executorAddress -= executorId -totalCores -= executorId -freeCores -= executorId -totalCoreCount.addAndGet(-numCores) -scheduler.executorLost(executorId, SlaveLost(reason)) + executorData.get(executorId) match { --- End diff -- getOrElse? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...
Github user Ishiihara commented on a diff in the pull request: https://github.com/apache/spark/pull/2533#discussion_r18026933 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala --- @@ -179,25 +178,22 @@ class CoarseGrainedSchedulerBackend(scheduler: TaskSchedulerImpl, actorSystem: A } } else { - freeCores(task.executorId) -= scheduler.CPUS_PER_TASK - executorActor(task.executorId) ! LaunchTask(new SerializableBuffer(serializedTask)) + val executorInfo = executorData(task.executorId) + executorInfo.freeCores -= scheduler.CPUS_PER_TASK + executorInfo.executorActor ! LaunchTask(new SerializableBuffer(serializedTask)) } } } // Remove a disconnected slave from the cluster def removeExecutor(executorId: String, reason: String) { - if (executorActor.contains(executorId)) { -logInfo("Executor " + executorId + " disconnected, so removing it") -val numCores = totalCores(executorId) -executorActor -= executorId -executorHost -= executorId -addressToExecutorId -= executorAddress(executorId) -executorAddress -= executorId -totalCores -= executorId -freeCores -= executorId -totalCoreCount.addAndGet(-numCores) -scheduler.executorLost(executorId, SlaveLost(reason)) + executorData.get(executorId) match { --- End diff -- How about using getOrElse instead? It should be more concise. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...
Github user Ishiihara commented on a diff in the pull request: https://github.com/apache/spark/pull/2533#discussion_r18026850 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala --- @@ -149,13 +147,14 @@ class CoarseGrainedSchedulerBackend(scheduler: TaskSchedulerImpl, actorSystem: A // Make fake resource offers on all executors def makeOffers() { launchTasks(scheduler.resourceOffers( -executorHost.toArray.map {case (id, host) => new WorkerOffer(id, host, freeCores(id))})) +executorData.map{ case(k,v) => new WorkerOffer( k, v.executorHost, v.freeCores)}.toSeq)) --- End diff -- How about case (id, executorInfo)? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...
Github user Ishiihara commented on a diff in the pull request: https://github.com/apache/spark/pull/2533#discussion_r18026763 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala --- @@ -126,8 +124,8 @@ class CoarseGrainedSchedulerBackend(scheduler: TaskSchedulerImpl, actorSystem: A case StopExecutors => logInfo("Asking each executor to shut down") -for (executor <- executorActor.values) { - executor ! StopExecutor +executorData.foreach { case(k,v) => --- End diff -- for ((k,v) <- executorData) { v.executorActor ! StopExecutor } --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...
Github user Ishiihara commented on a diff in the pull request: https://github.com/apache/spark/pull/2533#discussion_r18026654 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala --- @@ -85,16 +79,18 @@ class CoarseGrainedSchedulerBackend(scheduler: TaskSchedulerImpl, actorSystem: A def receiveWithLogging = { case RegisterExecutor(executorId, hostPort, cores) => Utils.checkHostPort(hostPort, "Host port expected " + hostPort) -if (executorActor.contains(executorId)) { +if (executorData.contains(executorId)) { sender ! RegisterExecutorFailed("Duplicate executor ID: " + executorId) } else { logInfo("Registered executor: " + sender + " with ID " + executorId) sender ! RegisteredExecutor - executorActor(executorId) = sender - executorHost(executorId) = Utils.parseHostPort(hostPort)._1 - totalCores(executorId) = cores - freeCores(executorId) = cores - executorAddress(executorId) = sender.path.address + executorData.put(executorId, new ExecutorData( --- End diff -- Maybe executorData(executorID) = new ExecutorData(sender, sender.path.address, Utils.parseHostPort(hostPort)._1,cores, cores) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...
Github user Ishiihara commented on a diff in the pull request: https://github.com/apache/spark/pull/2533#discussion_r18026675 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala --- @@ -104,13 +100,15 @@ class CoarseGrainedSchedulerBackend(scheduler: TaskSchedulerImpl, actorSystem: A case StatusUpdate(executorId, taskId, state, data) => scheduler.statusUpdate(taskId, state, data.value) if (TaskState.isFinished(state)) { - if (executorActor.contains(executorId)) { -freeCores(executorId) += scheduler.CPUS_PER_TASK -makeOffers(executorId) - } else { -// Ignoring the update since we don't know about the executor. -val msg = "Ignored task status update (%d state %s) from unknown executor %s with ID %s" -logWarning(msg.format(taskId, state, sender, executorId)) + executorData.get(executorId) match { +case Some(executorInfo) => --- End diff -- See my comment above on executorInfo. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...
Github user Ishiihara commented on a diff in the pull request: https://github.com/apache/spark/pull/2533#discussion_r18026563 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala --- @@ -85,16 +79,18 @@ class CoarseGrainedSchedulerBackend(scheduler: TaskSchedulerImpl, actorSystem: A def receiveWithLogging = { case RegisterExecutor(executorId, hostPort, cores) => Utils.checkHostPort(hostPort, "Host port expected " + hostPort) -if (executorActor.contains(executorId)) { +if (executorData.contains(executorId)) { sender ! RegisterExecutorFailed("Duplicate executor ID: " + executorId) } else { logInfo("Registered executor: " + sender + " with ID " + executorId) sender ! RegisteredExecutor - executorActor(executorId) = sender - executorHost(executorId) = Utils.parseHostPort(hostPort)._1 - totalCores(executorId) = cores - freeCores(executorId) = cores - executorAddress(executorId) = sender.path.address + executorData.put(executorId, new ExecutorData( --- End diff -- I prefer to construct ExecutorData object with parameters the same order as fields defined in ExecutorData class. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...
Github user Ishiihara commented on a diff in the pull request: https://github.com/apache/spark/pull/2533#discussion_r18026314 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala --- @@ -62,15 +62,9 @@ class CoarseGrainedSchedulerBackend(scheduler: TaskSchedulerImpl, actorSystem: A val createTime = System.currentTimeMillis() class DriverActor(sparkProperties: Seq[(String, String)]) extends Actor with ActorLogReceive { - override protected def log = CoarseGrainedSchedulerBackend.this.log - -private val executorActor = new HashMap[String, ActorRef] -private val executorAddress = new HashMap[String, Address] -private val executorHost = new HashMap[String, String] -private val freeCores = new HashMap[String, Int] -private val totalCores = new HashMap[String, Int] private val addressToExecutorId = new HashMap[Address, String] +private val executorData = new HashMap[String, ExecutorData] --- End diff -- executorData and executorInfo are confusing here. executorInfo is of type ExecutorData. Maybe you can switch the two? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2533#issuecomment-56803421 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-CORE [SPARK-3651] Group common CoarseGra...
GitHub user tigerquoll opened a pull request: https://github.com/apache/spark/pull/2533 SPARK-CORE [SPARK-3651] Group common CoarseGrainedSchedulerBackend variables together from [SPARK-3651] In CoarseGrainedSchedulerBackend, we have: private val executorActor = new HashMap[String, ActorRef] private val executorAddress = new HashMap[String, Address] private val executorHost = new HashMap[String, String] private val freeCores = new HashMap[String, Int] private val totalCores = new HashMap[String, Int] We only ever put / remove stuff from these maps together. It would simplify the code if we consolidate these all into one map as we have done in JobProgressListener in https://issues.apache.org/jira/browse/SPARK-2299. You can merge this pull request into a Git repository by running: $ git pull https://github.com/tigerquoll/spark-tigerquoll SPARK-3651 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/2533.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2533 commit 7d671cf230bdad22f42f336174e8e0a8f7bc267b Author: Dale Date: 2014-09-25T10:46:30Z [SPARK-3651] Grouped variables under a ExecutorDataObject, and reference them via a map entry as they are all retrieved under the same key --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org