Github user asfgit closed the pull request at:
https://github.com/apache/spark/pull/214
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabl
Github user kayousterhout commented on the pull request:
https://github.com/apache/spark/pull/214#issuecomment-61938307
Hi @qqsun8819, as Matei mentioned, Spark now broadcasts RDD objects, so
it's very unlikely for task serialization to become a bottleneck. I closed the
associated JI
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/214#issuecomment-54694752
Can one of the admins verify this patch?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project do
Github user mateiz commented on the pull request:
https://github.com/apache/spark/pull/214#issuecomment-53514576
@qqsun8819 given the recent patch in 1.1 to broadcast RDD objects (and
hence not have to serialize them when we send each task), do you think this
patch is still needed? Un
Github user CodingCat commented on the pull request:
https://github.com/apache/spark/pull/214#issuecomment-39083121
Hey @qqsun8819 , Finally find that, there have been some discussions about
removing dagScheduler's serializability checking
https://github.com/apache/spark/pull/143
--
Github user qqsun8819 commented on the pull request:
https://github.com/apache/spark/pull/214#issuecomment-39079998
@CodingCat Thanks very much for your review.I found out that you main
concern concentrate on two points:1 Merge the two SerializerRunner in two
scheduler backend into on
Github user CodingCat commented on a diff in the pull request:
https://github.com/apache/spark/pull/214#discussion_r11102892
--- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
@@ -149,6 +151,21 @@ private[spark] object Utils extends Logging {
buf
}
Github user CodingCat commented on the pull request:
https://github.com/apache/spark/pull/214#issuecomment-39050291
@qqsun8819 Good job, just gave my thoughts on the current solution, I'm
actually far from an expert, expecting others' feedback.
---
If your project is set up for i
Github user CodingCat commented on a diff in the pull request:
https://github.com/apache/spark/pull/214#discussion_r11102844
--- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
@@ -149,6 +151,21 @@ private[spark] object Utils extends Logging {
buf
}
Github user CodingCat commented on a diff in the pull request:
https://github.com/apache/spark/pull/214#discussion_r11102803
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/local/LocalBackend.scala ---
@@ -46,6 +47,7 @@ private[spark] class LocalActor(
private
Github user CodingCat commented on a diff in the pull request:
https://github.com/apache/spark/pull/214#discussion_r11102770
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerBackend.scala
---
@@ -62,6 +65,30 @@ private[spark] class MesosSchedule
Github user CodingCat commented on a diff in the pull request:
https://github.com/apache/spark/pull/214#discussion_r11102737
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerBackend.scala
---
@@ -62,6 +65,30 @@ private[spark] class MesosSchedule
Github user CodingCat commented on a diff in the pull request:
https://github.com/apache/spark/pull/214#discussion_r11102700
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerBackend.scala
---
@@ -29,9 +29,12 @@ import org.apache.mesos.{Scheduler
Github user CodingCat commented on a diff in the pull request:
https://github.com/apache/spark/pull/214#discussion_r11102691
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerBackend.scala
---
@@ -62,6 +65,30 @@ private[spark] class MesosSchedule
Github user qqsun8819 commented on the pull request:
https://github.com/apache/spark/pull/214#issuecomment-39048706
patch updated
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/214#issuecomment-38881858
Can one of the admins verify this patch?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your proj
Github user qqsun8819 commented on the pull request:
https://github.com/apache/spark/pull/214#issuecomment-38651743
@CodingCat @kayousterhout @mridulm Thanks very much for your review.
I think @kayousterhout state clear in her last two comments what the ideal
implementation looks
Github user qqsun8819 commented on the pull request:
https://github.com/apache/spark/pull/214#issuecomment-38651617
@CodingCat @mridulm @kayousterhout Thanks very much for your review
I looked through your discussion, and basically understand what you mean
.So you all agree on movi
Github user kayousterhout commented on the pull request:
https://github.com/apache/spark/pull/214#issuecomment-38615455
Yeah totally agree about the util method!!
On Tue, Mar 25, 2014 at 1:21 PM, Mridul Muralidharan <
notificati...@github.com> wrote:
> Btw, we
Github user mridulm commented on the pull request:
https://github.com/apache/spark/pull/214#issuecomment-38615396
Btw, we might want to make it some util method somewhere - so that the
various backends dont need to duplicate this code.
---
If your project is set up for it, you can re
Github user mridulm commented on the pull request:
https://github.com/apache/spark/pull/214#issuecomment-38615338
Ah, I see what you mean - pull all of the logic within successful schedule
of resourceOffer into the caller.
Yeah, that should work fine (with the caveat of setting Spa
Github user kayousterhout commented on the pull request:
https://github.com/apache/spark/pull/214#issuecomment-38614287
Yeah exactly -- so my proposal was something like, inn CBSG.makeOffers():
-still do scheduler.resourceOffers(), only now this returns unserialized
tasks
-upda
Github user mridulm commented on the pull request:
https://github.com/apache/spark/pull/214#issuecomment-38613780
In coarse grained scheduler, freeCores is updated once the task desc's are
returned - in launchTasks; and expected to be used within the actor thread (so
MT-unsafe).
A
Github user kayousterhout commented on the pull request:
https://github.com/apache/spark/pull/214#issuecomment-38612977
freeCores would need to be updated by the makeOffers() method and before
the tasks get serialized (otherwise we can have race conditions where we
assign the sa
Github user mridulm commented on the pull request:
https://github.com/apache/spark/pull/214#issuecomment-38612685
@kayousterhout the backend assumes that there is only a single thread which
is executing inside the actor at a given point of time. We will be changing
this assumption.
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/214#discussion_r10949757
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala ---
@@ -198,6 +201,13 @@ private[spark] class TaskSchedulerImpl(
*/
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/214#discussion_r10949666
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala ---
@@ -243,9 +275,16 @@ private[spark] class TaskSchedulerImpl(
}
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/214#discussion_r10949633
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala ---
@@ -219,18 +229,40 @@ private[spark] class TaskSchedulerImpl(
Github user kayousterhout commented on the pull request:
https://github.com/apache/spark/pull/214#issuecomment-38609414
I thought about this a bit more and I think it makes sense to do something
similar to what @CodingCat suggested: in CoarseGrainedSchedulerBackend, when we
call sched
Github user CodingCat commented on the pull request:
https://github.com/apache/spark/pull/214#issuecomment-38556226
Hey, @qqsun8819 , after the second thought on whether task serialization
function should call the function directly or send a message to the
ClusterSchedulerBackend, I
Github user CodingCat commented on a diff in the pull request:
https://github.com/apache/spark/pull/214#discussion_r10919090
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala ---
@@ -198,6 +201,13 @@ private[spark] class TaskSchedulerImpl(
*/
Github user qqsun8819 commented on a diff in the pull request:
https://github.com/apache/spark/pull/214#discussion_r10918799
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala ---
@@ -198,6 +201,13 @@ private[spark] class TaskSchedulerImpl(
*/
Github user qqsun8819 commented on a diff in the pull request:
https://github.com/apache/spark/pull/214#discussion_r10918762
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala ---
@@ -198,6 +201,13 @@ private[spark] class TaskSchedulerImpl(
*/
Github user CodingCat commented on the pull request:
https://github.com/apache/spark/pull/214#issuecomment-38533621
Oh, sorry, it's DAG
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this fe
Github user CodingCat commented on the pull request:
https://github.com/apache/spark/pull/214#issuecomment-38533621
Oh, sorry, it's DAG
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this fe
Github user CodingCat commented on a diff in the pull request:
https://github.com/apache/spark/pull/214#discussion_r10918503
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala ---
@@ -198,6 +201,13 @@ private[spark] class TaskSchedulerImpl(
*/
Github user CodingCat commented on the pull request:
https://github.com/apache/spark/pull/214#issuecomment-38533226
Hi, @kayousterhout, you mean CoarseClusterSchedulerBackend block, instead
of DAG?
---
If your project is set up for it, you can reply to this email and have your
reply
Github user CodingCat commented on a diff in the pull request:
https://github.com/apache/spark/pull/214#discussion_r10918466
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala ---
@@ -198,6 +201,13 @@ private[spark] class TaskSchedulerImpl(
*/
Github user qqsun8819 commented on the pull request:
https://github.com/apache/spark/pull/214#issuecomment-38532790
Also,if we use TaskResultGetter-like mechanism , we can create threapool
inside it using FixPool from Util just as ResultGetter does
---
If your project is set up for i
Github user qqsun8819 commented on the pull request:
https://github.com/apache/spark/pull/214#issuecomment-38532473
Thanks for your advice @kayousterhout . And my understand for what you
mean is create a TaskResultGetter-like class, and this class main a threadpool
inside it , and e
Github user qqsun8819 commented on a diff in the pull request:
https://github.com/apache/spark/pull/214#discussion_r10918103
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala ---
@@ -198,6 +201,13 @@ private[spark] class TaskSchedulerImpl(
*/
Github user kayousterhout commented on the pull request:
https://github.com/apache/spark/pull/214#issuecomment-38496480
Echoing what @CodingCat said, I think this solution has the same problem
that I mentioned in response to your design posted in the JIRA
(https://spark-project.atlass
Github user CodingCat commented on a diff in the pull request:
https://github.com/apache/spark/pull/214#discussion_r10899913
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala ---
@@ -31,6 +32,7 @@ import org.apache.spark._
import org.apache.spark
Github user CodingCat commented on a diff in the pull request:
https://github.com/apache/spark/pull/214#discussion_r10896811
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala ---
@@ -198,6 +201,13 @@ private[spark] class TaskSchedulerImpl(
*/
Github user qqsun8819 commented on the pull request:
https://github.com/apache/spark/pull/214#issuecomment-38463531
Fix DriverSuite case fail
put threadpool inside resourceoffer and shutdown it before it return
some other fix according to @CodingCat 's review
---
If your pro
Github user qqsun8819 commented on a diff in the pull request:
https://github.com/apache/spark/pull/214#discussion_r10888735
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala ---
@@ -93,6 +96,10 @@ private[spark] class TaskSchedulerImpl(
val ma
Github user qqsun8819 commented on a diff in the pull request:
https://github.com/apache/spark/pull/214#discussion_r10888611
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala ---
@@ -30,6 +30,9 @@ import scala.util.Random
import org.apache.spark.
Github user qqsun8819 commented on a diff in the pull request:
https://github.com/apache/spark/pull/214#discussion_r10886619
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala ---
@@ -219,18 +226,43 @@ private[spark] class TaskSchedulerImpl(
Github user CodingCat commented on a diff in the pull request:
https://github.com/apache/spark/pull/214#discussion_r10884928
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala ---
@@ -219,18 +226,43 @@ private[spark] class TaskSchedulerImpl(
Github user CodingCat commented on a diff in the pull request:
https://github.com/apache/spark/pull/214#discussion_r10884645
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala ---
@@ -30,6 +30,9 @@ import scala.util.Random
import org.apache.spark.
Github user CodingCat commented on a diff in the pull request:
https://github.com/apache/spark/pull/214#discussion_r10884378
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala ---
@@ -219,18 +226,43 @@ private[spark] class TaskSchedulerImpl(
Github user CodingCat commented on a diff in the pull request:
https://github.com/apache/spark/pull/214#discussion_r10884300
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala ---
@@ -219,18 +226,43 @@ private[spark] class TaskSchedulerImpl(
Github user CodingCat commented on a diff in the pull request:
https://github.com/apache/spark/pull/214#discussion_r10884189
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala ---
@@ -243,12 +275,18 @@ private[spark] class TaskSchedulerImpl(
Github user CodingCat commented on a diff in the pull request:
https://github.com/apache/spark/pull/214#discussion_r10884203
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala ---
@@ -243,12 +275,18 @@ private[spark] class TaskSchedulerImpl(
Github user CodingCat commented on a diff in the pull request:
https://github.com/apache/spark/pull/214#discussion_r10884024
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala ---
@@ -93,6 +96,10 @@ private[spark] class TaskSchedulerImpl(
val ma
Github user CodingCat commented on a diff in the pull request:
https://github.com/apache/spark/pull/214#discussion_r10884042
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala ---
@@ -93,6 +96,10 @@ private[spark] class TaskSchedulerImpl(
val ma
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/214#issuecomment-38416907
Can one of the admins verify this patch?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your proj
GitHub user qqsun8819 opened a pull request:
https://github.com/apache/spark/pull/214
[SPARK-1141] [WIP] Parallelize Task Serialization
https://spark-project.atlassian.net/browse/SPARK-1141
@kayousterhout
copied from JIRA(design doc in JIRA is old, I'll update it later)
58 matches
Mail list logo