[GitHub] spark pull request: [SPARK-6880][CORE] Use properties from ActiveJ...

2015-09-02 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/6291#issuecomment-137161186 @squito Sure, if you want to finish off the test, that would be great. I haven't been prioritizing that since the bug itself was fairly low priority and

[GitHub] spark pull request: [SPARK-10192] [core] simple test w/ failure in...

2015-08-25 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/8402#discussion_r37928805 --- Diff: core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala --- @@ -695,6 +695,126 @@ class DAGSchedulerSuite

[GitHub] spark pull request: [SPARK-10192] [core] simple test w/ failure in...

2015-08-25 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/8402#discussion_r37927038 --- Diff: core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala --- @@ -695,6 +695,126 @@ class DAGSchedulerSuite

[GitHub] spark pull request: [SPARK-10193] [core] [wip] Eliminate Skipped S...

2015-08-25 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/8427#issuecomment-134753462 I wasn't intending to answer "no", but rather just wanting to make sure that we think through the implications of this change. It will increase m

[GitHub] spark pull request: [SPARK-10192] [core] simple test w/ failure in...

2015-08-25 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/8402#discussion_r37923582 --- Diff: core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala --- @@ -695,6 +695,126 @@ class DAGSchedulerSuite

[GitHub] spark pull request: [SPARK-10193] [core] [wip]

2015-08-25 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/8427#issuecomment-134721686 This is going to increase memory pressure. The very early code never cleaned up the Stage-tracking data structures at all, which was clearly unacceptable for long

[GitHub] spark pull request: [SPARK-5259][CORE] don't submit stage until it...

2015-08-25 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/7699#issuecomment-134690926 This LGTM. The only issues I have are quibbles, such as that the formatting of `runEvent(CompletionEvent ... )` isn't consistent in DAGSchedulerSuite and I

[GitHub] spark pull request: [SPARK-9851] Support submitting map stages ind...

2015-08-21 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/8180#issuecomment-133573960 @squito Agree that this is high risk, but SPARK-9851 is also very high reward. On balance, I'm excited about this, but also expecting some difficult debu

[GitHub] spark pull request: [SPARK-9952] Fix N^2 loop when DAGScheduler.ge...

2015-08-16 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/8178#discussion_r37157181 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -302,12 +302,12 @@ class DAGScheduler( shuffleDep

[GitHub] spark pull request: [SPARK-9952] Fix N^2 loop when DAGScheduler.ge...

2015-08-16 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/8178#discussion_r37157171 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -302,12 +302,12 @@ class DAGScheduler( shuffleDep

[GitHub] spark pull request: [SPARK-8366] maxNumExecutorsNeeded should prop...

2015-08-07 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/6817#issuecomment-128922978 @andrewor14 Makes better sense to me. Thanks for the explanation, Andrew. --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request: [SPARK-8366] maxNumExecutorsNeeded should prop...

2015-08-07 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/6817#discussion_r36541063 --- Diff: core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala --- @@ -628,6 +621,13 @@ private[spark] class ExecutorAllocationManager

[GitHub] spark pull request: [SPARK-8366] maxNumExecutorsNeeded should prop...

2015-08-07 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/6817#discussion_r36539846 --- Diff: core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala --- @@ -599,15 +599,6 @@ private[spark] class ExecutorAllocationManager

[GitHub] spark pull request: [SPARK-6880][CORE] Use properties from ActiveJ...

2015-08-04 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/6291#issuecomment-127787651 Yeah, don't bother quite yet -- the test isn't actually exercising the code path that it needs to. :( --- If your project is set up for it, you can rep

[GitHub] spark pull request: [SPARK-6880][CORE] Use properties from ActiveJ...

2015-08-04 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/6291#issuecomment-12154 ping("test added") --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project doe

[GitHub] spark pull request: [SPARK-6880][CORE] Use properties from ActiveJ...

2015-08-01 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/6291#issuecomment-126942997 I just brought it up-to-date. Putting together a test or two is something I'm going to try to get to today or tomorrow. --- If your project is set up for it

[GitHub] spark pull request: [SPARK-9366] use task's stageAttemptId in Task...

2015-07-27 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/7681#issuecomment-125276917 +1 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [SPARK-8103][core] DAGScheduler should not sub...

2015-07-22 Thread markhamstra
Github user markhamstra closed the pull request at: https://github.com/apache/spark/pull/7572 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: [SPARK-8103][core] DAGScheduler should not sub...

2015-07-21 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/7572#issuecomment-123477969 Ok, I'll leave this open for about 24 hr to collect any more comments. If Patrick isn't convinced by then, I'll close this PR. --- If your projec

[GitHub] spark pull request: [SPARK-8103][core] DAGScheduler should not sub...

2015-07-21 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/7572#issuecomment-123462071 I think the lack of a JIRA earlier is largely a matter of users not even being able to sort out the confusing behavior sufficiently to be able to file a useful bug

[GitHub] spark pull request: [SPARK-8103][core] DAGScheduler should not sub...

2015-07-21 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/7572#issuecomment-123458799 Nope -- it can be a big problem with earlier versions, too :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request: [SPARK-8103][core] DAGScheduler should not sub...

2015-07-21 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/7572#issuecomment-123457040 @kayousterhout Yup, precisely -- it can be a big problem for 1.4 users. --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request: [SPARK-8103][core] DAGScheduler should not sub...

2015-07-21 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/7572#issuecomment-123453287 @squito @kayousterhout This applies cleanly to branch-1.4, and I'm not seeing any reason why it shouldn't be cherry-picked to that branch.

[GitHub] spark pull request: [SPARK-8103][core] DAGScheduler should not sub...

2015-07-21 Thread markhamstra
GitHub user markhamstra opened a pull request: https://github.com/apache/spark/pull/7572 [SPARK-8103][core] DAGScheduler should not submit multiple concurrent… … attempts for a stage https://issues.apache.org/jira/browse/SPARK-8103 cc kayousterhout (thanks for

[GitHub] spark pull request: [SPARK-9144] Remove DAGScheduler.runLocallyWit...

2015-07-17 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/7484#issuecomment-122441437 If nobody is going to miss it, I'd be happy to be rid of localExecution. But this PR should really be "Deprecate DAGScheduler.runLocallyWithin

[GitHub] spark pull request: [SPARK-9144] Remove DAGScheduler.runLocallyWit...

2015-07-17 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/7484#discussion_r34938490 --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala --- @@ -1758,16 +1758,13 @@ class SparkContext(config: SparkConf) extends Logging with

[GitHub] spark pull request: [SPARK-8333] [SQL] Spark failed to delete temp...

2015-07-17 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/6858#issuecomment-122348869 I'd rather something more structured like case classes were used instead of plain tuples. It is less than immediately apparent what things like `state._2`

[GitHub] spark pull request: [SPARK-8728] Add configuration for limiting th...

2015-07-17 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/7119#discussion_r34909405 --- Diff: core/src/test/scala/org/apache/spark/scheduler/PoolSuite.scala --- @@ -178,4 +185,30 @@ class PoolSuite extends SparkFunSuite with

[GitHub] spark pull request: [SPARK-8728] Add configuration for limiting th...

2015-07-17 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/7119#discussion_r34909121 --- Diff: core/src/test/scala/org/apache/spark/scheduler/PoolSuite.scala --- @@ -178,4 +185,30 @@ class PoolSuite extends SparkFunSuite with

[GitHub] spark pull request: [SPARK-5259][CORE]Make sure shuffle metadata a...

2015-07-17 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/4055#issuecomment-122331612 @squito Doh! Yeah, off-by-one error. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-5259][CORE]Make sure shuffle metadata a...

2015-07-16 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/4055#issuecomment-122122767 @suyanNone @squito Thanks for all the work you've done on this PR so far -- your diligence has been really helpful. Sorry that I haven't had time to p

[GitHub] spark pull request: [SPARK-8625] [Core] Propagate user exceptions ...

2015-07-14 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/7014#issuecomment-121311906 @pwendell I'm not seeing anything concerning in the DAGScheduler changes. --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark pull request: [SPARK-8103][core] DAGScheduler should not sub...

2015-07-01 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/6750#discussion_r33724583 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala --- @@ -163,6 +163,15 @@ private[spark] class TaskSchedulerImpl

[GitHub] spark pull request: [SPARK-8029][core] shuffleoutput per attempt

2015-07-01 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/6648#discussion_r33704621 --- Diff: core/src/main/scala/org/apache/spark/shuffle/FileShuffleBlockResolver.scala --- @@ -104,11 +106,12 @@ private[spark] class

[GitHub] spark pull request: [SPARK-8029][core] shuffleoutput per attempt

2015-07-01 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/6648#discussion_r33701929 --- Diff: core/src/main/scala/org/apache/spark/shuffle/FileShuffleBlockResolver.scala --- @@ -23,6 +23,8 @@ import

[GitHub] spark pull request: [SPARK-8103][core] DAGScheduler should not sub...

2015-06-30 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/6750#issuecomment-117371398 Other than the question of whether doing more while holding the TaskSchedulerImpl lock is a concern, this LGTM. I'll come back later to take a closer look a

[GitHub] spark pull request: [SPARK-8103][core] DAGScheduler should not sub...

2015-06-30 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/6750#discussion_r33634280 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -1002,6 +1001,7 @@ class DAGScheduler( val stageId

[GitHub] spark pull request: [SPARK-8103][core] DAGScheduler should not sub...

2015-06-30 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/6750#discussion_r33630644 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala --- @@ -163,6 +163,12 @@ private[spark] class TaskSchedulerImpl

[GitHub] spark pull request: [SPARK-8029][core] shuffleoutput per attempt

2015-06-30 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/6648#issuecomment-117353879 Ok, I haven't gone through the test suites in detail yet, but I have looked at the rest of it; and what is there looks essentially good to me. I need to come

[GitHub] spark pull request: [SPARK-8029][core] shuffleoutput per attempt

2015-06-30 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/6648#discussion_r33626496 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockId.scala --- @@ -56,18 +56,27 @@ case class RDDBlockId(rddId: Int, splitIndex: Int) extends

[GitHub] spark pull request: [SPARK-8029][core] shuffleoutput per attempt

2015-06-30 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/6648#discussion_r33625132 --- Diff: core/src/main/scala/org/apache/spark/shuffle/hash/HashShuffleManager.scala --- @@ -50,15 +53,30 @@ private[spark] class HashShuffleManager(conf

[GitHub] spark pull request: [SPARK-8029][core] shuffleoutput per attempt

2015-06-30 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/6648#discussion_r33624446 --- Diff: core/src/main/scala/org/apache/spark/shuffle/ShuffleManager.scala --- @@ -62,4 +88,31 @@ private[spark] trait ShuffleManager

[GitHub] spark pull request: [SPARK-8029][core] shuffleoutput per attempt

2015-06-30 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/6648#discussion_r33622414 --- Diff: core/src/main/scala/org/apache/spark/shuffle/FileShuffleBlockResolver.scala --- @@ -104,11 +106,12 @@ private[spark] class

[GitHub] spark pull request: [SPARK-8029][core] shuffleoutput per attempt

2015-06-30 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/6648#discussion_r33622124 --- Diff: core/src/main/scala/org/apache/spark/shuffle/FileShuffleBlockResolver.scala --- @@ -215,7 +224,8 @@ private[spark] class

[GitHub] spark pull request: [SPARK-8029][core] shuffleoutput per attempt

2015-06-30 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/6648#discussion_r33621871 --- Diff: core/src/main/scala/org/apache/spark/shuffle/FileShuffleBlockResolver.scala --- @@ -197,7 +201,7 @@ private[spark] class

[GitHub] spark pull request: [SPARK-8029][core] shuffleoutput per attempt

2015-06-30 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/6648#discussion_r33621654 --- Diff: core/src/main/scala/org/apache/spark/shuffle/FileShuffleBlockResolver.scala --- @@ -104,11 +106,12 @@ private[spark] class

[GitHub] spark pull request: [SPARK-8029][core] shuffleoutput per attempt

2015-06-30 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/6648#discussion_r33621464 --- Diff: core/src/main/scala/org/apache/spark/shuffle/FileShuffleBlockResolver.scala --- @@ -104,11 +106,12 @@ private[spark] class

[GitHub] spark pull request: [SPARK-8029][core] shuffleoutput per attempt

2015-06-30 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/6648#discussion_r33620330 --- Diff: core/src/main/scala/org/apache/spark/shuffle/FileShuffleBlockResolver.scala --- @@ -23,6 +23,8 @@ import

[GitHub] spark pull request: [SPARK-8029][core] shuffleoutput per attempt

2015-06-30 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/6648#discussion_r33613862 --- Diff: core/src/main/scala/org/apache/spark/shuffle/ShuffleManager.scala --- @@ -37,7 +44,11 @@ private[spark] trait ShuffleManager

[GitHub] spark pull request: [SPARK-8029][core] shuffleoutput per attempt

2015-06-30 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/6648#discussion_r33610368 --- Diff: core/src/main/scala/org/apache/spark/scheduler/MapStatus.scala --- @@ -132,6 +140,7 @@ private[spark] class CompressedMapStatus

[GitHub] spark pull request: [SPARK-8029][core] shuffleoutput per attempt

2015-06-30 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/6648#discussion_r33606894 --- Diff: core/src/main/java/org/apache/spark/shuffle/unsafe/UnsafeShuffleWriter.java --- @@ -307,7 +313,7 @@ void forceSorterToSpill() throws

[GitHub] spark pull request: [SPARK-7708] [Core] [WIP] Fixes for Kryo closu...

2015-06-29 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/6361#issuecomment-116742527 @koertkuipers While it may be convenient for you that Scalding and Spark have a common serialization format when using chill/kryo, that can't be considered any

[GitHub] spark pull request: [SPARK-8048] Partitionning of an RDD with 0 pa...

2015-06-26 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/7020#discussion_r33377752 --- Diff: core/src/main/scala/org/apache/spark/Partitioner.scala --- @@ -56,7 +56,7 @@ object Partitioner { */ def defaultPartitioner(rdd

[GitHub] spark pull request: [SPARK-8048] Partitionning of an RDD with 0 pa...

2015-06-26 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/7020#discussion_r33375409 --- Diff: core/src/main/scala/org/apache/spark/Partitioner.scala --- @@ -56,7 +56,7 @@ object Partitioner { */ def defaultPartitioner(rdd

[GitHub] spark pull request: [SPARK-8625] [Core] Propagate user exceptions ...

2015-06-26 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/7014#discussion_r33371157 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala --- @@ -662,6 +662,7 @@ private[spark] class TaskSetManager

[GitHub] spark pull request: [SPARK-8048] Partitionning of an RDD with 0 pa...

2015-06-26 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/7020#discussion_r33366303 --- Diff: core/src/main/scala/org/apache/spark/Partitioner.scala --- @@ -56,7 +56,7 @@ object Partitioner { */ def defaultPartitioner(rdd

[GitHub] spark pull request: [SPARK-746][CORE][WIP] Added Avro Serializatio...

2015-06-25 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/7004#discussion_r33267609 --- Diff: core/pom.xml --- @@ -398,6 +398,41 @@ py4j 0.8.2.1 + + org.apache.avro + avro

[GitHub] spark pull request: [SPARK-7050][build] Keep maven build consisten...

2015-06-16 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/5632#issuecomment-112656242 @jerryshao The Maven build is the official/canonical build because, among other reasons, the releases of Spark are produced with the Maven release plugin. That&#

[GitHub] spark pull request: [SPARK-8206][SQL]Add function round [WIP]

2015-06-16 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/6836#issuecomment-112474620 Actually, it's worth considering whether in the common case of rounding to zero decimal places we should be producing integral values. --- If your project i

[GitHub] spark pull request: [SPARK-8206][SQL]Add function round [WIP]

2015-06-16 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/6836#issuecomment-112471585 I don't believe this is quite right, since `setScale` in BigDecimal will handle negative values differently than SQL `ROUND(column_name, decimals)` will han

[GitHub] spark pull request: [SPARK-8372] History server shows incorrect in...

2015-06-15 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/6827#issuecomment-112254025 Yes, @jerryshao, the idiom is a little clearer when the contents of the Option are actually be used to produce a result other than Unit, but it's also a littl

[GitHub] spark pull request: [SPARK-8372] History server shows incorrect in...

2015-06-15 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/6827#issuecomment-112120882 @jerryshao The `res.map { r =>` version would be considered idiomatic Scala style, not "weird". In this case, there's not a lot of reason to pref

[GitHub] spark pull request: [SPARK-7708] [Core] [WIP] Fixes for Kryo closu...

2015-06-01 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/6361#discussion_r31442435 --- Diff: core/src/main/scala/org/apache/spark/util/SerializableBuffer.scala --- @@ -21,12 +21,16 @@ import java.io.{EOFException, IOException

[GitHub] spark pull request: [SPARK-7826][CORE] Suppress extra calling getC...

2015-05-26 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/6352#discussion_r31099281 --- Diff: core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala --- @@ -342,6 +342,35 @@ class DAGSchedulerSuite assert(locs

[GitHub] spark pull request: [SPARK-7878] Rename Stage.jobId to firstJobId

2015-05-26 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/6418#issuecomment-105716679 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [SPARK-6880][CORE] Use properties from ActiveJ...

2015-05-26 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/6291#issuecomment-105660216 Added the proposed changed in submitTasks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-6880][CORE] Use properties from ActiveJ...

2015-05-26 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/6291#issuecomment-105644240 @kayousterhout In looking through our other uses of stage.jobId in the DAGScheduler, I didn't see anything that jumped out as an obvious problem except fo

[GitHub] spark pull request: [SPARK-6880][CORE] Use properties from ActiveJ...

2015-05-26 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/6291#issuecomment-105636694 Ok, so what was happening is that a stage would get created as part of a particular job. When we'd getMissingTasks for that stage, we were always using the

[GitHub] spark pull request: [SPARK-7826][CORE] Suppress extra calling getC...

2015-05-26 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/6352#issuecomment-105611121 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [SPARK-6880][CORE] Use properties from ActiveJ...

2015-05-26 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/6291#issuecomment-105579119 @squito I'm not sure what you want to test. The change is actually very straightforward. In essence, we started with what should have been obviously broken

[GitHub] spark pull request: [SPARK-7655][Core] Deserializing value should ...

2015-05-26 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/6195#discussion_r31011780 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskResult.scala --- @@ -40,6 +40,9 @@ class DirectTaskResult[T](var valueBytes: ByteBuffer, var

[GitHub] spark pull request: [SPARK-7655][Core] Deserializing value should ...

2015-05-25 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/6195#discussion_r31004183 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskResult.scala --- @@ -40,6 +40,9 @@ class DirectTaskResult[T](var valueBytes: ByteBuffer, var

[GitHub] spark pull request: [SPARK-6880][CORE] Use properties from ActiveJ...

2015-05-20 Thread markhamstra
GitHub user markhamstra opened a pull request: https://github.com/apache/spark/pull/6291 [SPARK-6880][CORE] Use properties from ActiveJob associated with a Stage This issue was addressed in https://github.com/apache/spark/pull/5494, but the fix in that PR, while safe in the sense

[GitHub] spark pull request: [SPARK-6746B] Refactor large functions in DAGS...

2015-04-07 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/5396#issuecomment-90750182 Maybe... convince me. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-6746B] Refactor large functions in DAGS...

2015-04-07 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/5396#issuecomment-90743681 I guess I just don't see things the same way. I don't have a problem with the DAGScheduler being implemented with scoping idioms that may be unfamiliar

[GitHub] spark pull request: [SPARK-6746B] Refactor large functions in DAGS...

2015-04-07 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/5396#issuecomment-90717805 Ok, in looking further, I see that I'm going to have a more general problem with this PR. I can't see exposing carefully nested and scoped portions of fu

[GitHub] spark pull request: [SPARK-6746B] Refactor large functions in DAGS...

2015-04-07 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/5396#discussion_r27914814 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -418,6 +418,32 @@ class DAGScheduler

[GitHub] spark pull request: [SPARK-6183][Deploy] Skip bad workers when re-...

2015-03-07 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/4909#discussion_r26001955 --- Diff: core/src/main/scala/org/apache/spark/deploy/master/Master.scala --- @@ -467,7 +467,9 @@ private[spark] class Master( * two executors on

[GitHub] spark pull request: [SPARK-4655] Split Stage into ShuffleMapStage ...

2015-03-04 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/4708#issuecomment-77267879 Interesting. Looks like the failed test results are no longer available. Do you recall what the problem was? --- If your project is set up for it, you can reply

[GitHub] spark pull request: [SPARK-5259][CORE]Make sure mapStage.pendingta...

2015-02-27 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/4055#issuecomment-76464316 I'll take a look over the weekend. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your pr

[GitHub] spark pull request: [SPARK-4655] Split Stage into ShuffleMapStage ...

2015-02-23 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/4708#issuecomment-75653958 nits: extra spaces in type declarations. E.g,: ```scala var numAvailableOutputs: Long = 0 ``` not ```scala var numAvailableOutputs : Long = 0

[GitHub] spark pull request: [SPARK-4655] Split Stage into ShuffleMapStage ...

2015-02-23 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/4708#issuecomment-75637201 Look pretty good to me, but left a few more comments. Also, please take a look at the various logging strings to see whether some of them can be expressed more

[GitHub] spark pull request: [SPARK-4655] Split Stage into ShuffleMapStage ...

2015-02-23 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/4708#discussion_r25202931 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -210,40 +210,58 @@ class DAGScheduler( * The jobId value

[GitHub] spark pull request: [SPARK-4655] Split Stage into ShuffleMapStage ...

2015-02-23 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/4708#discussion_r25188257 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -228,22 +227,41 @@ class DAGScheduler

[GitHub] spark pull request: [SPARK-4655] Split Stage into ShuffleMapStage ...

2015-02-23 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/4708#discussion_r25186634 --- Diff: core/src/main/scala/org/apache/spark/scheduler/Stage.scala --- @@ -77,52 +71,9 @@ private[spark] class Stage( /** Pointer to the latest

[GitHub] spark pull request: [SPARK-4655] Split Stage into ShuffleMapStage ...

2015-02-23 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/4708#discussion_r25186416 --- Diff: core/src/main/scala/org/apache/spark/scheduler/Stage.scala --- @@ -47,26 +47,20 @@ import org.apache.spark.util.CallSite * be updated for

[GitHub] spark pull request: [SPARK-4655] Split Stage into ShuffleMapStage ...

2015-02-23 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/4708#discussion_r25186558 --- Diff: core/src/main/scala/org/apache/spark/scheduler/Stage.scala --- @@ -77,52 +71,9 @@ private[spark] class Stage( /** Pointer to the latest

[GitHub] spark pull request: [SPARK-4655] Split Stage into ShuffleMapStage ...

2015-02-20 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/4703#discussion_r25058302 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -912,6 +959,196 @@ class DAGScheduler

[GitHub] spark pull request: [SPARK-4655] Split Stage into ShuffleMapStage ...

2015-02-20 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/4703#discussion_r25058069 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -484,7 +505,7 @@ class DAGScheduler( "Total numb

[GitHub] spark pull request: [SPARK-4655] Split Stage into ShuffleMapStage ...

2015-02-20 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/4703#discussion_r25057992 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -306,26 +319,31 @@ class DAGScheduler

[GitHub] spark pull request: [SPARK-4655] Split Stage into ShuffleMapStage ...

2015-02-20 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/4703#discussion_r25057714 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -229,41 +227,56 @@ class DAGScheduler

[GitHub] spark pull request: [SPARK-4655] Split Stage into ShuffleMapStage ...

2015-02-20 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/4703#discussion_r25056762 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -229,41 +227,56 @@ class DAGScheduler

[GitHub] spark pull request: [SPARK-4655] Split Stage into ShuffleMapStage ...

2015-02-20 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/4703#discussion_r25056356 --- Diff: core/src/main/scala/org/apache/spark/scheduler/Stage.scala --- @@ -132,9 +88,7 @@ private[spark] class Stage( } def

[GitHub] spark pull request: [SPARK-4655] Split Stage into ShuffleMapStage ...

2015-02-20 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/4703#discussion_r25056104 --- Diff: core/src/main/scala/org/apache/spark/scheduler/Stage.scala --- @@ -77,53 +70,16 @@ private[spark] class Stage( /** Pointer to the latest

[GitHub] spark pull request: [SPARK-4655] Split Stage into ShuffleMapStage ...

2015-02-20 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/4703#discussion_r25055713 --- Diff: core/src/main/scala/org/apache/spark/scheduler/Stage.scala --- @@ -47,26 +47,19 @@ import org.apache.spark.util.CallSite * be updated for

[GitHub] spark pull request: [SPARK-5522] Accelerate the Histroty Server st...

2015-02-12 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/4525#discussion_r24634767 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala --- @@ -66,6 +68,11 @@ private[history] class FsHistoryProvider

[GitHub] spark pull request: [SPARK-4436][SPARK-3624][BUILD] Debian packagi...

2015-02-12 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/3297#issuecomment-74157814 Agreed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [SPARK-5522] Accelerate the Histroty Server st...

2015-02-11 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/4525#issuecomment-73994082 nit: typo in the PR title --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-5744] [CORE] Making RDD.isEmpty robust ...

2015-02-11 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/4534#issuecomment-73941110 perfect --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: Making RDD.isEmpty robust to empty partitions ...

2015-02-11 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/4534#issuecomment-73940039 @tbertelsen Better, but you still should include SPARK-5744 and add [CORE] to the PR title. --- If your project is set up for it, you can reply to this email and

<    1   2   3   4   5   6   7   8   >