[GitHub] spark pull request #17276: [SPARK-19937] Collect metrics of block sizes when...

2017-03-21 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/17276#discussion_r107156339 --- Diff: core/src/main/scala/org/apache/spark/executor/ShuffleWriteMetrics.scala --- @@ -17,8 +17,12 @@ package org.apache.spark.executor

[GitHub] spark pull request #17276: [SPARK-19937] Collect metrics of block sizes when...

2017-03-21 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/17276#discussion_r107158475 --- Diff: core/src/main/scala/org/apache/spark/executor/ShuffleReadMetrics.scala --- @@ -80,13 +92,17 @@ class ShuffleReadMetrics private[spark] () extends

[GitHub] spark pull request #17276: [SPARK-19937] Collect metrics of block sizes when...

2017-03-21 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/17276#discussion_r107161810 --- Diff: core/src/main/scala/org/apache/spark/shuffle/sort/SortShuffleWriter.scala --- @@ -72,6 +72,18 @@ private[spark] class SortShuffleWriter[K, V, C

[GitHub] spark pull request #17276: [SPARK-19937] Collect metrics of block sizes when...

2017-03-21 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/17276#discussion_r107157788 --- Diff: core/src/main/scala/org/apache/spark/scheduler/MapStatus.scala --- @@ -164,6 +164,8 @@ private[spark] class HighlyCompressedMapStatus private

[GitHub] spark pull request #17276: [SPARK-19937] Collect metrics of block sizes when...

2017-03-21 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/17276#discussion_r107157471 --- Diff: core/src/main/scala/org/apache/spark/shuffle/sort/SortShuffleWriter.scala --- @@ -72,6 +72,18 @@ private[spark] class SortShuffleWriter[K, V, C

[GitHub] spark pull request #17276: [SPARK-19937] Collect metrics of block sizes when...

2017-03-21 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/17276#discussion_r107160710 --- Diff: core/src/main/scala/org/apache/spark/shuffle/sort/SortShuffleWriter.scala --- @@ -72,6 +72,18 @@ private[spark] class SortShuffleWriter[K, V, C

[GitHub] spark issue #14617: [SPARK-17019][Core] Expose on-heap and off-heap memory u...

2017-03-21 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/14617 thanks @jerryshao . Two other points: 1. We should have the same treatment in the executor summary table 2. as tom mentioned, there is also the storage page. you can do that separately

[GitHub] spark issue #14617: [SPARK-17019][Core] Expose on-heap and off-heap memory u...

2017-03-20 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/14617 good points @tgravescs . What about making them additional metrics, turned on by a checkbox, like the extra task metrics? --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request #14617: [SPARK-17019][Core] Expose on-heap and off-heap m...

2017-03-20 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/14617#discussion_r106931457 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala --- @@ -180,7 +180,8 @@ private[spark] class BlockManager( val

[GitHub] spark pull request #14617: [SPARK-17019][Core] Expose on-heap and off-heap m...

2017-03-20 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/14617#discussion_r106930443 --- Diff: core/src/main/resources/org/apache/spark/ui/static/executorspage.js --- @@ -172,6 +172,15 @@ function totalDurationColor(totalGCTime

[GitHub] spark pull request #14617: [SPARK-17019][Core] Expose on-heap and off-heap m...

2017-03-20 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/14617#discussion_r106935786 --- Diff: core/src/main/scala/org/apache/spark/storage/StorageUtils.scala --- @@ -176,17 +185,42 @@ class StorageStatus(val blockManagerId: BlockManagerId

[GitHub] spark pull request #14617: [SPARK-17019][Core] Expose on-heap and off-heap m...

2017-03-20 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/14617#discussion_r106930720 --- Diff: core/src/main/resources/org/apache/spark/ui/static/executorspage.js --- @@ -172,6 +172,15 @@ function totalDurationColor(totalGCTime

[GitHub] spark issue #14617: [SPARK-17019][Core] Expose on-heap and off-heap memory u...

2017-03-20 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/14617 Looks like the failures are real (you probably just need to regenerate the expectations for the new blacklisting tests) --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request #16867: [SPARK-16929] Improve performance when check spec...

2017-03-19 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/16867#discussion_r106833748 --- Diff: core/src/main/scala/org/apache/spark/util/collection/MedianHeap.scala --- @@ -0,0 +1,95 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #16867: [SPARK-16929] Improve performance when check spec...

2017-03-19 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/16867#discussion_r106833737 --- Diff: core/src/main/scala/org/apache/spark/util/collection/MedianHeap.scala --- @@ -0,0 +1,95 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #16867: [SPARK-16929] Improve performance when check spec...

2017-03-19 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/16867#discussion_r106833868 --- Diff: core/src/main/scala/org/apache/spark/util/collection/MedianHeap.scala --- @@ -0,0 +1,95 @@ +/* + * Licensed to the Apache Software

[GitHub] spark issue #17297: [SPARK-14649][CORE] DagScheduler should not run duplicat...

2017-03-19 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/17297 > when the stage fails because of fetch failure, we remove the stage from the output commiter. So if any task completes between the time of first fetch failure and the time stage is resubmit

[GitHub] spark pull request #17297: [SPARK-14649][CORE] DagScheduler should not run d...

2017-03-17 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/17297#discussion_r106774285 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala --- @@ -193,13 +193,6 @@ private[spark] class TaskSchedulerImpl private

[GitHub] spark issue #17297: [SPARK-14649][CORE] DagScheduler should not run duplicat...

2017-03-17 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/17297 I'm a bit confused by the description: > 1. When a fetch failure happens, the task set manager ask the dag scheduler to abort all the non-running tasks. However, the running task

[GitHub] spark issue #16781: [SPARK-12297][SQL] Hive compatibility for Parquet Timest...

2017-03-17 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/16781 pinging some potential reviewers: @liancheng @yhuai for prior work in hive compatibility @ueshin for work on timezone support Note that the timezone support from SPARK-18350 is to

[GitHub] spark issue #17088: [SPARK-19753][CORE] Un-register all shuffle output on a ...

2017-03-17 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/17088 @kayousterhout I don't think https://github.com/apache/spark/pull/14931 is really a complete answer to this. (a) we only get that from standalone mode, no other cluster managers (yarn

[GitHub] spark issue #14617: [SPARK-17019][Core] Expose on-heap and off-heap memory u...

2017-03-17 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/14617 hi @jerryshao sorry this went unnoticed for so long, if you bring this up to date I'll keep an eye on it. Before this change, is off-heap storage completely ignored in the UI? Or doe

[GitHub] spark issue #16867: [SPARK-16929] Improve performance when check speculatabl...

2017-03-17 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/16867 @jinxing64 would you mind repeating your performance experiments with the lastest version? Both for `checkSpeculatableTasks` and also for inserting the duration on each task completion? --- If

[GitHub] spark issue #17088: [SPARK-19753][CORE] Un-register all shuffle output on a ...

2017-03-17 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/17088 One thing which I noticed while making sense of what was going in the code (even before) -- IIRC, spark standalone is a bit of a special case. I think it used to be the case that to run multiple

[GitHub] spark pull request #17088: [SPARK-19753][CORE] Un-register all shuffle outpu...

2017-03-17 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/17088#discussion_r106669055 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -1390,7 +1401,34 @@ class DAGScheduler( } } else

[GitHub] spark pull request #17088: [SPARK-19753][CORE] Un-register all shuffle outpu...

2017-03-17 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/17088#discussion_r106684005 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -1390,7 +1401,34 @@ class DAGScheduler( } } else

[GitHub] spark pull request #17088: [SPARK-19753][CORE] Un-register all shuffle outpu...

2017-03-17 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/17088#discussion_r106670128 --- Diff: core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala --- @@ -394,6 +394,68 @@ class DAGSchedulerSuite extends SparkFunSuite with

[GitHub] spark pull request #17088: [SPARK-19753][CORE] Un-register all shuffle outpu...

2017-03-17 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/17088#discussion_r106668930 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -1365,18 +1369,25 @@ class DAGScheduler( */ private

[GitHub] spark pull request #17088: [SPARK-19753][CORE] Un-register all shuffle outpu...

2017-03-17 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/17088#discussion_r106677939 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -1390,7 +1401,34 @@ class DAGScheduler( } } else

[GitHub] spark issue #17307: [SPARK-13369] Make number of consecutive fetch failures ...

2017-03-17 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/17307 merged to master (slightly reworded the first line of the commit msg so it all fit). thanks @sitalkedia, especially for sticking with this despite the delays, our nitpickiness, and the

[GitHub] spark issue #17307: [SPARK-13369] Make number of consecutive fetch failures ...

2017-03-16 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/17307 yeah, sorry I am looking, but keep getting distracted ... I'm sure these failures don't matter but can't merge this second anyhow so lets just test again ... --- If your project i

[GitHub] spark issue #17307: [SPARK-13369] Make number of consecutive fetch failures ...

2017-03-16 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/17307 Jenkins, retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #16905: [SPARK-19567][CORE][SCHEDULER] Support some Schedulable ...

2017-03-16 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/16905 @shaneknapp I had to trigger jenkins manually via spark-prs. Every once in a while I encounter a pr for which tests are never triggered via comments. Its pretty rare, so its not a big deal, but I

[GitHub] spark pull request #16867: [SPARK-16929] Improve performance when check spec...

2017-03-16 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/16867#discussion_r106433116 --- Diff: core/src/test/scala/org/apache/spark/util/collection/MedianHeapSuite.scala --- @@ -0,0 +1,67 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #16867: [SPARK-16929] Improve performance when check spec...

2017-03-16 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/16867#discussion_r106436689 --- Diff: core/src/test/scala/org/apache/spark/util/collection/MedianHeapSuite.scala --- @@ -0,0 +1,67 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #16867: [SPARK-16929] Improve performance when check spec...

2017-03-16 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/16867#discussion_r106434806 --- Diff: core/src/test/scala/org/apache/spark/util/collection/MedianHeapSuite.scala --- @@ -0,0 +1,67 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #16867: [SPARK-16929] Improve performance when check spec...

2017-03-16 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/16867#discussion_r106421711 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala --- @@ -172,7 +172,7 @@ private[spark] class TaskSchedulerImpl private

[GitHub] spark pull request #17088: [SPARK-19753][CORE] Un-register all shuffle outpu...

2017-03-15 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/17088#discussion_r106335824 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -1365,19 +1369,27 @@ class DAGScheduler( */ private

[GitHub] spark pull request #17088: [SPARK-19753][CORE] Un-register all shuffle outpu...

2017-03-15 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/17088#discussion_r106330358 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -1331,7 +1328,14 @@ class DAGScheduler( // TODO

[GitHub] spark pull request #17088: [SPARK-19753][CORE] Un-register all shuffle outpu...

2017-03-15 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/17088#discussion_r106335566 --- Diff: core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala --- @@ -394,6 +394,32 @@ class DAGSchedulerSuite extends SparkFunSuite with

[GitHub] spark issue #16905: [SPARK-19567][CORE][SCHEDULER] Support some Schedulable ...

2017-03-15 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/16905 reopened https://issues.apache.org/jira/browse/SPARK-7420 for the failure Jenkins, retest this please --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request #17307: [SPARK-13369] Make number of consecutive fetch fa...

2017-03-15 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/17307#discussion_r106328854 --- Diff: docs/configuration.md --- @@ -1506,6 +1506,11 @@ Apart from these, the following properties are also available, and may be useful of this

[GitHub] spark issue #17113: [SPARK-13669][Core] Improve the blacklist mechanism to h...

2017-03-15 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/17113 sorry i was vague -- I'm saying I'm ok with this as long as its (a) off by default and (b) experimental so we can change it around (which it is). --- If your project is set up for it, you

[GitHub] spark pull request #11254: [SPARK-13369] Make number of consecutive fetch fa...

2017-03-15 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/11254#discussion_r106289378 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -180,6 +180,11 @@ class DAGScheduler( /** If enabled, FetchFailed

[GitHub] spark pull request #11254: [SPARK-13369] Make number of consecutive fetch fa...

2017-03-15 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/11254#discussion_r106289559 --- Diff: docs/configuration.md --- @@ -1157,6 +1157,13 @@ Apart from these, the following properties are also available, and may be useful Should

[GitHub] spark pull request #11254: [SPARK-13369] Make number of consecutive fetch fa...

2017-03-15 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/11254#discussion_r106289060 --- Diff: core/src/main/scala/org/apache/spark/scheduler/Stage.scala --- @@ -118,7 +119,7 @@ private[scheduler] abstract class Stage

[GitHub] spark issue #11254: [SPARK-13369] Make number of consecutive fetch failures ...

2017-03-15 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/11254 sorry this has sat around so long, I agree this is useful following up on discussion here: https://github.com/apache/spark/pull/17088 I'd reword the description to something more like

[GitHub] spark issue #17088: [SPARK-19753][CORE] Un-register all shuffle output on a ...

2017-03-15 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/17088 first, I think we should change the hard-coded limit of 4 stage retries. Its clear to me there is an important reason why users would want a higher limit, so lets make it a config. That is a very

[GitHub] spark pull request #17113: [SPARK-13669][Core] Improve the blacklist mechani...

2017-03-15 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/17113#discussion_r106249028 --- Diff: core/src/test/scala/org/apache/spark/scheduler/TaskSetManagerSuite.scala --- @@ -1039,6 +1039,40 @@ class TaskSetManagerSuite extends

[GitHub] spark pull request #17113: [SPARK-13669][Core] Improve the blacklist mechani...

2017-03-15 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/17113#discussion_r106246809 --- Diff: core/src/main/scala/org/apache/spark/scheduler/BlacklistTracker.scala --- @@ -145,6 +146,63 @@ private[scheduler] class BlacklistTracker

[GitHub] spark pull request #17113: [SPARK-13669][Core] Improve the blacklist mechani...

2017-03-15 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/17113#discussion_r106248846 --- Diff: core/src/test/scala/org/apache/spark/scheduler/TaskSetManagerSuite.scala --- @@ -1039,6 +1039,40 @@ class TaskSetManagerSuite extends

[GitHub] spark pull request #17113: [SPARK-13669][Core] Improve the blacklist mechani...

2017-03-15 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/17113#discussion_r106245985 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala --- @@ -753,6 +753,12 @@ private[spark] class TaskSetManager

[GitHub] spark pull request #17113: [SPARK-13669][Core] Improve the blacklist mechani...

2017-03-15 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/17113#discussion_r106248521 --- Diff: docs/configuration.md --- @@ -1411,6 +1411,15 @@ Apart from these, the following properties are also available, and may be useful

[GitHub] spark pull request #17113: [SPARK-13669][Core] Improve the blacklist mechani...

2017-03-15 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/17113#discussion_r106246521 --- Diff: core/src/main/scala/org/apache/spark/scheduler/BlacklistTracker.scala --- @@ -145,6 +146,63 @@ private[scheduler] class BlacklistTracker

[GitHub] spark issue #17208: [SPARK-19868] conflict TasksetManager lead to spark stop...

2017-03-15 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/17208 to be clear, I agree with Kay's rewording (in particular, I meant stage attempt, not task attempt). Also I think its worth including a test. You can use this: https://github.com/s

[GitHub] spark issue #17238: getRackForHost returns None if host is unknown by driver

2017-03-14 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/17238 I don't think you're missing anything @tgravescs , it sounds like this is just a misconfiguration and we shouldn't be doing anything special for it (since it could hurt correct conf

[GitHub] spark pull request #16867: [SPARK-16929] Improve performance when check spec...

2017-03-13 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/16867#discussion_r105751160 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala --- @@ -909,19 +917,20 @@ private[spark] class TaskSetManager

[GitHub] spark pull request #16867: [SPARK-16929] Improve performance when check spec...

2017-03-13 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/16867#discussion_r105754479 --- Diff: core/src/main/scala/org/apache/spark/util/collection/MedianHeap.scala --- @@ -0,0 +1,130 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #16867: [SPARK-16929] Improve performance when check spec...

2017-03-13 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/16867#discussion_r105753056 --- Diff: core/src/main/scala/org/apache/spark/util/collection/MedianHeap.scala --- @@ -0,0 +1,130 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #16867: [SPARK-16929] Improve performance when check spec...

2017-03-13 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/16867#discussion_r105751503 --- Diff: core/src/main/scala/org/apache/spark/util/collection/MedianHeap.scala --- @@ -0,0 +1,130 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #16867: [SPARK-16929] Improve performance when check spec...

2017-03-13 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/16867#discussion_r105751547 --- Diff: core/src/main/scala/org/apache/spark/util/collection/MedianHeap.scala --- @@ -0,0 +1,130 @@ +/* + * Licensed to the Apache Software

[GitHub] spark issue #17238: getRackForHost returns None if host is unknown by driver

2017-03-13 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/17238 @mridulm you know the yarn bits involved better than I do. It sounds like you are saying this is the wrong change, and instead its just a misconfiguration in yarn -- I'll buy that arg

[GitHub] spark issue #16867: [SPARK-16929] Improve performance when check speculatabl...

2017-03-10 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/16867 @jinxing64 looks like something went wrong with your last push, I think there are lots of unintentional changes --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark issue #17238: getRackForHost returns None if host is unknown by driver

2017-03-10 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/17238 The description of the problem makes sense for how it would effect task locality. I will need to look more closely at some yarn bits to make sure its the right change, but looks reasonable

[GitHub] spark issue #17208: [SPARK-19868] conflict TasksetManager lead to spark stop...

2017-03-08 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/17208 This looks like the right change. In fact, I could have sworn we had recently merged in something like this -- maybe there is another pr still in flight which includes this? @jinxing64 perhaps

[GitHub] spark issue #17208: [SPARK-19868] conflict TasksetManager lead to spark stop...

2017-03-08 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/17208 Jenkins, ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #17198: [SPARK-19857][yarn] Correctly calculate next credential ...

2017-03-07 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/17198 lgtm I couldn't believe this at first, I had to verify by running this for myself: ```scala scala> :paste // Entering paste mode (ctrl-D to finish) val x = 7

[GitHub] spark pull request #16867: [SPARK-16929] Improve performance when check spec...

2017-03-06 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/16867#discussion_r104518016 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala --- @@ -740,6 +743,7 @@ private[spark] class TaskSetManager

[GitHub] spark issue #17113: [SPARK-13669][Core] Improve the blacklist mechanism to h...

2017-03-06 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/17113 I think killing tasks is only applicable in different scenarios, eg. if the [*job* fails](https://github.com/apache/spark/blob/12bf832407eaaed90d7c599522457cb36b303b6c/core/src/main/scala/org/apache

[GitHub] spark issue #17140: [SPARK-19796][CORE] Fix serialization of long property v...

2017-03-06 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/17140 @kayousterhout > Do you think we should do the same thing for properties keys? I think probably not? yeah I went back and forth on it, thinking maybe I should do the keys as well

[GitHub] spark issue #17140: [SPARK-19796][CORE] Fix serialization of long property v...

2017-03-06 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/17140 merged to master --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or

[GitHub] spark pull request #16781: [SPARK-12297][SQL][POC] Hive compatibility for Pa...

2017-03-06 Thread squito
GitHub user squito reopened a pull request: https://github.com/apache/spark/pull/16781 [SPARK-12297][SQL][POC] Hive compatibility for Parquet Timestamps ## What changes were proposed in this pull request? Hive has very strange behavior when writing timestamps to parquet

[GitHub] spark issue #16867: [SPARK-16929] Improve performance when check speculatabl...

2017-03-03 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/16867 more brainstorming: (1) you could lazily update your median collection (whether its a treeset or median heap). First you'd just dump tasks into an array, and then when you query fo

[GitHub] spark issue #16867: [SPARK-16929] Improve performance when check speculatabl...

2017-03-03 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/16867 median heap is a good idea to try. in fact, `slice` is `O(n)` because of the way its implemented, it actually iterates through the first `n/2` elements (even though it should be able to do

[GitHub] spark issue #16639: [SPARK-19276][CORE] Fetch Failure handling robust to use...

2017-03-02 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/16639 @mridulm look ok to you too? I plan on merging soon. I just made a small change to the comments (I copied and pasted incorrect comments in the last test case I added) --- If your project

[GitHub] spark pull request #17140: [SPARK-19796][CORE] Fix serialization of long pro...

2017-03-02 Thread squito
GitHub user squito opened a pull request: https://github.com/apache/spark/pull/17140 [SPARK-19796][CORE] Fix serialization of long property values in TaskDescription ## What changes were proposed in this pull request? The properties that are serialized with a

[GitHub] spark pull request #16959: [SPARK-19631][CORE] OutputCommitCoordinator shoul...

2017-03-01 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/16959#discussion_r103745227 --- Diff: core/src/test/scala/org/apache/spark/scheduler/OutputCommitCoordinatorSuite.scala --- @@ -195,6 +195,17 @@ class OutputCommitCoordinatorSuite

[GitHub] spark issue #16959: [SPARK-19631][CORE] OutputCommitCoordinator should not a...

2017-03-01 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/16959 lgtm, sorry for the noise --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark pull request #16959: [SPARK-19631][CORE] OutputCommitCoordinator shoul...

2017-03-01 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/16959#discussion_r103741578 --- Diff: core/src/test/scala/org/apache/spark/scheduler/OutputCommitCoordinatorSuite.scala --- @@ -195,6 +195,17 @@ class OutputCommitCoordinatorSuite

[GitHub] spark pull request #16959: [SPARK-19631][CORE] OutputCommitCoordinator shoul...

2017-02-28 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/16959#discussion_r103606657 --- Diff: core/src/test/scala/org/apache/spark/scheduler/OutputCommitCoordinatorSuite.scala --- @@ -195,6 +195,17 @@ class OutputCommitCoordinatorSuite

[GitHub] spark issue #17111: [SPARK-19777] Scan runningTasksSet when check speculatab...

2017-02-28 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/17111 lgtm (btw, I just meant to update your old PR to mention the new jira in the title -- you didn't have to open a new pr. but this works too) --- If your project is set up for it, yo

[GitHub] spark pull request #16959: [SPARK-19631][CORE] OutputCommitCoordinator shoul...

2017-02-28 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/16959#discussion_r103560298 --- Diff: core/src/test/scala/org/apache/spark/scheduler/OutputCommitCoordinatorSuite.scala --- @@ -195,6 +195,17 @@ class OutputCommitCoordinatorSuite

[GitHub] spark pull request #15505: [SPARK-18890][CORE] Move task serialization from ...

2017-02-28 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/15505#discussion_r10306 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala --- @@ -621,6 +615,80 @@ class

[GitHub] spark pull request #15505: [SPARK-18890][CORE] Move task serialization from ...

2017-02-28 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/15505#discussion_r103555472 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala --- @@ -621,6 +615,80 @@ class

[GitHub] spark pull request #15505: [SPARK-18890][CORE] Move task serialization from ...

2017-02-28 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/15505#discussion_r103555964 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala --- @@ -621,6 +615,80 @@ class

[GitHub] spark pull request #15505: [SPARK-18890][CORE] Move task serialization from ...

2017-02-28 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/15505#discussion_r103554226 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskDescription.scala --- @@ -52,8 +55,26 @@ private[spark] class TaskDescription( val

[GitHub] spark pull request #15505: [SPARK-18890][CORE] Move task serialization from ...

2017-02-28 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/15505#discussion_r103556523 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala --- @@ -621,6 +615,80 @@ class

[GitHub] spark pull request #15505: [SPARK-18890][CORE] Move task serialization from ...

2017-02-28 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/15505#discussion_r103555074 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala --- @@ -621,6 +615,80 @@ class

[GitHub] spark pull request #15505: [SPARK-18890][CORE] Move task serialization from ...

2017-02-28 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/15505#discussion_r103554187 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskDescription.scala --- @@ -52,8 +55,26 @@ private[spark] class TaskDescription( val

[GitHub] spark pull request #16959: [SPARK-19631][CORE] OutputCommitCoordinator shoul...

2017-02-28 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/16959#discussion_r103549134 --- Diff: core/src/test/scala/org/apache/spark/scheduler/OutputCommitCoordinatorSuite.scala --- @@ -195,6 +195,17 @@ class OutputCommitCoordinatorSuite

[GitHub] spark issue #16959: [SPARK-19631][CORE] OutputCommitCoordinator should not a...

2017-02-28 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/16959 @kayousterhout > This commit makes me worried there are more bugs related to #16620. For example, what if a task was OK'ed to commit, but then DAGScheduler decides to ignore it becaus

[GitHub] spark pull request #16959: [SPARK-19631][CORE] OutputCommitCoordinator shoul...

2017-02-28 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/16959#discussion_r103541861 --- Diff: core/src/main/scala/org/apache/spark/scheduler/OutputCommitCoordinator.scala --- @@ -181,11 +185,20 @@ private[spark] class

[GitHub] spark issue #16867: [SPARK-16929] Improve performance when check speculatabl...

2017-02-28 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/16867 @jinxing64 thanks for updating this to be just the simpler fix. Since the original jira has a bit of a longer discussion on it, do you mind opening a new jira for this change, and linking it to the

[GitHub] spark issue #16867: [SPARK-16929] Improve performance when check speculatabl...

2017-02-28 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/16867 other than a bit of jira re-organization, lgtm --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request #16639: [SPARK-19276][CORE] Fetch Failure handling robust...

2017-02-28 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/16639#discussion_r103505374 --- Diff: core/src/main/scala/org/apache/spark/shuffle/FetchFailedException.scala --- @@ -45,6 +50,12 @@ private[spark] class FetchFailedException

[GitHub] spark pull request #16639: [SPARK-19276][CORE] Fetch Failure handling robust...

2017-02-28 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/16639#discussion_r103503390 --- Diff: core/src/main/scala/org/apache/spark/executor/Executor.scala --- @@ -400,8 +410,16 @@ private[spark] class Executor

[GitHub] spark issue #16930: [SPARK-19597][CORE] test case for task deserialization e...

2017-02-24 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/16930 thanks @kayousterhout ! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark pull request #16892: [SPARK-19560] Improve DAGScheduler tests.

2017-02-24 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/16892#discussion_r102996593 --- Diff: core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala --- @@ -2031,6 +2051,11 @@ class DAGSchedulerSuite extends SparkFunSuite

[GitHub] spark pull request #16946: [SPARK-19554][UI,YARN] Allow SHS URL to be used f...

2017-02-22 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/16946#discussion_r102539475 --- Diff: resource-managers/yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnProxyRedirectFilterSuite.scala --- @@ -0,0 +1,53

[GitHub] spark pull request #16946: [SPARK-19554][UI,YARN] Allow SHS URL to be used f...

2017-02-22 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/16946#discussion_r102546890 --- Diff: docs/running-on-yarn.md --- @@ -604,3 +604,17 @@ spark.yarn.am.extraJavaOptions -Dsun.security.krb5.debug=true -Dsun.security.spn

<    9   10   11   12   13   14   15   16   17   18   >