[GitHub] spark pull request: [WIP][SPARK-3139] Akka timeouts from ContextCl...

2014-08-20 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2056#issuecomment-52751046 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18947/consoleFull) for PR 2056 at commit [`a49bc80`](https://github.com/ap

[GitHub] spark pull request: [WIP][SPARK-3139] Akka timeouts from ContextCl...

2014-08-20 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2056#issuecomment-52751321 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18947/consoleFull) for PR 2056 at commit [`a49bc80`](https://github.com/a

[GitHub] spark pull request: [WIP][SPARK-3139] Akka timeouts from ContextCl...

2014-08-20 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2056#issuecomment-52752829 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18948/consoleFull) for PR 2056 at commit [`f81c6a9`](https://github.com/ap

[GitHub] spark pull request: [WIP][SPARK-3139] Akka timeouts from ContextCl...

2014-08-20 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2056#issuecomment-52752840 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18948/consoleFull) for PR 2056 at commit [`f81c6a9`](https://github.com/a

[GitHub] spark pull request: [WIP][SPARK-3139] Akka timeouts from ContextCl...

2014-08-20 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/2056#issuecomment-52753087 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [WIP][SPARK-3139] Akka timeouts from ContextCl...

2014-08-20 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2056#issuecomment-52753280 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18949/consoleFull) for PR 2056 at commit [`f81c6a9`](https://github.com/ap

[GitHub] spark pull request: [WIP][SPARK-3139] Akka timeouts from ContextCl...

2014-08-20 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2056#issuecomment-52759438 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18949/consoleFull) for PR 2056 at commit [`f81c6a9`](https://github.com/a

[GitHub] spark pull request: [WIP][SPARK-3139] Akka timeouts from ContextCl...

2014-08-20 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2056#issuecomment-52783094 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18956/consoleFull) for PR 2056 at commit [`f388e6d`](https://github.com/ap

[GitHub] spark pull request: [WIP][SPARK-3139] Akka timeouts from ContextCl...

2014-08-20 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2056#issuecomment-52783109 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18956/consoleFull) for PR 2056 at commit [`f388e6d`](https://github.com/a

[GitHub] spark pull request: [WIP][SPARK-3139] Akka timeouts from ContextCl...

2014-08-20 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2056#issuecomment-52783748 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18957/consoleFull) for PR 2056 at commit [`0ccf45b`](https://github.com/ap

[GitHub] spark pull request: [WIP][SPARK-3139] Akka timeouts from ContextCl...

2014-08-20 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2056#issuecomment-52791960 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18957/consoleFull) for PR 2056 at commit [`0ccf45b`](https://github.com/a

[GitHub] spark pull request: [WIP][SPARK-3139] Akka timeouts from ContextCl...

2014-08-20 Thread JoshRosen
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/2056#issuecomment-52808087 I don't think this really solves the underlying problem, since it doesn't seem to address _why_ we're getting the timeout. I found this bug on a pretty lightly-loaded

[GitHub] spark pull request: [WIP][SPARK-3139] Akka timeouts from ContextCl...

2014-08-20 Thread andrewor14
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/2056#issuecomment-52828466 Yes, this does not seem to address the root cause, which may be a deadlock. If we hit it again with infinite timeout the cleaning thread will just hang forever. ---

[GitHub] spark pull request: [WIP][SPARK-3139] Akka timeouts from ContextCl...

2014-08-20 Thread tdas
Github user tdas commented on the pull request: https://github.com/apache/spark/pull/2056#issuecomment-52855899 I, in fact, advise against it. If something goes wrong, there will be indefinite, potentially large number of Futures just waiting for response from driver, never timing out

[GitHub] spark pull request: [WIP][SPARK-3139] Akka timeouts from ContextCl...

2014-08-20 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/2056#issuecomment-52870430 I think this the root cause here [ShuffleBlockManager.scala#L207](https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/storage/ShuffleBlockManag

[GitHub] spark pull request: [WIP][SPARK-3139] Akka timeouts from ContextCl...

2014-08-22 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/2056#issuecomment-53143532 @witgo why does this IO wait cause a problem? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your pro

[GitHub] spark pull request: [WIP][SPARK-3139] Akka timeouts from ContextCl...

2014-08-23 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/2056#issuecomment-53149146 In `removeShuffleBlocks` ``` for (mapId <- state.completedMapTasks; reduceId <- 0 until state.numBuckets) { val blockId = new ShuffleBlockId(shuffle