[GitHub] spark pull request: [SPARK-9809]Task crashes because the internal ...

2015-08-16 Thread squito
Github user squito commented on the pull request: https://github.com/apache/spark/pull/8090#issuecomment-131660810 @andrewor14 running in `local` mode is misleading for stage failures -- you really need to have multiple block managers in place to understand the behavior, eg. with

[GitHub] spark pull request: [SPARK-9809]Task crashes because the internal ...

2015-08-14 Thread andrewor14
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/8090#issuecomment-131232391 @squito here is the answer to your question: ## Short version When we have a fetch failure, we resubmit both the map stage that wrote the shuffle

[GitHub] spark pull request: [SPARK-9809]Task crashes because the internal ...

2015-08-14 Thread andrewor14
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/8090#issuecomment-131233660 In any case, the changes here are a strict improvement and fixes a critical bug that needs to be in the release. I have confirmed its correctness and will merge it

[GitHub] spark pull request: [SPARK-9809]Task crashes because the internal ...

2015-08-14 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/8090 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-9809]Task crashes because the internal ...

2015-08-12 Thread squito
Github user squito commented on the pull request: https://github.com/apache/spark/pull/8090#issuecomment-130492201 I was thinking that my points about simplifying the logic here and making this safer would be pretty clear, but I guess that is not the case. If that is at all

[GitHub] spark pull request: [SPARK-9809]Task crashes because the internal ...

2015-08-11 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/8090#issuecomment-130087503 [Test build #1462 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/1462/consoleFull) for PR 8090 at commit

[GitHub] spark pull request: [SPARK-9809]Task crashes because the internal ...

2015-08-11 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/8090#issuecomment-130088400 [Test build #1462 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/1462/console) for PR 8090 at commit

[GitHub] spark pull request: [SPARK-9809]Task crashes because the internal ...

2015-08-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8090#issuecomment-129778269 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-9809]Task crashes because the internal ...

2015-08-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8090#issuecomment-129778174 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-9809]Task crashes because the internal ...

2015-08-11 Thread carsonwang
Github user carsonwang commented on the pull request: https://github.com/apache/spark/pull/8090#issuecomment-129776041 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: [SPARK-9809]Task crashes because the internal ...

2015-08-11 Thread carsonwang
Github user carsonwang commented on the pull request: https://github.com/apache/spark/pull/8090#issuecomment-129724889 @squito , here are some further information about the issue. I was running an iterative graph-parallel algorithm which uses RDD cache for iterative computations. In

[GitHub] spark pull request: [SPARK-9809]Task crashes because the internal ...

2015-08-11 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/8090#issuecomment-129780967 [Test build #40434 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/40434/consoleFull) for PR 8090 at commit

[GitHub] spark pull request: [SPARK-9809]Task crashes because the internal ...

2015-08-11 Thread squito
Github user squito commented on the pull request: https://github.com/apache/spark/pull/8090#issuecomment-129862901 Thanks @carsonwang , I think I see -- it seems that the key part is that the stage is skipped (because the shuffle map output is computed by a different stage, and the

[GitHub] spark pull request: [SPARK-9809]Task crashes because the internal ...

2015-08-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8090#issuecomment-129835666 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-9809]Task crashes because the internal ...

2015-08-11 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/8090#issuecomment-129835618 [Test build #40434 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/40434/console) for PR 8090 at commit

[GitHub] spark pull request: [SPARK-9809]Task crashes because the internal ...

2015-08-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8090#issuecomment-129979524 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-9809]Task crashes because the internal ...

2015-08-11 Thread squito
Github user squito commented on the pull request: https://github.com/apache/spark/pull/8090#issuecomment-129979596 so you don't think we should consider just completely skipping resetting the accumulators, and just always initializing the `internalAccumualtors` right away, since that

[GitHub] spark pull request: [SPARK-9809]Task crashes because the internal ...

2015-08-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8090#issuecomment-129979557 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-9809]Task crashes because the internal ...

2015-08-11 Thread andrewor14
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/8090#issuecomment-129978442 I see, thanks @carsonwang for the explanation. This fix LGTM given my understanding. I would also like to see a regression test for this, but I think the existing

[GitHub] spark pull request: [SPARK-9809]Task crashes because the internal ...

2015-08-11 Thread andrewor14
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/8090#issuecomment-129978458 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: [SPARK-9809]Task crashes because the internal ...

2015-08-11 Thread andrewor14
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/8090#issuecomment-129983636 @squito ah sorry I missed your comment. Just so I understand your point, are you saying that when a stage is submitted with all missing partitions, none of the tasks

[GitHub] spark pull request: [SPARK-9809]Task crashes because the internal ...

2015-08-11 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/8090#issuecomment-129980080 [Test build #40473 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/40473/consoleFull) for PR 8090 at commit

[GitHub] spark pull request: [SPARK-9809]Task crashes because the internal ...

2015-08-11 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/8090#issuecomment-130050438 [Test build #40473 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/40473/console) for PR 8090 at commit

[GitHub] spark pull request: [SPARK-9809]Task crashes because the internal ...

2015-08-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8090#issuecomment-130050714 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-9809]Task crashes because the internal ...

2015-08-11 Thread andrewor14
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/8090#issuecomment-130090548 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: [SPARK-9809]Task crashes because the internal ...

2015-08-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8090#issuecomment-130091284 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-9809]Task crashes because the internal ...

2015-08-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8090#issuecomment-130091309 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-9809]Task crashes because the internal ...

2015-08-11 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/8090#issuecomment-130091477 [Test build #40521 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/40521/consoleFull) for PR 8090 at commit

[GitHub] spark pull request: [SPARK-9809]Task crashes because the internal ...

2015-08-11 Thread squito
Github user squito commented on the pull request: https://github.com/apache/spark/pull/8090#issuecomment-130093251 Hi @andrewor14 , no that is not quite what I was saying. I was basically just repeating my comments from the earlier PR, and how I was confused on what we the goal is

[GitHub] spark pull request: [SPARK-9809]Task crashes because the internal ...

2015-08-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8090#issuecomment-130124548 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-9809]Task crashes because the internal ...

2015-08-11 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/8090#issuecomment-130124407 [Test build #40521 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/40521/console) for PR 8090 at commit

[GitHub] spark pull request: [SPARK-9809]Task crashes because the internal ...

2015-08-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8090#issuecomment-129708266 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-9809]Task crashes because the internal ...

2015-08-10 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/8090#issuecomment-129708237 [Test build #40381 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/40381/console) for PR 8090 at commit

[GitHub] spark pull request: [SPARK-9809]Task crashes because the internal ...

2015-08-10 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/8090#discussion_r36708799 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -787,9 +787,10 @@ class DAGScheduler( } } +

[GitHub] spark pull request: [SPARK-9809]Task crashes because the internal ...

2015-08-10 Thread squito
Github user squito commented on the pull request: https://github.com/apache/spark/pull/8090#issuecomment-129690335 Hi @carsonwang thanks for reporting this suggesting a fix. This is clearly really important to fix for 1.5, I'm glad you marked it as a blocker -- but I'd also like to

[GitHub] spark pull request: [SPARK-9809]Task crashes because the internal ...

2015-08-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8090#issuecomment-129674969 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-9809]Task crashes because the internal ...

2015-08-10 Thread carsonwang
Github user carsonwang commented on the pull request: https://github.com/apache/spark/pull/8090#issuecomment-129684834 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: [SPARK-9809]Task crashes because the internal ...

2015-08-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8090#issuecomment-129685721 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-9809]Task crashes because the internal ...

2015-08-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8090#issuecomment-129685760 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-9809]Task crashes because the internal ...

2015-08-10 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/8090#issuecomment-129686486 [Test build #40381 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/40381/consoleFull) for PR 8090 at commit

[GitHub] spark pull request: [SPARK-9809]Task crashes because the internal ...

2015-08-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8090#issuecomment-129673355 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-9809]Task crashes because the internal ...

2015-08-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8090#issuecomment-129673308 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-9809]Task crashes because the internal ...

2015-08-10 Thread carsonwang
GitHub user carsonwang opened a pull request: https://github.com/apache/spark/pull/8090 [SPARK-9809]Task crashes because the internal accumulators are not properly initialized When a stage failed and another stage was resubmitted with only part of partitions to compute, all the