[GitHub] spark issue #20930: [SPARK-23811][Core] FetchFailed comes before Success of ...

2018-04-26 Thread Ngone51
Github user Ngone51 commented on the issue: https://github.com/apache/spark/pull/20930 No wonder I can't understand the issue for a long time since I've thought it happened on Spark2.3 . And now it makes sense. Thanks @jiangxb1987 ---

[GitHub] spark issue #20930: [SPARK-23811][Core] FetchFailed comes before Success of ...

2018-04-26 Thread xuanyuanking
Github user xuanyuanking commented on the issue: https://github.com/apache/spark/pull/20930 > Have you applied this patch: #17955 ? No, this happened on Spark 2.1. Thanks xingbo & wenchen, I'll port back this patch to our internal Spark 2.1. > That PR seems to be

[GitHub] spark issue #20930: [SPARK-23811][Core] FetchFailed comes before Success of ...

2018-04-26 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/20930 Have you applied this patch: https://github.com/apache/spark/pull/17955 ? --- - To unsubscribe, e-mail:

[GitHub] spark issue #20930: [SPARK-23811][Core] FetchFailed comes before Success of ...

2018-04-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20930 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20930: [SPARK-23811][Core] FetchFailed comes before Success of ...

2018-04-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20930 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89870/ Test FAILed. ---

[GitHub] spark issue #20930: [SPARK-23811][Core] FetchFailed comes before Success of ...

2018-04-26 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20930 **[Test build #89870 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89870/testReport)** for PR 20930 at commit

[GitHub] spark issue #20930: [SPARK-23811][Core] FetchFailed comes before Success of ...

2018-04-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20930 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2686/

[GitHub] spark issue #20930: [SPARK-23811][Core] FetchFailed comes before Success of ...

2018-04-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20930 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20930: [SPARK-23811][Core] FetchFailed comes before Success of ...

2018-04-25 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20930 **[Test build #89870 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89870/testReport)** for PR 20930 at commit

[GitHub] spark issue #20930: [SPARK-23811][Core] FetchFailed comes before Success of ...

2018-04-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20930 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20930: [SPARK-23811][Core] FetchFailed comes before Success of ...

2018-04-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20930 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89850/ Test PASSed. ---

[GitHub] spark issue #20930: [SPARK-23811][Core] FetchFailed comes before Success of ...

2018-04-25 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20930 **[Test build #89850 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89850/testReport)** for PR 20930 at commit

[GitHub] spark issue #20930: [SPARK-23811][Core] FetchFailed comes before Success of ...

2018-04-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20930 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20930: [SPARK-23811][Core] FetchFailed comes before Success of ...

2018-04-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20930 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89849/ Test PASSed. ---

[GitHub] spark issue #20930: [SPARK-23811][Core] FetchFailed comes before Success of ...

2018-04-25 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20930 **[Test build #89849 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89849/testReport)** for PR 20930 at commit

[GitHub] spark issue #20930: [SPARK-23811][Core] FetchFailed comes before Success of ...

2018-04-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20930 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20930: [SPARK-23811][Core] FetchFailed comes before Success of ...

2018-04-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20930 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2676/

[GitHub] spark issue #20930: [SPARK-23811][Core] FetchFailed comes before Success of ...

2018-04-25 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20930 **[Test build #89850 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89850/testReport)** for PR 20930 at commit

[GitHub] spark issue #20930: [SPARK-23811][Core] FetchFailed comes before Success of ...

2018-04-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20930 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2675/

[GitHub] spark issue #20930: [SPARK-23811][Core] FetchFailed comes before Success of ...

2018-04-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20930 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20930: [SPARK-23811][Core] FetchFailed comes before Success of ...

2018-04-25 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20930 **[Test build #89849 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89849/testReport)** for PR 20930 at commit

[GitHub] spark issue #20930: [SPARK-23811][Core] FetchFailed comes before Success of ...

2018-04-21 Thread Ngone51
Github user Ngone51 commented on the issue: https://github.com/apache/spark/pull/20930 > because we can get the MapStatus, but get a 'null'. If I'm not mistaken, this also because the ExecutorLost trigger removeOutputsOnExecutor If there's a `null` MapStatus for stage 2, how

[GitHub] spark issue #20930: [SPARK-23811][Core] FetchFailed comes before Success of ...

2018-04-21 Thread xuanyuanking
Github user xuanyuanking commented on the issue: https://github.com/apache/spark/pull/20930 ![image](https://user-images.githubusercontent.com/4833765/39091106-ff11d0a6-461f-11e8-968f-7fcbe6652bb3.png) Stage 0\1\2\3 same with 20\21\22\23 in this screenshot, stage2's shuffleId

[GitHub] spark issue #20930: [SPARK-23811][Core] FetchFailed comes before Success of ...

2018-04-21 Thread Ngone51
Github user Ngone51 commented on the issue: https://github.com/apache/spark/pull/20930 Hi, @xuanyuanking , thank for your patient explanation, sincerely. With regard to your latest explanation: > stage 2's shuffleID is 1, but stage 3 failed by missing an output for

[GitHub] spark issue #20930: [SPARK-23811][Core] FetchFailed comes before Success of ...

2018-04-20 Thread xuanyuanking
Github user xuanyuanking commented on the issue: https://github.com/apache/spark/pull/20930 @Ngone51 Ah, maybe I know how the description misleading you, the in the description 5, 'this stage' refers to 'Stage 2' in screenshot, thanks for your check, I modified the description to

[GitHub] spark issue #20930: [SPARK-23811][Core] FetchFailed comes before Success of ...

2018-04-20 Thread xuanyuanking
Github user xuanyuanking commented on the issue: https://github.com/apache/spark/pull/20930 @Ngone51 You can check the screenshot in detail, stage 2's shuffleID is 1, but stage 3 failed by missing an output for shuffle '0'! So here the stage 2's skip cause stage 3 got an error

[GitHub] spark issue #20930: [SPARK-23811][Core] FetchFailed comes before Success of ...

2018-04-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20930 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20930: [SPARK-23811][Core] FetchFailed comes before Success of ...

2018-04-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20930 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89479/ Test PASSed. ---

[GitHub] spark issue #20930: [SPARK-23811][Core] FetchFailed comes before Success of ...

2018-04-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20930 **[Test build #89479 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89479/testReport)** for PR 20930 at commit

[GitHub] spark issue #20930: [SPARK-23811][Core] FetchFailed comes before Success of ...

2018-04-17 Thread Ngone51
Github user Ngone51 commented on the issue: https://github.com/apache/spark/pull/20930 Hi, @xuanyuanking , I'm still confused (smile & cry). > Stage 2 retry 4 times triggered by Stage 3's fetch failed event. Actually in this scenario, stage 3 will always failed by fetch fail.

[GitHub] spark issue #20930: [SPARK-23811][Core] FetchFailed comes before Success of ...

2018-04-17 Thread xuanyuanking
Github user xuanyuanking commented on the issue: https://github.com/apache/spark/pull/20930 @Ngone51 Thanks for your review. > Does stage 2 is correspond to the never success stage in PR description ? Stage 3 is the never success stage, stage 2 is its parent stage.

[GitHub] spark issue #20930: [SPARK-23811][Core] FetchFailed comes before Success of ...

2018-04-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20930 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20930: [SPARK-23811][Core] FetchFailed comes before Success of ...

2018-04-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20930 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2411/

[GitHub] spark issue #20930: [SPARK-23811][Core] FetchFailed comes before Success of ...

2018-04-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20930 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20930: [SPARK-23811][Core] FetchFailed comes before Success of ...

2018-04-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20930 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2410/

[GitHub] spark issue #20930: [SPARK-23811][Core] FetchFailed comes before Success of ...

2018-04-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20930 **[Test build #89479 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89479/testReport)** for PR 20930 at commit

[GitHub] spark issue #20930: [SPARK-23811][Core] FetchFailed comes before Success of ...

2018-04-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20930 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20930: [SPARK-23811][Core] FetchFailed comes before Success of ...

2018-04-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20930 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89389/ Test PASSed. ---

[GitHub] spark issue #20930: [SPARK-23811][Core] FetchFailed comes before Success of ...

2018-04-16 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20930 **[Test build #89389 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89389/testReport)** for PR 20930 at commit

[GitHub] spark issue #20930: [SPARK-23811][Core] FetchFailed comes before Success of ...

2018-04-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20930 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2339/

[GitHub] spark issue #20930: [SPARK-23811][Core] FetchFailed comes before Success of ...

2018-04-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20930 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20930: [SPARK-23811][Core] FetchFailed comes before Success of ...

2018-04-16 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20930 **[Test build #89389 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89389/testReport)** for PR 20930 at commit

[GitHub] spark issue #20930: [SPARK-23811][Core] FetchFailed comes before Success of ...

2018-04-16 Thread xuanyuanking
Github user xuanyuanking commented on the issue: https://github.com/apache/spark/pull/20930 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #20930: [SPARK-23811][Core] FetchFailed comes before Success of ...

2018-04-16 Thread xuanyuanking
Github user xuanyuanking commented on the issue: https://github.com/apache/spark/pull/20930 @cloud-fan @jiangxb1987 Sorry for late reply, delete the useless code as our discussion before. --- - To unsubscribe,

[GitHub] spark issue #20930: [SPARK-23811][Core] FetchFailed comes before Success of ...

2018-04-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20930 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88806/ Test FAILed. ---

[GitHub] spark issue #20930: [SPARK-23811][Core] FetchFailed comes before Success of ...

2018-04-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20930 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20930: [SPARK-23811][Core] FetchFailed comes before Success of ...

2018-04-02 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20930 **[Test build #88806 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88806/testReport)** for PR 20930 at commit

[GitHub] spark issue #20930: [SPARK-23811][Core] FetchFailed comes before Success of ...

2018-04-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20930 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1895/

[GitHub] spark issue #20930: [SPARK-23811][Core] FetchFailed comes before Success of ...

2018-04-02 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20930 **[Test build #88806 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88806/testReport)** for PR 20930 at commit

[GitHub] spark issue #20930: [SPARK-23811][Core] FetchFailed comes before Success of ...

2018-04-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20930 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20930: [SPARK-23811][Core] FetchFailed comes before Success of ...

2018-03-31 Thread xuanyuanking
Github user xuanyuanking commented on the issue: https://github.com/apache/spark/pull/20930 > What's your proposed fix? I fix this by killing other attempts while receive a FetchFailed in `TaskSetManager`. If we finally ignore the success event of other attempts, might as well

[GitHub] spark issue #20930: [SPARK-23811][Core] FetchFailed comes before Success of ...

2018-03-30 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/20930 What's your proposed fix? it sounds like we can just ignore `ShuffleMapTask 1.0` if the stage is marked as failed. --- - To

[GitHub] spark issue #20930: [SPARK-23811][Core] FetchFailed comes before Success of ...

2018-03-30 Thread xuanyuanking
Github user xuanyuanking commented on the issue: https://github.com/apache/spark/pull/20930 Yeah, the stage resubmitted, but there's no missing task for this stage and actually no task will be resubmitted. This mainly because the `ShuffleMapTask 1.0` triggered

[GitHub] spark issue #20930: [SPARK-23811][Core] FetchFailed comes before Success of ...

2018-03-30 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/20930 then why is it a problem? The stage should be resubmitted soon, `ShuffleMapTask 1.0` should be a no-op. --- - To unsubscribe,

[GitHub] spark issue #20930: [SPARK-23811][Core] FetchFailed comes before Success of ...

2018-03-30 Thread xuanyuanking
Github user xuanyuanking commented on the issue: https://github.com/apache/spark/pull/20930 The first case, the stage is marked as failed, but not be resubmitted yet. --- - To unsubscribe, e-mail:

[GitHub] spark issue #20930: [SPARK-23811][Core] FetchFailed comes before Success of ...

2018-03-30 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/20930 What happened to `ShuffleMapTask 1.0` exactly? There are 2 cases: the stage is marked as failed, but not be resubmitted yet, or the stage has been resubmitted, or the stage is aborted. ---

[GitHub] spark issue #20930: [SPARK-23811][Core] FetchFailed comes before Success of ...

2018-03-30 Thread xuanyuanking
Github user xuanyuanking commented on the issue: https://github.com/apache/spark/pull/20930 `ShuffleMapTask 1.0` succeed after its speculative task failed by FetchFailed. Thanks for your checking, I will modify the PR description. ---

[GitHub] spark issue #20930: [SPARK-23811][Core] FetchFailed comes before Success of ...

2018-03-30 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/20930 what happened to `ShuffleMapTask 1.0`? I don't get it from your PR description. --- - To unsubscribe, e-mail: