[GitHub] spark issue #18213: [SPARK-20996][YARN] Better handling AM reattempt based o...

2017-06-06 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18213 **[Test build #1 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/1/testReport)** for PR 18213 at commit [`1fa1415`](https://github.com/apache/spark/commit/1f

[GitHub] spark issue #18213: [SPARK-20996][YARN] Better handling AM reattempt based o...

2017-06-06 Thread jerryshao
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/18213 CC @mridulm, can you please help to review? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this featu

[GitHub] spark issue #18213: [SPARK-20996][YARN] Better handling AM reattempt based o...

2017-06-06 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18213 **[Test build #1 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/1/testReport)** for PR 18213 at commit [`1fa1415`](https://github.com/apache/spark/commit/1

[GitHub] spark issue #18213: [SPARK-20996][YARN] Better handling AM reattempt based o...

2017-06-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18213 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #18213: [SPARK-20996][YARN] Better handling AM reattempt based o...

2017-06-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18213 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/1/ Test PASSed. ---

[GitHub] spark issue #18213: [SPARK-20996][YARN] Better handling AM reattempt based o...

2017-06-06 Thread mridulm
Github user mridulm commented on the issue: https://github.com/apache/spark/pull/18213 What is the expectation for spark streaming (yarn cluster case) ? +CC @tgravescs --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as we

[GitHub] spark issue #18213: [SPARK-20996][YARN] Better handling AM reattempt based o...

2017-06-07 Thread jerryshao
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/18213 For Streaming application, I just treat it as normal Spark application, if it fails from itself internally, AM will unregister itself, if it is from external issue then AM will not unregister itse

[GitHub] spark issue #18213: [SPARK-20996][YARN] Better handling AM reattempt based o...

2017-06-07 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18213 **[Test build #77808 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77808/testReport)** for PR 18213 at commit [`64c59a4`](https://github.com/apache/spark/commit/64

[GitHub] spark issue #18213: [SPARK-20996][YARN] Better handling AM reattempt based o...

2017-06-07 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18213 **[Test build #77808 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77808/testReport)** for PR 18213 at commit [`64c59a4`](https://github.com/apache/spark/commit/6

[GitHub] spark issue #18213: [SPARK-20996][YARN] Better handling AM reattempt based o...

2017-06-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18213 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #18213: [SPARK-20996][YARN] Better handling AM reattempt based o...

2017-06-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18213 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77808/ Test PASSed. ---

[GitHub] spark issue #18213: [SPARK-20996][YARN] Better handling AM reattempt based o...

2017-06-08 Thread jerryshao
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/18213 @mridulm , the exit code of pyspark or R is really user defined, user could exit with any code, for example `sys.exit(100)`, so potentially it could be overlapping. --- If your project is set up

[GitHub] spark issue #18213: [SPARK-20996][YARN] Better handling AM reattempt based o...

2017-06-08 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18213 **[Test build #77809 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77809/testReport)** for PR 18213 at commit [`1559cbb`](https://github.com/apache/spark/commit/15

[GitHub] spark issue #18213: [SPARK-20996][YARN] Better handling AM reattempt based o...

2017-06-08 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18213 **[Test build #77809 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77809/testReport)** for PR 18213 at commit [`1559cbb`](https://github.com/apache/spark/commit/1

[GitHub] spark issue #18213: [SPARK-20996][YARN] Better handling AM reattempt based o...

2017-06-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18213 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77809/ Test PASSed. ---

[GitHub] spark issue #18213: [SPARK-20996][YARN] Better handling AM reattempt based o...

2017-06-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18213 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #18213: [SPARK-20996][YARN] Better handling AM reattempt based o...

2017-06-08 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18213 **[Test build #77835 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77835/testReport)** for PR 18213 at commit [`2b364fd`](https://github.com/apache/spark/commit/2b

[GitHub] spark issue #18213: [SPARK-20996][YARN] Better handling AM reattempt based o...

2017-06-08 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18213 **[Test build #77835 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77835/testReport)** for PR 18213 at commit [`2b364fd`](https://github.com/apache/spark/commit/2

[GitHub] spark issue #18213: [SPARK-20996][YARN] Better handling AM reattempt based o...

2017-06-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18213 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #18213: [SPARK-20996][YARN] Better handling AM reattempt based o...

2017-06-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18213 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77835/ Test PASSed. ---

[GitHub] spark issue #18213: [SPARK-20996][YARN] Better handling AM reattempt based o...

2017-06-16 Thread vanzin
Github user vanzin commented on the issue: https://github.com/apache/spark/pull/18213 I'm not so sure about this... this is a fundamental change in how the feature works. With this change, there are only two cases where the AM will be retried: - the client-mode AM - when

[GitHub] spark issue #18213: [SPARK-20996][YARN] Better handling AM reattempt based o...

2017-06-18 Thread jerryshao
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/18213 Thanks @vanzin , a valid concern, actually it is hard for AM to differentiate several different scenarios and treat with different approaches. So your suggestion is only to set max attempt to 1 as

[GitHub] spark issue #18213: [SPARK-20996][YARN] Better handling AM reattempt based o...

2017-06-20 Thread jerryshao
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/18213 @vanzin , how about the current changes, I set the default maxAttempts to 1. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark issue #18213: [SPARK-20996][YARN] Better handling AM reattempt based o...

2017-06-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18213 **[Test build #78283 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78283/testReport)** for PR 18213 at commit [`fd9c20f`](https://github.com/apache/spark/commit/fd

[GitHub] spark issue #18213: [SPARK-20996][YARN] Better handling AM reattempt based o...

2017-06-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18213 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/78283/ Test PASSed. ---

[GitHub] spark issue #18213: [SPARK-20996][YARN] Better handling AM reattempt based o...

2017-06-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18213 **[Test build #78283 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78283/testReport)** for PR 18213 at commit [`fd9c20f`](https://github.com/apache/spark/commit/f

[GitHub] spark issue #18213: [SPARK-20996][YARN] Better handling AM reattempt based o...

2017-06-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18213 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #18213: [SPARK-20996][YARN] Better handling AM reattempt based o...

2017-06-20 Thread tgravescs
Github user tgravescs commented on the issue: https://github.com/apache/spark/pull/18213 I think this is a very hard thing for us to know, to many different failure types. I agree that setting to 1 is better then us getting it wrong, although I question a bit still if that is right e

[GitHub] spark issue #18213: [SPARK-20996][YARN] Better handling AM reattempt based o...

2017-06-20 Thread vanzin
Github user vanzin commented on the issue: https://github.com/apache/spark/pull/18213 With your current change, you won't allow re-tries of the client-mode AM. It's rarely the user's fault when the client-mode AM needs to be restarted. I kinda agree with Tom that at this point

[GitHub] spark issue #18213: [SPARK-20996][YARN] Better handling AM reattempt based o...

2017-06-21 Thread jerryshao
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/18213 Thanks @tgravescs @vanzin for your comments, I think it is quite valid, I will close this PR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHu