[GitHub] spark pull request: [SPARK-14963][Yarn] Using recoveryPath if NM r...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/12994 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14963][Yarn] Using recoveryPath if NM r...
Github user tgravescs commented on the pull request: https://github.com/apache/spark/pull/12994#issuecomment-218184446 +1 Thanks @jerryshao --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14963][Yarn] Using recoveryPath if NM r...
Github user tgravescs commented on the pull request: https://github.com/apache/spark/pull/12994#issuecomment-218177238 I'm not concerned with the downgrade case. It just won't find the file if yarn isn't setting the recovery path any longer (it will create new one in localdir) , but I don't see that as a big issue because if someone is downgrading their cluster or turned off recovery they should kill everything that is running. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14963][Yarn] Using recoveryPath if NM r...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12994#issuecomment-218074876 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/58216/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14963][Yarn] Using recoveryPath if NM r...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12994#issuecomment-218074797 **[Test build #58216 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58216/consoleFull)** for PR 12994 at commit [`6d4a8f1`](https://github.com/apache/spark/commit/6d4a8f19bfb4c80e18c397f4fecc7c1c5f771877). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14963][Yarn] Using recoveryPath if NM r...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12994#issuecomment-218074874 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14963][Yarn] Using recoveryPath if NM r...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12994#issuecomment-218072671 **[Test build #58216 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58216/consoleFull)** for PR 12994 at commit [`6d4a8f1`](https://github.com/apache/spark/commit/6d4a8f19bfb4c80e18c397f4fecc7c1c5f771877). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14963][Yarn] Using recoveryPath if NM r...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12994#issuecomment-218066986 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/58210/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14963][Yarn] Using recoveryPath if NM r...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12994#issuecomment-218066984 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14963][Yarn] Using recoveryPath if NM r...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12994#issuecomment-218066972 **[Test build #58210 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58210/consoleFull)** for PR 12994 at commit [`02752c9`](https://github.com/apache/spark/commit/02752c955812a7f00d52f1c462309953eb296ec2). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14963][Yarn] Using recoveryPath if NM r...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12994#issuecomment-218065383 **[Test build #58210 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58210/consoleFull)** for PR 12994 at commit [`02752c9`](https://github.com/apache/spark/commit/02752c955812a7f00d52f1c462309953eb296ec2). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14963][Yarn] Using recoveryPath if NM r...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12994#issuecomment-218064524 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/58207/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14963][Yarn] Using recoveryPath if NM r...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12994#issuecomment-218064520 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14963][Yarn] Using recoveryPath if NM r...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12994#issuecomment-218064500 **[Test build #58207 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58207/consoleFull)** for PR 12994 at commit [`519bf07`](https://github.com/apache/spark/commit/519bf07cfb91bb3cbf6bf274a71ecbe62e45e8ec). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14963][Yarn] Using recoveryPath if NM r...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12994#issuecomment-218062949 **[Test build #58207 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58207/consoleFull)** for PR 12994 at commit [`519bf07`](https://github.com/apache/spark/commit/519bf07cfb91bb3cbf6bf274a71ecbe62e45e8ec). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14963][Yarn] Using recoveryPath if NM r...
Github user jerryshao commented on the pull request: https://github.com/apache/spark/pull/12994#issuecomment-218062741 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14963][Yarn] Using recoveryPath if NM r...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12994#issuecomment-218060958 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14963][Yarn] Using recoveryPath if NM r...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12994#issuecomment-218060959 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/58205/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14963][Yarn] Using recoveryPath if NM r...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12994#issuecomment-218060952 **[Test build #58205 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58205/consoleFull)** for PR 12994 at commit [`519bf07`](https://github.com/apache/spark/commit/519bf07cfb91bb3cbf6bf274a71ecbe62e45e8ec). * This patch **fails MiMa tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14963][Yarn] Using recoveryPath if NM r...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12994#issuecomment-218060219 **[Test build #58205 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58205/consoleFull)** for PR 12994 at commit [`519bf07`](https://github.com/apache/spark/commit/519bf07cfb91bb3cbf6bf274a71ecbe62e45e8ec). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14963][Yarn] Using recoveryPath if NM r...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12994#issuecomment-218051051 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14963][Yarn] Using recoveryPath if NM r...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12994#issuecomment-218051038 **[Test build #58201 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58201/consoleFull)** for PR 12994 at commit [`4e5c2fd`](https://github.com/apache/spark/commit/4e5c2fd557dcfc1327449c1e329969b0d401dbbe). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14963][Yarn] Using recoveryPath if NM r...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12994#issuecomment-218051052 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/58201/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14963][Yarn] Using recoveryPath if NM r...
Github user jerryshao commented on the pull request: https://github.com/apache/spark/pull/12994#issuecomment-218050017 @tgravescs , I tested locally using Hadoop 2.4 and 2.6 with different scenarios: 1. Only Hadoop 2.4 2. Hadoop 2.4 upgrade to 2.6 with NM recovery disabled. 3. Hadoop 2.4 upgrade to 2.6 with NM recovery enabled. 3. Hadoop 2.6 NM recovery disabled to enabled. Looks fine in all these scenarios. One missing part is do we need to take care of downgrade scenarios, like 2.6 to 2.4 or NM recovery enabled to disabled? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14963][Yarn] Using recoveryPath if NM r...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12994#issuecomment-218049583 **[Test build #58201 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58201/consoleFull)** for PR 12994 at commit [`4e5c2fd`](https://github.com/apache/spark/commit/4e5c2fd557dcfc1327449c1e329969b0d401dbbe). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14963][Yarn] Using recoveryPath if NM r...
Github user jerryshao commented on the pull request: https://github.com/apache/spark/pull/12994#issuecomment-218034727 Thanks @tgravescs for your comments, I will change the code and do a more comprehensive test accordingly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14963][Yarn] Using recoveryPath if NM r...
Github user tgravescs commented on the pull request: https://github.com/apache/spark/pull/12994#issuecomment-217873160 few minor comments but mostly looks good. Did you build against both hadoop 2.5+ and hadoop < 2.5? Did you manually test the upgrade path? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14963][Yarn] Using recoveryPath if NM r...
Github user tgravescs commented on a diff in the pull request: https://github.com/apache/spark/pull/12994#discussion_r62502165 --- Diff: yarn/src/test/scala/org/apache/spark/network/yarn/YarnShuffleServiceSuite.scala --- @@ -234,7 +236,25 @@ class YarnShuffleServiceSuite extends SparkFunSuite with Matchers with BeforeAnd s3.initializeApplication(app2Data) ShuffleTestAccessor.getExecutorInfo(app2Id, "exec-2", resolver3) should be (Some(shuffleInfo2)) s3.stop() - } + test("get correct recovery path") { --- End diff -- could you also add a test to make sure the move happens properly when upgrading. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14963][Yarn] Using recoveryPath if NM r...
Github user tgravescs commented on a diff in the pull request: https://github.com/apache/spark/pull/12994#discussion_r62500518 --- Diff: common/network-yarn/src/main/java/org/apache/spark/network/yarn/YarnShuffleService.java --- @@ -222,4 +219,43 @@ protected void serviceStop() { public ByteBuffer getMetaData() { return ByteBuffer.allocate(0); } + + /** + * Set the recovery path for shuffle service recovery when NM is restarted. The method will be + * overrode and called when Hadoop version is 2.5+ and NM recovery is enabled, otherwise we + * have to manually call this to set our own recovery path. + */ + public void setRecoveryPath(Path recoveryPath) { +_recoveryPath = recoveryPath; + } + + /** + * Get the recovery path, this will override the default one to get the our own maintained + * recovery path. + */ + protected Path getRecoveryPath() { +String[] localDirs = _conf.getTrimmedStrings("yarn.nodemanager.local-dirs"); +for (String dir : localDirs) { + File f = new File(new Path(dir).toUri().getPath(), "registeredExecutors.ldb"); + if (f.exists()) { +if (_recoveryPath == null) { + // If NM recovery is not enabled, we should specify the recovery path using NM local + // dirs, which is compatible with the old code. + _recoveryPath = new Path(dir); +} else { + // If NM recovery is enabled and recovery file is existed in old NM local dirs, which --- End diff -- nit change "and recovery file is existed in old" to "the recovery file exists in the old" --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14963][Yarn] Using recoveryPath if NM r...
Github user tgravescs commented on a diff in the pull request: https://github.com/apache/spark/pull/12994#discussion_r62500214 --- Diff: common/network-yarn/src/main/java/org/apache/spark/network/yarn/YarnShuffleService.java --- @@ -112,14 +118,15 @@ private boolean isAuthenticationEnabled() { */ @Override protected void serviceInit(Configuration conf) { +_conf = conf; // In case this NM was killed while there were running spark applications, we need to restore // lost state for the existing executors. We look for an existing file in the NM's local dirs. // If we don't find one, then we choose a file to use to save the state next time. Even if // an application was stopped while the NM was down, we expect yarn to call stopApplication() // when it comes back registeredExecutorFile = - findRegisteredExecutorFile(conf.getTrimmedStrings("yarn.nodemanager.local-dirs")); + new File(getRecoveryPath().toUri().getPath(), "registeredExecutors.ldb"); --- End diff -- I know you didn't change this but could you make a constant for "registeredExecutors.ldb" --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14963][Yarn] Using recoveryPath if NM r...
Github user tgravescs commented on a diff in the pull request: https://github.com/apache/spark/pull/12994#discussion_r62493162 --- Diff: common/network-yarn/src/main/java/org/apache/spark/network/yarn/YarnShuffleService.java --- @@ -222,4 +219,43 @@ protected void serviceStop() { public ByteBuffer getMetaData() { return ByteBuffer.allocate(0); } + + /** + * Set the recovery path for shuffle service recovery when NM is restarted. The method will be + * overrode and called when Hadoop version is 2.5+ and NM recovery is enabled, otherwise we + * have to manually call this to set our own recovery path. + */ + public void setRecoveryPath(Path recoveryPath) { +_recoveryPath = recoveryPath; + } + + /** + * Get the recovery path, this will override the default one to get the our own maintained --- End diff -- nit: "to get our" remove "the" --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14963][Yarn] Using recoveryPath if NM r...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12994#issuecomment-217764236 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14963][Yarn] Using recoveryPath if NM r...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12994#issuecomment-217764237 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/58116/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14963][Yarn] Using recoveryPath if NM r...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12994#issuecomment-217764184 **[Test build #58116 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58116/consoleFull)** for PR 12994 at commit [`08557bf`](https://github.com/apache/spark/commit/08557bf4e8c47d8114af8188fc90822cc622942b). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14963][Yarn] Using recoveryPath if NM r...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12994#issuecomment-217763069 **[Test build #58116 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58116/consoleFull)** for PR 12994 at commit [`08557bf`](https://github.com/apache/spark/commit/08557bf4e8c47d8114af8188fc90822cc622942b). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14963][Yarn] Using recoveryPath if NM r...
GitHub user jerryshao opened a pull request: https://github.com/apache/spark/pull/12994 [SPARK-14963][Yarn] Using recoveryPath if NM recovery is enabled ## What changes were proposed in this pull request? From Hadoop 2.5+, Yarn NM supports NM recovery which using recovery path for auxiliary services such as spark_shuffle, mapreduce_shuffle. So here change to use this path install of NM local dir if NM recovery is enabled. ## How was this patch tested? Unit test + local test. You can merge this pull request into a Git repository by running: $ git pull https://github.com/jerryshao/apache-spark SPARK-14963 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/12994.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #12994 commit 08557bf4e8c47d8114af8188fc90822cc622942b Author: jerryshao Date: 2016-05-09T02:10:29Z Using recoveryPath if NM recovery is enabled --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org