subject:"\[jira\] \[Commented\] \(YARN\-1341\) Recover NMTokens upon nodemanager restart"

[jira] [Commented] (YARN-1341) Recover NMTokens upon nodemanager restart

2014-07-18 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14066251#comment-14066251
 ] 

Hudson commented on YARN-1341:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #616 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/616/])
YARN-1341. Recover NMTokens upon nodemanager restart. (Contributed by Jason 
Lowe) (junping_du: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1611512)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/security/BaseNMTokenSecretManager.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeManager.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/recovery/NMLeveldbStateStoreService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/recovery/NMNullStateStoreService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/recovery/NMStateStoreService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/security/NMTokenSecretManagerInNM.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/recovery/NMMemoryStateStoreService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/recovery/TestNMLeveldbStateStoreService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/security
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/security/TestNMTokenSecretManagerInNM.java


 Recover NMTokens upon nodemanager restart
 -

 Key: YARN-1341
 URL: https://issues.apache.org/jira/browse/YARN-1341
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Affects Versions: 2.3.0
Reporter: Jason Lowe
Assignee: Jason Lowe
 Fix For: 2.6.0

 Attachments: YARN-1341.patch, YARN-1341v2.patch, YARN-1341v3.patch, 
 YARN-1341v4-and-YARN-1987.patch, YARN-1341v5.patch, YARN-1341v6.patch, 
 YARN-1341v7.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-1341) Recover NMTokens upon nodemanager restart

2014-07-18 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14066342#comment-14066342
 ] 

Hudson commented on YARN-1341:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1835 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1835/])
YARN-1341. Recover NMTokens upon nodemanager restart. (Contributed by Jason 
Lowe) (junping_du: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1611512)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/security/BaseNMTokenSecretManager.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeManager.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/recovery/NMLeveldbStateStoreService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/recovery/NMNullStateStoreService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/recovery/NMStateStoreService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/security/NMTokenSecretManagerInNM.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/recovery/NMMemoryStateStoreService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/recovery/TestNMLeveldbStateStoreService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/security
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/security/TestNMTokenSecretManagerInNM.java


 Recover NMTokens upon nodemanager restart
 -

 Key: YARN-1341
 URL: https://issues.apache.org/jira/browse/YARN-1341
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Affects Versions: 2.3.0
Reporter: Jason Lowe
Assignee: Jason Lowe
 Fix For: 2.6.0

 Attachments: YARN-1341.patch, YARN-1341v2.patch, YARN-1341v3.patch, 
 YARN-1341v4-and-YARN-1987.patch, YARN-1341v5.patch, YARN-1341v6.patch, 
 YARN-1341v7.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-1341) Recover NMTokens upon nodemanager restart

2014-07-18 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14066365#comment-14066365
 ] 

Hudson commented on YARN-1341:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #1808 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1808/])
YARN-1341. Recover NMTokens upon nodemanager restart. (Contributed by Jason 
Lowe) (junping_du: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1611512)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/security/BaseNMTokenSecretManager.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeManager.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/recovery/NMLeveldbStateStoreService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/recovery/NMNullStateStoreService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/recovery/NMStateStoreService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/security/NMTokenSecretManagerInNM.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/recovery/NMMemoryStateStoreService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/recovery/TestNMLeveldbStateStoreService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/security
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/security/TestNMTokenSecretManagerInNM.java


 Recover NMTokens upon nodemanager restart
 -

 Key: YARN-1341
 URL: https://issues.apache.org/jira/browse/YARN-1341
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Affects Versions: 2.3.0
Reporter: Jason Lowe
Assignee: Jason Lowe
 Fix For: 2.6.0

 Attachments: YARN-1341.patch, YARN-1341v2.patch, YARN-1341v3.patch, 
 YARN-1341v4-and-YARN-1987.patch, YARN-1341v5.patch, YARN-1341v6.patch, 
 YARN-1341v7.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-1341) Recover NMTokens upon nodemanager restart

2014-07-17 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14064870#comment-14064870
 ] 

Hadoop QA commented on YARN-1341:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12656064/YARN-1341v7.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:

  
org.apache.hadoop.yarn.server.nodemanager.webapp.TestNMWebServicesContainers
  
org.apache.hadoop.yarn.server.nodemanager.webapp.TestNMWebServices
  
org.apache.hadoop.yarn.server.nodemanager.webapp.TestNMWebServicesApps

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/4344//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4344//console

This message is automatically generated.

 Recover NMTokens upon nodemanager restart
 -

 Key: YARN-1341
 URL: https://issues.apache.org/jira/browse/YARN-1341
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Affects Versions: 2.3.0
Reporter: Jason Lowe
Assignee: Jason Lowe
 Attachments: YARN-1341.patch, YARN-1341v2.patch, YARN-1341v3.patch, 
 YARN-1341v4-and-YARN-1987.patch, YARN-1341v5.patch, YARN-1341v6.patch, 
 YARN-1341v7.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-1341) Recover NMTokens upon nodemanager restart

2014-07-17 Thread Junping Du (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14065716#comment-14065716
 ] 

Junping Du commented on YARN-1341:
--

+1. Patch looks good. Will commit it shortly.

 Recover NMTokens upon nodemanager restart
 -

 Key: YARN-1341
 URL: https://issues.apache.org/jira/browse/YARN-1341
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Affects Versions: 2.3.0
Reporter: Jason Lowe
Assignee: Jason Lowe
 Attachments: YARN-1341.patch, YARN-1341v2.patch, YARN-1341v3.patch, 
 YARN-1341v4-and-YARN-1987.patch, YARN-1341v5.patch, YARN-1341v6.patch, 
 YARN-1341v7.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-1341) Recover NMTokens upon nodemanager restart

2014-07-17 Thread Junping Du (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14065715#comment-14065715
 ] 

Junping Du commented on YARN-1341:
--

I can confirm test failure is not related to the patch as it also show up in 
YARN-2045. The similar issue happens in AM WebServices (MAPREDUCE-5973)and RM 
WebServices (YARN-2304) also. Already filed YARN-2316 to track these failures.

 Recover NMTokens upon nodemanager restart
 -

 Key: YARN-1341
 URL: https://issues.apache.org/jira/browse/YARN-1341
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Affects Versions: 2.3.0
Reporter: Jason Lowe
Assignee: Jason Lowe
 Attachments: YARN-1341.patch, YARN-1341v2.patch, YARN-1341v3.patch, 
 YARN-1341v4-and-YARN-1987.patch, YARN-1341v5.patch, YARN-1341v6.patch, 
 YARN-1341v7.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-1341) Recover NMTokens upon nodemanager restart

2014-07-17 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14065795#comment-14065795
 ] 

Hudson commented on YARN-1341:
--

FAILURE: Integrated in Hadoop-trunk-Commit #5906 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5906/])
YARN-1341. Recover NMTokens upon nodemanager restart. (Contributed by Jason 
Lowe) (junping_du: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1611512)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/security/BaseNMTokenSecretManager.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeManager.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/recovery/NMLeveldbStateStoreService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/recovery/NMNullStateStoreService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/recovery/NMStateStoreService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/security/NMTokenSecretManagerInNM.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/recovery/NMMemoryStateStoreService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/recovery/TestNMLeveldbStateStoreService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/security
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/security/TestNMTokenSecretManagerInNM.java


 Recover NMTokens upon nodemanager restart
 -

 Key: YARN-1341
 URL: https://issues.apache.org/jira/browse/YARN-1341
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Affects Versions: 2.3.0
Reporter: Jason Lowe
Assignee: Jason Lowe
 Attachments: YARN-1341.patch, YARN-1341v2.patch, YARN-1341v3.patch, 
 YARN-1341v4-and-YARN-1987.patch, YARN-1341v5.patch, YARN-1341v6.patch, 
 YARN-1341v7.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-1341) Recover NMTokens upon nodemanager restart

2014-07-16 Thread Junping Du (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14063381#comment-14063381
 ] 

Junping Du commented on YARN-1341:
--

Thanks [~jlowe] for reply above and document work on umbrella JIRA. I think we 
can add error handling session later for all cases according to discussion 
above. Isn't it? I also agree retry (configurable or not) may not be necessary 
at this point. We may add it if future if we really need it. [~devaraj.k], what 
do you think?
The latest patch looks good to me overall. Some comments on tiny issues:
In NMTokenSecretManagerInNM.java
{code}
+// if there was no master key, try the previous key
+if (super.currentMasterKey == null) {
+  super.currentMasterKey = previousMasterKey;
+}
{code}
Is above code still necessary given currentMasterKey will updated soon from RM 
registration as what we discussed above?

 Recover NMTokens upon nodemanager restart
 -

 Key: YARN-1341
 URL: https://issues.apache.org/jira/browse/YARN-1341
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Affects Versions: 2.3.0
Reporter: Jason Lowe
Assignee: Jason Lowe
 Attachments: YARN-1341.patch, YARN-1341v2.patch, YARN-1341v3.patch, 
 YARN-1341v4-and-YARN-1987.patch, YARN-1341v5.patch, YARN-1341v6.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-1341) Recover NMTokens upon nodemanager restart

2014-07-16 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14063704#comment-14063704
 ] 

Hadoop QA commented on YARN-1341:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12656064/YARN-1341v7.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:

  
org.apache.hadoop.yarn.server.nodemanager.webapp.TestNMWebServicesContainers
  
org.apache.hadoop.yarn.server.nodemanager.webapp.TestNMWebServices
  
org.apache.hadoop.yarn.server.nodemanager.webapp.TestNMWebServicesApps

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/4323//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4323//console

This message is automatically generated.

 Recover NMTokens upon nodemanager restart
 -

 Key: YARN-1341
 URL: https://issues.apache.org/jira/browse/YARN-1341
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Affects Versions: 2.3.0
Reporter: Jason Lowe
Assignee: Jason Lowe
 Attachments: YARN-1341.patch, YARN-1341v2.patch, YARN-1341v3.patch, 
 YARN-1341v4-and-YARN-1987.patch, YARN-1341v5.patch, YARN-1341v6.patch, 
 YARN-1341v7.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-1341) Recover NMTokens upon nodemanager restart

2014-07-16 Thread Jason Lowe (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14063981#comment-14063981
 ] 

Jason Lowe commented on YARN-1341:
--

I believe the test failures are unrelated.  All of the nodemanager tests pass 
for me locally, and the build report with the test failures shows they're all 
of the java.net.BindException: Address already in use variety.  Other web 
services tests have been failing in MAPREDUCE in a similar way -- I suspect a 
process from a previous test run got stuck on a popular web service port.

 Recover NMTokens upon nodemanager restart
 -

 Key: YARN-1341
 URL: https://issues.apache.org/jira/browse/YARN-1341
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Affects Versions: 2.3.0
Reporter: Jason Lowe
Assignee: Jason Lowe
 Attachments: YARN-1341.patch, YARN-1341v2.patch, YARN-1341v3.patch, 
 YARN-1341v4-and-YARN-1987.patch, YARN-1341v5.patch, YARN-1341v6.patch, 
 YARN-1341v7.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-1341) Recover NMTokens upon nodemanager restart

2014-07-16 Thread Junping Du (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14064427#comment-14064427
 ] 

Junping Du commented on YARN-1341:
--

Agree. The test failure shouldn't be related to the latest patch. Kick off 
Jenkins test again.

 Recover NMTokens upon nodemanager restart
 -

 Key: YARN-1341
 URL: https://issues.apache.org/jira/browse/YARN-1341
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Affects Versions: 2.3.0
Reporter: Jason Lowe
Assignee: Jason Lowe
 Attachments: YARN-1341.patch, YARN-1341v2.patch, YARN-1341v3.patch, 
 YARN-1341v4-and-YARN-1987.patch, YARN-1341v5.patch, YARN-1341v6.patch, 
 YARN-1341v7.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-1341) Recover NMTokens upon nodemanager restart

2014-07-15 Thread Junping Du (JIRA)

[
https://issues.apache.org/jira/browse/YARN-1341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14061891#comment-14061891
]

Junping Du commented on YARN-1341:
--

Hey [~jlowe], I also agree it is better to discuss the inconsistent scenario
for each cases on separated JIRAs. However, for now, our conclusion from these
discussions can only be true in theoretically but it may have bugs/issues in
practical. Thus, I also suggest we should have a central place to document
these assumptions/conclusions from discussions and it would help us and others
in community to identify potential issues if coming up with UT or other
integration tests on negative cases later. What do you think? If you are also
agree on this, we can separate this document effort to other JIRA (Umbrella or
a dedicated one, whatever you like) and continue the discussion on this
particular case.
On this particular one, the assumptions here from discussion above seems like:
if NM restart with stale keys,
a. if currentMasterKey is stale, it can be updated and override soon with
registering to RM later. Nothing is affected.
b. if previousMasterKey is stale, then the real previous master key is lost, so
the affection is: AMs with real master key cannot connect to NM to launch
containers.
c. if applicationMasterKeys are stale, then previous old keys get tracked in
applicationMasterKeys get lost after restart. The affection is: AMs with old
keys cannot connect to NM to launch containers.
I would prefer option 1 too if we listed all affections here. Anything I am
missing here?

Recover NMTokens upon nodemanager restart
-

Key: YARN-1341
URL: https://issues.apache.org/jira/browse/YARN-1341
Project: Hadoop YARN
Issue Type: Sub-task
Components: nodemanager
Affects Versions: 2.3.0
Reporter: Jason Lowe
Assignee: Jason Lowe
Attachments: YARN-1341.patch, YARN-1341v2.patch, YARN-1341v3.patch,
YARN-1341v4-and-YARN-1987.patch, YARN-1341v5.patch, YARN-1341v6.patch

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-1341) Recover NMTokens upon nodemanager restart

2014-07-15 Thread Jason Lowe (JIRA)

[
https://issues.apache.org/jira/browse/YARN-1341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14062861#comment-14062861
]

Jason Lowe commented on YARN-1341:
--

Thanks for commenting, Devaraj! My apologies for the late reply, as I was on
vacation and am still catching up.

bq. In addition to option 1), I'd think of making the NM down if NM fails to
store RM keys for certain number of times(configurable) consecutively.

As for retries, I mentioned earlier that if retries are likely to help then the
state store implementation should do so rather than have the common code do so.
For the leveldb implementation it is very unlikely that a retry is going to do
anything other than just make the operation take longer to ultimately fail.
The the firmware of the drive is already going to implement a large number of
retries to attempt to recover from hardware errors, and non-hardware local
filesystem errors are highly unlikely to be fixed by simply retrying
immediately. If that were the case then I'd expect retries to be implemented
in many other places where the local filesystem is used by Hadoop code.

bq. And also we can make it(i.e. tear down NM or not) as configurable

I'd like to avoid adding yet more config options unless we think we really need
them, but if people agree this needs to be configurable then we can do so.
Also I assume in that scenario you would want the NM to shutdown while also
tearing down containers, cleaning up, etc. as if it didn't support recovery.
Tearing down the NM on a state store error just to have it start up again and
try to recover with stale state seems pointless -- might as well have just kept
running which is a better outcome. Or am I missing a use case for that?

And thanks, Junping, for the recent comments!

bq. If you are also agree on this, we can separate this document effort to
other JIRA (Umbrella or a dedicated one, whatever you like) and continue the
discussion on this particular case.

Sure, we can discuss general error handling or an overall document for it
either on YARN-1336 or a new JIRA.

bq. a. if currentMasterKey is stale, it can be updated and override soon with
registering to RM later. Nothing is affected.
Correct, the NM should receive the current master key upon re-registration with
the RM after it restarts.

bq. b. if previousMasterKey is stale, then the real previous master key is
lost, so the affection is: AMs with real master key cannot connect to NM to
launch containers.
AMs that have the current master key will still be able to connect because the
NM just got the current master key as described in a). AM's that have the
previous master key will not be able to connect to the NM unless that
particular master key also happened to be successfully associated with the
attempt in the state store (related to case c).

bq. c. if applicationMasterKeys are stale, then previous old keys get tracked
in applicationMasterKeys get lost after restart. The affection is: AMs with old
keys cannot connect to NM to launch containers.
AMs that use an old key (i.e.: not the current or previous master key) would be
unable to connect to the NM.

bq. Anything I am missing here?
I don't believe so. The bottom line is that an AM may not be able to
successfully connect to an NM after a restart with stale NM token state.

Recover NMTokens upon nodemanager restart
-

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-1341) Recover NMTokens upon nodemanager restart

2014-07-15 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14063031#comment-14063031
 ] 

Hadoop QA commented on YARN-1341:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12651342/YARN-1341v6.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/4319//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4319//console

This message is automatically generated.

 Recover NMTokens upon nodemanager restart
 -

 Key: YARN-1341
 URL: https://issues.apache.org/jira/browse/YARN-1341
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Affects Versions: 2.3.0
Reporter: Jason Lowe
Assignee: Jason Lowe
 Attachments: YARN-1341.patch, YARN-1341v2.patch, YARN-1341v3.patch, 
 YARN-1341v4-and-YARN-1987.patch, YARN-1341v5.patch, YARN-1341v6.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-1341) Recover NMTokens upon nodemanager restart

2014-06-30 Thread Devaraj K (JIRA)

[
https://issues.apache.org/jira/browse/YARN-1341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14047910#comment-14047910
]

Devaraj K commented on YARN-1341:
-

Sorry for coming late here.

+1 for limiting the implementation/discussion as per Jira title and handling
other cases in the respected Jira’s.

In addition to option 1), I'd think of making the NM down if NM fails to store
RM keys for certain number of times(configurable) consecutively. And also we
can make it(i.e. tear down NM or not) as configurable and let the users choose
whether to enable or disable the config to make the NM down for RM keys state
store failures.

Similarly for Container/Application state store failures, NM can mark that
Container/Application as failed and can be reported to RM. These can be
discussed more detail in the corresponding Jira’s YARN-1337 and YARN-1354.
However for all these NM state store operations, we could think of having
retries before throwing the IOException.

Thoughts?

Recover NMTokens upon nodemanager restart
-

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-1341) Recover NMTokens upon nodemanager restart

2014-06-27 Thread Jason Lowe (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14045959#comment-14045959
 ] 

Jason Lowe commented on YARN-1341:
--

Agree it's not ideal to discuss handling state store errors for all NM 
components in this JIRA.  In general I'd prefer to discuss and address each 
case with the corresponding JIRA, e.g.: application state store errors 
discussed and addressed in YARN-1354, container state store errors in 
YARN-1337, etc.  If we feel there's significant utility to committing a JIRA 
before all the issues are addressed then we can file one or more followup JIRAs 
to track those outstanding issues.  That's the normal process we follow with 
other features/fixes as well.  

So if we follow that process then we're back to the discussion about RM master 
keys not being able to be stored in the state store.  The choices we've 
discussed are:

1) Log an error, update the master key in memory, and continue
2) Log an error, _not_ update the master key in memory, and continue
3) Log an error and tear down the NM

I'd prefer 1) since that is the option that preserves the most work in all 
scenarios I can think of, and I don't know of a scenario where 2) would handle 
it better.  However I could be convinced given the right scenario.  I'd really 
rather avoid 3) since that seems like a severe way to handle the error and 
guarantees work is lost.

Oh there is one more handling scenario we briefly discussed where we flag the 
NM as undesirable.  When that occurs we don't shoot the containers that are 
running, but we avoid adding new containers since the node is having issues 
(i.e.: a drain-decommission).  I feel that would be a separate JIRA since it 
needs YARN-914, and we'd still need to decide how to handle the error until the 
decommission is complete (i.e.: choice 1 or 2 above).

 Recover NMTokens upon nodemanager restart
 -

 Key: YARN-1341
 URL: https://issues.apache.org/jira/browse/YARN-1341
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Affects Versions: 2.3.0
Reporter: Jason Lowe
Assignee: Jason Lowe
 Attachments: YARN-1341.patch, YARN-1341v2.patch, YARN-1341v3.patch, 
 YARN-1341v4-and-YARN-1987.patch, YARN-1341v5.patch, YARN-1341v6.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-1341) Recover NMTokens upon nodemanager restart

2014-06-26 Thread Junping Du (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14045522#comment-14045522
 ] 

Junping Du commented on YARN-1341:
--

[~jlowe], I agree we should treat each state inconsistent case specifically, 
and it is good that we already had many discussions to cover many cases which 
could beyond the work in this JIRA. I prefer to continue these discussions in a 
separated JIRA until we make sure all scenarios are covered properly (not block 
this JIRA's work). What do you think?   

 Recover NMTokens upon nodemanager restart
 -

 Key: YARN-1341
 URL: https://issues.apache.org/jira/browse/YARN-1341
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Affects Versions: 2.3.0
Reporter: Jason Lowe
Assignee: Jason Lowe
 Attachments: YARN-1341.patch, YARN-1341v2.patch, YARN-1341v3.patch, 
 YARN-1341v4-and-YARN-1987.patch, YARN-1341v5.patch, YARN-1341v6.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-1341) Recover NMTokens upon nodemanager restart

2014-06-24 Thread Jason Lowe (JIRA)

[
https://issues.apache.org/jira/browse/YARN-1341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14042559#comment-14042559
]

Jason Lowe commented on YARN-1341:
--

bq. So far from I know, RM restart didn't track this because these metrics will
be recover during events recovery in RM restart. In current NM restart, some
metrics could be lost, i.e. allocatedContainers, etc. I think we should either
count them back as part of events during recovery or persistent them. Thoughts?

Not all of the RM metrics will be recovered, correct? RPC metrics will be
zeroed since those aren't persisted (nor should they be, IMHO). Aggregate
containers allocated/released in the queue metrics will be wrong since the RM
restart work, by design, doesn't store per-container state. If the cluster
stays up too long then apps submitted/completed/failed/killed will not be
correct, as I believe it will only count the applications that haven't been
reaped due to retention policies. Anyway this is outside the scope of this
JIRA, and I'll file a separate JIRA underneath the YARN-1336 umbrella to
discuss what we should do about NM metrics and restart.

bq. If so, how about we don't apply these changes until these changes can be
persistent? If so, we still keep consistent between state store and NM's
current state. Even we choose to fail the NM, we still can load state and
recover the working.

Again I think this is a case-by-case thing. For the RM master key, I'd rather
keep going with the current master key and hope the next key update is able to
persist (e.g.: a full disk where the state is stored that is later cleared up)
rather than ditch the new key update and risk bringing down the NM because it
can no longer keep talking to the RM or AMs. As I mentioned earlier, the
failure to persist the RM master key or the master key used by an AM is that
_if_ the NM happens to restart then some AMs _might_ not be able to
authenticate with the NM until they get updated to the new master key. If we
take down the NM or keep going but fail to update the master key in memory then
this seems purely worse. The opportunity for error has widened, but I don't
see any advantage gained by doing so.

bq. Do we expect some operations can be failed while other operation can be
successful? If this means short-term unavailable for persistent effort, we can
just handle it by adding retry. If not, we should expect other operations that
fetal get failed soon enough, and in this case, log error and move on in
non-fatal operations don't have many differences. No?

I don't expect immediate retry to help, and if the state store implementation
is such that immediate retry is likely to help then the state store
implementation should do that directly before throwing the error rather than
relying on the upper-layer code to do so. However I do expect there to be
common failure modes where the error state is temporary but not in the
immediate sense (e.g.: the full disk scenario). And although an NM can't
launch containers without a working state store, there's still a lot of useful
stuff an NM can do with a broken state store -- report status of active
containers, serve up shuffle data, etc. So far I don't think any of the state
store updates should result in a teardown of the NM if there is a failure,
although please let me know if you have a scenario where we should.

Recover NMTokens upon nodemanager restart
-

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-1341) Recover NMTokens upon nodemanager restart

2014-06-23 Thread Junping Du (JIRA)

[
https://issues.apache.org/jira/browse/YARN-1341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14041077#comment-14041077
]

Junping Du commented on YARN-1341:
--

bq. Yes, applications should be like containers. If we fail to store an
application start in the state store then we should fail the container launch
that triggered the application to be added. This already happens in the current
patch for YARN-1354. If we fail to store the completion of an application then
worst-case we will report an application to the RM on restart that isn't
active, and the RM will correct the NM when it re-registers.
That make sense. I guess we should do additional work to check if the behavior
is as our expected.

bq. I wasn't planning on persisting metrics during restart, as there are quite
a few (e.g.: RPC metrics, etc.), and I'm not sure it's critical that they be
preserved across a restart. Does RM restart do this or are there plans to do so?
I think these metrics are important especially for user's monitoring tools and
we should make these info consistent during restart. So far from I know, RM
restart didn't track this because these metrics will be recover during events
recovery in RM restart. In current NM restart, some metrics could be lost, i.e.
allocatedContainers, etc. I think we should either count them back as part of
events during recovery or persistent them. Thoughts?

bq. Therefore I don't believe the effort to maintain a stale tag is going to be
worth it. Also if we refuse to load a state store that's stale then we are
going to leak containers because we won't try to recover anything from a stale
state store.
If so, how about we don't apply these changes until these changes can be
persistent? If so, we still keep consistent between state store and NM's
current state. Even we choose to fail the NM, we still can load state and
recover the working.

bq. Instead I think we should decide in the various store failure cases whether
the error should be fatal to the operation (which may lead to it being fatal to
the NM overall) or if we feel the recovery with stale information is a better
outcome than taking the NM down. In the latter case we should just log the
error and move on.
Do we expect some operations can be failed while other operation can be
successful? If this means short-term unavailable for persistent effort, we can
just handle it by adding retry. If not, we should expect other operations that
fetal get failed soon enough, and in this case, log error and move on in
non-fatal operations don't have many differences. No?

Recover NMTokens upon nodemanager restart
-

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-1341) Recover NMTokens upon nodemanager restart

2014-06-20 Thread Jason Lowe (JIRA)

[
https://issues.apache.org/jira/browse/YARN-1341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14039138#comment-14039138
]

Jason Lowe commented on YARN-1341:
--

bq. The worst case seems to me is: NM restart with partial state recovered,
this inconsistent state is not aware by running containers which could bring
some weird bugs.

Yes, you're correct. The worst-case is likely where we come up, fail to
realize a container is running, and therefore the container leaks.

I think we should handle store errors on a case-by-case basis, based on the
ramifications of how the system will recover without that information. For
containers, a container should fail to launch if state store errors occur to
mark the container request and/or mark the container launch. The YARN-1336
prototype patch already does this when containers are requested and in
ContainerLaunch. That way the worst-case scenario for a container is that we
throw an error for the container request or the container fails before launch
due to a state store error. We failed to launch a container, but the whole NM
doesn't go down. If we fail to mark the container completed in the store then
worst-case scenario is that we try to recover a container that isn't there,
which again will mark the container as failed and we'll report that to the RM.
If the RM doesn't know about the failed container (because the container/app is
long gone) then it will just ignore it.

For deletion service, if we fail to update the store then we may fail to delete
something when we recover if we happened to restart in-between. If we ignore
the error then it's very likely the NM will _not_ restart before the deletion
time expires and the file is deleted. However if we tear down the NM on a
store error then we will also fail to delete it when the NM restarts later
since we failed to record it, meaning we made things purely worse -- we lost
work _and_ leaked the thing we were supposed to delete. Therefore for deletion
tasks I think the current behavior is appropriate.

For localized resources failing to update the store means we could end up
leaking a resource or thinking a resource is there when it's really not. The
latter isn't a huge problem because when we try to reference the resource again
it checks if it's there, and if it isn't it re-localizes it again. Not knowing
a resource is there is a bigger issue, and there's a couple of ways to tackle
that one -- either fail the localization of the resource when the state store
error occurs or have the NM scan the local resource directories for unknown
resources when it recovers.

For the RM master key, I see it very similar to the deletion task case. If we
fail to store it then the NM will update it in memory, and can keep going. If
we restart without recovering an older key (the current key will be obtained
when the NM re-registers with the RM) then we may fail to let AMs connect that
only have an older key. Containers that were still on the NM will still
continue. If we take down the NM when the store hiccups then we lose work
which seems worse than a possibility the AM could fail to connect to the NM
(which can and does already happen today due to network cuts, etc.)

bq. May be we add some stale tag on NMStateStore and mark this when store
failure happens and never load a staled store.

If we had an error storing then we're likely to have the same error trying to
store a stale tag, or am I misunderstanding the proposal? Also as I mentioned
above, there are many cases where a partial recovery isn't a bad thing as the
system can recover via other means (e.g.: trying to recover a container that
already completed should be benign, trying to delete a container directory
that's already deleted is benign, etc.).

Recover NMTokens upon nodemanager restart
-

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-1341) Recover NMTokens upon nodemanager restart

2014-06-20 Thread Junping Du (JIRA)

[
https://issues.apache.org/jira/browse/YARN-1341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14039206#comment-14039206
]

Junping Du commented on YARN-1341:
--

Thanks [~jlowe] for detailed explanation here! I totally agree that we should
deal with this case by case and appreciate your analysis on above cases. I
think there are still other cases we should double-check, some of them may
suffer more from inconsistency.
- Application state - If we failed to store the application update, i.e. from
init to finish, then we get wrong state on application after recovery.
- NodeManagerMetrics - The metrics of NM will get mess up if partial updated.
(We haven't get JIRA to store/recover this. Isn't it?)
On the side effect for bring NM down, like case in deletionServices. I think we
can just do cleanup on these directories (like we want to do in node
decommission cases).
About stale tag on NMStateStore - I don't mean to put on NMStateStore, but
haven't think clearly on where to do - may be we can persistent on local disk
directly or send to RM and retrieval it in NM registration?

Recover NMTokens upon nodemanager restart
-

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-1341) Recover NMTokens upon nodemanager restart

2014-06-20 Thread Jason Lowe (JIRA)

[
https://issues.apache.org/jira/browse/YARN-1341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14039235#comment-14039235
]

Jason Lowe commented on YARN-1341:
--

bq. Application state - If we failed to store the application update, i.e. from
init to finish, then we get wrong state on application after recovery.

Yes, applications should be like containers. If we fail to store an
application start in the state store then we should fail the container launch
that triggered the application to be added. This already happens in the
current patch for YARN-1354. If we fail to store the completion of an
application then worst-case we will report an application to the RM on restart
that isn't active, and the RM will correct the NM when it re-registers.

bq. NodeManagerMetrics - The metrics of NM will get mess up if partial updated.

I wasn't planning on persisting metrics during restart, as there are quite a
few (e.g.: RPC metrics, etc.), and I'm not sure it's critical that they be
preserved across a restart. Does RM restart do this or are there plans to do
so?

bq. About stale tag on NMStateStore - I don't mean to put on NMStateStore, but
haven't think clearly on where to do - may be we can persistent on local disk
directly or send to RM and retrieval it in NM registration?

I think in most cases the attempt to update the stale tag, even if it's
separate from the NMStateStore, will often fail in a similar way when the state
store fails (e.g.: full local disk, read-only filesystem, etc.). Therefore I
don't believe the effort to maintain a stale tag is going to be worth it. Also
if we refuse to load a state store that's stale then we are going to leak
containers because we won't try to recover anything from a stale state store.

Instead I think we should decide in the various store failure cases whether the
error should be fatal to the operation (which may lead to it being fatal to the
NM overall) or if we feel the recovery with stale information is a better
outcome than taking the NM down. In the latter case we should just log the
error and move on.

Recover NMTokens upon nodemanager restart
-

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-1341) Recover NMTokens upon nodemanager restart

2014-06-19 Thread Junping Du (JIRA)

[
https://issues.apache.org/jira/browse/YARN-1341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14037656#comment-14037656
]

Junping Du commented on YARN-1341:
--

bq. I'm not sure I understand what you're requesting. Recovering the NM tokens
is one line of code (3 if we count the if canRecover part), and recovering
the container tokens in YARN-1342 will add one more line for that (inside the
same if canRecover block). I went ahead and factored this into a separate
method, however I'm not sure it matches what you were expecting as I don't see
where we're saving duplicated code. If what's in the updated patch isn't what
you expected, please provide some sample pseudo-code to demonstrate how we can
avoid duplication of code.
I think it is fine for now. However, I would like to refactor a bit on
NodeManager#serviceInit() when we finish all these recover work to avoid some
duplicate work, some code like: createNMContext(), we duplicated set some
handler. Anyway, we can do this later.

bq. The problem with throwing an exception is what to do with the exception –
do we take down the NM? That seems like a drastic answer since the NM will
likely chug along just fine without the key stored. It only becomes a problem
when the NM restarts and restores an old key. However if we rollback the old
key here then we take that only-breaks-if-we-happened-to-restart case and make
it an always-breaks scenario. Eventually the old key will no longer be valid to
the RM, and none of the AMs will be able to authenticate to the NM. Therefore I
thought it would be better to log the error, press onward, and hope we don't
restart before we store a valid key again (maybe store error was transient)
rather than either take down the NM or have things start failing even without a
restart
We already have similar tradeoff in RM side, if any exception happens in
RMStore then it will bring down RM. In NM case, if levelDB stop to work, I
think we should bring NM down to get rid of any inconsistent after NM restart.
Although I am not sure what weird things could happen in case of inconsistency
here, but considering it is cheaper to bring down NM, we should play more
safety in our case than RM. Actually, I bring up some thoughts on play more
risky in RM side at YARN-2019 which target to reduce RM service down time. But
here, I prefer to be safer. Jason, what do you think?

Recover NMTokens upon nodemanager restart
-

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-1341) Recover NMTokens upon nodemanager restart

2014-06-19 Thread Junping Du (JIRA)

[
https://issues.apache.org/jira/browse/YARN-1341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14037868#comment-14037868
]

Junping Du commented on YARN-1341:
--

bq. Restarts should be rare, and I'd rather not force a loss of work by taking
the NM down instantly when the state store hiccups.
Yes. But considering rolling upgrade case, it (restart) should be much often
than failed in state store (Correct me here if I am wrong as I am not levelDB
expert). In this case, we always look forward to some work loss as even if we
don't bring NM down now, we will suffer after NM restart in upgrade.

bq. If the state store is missing some things, we might not be able to recover
a localized resource, a token, a container, or possibly anything at all.
I am not worrying losing them all, but if we can only partially recover these,
would it become a problem and break some assumptions we have? I don't know. But
this seems to make things more complicated.

bq. in the worst-case, the state store is so corrupted on startup that we
don't even survive the NM restart and the NM crashes, which would have an end
result just like if we took it down when the state store failed.
I am not sure if this is the worst case. The worst case seems to me is: NM
restart with partial state recovered, this inconsistent state is not aware by
running containers which could bring some weird bugs. I am not sure how
possible it could happen here, please

bq. Therefore I'd rather not guarantee that we'll lose work by crashing the NM
on any store error and instead try to preserve the work we have. The NM could
theoretically recover (e.g.: if the error is transient then the next RM key
store could succeed). If we take the NM down immediately then we're
guaranteeing the work is lost. Is that really better?
I think it is better to guarantee the work get lost as the expectation to user
is consistent. We don't know when new Token from RM come to refresh to stale
one to make persevering work succeed in lucky. User shouldn't expect work still
get preserved after NM restart if state store get failed sometime.

bq. May be a better approach is to have errors like this trigger an unhealthy
state for the NM when we have the ability to do a graceful decommission.
I agree. This could be a better approach.

In overall, I agree that we can keep log error here without breaking NM down
(or we will have change previous code on update
localizedResources/deletionServices) for reason you specified above. However,
to get rid of loading inconsistent state and manage user's expectation. I think
we shouldn't allow the state get loaded again if get some failure before in
store. May be we add some stale tag on NMStateStore and mark this when store
failure happens and never load a staled store. [~jlowe], what do you think?

Recover NMTokens upon nodemanager restart
-

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-1341) Recover NMTokens upon nodemanager restart

2014-06-18 Thread Junping Du (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14036592#comment-14036592
 ] 

Junping Du commented on YARN-1341:
--

Thanks for updating the patch, [~jlowe]! 
Some minor comments:
The change in BaseContainerTokenSecretManager.java is not necessary and I 
believe that belongs to YARN-1342. Let’s remove it for this patch.

In NodeManager.java,
{code}
 NMTokenSecretManagerInNM nmTokenSecretManager =
-new NMTokenSecretManagerInNM();
+new NMTokenSecretManagerInNM(nmStore);
+
+if (nmStore.canRecover()) {
+  nmTokenSecretManager.recover(nmStore.loadNMTokenState());
+}
{code}
Can we consolidate the code in a separated method together with 
NMContainerTokenSecretManager as we will do similar thing to recover 
ContainerToken staff which make code have duplicated things?

In NMTokenSecretManagerInNM.java, 
{code}
+  private void updateCurrentMasterKey(MasterKeyData key) {
+super.currentMasterKey = key;
+try {
+  stateStore.storeNMTokenCurrentMasterKey(key.getMasterKey());
+} catch (IOException e) {
+  LOG.error(Unable to update current master key in state store, e);
+}
+  }
+
+  private void updatePreviousMasterKey(MasterKeyData key) {
+previousMasterKey = key;
+try {
+  stateStore.storeNMTokenPreviousMasterKey(key.getMasterKey());
+} catch (IOException e) {
+  LOG.error(Unable to update previous master key in state store, e);
+}
+  }
{code}
Does log error here is just enough in case of failure in store? If Master key 
is updated but not persistent, then it could cause some inconsistency when 
recover it. I think we should throw some exception here if store get failed and 
rollback the key just set. Thoughts?

 Recover NMTokens upon nodemanager restart
 -

 Key: YARN-1341
 URL: https://issues.apache.org/jira/browse/YARN-1341
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Affects Versions: 2.3.0
Reporter: Jason Lowe
Assignee: Jason Lowe
 Attachments: YARN-1341.patch, YARN-1341v2.patch, YARN-1341v3.patch, 
 YARN-1341v4-and-YARN-1987.patch, YARN-1341v5.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-1341) Recover NMTokens upon nodemanager restart

2014-06-18 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14036853#comment-14036853
 ] 

Hadoop QA commented on YARN-1341:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12651342/YARN-1341v6.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/4022//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4022//console

This message is automatically generated.

 Recover NMTokens upon nodemanager restart
 -

 Key: YARN-1341
 URL: https://issues.apache.org/jira/browse/YARN-1341
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Affects Versions: 2.3.0
Reporter: Jason Lowe
Assignee: Jason Lowe
 Attachments: YARN-1341.patch, YARN-1341v2.patch, YARN-1341v3.patch, 
 YARN-1341v4-and-YARN-1987.patch, YARN-1341v5.patch, YARN-1341v6.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-1341) Recover NMTokens upon nodemanager restart

2014-06-17 Thread Junping Du (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14034021#comment-14034021
 ] 

Junping Du commented on YARN-1341:
--

[~jlowe], Thanks for the patch here. I am currently reviewing it and looks like 
some code like: LeveldbIterator, NMStateStoreService already get committed in 
other patches. Would you resync the patch here against trunk? Thanks!

 Recover NMTokens upon nodemanager restart
 -

 Key: YARN-1341
 URL: https://issues.apache.org/jira/browse/YARN-1341
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Affects Versions: 2.3.0
Reporter: Jason Lowe
Assignee: Jason Lowe
 Attachments: YARN-1341.patch, YARN-1341v2.patch, YARN-1341v3.patch, 
 YARN-1341v4-and-YARN-1987.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-1341) Recover NMTokens upon nodemanager restart

2014-06-17 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14034588#comment-14034588
 ] 

Hadoop QA commented on YARN-1341:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12650914/YARN-1341v5.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/4017//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4017//console

This message is automatically generated.

 Recover NMTokens upon nodemanager restart
 -

 Key: YARN-1341
 URL: https://issues.apache.org/jira/browse/YARN-1341
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Affects Versions: 2.3.0
Reporter: Jason Lowe
Assignee: Jason Lowe
 Attachments: YARN-1341.patch, YARN-1341v2.patch, YARN-1341v3.patch, 
 YARN-1341v4-and-YARN-1987.patch, YARN-1341v5.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-1341) Recover NMTokens upon nodemanager restart

2014-04-30 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13985928#comment-13985928
 ] 

Hadoop QA commented on YARN-1341:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12642696/YARN-1341v4-and-YARN-1987.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3667//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3667//console

This message is automatically generated.

 Recover NMTokens upon nodemanager restart
 -

 Key: YARN-1341
 URL: https://issues.apache.org/jira/browse/YARN-1341
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Affects Versions: 2.3.0
Reporter: Jason Lowe
Assignee: Jason Lowe
 Attachments: YARN-1341.patch, YARN-1341v2.patch, YARN-1341v3.patch, 
 YARN-1341v4-and-YARN-1987.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-1341) Recover NMTokens upon nodemanager restart

2014-04-08 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963581#comment-13963581
 ] 

Hadoop QA commented on YARN-1341:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12639280/YARN-1341v3.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3534//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3534//console

This message is automatically generated.

 Recover NMTokens upon nodemanager restart
 -

 Key: YARN-1341
 URL: https://issues.apache.org/jira/browse/YARN-1341
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Affects Versions: 2.3.0
Reporter: Jason Lowe
Assignee: Jason Lowe
 Attachments: YARN-1341.patch, YARN-1341v2.patch, YARN-1341v3.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-1341) Recover NMTokens upon nodemanager restart

2014-03-06 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923276#comment-13923276
 ] 

Hadoop QA commented on YARN-1341:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12633251/YARN-1341.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 11 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3283//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3283//console

This message is automatically generated.

 Recover NMTokens upon nodemanager restart
 -

 Key: YARN-1341
 URL: https://issues.apache.org/jira/browse/YARN-1341
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Affects Versions: 2.3.0
Reporter: Jason Lowe
Assignee: Jason Lowe
 Attachments: YARN-1341.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-1341) Recover NMTokens upon nodemanager restart

2014-03-06 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923372#comment-13923372
 ] 

Hadoop QA commented on YARN-1341:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12633265/YARN-1341v2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3285//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3285//console

This message is automatically generated.

 Recover NMTokens upon nodemanager restart
 -

 Key: YARN-1341
 URL: https://issues.apache.org/jira/browse/YARN-1341
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Affects Versions: 2.3.0
Reporter: Jason Lowe
Assignee: Jason Lowe
 Attachments: YARN-1341.patch, YARN-1341v2.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-1341) Recover NMTokens upon nodemanager restart

[jira] [Commented] (YARN-1341) Recover NMTokens upon nodemanager restart

[jira] [Commented] (YARN-1341) Recover NMTokens upon nodemanager restart

[jira] [Commented] (YARN-1341) Recover NMTokens upon nodemanager restart

[jira] [Commented] (YARN-1341) Recover NMTokens upon nodemanager restart

[jira] [Commented] (YARN-1341) Recover NMTokens upon nodemanager restart

[jira] [Commented] (YARN-1341) Recover NMTokens upon nodemanager restart

[jira] [Commented] (YARN-1341) Recover NMTokens upon nodemanager restart

[jira] [Commented] (YARN-1341) Recover NMTokens upon nodemanager restart

[jira] [Commented] (YARN-1341) Recover NMTokens upon nodemanager restart

[jira] [Commented] (YARN-1341) Recover NMTokens upon nodemanager restart

[jira] [Commented] (YARN-1341) Recover NMTokens upon nodemanager restart

[jira] [Commented] (YARN-1341) Recover NMTokens upon nodemanager restart

[jira] [Commented] (YARN-1341) Recover NMTokens upon nodemanager restart

[jira] [Commented] (YARN-1341) Recover NMTokens upon nodemanager restart

[jira] [Commented] (YARN-1341) Recover NMTokens upon nodemanager restart

[jira] [Commented] (YARN-1341) Recover NMTokens upon nodemanager restart

[jira] [Commented] (YARN-1341) Recover NMTokens upon nodemanager restart

[jira] [Commented] (YARN-1341) Recover NMTokens upon nodemanager restart

[jira] [Commented] (YARN-1341) Recover NMTokens upon nodemanager restart

[jira] [Commented] (YARN-1341) Recover NMTokens upon nodemanager restart

[jira] [Commented] (YARN-1341) Recover NMTokens upon nodemanager restart

[jira] [Commented] (YARN-1341) Recover NMTokens upon nodemanager restart

[jira] [Commented] (YARN-1341) Recover NMTokens upon nodemanager restart

[jira] [Commented] (YARN-1341) Recover NMTokens upon nodemanager restart

[jira] [Commented] (YARN-1341) Recover NMTokens upon nodemanager restart

[jira] [Commented] (YARN-1341) Recover NMTokens upon nodemanager restart

[jira] [Commented] (YARN-1341) Recover NMTokens upon nodemanager restart

[jira] [Commented] (YARN-1341) Recover NMTokens upon nodemanager restart

[jira] [Commented] (YARN-1341) Recover NMTokens upon nodemanager restart

[jira] [Commented] (YARN-1341) Recover NMTokens upon nodemanager restart

[jira] [Commented] (YARN-1341) Recover NMTokens upon nodemanager restart

32 matches

Site Navigation

Mail list logo

Footer information