[jira] [Commented] (YARN-1849) NPE in ResourceTrackerService#registerNodeManager for UAM on secure clusters

2014-03-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940746#comment-13940746
 ] 

Hadoop QA commented on YARN-1849:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12635573/yarn-1849-1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3397//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3397//console

This message is automatically generated.

 NPE in ResourceTrackerService#registerNodeManager for UAM on secure clusters
 

 Key: YARN-1849
 URL: https://issues.apache.org/jira/browse/YARN-1849
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.4.0
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
Priority: Blocker
 Attachments: yarn-1849-1.patch


 While running an UnmanagedAM on secure cluster, ran into an NPE on 
 failover/restart. This is similar to YARN-1821. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1849) NPE in ResourceTrackerService#registerNodeManager for UAM on secure clusters

2014-03-19 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940640#comment-13940640
 ] 

Karthik Kambatla commented on YARN-1849:


This time around, it turns out the master container is null:
{code}
  if (rmAppAttempt != null) {
if (rmAppAttempt.getMasterContainer().getId()
.equals(containerStatus.getContainerId())
 containerStatus.getState() == ContainerState.COMPLETE) 
{code}
Looks like it is not necessary for an UnmanagedAM to have a master container. 

 NPE in ResourceTrackerService#registerNodeManager for UAM on secure clusters
 

 Key: YARN-1849
 URL: https://issues.apache.org/jira/browse/YARN-1849
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.4.0
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
Priority: Blocker

 While running an UnmanagedAM on secure cluster, ran into an NPE on 
 failover/restart. This is similar to YARN-1821. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1849) NPE in ResourceTrackerService#registerNodeManager for UAM on secure clusters

2014-03-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940936#comment-13940936
 ] 

Hadoop QA commented on YARN-1849:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12635622/yarn-1849-2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

  {color:red}-1 javac{color}.  The applied patch generated 1492 javac 
compiler warnings (more than the trunk's current 1491 warnings).

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  
org.apache.hadoop.yarn.server.resourcemanager.recovery.TestZKRMStateStoreZKClientConnections

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3398//testReport/
Javac warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/3398//artifact/trunk/patchprocess/diffJavacWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3398//console

This message is automatically generated.

 NPE in ResourceTrackerService#registerNodeManager for UAM on secure clusters
 

 Key: YARN-1849
 URL: https://issues.apache.org/jira/browse/YARN-1849
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.4.0
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
Priority: Blocker
 Attachments: yarn-1849-1.patch, yarn-1849-2.patch, yarn-1849-2.patch


 While running an UnmanagedAM on secure cluster, ran into an NPE on 
 failover/restart. This is similar to YARN-1821. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1849) NPE in ResourceTrackerService#registerNodeManager for UAM on secure clusters

2014-03-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13941008#comment-13941008
 ] 

Hadoop QA commented on YARN-1849:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12635635/yarn-1849-3.patch
  against trunk revision .

{color:red}-1 patch{color}.  Trunk compilation may be broken.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3400//console

This message is automatically generated.

 NPE in ResourceTrackerService#registerNodeManager for UAM on secure clusters
 

 Key: YARN-1849
 URL: https://issues.apache.org/jira/browse/YARN-1849
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.4.0
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
Priority: Blocker
 Attachments: yarn-1849-1.patch, yarn-1849-2.patch, yarn-1849-2.patch, 
 yarn-1849-3.patch


 While running an UnmanagedAM on secure cluster, ran into an NPE on 
 failover/restart. This is similar to YARN-1821. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1849) NPE in ResourceTrackerService#registerNodeManager for UAM on secure clusters

2014-03-19 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13941037#comment-13941037
 ] 

Jian He commented on YARN-1849:
---

Hi, I want to take a look at the patch, can you wait for some time ?  I'll do 
it today.

 NPE in ResourceTrackerService#registerNodeManager for UAM on secure clusters
 

 Key: YARN-1849
 URL: https://issues.apache.org/jira/browse/YARN-1849
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.4.0
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
Priority: Blocker
 Attachments: yarn-1849-1.patch, yarn-1849-2.patch, yarn-1849-2.patch, 
 yarn-1849-3.patch


 While running an UnmanagedAM on secure cluster, ran into an NPE on 
 failover/restart. This is similar to YARN-1821. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1849) NPE in ResourceTrackerService#registerNodeManager for UAM on secure clusters

2014-03-19 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13941007#comment-13941007
 ] 

Alejandro Abdelnur commented on YARN-1849:
--

+1 pending jenkins.

 NPE in ResourceTrackerService#registerNodeManager for UAM on secure clusters
 

 Key: YARN-1849
 URL: https://issues.apache.org/jira/browse/YARN-1849
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.4.0
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
Priority: Blocker
 Attachments: yarn-1849-1.patch, yarn-1849-2.patch, yarn-1849-2.patch, 
 yarn-1849-3.patch


 While running an UnmanagedAM on secure cluster, ran into an NPE on 
 failover/restart. This is similar to YARN-1821. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1849) NPE in ResourceTrackerService#registerNodeManager for UAM on secure clusters

2014-03-19 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13941004#comment-13941004
 ] 

Karthik Kambatla commented on YARN-1849:


Tested the newest patch on a secure cluster with UAM and RM HA. Failover works 
fine. 

 NPE in ResourceTrackerService#registerNodeManager for UAM on secure clusters
 

 Key: YARN-1849
 URL: https://issues.apache.org/jira/browse/YARN-1849
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.4.0
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
Priority: Blocker
 Attachments: yarn-1849-1.patch, yarn-1849-2.patch, yarn-1849-2.patch, 
 yarn-1849-3.patch


 While running an UnmanagedAM on secure cluster, ran into an NPE on 
 failover/restart. This is similar to YARN-1821. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1849) NPE in ResourceTrackerService#registerNodeManager for UAM on secure clusters

2014-03-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13941062#comment-13941062
 ] 

Hadoop QA commented on YARN-1849:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12635635/yarn-1849-3.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3399//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3399//console

This message is automatically generated.

 NPE in ResourceTrackerService#registerNodeManager for UAM on secure clusters
 

 Key: YARN-1849
 URL: https://issues.apache.org/jira/browse/YARN-1849
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.4.0
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
Priority: Blocker
 Attachments: yarn-1849-1.patch, yarn-1849-2.patch, yarn-1849-2.patch, 
 yarn-1849-3.patch


 While running an UnmanagedAM on secure cluster, ran into an NPE on 
 failover/restart. This is similar to YARN-1821. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1849) NPE in ResourceTrackerService#registerNodeManager for UAM on secure clusters

2014-03-19 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13941275#comment-13941275
 ] 

Jian He commented on YARN-1849:
---

Those NULL checks should be valid only for UMA. Normal AM should not happen, if 
it happens, it’s a bug. suggest instead of those NULL checks which may hide 
bug, check if it is UMA, if it is , do not send the container finished events.

 NPE in ResourceTrackerService#registerNodeManager for UAM on secure clusters
 

 Key: YARN-1849
 URL: https://issues.apache.org/jira/browse/YARN-1849
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.4.0
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
Priority: Blocker
 Attachments: yarn-1849-1.patch, yarn-1849-2.patch, yarn-1849-2.patch, 
 yarn-1849-3.patch


 While running an UnmanagedAM on secure cluster, ran into an NPE on 
 failover/restart. This is similar to YARN-1821. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1849) NPE in ResourceTrackerService#registerNodeManager for UAM on secure clusters

2014-03-19 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13941283#comment-13941283
 ] 

Karthik Kambatla commented on YARN-1849:


Thanks [~jianhe]. Agree with you partially; in fact, I was thinking of doing 
that initially. However, in case we do end up into these NULLs for managed AMs, 
not handling them leads to the NM going down. Logging the errors will let us 
know that things are wrong, but not take the nodes down. 

 NPE in ResourceTrackerService#registerNodeManager for UAM on secure clusters
 

 Key: YARN-1849
 URL: https://issues.apache.org/jira/browse/YARN-1849
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.4.0
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
Priority: Blocker
 Attachments: yarn-1849-1.patch, yarn-1849-2.patch, yarn-1849-2.patch, 
 yarn-1849-3.patch


 While running an UnmanagedAM on secure cluster, ran into an NPE on 
 failover/restart. This is similar to YARN-1821. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1849) NPE in ResourceTrackerService#registerNodeManager for UAM on secure clusters

2014-03-19 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13941310#comment-13941310
 ] 

Vinod Kumar Vavilapalli commented on YARN-1849:
---

Haven't looked at the patch, but in general there is a constant tussle between 
keeping things up vs failing fast so as to be able to fix bugs.

I would in general avoid null checks unless I am sure - failing the RM/NM at 
least uncovers the bug instead of limping with it and then breaking somewhere 
else at which point it becomes hard to root-cause. If possible, let's fix what 
is actually broken here instead of putting in a lot of null checks (if that is 
what the above comments are talking about). Sure, we may run into one more 
issue that we haven't foreseen, but we can atleast comfort in knowing that we 
are addressing the right corner cases.

 NPE in ResourceTrackerService#registerNodeManager for UAM on secure clusters
 

 Key: YARN-1849
 URL: https://issues.apache.org/jira/browse/YARN-1849
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.4.0
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
Priority: Blocker
 Attachments: yarn-1849-1.patch, yarn-1849-2.patch, yarn-1849-2.patch, 
 yarn-1849-3.patch


 While running an UnmanagedAM on secure cluster, ran into an NPE on 
 failover/restart. This is similar to YARN-1821. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)