[ 
https://issues.apache.org/jira/browse/YARN-3536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14508855#comment-14508855
 ] 

gu-chi commented on YARN-3536:
------------------------------

2015-04-21 04:22:33,923 | INFO  | main-EventThread | Recovering app: 
application_1429597538411_0001 with 2 attempts and final state = FINISHED | 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.recover(RMAppImpl.java:700)
2015-04-21 04:22:33,923 | INFO  | main-EventThread | Recovering attempt: 
appattempt_1429597538411_0001_000001 with final state: FAILED | 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.recover(RMAppAttemptImpl.java:734)
2015-04-21 04:22:33,924 | INFO  | main-EventThread | Recovering attempt: 
appattempt_1429597538411_0001_000002 with final state: null | 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.recover(RMAppAttemptImpl.java:734)
2015-04-21 04:22:33,924 | INFO  | main-EventThread | Create AMRMToken for 
ApplicationAttempt: appattempt_1429597538411_0001_000002 | 
org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager.createAndGetAMRMToken(AMRMTokenSecretManager.java:195)
2015-04-21 04:22:33,924 | INFO  | main-EventThread | Creating password for 
appattempt_1429597538411_0001_000002 | 
org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager.createPassword(AMRMTokenSecretManager.java:307)
2015-04-21 04:22:33,924 | INFO  | main-EventThread | 
appattempt_1429597538411_0001_000001 State change from NEW to FAILED | 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:704)
2015-04-21 04:22:33,925 | INFO  | main-EventThread | Registering app attempt : 
appattempt_1429597538411_0001_000002 | 
org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.registerAppAttempt(ApplicationMasterService.java:656)
2015-04-21 04:22:33,925 | ERROR | main-EventThread | Failed to load/recover 
state | 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:533)
java.lang.NullPointerException
        at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.addApplicationAttempt(CapacityScheduler.java:607)
        at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:941)
        at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:97)

> ZK exception occur when updating AppAttempt status, then NPE thrown when RM 
> do recover
> --------------------------------------------------------------------------------------
>
>                 Key: YARN-3536
>                 URL: https://issues.apache.org/jira/browse/YARN-3536
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: capacityscheduler, resourcemanager
>    Affects Versions: 2.4.1
>            Reporter: gu-chi
>
> Here is a scenario that Application status is FAILED/FINISHED but AppAttempt 
> status is null, this cause NPE when doing recover with 
> yarn.resourcemanager.work-preserving-recovery.enabled set to true



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to