[ https://issues.apache.org/jira/browse/YARN-2340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14247415#comment-14247415 ]
Jian He commented on YARN-2340: ------------------------------- Today, the semantics to stop a queue is to let the existing applications run into completion. We should retain the same semantics for RM restart as well. In this case, I think we need to ignore this exception and continue because the application was accepted before the queue is changed to stopped. Similar problem could happen if we change the application acl and restart RM while application is running. > NPE thrown when RM restart after queue is STOPPED. There after RM can not > recovery application's and remain in standby > ---------------------------------------------------------------------------------------------------------------------- > > Key: YARN-2340 > URL: https://issues.apache.org/jira/browse/YARN-2340 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager, scheduler > Affects Versions: 2.4.1 > Environment: Capacityscheduler with Queue a, b > Reporter: Nishan Shetty > Assignee: Rohith > Priority: Critical > > While job is in progress make Queue state as STOPPED and then restart RM > Observe that standby RM fails to come up as acive throwing below NPE > 2014-07-23 18:43:24,432 INFO > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: > appattempt_1406116264351_0014_000002 State change from NEW to SUBMITTED > 2014-07-23 18:43:24,433 FATAL > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in > handling event type APP_ATTEMPT_ADDED to the scheduler > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.addApplicationAttempt(CapacityScheduler.java:568) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:916) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:101) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:602) > at java.lang.Thread.run(Thread.java:662) > 2014-07-23 18:43:24,434 INFO > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Exiting, bbye.. -- This message was sent by Atlassian JIRA (v6.3.4#6332)