Oleksandr Shevchenko created YARN-7998: ------------------------------------------
Summary: RM crashes with NPE during recovering if ACL configuration was changed Key: YARN-7998 URL: https://issues.apache.org/jira/browse/YARN-7998 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 3.0.0 Reporter: Oleksandr Shevchenko RM crashes with NPE during failover because ACL configurations were changed as a result we no longer have a rights to submit an application to a queue. Scenario: # Submit an application # Change ACL configuration for a queue that accepted the application so that an owner of the application will no longer have a rights to submit this application. # Restart RM. As a result, we get NPE: 2018-02-27 18:14:00,968 INFO org.apache.hadoop.service.AbstractService: Service ResourceManager failed in state STARTED; cause: java.lang.NullPointerException java.lang.NullPointerException at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.addApplicationAttempt(FairScheduler.java:738) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:1286) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:116) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$AttemptRecoveredTransition.transition(RMAppAttemptImpl.java:1098) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$AttemptRecoveredTransition.transition(RMAppAttemptImpl.java:1044) at org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385 -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org