[ https://issues.apache.org/jira/browse/YARN-9198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16742077#comment-16742077 ]
Wilfred Spiegelenburg commented on YARN-9198: --------------------------------------------- As I [commented in the previous jira|https://issues.apache.org/jira/browse/YARN-7913?focusedCommentId=16483490&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16483490]: the CS and FS work differently and this can happen due to a number of reasons. ACL changes or a change in queue configuration is one of those. Just removing a running application on restore is not correct. It really breaks the restore as you can now not rely on the restore to pull back all running application on a fail over. We need to go back and fix the underlying issue around the queues and config. BTW: The CS forces you to roll back the configuration change and make sure that it always works. That might be a solution but with the FS doing queue management in a more dynamic way that might not work. > Corrupted state from a previous version can still cause RM to fail with NPE > on FairScheduler > -------------------------------------------------------------------------------------------- > > Key: YARN-9198 > URL: https://issues.apache.org/jira/browse/YARN-9198 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler, resourcemanager > Affects Versions: 3.1.0, 2.8.5 > Reporter: Dapeng Sun > Assignee: Dapeng Sun > Priority: Major > Attachments: YARN-9198.001.patch > > > Previously, RM may fail with NPE due to YARN-4347,YARN-4000. After these > fixes, FairScheduler still has the same potential issue. > > 201x-xx-xx xx:xx:xx,xxx ERROR resourcemanager.ResourceManager > (ResourceManager.java:serviceStart) - Failed to load/recover state > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.addApplicationAttempt(FairScheduler.java) -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org