[ https://issues.apache.org/jira/browse/YARN-6031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15804114#comment-15804114 ]
Sunil G commented on YARN-6031: ------------------------------- Since RMAppImpl handles APP_REJECTED event and we can move app from NEW to FAILED (via FINAL_SAVING), I think its fine. Currently you are validating resource request and doing an assert. I suggest you can raise YarnException with more meaningful message. For test cases, i suggest you can look {{TestWorkPreservingRMRestartForNodeLabel}}. > Application recovery failed after disabling node label > ------------------------------------------------------ > > Key: YARN-6031 > URL: https://issues.apache.org/jira/browse/YARN-6031 > Project: Hadoop YARN > Issue Type: Bug > Components: scheduler > Affects Versions: 2.8.0 > Reporter: Ying Zhang > Assignee: Ying Zhang > Priority: Minor > Attachments: YARN-6031.001.patch, YARN-6031.002.patch > > > Here is the repro steps: > Enable node label, restart RM, configure CS properly, and run some jobs; > Disable node label, restart RM, and the following exception thrown: > {noformat} > Caused by: > org.apache.hadoop.yarn.exceptions.InvalidLabelResourceRequestException: > Invalid resource request, node label not enabled but request contains label > expression > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndValidateRequest(SchedulerUtils.java:225) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndValidateRequest(SchedulerUtils.java:248) > at > org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.validateAndCreateResourceRequest(RMAppManager.java:394) > at > org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:339) > at > org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recoverApplication(RMAppManager.java:319) > at > org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recover(RMAppManager.java:436) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.recover(ResourceManager.java:1165) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:574) > at > org.apache.hadoop.service.AbstractService.start(AbstractService.java:193) > ... 10 more > {noformat} > During RM restart, application recovery failed due to that application had > node label expression specified while node label has been disabled. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org