[ https://issues.apache.org/jira/browse/YARN-6031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15785845#comment-15785845 ]
Wangda Tan commented on YARN-6031: ---------------------------------- Thanks [~Ying Zhang] for updating patch, and thanks comments from [~templedf] / [~sunilg]. I agree with approach suggested by [~sunilg]. We can create RMAppImpl when InvalidResourceRequest thrown (For example, create a null ResourceRequest), and print proper error message After {{createAndPopulateNewRMApp}}, we can check if (amRequest == null) and (not unmanaged-am), if yes, send APP_REJECTED message, just like handling errors when credential parse error found, see {{submitApplication}}. Thoughts? > Application recovery failed after disabling node label > ------------------------------------------------------ > > Key: YARN-6031 > URL: https://issues.apache.org/jira/browse/YARN-6031 > Project: Hadoop YARN > Issue Type: Bug > Components: scheduler > Affects Versions: 2.8.0 > Reporter: Ying Zhang > Assignee: Ying Zhang > Priority: Minor > Attachments: YARN-6031.001.patch > > > Here is the repro steps: > Enable node label, restart RM, configure CS properly, and run some jobs; > Disable node label, restart RM, and the following exception thrown: > {noformat} > Caused by: > org.apache.hadoop.yarn.exceptions.InvalidLabelResourceRequestException: > Invalid resource request, node label not enabled but request contains label > expression > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndValidateRequest(SchedulerUtils.java:225) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndValidateRequest(SchedulerUtils.java:248) > at > org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.validateAndCreateResourceRequest(RMAppManager.java:394) > at > org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:339) > at > org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recoverApplication(RMAppManager.java:319) > at > org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recover(RMAppManager.java:436) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.recover(ResourceManager.java:1165) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:574) > at > org.apache.hadoop.service.AbstractService.start(AbstractService.java:193) > ... 10 more > {noformat} > During RM restart, application recovery failed due to that application had > node label expression specified while node label has been disabled. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org