[ 
https://issues.apache.org/jira/browse/YARN-6031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15783397#comment-15783397
 ] 

Bibin A Chundatt commented on YARN-6031:
----------------------------------------

As [~sunilg] mentioned earlier ignoring application could create stale 
application in state store.

[~Ying Zhang] IIUC ignore validation on recovery also should work.

{code}
  private static void validateResourceRequest(ResourceRequest resReq,
      Resource maximumResource, QueueInfo queueInfo, RMContext rmContext)
      throws InvalidResourceRequestException {
    Configuration conf = rmContext.getYarnConfiguration();
    // If Node label is not enabled throw exception
    if (null != conf && !YarnConfiguration.areNodeLabelsEnabled(conf)) {
      String labelExp = resReq.getNodeLabelExpression();
      if (!(RMNodeLabelsManager.NO_LABEL.equals(labelExp)
          || null == labelExp)) {
        throw new InvalidLabelResourceRequestException(
            "Invalid resource request, node label not enabled "
                + "but request contains label expression");
      }
    }
{code}

Thoughts??

{quote}
The current fact is (with or without this fix): application submitted with node 
label expression explicitly specified will fail during recovery
{quote}
IMHO should be acceptable since any application submitted with labels when 
feature is disabled gets rejected.

Solution 2:
We could ignore/reset  labels  to default in  resourcerequest when nodelabels 
are disabled. Havn't looked at impact of the same. 
An elaborate testing would be needed to see how metrics are impacted.

YARN-4562 will try to handle  ignore loading label configuration when disabled.

[~templedf] i do agree that admin would require some way to dump application 
info when recovery fails so that bulk update in state store is possible.


> Application recovery failed after disabling node label
> ------------------------------------------------------
>
>                 Key: YARN-6031
>                 URL: https://issues.apache.org/jira/browse/YARN-6031
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: scheduler
>    Affects Versions: 2.8.0
>            Reporter: Ying Zhang
>            Assignee: Ying Zhang
>            Priority: Minor
>         Attachments: YARN-6031.001.patch
>
>
> Here is the repro steps:
> Enable node label, restart RM, configure CS properly, and run some jobs;
> Disable node label, restart RM, and the following exception thrown:
> {noformat}
> Caused by: 
> org.apache.hadoop.yarn.exceptions.InvalidLabelResourceRequestException: 
> Invalid resource request, node label not enabled but request contains label 
> expression
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndValidateRequest(SchedulerUtils.java:225)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndValidateRequest(SchedulerUtils.java:248)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.validateAndCreateResourceRequest(RMAppManager.java:394)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:339)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recoverApplication(RMAppManager.java:319)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recover(RMAppManager.java:436)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.recover(ResourceManager.java:1165)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:574)
>         at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
>         ... 10 more
> {noformat}
> During RM restart, application recovery failed due to that application had 
> node label expression specified while node label has been disabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to