[ 
https://issues.apache.org/jira/browse/YARN-3940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14647955#comment-14647955
 ] 

Wangda Tan commented on YARN-3940:
----------------------------------

[~bibinchundatt],
I took a look at the patch, it checked app's node-label-expression AND queue's 
accessible-node-label, which is not enough to me. We should check usage as I 
mentioned at: 
https://issues.apache.org/jira/browse/YARN-3940?focusedCommentId=14633876&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14633876.

Actually I don't have a clear idea about how to solve this problem as well. 
Another related problem to this is, we may need to consider how to deal with 
node label update, currently, if we change labels on a node, all containers 
running on the node will be killed. I suggest to clear think about both of the 
problem before moving forward.

> Application moveToQueue should check NodeLabel permission 
> ----------------------------------------------------------
>
>                 Key: YARN-3940
>                 URL: https://issues.apache.org/jira/browse/YARN-3940
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>            Reporter: Bibin A Chundatt
>            Assignee: Bibin A Chundatt
>         Attachments: 0001-YARN-3940.patch, 0002-YARN-3940.patch
>
>
> Configure capacity scheduler 
> Configure node label an submit application {{queue=A Label=X}}
> Move application to queue {{B}} and x is not having access
> {code}
> 2015-07-20 19:46:19,626 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler:
>  Application attempt appattempt_1437385548409_0005_000001 released container 
> container_e08_1437385548409_0005_01_000002 on node: host: 
> host-10-19-92-117:64318 #containers=1 available=<memory:2560, vCores:15> 
> used=<memory:512, vCores:1> with event: KILL
> 2015-07-20 19:46:20,970 WARN 
> org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService: 
> Invalid resource ask by application appattempt_1437385548409_0005_000001
> org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException: Invalid 
> resource request, queue=b1 doesn't have permission to access all labels in 
> resource request. labelExpression of resource request=x. Queue labels=y
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.validateResourceRequest(SchedulerUtils.java:304)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndValidateRequest(SchedulerUtils.java:234)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndvalidateRequest(SchedulerUtils.java:250)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.RMServerUtils.normalizeAndValidateRequests(RMServerUtils.java:106)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:515)
>         at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationMasterProtocolPBServiceImpl.allocate(ApplicationMasterProtocolPBServiceImpl.java:60)
>         at 
> org.apache.hadoop.yarn.proto.ApplicationMasterProtocol$ApplicationMasterProtocolService$2.callBlockingMethod(ApplicationMasterProtocol.java:99)
>         at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:636)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:976)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2174)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2170)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:422)
>         at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1666)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2168)
> {code}
> Same exception will be thrown till *heartbeat timeout*
> Then application state will be updated to *FAILED*



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to