[ https://issues.apache.org/jira/browse/YARN-7872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16406156#comment-16406156 ]
Yuqi Wang commented on YARN-7872: --------------------------------- [~jlowe], could you please also take a look at this? Only a little change and seems trunk also has the same issue. > labeled node cannot be used to satisfy locality specified request > ----------------------------------------------------------------- > > Key: YARN-7872 > URL: https://issues.apache.org/jira/browse/YARN-7872 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler, capacityscheduler, resourcemanager > Affects Versions: 2.7.2 > Reporter: Yuqi Wang > Assignee: Yuqi Wang > Priority: Blocker > Fix For: 2.7.2 > > Attachments: YARN-7872-branch-2.7.2.001.patch > > > *Issue summary:* > labeled node (i.e. node with 'not empty' node label) cannot be used to > satisfy locality specified request (i.e. container request with 'not ANY' > resource name and the relax locality is false). > > *For example:* > The node with available resource: > [Resource: [MemoryMB: [100] CpuNumber: [12]] {color:#14892c}NodeLabel: > [persistent]{color} {color:#f79232}HostName: \{SRG}{color} RackName: > \{/default-rack}] > The container request: > [Priority: [1] Resource: [MemoryMB: [1] CpuNumber: [1]] > {color:#14892c}NodeLabel: [null]{color} {color:#f79232}HostNames: > \{SRG}{color} RackNames: {} {color:#59afe1}RelaxLocality: [false]{color}] > Current RM capacity scheduler's behavior is that (at least for version 2.7 > and 2.8), the node cannot allocate container for the request, because the > node label is not matched when the leaf queue assign container. > > *Possible solution:* > However, node locality and node label should be two orthogonal dimensions to > select candidate nodes for container request. And the node label matching > should only be executed for container request with ANY resource name, since > only this kind of container request is allowed to have 'not empty' node label. > So, for container request with 'not ANY' resource name (so, we clearly know > it should not have node label), we should use the requested resource name to > match with the node instead of using the requested node label to match with > the node. And this resource name matching should be safe, since the node > whose node label is not accessible for the queue will not be sent to the leaf > queue. > > *Discussion:* > Attachment is the fix according to this principle, please help to review. > Without it, we cannot use locality to request container within these labeled > nodes. > If the fix is acceptable, we should also recheck whether the same issue > happens in trunk and other hadoop versions. > If not acceptable (i.e. the current behavior is by designed), so, how can we > use locality to request container within these labeled nodes? -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org