[ https://issues.apache.org/jira/browse/YARN-8511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16541195#comment-16541195 ]
Weiwei Yang commented on YARN-8511: ----------------------------------- Hi [~leftnoteasy] Thanks for helping to review this, {quote}I'm not sure if your patch works since the {{SchedulerNode#releaseContainer}} could be invoked in scenarios like when an AM release container by invoking allocate call, or app attempt finishes. Scheduler could still place a new container on a node before it terminated by NM. {quote} YARN-4148 adds a boolean flag to represent if a release is trigged by nodeUpdate, {code:java} SchedulerNode#releaseContainer(ContainerId containerId, boolean releasedByNode) {code} so here it removes tags only when {{releaseByNode=true}}. It's just like a hook inside {{AbstractYarnScheduler#nodeUpdate}}. Basically after YARN-4148, node-resource and app-resource are handling differently. For node-resource, resources are deducted only when NM confirms; for app-resource, resources are deducted immediately if a container is released by AM or killed. So I don't think we could run into #1 problem. It's OK NM takes some time to terminate a container, in that case, its allocation tags and as well as node-resource won't be deducted at node-level. Then if another container anti-affinity with this tag or ask for those resource, scheduler will reject the request. Please let me know your thought, thanks. > When AM releases a container, RM removes allocation tags before it is > released by NM > ------------------------------------------------------------------------------------ > > Key: YARN-8511 > URL: https://issues.apache.org/jira/browse/YARN-8511 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler > Affects Versions: 3.1.0 > Reporter: Weiwei Yang > Assignee: Weiwei Yang > Priority: Major > Attachments: YARN-8511.001.patch, YARN-8511.002.patch > > > User leverages PC with allocation tags to avoid port conflicts between apps, > we found sometimes they still get port conflicts. This is a similar issue > like YARN-4148. Because RM immediately removes allocation tags once > AM#allocate asks to release a container, however container on NM has some > delay until it actually gets killed and released the port. We should let RM > remove allocation tags AFTER NM confirms the containers are released. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org