[ https://issues.apache.org/jira/browse/YARN-3020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14284451#comment-14284451 ]
Wangda Tan commented on YARN-3020: ---------------------------------- [~peterdkirchner], The expected usage of AMRMClient is (Thanks for input from [~hitesh] and [~jianhe]): When you received newly allocated containers from RM, you should manually call {{removeContainerRequest}} to remove pending container requests. AMRMClient itself will not automatically deduct #pendingContainerRequests. The reason is, when a container allocated from RM, AMRMClient doesn't know the container allocated from which ResourceRequest. You may think container has priority, capacity and resourceName, so that AMRMClient can get ResourceRequest via {{getMatchingRequests}}. But it is possible some applications may use the container for other propose (AMRMClient cannot understand application's specific logic). So AM should call {{removeContainerRequest}} itself. To improve this, I think 1) we need add this behavior to YARN doc -- people should better understand how to use AMRMClient. And 2) maybe we should add a default implementation to deduct pending resource requests by prioirty/resource-name/capacity of allocated containers automatically (User can disable this default behavior, implement their own logic to deduct pending resource requests.) Does this make sense to you? Thanks, Wangda > n similar addContainerRequest()s produce n*(n+1)/2 containers > ------------------------------------------------------------- > > Key: YARN-3020 > URL: https://issues.apache.org/jira/browse/YARN-3020 > Project: Hadoop YARN > Issue Type: Bug > Components: client > Affects Versions: 2.5.0, 2.6.0, 2.5.1, 2.5.2 > Reporter: Peter D Kirchner > Original Estimate: 24h > Remaining Estimate: 24h > > BUG: If the application master calls addContainerRequest() n times, but with > the same priority, I get up to 1+2+3+...+n containers = n*(n+1)/2 . The most > containers are requested when the interval between calls to > addContainerRequest() exceeds the heartbeat interval of calls to allocate() > (in AMRMClientImpl's run() method). > If the application master calls addContainerRequest() n times, but with a > unique priority each time, I get n containers (as I intended). > Analysis: > There is a logic problem in AMRMClientImpl.java. > Although AMRMClientImpl.java, allocate() does an ask.clear() , on subsequent > calls to addContainerRequest(), addResourceRequest() finds the previous > matching remoteRequest and increments the container count rather than > starting anew, and does an addResourceRequestToAsk() which defeats the > ask.clear(). > From documentation and code comments, it was hard for me to discern the > intended behavior of the API, but the inconsistency reported in this issue > suggests one case or the other is implemented incorrectly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)