[ 
https://issues.apache.org/jira/browse/YARN-3020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14284451#comment-14284451
 ] 

Wangda Tan commented on YARN-3020:
----------------------------------

[~peterdkirchner],
The expected usage of AMRMClient is (Thanks for input from [~hitesh] and 
[~jianhe]): When you received newly allocated containers from RM, you should 
manually call {{removeContainerRequest}} to remove pending container requests. 
AMRMClient itself will not automatically deduct #pendingContainerRequests.

The reason is, when a container allocated from RM, AMRMClient doesn't know the 
container allocated from which ResourceRequest. You may think container has 
priority, capacity and resourceName, so that AMRMClient can get ResourceRequest 
via {{getMatchingRequests}}. But it is possible some applications may use the 
container for other propose (AMRMClient cannot understand application's 
specific logic). So AM should call {{removeContainerRequest}} itself.

To improve this, I think 1) we need add this behavior to YARN doc -- people 
should better understand how to use AMRMClient. And 2) maybe we should add a 
default implementation to deduct pending resource requests by 
prioirty/resource-name/capacity of allocated containers automatically (User can 
disable this default behavior, implement their own logic to deduct pending 
resource requests.)

Does this make sense to you?

Thanks,
Wangda

> n similar addContainerRequest()s produce n*(n+1)/2 containers
> -------------------------------------------------------------
>
>                 Key: YARN-3020
>                 URL: https://issues.apache.org/jira/browse/YARN-3020
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: client
>    Affects Versions: 2.5.0, 2.6.0, 2.5.1, 2.5.2
>            Reporter: Peter D Kirchner
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> BUG: If the application master calls addContainerRequest() n times, but with 
> the same priority, I get up to 1+2+3+...+n containers = n*(n+1)/2 .  The most 
> containers are requested when the interval between calls to 
> addContainerRequest() exceeds the heartbeat interval of calls to allocate() 
> (in AMRMClientImpl's run() method).
> If the application master calls addContainerRequest() n times, but with a 
> unique priority each time, I get n containers (as I intended).
> Analysis:
> There is a logic problem in AMRMClientImpl.java.
> Although AMRMClientImpl.java, allocate() does an ask.clear() , on subsequent 
> calls to addContainerRequest(), addResourceRequest() finds the previous 
> matching remoteRequest and increments the container count rather than 
> starting anew, and does an addResourceRequestToAsk() which defeats the 
> ask.clear().
> From documentation and code comments, it was hard for me to discern the 
> intended behavior of the API, but the inconsistency reported in this issue 
> suggests one case or the other is implemented incorrectly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to