[ 
https://issues.apache.org/jira/browse/YARN-7408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16222472#comment-16222472
 ] 

Jason Lowe commented on YARN-7408:
----------------------------------

I assume you are using CapacityScheduler?  The answers may change for 
FairScheduler as I am not as familiar with how it handles reservations.

Reservations can increase over time as the allocation request remains 
unsatisfied, but the amount of space that can be reserved is limited by the 
user and queue limits.  In other words, the user can't reserve the whole 
cluster unless they are allowed to use the whole cluster normally.

As for other container requests, it depends upon where these requests are 
coming from.  If these are other requests from the same application then the 
app needs to change the priority of the other requests.  The RM allocates 
containers in priority order, so it won't consider the other requests until the 
reservations are satisifed or the request is cancelled.  If the requests are 
coming from other apps then it could be the priority of the app relative to the 
other apps.  Apps ahead in the queue will get first crack at resources or we 
risk indefinite postponement.  Proposals to artificially limit reservations for 
an app also risk this same indefinite postponement if the scheduler happened to 
choose poorly where to place the limited number of reservations.  In a cluster 
with long running containers, this app may not ever run in a timely manner.

One way to achieve something close to what you are proposing is to have the 
problematic app run in a separate queue where you can explicitly cap the 
resources associated with that app, reserved or otherwise.  The app will only 
be able to reserve up to the queue's capacity at most.  This should work quite 
well, assuming the total resources required by the app is less than you are 
willing to allow it to reserve in its attempt to get containers.


> total capacity could be occupied by a large container request
> -------------------------------------------------------------
>
>                 Key: YARN-7408
>                 URL: https://issues.apache.org/jira/browse/YARN-7408
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: kyungwan nam
>
> if NM can not afford to allocate a large container request, it will be 
> reserved container.
> but, in a cluster with long running apps, it is not often that running 
> containers are released.
> in cases like this, reserved containers will be increased as time goes on. as 
> a result, total capacity could be occupied by reserved resources.
> it makes other container requests starve.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to