[ https://issues.apache.org/jira/browse/YARN-7408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16222472#comment-16222472 ]
Jason Lowe commented on YARN-7408: ---------------------------------- I assume you are using CapacityScheduler? The answers may change for FairScheduler as I am not as familiar with how it handles reservations. Reservations can increase over time as the allocation request remains unsatisfied, but the amount of space that can be reserved is limited by the user and queue limits. In other words, the user can't reserve the whole cluster unless they are allowed to use the whole cluster normally. As for other container requests, it depends upon where these requests are coming from. If these are other requests from the same application then the app needs to change the priority of the other requests. The RM allocates containers in priority order, so it won't consider the other requests until the reservations are satisifed or the request is cancelled. If the requests are coming from other apps then it could be the priority of the app relative to the other apps. Apps ahead in the queue will get first crack at resources or we risk indefinite postponement. Proposals to artificially limit reservations for an app also risk this same indefinite postponement if the scheduler happened to choose poorly where to place the limited number of reservations. In a cluster with long running containers, this app may not ever run in a timely manner. One way to achieve something close to what you are proposing is to have the problematic app run in a separate queue where you can explicitly cap the resources associated with that app, reserved or otherwise. The app will only be able to reserve up to the queue's capacity at most. This should work quite well, assuming the total resources required by the app is less than you are willing to allow it to reserve in its attempt to get containers. > total capacity could be occupied by a large container request > ------------------------------------------------------------- > > Key: YARN-7408 > URL: https://issues.apache.org/jira/browse/YARN-7408 > Project: Hadoop YARN > Issue Type: Bug > Reporter: kyungwan nam > > if NM can not afford to allocate a large container request, it will be > reserved container. > but, in a cluster with long running apps, it is not often that running > containers are released. > in cases like this, reserved containers will be increased as time goes on. as > a result, total capacity could be occupied by reserved resources. > it makes other container requests starve. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org