[ 
https://issues.apache.org/jira/browse/YARN-3633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14609163#comment-14609163
 ] 

Rohit Agarwal commented on YARN-3633:
-------------------------------------

Yes, that is very much possible. But without this change - this scenario will 
result in none of the applications making any progress. I would take one 
application getting starved over the whole cluster getting starved any day. :-)

FWIW, we have been running our clusters with this patch for a month now and 
haven't seen any cluster logjam since.

> With Fair Scheduler, cluster can logjam when there are too many queues
> ----------------------------------------------------------------------
>
>                 Key: YARN-3633
>                 URL: https://issues.apache.org/jira/browse/YARN-3633
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: fairscheduler
>    Affects Versions: 2.6.0
>            Reporter: Rohit Agarwal
>            Assignee: Rohit Agarwal
>            Priority: Critical
>         Attachments: YARN-3633-1.patch, YARN-3633.patch
>
>
> It's possible to logjam a cluster by submitting many applications at once in 
> different queues.
> For example, let's say there is a cluster with 20GB of total memory. Let's 
> say 4 users submit applications at the same time. The fair share of each 
> queue is 5GB. Let's say that maxAMShare is 0.5. So, each queue has at most 
> 2.5GB memory for AMs. If all the users requested AMs of size 3GB - the 
> cluster logjams. Nothing gets scheduled even when 20GB of resources are 
> available.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to