[jira] [Updated] (YARN-3633) With Fair Scheduler, cluster can logjam when there are too many queues
[ https://issues.apache.org/jira/browse/YARN-3633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohit Agarwal updated YARN-3633: Attachment: YARN-3633-1.patch With Fair Scheduler, cluster can logjam when there are too many queues -- Key: YARN-3633 URL: https://issues.apache.org/jira/browse/YARN-3633 Project: Hadoop YARN Issue Type: Bug Components: fairscheduler Affects Versions: 2.6.0 Reporter: Rohit Agarwal Assignee: Rohit Agarwal Priority: Critical Attachments: YARN-3633-1.patch, YARN-3633.patch It's possible to logjam a cluster by submitting many applications at once in different queues. For example, let's say there is a cluster with 20GB of total memory. Let's say 4 users submit applications at the same time. The fair share of each queue is 5GB. Let's say that maxAMShare is 0.5. So, each queue has at most 2.5GB memory for AMs. If all the users requested AMs of size 3GB - the cluster logjams. Nothing gets scheduled even when 20GB of resources are available. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3633) With Fair Scheduler, cluster can logjam when there are too many queues
[ https://issues.apache.org/jira/browse/YARN-3633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohit Agarwal updated YARN-3633: Attachment: YARN-3633.patch With Fair Scheduler, cluster can logjam when there are too many queues -- Key: YARN-3633 URL: https://issues.apache.org/jira/browse/YARN-3633 Project: Hadoop YARN Issue Type: Bug Components: fairscheduler Affects Versions: 2.6.0 Reporter: Rohit Agarwal Assignee: Rohit Agarwal Priority: Critical Attachments: YARN-3633.patch It's possible to logjam a cluster by submitting many applications at once in different queues. For example, let's say there is a cluster with 20GB of total memory. Let's say 4 users submit applications at the same time. The fair share of each queue is 5GB. Let's say that maxAMShare is 0.5. So, each queue has at most 2.5GB memory for AMs. If all the users requested AMs of size 3GB - the cluster logjams. Nothing gets scheduled even when 20GB of resources are available. -- This message was sent by Atlassian JIRA (v6.3.4#6332)