[ https://issues.apache.org/jira/browse/YARN-1913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14015518#comment-14015518 ]
Sandy Ryza commented on YARN-1913: ---------------------------------- This is looking good. A small things. AppSchedulingInfo is only used to track pending resources. We should hold amResource in SchedulerApplicationAttempt. {code} + if (! queue.canRunAppAM(app.getAMResource())) { {code} Take out space after exclamation point. {code} @Override + public boolean checkIfAMResourceUsageOverLimit(Resource usage, Resource maxAMResource) { + return Resources.greaterThan(RESOURCE_CALCULATOR, null, usage, maxAMResource); + } {code} Simpler to just use "usage.getMemory() > maxAMResource.getMemory()". {code} + if (request.getPriority().equals(RMAppAttemptImpl.AM_CONTAINER_PRIORITY)) { {code} I'm a little nervous about using the priority here because apps could unwittingly submit all requests at that priority. Can we use SchedulerApplicationAttempt.getLiveContainers().isEmpty()? > With Fair Scheduler, cluster can logjam when all resources are consumed by AMs > ------------------------------------------------------------------------------ > > Key: YARN-1913 > URL: https://issues.apache.org/jira/browse/YARN-1913 > Project: Hadoop YARN > Issue Type: Bug > Components: scheduler > Affects Versions: 2.3.0 > Reporter: bc Wong > Assignee: Wei Yan > Attachments: YARN-1913.patch, YARN-1913.patch, YARN-1913.patch, > YARN-1913.patch, YARN-1913.patch, YARN-1913.patch, YARN-1913.patch, > YARN-1913.patch > > > It's possible to deadlock a cluster by submitting many applications at once, > and have all cluster resources taken up by AMs. > One solution is for the scheduler to limit resources taken up by AMs, as a > percentage of total cluster resources, via a "maxApplicationMasterShare" > config. -- This message was sent by Atlassian JIRA (v6.2#6252)