HTH, Thanks for your prompt response. I'll take a look at the per-job memory settings which from your explaining should resolve my issue.
-Matt On Fri, Aug 28, 2015 at 12:35 PM, Vinod Kumar Vavilapalli < vino...@hortonworks.com> wrote: > Hi Matt, > > Replies inline. > > > I'm using the Capacity Scheduler and deploy mapred-site.xml and > yarn-site.xml configuration files with various memory settings that are > tailored to the resources for a particular machine. The master node, and > the two slave node classes each get a different configuration file since > they have different memory profiles. > > > We are improving this starting 2.8 so as to not require different > configuration files - see https://issues.apache.org/jira/browse/YARN-160. > > > > yarn.scheduler.minimum-allocation-mb: This appears to behave as a > cluster-wide setting; however, due to my two node classes, a per-node > yarn.scheduler.minimum-allocation-mb would be desirable. > > Actually the minimum container size is a cluster-level constant by design. > It doesn’t matter how big or small nodes are in the cluster, the minimum > size needs to be a constant for applications to have a notion of > deterministic sizing. What we instead suggest is to simply run more > containers on bigger machines using the yarn.nodemanage.resource.memory-mb > configuration. > > On the other hand, maximum container-size obviously should at best be the > size of the smallest node in the cluster. Otherwise, again, you may cause > indeterministic scheduling behavior for apps. > > > More concretely, suppose I have two jobs with differing memory > requirements--how would I communicate this to yarn and request that my > containers be allocated with additional memory? > > This is a more apt ask. The minimum container size doesn’t determine > container-size!. Containers can be of sizes of various multiples of the > minimum, and driven by the application, or frameworks like MapReduce. For > example, even if the container-size in the cluster is 1GB, MapReduce > framework can ask bigger containers if user sets mapreduce.map.memory.mb to > 2GB/4GB etc. And this is controllable at the job level! > > HTH > +Vinod