yarn memory settings in heterogeneous cluster

2015-08-28 Thread Matt Kowalczyk
Hi,

I have deployed a hadoop 2.7.1 cluster with heterogeneous nodes. For the
sake of discussion, suppose one node has 100GB of RAM while another has 50
GB.

I'm using the Capacity Scheduler and deploy mapred-site.xml and
yarn-site.xml configuration files with various memory settings that are
tailored to the resources for a particular machine. The master node, and
the two slave node classes each get a different configuration file since
they have different memory profiles.

I am trying to configure yarn is such a way as to take advantage of all the
resources available on the nodes and I'm having particular difficulty with
the minimum allocation setting. What I can tell from my deployment is that
there are certain memory settings that are node specific while others that
are cluster wide. A particular configuration setting that's causing me
troubles is,

yarn.scheduler.minimum-allocation-mb

This appears to behave as a cluster-wide setting; however, due to my two
node classes, a per-node yarn.scheduler.minimum-allocation-mb would be
desirable.

I also notice the behavior that yarn _always_ allocates
yarn.scheduler.minimum-allocation-mb to each container irrespective of how
their per-node memory settings are configured.

Couple questions to help drive the discussion.

- how should yarn be configured in a heterogeneous cluster?
- yarn exposes a minimum and maximum allocation, how do I indicate that
additional memory is desirable such that yarn doesn't always allocate the
minimum? More concretely, suppose I have two jobs with differing memory
requirements--how would I communicate this to yarn and request that my
containers be allocated with additional memory?

Thanks,
Matt


Re: yarn memory settings in heterogeneous cluster

2015-08-28 Thread Vinod Kumar Vavilapalli
Hi Matt,

Replies inline.

 I'm using the Capacity Scheduler and deploy mapred-site.xml and yarn-site.xml 
 configuration files with various memory settings that are tailored to the 
 resources for a particular machine. The master node, and the two slave node 
 classes each get a different configuration file since they have different 
 memory profiles.


We are improving this starting 2.8 so as to not require different configuration 
files - see https://issues.apache.org/jira/browse/YARN-160.


 yarn.scheduler.minimum-allocation-mb: This appears to behave as a 
 cluster-wide setting; however, due to my two node classes, a per-node 
 yarn.scheduler.minimum-allocation-mb would be desirable.

Actually the minimum container size is a cluster-level constant by design. It 
doesn’t matter how big or small nodes are in the cluster, the minimum size 
needs to be a constant for applications to have a notion of deterministic 
sizing. What we instead suggest is to simply run more containers on bigger 
machines using the yarn.nodemanage.resource.memory-mb configuration.

On the other hand, maximum container-size obviously should at best be the size 
of the smallest node in the cluster. Otherwise, again, you may cause 
indeterministic scheduling behavior for apps.

 More concretely, suppose I have two jobs with differing memory 
 requirements--how would I communicate this to yarn and request that my 
 containers be allocated with additional memory?

This is a more apt ask. The minimum container size doesn’t determine 
container-size!. Containers can be of sizes of various multiples of the 
minimum, and driven by the application, or frameworks like MapReduce. For 
example, even if the container-size in the cluster is 1GB, MapReduce framework 
can ask bigger containers if user sets mapreduce.map.memory.mb to 2GB/4GB etc. 
And this is controllable at the job level!

HTH
+Vinod

Re: yarn memory settings in heterogeneous cluster

2015-08-28 Thread Matt Kowalczyk
HTH,

Thanks for your prompt response. I'll take a look at the per-job memory
settings which from your explaining should resolve my issue.

-Matt

On Fri, Aug 28, 2015 at 12:35 PM, Vinod Kumar Vavilapalli 
vino...@hortonworks.com wrote:

 Hi Matt,

 Replies inline.

  I'm using the Capacity Scheduler and deploy mapred-site.xml and
 yarn-site.xml configuration files with various memory settings that are
 tailored to the resources for a particular machine. The master node, and
 the two slave node classes each get a different configuration file since
 they have different memory profiles.


 We are improving this starting 2.8 so as to not require different
 configuration files - see https://issues.apache.org/jira/browse/YARN-160.


  yarn.scheduler.minimum-allocation-mb: This appears to behave as a
 cluster-wide setting; however, due to my two node classes, a per-node
 yarn.scheduler.minimum-allocation-mb would be desirable.

 Actually the minimum container size is a cluster-level constant by design.
 It doesn’t matter how big or small nodes are in the cluster, the minimum
 size needs to be a constant for applications to have a notion of
 deterministic sizing. What we instead suggest is to simply run more
 containers on bigger machines using the yarn.nodemanage.resource.memory-mb
 configuration.

 On the other hand, maximum container-size obviously should at best be the
 size of the smallest node in the cluster. Otherwise, again, you may cause
 indeterministic scheduling behavior for apps.

  More concretely, suppose I have two jobs with differing memory
 requirements--how would I communicate this to yarn and request that my
 containers be allocated with additional memory?

 This is a more apt ask. The minimum container size doesn’t determine
 container-size!. Containers can be of sizes of various multiples of the
 minimum, and driven by the application, or frameworks like MapReduce. For
 example, even if the container-size in the cluster is 1GB, MapReduce
 framework can ask bigger containers if user sets mapreduce.map.memory.mb to
 2GB/4GB etc. And this is controllable at the job level!

 HTH
 +Vinod