I encountered under-utilization while moving in particular one of my MR jobs to 
YARN/Hadoop 2.7 a while back.  While it will probably depend on what your job 
is doing, in my case, I think the biggest improvement came for me when I 
increased the split size (from its default of 128M up to 2G) since it seemed 
like my mappers were finishing very quickly, so my containers were very 
short-lived, and I was suffering from the impact of allocation/deallocation of 
container resources.  I also bumped up memory on the resource manager and did a 
couple other things.

Here’s the thread:
https://mail-archives.apache.org/mod_mbox/hadoop-user/201605.mbox/%3cam2pr04mb07084248711dc9138d805a9990...@am2pr04mb0708.eurprd04.prod.outlook.com%3E

Hope this helps!  Good luck!
-Jeff

From: George Liaw [mailto:george.a.l...@gmail.com]
Sent: Friday, October 07, 2016 2:32 PM
To: user@hadoop.apache.org
Subject: Hadoop 2.7.2 Yarn Memory Utiliziation

Hi,
I'm playing around with setting up a Hadoop 2.7.2 cluster with some configs 
that I inherited and I'm running into an issue where YARN isn't utilizing all 
of the memory available on the nodemanagers and seems to be limited to 2-3 map 
tasks for an application for some reason. Can anyone shed some light on this or 
let me know what else I should look into?

Attached screenshots below of what I'm seeing.

Relevant configs that I'm aware of:
yarn-site.xml:

<property>
  <name>yarn.scheduler.minimum-allocation-mb</name>
  <value>128</value>
</property>

<property>
  <name>yarn.scheduler.maximum-allocation-mb</name>
  <value>4096</value>
</property>
mapred-site.xml:

<property>
  <name>mapreduce.map.memory.mb</name>
  <value>4096</value>
</property>

<property>
  <name>mapreduce.reduce.memory.mb</name>
  <value>4096</value>
</property>
[Inline image 5][Inline image 4][Inline image 6]

Thanks,
George

--
George A. Liaw

Reply via email to