I encountered under-utilization while moving in particular one of my MR jobs to YARN/Hadoop 2.7 a while back. While it will probably depend on what your job is doing, in my case, I think the biggest improvement came for me when I increased the split size (from its default of 128M up to 2G) since it seemed like my mappers were finishing very quickly, so my containers were very short-lived, and I was suffering from the impact of allocation/deallocation of container resources. I also bumped up memory on the resource manager and did a couple other things.
Here’s the thread: https://mail-archives.apache.org/mod_mbox/hadoop-user/201605.mbox/%3cam2pr04mb07084248711dc9138d805a9990...@am2pr04mb0708.eurprd04.prod.outlook.com%3E Hope this helps! Good luck! -Jeff From: George Liaw [mailto:george.a.l...@gmail.com] Sent: Friday, October 07, 2016 2:32 PM To: user@hadoop.apache.org Subject: Hadoop 2.7.2 Yarn Memory Utiliziation Hi, I'm playing around with setting up a Hadoop 2.7.2 cluster with some configs that I inherited and I'm running into an issue where YARN isn't utilizing all of the memory available on the nodemanagers and seems to be limited to 2-3 map tasks for an application for some reason. Can anyone shed some light on this or let me know what else I should look into? Attached screenshots below of what I'm seeing. Relevant configs that I'm aware of: yarn-site.xml: <property> <name>yarn.scheduler.minimum-allocation-mb</name> <value>128</value> </property> <property> <name>yarn.scheduler.maximum-allocation-mb</name> <value>4096</value> </property> mapred-site.xml: <property> <name>mapreduce.map.memory.mb</name> <value>4096</value> </property> <property> <name>mapreduce.reduce.memory.mb</name> <value>4096</value> </property> [Inline image 5][Inline image 4][Inline image 6] Thanks, George -- George A. Liaw