I'm running a single node Apache Hadoop 2.4.0 "cluster" and trying to test my application's behavior when it exceeds the memory allocated for the containers. No matter what I do I can't seem to get the containers to be killed when they start exceeding physical memory allocated.
Any suggestions on what I'm doing wrong or how to debug further appreciated. Some additional information... Memory requested for the container as shown in node manager console: *TotalMemoryNeeded 32* Memory settings for the node as shown in node manager console: Total Vmem allocated for Containers 16.80 GB Vmem enforcement enabled true Total Pmem allocated for Container 8 GB *Pmem enforcement enabled true * Top of the container child process using ~111mb of memory: PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND *12013* xxxxxxxx 20 0 243m *111m* 10m R 100.0 1.4 19:52.55 osh Pids of the NodeManager(*4229*), Container launch process(*12008*), and child(*12013*): xxxxxxxx *4229* 1 2 12:53 ? 00:01:04 /apt/isinstall/yli/soft/jdk/linux64/jdk/bin/java -Dproc_nodemanager -Xmx2000m <classpath removed for brevity> org.apache.hadoop.yarn.server.nodemanager.NodeManager xxxxxxxx *12008 4229* 0 13:00 ? 00:00:00 /bin/bash -c /sandbox/xxxxxxxx/is_11_3_2_hdfs/ORCHESTRATE/orch_master/apt/bin/osh -APT_PMplayerFlag isblade1.swg.usma.ibm.com 1 1 30 isblade1.swg.usma.ibm.com isblade1.swg.usma.ibm.com 1415296853.252169.2c47 12000 12001 12002 /tmp/APTps120026233e6d8_20141106130053.532726 -os_charset ISO-8859-1 --------------------------------------------------? 1>/tmp/logs/application_1415296425234_0003/container_1415296425234_0003_01_000003/stdout 2>/tmp/logs/application_1415296425234_0003/container_1415296425234_0003_01_000003/stderr xxxxxxxx *12013 12008* 99 13:00 ? 00:35:23 /sandbox/xxxxxxxx/is_11_3_2_hdfs/ORCHESTRATE/orch_master/apt/bin/osh -APT_PMplayerFlag isblade1.swg.usma.ibm.com 1 1 30 isblade1.swg.usma.ibm.com isblade1.swg.usma.ibm.com 1415296853.252169.2c47 12000 12001 12002 /tmp/APTps120026233e6d8_20141106130053.532726 -os_charset ISO-8859-1 parallel APT_JoinSubOperatorNC in fullouterjoin What am I missing? Thanks, Eric