Hey,
Tez task a failing with this message:
[pid=14120,containerID=container_1411415114858_0003_01_000036] is
running beyond physical memory limits. Current usage: 1.0 GB of 1 GB physical
memory used; 1.5 GB of 2.1 GB virtual memory used. Killing container.
As per my understanding this can happen if -Xmx for the container is higher
then the resource request in YARN it !?
But checking the -Xmx of the container it shows that it only 819 MB where the
container got requested with 1GB.
Dump of the process-tree for container_1411415114858_0003_01_000036 :
|- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS)
SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
|- 14601 14120 14120 14120 (java) 243781 3714 1528500224 262110
/usr/java/jdk1.7.0_55-cloudera/bin/java -Xmx819m
-Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN
-Dlog4j.configuration=tez-container-log4j.properties
-Dyarn.app.container.log.dir=/var/log/hadoop-
yarn/container/application_1411415114858_0003/container_1411415114858_0003_01_000036
-Dtez.root.logger=INFO,CLA
-Djava.io.tmpdir=/mnt/data1/yarn/nm/usercache/qa/appcache/application_1411415114858_0003/container_1411415114858_0003_01_000036/tmp
org.apache.tez.runtime.task.TezChild 10.167.165.29 46626
container_1411415114858_0003_01_000036 application_1411415114858_0003 1
|- 14120 5839 14120 14120 (bash) 1 1 110809088 335 /bin/bash -c
/usr/java/jdk1.7.0_55-cloudera/bin/java -Xmx819m
-Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN
-Dlog4j.configuration=tez-container-log4j.properties
-Dyarn.app.container.log.dir=/var/log/hadoop-
yarn/container/application_1411415114858_0003/container_1411415114858_0003_01_000036
-Dtez.root.logger=INFO,CLA
-Djava.io.tmpdir=/mnt/data1/yarn/nm/usercache/qa/appcache/application_1411415114858_0003/container_1411415114858_0003_01_000036/tmp
org.apache.tez.runtime.task.TezChild 10.167.165.29 46626
container_1411415114858_0003_01_000036 application_1411415114858_0003 1
1>/var/log/hadoop-yarn/container/application_1411415114858_0003/container_1411415114858_0003_01_000036/stdout
2>/var/log/hadoop-yarn/container/application_1411415114858_0003/container_1411415114858_0003_01_000036/stderr
Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143
Any ideas ?
Johannes