Sure, thank you On Fri, Oct 19, 2018 at 11:06 AM Thai Bui <[email protected]> wrote:
> Your Tez container size is too small relatively to your query and data > size. Notice the log said *1.0 GB of 1 GB physical memory used. *It's > because the default Tez container/task size for your cluster is 1024GB. You > can increase it to a higher number (such as 2048 or 4096) via the setting > hive.tez.container.size when you launch your cluster. > > Similarly, make sure that your YARN node manager setting is high enough > (via yarn.nodemanager.resource.memory-mb) so that you can launch a > container larger than 1GB in size. > > This article may help you more to understand what/where to tune and how. > It's should be applicable for EMR cluster > https://community.hortonworks.com/articles/14309/demystify-tez-tuning-step-by-step.html > > > > On Thu, Oct 18, 2018 at 1:13 PM AgriNut solutions < > [email protected]> wrote: > >> Hi Hive experts, >> >> I am having a 1 Master node, 3 corenodes and autoscaled task nodes from >> min 1 to max 20 nodes EMR cluster. >> >> Hive table's data is 3.5Gb with 1.3e6 rows and 28 columns. And we can't >> run any query with it, as it fails due to memory error: >> >> Intially got below error: >> ``` >> Application application_1538433214426_0296 failed 2 times due to AM >> Container for appattempt_1538433214426_0296_000002 exited with exitCode: >> -104 >> *Failing this attempt.Diagnostics: Container >> [pid=20906,containerID=container_1538433214426_0296_02_000001] is running >> beyond physical memory limits. Current usage: 1.0 GB of 1 GB physical >> memory used; 2.8 GB of 5 GB virtual memory used. Killing container.* >> Dump of the process-tree for container_1538433214426_0296_02_000001 : >> |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) >> SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE >> |- 20906 20904 20906 20906 (bash) 0 0 115863552 670 /bin/bash -c >> /usr/lib/jvm/java-openjdk/bin/java -Xmx819m >> -Djava.io.tmpdir=/mnt/yarn/usercache/hadoop/appcache/application_1538433214426_0296/container_1538433214426_0296_02_000001/tmp >> -server -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN >> -XX:+PrintGCDetails -verbose:gc -XX:+PrintGCTimeStamps -XX:+UseNUMA >> -XX:+UseParallelGC >> -Dlog4j.configuratorClass=org.apache.tez.common.TezLog4jConfigurator >> -Dlog4j.configuration=tez-container-log4j.properties >> -Dyarn.app.container.log.dir=/var/log/hadoop-yarn/containers/application_1538433214426_0296/container_1538433214426_0296_02_000001 >> -Dtez.root.logger=INFO,CLA -Dsun.nio.ch.bugLevel='' >> org.apache.tez.dag.app.DAGAppMaster --session >> 1>/var/log/hadoop-yarn/containers/application_1538433214426_0296/container_1538433214426_0296_02_000001/stdout >> 2>/var/log/hadoop-yarn/containers/application_1538433214426_0296/container_1538433214426_0296_02_000001/stderr >> |- 20921 20906 20906 20906 (java) 4140 141 2911690752 263307 >> /usr/lib/jvm/java-openjdk/bin/java -Xmx819m >> -Djava.io.tmpdir=/mnt/yarn/usercache/hadoop/appcache/application_1538433214426_0296/container_1538433214426_0296_02_000001/tmp >> -server -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN >> -XX:+PrintGCDetails -verbose:gc -XX:+PrintGCTimeStamps -XX:+UseNUMA >> -XX:+UseParallelGC >> -Dlog4j.configuratorClass=org.apache.tez.common.TezLog4jConfigurator >> -Dlog4j.configuration=tez-container-log4j.properties >> -Dyarn.app.container.log.dir=/var/log/hadoop-yarn/containers/application_1538433214426_0296/container_1538433214426_0296_02_000001 >> -Dtez.root.logger=INFO,CLA -Dsun.nio.ch.bugLevel= >> org.apache.tez.dag.app.DAGAppMaster --session >> *Container killed on request. Exit code is 143* >> *Container exited with a non-zero exit code 143* >> For more detailed output, check the application tracking page: >> http://ip-172-24-11-108.us-east-2.compute.internal:8088/cluster/app/application_1538433214426_0296 >> Then click on links to logs of each attempt. >> . Failing the application. >> FAILED: Execution Error, return code 2 from >> org.apache.hadoop.hive.ql.exec.tez.TezTask. Application >> application_1538433214426_0296 failed 2 times due to AM Container for >> appattempt_1538433214426_0296_000002 exited with exitCode: -104 >> Failing this attempt.Diagnostics: Container >> [pid=20906,containerID=container_1538433214426_0296_02_000001] is running >> beyond physical memory limits. Current usage: 1.0 GB of 1 GB physical >> memory used; 2.8 GB of 5 GB virtual memory used. Killing container. >> Dump of the process-tree for container_1538433214426_0296_02_000001 : >> |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) >> SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE >> |- 20906 20904 20906 20906 (bash) 0 0 115863552 670 /bin/bash -c >> /usr/lib/jvm/java-openjdk/bin/java -Xmx819m >> -Djava.io.tmpdir=/mnt/yarn/usercache/hadoop/appcache/application_1538433214426_0296/container_1538433214426_0296_02_000001/tmp >> -server -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN >> -XX:+PrintGCDetails -verbose:gc -XX:+PrintGCTimeStamps -XX:+UseNUMA >> -XX:+UseParallelGC >> -Dlog4j.configuratorClass=org.apache.tez.common.TezLog4jConfigurator >> -Dlog4j.configuration=tez-container-log4j.properties >> -Dyarn.app.container.log.dir=/var/log/hadoop-yarn/containers/application_1538433214426_0296/container_1538433214426_0296_02_000001 >> -Dtez.root.logger=INFO,CLA -Dsun.nio.ch.bugLevel='' >> org.apache.tez.dag.app.DAGAppMaster --session >> 1>/var/log/hadoop-yarn/containers/application_1538433214426_0296/container_1538433214426_0296_02_000001/stdout >> 2>/var/log/hadoop-yarn/containers/application_1538433214426_0296/container_1538433214426_0296_02_000001/stderr >> |- 20921 20906 20906 20906 (java) 4140 141 2911690752 263307 >> /usr/lib/jvm/java-openjdk/bin/java -Xmx819m >> -Djava.io.tmpdir=/mnt/yarn/usercache/hadoop/appcache/application_1538433214426_0296/container_1538433214426_0296_02_000001/tmp >> -server -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN >> -XX:+PrintGCDetails -verbose:gc -XX:+PrintGCTimeStamps -XX:+UseNUMA >> -XX:+UseParallelGC >> -Dlog4j.configuratorClass=org.apache.tez.common.TezLog4jConfigurator >> -Dlog4j.configuration=tez-container-log4j.properties >> -Dyarn.app.container.log.dir=/var/log/hadoop-yarn/containers/application_1538433214426_0296/container_1538433214426_0296_02_000001 >> -Dtez.root.logger=INFO,CLA -Dsun.nio.ch.bugLevel= >> org.apache.tez.dag.app.DAGAppMaster --session >> Container killed on request. Exit code is 143 >> Container exited with a non-zero exit code 143 >> For more detailed output, check the application tracking page: >> http://ip-172-24-11-108.us-east-2.compute.internal:8088/cluster/app/application_1538433214426_0296 >> Then click on links to logs of each attempt. >> . Failing the application. >> ``` >> Can anyone help on what might be the issue and any suggestions would >> help. Thanks in advance. >> Also, no matter how many nodes/mappers and reducers I had, the query >> execution is only one container. Any help on this too. Thanks. >> > > > -- > Thai >
