
I have 4 instances on ec2
1 master with namenode and YARN running on it
1 for secondary namenode and 2 slaves

I used t2.medium instance for the master node only and left the rest as
they were and still i got the same exception. t2.medium is a decent
instance with 4GB RAM and 2 CPUs so i don't think this exception is related
to memory. Any other suggestions please ?

> Hi,
> The actual useful part of the error is:
> Execution Error, return code 2 from
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask
> If you do a search for this plus "EC2" in Google you will find a couple of
> results that point to memory exhaustion issues. You should try increasing
> the configurated memory size.
> Since you are using a t2.micro you should really try using a bigger Amazon
> instance size. This might probably be a lot more useful than trying
> different configurations.
>> Can anyone please help with this ?
>> [image: Inline image 1]
>> i followed the advice here
>> http://stackoverflow.com/questions/20390217/mapreduce-job-in-headless-environment-fails-n-times-due-to-am-container-exceptio
>> and added to mapred-site.xml following properties but still getting the
>> same error.
>> <property>
>>     <name>mapred.child.java.opts</name>
>>     <value>-Djava.awt.headless=true</value></property><!-- add headless to 
>> default -Xmx1024m --><property>
>>     <name>yarn.app.mapreduce.am.command-opts</name>
>>     <value>-Djava.awt.headless=true -Xmx1024m</value></property><property>
>>     <name>yarn.app.mapreduce.am.admin-command-opts</name>
>>     <value>-Djava.awt.headless=true</value></property
>>> Hi,
>>> I am using Hive 0.13.1 and Hadoop 2-2.0 on amazon EC2 t2.micro
>>> instances. I have 4 instances, master has the namenode and yarn,
>>> secondarynode is a separate instance and two slaves are on separate
>>> instances each.
>>> It was working fine till now but it started to break when i tried to run
>>> the following query on tpch generated 3GB data. same worked ok on 1GB
>>>   l_orderkey
>>>   , sum(l_extendedprice*(1-l_discount)) as revenue
>>>   , o_orderdate
>>>   , o_shippriority
>>> FROM
>>> customer c JOIN orders o
>>>     ON (c.c_custkey = o.o_custkey)
>>> JOIN lineitem l
>>>     on (l.l_orderkey = o.o_orderkey)
>>>  o_orderdate < '1995-03-15' and l_shipdate > '1995-03-15'
>>> AND c.c_mktsegment = 'AUTOMOBILE'
>>> l_orderkey, o_orderdate, o_shippriority
>>> sum(l_extendedprice*(1-l_discount)) > 38500 --average revenue
>>> --LIMIT 10;
>>> i have tried many things but nothing seems to work. I am also attaching
>>> my mapred-site.xml and yarn-site.xml files for reference plus the error
>>> log. I have also tried to limit the memory settings in mapred-site.xml and
>>> yarn-site but nothing seems to be working. For full log details please find
>>> attached hive.log file. Please help!
>>> Hadoop job information for Stage-7: number of mappers: 9; number of
>>> reducers: 0
>>> 2014-07-22 06:39:31,643 Stage-7 map = 0%,  reduce = 0%
>>> 2014-07-22 06:39:43,940 Stage-7 map = 6%,  reduce = 0%, Cumulative CPU
>>> 5.34 sec
>>> 2014-07-22 06:39:45,002 Stage-7 map = 11%,  reduce = 0%, Cumulative CPU
>>> 6.94 sec
>>> 2014-07-22 06:40:08,373 Stage-7 map = 17%,  reduce = 0%, Cumulative CPU
>>> 12.6 sec
>>> 2014-07-22 06:40:10,417 Stage-7 map = 22%,  reduce = 0%, Cumulative CPU
>>> 14.06 sec
>>> 2014-07-22 06:40:22,732 Stage-7 map = 28%,  reduce = 0%, Cumulative CPU
>>> 24.46 sec
>>> 2014-07-22 06:40:25,843 Stage-7 map = 33%,  reduce = 0%, Cumulative CPU
>>> 25.74 sec
>>> 2014-07-22 06:40:33,039 Stage-7 map = 44%,  reduce = 0%, Cumulative CPU
>>> 33.32 sec
>>> 2014-07-22 06:40:38,709 Stage-7 map = 56%,  reduce = 0%, Cumulative CPU
>>> 37.19 sec
>>> 2014-07-22 06:41:07,648 Stage-7 map = 61%,  reduce = 0%, Cumulative CPU
>>> 42.83 sec
>>> 2014-07-22 06:41:15,900 Stage-7 map = 56%,  reduce = 0%, Cumulative CPU
>>> 39.49 sec
>>> 2014-07-22 06:41:27,299 Stage-7 map = 67%,  reduce = 0%, Cumulative CPU
>>> 46.07 sec
>>> 2014-07-22 06:41:28,342 Stage-7 map = 56%,  reduce = 0%, Cumulative CPU
>>> 40.9 sec
>>> 2014-07-22 06:41:43,753 Stage-7 map = 61%,  reduce = 0%, Cumulative CPU
>>> 42.84 sec
>>> 2014-07-22 06:41:45,801 Stage-7 map = 100%,  reduce = 0%, Cumulative CPU
>>> 37.19 sec
>>> MapReduce Total cumulative CPU time: 37 seconds 190 msec
>>> Ended Job = job_1406011031680_0002 with errors
>>> Error during job, obtaining debugging information...
>>> Job Tracking URL:
>>> http://ec2-54-77-76-145.eu-west-1.compute.amazonaws.com:8088/proxy/application_1406011031680_0002/
>>> Examining task ID: task_1406011031680_0002_m_000001 (and more) from job
>>> job_1406011031680_0002
>>> Examining task ID: task_1406011031680_0002_m_000005 (and more) from job
>>> job_1406011031680_0002
>>> Task with the most failures(4):
>>> -----
>>> Task ID:
>>>   task_1406011031680_0002_m_000008
>>> URL:
>>> http://ec2-54-77-76-145.eu-west-1.compute.amazonaws.com:8088/taskdetails.jsp?jobid=job_1406011031680_0002&tipid=task_1406011031680_0002_m_000008
>>> -----
>>> Diagnostic Messages for this Task:
>>> Exception from container-launch:
>>> org.apache.hadoop.util.Shell$ExitCodeException:
>>>         at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
>>>         at org.apache.hadoop.util.Shell.run(Shell.java:379)
>>>         at
>>> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
>>>         at
>>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
>>>         at
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283)
>>>         at
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)
>>>         at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>>>         at
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>         at
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>         at java.lang.Thread.run(Thread.java:744)
>>> FAILED: Execution Error, return code 2 from
>>> org.apache.hadoop.hive.ql.exec.mr.MapRedTask
>>> MapReduce Jobs Launched:
>>> Job 0: Map: 3  Reduce: 1   Cumulative CPU: 24.58 sec   HDFS Read:
>>> 593821601 HDFS Write: 14518009 SUCCESS
>>> Job 1: Map: 9   Cumulative CPU: 37.19 sec   HDFS Read: 1342219615 HDFS
>>> Write: 821879 FAIL
>>> Total MapReduce CPU Time Spent: 1 minutes 1 seconds 770 msec
>>> hive (default)> exit;
