Re: Container launch failed Error

Amit Behera Mon, 24 Nov 2014 12:28:37 -0800

Hi Daniel,

Thank you , Its running fine.


*Another question:*
 could you please tell me what to do If I will get *Shuffle Error*.
one time I got this type of error while running a join query on 300GB data
with 20GB data


Thanks
Amit

On Mon, Nov 24, 2014 at 11:13 PM, Daniel Haviv <
daniel.ha...@veracity-group.com> wrote:

> Good luck
> Share your results with us
>
> Daniel
>
> On 24 בנוב׳ 2014, at 19:36, Amit Behera <amit.bd...@gmail.com> wrote:
>
> Hi Daniel,
>
> Thanks a lot,
>
>
> I will do that and rerun the query. :)
>
> On Mon, Nov 24, 2014 at 10:59 PM, Daniel Haviv <
> daniel.ha...@veracity-group.com> wrote:
>
>> It is a problem as the application master needs to contact the other nodes
>>
>> Try updating the hosts file on all the machines and try again.
>>
>> Daniel
>>
>> On 24 בנוב׳ 2014, at 19:26, Amit Behera <amit.bd...@gmail.com> wrote:
>>
>> I did not modify in all the slaves. except slave
>>
>> will it be a problem ?
>>
>> But for small data (up to 20 GB table) it is running and for 300GB table
>> only count(*) running sometimes and sometimes failed
>>
>> Thanks
>> Amit
>>
>> On Mon, Nov 24, 2014 at 10:37 PM, Daniel Haviv <
>> daniel.ha...@veracity-group.com> wrote:
>>
>>> did you copy the hosts file to all the nodes?
>>>
>>> Daniel
>>>
>>> On 24 בנוב׳ 2014, at 19:04, Amit Behera <amit.bd...@gmail.com> wrote:
>>>
>>> hi Daniel,
>>>
>>>
>>> this stacktrace same for other query .
>>> for different run I am getting slave7 sometime slave8...
>>>
>>> And also I registered all machine IPs in /etc/hosts
>>>
>>> Regards
>>> Amit
>>>
>>>
>>>
>>> On Mon, Nov 24, 2014 at 10:22 PM, Daniel Haviv <
>>> daniel.ha...@veracity-group.com> wrote:
>>>
>>>> It seems that the application master can't resolve slave6's name to an
>>>> IP
>>>>
>>>> Daniel
>>>>
>>>> On 24 בנוב׳ 2014, at 18:49, Amit Behera <amit.bd...@gmail.com> wrote:
>>>>
>>>> Hi Users,
>>>>
>>>> *my cluster(1+8) configuration*:
>>>>
>>>> RAM  : 32 GB each
>>>> HDFS : 1.5 TB SSD
>>>> CPU   : 8 core each
>>>>
>>>> -----------------------------------------------
>>>>
>>>> I am trying to query on 300GB of table but I am able to run only select
>>>> query.
>>>>
>>>> Except select query , for all other query I am getting following
>>>> exception.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Total jobs = 1
>>>>
>>>> Stage-1 is selected by condition resolver.
>>>>
>>>> Launching Job 1 out of 1
>>>>
>>>> Number of reduce tasks not specified. Estimated
>>>> from input data size: 183
>>>>
>>>> In order to change the average load for a
>>>> reducer (in bytes):
>>>>
>>>>   set
>>>> hive.exec.reducers.bytes.per.reducer=<number>
>>>>
>>>> In order to limit the maximum number of
>>>> reducers:
>>>>
>>>>   set hive.exec.reducers.max=<number>
>>>>
>>>> In order to set a constant number of reducers:
>>>>
>>>>   set mapreduce.job.reduces=<number>
>>>>
>>>> Starting Job = job_1416831990090_0005, Tracking
>>>> URL = http://master:8088/proxy/application_1416831990090_0005/
>>>>
>>>> Kill Command = /root/hadoop/bin/hadoop job
>>>> -kill job_1416831990090_0005
>>>>
>>>> Hadoop job information for Stage-1: number of
>>>> mappers: 679; number of reducers: 183
>>>>
>>>> 2014-11-24 19:43:01,523 Stage-1 map = 0%,
>>>> reduce = 0%
>>>>
>>>> 2014-11-24 19:43:22,730 Stage-1 map = 53%,
>>>> reduce = 0%, Cumulative CPU 625.19 sec
>>>>
>>>> 2014-11-24 19:43:23,778 Stage-1 map = 100%,
>>>> reduce = 100%
>>>>
>>>> MapReduce Total cumulative CPU time: 10 minutes
>>>> 25 seconds 190 msec
>>>>
>>>> Ended Job = job_1416831990090_0005 with errors
>>>>
>>>> Error during job, obtaining debugging
>>>> information...
>>>>
>>>> Examining task ID:
>>>> task_1416831990090_0005_m_000005 (and more) from job
>>>> job_1416831990090_0005
>>>>
>>>> Examining task ID:
>>>> task_1416831990090_0005_m_000042 (and more) from job
>>>> job_1416831990090_0005
>>>>
>>>> Examining task ID:
>>>> task_1416831990090_0005_m_000035 (and more) from job
>>>> job_1416831990090_0005
>>>>
>>>> Examining task ID:
>>>> task_1416831990090_0005_m_000065 (and more) from job
>>>> job_1416831990090_0005
>>>>
>>>> Examining task ID:
>>>> task_1416831990090_0005_m_000002 (and more) from job
>>>> job_1416831990090_0005
>>>>
>>>> Examining task ID:
>>>> task_1416831990090_0005_m_000007 (and more) from job
>>>> job_1416831990090_0005
>>>>
>>>> Examining task ID:
>>>> task_1416831990090_0005_m_000058 (and more) from job
>>>> job_1416831990090_0005
>>>>
>>>> Examining task ID:
>>>> task_1416831990090_0005_m_000043 (and more) from job
>>>> job_1416831990090_0005
>>>>
>>>>
>>>>  Task with the most failures(4):
>>>>
>>>> -----
>>>>
>>>> Task ID:
>>>>
>>>>   task_1416831990090_0005_m_000005
>>>>
>>>>
>>>>  URL:
>>>>
>>>>  
>>>> http://master:8088/taskdetails.jsp?jobid=job_1416831990090_0005&tipid=task_1416831990090_0005_m_000005
>>>>
>>>> -----
>>>>
>>>> Diagnostic Messages for this Task:
>>>>
>>>> Container launch failed for
>>>> container_1416831990090_0005_01_000112 :
>>>> java.lang.IllegalArgumentException: java.net.UnknownHostException:
>>>> slave6
>>>>
>>>>    at
>>>> org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:418)
>>>>
>>>>    at
>>>> org.apache.hadoop.security.SecurityUtil.setTokenService(SecurityUtil.java:397)
>>>>
>>>>    at
>>>> org.apache.hadoop.yarn.util.ConverterUtils.convertFromYarn(ConverterUtils.java:233)
>>>>
>>>>    at
>>>> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.newProxy(ContainerManagementProtocolProxy.java:211)
>>>>
>>>>    at
>>>> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.<init>(ContainerManagementProtocolProxy.java:189)
>>>>
>>>>    at
>>>> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy.getProxy(ContainerManagementProtocolProxy.java:110)
>>>>
>>>>    at
>>>> org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl.getCMProxy(ContainerLauncherImpl.java:403)
>>>>
>>>>    at
>>>> org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$Container.launch(ContainerLauncherImpl.java:138)
>>>>
>>>>    at
>>>> org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:369)
>>>>
>>>>    at
>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>>
>>>>    at
>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>>
>>>>    at java.lang.Thread.run(Thread.java:745)
>>>>
>>>> Caused by: java.net.UnknownHostException: slave6
>>>>
>>>>    ... 12 more
>>>>
>>>>
>>>>
>>>>  FAILED: Execution Error, return code 2 from
>>>> org.apache.hadoop.hive.ql.exec.mr.MapRedTask
>>>>
>>>> MapReduce Jobs Launched:
>>>>
>>>> Job 0: Map: 679  Reduce: 183   Cumulative CPU:
>>>> 625.19 sec   HDFS Read: 0 HDFS Write: 0 FAIL
>>>>
>>>> Total MapReduce CPU Time Spent: 10 minutes 25
>>>> seconds 190 mse
>>>>
>>>>
>>>>
>>>>
>>>> Please help me to fix the issue.
>>>>
>>>> Thanks
>>>> Amit
>>>>
>>>>
>>>
>>
>

Re: Container launch failed Error

Reply via email to