Hi Daniel, Thank you , Its running fine.
*Another question:* could you please tell me what to do If I will get *Shuffle Error*. one time I got this type of error while running a join query on 300GB data with 20GB data Thanks Amit On Mon, Nov 24, 2014 at 11:13 PM, Daniel Haviv < daniel.ha...@veracity-group.com> wrote: > Good luck > Share your results with us > > Daniel > > On 24 בנוב׳ 2014, at 19:36, Amit Behera <amit.bd...@gmail.com> wrote: > > Hi Daniel, > > Thanks a lot, > > > I will do that and rerun the query. :) > > On Mon, Nov 24, 2014 at 10:59 PM, Daniel Haviv < > daniel.ha...@veracity-group.com> wrote: > >> It is a problem as the application master needs to contact the other nodes >> >> Try updating the hosts file on all the machines and try again. >> >> Daniel >> >> On 24 בנוב׳ 2014, at 19:26, Amit Behera <amit.bd...@gmail.com> wrote: >> >> I did not modify in all the slaves. except slave >> >> will it be a problem ? >> >> But for small data (up to 20 GB table) it is running and for 300GB table >> only count(*) running sometimes and sometimes failed >> >> Thanks >> Amit >> >> On Mon, Nov 24, 2014 at 10:37 PM, Daniel Haviv < >> daniel.ha...@veracity-group.com> wrote: >> >>> did you copy the hosts file to all the nodes? >>> >>> Daniel >>> >>> On 24 בנוב׳ 2014, at 19:04, Amit Behera <amit.bd...@gmail.com> wrote: >>> >>> hi Daniel, >>> >>> >>> this stacktrace same for other query . >>> for different run I am getting slave7 sometime slave8... >>> >>> And also I registered all machine IPs in /etc/hosts >>> >>> Regards >>> Amit >>> >>> >>> >>> On Mon, Nov 24, 2014 at 10:22 PM, Daniel Haviv < >>> daniel.ha...@veracity-group.com> wrote: >>> >>>> It seems that the application master can't resolve slave6's name to an >>>> IP >>>> >>>> Daniel >>>> >>>> On 24 בנוב׳ 2014, at 18:49, Amit Behera <amit.bd...@gmail.com> wrote: >>>> >>>> Hi Users, >>>> >>>> *my cluster(1+8) configuration*: >>>> >>>> RAM : 32 GB each >>>> HDFS : 1.5 TB SSD >>>> CPU : 8 core each >>>> >>>> ----------------------------------------------- >>>> >>>> I am trying to query on 300GB of table but I am able to run only select >>>> query. >>>> >>>> Except select query , for all other query I am getting following >>>> exception. >>>> >>>> >>>> >>>> >>>> >>>> Total jobs = 1 >>>> >>>> Stage-1 is selected by condition resolver. >>>> >>>> Launching Job 1 out of 1 >>>> >>>> Number of reduce tasks not specified. Estimated >>>> from input data size: 183 >>>> >>>> In order to change the average load for a >>>> reducer (in bytes): >>>> >>>> set >>>> hive.exec.reducers.bytes.per.reducer=<number> >>>> >>>> In order to limit the maximum number of >>>> reducers: >>>> >>>> set hive.exec.reducers.max=<number> >>>> >>>> In order to set a constant number of reducers: >>>> >>>> set mapreduce.job.reduces=<number> >>>> >>>> Starting Job = job_1416831990090_0005, Tracking >>>> URL = http://master:8088/proxy/application_1416831990090_0005/ >>>> >>>> Kill Command = /root/hadoop/bin/hadoop job >>>> -kill job_1416831990090_0005 >>>> >>>> Hadoop job information for Stage-1: number of >>>> mappers: 679; number of reducers: 183 >>>> >>>> 2014-11-24 19:43:01,523 Stage-1 map = 0%, >>>> reduce = 0% >>>> >>>> 2014-11-24 19:43:22,730 Stage-1 map = 53%, >>>> reduce = 0%, Cumulative CPU 625.19 sec >>>> >>>> 2014-11-24 19:43:23,778 Stage-1 map = 100%, >>>> reduce = 100% >>>> >>>> MapReduce Total cumulative CPU time: 10 minutes >>>> 25 seconds 190 msec >>>> >>>> Ended Job = job_1416831990090_0005 with errors >>>> >>>> Error during job, obtaining debugging >>>> information... >>>> >>>> Examining task ID: >>>> task_1416831990090_0005_m_000005 (and more) from job >>>> job_1416831990090_0005 >>>> >>>> Examining task ID: >>>> task_1416831990090_0005_m_000042 (and more) from job >>>> job_1416831990090_0005 >>>> >>>> Examining task ID: >>>> task_1416831990090_0005_m_000035 (and more) from job >>>> job_1416831990090_0005 >>>> >>>> Examining task ID: >>>> task_1416831990090_0005_m_000065 (and more) from job >>>> job_1416831990090_0005 >>>> >>>> Examining task ID: >>>> task_1416831990090_0005_m_000002 (and more) from job >>>> job_1416831990090_0005 >>>> >>>> Examining task ID: >>>> task_1416831990090_0005_m_000007 (and more) from job >>>> job_1416831990090_0005 >>>> >>>> Examining task ID: >>>> task_1416831990090_0005_m_000058 (and more) from job >>>> job_1416831990090_0005 >>>> >>>> Examining task ID: >>>> task_1416831990090_0005_m_000043 (and more) from job >>>> job_1416831990090_0005 >>>> >>>> >>>> Task with the most failures(4): >>>> >>>> ----- >>>> >>>> Task ID: >>>> >>>> task_1416831990090_0005_m_000005 >>>> >>>> >>>> URL: >>>> >>>> >>>> http://master:8088/taskdetails.jsp?jobid=job_1416831990090_0005&tipid=task_1416831990090_0005_m_000005 >>>> >>>> ----- >>>> >>>> Diagnostic Messages for this Task: >>>> >>>> Container launch failed for >>>> container_1416831990090_0005_01_000112 : >>>> java.lang.IllegalArgumentException: java.net.UnknownHostException: >>>> slave6 >>>> >>>> at >>>> org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:418) >>>> >>>> at >>>> org.apache.hadoop.security.SecurityUtil.setTokenService(SecurityUtil.java:397) >>>> >>>> at >>>> org.apache.hadoop.yarn.util.ConverterUtils.convertFromYarn(ConverterUtils.java:233) >>>> >>>> at >>>> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.newProxy(ContainerManagementProtocolProxy.java:211) >>>> >>>> at >>>> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.<init>(ContainerManagementProtocolProxy.java:189) >>>> >>>> at >>>> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy.getProxy(ContainerManagementProtocolProxy.java:110) >>>> >>>> at >>>> org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl.getCMProxy(ContainerLauncherImpl.java:403) >>>> >>>> at >>>> org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$Container.launch(ContainerLauncherImpl.java:138) >>>> >>>> at >>>> org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:369) >>>> >>>> at >>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >>>> >>>> at >>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >>>> >>>> at java.lang.Thread.run(Thread.java:745) >>>> >>>> Caused by: java.net.UnknownHostException: slave6 >>>> >>>> ... 12 more >>>> >>>> >>>> >>>> FAILED: Execution Error, return code 2 from >>>> org.apache.hadoop.hive.ql.exec.mr.MapRedTask >>>> >>>> MapReduce Jobs Launched: >>>> >>>> Job 0: Map: 679 Reduce: 183 Cumulative CPU: >>>> 625.19 sec HDFS Read: 0 HDFS Write: 0 FAIL >>>> >>>> Total MapReduce CPU Time Spent: 10 minutes 25 >>>> seconds 190 mse >>>> >>>> >>>> >>>> >>>> Please help me to fix the issue. >>>> >>>> Thanks >>>> Amit >>>> >>>> >>> >> >