Hi Daniel, Thanks a lot,
I will do that and rerun the query. :) On Mon, Nov 24, 2014 at 10:59 PM, Daniel Haviv < daniel.ha...@veracity-group.com> wrote: > It is a problem as the application master needs to contact the other nodes > > Try updating the hosts file on all the machines and try again. > > Daniel > > On 24 בנוב׳ 2014, at 19:26, Amit Behera <amit.bd...@gmail.com> wrote: > > I did not modify in all the slaves. except slave > > will it be a problem ? > > But for small data (up to 20 GB table) it is running and for 300GB table > only count(*) running sometimes and sometimes failed > > Thanks > Amit > > On Mon, Nov 24, 2014 at 10:37 PM, Daniel Haviv < > daniel.ha...@veracity-group.com> wrote: > >> did you copy the hosts file to all the nodes? >> >> Daniel >> >> On 24 בנוב׳ 2014, at 19:04, Amit Behera <amit.bd...@gmail.com> wrote: >> >> hi Daniel, >> >> >> this stacktrace same for other query . >> for different run I am getting slave7 sometime slave8... >> >> And also I registered all machine IPs in /etc/hosts >> >> Regards >> Amit >> >> >> >> On Mon, Nov 24, 2014 at 10:22 PM, Daniel Haviv < >> daniel.ha...@veracity-group.com> wrote: >> >>> It seems that the application master can't resolve slave6's name to an IP >>> >>> Daniel >>> >>> On 24 בנוב׳ 2014, at 18:49, Amit Behera <amit.bd...@gmail.com> wrote: >>> >>> Hi Users, >>> >>> *my cluster(1+8) configuration*: >>> >>> RAM : 32 GB each >>> HDFS : 1.5 TB SSD >>> CPU : 8 core each >>> >>> ----------------------------------------------- >>> >>> I am trying to query on 300GB of table but I am able to run only select >>> query. >>> >>> Except select query , for all other query I am getting following >>> exception. >>> >>> >>> >>> >>> >>> Total jobs = 1 >>> >>> Stage-1 is selected by condition resolver. >>> >>> Launching Job 1 out of 1 >>> >>> Number of reduce tasks not specified. Estimated >>> from input data size: 183 >>> >>> In order to change the average load for a >>> reducer (in bytes): >>> >>> set >>> hive.exec.reducers.bytes.per.reducer=<number> >>> >>> In order to limit the maximum number of >>> reducers: >>> >>> set hive.exec.reducers.max=<number> >>> >>> In order to set a constant number of reducers: >>> >>> set mapreduce.job.reduces=<number> >>> >>> Starting Job = job_1416831990090_0005, Tracking >>> URL = http://master:8088/proxy/application_1416831990090_0005/ >>> >>> Kill Command = /root/hadoop/bin/hadoop job >>> -kill job_1416831990090_0005 >>> >>> Hadoop job information for Stage-1: number of >>> mappers: 679; number of reducers: 183 >>> >>> 2014-11-24 19:43:01,523 Stage-1 map = 0%, >>> reduce = 0% >>> >>> 2014-11-24 19:43:22,730 Stage-1 map = 53%, >>> reduce = 0%, Cumulative CPU 625.19 sec >>> >>> 2014-11-24 19:43:23,778 Stage-1 map = 100%, >>> reduce = 100% >>> >>> MapReduce Total cumulative CPU time: 10 minutes >>> 25 seconds 190 msec >>> >>> Ended Job = job_1416831990090_0005 with errors >>> >>> Error during job, obtaining debugging >>> information... >>> >>> Examining task ID: >>> task_1416831990090_0005_m_000005 (and more) from job >>> job_1416831990090_0005 >>> >>> Examining task ID: >>> task_1416831990090_0005_m_000042 (and more) from job >>> job_1416831990090_0005 >>> >>> Examining task ID: >>> task_1416831990090_0005_m_000035 (and more) from job >>> job_1416831990090_0005 >>> >>> Examining task ID: >>> task_1416831990090_0005_m_000065 (and more) from job >>> job_1416831990090_0005 >>> >>> Examining task ID: >>> task_1416831990090_0005_m_000002 (and more) from job >>> job_1416831990090_0005 >>> >>> Examining task ID: >>> task_1416831990090_0005_m_000007 (and more) from job >>> job_1416831990090_0005 >>> >>> Examining task ID: >>> task_1416831990090_0005_m_000058 (and more) from job >>> job_1416831990090_0005 >>> >>> Examining task ID: >>> task_1416831990090_0005_m_000043 (and more) from job >>> job_1416831990090_0005 >>> >>> >>> Task with the most failures(4): >>> >>> ----- >>> >>> Task ID: >>> >>> task_1416831990090_0005_m_000005 >>> >>> >>> URL: >>> >>> >>> http://master:8088/taskdetails.jsp?jobid=job_1416831990090_0005&tipid=task_1416831990090_0005_m_000005 >>> >>> ----- >>> >>> Diagnostic Messages for this Task: >>> >>> Container launch failed for >>> container_1416831990090_0005_01_000112 : >>> java.lang.IllegalArgumentException: java.net.UnknownHostException: >>> slave6 >>> >>> at >>> org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:418) >>> >>> at >>> org.apache.hadoop.security.SecurityUtil.setTokenService(SecurityUtil.java:397) >>> >>> at >>> org.apache.hadoop.yarn.util.ConverterUtils.convertFromYarn(ConverterUtils.java:233) >>> >>> at >>> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.newProxy(ContainerManagementProtocolProxy.java:211) >>> >>> at >>> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.<init>(ContainerManagementProtocolProxy.java:189) >>> >>> at >>> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy.getProxy(ContainerManagementProtocolProxy.java:110) >>> >>> at >>> org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl.getCMProxy(ContainerLauncherImpl.java:403) >>> >>> at >>> org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$Container.launch(ContainerLauncherImpl.java:138) >>> >>> at >>> org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:369) >>> >>> at >>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >>> >>> at >>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >>> >>> at java.lang.Thread.run(Thread.java:745) >>> >>> Caused by: java.net.UnknownHostException: slave6 >>> >>> ... 12 more >>> >>> >>> >>> FAILED: Execution Error, return code 2 from >>> org.apache.hadoop.hive.ql.exec.mr.MapRedTask >>> >>> MapReduce Jobs Launched: >>> >>> Job 0: Map: 679 Reduce: 183 Cumulative CPU: >>> 625.19 sec HDFS Read: 0 HDFS Write: 0 FAIL >>> >>> Total MapReduce CPU Time Spent: 10 minutes 25 >>> seconds 190 mse >>> >>> >>> >>> >>> Please help me to fix the issue. >>> >>> Thanks >>> Amit >>> >>> >> >