Hello,

In my 5 node Hadoop 2.7.3 AWS EC2 instance cluster, things were running
smooth before I submitted one query. I tried to create an ORC table using
below query:

create table dummy_orc stored as orc tblproperties ("orc.compress"="Lz4")
as select * from dummy;

The job said, it would run 76 mappers and 0 reducers and job started. After
some 10-12 minutes when the map % reached 100%, the job aborted and did not
give output. Since number of records was large, I did not mind the large
time it took initially.But then all my datanode daemons and nodemanager
daemons died. The hdfs dfsadmin -report command gave 0 cluster capacity, 0
live datanodes, etc.

I restarted the cluster completely. Restarted namenode, resource manager,
datanode, nodemanager, zkfc services, quorumPeerMain, everything. After
that the cluster capacity,etc is coming fine. I am able to fire normal
non-mapreduce queries like select *.

But mapreduce is not starting.Also spark jobs are running now. They are
stuck at ACCEPTED state like MR jobs.

MR is stuck for select count(1) from dummy at:

Query ID = hadoopuser_20170728093320_b1875223-801e-466b-997f-4b58f0e90041
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapreduce.job.reduces=<number>
Starting Job = job_1501233326257_0003, Tracking URL =
http://dev-bigdatamaster1:8088/proxy/application_1501233326257_0003/
Kill Command = /home/hadoopuser/hadoop//bin/hadoop job  -kill
job_1501233326257_0003

Which log would give me better picture to resolve this error? And what went
wrong?

Reply via email to