Dear Guys:

Recently we compile impala using our development environment and when we run 
the complied impala, we met the following problem.

Problem: Impala runs successfully if we do not reboot our machine. However, 
when we reboot the machine, we cannot restart the impala process. We try a lot 
of machines, the problem occurs on every machine.

We struggle for a long time , but it still does not work. We are wondering 
whether you guys can help us to solve the problem.

The environment and error message is as follows.

environment<javascript:void(0);>:
OS: Distributor ID: CentOS
Description:    CentOS Linux release 7.2.1511 (Core)
Release:        7.2.1511
Codename:       Core
Kernel:Linux version 3.10.0-327.28.2.el7.x86_64
Impala version: cdh5-trunk


1.       We start Impala using: ${IMPALA_HOME}/testdata/bin/run-all.sh, and get 
the following message.
[root@localhost rtap-on-impala]# ${IMPALA_HOME}/testdata/bin/run-all.sh
Killing running services...
Starting all cluster services...
--> Starting mini-DFS cluster
Stopping kms
Stopping llama
Stopping yarn
Stopping hdfs
Starting hdfs (Web UI - http://localhost:5070)
....Namenode started
Starting yarn (Web UI - http://localhost:8088)
Starting llama (Web UI - http://localhost:1501)
Starting kms (Web UI - http://localhost:16000)
The cluster is running
--> Starting HBase
localhost: starting zookeeper, logging to 
/home/linxiaoyong/impala_new/rtap-on-impala/impala/cluster_logs/hbase/hbase-root-zookeeper-localhost.localdomain.out
starting master, logging to 
/home/linxiaoyong/impala_new/rtap-on-impala/impala/cluster_logs/hbase/hbase-root-master-localhost.localdomain.out
16/09/28 17:15:52 INFO util.VersionInfo: HBase 1.2.0-cdh5.8.0-SNAPSHOT
16/09/28 17:15:52 INFO util.VersionInfo: Source code repository 
file:///var/lib/jenkins/workspace/generic-binary-tarball-and-maven-deploy/CDH5-Packaging-HBase-2016-02-24_17-14-20/hbase-1.2.0-cdh5.8.0-SNAPSHOT
 revision=Unknown
16/09/28 17:15:52 INFO util.VersionInfo: Compiled by jenkins on Wed Feb 24 
17:26:12 PST 2016
16/09/28 17:15:52 INFO util.VersionInfo: From source with checksum 
2c2f0626ababf9b47e88728c663df5c7
Waiting for HBase Master
...........................Failure
Hbase master did NOT write /hbase/rs in 30.4s
Error in 
/home/linxiaoyong/impala_new/rtap-on-impala/impala/testdata/bin/run-hbase.sh at 
line 87: ${CLUSTER_BIN}/wait-for-hbase-master.py
Error in 
/home/linxiaoyong/impala_new/rtap-on-impala/impala/testdata/bin/run-all.sh at 
line 48: tee ${IMPALA_TEST_CLUSTER_LOG_DIR}/run-hbase.log




2.       Vim cluster_logs/hbase/hbase-root-master-localhost.localdomain.out
Errors follow as:

16/09/28 17:16:10 INFO zookeeper.ClientCnxn: Opening socket connection to 
server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL 
(unknown error)
16/09/28 17:16:10 WARN zookeeper.ClientCnxn: Session 0x0 for server null, 
unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
        at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
        at 
org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)
        at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
16/09/28 17:16:11 INFO zookeeper.ClientCnxn: Opening socket connection to 
server localhost/0:0:0:0:0:0:0:1:2181. Will not attempt to authenticate using 
SASL (unknown error)
16/09/28 17:16:11 WARN zookeeper.ClientCnxn: Session 0x0 for server null, 
unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
        at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
        at 
org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)
        at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
16/09/28 17:16:11 INFO zookeeper.ClientCnxn: Opening socket connection to 
server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL 
(unknown error)
16/09/28 17:16:11 ERROR zookeeper.RecoverableZooKeeper: ZooKeeper create failed 
after 4 attempts
16/09/28 17:16:11 WARN zookeeper.ClientCnxn: Session 0x0 for server null, 
unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
        at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
        at 
org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)
        at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
16/09/28 17:16:11 ERROR master.HMasterCommandLine: Master exiting
java.lang.RuntimeException: Failed construction of Master: class 
org.apache.hadoop.hbase.master.HMaster.
        at 
org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:2428)
        at 
org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:232)
        at 
org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:138)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
        at 
org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:126)
        at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2438)
Caused by: org.apache.hadoop.hbase.ZooKeeperConnectionException: 
master:600000x0, quorum=localhost:2181, baseZNode=/hbase Unexpected 
KeeperException creating base node
        at 
org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.createBaseZNodes(ZooKeeperWatcher.java:206)
        at 
org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.<init>(ZooKeeperWatcher.java:187)
        at 
org.apache.hadoop.hbase.regionserver.HRegionServer.<init>(HRegionServer.java:590)
        at org.apache.hadoop.hbase.master.HMaster.<init>(HMaster.java:375)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
        at 
org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:2421)
        ... 5 more






I used “jps” to watch the processes like as:

[root@localhost rtap-on-impala]# jps
26528 LlamaAMMain
25921 NodeManager
25186 DataNode
25890 NodeManager
29188 Jps
25221 DataNode
25864 NodeManager
25162 DataNode
26635 Bootstrap
14194 -- process information unavailable
25246 NameNode
25950 ResourceManager
27423 HQuorumPeer




Reply via email to