JoneZhang created HBASE-14074:
---------------------------------
Summary: HBase cluster crashed on-the-hour
Key: HBASE-14074
URL: https://issues.apache.org/jira/browse/HBASE-14074
Project: HBase
Issue Type: Bug
Components: Admin
Affects Versions: 0.96.2
Environment: Hadoop 2.5.1
HBase 0.96.2
Reporter: JoneZhang
I found hbase clutser crashed on-the-hour
HBase master running log as follows
"2015-07-14 14:41:49,832 DEBUG [master:10.240.131.18:60000.oldLogCleaner]
master.ReplicationLogCleaner: Didn't find this log in ZK, deleting:
10-241-125-46%2C60020%2C1436841063572.1436851865226
2015-07-14 14:45:49,822 DEBUG [master:10.240.131.18:60000.oldLogCleaner]
master.ReplicationLogCleaner: Didn't find this log in ZK, deleting:
10-241-85-137%2C60020%2C1436841341086.1436852143141
2015-07-14 15:00:03,481 INFO [main] util.VersionInfo: HBase 0.96.2-hadoop2
2015-07-14 15:00:03,481 INFO [main] util.VersionInfo: Subversion
https://svn.apache.org/repos/asf/hbase/tags/0.96.2RC2 -r 1581096
2015-07-14 15:00:03,481 INFO [main] util.VersionInfo: Compiled by stack on Mon
Mar 24 16:03:18 PDT 2014
2015-07-14 15:00:03,729 INFO [main] zookeeper.ZooKeeper: Client
environment:zookeeper.version=3.4.5-1392090, built on 09/30/2012 17:52 GMT
2015-07-14 15:00:03,730 INFO [main] zookeeper.ZooKeeper: Client
environment:host.name=10-240-131-18
2015-07-14 15:00:03,730 INFO [main] zookeeper.ZooKeeper: Client
environment:java.version=1.7.0_72
...
2015-07-14 15:00:03,749 INFO [main] zookeeper.RecoverableZooKeeper: Process
identifier=clean znode for master connecting to ZooKeeper
ensemble=10.240.131.17:2200,10.240.131.16:2200,10.240.131.15:2200,10.240.131.14:2200,10.240.131.18:2200
2015-07-14 15:00:03,751 INFO [main-SendThread(10-240-131-18:2200)]
zookeeper.ClientCnxn: Opening socket connection to server
10-240-131-18/10.240.131.18:2200. Will not attempt to authenticate using SASL
(unknown error)
2015-07-14 15:00:03,757 INFO [main-SendThread(10-240-131-18:2200)]
zookeeper.ClientCnxn: Socket connection established to
10-240-131-18/10.240.131.18:2200, initiating session
2015-07-14 15:00:03,764 INFO [main-SendThread(10-240-131-18:2200)]
zookeeper.ClientCnxn: Session establishment complete on server
10-240-131-18/10.240.131.18:2200, sessionid = 0x34e8a64b453024a, negotiated
timeout = 40000
2015-07-14 15:00:04,835 INFO [main] zookeeper.ZooKeeper: Session:
0x34e8a64b453024a closed
2015-07-14 15:00:04,835 INFO [main-EventThread] zookeeper.ClientCnxn:
EventThread shut down"
After print " Didn't find this log in ZK..." every hour at a time
The master dead
Zookeeper running log as follows
"2015-07-14 15:00:03,756 [myid:3] - INFO
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2200:NIOServerCnxnFactory@197] - Accepted
socket connection from /10.240.131.18:52733
2015-07-14 15:00:03,761 [myid:3] - INFO
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2200:ZooKeeperServer@868] - Client
attempting to establish new session at /10.240.131.18:52733
2015-07-14 15:00:03,762 [myid:3] - INFO
[CommitProcessor:3:ZooKeeperServer@617] - Established session 0x34e8a64b453024a
with negotiated timeout 40000 for client /10.240.131.18:52733
2015-07-14 15:00:04,836 [myid:3] - INFO
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2200:NIOServerCnxn@1007] - Closed socket
connection for client /10.240.131.18:52733 which had sessionid
0x34e8a64b453024a"
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)