We had a namenode go down due to timeout with the hdfs ha qjm journal:
2015-12-09 04:10:42,723 WARN
org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Waited 19016
ms (timeout=20000 ms) for a response for sendEdits
2015-12-09 04:10:43,708 FATAL
org.apache.hadoop.hdfs.server.namenode.FSEditLog: Error: flush failed for
required journal (JournalAndStream(mgr=QJM to [10.42.28.221:8485,
10.42.28.222:8485, 10.42.28.223:8485], stream=QuorumOutputStream starting
at txid 8781293))
java.io.IOException: Timed out waiting 20000ms for a quorum of nodes to
respond.
at
org.apache.hadoop.hdfs.qjournal.client.AsyncLoggerSet.waitForWriteQuorum(AsyncLoggerSet.java:137)
at
org.apache.hadoop.hdfs.qjournal.client.QuorumOutputStream.flushAndSync(QuorumOutputStream.java:107)
at
org.apache.hadoop.hdfs.server.namenode.EditLogOutputStream.flush(EditLogOutputStream.java:113)
at
org.apache.hadoop.hdfs.server.namenode.EditLogOutputStream.flush(EditLogOutputStream.java:107)
at
org.apache.hadoop.hdfs.server.namenode.JournalSet$JournalSetOutputStream$8.apply(JournalSet.java:490)
at
org.apache.hadoop.hdfs.server.namenode.JournalSet.mapJournalsAndReportErrors(JournalSet.java:350)
at
org.apache.hadoop.hdfs.server.namenode.JournalSet.access$100(JournalSet.java:55)
at
org.apache.hadoop.hdfs.server.namenode.JournalSet$JournalSetOutputStream.flush(JournalSet.java:486)
at
org.apache.hadoop.hdfs.server.namenode.FSEditLog.logSync(FSEditLog.java:581)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:1695)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1669)
at
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:409)
at
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:205)
at
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44068)
at
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:898)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1693)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1689)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
While this is disturbing in it's own right, I'm further annoyed that HBASE
shut down 2 region servers. Furthermore, we had to hbck -fixAssignments to
repair HBASE, and I'm not sure that the data from the shutdown regions was
available, and if our hbase service itself was available afterwards:
2015-12-09 04:10:44,320 ERROR org.apache.hadoop.hbase.master.HMaster:
Region server ^@^@hbase008r09.comp.prod.local,60020,1436412712133 reported
a fatal error:
ABORTING region server hbase008r09.comp.prod.local,60020,1436412712133: IOE
in log roller
Cause:
java.io.IOException: cannot get log writer
at
org.apache.hadoop.hbase.regionserver.wal.HLog.createWriter(HLog.java:716)
at
org.apache.hadoop.hbase.regionserver.wal.HLog.createWriterInstance(HLog.java:663)
at org.apache.hadoop.hbase.regionserver.wal.HLog.rollWriter(HLog.java:595)
at org.apache.hadoop.hbase.regionserver.LogRoller.run(LogRoller.java:94)
at java.lang.Thread.run(Thread.java:722)
Caused by: java.io.IOException: java.io.IOException: Failed on local
exception: java.io.IOException: Response is null.; Host Details : local
host is: "hbase008r09.comp.prod.local/10.42.28.192"; destination host is:
"hbasenn001.comp.prod.local":8020;
at
org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.init(SequenceFileLogWriter.java:106)
at
org.apache.hadoop.hbase.regionserver.wal.HLog.createWriter(HLog.java:713)
... 4 more
Caused by: java.io.IOException: Failed on local exception:
java.io.IOException: Response is null.; Host Details : local host is:
"hbase008r09.comp.prod.local/10.42.28.192"; destination host is:
"hbasenn001.comp.prod.local":8020;
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:759)
at org.apache.hadoop.ipc.Client.call(Client.java:1228)
at
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
at com.sun.proxy.$Proxy14.create(Unknown Source)
at
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:192)
at sun.reflect.GeneratedMethodAccessor30.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
at com.sun.proxy.$Proxy15.create(Unknown Source)
at
org.apache.hadoop.hdfs.DFSOutputStream.<init>(DFSOutputStream.java:1298)
at
org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1317)
at org.apache.hadoop.hdfs.DFSClient.primitiveCreate(DFSClient.java:1264)
at org.apache.hadoop.fs.Hdfs.createInternal(Hdfs.java:97)
at org.apache.hadoop.fs.Hdfs.createInternal(Hdfs.java:53)
at
org.apache.hadoop.fs.AbstractFileSystem.create(AbstractFileSystem.java:554)
at org.apache.hadoop.fs.FileContext$3.next(FileContext.java:663)
at org.apache.hadoop.fs.FileContext$3.next(FileContext.java:660)
at
org.apache.hadoop.fs.FileContext$FSLinkResolver.resolve(FileContext.java:2333)
at org.apache.hadoop.fs.FileContext.create(FileContext.java:660)
at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:502)
at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:469)
at sun.reflect.GeneratedMethodAccessor49.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at
org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.init(SequenceFileLogWriter.java:87)
... 5 more
Caused by: java.io.IOException: Response is null.
at
org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:940)
at org.apache.hadoop.ipc.Client$Connection.run(Client.java:835)
2015-12-09 04:10:44,387 ERROR org.apache.hadoop.hbase.master.HMaster:
Region server ^@^@hbase007r08.comp.prod.local,60020,1436412674179 reported
a fatal error:
ABORTING region server hbase007r08.comp.prod.local,60020,1436412674179: IOE
in log roller
Cause:
java.io.IOException: cannot get log writer
at
org.apache.hadoop.hbase.regionserver.wal.HLog.createWriter(HLog.java:716)
at
org.apache.hadoop.hbase.regionserver.wal.HLog.createWriterInstance(HLog.java:663)
at org.apache.hadoop.hbase.regionserver.wal.HLog.rollWriter(HLog.java:595)
at org.apache.hadoop.hbase.regionserver.LogRoller.run(LogRoller.java:94)
at java.lang.Thread.run(Thread.java:722)
Caused by: java.io.IOException: java.io.IOException: Failed on local
exception: java.io.IOException: Response is null.; Host Details : local
host is: "hbase007r08.comp.prod.local/10.42.28.191"; destination host is:
"hbasenn001.comp.prod.local":8020;
at
org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.init(SequenceFileLogWriter.java:106)
at
org.apache.hadoop.hbase.regionserver.wal.HLog.createWriter(HLog.java:713)
... 4 more
Caused by: java.io.IOException: Failed on local exception:
java.io.IOException: Response is null.; Host Details : local host is:
"hbase007r08.comp.prod.local/10.42.28.191"; destination host is:
"hbasenn001.comp.prod.local":8020;
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:759)
at org.apache.hadoop.ipc.Client.call(Client.java:1228)
at
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
at com.sun.proxy.$Proxy14.create(Unknown Source)
at
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:192)
at sun.reflect.GeneratedMethodAccessor28.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
at com.sun.proxy.$Proxy15.create(Unknown Source)
at
org.apache.hadoop.hdfs.DFSOutputStream.<init>(DFSOutputStream.java:1298)
at
org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1317)
at org.apache.hadoop.hdfs.DFSClient.primitiveCreate(DFSClient.java:1264)
at org.apache.hadoop.fs.Hdfs.createInternal(Hdfs.java:97)
at org.apache.hadoop.fs.Hdfs.createInternal(Hdfs.java:53)
at
org.apache.hadoop.fs.AbstractFileSystem.create(AbstractFileSystem.java:554)
at org.apache.hadoop.fs.FileContext$3.next(FileContext.java:663)
at org.apache.hadoop.fs.FileContext$3.next(FileContext.java:660)
at
org.apache.hadoop.fs.FileContext$FSLinkResolver.resolve(FileContext.java:2333)
at org.apache.hadoop.fs.FileContext.create(FileContext.java:660)
at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:502)
at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:469)
at sun.reflect.GeneratedMethodAccessor37.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at
org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.init(SequenceFileLogWriter.java:87)
... 5 more
Caused by: java.io.IOException: Response is null.
at
org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:940)
at org.apache.hadoop.ipc.Client$Connection.run(Client.java:835)
2015-12-09 04:11:01,444 INFO org.apache.zookeeper.ClientCnxn: Client
session timed out, have not heard from server in 26679ms for sessionid
0x44e6c2f20980003, closing socket connection and attempting reconnect
2015-12-09 04:11:34,636 WARN
org.apache.hadoop.io.retry.RetryInvocationHandler: Exception while invoking
getListing of class ClientNamenodeProtocolTranslatorPB. Trying to fail over
immediately.
2015-12-09 04:11:34,687 WARN
org.apache.hadoop.io.retry.RetryInvocationHandler: Exception while invoking
getListing of class ClientNamenodeProtocolTranslatorPB after 1 fail over
attempts. Trying to fail over after sleeping for 791ms.
2015-12-09 04:11:35,334 WARN org.apache.hadoop.ipc.HBaseServer:
(responseTooSlow):
{"processingtimems":50237,"call":"reportRSFatalError([B@3c97e50c, ABORTING
region server hbase008r09.comp.prod.local,60020,1436412712133: IOE in log
roller\nCause:\njava.io.IOException: cannot get log writer\n\tat
org.apache.hadoop.hbase.regionserver.wal.HLog.createWriter(HLog.java:716)\n\tat
org.apache.hadoop.hbase.regionserver.wal.HLog.createWriterInstance(HLog.java:663)\n\tat
org.apache.hadoop.hbase.regionserver.wal.HLog.rollWriter(HLog.java:595)\n\tat
org.apache.hadoop.hbase.regionserver.LogRoller.run(LogRoller.java:94)\n\tat
java.lang.Thread.run(Thread.java:722)\nCaused by: java.io.IOException:
java.io.IOException: Failed on local exception: java.io.IOException:
Response is null.; Host Details : local host is:
\"hbase008r09.comp.prod.local/10.42.28.192\"; destination host is:
\"hbasenn001.comp.prod.local\":8020; \n\tat
org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.init(SequenceFileLogWriter.java:106)\n\tat
org.apache.hadoop.hbase.regionserver.wal.HLog.createWriter(HLog.java:713)\n\t...
4 more\nCaused by: java.io.IOException: Failed on local exception:
java.io.IOException: Response is null.; Host Details : local host is:
\"hbase008r09.comp.prod.local/10.42.28.192\"; destination host is:
\"hbasenn001.comp.prod.local\":8020; \n\tat
org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:759)\n\tat
org.apache.hadoop.ipc.Client.call(Client.java:1228)\n\tat
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)\n\tat
com.sun.proxy.$Proxy14.create(Unknown Source)\n\tat
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:192)\n\tat
sun.reflect.GeneratedMethodAccessor30.invoke(Unknown Source)\n\tat
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)\n\tat
java.lang.reflect.Method.invoke(Method.java:601)\n\tat
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)\n\tat
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)\n\tat
com.sun.proxy.$Proxy15.create(Unknown Source)\n\tat
org.apache.hadoop.hdfs.DFSOutputStream.<init>(DFSOutputStream.java:1298)\n\tat
org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1317)\n\tat
org.apache.hadoop.hdfs.DFSClient.primitiveCreate(DFSClient.java:1264)\n\tat
org.apache.hadoop.fs.Hdfs.createInternal(Hdfs.java:97)\n\tat
org.apache.hadoop.fs.Hdfs.createInternal(Hdfs.java:53)\n\tat
org.apache.hadoop.fs.AbstractFileSystem.create(AbstractFileSystem.java:554)\n\tat
org.apache.hadoop.fs.FileContext$3.next(FileContext.java:663)\n\tat
org.apache.hadoop.fs.FileContext$3.next(FileContext.java:660)\n\tat
org.apache.hadoop.fs.FileContext$FSLinkResolver.resolve(FileContext.java:2333)\n\tat
org.apache.hadoop.fs.FileContext.create(FileContext.java:660)\n\tat
org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:502)\n\tat
org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:469)\n\tat
sun.reflect.GeneratedMethodAccessor49.invoke(Unknown Source)\n\tat
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)\n\tat
java.lang.reflect.Method.invoke(Method.java:601)\n\tat
org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.init(SequenceFileLogWriter.java:87)\n\t...
5 more\nCaused by: java.io.IOException: Response is null.\n\tat
org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:940)\n\tat
org.apache.hadoop.ipc.Client$Connection.run(Client.java:835)\n), rpc
version=1, client version=29, methodsFingerPrint=-525182806","client":"
10.42.28.192:52162
","starttimems":1449659444320,"queuetimems":0,"class":"HMaster","responsesize":0,"method":"reportRSFatalError"}
2015-12-09 04:11:35,409 INFO org.apache.zookeeper.ClientCnxn: Opening
socket connection to server hbase004r08.comp.prod.local/10.42.28.188:2181.
Will not attempt to authenticate using SASL (Unable to locate a login
configuration)
2015-12-09 04:11:35,411 INFO org.apache.zookeeper.ClientCnxn: Socket
connection established to hbase004r08.comp.prod.local/10.42.28.188:2181,
initiating session
2015-12-09 04:11:35,413 INFO org.apache.zookeeper.ClientCnxn: Unable to
reconnect to ZooKeeper service, session 0x44e6c2f20980003 has expired,
closing socket connection
2015-12-09 04:11:35,413 FATAL org.apache.hadoop.hbase.master.HMaster:
Master server abort: loaded coprocessors are: []
2015-12-09 04:11:35,414 INFO org.apache.hadoop.hbase.master.HMaster:
Primary Master trying to recover from ZooKeeper session expiry.
2015-12-09 04:11:35,416 INFO org.apache.zookeeper.ZooKeeper: Initiating
client connection,
connectString=hbase004r08.comp.prod.local:2181,hbase003r07.comp.prod.local:2181,hbase005r09.comp.prod.local:2181
sessionTimeout=1200000 watcher=master:60000
...
and eventually:
2015-12-09 04:11:46,724 ERROR org.apache.zookeeper.ClientCnxn: Caught
unexpected throwable
2015-12-09 04:11:46,724 ERROR org.apache.zookeeper.ClientCnxn: Caught
unexpected throwable
java.lang.StackOverflowError
at java.security.AccessController.doPrivileged(Native Method)
at java.io.PrintWriter.<init>(PrintWriter.java:78)
at java.io.PrintWriter.<init>(PrintWriter.java:62)
at
org.apache.log4j.DefaultThrowableRenderer.render(DefaultThrowableRenderer.java:58)
at
org.apache.log4j.spi.ThrowableInformation.getThrowableStrRep(ThrowableInformation.java:87)
at
org.apache.log4j.spi.LoggingEvent.getThrowableStrRep(LoggingEvent.java:413)
at org.apache.log4j.WriterAppender.subAppend(WriterAppender.java:313)
at
org.apache.log4j.RollingFileAppender.subAppend(RollingFileAppender.java:276)
at org.apache.log4j.WriterAppender.append(WriterAppender.java:162)
at org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:251)
at
org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66)
at org.apache.log4j.Category.callAppenders(Category.java:206)
at org.apache.log4j.Category.forcedLog(Category.java:391)
at org.apache.log4j.Category.log(Category.java:856)
at org.slf4j.impl.Log4jLoggerAdapter.error(Log4jLoggerAdapter.java:576)
at
org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:623)
at
org.apache.zookeeper.ClientCnxn$EventThread.queuePacket(ClientCnxn.java:477)
at org.apache.zookeeper.ClientCnxn.finishPacket(ClientCnxn.java:640)
at org.apache.zookeeper.ClientCnxn.conLossPacket(ClientCnxn.java:658)
at org.apache.zookeeper.ClientCnxn.queuePacket(ClientCnxn.java:1286)
at org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:975)
at
org.apache.hadoop.hbase.master.SplitLogManager.deleteNode(SplitLogManager.java:627)
at
org.apache.hadoop.hbase.master.SplitLogManager.access$1600(SplitLogManager.java:96)
at
org.apache.hadoop.hbase.master.SplitLogManager$DeleteAsyncCallback.processResult(SplitLogManager.java:1106)
at
org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:619)
at
org.apache.zookeeper.ClientCnxn$EventThread.queuePacket(ClientCnxn.java:477)
at org.apache.zookeeper.ClientCnxn.finishPacket(ClientCnxn.java:640)
at org.apache.zookeeper.ClientCnxn.conLossPacket(ClientCnxn.java:658)
at org.apache.zookeeper.ClientCnxn.queuePacket(ClientCnxn.java:1286)
at org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:975)
at
org.apache.hadoop.hbase.master.SplitLogManager.deleteNode(SplitLogManager.java:627)
at
org.apache.hadoop.hbase.master.SplitLogManager.access$1600(SplitLogManager.java:96)
at
org.apache.hadoop.hbase.master.SplitLogManager$DeleteAsyncCallback.processResult(SplitLogManager.java:1106)
at
org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:619)
at
org.apache.zookeeper.ClientCnxn$EventThread.queuePacket(ClientCnxn.java:477)
at org.apache.zookeeper.ClientCnxn.finishPacket(ClientCnxn.java:640)
at org.apache.zookeeper.ClientCnxn.conLossPacket(ClientCnxn.java:658)
at org.apache.zookeeper.ClientCnxn.queuePacket(ClientCnxn.java:1286)
at org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:975)
at
org.apache.hadoop.hbase.master.SplitLogManager.deleteNode(SplitLogManager.java:627)
at
org.apache.hadoop.hbase.master.SplitLogManager.access$1600(SplitLogManager.java:96)
at
org.apache.hadoop.hbase.master.SplitLogManager$DeleteAsyncCallback.processResult(SplitLogManager.java:1106)
at
org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:619)
at
org.apache.zookeeper.ClientCnxn$EventThread.queuePacket(ClientCnxn.java:477)
at org.apache.zookeeper.ClientCnxn.finishPacket(ClientCnxn.java:640)
at org.apache.zookeeper.ClientCnxn.conLossPacket(ClientCnxn.java:658)
at org.apache.zookeeper.ClientCnxn.queuePacket(ClientCnxn.java:1286)
at org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:975)
at
org.apache.hadoop.hbase.master.SplitLogManager.deleteNode(SplitLogManager.java:627)
at
org.apache.hadoop.hbase.master.SplitLogManager.access$1600(SplitLogManager.java:96)
at
org.apache.hadoop.hbase.master.SplitLogManager$DeleteAsyncCallback.processResult(SplitLogManager.java:1106)
at
org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:619)
at
org.apache.zookeeper.ClientCnxn$EventThread.queuePacket(ClientCnxn.java:477)
at org.apache.zookeeper.ClientCnxn.finishPacket(ClientCnxn.java:640)
at org.apache.zookeeper.ClientCnxn.conLossPacket(ClientCnxn.java:658)
at org.apache.zookeeper.ClientCnxn.queuePacket(ClientCnxn.java:1286)
at org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:975)
at
org.apache.hadoop.hbase.master.SplitLogManager.deleteNode(SplitLogManager.java:627)
at
org.apache.hadoop.hbase.master.SplitLogManager.access$1600(SplitLogManager.java:96)
...
Since the namenode failover made the other nameserver active, then why did
my region servers decide to shutdown? The HDFS service seems to have stayed
up. Then how can I make the HBASE service more resilient to namenode
failovers?
Hbase: Version 0.92.1-cdh4.1.3
Hadoop: Hadoop 2.0.0-cdh4.1.3