[ https://issues.apache.org/jira/browse/HBASE-6046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13287467#comment-13287467 ]
Ashutosh Jindal commented on HBASE-6046: ---------------------------------------- Please check the second testcase added testLogSplittingAfterMasterRecoveryDueToZKExpiry() .If the testcase is run without the patch , stackOverFlow exception is thrown. {code} java.lang.StackOverflowError at java.lang.System.getProperty(System.java:647) at sun.security.action.GetPropertyAction.run(GetPropertyAction.java:67) at sun.security.action.GetPropertyAction.run(GetPropertyAction.java:32) at java.security.AccessController.doPrivileged(Native Method) at java.io.PrintWriter.<init>(PrintWriter.java:78) at java.io.PrintWriter.<init>(PrintWriter.java:62) at org.apache.log4j.DefaultThrowableRenderer.render(DefaultThrowableRenderer.java:58) at org.apache.log4j.spi.ThrowableInformation.getThrowableStrRep(ThrowableInformation.java:87) at org.apache.log4j.spi.LoggingEvent.getThrowableStrRep(LoggingEvent.java:413) at org.apache.log4j.WriterAppender.subAppend(WriterAppender.java:313) at org.apache.log4j.WriterAppender.append(WriterAppender.java:162) at org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:251) at org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66) at org.apache.log4j.Category.callAppenders(Category.java:206) at org.apache.log4j.Category.forcedLog(Category.java:391) at org.apache.log4j.Category.log(Category.java:856) at org.slf4j.impl.Log4jLoggerAdapter.error(Log4jLoggerAdapter.java:485) at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:623) at org.apache.zookeeper.ClientCnxn$EventThread.queuePacket(ClientCnxn.java:477) at org.apache.zookeeper.ClientCnxn.finishPacket(ClientCnxn.java:640) at org.apache.zookeeper.ClientCnxn.conLossPacket(ClientCnxn.java:658) at org.apache.zookeeper.ClientCnxn.queuePacket(ClientCnxn.java:1274) at org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:975) at org.apache.hadoop.hbase.master.SplitLogManager.deleteNode(SplitLogManager.java:626) at org.apache.hadoop.hbase.master.SplitLogManager.access$17(SplitLogManager.java:620) at org.apache.hadoop.hbase.master.SplitLogManager$DeleteAsyncCallback.processResult(SplitLogManager.java:1104) at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:619) at org.apache.zookeeper.ClientCnxn$EventThread.queuePacket(ClientCnxn.java:477) at org.apache.zookeeper.ClientCnxn.finishPacket(ClientCnxn.java:640) at org.apache.zookeeper.ClientCnxn.conLossPacket(ClientCnxn.java:658) at org.apache.zookeeper.ClientCnxn.queuePacket(ClientCnxn.java:1274) at org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:975) at org.apache.hadoop.hbase.master.SplitLogManager.deleteNode(SplitLogManager.java:626) at org.apache.hadoop.hbase.master.SplitLogManager.access$17(SplitLogManager.java:620) at org.apache.hadoop.hbase.master.SplitLogManager$DeleteAsyncCallback.processResult(SplitLogManager.java:1104) at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:619) at org.apache.zookeeper.ClientCnxn$EventThread.queuePacket(ClientCnxn.java:477) {code} This is coming because the listener for splitLogManager is not registered after the master recovers from expired zk session. > Master retry on ZK session expiry causes inconsistent region assignments. > ------------------------------------------------------------------------- > > Key: HBASE-6046 > URL: https://issues.apache.org/jira/browse/HBASE-6046 > Project: HBase > Issue Type: Bug > Components: master > Affects Versions: 0.92.1, 0.94.0 > Reporter: Gopinathan A > Assignee: ramkrishna.s.vasudevan > Attachments: HBASE_6046_0.94.patch > > > 1> ZK Session timeout in the hmaster leads to bulk assignment though all the > RSs are online. > 2> While doing bulk assignment, if the master again goes down & restart(or > backup comes up) all the node created in the ZK will now be tried to reassign > to the new RSs. This is leading to double assignment. > we had 2800 regions, among this 1900 region got double assignment, taking the > region count to 4700. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira