[ 
https://issues.apache.org/jira/browse/HBASE-6046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13287467#comment-13287467
 ] 

Ashutosh Jindal commented on HBASE-6046:
----------------------------------------

Please check the second testcase added 
testLogSplittingAfterMasterRecoveryDueToZKExpiry() .If the testcase is run 
without the patch , stackOverFlow exception is thrown.
{code}
java.lang.StackOverflowError
        at java.lang.System.getProperty(System.java:647)
        at sun.security.action.GetPropertyAction.run(GetPropertyAction.java:67)
        at sun.security.action.GetPropertyAction.run(GetPropertyAction.java:32)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.io.PrintWriter.<init>(PrintWriter.java:78)
        at java.io.PrintWriter.<init>(PrintWriter.java:62)
        at 
org.apache.log4j.DefaultThrowableRenderer.render(DefaultThrowableRenderer.java:58)
        at 
org.apache.log4j.spi.ThrowableInformation.getThrowableStrRep(ThrowableInformation.java:87)
        at 
org.apache.log4j.spi.LoggingEvent.getThrowableStrRep(LoggingEvent.java:413)
        at org.apache.log4j.WriterAppender.subAppend(WriterAppender.java:313)
        at org.apache.log4j.WriterAppender.append(WriterAppender.java:162)
        at org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:251)
        at 
org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66)
        at org.apache.log4j.Category.callAppenders(Category.java:206)
        at org.apache.log4j.Category.forcedLog(Category.java:391)
        at org.apache.log4j.Category.log(Category.java:856)
        at org.slf4j.impl.Log4jLoggerAdapter.error(Log4jLoggerAdapter.java:485)
        at 
org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:623)
        at 
org.apache.zookeeper.ClientCnxn$EventThread.queuePacket(ClientCnxn.java:477)
        at org.apache.zookeeper.ClientCnxn.finishPacket(ClientCnxn.java:640)
        at org.apache.zookeeper.ClientCnxn.conLossPacket(ClientCnxn.java:658)
        at org.apache.zookeeper.ClientCnxn.queuePacket(ClientCnxn.java:1274)
        at org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:975)
        at 
org.apache.hadoop.hbase.master.SplitLogManager.deleteNode(SplitLogManager.java:626)
        at 
org.apache.hadoop.hbase.master.SplitLogManager.access$17(SplitLogManager.java:620)
        at 
org.apache.hadoop.hbase.master.SplitLogManager$DeleteAsyncCallback.processResult(SplitLogManager.java:1104)
        at 
org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:619)
        at 
org.apache.zookeeper.ClientCnxn$EventThread.queuePacket(ClientCnxn.java:477)
        at org.apache.zookeeper.ClientCnxn.finishPacket(ClientCnxn.java:640)
        at org.apache.zookeeper.ClientCnxn.conLossPacket(ClientCnxn.java:658)
        at org.apache.zookeeper.ClientCnxn.queuePacket(ClientCnxn.java:1274)
        at org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:975)
        at 
org.apache.hadoop.hbase.master.SplitLogManager.deleteNode(SplitLogManager.java:626)
        at 
org.apache.hadoop.hbase.master.SplitLogManager.access$17(SplitLogManager.java:620)
        at 
org.apache.hadoop.hbase.master.SplitLogManager$DeleteAsyncCallback.processResult(SplitLogManager.java:1104)
        at 
org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:619)
        at 
org.apache.zookeeper.ClientCnxn$EventThread.queuePacket(ClientCnxn.java:477)
{code}

This is coming because the listener for splitLogManager is not registered after 
the master recovers from expired zk session.
                
> Master retry on ZK session expiry causes inconsistent region assignments.
> -------------------------------------------------------------------------
>
>                 Key: HBASE-6046
>                 URL: https://issues.apache.org/jira/browse/HBASE-6046
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 0.92.1, 0.94.0
>            Reporter: Gopinathan A
>            Assignee: ramkrishna.s.vasudevan
>         Attachments: HBASE_6046_0.94.patch
>
>
> 1> ZK Session timeout in the hmaster leads to bulk assignment though all the 
> RSs are online.
> 2> While doing bulk assignment, if the master again goes down & restart(or 
> backup comes up) all the node created in the ZK will now be tried to reassign 
> to the new RSs. This is leading to double assignment.
> we had 2800 regions, among this 1900 region got double assignment, taking the 
> region count to 4700. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to