[ https://issues.apache.org/jira/browse/HBASE-26420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
anonymous updated HBASE-26420: ------------------------------ Attachment: hbase-root-master-C3HM1.log > Unexpected crash of meta RegionServer causes the cluster out of service > ----------------------------------------------------------------------- > > Key: HBASE-26420 > URL: https://issues.apache.org/jira/browse/HBASE-26420 > Project: HBase > Issue Type: Bug > Affects Versions: 1.7.1 > Reporter: anonymous > Priority: Major > Attachments: hbase-root-master-C3HM1.log > > > We have a cluster of two HMasters, C3HM1 and C3HM2, and three RegionServers, > C3RS1, C3RS2, C3RS3. > We use an external ZooKeeper cluster which is a pseudo-distributed cluster: > {code:java} > <property> > <name>hbase.zookeeper.quorum</name> > <value>C3hb-zk</value> > </property> > <property> > <name>hbase.zookeeper.property.clientPort</name> > <value>11181</value> > </property> > {code} > For other HBase options, we use the default settings. The buggy scenario is > as follows: > 1. Start the cluster, C3HM1 becomes the active master; > 2. C3RS2 crashes right before creating the znode "/hbase/meta-region-server" > on ZooKeeper; > {code:java} > [org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.createNonSequential(RecoverableZooKeeper.java:665), > > org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.create(RecoverableZooKeeper.java:644), > org.apache.hadoop.hbase.zookeeper.ZKUtil.createAndWatch(ZKUtil.java:1182), > org.apache.hadoop.hbase.zookeeper.MetaTableLocator.setMetaLocation(MetaTableLocator.java:464), > > org.apache.hadoop.hbase.regionserver.HRegionServer.postOpenDeployTasks(HRegionServer.java:2182), > > org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler$PostOpenDeployTasksThread.run(OpenRegionHandler.java:329)] > {code} > 3. The meta server is still not online after 10 minutes. The data of znode > "/hbase/master" is C3HM1. > While the C3RS2 crashes after creating the "/hbase/meta-region-server" znode, > everything works fine. And the bug does not appear on HBase-2.4.5. -- This message was sent by Atlassian Jira (v8.3.4#803005)