[ https://issues.apache.org/jira/browse/HBASE-26420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
anonymous updated HBASE-26420: ------------------------------ Description: We have a cluster of two HMasters, C3HM1 and C3HM2, and three RegionServers, C3RS1, C3RS2, C3RS3. We use an external ZooKeeper cluster which is a pseudo-distributed cluster: {code:java} <property> <name>hbase.zookeeper.quorum</name> <value>C3hb-zk</value> </property> <property> <name>hbase.zookeeper.property.clientPort</name> <value>11181</value> </property> {code} For other HBase options, we use the default settings. The buggy scenario is as follows: 1. Start the cluster; 2. C3RS2 crashes right before creating the znode "/hbase/meta-region-server" on ZooKeeper; {code:java} [org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.createNonSequential(RecoverableZooKeeper.java:665), org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.create(RecoverableZooKeeper.java:644), org.apache.hadoop.hbase.zookeeper.ZKUtil.createAndWatch(ZKUtil.java:1182), org.apache.hadoop.hbase.zookeeper.MetaTableLocator.setMetaLocation(MetaTableLocator.java:464), org.apache.hadoop.hbase.regionserver.HRegionServer.postOpenDeployTasks(HRegionServer.java:2182), org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler$PostOpenDeployTasksThread.run(OpenRegionHandler.java:329)] {code} 3. The meta server has not been online after 10 minutes. While the C3RS2 crashes after creating the "/hbase/meta-region-server" znode, everything works fine. And the bug disappears on HBase-2.4.5. was: We have a cluster of two HMasters, C3HM1 and C3HM2, and three RegionServers, C3RS1, C3RS2, C3RS3. We use an external ZooKeeper cluster which is a pseudo-distributed cluster: {code:java} <property> <name>hbase.zookeeper.quorum</name> <value>C3hb-zk</value> </property> <property> <name>hbase.zookeeper.property.clientPort</name> <value>11181</value> </property> {code} For other HBase options, we use the default settings. The buggy scenario is as follows: 1. Start the cluster; 2. C3RS2 crashes right before creating the znode "/hbase/meta-region-server" on ZooKeeper; {code:java} [org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.createNonSequential(RecoverableZooKeeper.java:665), org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.create(RecoverableZooKeeper.java:644), org.apache.hadoop.hbase.zookeeper.ZKUtil.createAndWatch(ZKUtil.java:1182), org.apache.hadoop.hbase.zookeeper.MetaTableLocator.setMetaLocation(MetaTableLocator.java:464), org.apache.hadoop.hbase.regionserver.HRegionServer.postOpenDeployTasks(HRegionServer.java:2182), org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler$PostOpenDeployTasksThread.run(OpenRegionHandler.java:329)] {code} 3. The meta server has not been online after 10 minutes. While the C3RS2 crashes after creating the "/hbase/meta-region-server" znode, everything works fine. > Unexpected crash of meta RegionServer causes the cluster out of service > ----------------------------------------------------------------------- > > Key: HBASE-26420 > URL: https://issues.apache.org/jira/browse/HBASE-26420 > Project: HBase > Issue Type: Bug > Affects Versions: 1.7.1 > Reporter: anonymous > Priority: Major > > We have a cluster of two HMasters, C3HM1 and C3HM2, and three RegionServers, > C3RS1, C3RS2, C3RS3. > We use an external ZooKeeper cluster which is a pseudo-distributed cluster: > {code:java} > <property> > <name>hbase.zookeeper.quorum</name> > <value>C3hb-zk</value> > </property> > <property> > <name>hbase.zookeeper.property.clientPort</name> > <value>11181</value> > </property> > {code} > For other HBase options, we use the default settings. The buggy scenario is > as follows: > 1. Start the cluster; > 2. C3RS2 crashes right before creating the znode "/hbase/meta-region-server" > on ZooKeeper; > {code:java} > [org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.createNonSequential(RecoverableZooKeeper.java:665), > > org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.create(RecoverableZooKeeper.java:644), > org.apache.hadoop.hbase.zookeeper.ZKUtil.createAndWatch(ZKUtil.java:1182), > org.apache.hadoop.hbase.zookeeper.MetaTableLocator.setMetaLocation(MetaTableLocator.java:464), > > org.apache.hadoop.hbase.regionserver.HRegionServer.postOpenDeployTasks(HRegionServer.java:2182), > > org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler$PostOpenDeployTasksThread.run(OpenRegionHandler.java:329)] > {code} > 3. The meta server has not been online after 10 minutes. > While the C3RS2 crashes after creating the "/hbase/meta-region-server" znode, > everything works fine. And the bug disappears on HBase-2.4.5. -- This message was sent by Atlassian Jira (v8.3.4#803005)