[ https://issues.apache.org/jira/browse/HBASE-13814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
cuijianwei updated HBASE-13814: ------------------------------- Attachment: HBASE-13814-0.94-v2.patch > AssignmentManager does not write the correct server name into Zookeeper when > unassign region > -------------------------------------------------------------------------------------------- > > Key: HBASE-13814 > URL: https://issues.apache.org/jira/browse/HBASE-13814 > Project: HBase > Issue Type: Bug > Components: Region Assignment > Affects Versions: 0.94.27 > Reporter: cuijianwei > Priority: Minor > Attachments: HBASE-13814-0.94-v1.patch, HBASE-13814-0.94-v2.patch > > > When moving region, the region will firstly be unassigned from corresponding > region server by the method AssignmentManager#unassign(). AssignmentManager > will write the region info and the server name into Zookeeper by the > following code: > {code} > versionOfClosingNode = ZKAssign.createNodeClosing( > master.getZooKeeper(), region, master.getServerName()); > {code} > It seems that the AssignmentManager misuses the master's name as the server > name. If the ROOT region is being moved and the region server holding the > ROOT region is just crashed. The Master will try to start a > MetaServerShutdownHandler if the server is judged as holding meta region. The > judgment will be done by the method AssignmentManager#isCarryingRegion, and > the method will firstly check the server name in Zookeeper: > {code} > ServerName addressFromZK = (data != null && data.getOrigin() != null) ? > data.getOrigin() : null; > if (addressFromZK != null) { > // if we get something from ZK, we will use the data > boolean matchZK = (addressFromZK != null && > addressFromZK.equals(serverName)); > {code} > The wrong server name from Zookeeper will make the server not be judged as > holding the ROOT region. Then, the master will start a ServerShutdownHandler. > Unlike MetaServerShutdownHandler, the ServerShutdownHandler won't assign ROOT > region firstly, making the ROOT region won't be assigned forever. In our test > environment, we encounter this problem when moving ROOT region and stopping > the region server concurrently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)