[ https://issues.apache.org/jira/browse/HBASE-4729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13152141#comment-13152141 ]
Ted Yu commented on HBASE-4729: ------------------------------- In HBASE-4213, the following change is made to CompactSplitThread.java: {code} public synchronized boolean requestSplit(final HRegion r) { + waitForInflightSchemaChange(r.getRegionInfo().getTableNameAsString()); ... + /** + * Wait for mid-flight schema alter requests. (if any). We don't want to execute a split + * when a schema alter is in progress as we end up in an inconsistent state. {code} I think we should define the scope of what problem this JIRA solves. > Race between online altering and splitting kills the master > ----------------------------------------------------------- > > Key: HBASE-4729 > URL: https://issues.apache.org/jira/browse/HBASE-4729 > Project: HBase > Issue Type: Bug > Affects Versions: 0.92.0 > Reporter: Jean-Daniel Cryans > Assignee: ramkrishna.s.vasudevan > Fix For: 0.92.0, 0.94.0 > > Attachments: 4729.txt > > > I was running an online alter while regions were splitting, and suddenly the > master died and left my table half-altered (haven't restarted the master yet). > What killed the master: > {quote} > 2011-11-02 17:06:44,428 FATAL org.apache.hadoop.hbase.master.HMaster: > Unexpected ZK exception creating node CLOSING > org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode = > NodeExists for /hbase/unassigned/f7e1783e65ea8d621a4bc96ad310f101 > at > org.apache.zookeeper.KeeperException.create(KeeperException.java:110) > at > org.apache.zookeeper.KeeperException.create(KeeperException.java:42) > at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:637) > at > org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.createNonSequential(RecoverableZooKeeper.java:459) > at > org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.create(RecoverableZooKeeper.java:441) > at > org.apache.hadoop.hbase.zookeeper.ZKUtil.createAndWatch(ZKUtil.java:769) > at > org.apache.hadoop.hbase.zookeeper.ZKAssign.createNodeClosing(ZKAssign.java:568) > at > org.apache.hadoop.hbase.master.AssignmentManager.unassign(AssignmentManager.java:1722) > at > org.apache.hadoop.hbase.master.AssignmentManager.unassign(AssignmentManager.java:1661) > at org.apache.hadoop.hbase.master.BulkReOpen$1.run(BulkReOpen.java:69) > at > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) > at java.lang.Thread.run(Thread.java:662) > {quote} > A znode was created because the region server was splitting the region 4 > seconds before: > {quote} > 2011-11-02 17:06:40,704 INFO > org.apache.hadoop.hbase.regionserver.SplitTransaction: Starting split of > region TestTable,0012469153,1320253135043.f7e1783e65ea8d621a4bc96ad310f101. > 2011-11-02 17:06:40,704 DEBUG > org.apache.hadoop.hbase.regionserver.SplitTransaction: > regionserver:62023-0x132f043bbde0710 Creating ephemeral node for > f7e1783e65ea8d621a4bc96ad310f101 in SPLITTING state > 2011-11-02 17:06:40,751 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: > regionserver:62023-0x132f043bbde0710 Attempting to transition node > f7e1783e65ea8d621a4bc96ad310f101 from RS_ZK_REGION_SPLITTING to > RS_ZK_REGION_SPLITTING > ... > 2011-11-02 17:06:44,061 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: > regionserver:62023-0x132f043bbde0710 Successfully transitioned node > f7e1783e65ea8d621a4bc96ad310f101 from RS_ZK_REGION_SPLITTING to > RS_ZK_REGION_SPLIT > 2011-11-02 17:06:44,061 INFO > org.apache.hadoop.hbase.regionserver.SplitTransaction: Still waiting on the > master to process the split for f7e1783e65ea8d621a4bc96ad310f101 > {quote} > Now that the master is dead the region server is spewing those last two lines > like mad. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira