[ https://issues.apache.org/jira/browse/HBASE-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13124242#comment-13124242 ]
Ted Yu commented on HBASE-4562: ------------------------------- Apache Jenkins build for 0.90 has 678 tests. The attached test report had fewer test cases. Also, patch for 0.92/TRUNK is needed since the fix would go to those places as well. > When split doing offlineParentInMeta occurs error,it'll cause data loss > ----------------------------------------------------------------------- > > Key: HBASE-4562 > URL: https://issues.apache.org/jira/browse/HBASE-4562 > Project: HBase > Issue Type: Bug > Components: regionserver > Affects Versions: 0.90.4 > Reporter: bluedavy > Priority: Blocker > Fix For: 0.90.5 > > Attachments: HBASE-4562-test.report.txt, HBASE-4562.patch > > > Follow below steps to replay the problem: > 1. change the SplitTransaction.java as below,just like mock the timeout error. > {code:title=SplitTransaction.java|borderStyle=solid} > if (!testing) { > MetaEditor.offlineParentInMeta(server.getCatalogTracker(), > this.parent.getRegionInfo(), a.getRegionInfo(), b.getRegionInfo()); > throw new IOException("some unexpected error in split"); > } > {code} > 2. update the regionserver code,restart; > 3. create a table & put some data to the table; > 4. split the table; > 5. kill the regionserver hosted the table; > 6. wait some time after master ServerShutdownHandler.process execute,then > scan the table,u'll find the data wrote before lost. > We can fix the bug just use below code: > {code:title=SplitTransaction.java|borderStyle=solid} > this.journal.add(JournalEntry.PONR); > if (!testing) { > MetaEditor.offlineParentInMeta(server.getCatalogTracker(), > this.parent.getRegionInfo(), a.getRegionInfo(), b.getRegionInfo()); > throw new IOException("some unexpected error in split"); > } > {code} > {code:title=CompactSplitThread.java|borderStyle=solid} > if (st.rollback(this.server, this.server)) { > LOG.info("Successful rollback of failed split of " + > parent.getRegionNameAsString()); > } > else { > this.server.abort("Abort; we got an error after > point-of-no-return"); > } > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira