[ https://issues.apache.org/jira/browse/HBASE-5689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13248031#comment-13248031 ]
Hadoop QA commented on HBASE-5689: ---------------------------------- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12521635/HBASE-5689v3.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1422//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1422//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1422//console This message is automatically generated. > Skipping RecoveredEdits may cause data loss > ------------------------------------------- > > Key: HBASE-5689 > URL: https://issues.apache.org/jira/browse/HBASE-5689 > Project: HBase > Issue Type: Bug > Components: regionserver > Affects Versions: 0.94.0 > Reporter: chunhui shen > Assignee: chunhui shen > Priority: Critical > Fix For: 0.94.0 > > Attachments: 5689-testcase.patch, 5689-v4.txt, HBASE-5689.patch, > HBASE-5689.patch, HBASE-5689v2.patch, HBASE-5689v3.patch > > > Let's see the following scenario: > 1.Region is on the server A > 2.put KV(r1->v1) to the region > 3.move region from server A to server B > 4.put KV(r2->v2) to the region > 5.move region from server B to server A > 6.put KV(r3->v3) to the region > 7.kill -9 server B and start it > 8.kill -9 server A and start it > 9.scan the region, we could only get two KV(r1->v1,r2->v2), the third > KV(r3->v3) is lost. > Let's analyse the upper scenario from the code: > 1.the edit logs of KV(r1->v1) and KV(r3->v3) are both recorded in the same > hlog file on server A. > 2.when we split server B's hlog file in the process of ServerShutdownHandler, > we create one RecoveredEdits file f1 for the region. > 2.when we split server A's hlog file in the process of ServerShutdownHandler, > we create another RecoveredEdits file f2 for the region. > 3.however, RecoveredEdits file f2 will be skiped when initializing region > HRegion#replayRecoveredEditsIfAny > {code} > for (Path edits: files) { > if (edits == null || !this.fs.exists(edits)) { > LOG.warn("Null or non-existent edits file: " + edits); > continue; > } > if (isZeroLengthThenDelete(this.fs, edits)) continue; > if (checkSafeToSkip) { > Path higher = files.higher(edits); > long maxSeqId = Long.MAX_VALUE; > if (higher != null) { > // Edit file name pattern, HLog.EDITFILES_NAME_PATTERN: "-?[0-9]+" > String fileName = higher.getName(); > maxSeqId = Math.abs(Long.parseLong(fileName)); > } > if (maxSeqId <= minSeqId) { > String msg = "Maximum possible sequenceid for this log is " + > maxSeqId > + ", skipped the whole file, path=" + edits; > LOG.debug(msg); > continue; > } else { > checkSafeToSkip = false; > } > } > {code} > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira