[ https://issues.apache.org/jira/browse/HBASE-20878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16547381#comment-16547381 ]
stack commented on HBASE-20878: ------------------------------- Patch LGTM. Good one. Nice comments in code on why this obtuse check. Having to use WALSplitter.getSplitEditFilesSorted is ugly but I'm impressed you found this method... At least it hides a bunch of the recovered.edits mess. Nits that are not important: + Change NavigableSet<Path> file to Collection<Path> and then do files == null || files.isEmpty()... so avoid an import and an == 0 on Collection.. no biggie. + Throw an HBaseIOE rather than IOE here throw new IOException? We throw too much base IOE as it is. When the test runs, is it repro'ing the condition? Thanks. > Data loss if merging regions while ServerCrashProcedure executing > ----------------------------------------------------------------- > > Key: HBASE-20878 > URL: https://issues.apache.org/jira/browse/HBASE-20878 > Project: HBase > Issue Type: Sub-task > Components: amv2 > Affects Versions: 3.0.0, 2.1.0, 2.0.1 > Reporter: Allan Yang > Assignee: Allan Yang > Priority: Critical > Fix For: 3.0.0, 2.0.2, 2.1.1 > > Attachments: HBASE-20878.branch-2.0.001.patch, > HBASE-20878.branch-2.0.002.patch, HBASE-20878.branch-2.0.003.patch > > > In MergeTableRegionsProcedure, we close the regions to merge using > UnassignProcedure. But, if the RS these regions on is crashed, a > ServerCrashProcedure will execute at the same time. UnassignProcedures will > be blockd until all logs are split. But since these regions are closed for > merging, the regions won't open again, the recovered.edit in the region dir > won't be replay, thus, data will loss. > I provided a test to repo this case. I seriously doubt Split region procedure > also has this kind of problem. I will check later -- This message was sent by Atlassian JIRA (v7.6.3#76005)