[ https://issues.apache.org/jira/browse/HBASE-20001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16373621#comment-16373621 ]
Thiruvel Thirumoolan commented on HBASE-20001: ---------------------------------------------- Uploaded HBASE-20001.branch-1.4.003.patch to address the issues we ([~toffer] and I) found. This addresses two issues: # regionName fix which caused the data loss issue for us. # ZK split/merge rollback on failure and unit tests. testRSSplitEphemeralsDisappearButDaughtersAreOnlinedAfterShutdownHandling - This test failed and caused subsequent tests to fail. It was failing in deletion of the test table (finally clause) because the daughters were in transition (SPLITTING_NEW) due to regionName fix. Without the regionName fix, the daughters were offlined and HDFS dir removed and the test passed, which is wrong. [~toffer] pointed out that the test was waiting for daughters to be online, but in zk based assignment, we rollback and not forward. So, we should be waiting for parent. The test was still passing all these checks because there were not enough barriers. So we fixed the test to comply with the zk based behavior. We also introduced a similar test for merge in zk mode. I will raise separate Jira for re-introducing zkless based tests back and will add the appropriate zkless tests in follow up. Once we fixed the test, we realized the failed daughters were in transition and not offlined. We fixed that also in RegionStates.java as part of this Jira itself. Please let us know what do you guys think. Thanks! > cleanIfNoMetaEntry() uses encoded instead of region name to lookup region > ------------------------------------------------------------------------- > > Key: HBASE-20001 > URL: https://issues.apache.org/jira/browse/HBASE-20001 > Project: HBase > Issue Type: Bug > Affects Versions: 1.2.0, 1.3.0, 1.4.0, 1.1.7 > Reporter: Francis Liu > Assignee: Thiruvel Thirumoolan > Priority: Major > Fix For: 1.3.2, 1.5.0, 1.2.7, 1.4.3 > > Attachments: HBASE-20001.branch-1.4.001.patch, > HBASE-20001.branch-1.4.002.patch, HBASE-20001.branch-1.4.003.patch > > > In RegionStates.cleanIfNoMetaEntry() > {{if (MetaTableAccessor.getRegion(server.getConnection(), > hri.getEncodedNameAsBytes()) == null) {}} > {{regionOffline(hri);}} > {{FSUtils.deleteRegionDir(server.getConfiguration(), hri);}} > } > But api expects regionname > {{public static Pair<HRegionInfo, ServerName> getRegion(Connection > connection, byte [] regionName)}} > So we might end up cleaning good regions. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)