Guanghao Zhang created HBASE-23044: -------------------------------------- Summary: CatalogJanitor#cleanMergeQualifier may clean wrong parent regions Key: HBASE-23044 URL: https://issues.apache.org/jira/browse/HBASE-23044 Project: HBase Issue Type: Improvement Reporter: Guanghao Zhang
2019-09-17,19:42:40,539 INFO [PEWorker-1] org.apache.hadoop.hbase.procedure2.ProcedureExecutor: Finished pid=1223589, state=SUCCESS; GCMultipleMergedRegionsProcedure child={color:red}647600d28633bb2fe06b40682bab0593{color}, parents:[81b6fc3c560a00692bc7c3cd266a626a], [472500358997b0dc8f0002ec86593dcf] in 2.6470sec 2019-09-17,19:59:54,179 INFO [PEWorker-6] org.apache.hadoop.hbase.procedure2.ProcedureExecutor: Finished pid=1223651, state=SUCCESS; GCMultipleMergedRegionsProcedure child={color:red}647600d28633bb2fe06b40682bab0593{color}, parents:[9c52f24e0a9cc9b4959c1ebdfea29d64], [a623f298870df5581bcfae7f83311b33] in 1.0340sec The child is same region {color:red}647600d28633bb2fe06b40682bab0593{color} but the parent regions are different. MergeTableRegionProcedure#prepareMergeRegion will try to cleanMergeQualifier for the regions to merge. {code:java} for (RegionInfo ri: this.regionsToMerge) { if (!catalogJanitor.cleanMergeQualifier(ri)) { String msg = "Skip merging " + RegionInfo.getShortNameToLog(regionsToMerge) + ", because parent " + RegionInfo.getShortNameToLog(ri) + " has a merge qualifier"; LOG.warn(msg); throw new MergeRegionException(msg); } {code} If region A and B merge to C, region D and E merge to F. When merge C and F, it will try to cleanMergeQualifier for C and F. catalogJanitor.cleanMergeQualifier for region C succeed but catalogJanitor.cleanMergeQualifier for region F failed as there are references in region F. When merge C and F again, it will try to cleanMergeQualifier for C and F again. But MetaTableAccessor.getMergeRegions will get wrong parents now. It use scan with filter to scan result. But region C's MergeQualifier already was deleted before. Then the scan will return a wrong result, may be anther region...... {code:java} public boolean cleanMergeQualifier(final RegionInfo region) throws IOException { // Get merge regions if it is a merged region and already has merge qualifier List<RegionInfo> parents = MetaTableAccessor.getMergeRegions(this.services.getConnection(), region.getRegionName()); if (parents == null || parents.isEmpty()) { // It doesn't have merge qualifier, no need to clean return true; } return cleanMergeRegion(region, parents); } public static List<RegionInfo> getMergeRegions(Connection connection, byte[] regionName) throws IOException { return getMergeRegions(getMergeRegionsRaw(connection, regionName)); } private static Cell [] getMergeRegionsRaw(Connection connection, byte [] regionName) throws IOException { Scan scan = new Scan().withStartRow(regionName). setOneRowLimit(). readVersions(1). addFamily(HConstants.CATALOG_FAMILY). setFilter(new QualifierFilter(CompareOperator.EQUAL, new RegexStringComparator(HConstants.MERGE_QUALIFIER_PREFIX_STR+ ".*"))); try (Table m = getMetaHTable(connection); ResultScanner scanner = m.getScanner(scan)) { // Should be only one result in this scanner if any. Result result = scanner.next(); if (result == null) { return null; } // Should be safe to just return all Cells found since we had filter in place. // All values should be RegionInfos or something wrong. return result.rawCells(); } } {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)