[ https://issues.apache.org/jira/browse/HBASE-15940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15311627#comment-15311627 ]
Stephen Yuan Jiang commented on HBASE-15940: -------------------------------------------- One possible solution is that we really don't need to copy referenced files to a new region during fixHdfsOverlaps - either the referenced file is orphaned; or the real files that the referenced files point to are already in the new region. > HBCK unnecessary moves reference files when a table has split region to fix > non-existing overlap regions > -------------------------------------------------------------------------------------------------------- > > Key: HBASE-15940 > URL: https://issues.apache.org/jira/browse/HBASE-15940 > Project: HBase > Issue Type: Bug > Components: hbck > Affects Versions: 1.0.0 > Reporter: Stephen Yuan Jiang > Assignee: Stephen Yuan Jiang > Attachments: org.apache.hadoop.hbase.util.TestHBaseFsck-output.txt, > repro-hbck-repair-healthy-splitted=region.patch > > > When repair option (the -fixHdfsOverlaps option specifically) is specified > against a table, if the table has splitted regions (both parent region and > child regions exists with reference files), Hbck would wrongly think that > there exists overlapped regions and try to merge them and fix it. > This is by-design, as current implementation of Hbck uses HDFS as the trusted > source without consulting META table. > Here is the comments from one of unit tests: > {code} > // TODO: fixHdfsHoles does not work against splits, since the parent > dir lingers on > // for some time until children references are deleted. HBCK > erroneously sees this as > // overlapping regions > {code} > However, this is undesirable. when the reference files moved to a new > region, the parent region would have no daugher regions and hence it could be > cleaned up by CatalogJanitor. This would create real inconsistency: > lingering reference files. > Another bad consequence is that we would merge splitted regions back to one. > Even it is undesirable, at least this would not cause more inconsistency. > this JIRA would not try to solve this unsplit issue, as it requires bigger > design change in Hbck. > This JIRA is trying to address the potential lingering reference files > issue, as multiple customers using branch-1 faced this issue in production. > (workaround is that run major compaction on all split regions before run > HBCK, this could take longer time and have production impact). > Attached is the log and modified unit test to repro the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)