Stephen Yuan Jiang created HBASE-13576:
------------------------------------------

             Summary: HBCK enhancement: Failure in checking one region should 
not fail the entire HBCK operation.
                 Key: HBASE-13576
                 URL: https://issues.apache.org/jira/browse/HBASE-13576
             Project: HBase
          Issue Type: Bug
          Components: hbck
    Affects Versions: 2.0.0, 1.1.0, 1.2.0
            Reporter: Stephen Yuan Jiang
            Assignee: Stephen Yuan Jiang


HBaseFsck#checkRegionConsistency() checks region consistency and repair the 
corruption if requested.  However, this function expects some exceptions.  For 
example, in one aspect of region repair, it calls 
HBaseFsckRepair#waitUntilAssigned(), if a region is in transition for over 120 
seconds (default value of "hbase.hbck.assign.timeout" configuration), 
IOException would throw.

The problem is that one exception in checkRegionConsistency() would kill entire 
hbck operation, because the exception would propagate up.

The proposal is that if the region is not META region ( or a system table 
region if we prefer),  we can skip the region if  
HBaseFsck#checkRegionConsistency() fails.  We could print out skip regions in 
summary section so that users know to either re-run or investigate potential 
issue for that region. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to