[
https://issues.apache.org/jira/browse/HBASE-20018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Andrew Kyle Purtell resolved HBASE-20018.
-----------------------------------------
Resolution: Not A Problem
Some of these ideas were implemented in HbckChore in HBase v2 and the remainder
of this issue is old.
> Safe online META repair
> -----------------------
>
> Key: HBASE-20018
> URL: https://issues.apache.org/jira/browse/HBASE-20018
> Project: HBase
> Issue Type: New Feature
> Components: hbck
> Reporter: Andrew Kyle Purtell
> Priority: Major
>
> HBCK is a tank, or a giant shotgun, or choose the battlefield metaphor you
> feel is most appropriate. It rolls onto the field and leaves problems crushed
> in its wake, but if you point it in the wrong direction, it will also crush
> your production data too. As such it is a means of last resort to fix an
> ailing cluster. It is also imperative that user request traffic, writes in
> particular, are stopped before attempting a number of the fixes. It is
> unlikely the default "-repair" option is what you want - this turns on too
> many fixes to risk at one time. There are a large number of command line
> switches for individual checks and fixes which are very useful but also error
> prone when cobbling together a command line for a cluster fix under pressure.
> An operations team might hesitate to employ hbck to fix some accumulating bad
> state, because of the disruption use of it requires, and the risk of
> compounding the problem if not carefully done. That of course would be bad
> because the accumulating bad state will eventually have an availability
> impact.
> It should be safer to use hbck, but changing hbck also carries risk. We can
> leave it be as the useful (but dangerous) tool it is and focus on a subset of
> its functionality to make safer.
> There are a class of META corruptions of mild to moderate severity which
> could in theory be handled more safely in an online manner without requiring
> a suspension of user traffic. Some things hbck does are safe enough to use
> directly for this. Others need tweaks to do more preflight checks (like
> checking region states) first. Develop these as a separate tool, maybe even a
> new HMaster or Admin component.
> Look for opportunities to share code with existing hbck, via refactor into a
> shared library.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)