[
https://issues.apache.org/jira/browse/HBASE-21745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
stack updated HBASE-21745:
--------------------------
Release Note:
This issue adds via its subtasks:
* An 'HBCK Report' page to the Master UI added by
HBASE-22527+HBASE-22709+HBASE-22723+ (since 2.1.6, 2.2.1, 2.3.0). Lists
consistency or anomalies found via new hbase:meta consistency checking
extensions added to CatalogJanitor (holes, overlaps, bad servers) and by a new
'HBCK chore' that runs at a lesser periodicity that will note filesystem
orphans and overlaps as well as the following conditions:
** Master thought this region opened, but no regionserver reported it.
** Master thought this region opened on Server1, but regionserver reported
Server2
** More than one regionservers reported opened this region
Both chores can be triggered from the shell to regenerate ‘new’ reports.
* Means of scheduling a ServerCrashProcedure (HBASE-21393).
* An ‘offline’ hbase:meta rebuild (HBASE-22680).
* Offline replace of hbase.version and hbase.id
* Documentation on how to use completebulkload tool to ‘adopt’ orphaned data
found by new HBCK2 ‘filesystem’ check (see below) and ‘HBCK chore’ (HBASE-22859)
* A ‘holes’ and ‘overlaps’ fix that runs in the master that uses new
bulk-merge facility to collapse many overlaps in the one go.
* hbase-operator-tools HBCK2 client tool got a bunch of additions:
** A specialized 'fix' for the case where operators ran old hbck 'offlinemeta'
repair and destroyed their hbase:meta; it ties together holes in meta with
orphaned data in the fs (HBASE-22567)
** A ‘filesystem’ command that reports on orphan data as well as bad
references and hlinks with a ‘fix’ for the latter two options (based on hbck1
facility updated).
** Adds back the ‘replication’ fix facility from hbck1 (HBASE-22717)
The compound result is that hbck2 is now in excess of hbck1 abilities. The
provided functionality is disaggregated as per the hbck2 philosophy of
providing 'plumbing' rather than 'porcelain' so there is work to do still
adding fix-it playbooks, scripting across outages, and automation.
was:
This issue adds via its subtasks:
* An 'HBCK Report' page to the Master UI added by
HBASE-22527+HBASE-22709+HBASE-22723+ (since 2.1.6, 2.2.1, 2.3.0). Lists
consistency or anomalies found via new hbase:meta consistency checking
extensions added to CatalogJanitor (holes, overlaps, bad servers) and by a new
'HBCK chore' that runs at a lesser periodicity that will note filesystem
orphans and overlaps as well as the following conditions:
** Master thought this region opened, but no regionserver reported it.
** Master thought this region opened on Server1, but regionserver reported
Server2
** More than one regionservers reported opened this region
Both chores can be triggered from the shell to regenerate ‘new’ reports.
* Means of scheduling a ServerCrashProcedure (HBASE-21393).
* An ‘offline’ hbase:meta rebuild (HBASE-22680).
* Offline replace of hbase.version and hbase.id
* Documentation on how to use completebulkload tool to ‘adopt’ orphaned data
found by new HBCK2 ‘filesystem’ check (see below) and ‘HBCK chore’ (HBASE-22859)
* A ‘holes’ and ‘overlaps’ fix that runs in the master that uses new
bulk-merge facility to collapse many overlaps in the one go.
* hbase-operator-tools HBCK2 client tool got a bunch of additions:
** A specialized 'fix' for the case where operators ran old hbck 'offlinemeta'
repair and destroyed their hbase:meta; it ties together holes in meta with
orphaned data in the fs (HBASE-22567)
** A ‘filesystem’ command that reports on orphan data as well as bad
references and hlinks with a ‘fix’ for the latter two options (based on hbck1
facility updated).
** Adds back the ‘replication’ fix facility from hbck1 (HBASE-22717)
> Make HBCK2 be able to fix issues other than region assignment
> -------------------------------------------------------------
>
> Key: HBASE-21745
> URL: https://issues.apache.org/jira/browse/HBASE-21745
> Project: HBase
> Issue Type: Umbrella
> Components: hbase-operator-tools, hbck2
> Reporter: Duo Zhang
> Assignee: stack
> Priority: Critical
>
> This is what [~apurtell] posted on mailing-list, HBCK2 should support
> * -Rebuild meta from region metadata in the filesystem, aka offline meta
> rebuild.-
> * -Fix assignment errors (undeployed regions, double assignments (yes,
> should not be possible), etc)- (See
> https://issues.apache.org/jira/browse/HBASE-21745?focusedCommentId=16888302&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16888302)
> * -Fix region holes, overlaps, and other errors in the region chain- (See
> HBASE-22796 and HBASE-22771 -- adds hole and overlap fixing to master; hbck2
> client can as for a fixMeta).
> * -Fix failed split and merge transactions that have failed to roll back due
> to some bug (related to previous)- (Previous items 'overlaps' will take care
> of these).
> * -Enumerate store files to determine file level corruption and sideline
> corrupt files-
> * -Fix hfile link problems (dangling / broken)-
--
This message was sent by Atlassian Jira
(v8.3.2#803003)