[
https://issues.apache.org/jira/browse/HBASE-5128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jonathan Hsieh updated HBASE-5128:
----------------------------------
Description:
The current (0.90.5, 0.92.0rc2) versions of hbck detects most of region
consistency and table integrity invariant violations. However with '-fix' it
can only automatically repair region consistency cases having to do with
deployment problems. This updated version should be able to handle all cases
(including a new orphan regiondir case). When complete will likely deprecate
the OfflineMetaRepair tool and subsume several open META-hole related issue.
Here's the approach (from the comment of at the top of the new version of the
file).
{code}
/**
* HBaseFsck (hbck) is a tool for checking and repairing region consistency and
* table integrity.
*
* Region consistency checks verify that META, region deployment on
* region servers and the state of data in HDFS (.regioninfo files) all are in
* accordance.
*
* Table integrity checks verify that that all possible row keys can resolve to
* exactly one region of a table. This means there are no individual degenerate
* or backwards regions; no holes between regions; and that there no overlapping
* regions.
*
* The general repair strategy works in these steps.
* 1) Repair Table Integrity on HDFS. (merge or fabricate regions)
* 2) Repair Region Consistency with META and assignments
*
* For table integrity repairs, the tables their region directories are scanned
* for .regioninfo files. Each table's integrity is then verified. If there
* are any orphan regions (regions with no .regioninfo files), or holes, new
* regions are fabricated. Backwards regions are sidelined as well as empty
* degenerate (endkey==startkey) regions. If there are any overlapping regions,
* a new region is created and all data is merged into the new region.
*
* Table integrity repairs deal solely with HDFS and can be done offline -- the
* hbase region servers or master do not need to be running. These phase can be
* use to completely reconstruct the META table in an offline fashion.
*
* Region consistency requires three conditions -- 1) valid .regioninfo file
* present in an hdfs region dir, 2) valid row with .regioninfo data in META,
* and 3) a region is deployed only at the regionserver that is was assigned to.
*
* Region consistency requires hbck to contact the HBase master and region
* servers, so the connect() must first be called successfully. Much of the
* region consistency information is transient and less risky to repair.
*/
{code}
was:
The current (0.90.5, 0.92.0rc2) versions of hbck detects most of region
consistency and table integrity invariant violations. However with '-fix' it
can only automatically handle deployment problems with region consistency
cases. This updated version should be able to handle all cases (including a
new orphan regiondir case). When complete will likely deprecate the
OfflineMetaRepair tool and subsume several open META-hole related issue.
Here's the approach (from the comment of at the top of the new version of the
file).
{code}
/**
* HBaseFsck (hbck) is a tool for checking and repairing region consistency and
* table integrity.
*
* Region consistency checks verify that META, region deployment on
* region servers and the state of data in HDFS (.regioninfo files) all are in
* accordance.
*
* Table integrity checks verify that that all possible row keys can resolve to
* exactly one region of a table. This means there are no individual degenerate
* or backwards regions; no holes between regions; and that there no overlapping
* regions.
*
* The general repair strategy works in these steps.
* 1) Repair Table Integrity on HDFS. (merge or fabricate regions)
* 2) Repair Region Consistency with META and assignments
*
* For table integrity repairs, the tables their region directories are scanned
* for .regioninfo files. Each table's integrity is then verified. If there
* are any orphan regions (regions with no .regioninfo files), or holes, new
* regions are fabricated. Backwards regions are sidelined as well as empty
* degenerate (endkey==startkey) regions. If there are any overlapping regions,
* a new region is created and all data is merged into the new region.
*
* Table integrity repairs deal solely with HDFS and can be done offline -- the
* hbase region servers or master do not need to be running. These phase can be
* use to completely reconstruct the META table in an offline fashion.
*
* Region consistency requires three conditions -- 1) valid .regioninfo file
* present in an hdfs region dir, 2) valid row with .regioninfo data in META,
* and 3) a region is deployed only at the regionserver that is was assigned to.
*
* Region consistency requires hbck to contact the HBase master and region
* servers, so the connect() must first be called successfully. Much of the
* region consistency information is transient and less risky to repair.
*/
{code}
> [uber hbck] Enable hbck to automatically repair table integrity problems as
> well as region consistency problems while online.
> -----------------------------------------------------------------------------------------------------------------------------
>
> Key: HBASE-5128
> URL: https://issues.apache.org/jira/browse/HBASE-5128
> Project: HBase
> Issue Type: New Feature
> Reporter: Jonathan Hsieh
> Assignee: Jonathan Hsieh
>
> The current (0.90.5, 0.92.0rc2) versions of hbck detects most of region
> consistency and table integrity invariant violations. However with '-fix' it
> can only automatically repair region consistency cases having to do with
> deployment problems. This updated version should be able to handle all cases
> (including a new orphan regiondir case). When complete will likely deprecate
> the OfflineMetaRepair tool and subsume several open META-hole related issue.
> Here's the approach (from the comment of at the top of the new version of the
> file).
> {code}
> /**
> * HBaseFsck (hbck) is a tool for checking and repairing region consistency
> and
> * table integrity.
> *
> * Region consistency checks verify that META, region deployment on
> * region servers and the state of data in HDFS (.regioninfo files) all are in
> * accordance.
> *
> * Table integrity checks verify that that all possible row keys can resolve
> to
> * exactly one region of a table. This means there are no individual
> degenerate
> * or backwards regions; no holes between regions; and that there no
> overlapping
> * regions.
> *
> * The general repair strategy works in these steps.
> * 1) Repair Table Integrity on HDFS. (merge or fabricate regions)
> * 2) Repair Region Consistency with META and assignments
> *
> * For table integrity repairs, the tables their region directories are
> scanned
> * for .regioninfo files. Each table's integrity is then verified. If there
> * are any orphan regions (regions with no .regioninfo files), or holes, new
> * regions are fabricated. Backwards regions are sidelined as well as empty
> * degenerate (endkey==startkey) regions. If there are any overlapping
> regions,
> * a new region is created and all data is merged into the new region.
> *
> * Table integrity repairs deal solely with HDFS and can be done offline --
> the
> * hbase region servers or master do not need to be running. These phase can
> be
> * use to completely reconstruct the META table in an offline fashion.
> *
> * Region consistency requires three conditions -- 1) valid .regioninfo file
> * present in an hdfs region dir, 2) valid row with .regioninfo data in META,
> * and 3) a region is deployed only at the regionserver that is was assigned
> to.
> *
> * Region consistency requires hbck to contact the HBase master and region
> * servers, so the connect() must first be called successfully. Much of the
> * region consistency information is transient and less risky to repair.
> */
> {code}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira