[ https://issues.apache.org/jira/browse/HBASE-5128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13236997#comment-13236997 ]
jirapos...@reviews.apache.org commented on HBASE-5128: ------------------------------------------------------ ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3435/#review6304 ----------------------------------------------------------- Ship it! Went through a third. Minors below that should not hold up commit. Get it in!!! Great stuff Jon. src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java <https://reviews.apache.org/r/3435/#comment13682> Good doc (though I've said this previous, its still good doc) src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java <https://reviews.apache.org/r/3435/#comment13683> Why TreeMap it if its encoded region names? These are hashes so no value sorting them? src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java <https://reviews.apache.org/r/3435/#comment13684> Ditto on sort here? Why sort by table name? How does sort prevent dupes? src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java <https://reviews.apache.org/r/3435/#comment13685> This almost recommends that HBaseFsck becomes a shell that does nothing but instantiate another class that does acual fixup. clearState in that case would throw away the instantiated 'Fsck' class and create a completely new instance rather than zero out data members as this does. For the future. - Michael On 2012-03-23 16:13:50, jmhsieh wrote: bq. bq. ----------------------------------------------------------- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/3435/ bq. ----------------------------------------------------------- bq. bq. (Updated 2012-03-23 16:13:50) bq. bq. bq. Review request for hbase, Todd Lipcon, Ted Yu, Michael Stack, and Jean-Daniel Cryans. bq. bq. bq. Summary bq. ------- bq. bq. This should nearly be to ready for integration. This has the same control flow as the trunk/0.92/0.94 versions but has a few differences. bq. bq. - It needs to track HTableDescritors instead of reading them from the file system. bq. - It uses a different HBaseFsckRepair.forceOfflineInZK method -- which for some reason means we don't need HBASE-5563. bq. - Uses HServerAddress instead of ServerName bq. bq. This version is close to what we've used on production clusters. bq. bq. bq. This addresses bug HBASE-5128. bq. https://issues.apache.org/jira/browse/HBASE-5128 bq. bq. bq. Diffs bq. ----- bq. bq. src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java 1a4f7f1 bq. src/main/java/org/apache/hadoop/hbase/ipc/HMasterInterface.java 3c635d4 bq. src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java d47ef10 bq. src/main/java/org/apache/hadoop/hbase/master/HMaster.java cd1755f bq. src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java c0aaf65 bq. src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java 5916d9c bq. src/main/java/org/apache/hadoop/hbase/util/hbck/OfflineMetaRepair.java d57bb6b bq. src/main/java/org/apache/hadoop/hbase/util/hbck/TableIntegrityErrorHandler.java PRE-CREATION bq. src/main/java/org/apache/hadoop/hbase/util/hbck/TableIntegrityErrorHandlerImpl.java PRE-CREATION bq. src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java d9a2a02 bq. src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsck.java 937781d bq. src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsckComparator.java 0599da1 bq. src/test/java/org/apache/hadoop/hbase/util/hbck/HbckTestingUtil.java dbb97f8 bq. src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildBase.java 2b4cac8 bq. src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildHole.java ebbeead bq. src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildOverlap.java b175548 bq. bq. Diff: https://reviews.apache.org/r/3435/diff bq. bq. bq. Testing bq. ------- bq. bq. All TestHBaseFsck unit tests pass. Currently running full suite. bq. bq. bq. Thanks, bq. bq. jmhsieh bq. bq. > [uber hbck] Enable hbck to automatically repair table integrity problems as > well as region consistency problems while online. > ----------------------------------------------------------------------------------------------------------------------------- > > Key: HBASE-5128 > URL: https://issues.apache.org/jira/browse/HBASE-5128 > Project: HBase > Issue Type: New Feature > Components: hbck > Affects Versions: 0.90.5, 0.92.0, 0.94.0, 0.96.0 > Reporter: Jonathan Hsieh > Assignee: Jonathan Hsieh > Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0 > > Attachments: hbase-5128-0.90-v2.patch, hbase-5128-0.90-v2b.patch, > hbase-5128-0.92-v2.patch, hbase-5128-0.92-v4.patch, hbase-5128-0.94-v2.patch, > hbase-5128-0.94-v4.patch, hbase-5128-trunk-v2.patch, hbase-5128-trunk.patch, > hbase-5128-v3.patch, hbase-5128-v4.patch > > > The current (0.90.5, 0.92.0rc2) versions of hbck detects most of region > consistency and table integrity invariant violations. However with '-fix' it > can only automatically repair region consistency cases having to do with > deployment problems. This updated version should be able to handle all cases > (including a new orphan regiondir case). When complete will likely deprecate > the OfflineMetaRepair tool and subsume several open META-hole related issue. > Here's the approach (from the comment of at the top of the new version of the > file). > {code} > /** > * HBaseFsck (hbck) is a tool for checking and repairing region consistency > and > * table integrity. > * > * Region consistency checks verify that META, region deployment on > * region servers and the state of data in HDFS (.regioninfo files) all are in > * accordance. > * > * Table integrity checks verify that that all possible row keys can resolve > to > * exactly one region of a table. This means there are no individual > degenerate > * or backwards regions; no holes between regions; and that there no > overlapping > * regions. > * > * The general repair strategy works in these steps. > * 1) Repair Table Integrity on HDFS. (merge or fabricate regions) > * 2) Repair Region Consistency with META and assignments > * > * For table integrity repairs, the tables their region directories are > scanned > * for .regioninfo files. Each table's integrity is then verified. If there > * are any orphan regions (regions with no .regioninfo files), or holes, new > * regions are fabricated. Backwards regions are sidelined as well as empty > * degenerate (endkey==startkey) regions. If there are any overlapping > regions, > * a new region is created and all data is merged into the new region. > * > * Table integrity repairs deal solely with HDFS and can be done offline -- > the > * hbase region servers or master do not need to be running. These phase can > be > * use to completely reconstruct the META table in an offline fashion. > * > * Region consistency requires three conditions -- 1) valid .regioninfo file > * present in an hdfs region dir, 2) valid row with .regioninfo data in META, > * and 3) a region is deployed only at the regionserver that is was assigned > to. > * > * Region consistency requires hbck to contact the HBase master and region > * servers, so the connect() must first be called successfully. Much of the > * region consistency information is transient and less risky to repair. > */ > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira