Kicking this thread.... * Lots of progress on an hbck2. It has some basic utility (see below) that has been useful to me at least hacking on a test cluster I've been doing damage too this last week or so. It exits with complaint if run against an hbase that doesn't have support for hbck2 ops (i.e. < 2.0.3 or < 2.1.0) and it is itself versioned. I'll work on a bit of doc and our Sean is working on making it easy to find and run over in HBASE-21215 <https://issues.apache.org/jira/browse/HBASE-21215>. We could cut a 1.0.0RC inside the next week or so I'd say. * A bunch of messy stuff has been fixed over the last few weeks on the tip of branch-2.1 thanks to our Duo, Allan, JIngyun,among others (and backported to branch-2.0 <= Look for a 2.0.3RC soon after the 2.1.1RC...). In cluster testing, we're not looking bad.
So, I think a 2.1.1RC0 is not far off. If you want to help out, there's just a few outstanding issues [1]. If any are yours, please do an update (including moving out of 2.1.1 if you don't think it will make it ). The other area that needs love is failing unit tests. There are just a few. Pick one and have a go at it [2]. Lets try and get an RC0 up next week or so? Thanks, S 1. https://issues.apache.org/jira/projects/HBASE/versions/12343470 2. https://builds.apache.org/view/H-L/view/HBase/job/HBase%20Nightly/job/branch-2.1/ and https://builds.apache.org/view/H-L/view/HBase/job/HBase-Find-Flaky-Tests/job/branch-2.1/lastSuccessfulBuild/artifact/dashboard.html Below is usage for HBCK2 as of today: $ HBASE_CLASSPATH_PREFIX=~/checkouts/hbase-operator-tools/hbase-hbck2/target/hbase-hbck2-1.0.0-SNAPSHOT.jar ./bin/hbase org.apache.hbase.HBCK2 usage: HBCK2 [OPTIONS] COMMAND <ARGS> Options: -d,--debug run with debug output -h,--help output this help message -p,--hbase.zookeeper.property.clientPort port of target hbase ensemble -q,--hbase.zookeeper.quorum <arg> ensemble of target hbase -v,--version this hbck2 version -z,--zookeeper.znode.parent parent znode of target hbase Commands: assigns [OPTIONS] <ENCODED_REGIONNAME>... Options: -o,--override override ownership by another procedure A 'raw' assign that can be used even during Master initialization. Skirts Coprocessors. Pass one or more encoded RegionNames. 1588230740 is the hard-coded name for the hbase:meta region and de00010733901a05f5a2a3a382e27dd4 is an example of what a user-space encoded Region name looks like. For example: $ HBCK2 assign 1588230740 de00010733901a05f5a2a3a382e27dd4 Returns the pid(s) of the created AssignProcedure(s) or -1 if none. bypass [OPTIONS] <PID>... Options: -o,--override override if procedure is running/stuck -r,--recursive bypass parent and its children. SLOW! EXPENSIVE! -w,--lockWait milliseconds to wait on lock before giving up; default=1 Pass one (or more) procedure 'pid's to skip to procedure finish. Parent of bypassed procedure will also be skipped to the finish. Entities will be left in an inconsistent state and will require manual fixup. May need Master restart to clear locks still held. Bypass fails if procedure has children. Add 'recursive' if all you have is a parent pid to finish parent and children. This is SLOW, and dangerous so use selectively. Does not always work. unassigns <ENCODED_REGIONNAME>... Options: -o,--override override ownership by another procedure A 'raw' unassign that can be used even during Master initialization. Skirts Coprocessors. Pass one or more encoded RegionNames: 1588230740 is the hard-coded name for the hbase:meta region and de00010733901a05f5a2a3a382e27dd4 is an example of what a user-space encoded Region name looks like. For example: $ HBCK2 unassign 1588230740 de00010733901a05f5a2a3a382e27dd4 Returns the pid(s) of the created UnassignProcedure(s) or -1 if none. setTableState <TABLENAME> <STATE> Possible table states: ENABLED, DISABLED, DISABLING, ENABLING To read current table state, in the hbase shell run: hbase> get 'hbase:meta', '<TABLENAME>', 'table:state' A value of \x08\x00 == ENABLED, \x08\x01 == DISABLED, etc. An example making table name 'user' ENABLED: $ HBCK2 setTableState users ENABLED Returns whatever the previous table state was. On Mon, Oct 8, 2018 at 4:34 PM Stack <st...@duboce.net> wrote: > On Mon, Oct 8, 2018 at 4:01 PM Josh Elser <els...@apache.org> wrote: > >> Best place to find hbck2 issue needing review is off of HBASE-19121 or >> somewhere else? >> >> > For 2.1.1 issues, see the 2.1.1 release listing: > https://issues.apache.org/jira/projects/HBASE/versions/12343470 Half > these items are items turned up testing branch-2.1 and trying to use hbck2. > Will link a few others. > > >> All: please feel free to ping directly if you want/need reviews. >> >> Will do. > > Thanks, > S > > > >> On 10/5/18 7:41 PM, 张铎(Duo Zhang) wrote: >> > Stack has a plan on the 2.1.1 release where we want to finish the first >> > version on hbck2. In the real deploy we have met a stuck cluster several >> > times, and lots of users have asked that why hbck can not work any >> more... >> > >> > So the current opening issue is not important, please help reviewing the >> > patches for hbck2 to speed up the release... >> > >> > Thanks for bringing this up >> > >> > Mike Drob <md...@apache.org>于2018年10月5日 周五23:53写道: >> > >> >> Devs, >> >> >> >> It's been almost 3 months since 2.1.0 was released (Jul 19) and we >> have 150 >> >> commits on branch-2.1 in that time. What do folks think of getting a >> >> release going? I know that there's been some discussion around the >> HBCK2 >> >> stuff landing, but I feel like the conversation has gotten a bit lost >> >> without an actual release to relate to. >> >> >> >> Duo, as the 2.1.0 release manager, are you interested in maintaining >> the >> >> 2.1 branch release cadence? If you've gotten busy, then let's find >> another >> >> volunteer. >> >> >> >> There are 18 issues open or in progress currently. Only one is labelled >> >> blocker, and five more are critical -- let's evaluate these and the >> rest to >> >> figure out what we need for a release to happen. I went ahead and >> created a >> >> 2.1.2 version in Jira so that we have somewhere to move issues that >> aren't >> >> getting done soon. >> >> >> >> Meanwhile, I think we also need to look at test stabilization -- >> there's 15 >> >> tests on the dashboard that might need attention. >> >> >> >> Mike >> >> >> > >> >