In HBase 2 you should never delete master proc wals. unlike in earlier releases, it will almost certainly damage the cluster. Probably now you are in a known-bad state independent of whatever your earlier issue was. I think though we can fix it.
Some baseline info: 1) Did you follow the upgrade process to go from 2.0.z to 2.2.0? I can't link directly to the section due to HBASE-22010, but it's the first one here: http://hbase.apache.org/book.html#_upgrade_paths 2) I think your meta issue is somethign we'll need HBCK2 to fix. so I'd like to work out what's not working for you there. We have not done a release of HBCK2 yet, so unfortunately you'll have to build it yourself. I think you've already realized that's non-trivial. We have, however, successfully gone through using it with prior releases. Can you tell me where in this high level things fell down? Or where I should drill in more? 2a) Get the code from the git repo: https://github.com/apache/hbase-operator-tools 2b) Build for use with the RC. It is important that you specify your hbase version mvn -Dhbase.version=2.2.0 package Note that since 2.2.0 hasn't been released yet, you'll need to tell maven to point at the staged repository posted in the VOTE. e.g. save this gist https://gist.github.com/busbey/ce2293e78440f060fa60aa2dcf1333f1 as ~/hbase-2.2.0rc0.settings.xml and then do mvn --settings ~/hbase-2.2.0rc0.settings.xml -Dhbase.version=2.2.0 package 2c) grab the jar from hbase-hbck2/target/hbase-hbck2-1.0.0-SNAPSHOT.jar and put it where you can access it on the cluster. let's call it in ~/hbase-hbck2-for-2.2.0.jar 2d) run hbck2 on the cluster to verify that you get the correct help hbase hbck -j ~/hbase-hbck2-for-2.2.0.jar 3) are there outstanding procedures? when master isn't finishing initialization, what does it print out about the meta region? On Thu, Mar 7, 2019 at 1:57 PM Jean-Marc Spaggiari <[email protected]> wrote: > > Sure! here it is! > > I cleaned all WALs (old, master, etc.) and it seems to be a bit more clean > now but it's stlil stuck trying to assign the META table. > > 2019-03-07 14:50:35,286 WARN [master/node2:60000:becomeActiveMaster] > master.HMaster: hbase:meta,,1.1588230740 is NOT online; state={1588230740 > state=OPEN, ts=1551988229980, > server=node5.distparser.com,16020,1551986838747}; > ServerCrashProcedures=false. Master startup cannot progress, in > holding-pattern until region onlined. > 2019-03-07 14:50:36,287 WARN [master/node2:60000:becomeActiveMaster] > master.HMaster: hbase:meta,,1.1588230740 is NOT online; state={1588230740 > state=OPEN, ts=1551988229980, > server=node5.distparser.com,16020,1551986838747}; > ServerCrashProcedures=false. Master startup cannot progress, in > holding-pattern until region onlined. > 2019-03-07 14:50:38,287 WARN [master/node2:60000:becomeActiveMaster] > master.HMaster: hbase:meta,,1.1588230740 is NOT online; state={1588230740 > state=OPEN, ts=1551988229980, > server=node5.distparser.com,16020,1551986838747}; > ServerCrashProcedures=false. Master startup cannot progress, in > holding-pattern until region onlined. > 2019-03-07 14:50:42,288 WARN [master/node2:60000:becomeActiveMaster] > master.HMaster: hbase:meta,,1.1588230740 is NOT online; state={1588230740 > state=OPEN, ts=1551988229980, > server=node5.distparser.com,16020,1551986838747}; > ServerCrashProcedures=false. Master startup cannot progress, in > holding-pattern until region onlined. > 2019-03-07 14:50:50,289 WARN [master/node2:60000:becomeActiveMaster] > master.HMaster: hbase:meta,,1.1588230740 is NOT online; state={1588230740 > state=OPEN, ts=1551988229980, > server=node5.distparser.com,16020,1551986838747}; > ServerCrashProcedures=false. Master startup cannot progress, in > holding-pattern until region onlined. > 2019-03-07 14:51:06,290 WARN [master/node2:60000:becomeActiveMaster] > master.HMaster: hbase:meta,,1.1588230740 is NOT online; state={1588230740 > state=OPEN, ts=1551988229980, > server=node5.distparser.com,16020,1551986838747}; > ServerCrashProcedures=false. Master startup cannot progress, in > holding-pattern until region onlined. > 2019-03-07 14:51:29,765 INFO > [ReadOnlyZKClient-latitude.distparser.com:2181@0x71707c27] > zookeeper.ZooKeeper: Session: 0x16911bd542a00fa closed > 2019-03-07 14:51:29,766 INFO > [ReadOnlyZKClient-latitude.distparser.com:2181@0x71707c27-EventThread] > zookeeper.ClientCnxn: EventThread shut down for session: 0x16911bd542a00fa > 2019-03-07 14:51:38,292 WARN [master/node2:60000:becomeActiveMaster] > master.HMaster: hbase:meta,,1.1588230740 is NOT online; state={1588230740 > state=OPEN, ts=1551988229980, > server=node5.distparser.com,16020,1551986838747}; > ServerCrashProcedures=false. Master startup cannot progress, in > holding-pattern until region onlined. > 2019-03-07 14:52:42,292 WARN [master/node2:60000:becomeActiveMaster] > master.HMaster: hbase:meta,,1.1588230740 is NOT online; state={1588230740 > state=OPEN, ts=1551988229980, > server=node5.distparser.com,16020,1551986838747}; > ServerCrashProcedures=false. Master startup cannot progress, in > holding-pattern until region onlined. > 2019-03-07 14:54:50,293 WARN [master/node2:60000:becomeActiveMaster] > master.HMaster: hbase:meta,,1.1588230740 is NOT online; state={1588230740 > state=OPEN, ts=1551988229980, > server=node5.distparser.com,16020,1551986838747}; > ServerCrashProcedures=false. Master startup cannot progress, in > holding-pattern until region > onlined. > > I will kepe it running for some time and see if it ends up doing > something... > > JMS > > > Le jeu. 7 mars 2019 à 14:54, Sean Busbey <[email protected]> a écrit : > > > JMS, could you start a new thread with your upgrade issue so we can go > > through some things without pinging the VOTE thread? > > > > > > On Thu, Mar 7, 2019 at 1:48 PM Jean-Marc Spaggiari > > <[email protected]> wrote: > > > > > > Downloaded the version and checked the MD5SUM. > > > Checked documentation and README > > > Checked license => FAILED. *Too many files with unapproved license* > > > Ran in standalone, checked logs and UI, ran some load, went well. > > > > > > I tried to deploy on top of 2.0.4 and it doesn't start. > > > 2019-03-07 14:38:14,848 WARN [master/node2:60000.Chore.1] > > > master.CatalogJanitor: CatalogJanitor is disabled! Enabled=true, > > > maintenanceMode=false, > > > am=org.apache.hadoop.hbase.master.assignment.AssignmentManager@7deaf821, > > > metaLoaded=true, hasRIT=true clusterShutDown=false > > > 2019-03-07 14:39:13,869 WARN [ProcExecTimeout] > > > assignment.AssignmentManager: STUCK Region-In-Transition rit=OPENING, > > > location=node1.distparser.com,16020,1551986838653, table=dns, > > > region=bb65f685cdefc4f2491d246f376fc1f0 > > > > > > Tried to disable the tables but I'm not able. Tried to move the regions > > but > > > HBCK2 doesn't want. > > > Caused by: > > > > > org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.PleaseHoldException): > > > org.apache.hadoop.hbase.PleaseHoldException: Master is initializing > > > > > > > > > Few things I found: > > > > > > Command line says: > > > > > > hbck Run the HBase 'fsck' tool. Defaults read-only hbck1. > > > Pass '-j /path/to/HBCK2.jar' to run hbase-2.x HBCK2. > > > > > > However there is no HBCK2.jar. Google gave me > > > https://github.com/apache/hbase-operator-tools but that's not trivial. > > > Downloaded and built it, but the -j option doesn't seems to exist. Found > > > that it should be -jar . Was still not working. After fighting with it I > > > finally got it working by calling directly java org.apache.hbase.HBCK2 > > > > > > > > > I got this in the logs at some point: > > > Caused by: java.lang.ClassNotFoundException: > > > org.apache.hadoop.hbase.master.assignment.TransitRegionStateProcedure > > > at java.net.URLClassLoader.findClass(URLClassLoader.java:381) > > > at java.lang.ClassLoader.loadClass(ClassLoader.java:424) > > > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335) > > > at java.lang.ClassLoader.loadClass(ClassLoader.java:357) > > > at java.lang.Class.forName0(Native Method) > > > at java.lang.Class.forName(Class.java:264) > > > at > > > > > org.apache.hadoop.hbase.procedure2.ProcedureUtil.newProcedure(ProcedureUtil.java:50) > > > ... 17 more > > > > > > I will keep fighting with it to try to get something working. I can of > > > course rm -rm /hbase and get it run, but I would like to see if it can > > > recover... > > > > > > Thanks, > > > > > > JMS > > > > > > Le jeu. 7 mars 2019 à 04:44, Guanghao Zhang <[email protected]> a > > écrit : > > > > > > > Please vote on this release candidate (RC) for Apache HBase 2.2.0. > > > > This is the first release of the branch-2.2 line. > > > > > > > > The VOTE will remain open for at least 72 hours. > > > > > > > > [ ] +1 Release this package as Apache HBase 2.2.0 > > > > [ ] -1 Do not release this package because ... > > > > > > > > The tag to be voted on is 2.2.0-RC0 (commit > > > > 4ab2dc20f15e9b59477de4bd971c367f3ce342cb): > > > > > > > > https://github.com/apache/hbase/tree/2.2.0-RC0 > > > > > > > > The release files, including signatures, digests, etc. can be found at: > > > > > > > > https://dist.apache.org/repos/dist/dev/hbase/2.2.0RC0/ > > > > > > > > Maven artifacts are available in a staging repository at: > > > > > > > > https://repository.apache.org/content/repositories/orgapachehbase-1286 > > > > > > > > Signatures used for HBase RCs can be found in this file: > > > > > > > > https://dist.apache.org/repos/dist/release/hbase/KEYS > > > > > > > > The list of bug fixes going into 2.2.0 can be found in included > > > > CHANGES.md and RELEASENOTES.md available here: > > > > > > > > https://dist.apache.org/repos/dist/dev/hbase/2.2.0RC0/CHANGES.md > > > > https://dist.apache.org/repos/dist/dev/hbase/2.2.0RC0/RELEASENOTES.md > > > > > > > > To learn more about Apache HBase, please see http://hbase.apache.org/ > > > > > > > > Thanks, > > > > Guanghao Zhang > > > > > >
