In HBase 2 you should never delete master proc wals. unlike in earlier
releases, it will almost certainly damage the cluster. Probably now
you are in a known-bad state independent of whatever your earlier
issue was. I think though we can fix it.

Some baseline info:

1) Did you follow the upgrade process to go from 2.0.z to 2.2.0?

I can't link directly to the section due to HBASE-22010, but it's the
first one here:

http://hbase.apache.org/book.html#_upgrade_paths

2) I think your meta issue is somethign we'll need HBCK2 to fix. so
I'd like to work out what's not working for you there.

We have not done a release of HBCK2 yet, so unfortunately you'll have
to build it yourself. I think you've already realized that's
non-trivial. We have, however, successfully gone through using it with
prior releases.

Can you tell me where in this high level things fell down? Or where I
should drill in more?

2a) Get the code from the git repo:
https://github.com/apache/hbase-operator-tools
2b) Build for use with the RC. It is important that you specify your
hbase version

mvn -Dhbase.version=2.2.0 package

Note that since 2.2.0 hasn't been released yet, you'll need to tell
maven to point at the staged repository posted in the VOTE. e.g. save
this gist

https://gist.github.com/busbey/ce2293e78440f060fa60aa2dcf1333f1

 as ~/hbase-2.2.0rc0.settings.xml and then do

mvn --settings ~/hbase-2.2.0rc0.settings.xml -Dhbase.version=2.2.0 package

2c) grab the jar from
hbase-hbck2/target/hbase-hbck2-1.0.0-SNAPSHOT.jar and put it where you
can access it on the cluster. let's call it in
~/hbase-hbck2-for-2.2.0.jar

2d) run hbck2 on the cluster to verify that you get the correct help

hbase hbck -j ~/hbase-hbck2-for-2.2.0.jar

3) are there outstanding procedures? when master isn't finishing
initialization, what does it print out about the meta region?



On Thu, Mar 7, 2019 at 1:57 PM Jean-Marc Spaggiari
<[email protected]> wrote:
>
> Sure! here it is!
>
> I cleaned all WALs (old, master, etc.) and it seems to be a bit more clean
> now but it's stlil stuck trying to assign the META table.
>
> 2019-03-07 14:50:35,286 WARN  [master/node2:60000:becomeActiveMaster]
> master.HMaster: hbase:meta,,1.1588230740 is NOT online; state={1588230740
> state=OPEN, ts=1551988229980, 
> server=node5.distparser.com,16020,1551986838747};
> ServerCrashProcedures=false. Master startup cannot progress, in
> holding-pattern until region onlined.
> 2019-03-07 14:50:36,287 WARN  [master/node2:60000:becomeActiveMaster]
> master.HMaster: hbase:meta,,1.1588230740 is NOT online; state={1588230740
> state=OPEN, ts=1551988229980, 
> server=node5.distparser.com,16020,1551986838747};
> ServerCrashProcedures=false. Master startup cannot progress, in
> holding-pattern until region onlined.
> 2019-03-07 14:50:38,287 WARN  [master/node2:60000:becomeActiveMaster]
> master.HMaster: hbase:meta,,1.1588230740 is NOT online; state={1588230740
> state=OPEN, ts=1551988229980, 
> server=node5.distparser.com,16020,1551986838747};
> ServerCrashProcedures=false. Master startup cannot progress, in
> holding-pattern until region onlined.
> 2019-03-07 14:50:42,288 WARN  [master/node2:60000:becomeActiveMaster]
> master.HMaster: hbase:meta,,1.1588230740 is NOT online; state={1588230740
> state=OPEN, ts=1551988229980, 
> server=node5.distparser.com,16020,1551986838747};
> ServerCrashProcedures=false. Master startup cannot progress, in
> holding-pattern until region onlined.
> 2019-03-07 14:50:50,289 WARN  [master/node2:60000:becomeActiveMaster]
> master.HMaster: hbase:meta,,1.1588230740 is NOT online; state={1588230740
> state=OPEN, ts=1551988229980, 
> server=node5.distparser.com,16020,1551986838747};
> ServerCrashProcedures=false. Master startup cannot progress, in
> holding-pattern until region onlined.
> 2019-03-07 14:51:06,290 WARN  [master/node2:60000:becomeActiveMaster]
> master.HMaster: hbase:meta,,1.1588230740 is NOT online; state={1588230740
> state=OPEN, ts=1551988229980, 
> server=node5.distparser.com,16020,1551986838747};
> ServerCrashProcedures=false. Master startup cannot progress, in
> holding-pattern until region onlined.
> 2019-03-07 14:51:29,765 INFO
> [ReadOnlyZKClient-latitude.distparser.com:2181@0x71707c27]
> zookeeper.ZooKeeper: Session: 0x16911bd542a00fa closed
> 2019-03-07 14:51:29,766 INFO
> [ReadOnlyZKClient-latitude.distparser.com:2181@0x71707c27-EventThread]
> zookeeper.ClientCnxn: EventThread shut down for session: 0x16911bd542a00fa
> 2019-03-07 14:51:38,292 WARN  [master/node2:60000:becomeActiveMaster]
> master.HMaster: hbase:meta,,1.1588230740 is NOT online; state={1588230740
> state=OPEN, ts=1551988229980, 
> server=node5.distparser.com,16020,1551986838747};
> ServerCrashProcedures=false. Master startup cannot progress, in
> holding-pattern until region onlined.
> 2019-03-07 14:52:42,292 WARN  [master/node2:60000:becomeActiveMaster]
> master.HMaster: hbase:meta,,1.1588230740 is NOT online; state={1588230740
> state=OPEN, ts=1551988229980, 
> server=node5.distparser.com,16020,1551986838747};
> ServerCrashProcedures=false. Master startup cannot progress, in
> holding-pattern until region onlined.
> 2019-03-07 14:54:50,293 WARN  [master/node2:60000:becomeActiveMaster]
> master.HMaster: hbase:meta,,1.1588230740 is NOT online; state={1588230740
> state=OPEN, ts=1551988229980, 
> server=node5.distparser.com,16020,1551986838747};
> ServerCrashProcedures=false. Master startup cannot progress, in
> holding-pattern until region
> onlined.
>
> I will kepe it running for some time and see if it ends up doing
> something...
>
> JMS
>
>
> Le jeu. 7 mars 2019 à 14:54, Sean Busbey <[email protected]> a écrit :
>
> > JMS, could you start a new thread with your upgrade issue so we can go
> > through some things without pinging the VOTE thread?
> >
> >
> > On Thu, Mar 7, 2019 at 1:48 PM Jean-Marc Spaggiari
> > <[email protected]> wrote:
> > >
> > > Downloaded the version and checked the MD5SUM.
> > > Checked documentation and README
> > > Checked license => FAILED. *Too many files with unapproved license*
> > > Ran in standalone, checked logs and UI, ran some load, went well.
> > >
> > > I tried to deploy on top of 2.0.4 and it doesn't start.
> > > 2019-03-07 14:38:14,848 WARN  [master/node2:60000.Chore.1]
> > > master.CatalogJanitor: CatalogJanitor is disabled! Enabled=true,
> > > maintenanceMode=false,
> > > am=org.apache.hadoop.hbase.master.assignment.AssignmentManager@7deaf821,
> > > metaLoaded=true, hasRIT=true clusterShutDown=false
> > > 2019-03-07 14:39:13,869 WARN  [ProcExecTimeout]
> > > assignment.AssignmentManager: STUCK Region-In-Transition rit=OPENING,
> > > location=node1.distparser.com,16020,1551986838653, table=dns,
> > > region=bb65f685cdefc4f2491d246f376fc1f0
> > >
> > > Tried to disable the tables but I'm not able. Tried to move the regions
> > but
> > > HBCK2 doesn't want.
> > > Caused by:
> > >
> > org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.PleaseHoldException):
> > > org.apache.hadoop.hbase.PleaseHoldException: Master is initializing
> > >
> > >
> > > Few things I found:
> > >
> > > Command line says:
> > >
> > >   hbck            Run the HBase 'fsck' tool. Defaults read-only hbck1.
> > >                   Pass '-j /path/to/HBCK2.jar' to run hbase-2.x HBCK2.
> > >
> > > However there is no HBCK2.jar. Google gave me
> > > https://github.com/apache/hbase-operator-tools but that's not trivial.
> > > Downloaded and built it, but the -j option doesn't seems to exist. Found
> > > that it should be -jar . Was still not working. After fighting with it I
> > > finally got it working by calling directly java org.apache.hbase.HBCK2
> > >
> > >
> > > I got this in the logs at some point:
> > > Caused by: java.lang.ClassNotFoundException:
> > > org.apache.hadoop.hbase.master.assignment.TransitRegionStateProcedure
> > > at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
> > > at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
> > > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)
> > > at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> > > at java.lang.Class.forName0(Native Method)
> > > at java.lang.Class.forName(Class.java:264)
> > > at
> > >
> > org.apache.hadoop.hbase.procedure2.ProcedureUtil.newProcedure(ProcedureUtil.java:50)
> > > ... 17 more
> > >
> > > I will keep fighting with it to try to get something working. I can of
> > > course rm -rm /hbase and get it run, but I would like to see if it can
> > > recover...
> > >
> > > Thanks,
> > >
> > > JMS
> > >
> > > Le jeu. 7 mars 2019 à 04:44, Guanghao Zhang <[email protected]> a
> > écrit :
> > >
> > > > Please vote on this release candidate (RC) for Apache HBase 2.2.0.
> > > > This is the first release of the branch-2.2 line.
> > > >
> > > > The VOTE will remain open for at least 72 hours.
> > > >
> > > > [ ] +1 Release this package as Apache HBase 2.2.0
> > > > [ ] -1 Do not release this package because ...
> > > >
> > > > The tag to be voted on is 2.2.0-RC0 (commit
> > > > 4ab2dc20f15e9b59477de4bd971c367f3ce342cb):
> > > >
> > > >  https://github.com/apache/hbase/tree/2.2.0-RC0
> > > >
> > > > The release files, including signatures, digests, etc. can be found at:
> > > >
> > > > https://dist.apache.org/repos/dist/dev/hbase/2.2.0RC0/
> > > >
> > > > Maven artifacts are available in a staging repository at:
> > > >
> > > > https://repository.apache.org/content/repositories/orgapachehbase-1286
> > > >
> > > > Signatures used for HBase RCs can be found in this file:
> > > >
> > > > https://dist.apache.org/repos/dist/release/hbase/KEYS
> > > >
> > > > The list of bug fixes going into 2.2.0 can be found in included
> > > > CHANGES.md and RELEASENOTES.md available here:
> > > >
> > > > https://dist.apache.org/repos/dist/dev/hbase/2.2.0RC0/CHANGES.md
> > > > https://dist.apache.org/repos/dist/dev/hbase/2.2.0RC0/RELEASENOTES.md
> > > >
> > > > To learn more about Apache HBase, please see http://hbase.apache.org/
> > > >
> > > > Thanks,
> > > > Guanghao Zhang
> > > >
> >

Reply via email to