Ok. First, thanks to all of you for taking a look at that!

Sean, I didn't follow the steps. 2.0.4 was failure and not able to start. I
got the recommendation to remove the master wall, which I did. So I did't
set the procedure upgrade tag, because there was no more any data on the
procedure side. Should I still put it even if I wiped everything?


"Can you tell me where in this high level things fell down? Or where I
should drill in more?"

It's hard to say. I tried way too many things I think, to be able to point
to something specific. I think HBCK2 is definitely somewhere where I
struggled. So the instructions you sent are very helpful.

Right now, I still have a 2.2.0 instance that doesn't want to start. I can
wipe /hbase in both ZK and HDFS and I'm sure it will run, but I'm
interested  to figure how to get it back stable, instead of taking the easy
path.



@Allan: I removed both the znode for meta and namespace and they still
can't be assigned. Thanks for the suggestion.

@Wellington: I followed your proposed steps, but I still get
PleaseHoldException: Master is initializing. I don't see the difference
with and without this parameter.

2019-03-12 10:37:08,385 WARN  [master/node2:60000:becomeActiveMaster]
master.HMaster: hbase:meta,,1.1588230740 is NOT online; state={1588230740
state=OPEN, ts=1552400910746, server=node3,60000,-1};
ServerCrashProcedures=true. Master startup cannot progress, in
holding-pattern until region onlined.


I will keep trying to get this cluster starting. Helps to understand the
new constriants...

Thanks again all,

JM


Le ven. 8 mars 2019 à 06:36, Wellington Chevreuil <
[email protected]> a écrit :

> JMS, if u are still getting stuck to assign meta table, assuming u managed
> to get an hbck2 jar built, u can try set master to maintenance mode by
> setting "hbase.master.maintenance_mode" to "true" on master's
> hbase-site.xml, restart master, then manually bring meta online with hbck2
> below command:
>
> $ hbase hbck -j /path/to/hbase-hbck2-1.0.0-SNAPSHOT.jar assigns 1588230740
>
>
> Em sex, 8 de mar de 2019 às 08:07, Allan Yang <[email protected]>
> escreveu:
>
> > Try to delete meta Znode from Zookeeper, and restart master.
> > Best Regards
> > Allan Yang
> >
> >
> > Sean Busbey <[email protected]> 于2019年3月8日周五 上午4:37写道:
> >
> > > In HBase 2 you should never delete master proc wals. unlike in earlier
> > > releases, it will almost certainly damage the cluster. Probably now
> > > you are in a known-bad state independent of whatever your earlier
> > > issue was. I think though we can fix it.
> > >
> > > Some baseline info:
> > >
> > > 1) Did you follow the upgrade process to go from 2.0.z to 2.2.0?
> > >
> > > I can't link directly to the section due to HBASE-22010, but it's the
> > > first one here:
> > >
> > > http://hbase.apache.org/book.html#_upgrade_paths
> > >
> > > 2) I think your meta issue is somethign we'll need HBCK2 to fix. so
> > > I'd like to work out what's not working for you there.
> > >
> > > We have not done a release of HBCK2 yet, so unfortunately you'll have
> > > to build it yourself. I think you've already realized that's
> > > non-trivial. We have, however, successfully gone through using it with
> > > prior releases.
> > >
> > > Can you tell me where in this high level things fell down? Or where I
> > > should drill in more?
> > >
> > > 2a) Get the code from the git repo:
> > > https://github.com/apache/hbase-operator-tools
> > > 2b) Build for use with the RC. It is important that you specify your
> > > hbase version
> > >
> > > mvn -Dhbase.version=2.2.0 package
> > >
> > > Note that since 2.2.0 hasn't been released yet, you'll need to tell
> > > maven to point at the staged repository posted in the VOTE. e.g. save
> > > this gist
> > >
> > > https://gist.github.com/busbey/ce2293e78440f060fa60aa2dcf1333f1
> > >
> > >  as ~/hbase-2.2.0rc0.settings.xml and then do
> > >
> > > mvn --settings ~/hbase-2.2.0rc0.settings.xml -Dhbase.version=2.2.0
> > package
> > >
> > > 2c) grab the jar from
> > > hbase-hbck2/target/hbase-hbck2-1.0.0-SNAPSHOT.jar and put it where you
> > > can access it on the cluster. let's call it in
> > > ~/hbase-hbck2-for-2.2.0.jar
> > >
> > > 2d) run hbck2 on the cluster to verify that you get the correct help
> > >
> > > hbase hbck -j ~/hbase-hbck2-for-2.2.0.jar
> > >
> > > 3) are there outstanding procedures? when master isn't finishing
> > > initialization, what does it print out about the meta region?
> > >
> > >
> > >
> > > On Thu, Mar 7, 2019 at 1:57 PM Jean-Marc Spaggiari
> > > <[email protected]> wrote:
> > > >
> > > > Sure! here it is!
> > > >
> > > > I cleaned all WALs (old, master, etc.) and it seems to be a bit more
> > > clean
> > > > now but it's stlil stuck trying to assign the META table.
> > > >
> > > > 2019-03-07 14:50:35,286 WARN  [master/node2:60000:becomeActiveMaster]
> > > > master.HMaster: hbase:meta,,1.1588230740 is NOT online;
> > state={1588230740
> > > > state=OPEN, ts=1551988229980, server=node5.distparser.com
> > > ,16020,1551986838747};
> > > > ServerCrashProcedures=false. Master startup cannot progress, in
> > > > holding-pattern until region onlined.
> > > > 2019-03-07 14:50:36,287 WARN  [master/node2:60000:becomeActiveMaster]
> > > > master.HMaster: hbase:meta,,1.1588230740 is NOT online;
> > state={1588230740
> > > > state=OPEN, ts=1551988229980, server=node5.distparser.com
> > > ,16020,1551986838747};
> > > > ServerCrashProcedures=false. Master startup cannot progress, in
> > > > holding-pattern until region onlined.
> > > > 2019-03-07 14:50:38,287 WARN  [master/node2:60000:becomeActiveMaster]
> > > > master.HMaster: hbase:meta,,1.1588230740 is NOT online;
> > state={1588230740
> > > > state=OPEN, ts=1551988229980, server=node5.distparser.com
> > > ,16020,1551986838747};
> > > > ServerCrashProcedures=false. Master startup cannot progress, in
> > > > holding-pattern until region onlined.
> > > > 2019-03-07 14:50:42,288 WARN  [master/node2:60000:becomeActiveMaster]
> > > > master.HMaster: hbase:meta,,1.1588230740 is NOT online;
> > state={1588230740
> > > > state=OPEN, ts=1551988229980, server=node5.distparser.com
> > > ,16020,1551986838747};
> > > > ServerCrashProcedures=false. Master startup cannot progress, in
> > > > holding-pattern until region onlined.
> > > > 2019-03-07 14:50:50,289 WARN  [master/node2:60000:becomeActiveMaster]
> > > > master.HMaster: hbase:meta,,1.1588230740 is NOT online;
> > state={1588230740
> > > > state=OPEN, ts=1551988229980, server=node5.distparser.com
> > > ,16020,1551986838747};
> > > > ServerCrashProcedures=false. Master startup cannot progress, in
> > > > holding-pattern until region onlined.
> > > > 2019-03-07 14:51:06,290 WARN  [master/node2:60000:becomeActiveMaster]
> > > > master.HMaster: hbase:meta,,1.1588230740 is NOT online;
> > state={1588230740
> > > > state=OPEN, ts=1551988229980, server=node5.distparser.com
> > > ,16020,1551986838747};
> > > > ServerCrashProcedures=false. Master startup cannot progress, in
> > > > holding-pattern until region onlined.
> > > > 2019-03-07 14:51:29,765 INFO
> > > > [ReadOnlyZKClient-latitude.distparser.com:2181@0x71707c27]
> > > > zookeeper.ZooKeeper: Session: 0x16911bd542a00fa closed
> > > > 2019-03-07 14:51:29,766 INFO
> > > > [ReadOnlyZKClient-latitude.distparser.com:2181
> @0x71707c27-EventThread]
> > > > zookeeper.ClientCnxn: EventThread shut down for session:
> > > 0x16911bd542a00fa
> > > > 2019-03-07 14:51:38,292 WARN  [master/node2:60000:becomeActiveMaster]
> > > > master.HMaster: hbase:meta,,1.1588230740 is NOT online;
> > state={1588230740
> > > > state=OPEN, ts=1551988229980, server=node5.distparser.com
> > > ,16020,1551986838747};
> > > > ServerCrashProcedures=false. Master startup cannot progress, in
> > > > holding-pattern until region onlined.
> > > > 2019-03-07 14:52:42,292 WARN  [master/node2:60000:becomeActiveMaster]
> > > > master.HMaster: hbase:meta,,1.1588230740 is NOT online;
> > state={1588230740
> > > > state=OPEN, ts=1551988229980, server=node5.distparser.com
> > > ,16020,1551986838747};
> > > > ServerCrashProcedures=false. Master startup cannot progress, in
> > > > holding-pattern until region onlined.
> > > > 2019-03-07 14:54:50,293 WARN  [master/node2:60000:becomeActiveMaster]
> > > > master.HMaster: hbase:meta,,1.1588230740 is NOT online;
> > state={1588230740
> > > > state=OPEN, ts=1551988229980, server=node5.distparser.com
> > > ,16020,1551986838747};
> > > > ServerCrashProcedures=false. Master startup cannot progress, in
> > > > holding-pattern until region
> > > > onlined.
> > > >
> > > > I will kepe it running for some time and see if it ends up doing
> > > > something...
> > > >
> > > > JMS
> > > >
> > > >
> > > > Le jeu. 7 mars 2019 à 14:54, Sean Busbey <[email protected]> a
> écrit :
> > > >
> > > > > JMS, could you start a new thread with your upgrade issue so we can
> > go
> > > > > through some things without pinging the VOTE thread?
> > > > >
> > > > >
> > > > > On Thu, Mar 7, 2019 at 1:48 PM Jean-Marc Spaggiari
> > > > > <[email protected]> wrote:
> > > > > >
> > > > > > Downloaded the version and checked the MD5SUM.
> > > > > > Checked documentation and README
> > > > > > Checked license => FAILED. *Too many files with unapproved
> license*
> > > > > > Ran in standalone, checked logs and UI, ran some load, went well.
> > > > > >
> > > > > > I tried to deploy on top of 2.0.4 and it doesn't start.
> > > > > > 2019-03-07 14:38:14,848 WARN  [master/node2:60000.Chore.1]
> > > > > > master.CatalogJanitor: CatalogJanitor is disabled! Enabled=true,
> > > > > > maintenanceMode=false,
> > > > > >
> > > am=org.apache.hadoop.hbase.master.assignment.AssignmentManager@7deaf821
> ,
> > > > > > metaLoaded=true, hasRIT=true clusterShutDown=false
> > > > > > 2019-03-07 14:39:13,869 WARN  [ProcExecTimeout]
> > > > > > assignment.AssignmentManager: STUCK Region-In-Transition
> > rit=OPENING,
> > > > > > location=node1.distparser.com,16020,1551986838653, table=dns,
> > > > > > region=bb65f685cdefc4f2491d246f376fc1f0
> > > > > >
> > > > > > Tried to disable the tables but I'm not able. Tried to move the
> > > regions
> > > > > but
> > > > > > HBCK2 doesn't want.
> > > > > > Caused by:
> > > > > >
> > > > >
> > >
> >
> org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.PleaseHoldException):
> > > > > > org.apache.hadoop.hbase.PleaseHoldException: Master is
> initializing
> > > > > >
> > > > > >
> > > > > > Few things I found:
> > > > > >
> > > > > > Command line says:
> > > > > >
> > > > > >   hbck            Run the HBase 'fsck' tool. Defaults read-only
> > > hbck1.
> > > > > >                   Pass '-j /path/to/HBCK2.jar' to run hbase-2.x
> > > HBCK2.
> > > > > >
> > > > > > However there is no HBCK2.jar. Google gave me
> > > > > > https://github.com/apache/hbase-operator-tools but that's not
> > > trivial.
> > > > > > Downloaded and built it, but the -j option doesn't seems to
> exist.
> > > Found
> > > > > > that it should be -jar . Was still not working. After fighting
> with
> > > it I
> > > > > > finally got it working by calling directly java
> > > org.apache.hbase.HBCK2
> > > > > >
> > > > > >
> > > > > > I got this in the logs at some point:
> > > > > > Caused by: java.lang.ClassNotFoundException:
> > > > > >
> > org.apache.hadoop.hbase.master.assignment.TransitRegionStateProcedure
> > > > > > at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
> > > > > > at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
> > > > > > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)
> > > > > > at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> > > > > > at java.lang.Class.forName0(Native Method)
> > > > > > at java.lang.Class.forName(Class.java:264)
> > > > > > at
> > > > > >
> > > > >
> > >
> >
> org.apache.hadoop.hbase.procedure2.ProcedureUtil.newProcedure(ProcedureUtil.java:50)
> > > > > > ... 17 more
> > > > > >
> > > > > > I will keep fighting with it to try to get something working. I
> can
> > > of
> > > > > > course rm -rm /hbase and get it run, but I would like to see if
> it
> > > can
> > > > > > recover...
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > JMS
> > > > > >
> > > > > > Le jeu. 7 mars 2019 à 04:44, Guanghao Zhang <[email protected]>
> a
> > > > > écrit :
> > > > > >
> > > > > > > Please vote on this release candidate (RC) for Apache HBase
> > 2.2.0.
> > > > > > > This is the first release of the branch-2.2 line.
> > > > > > >
> > > > > > > The VOTE will remain open for at least 72 hours.
> > > > > > >
> > > > > > > [ ] +1 Release this package as Apache HBase 2.2.0
> > > > > > > [ ] -1 Do not release this package because ...
> > > > > > >
> > > > > > > The tag to be voted on is 2.2.0-RC0 (commit
> > > > > > > 4ab2dc20f15e9b59477de4bd971c367f3ce342cb):
> > > > > > >
> > > > > > >  https://github.com/apache/hbase/tree/2.2.0-RC0
> > > > > > >
> > > > > > > The release files, including signatures, digests, etc. can be
> > > found at:
> > > > > > >
> > > > > > > https://dist.apache.org/repos/dist/dev/hbase/2.2.0RC0/
> > > > > > >
> > > > > > > Maven artifacts are available in a staging repository at:
> > > > > > >
> > > > > > >
> > > https://repository.apache.org/content/repositories/orgapachehbase-1286
> > > > > > >
> > > > > > > Signatures used for HBase RCs can be found in this file:
> > > > > > >
> > > > > > > https://dist.apache.org/repos/dist/release/hbase/KEYS
> > > > > > >
> > > > > > > The list of bug fixes going into 2.2.0 can be found in included
> > > > > > > CHANGES.md and RELEASENOTES.md available here:
> > > > > > >
> > > > > > >
> https://dist.apache.org/repos/dist/dev/hbase/2.2.0RC0/CHANGES.md
> > > > > > >
> > > https://dist.apache.org/repos/dist/dev/hbase/2.2.0RC0/RELEASENOTES.md
> > > > > > >
> > > > > > > To learn more about Apache HBase, please see
> > > http://hbase.apache.org/
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Guanghao Zhang
> > > > > > >
> > > > >
> > >
> >
>

Reply via email to