"1588230740" would be the meta region name, not namespace. It seems meta is
already online, per below log:
...
2019-03-12 20:53:41,037 INFO  [master/node2:60000:becomeActiveMaster]
master.HMaster: hbase:meta {1588230740 state=OPEN, ts=1552438420570, server=
node7.distparser.com,16020,1552421510124}
...

The maintenance mode I suggested before was to have master doing minimum
required stuff while attempting to getting meta/namespace online, but I
guess it wouldn't be able to avoid such timeouts. Below message also means
AM could read meta table, giving another indication meta is fine:
...
2019-03-12 20:53:45,942 INFO  [master/node2:60000:becomeActiveMaster]
assignment.AssignmentManager: Joined the cluster in 308msec
...

Now issue is namespace table. For some reason, AM is not able to kick APs
before the 5 minutes timeout exceeds, and that's probably why namespace
table never comes available:
...
2019-03-12 20:53:45,942 INFO  [master/node2:60000:becomeActiveMaster]
assignment.AssignmentManager: Joined the cluster in 308msec
2019-03-12 20:54:45,725 INFO
[ReadOnlyZKClient-latitude.distparser.com:2181@0x7ea9b2c0]
zookeeper.ZooKeeper: Session: 0x16911bd542a02a2 closed
2019-03-12 20:54:45,725 INFO
[ReadOnlyZKClient-latitude.distparser.com:2181@0x7ea9b2c0-EventThread]
zookeeper.ClientCnxn: EventThread shut down for session: 0x16911bd542a02a2
2019-03-12 20:58:46,603 ERROR [master/node2:60000:becomeActiveMaster]
master.HMaster: Failed to become active master
java.lang.IllegalStateException: Expected the service
ClusterSchemaServiceImpl [FAILED] to be RUNNING, but the service has FAILED
...

You may be able to force namespace region coming online with hbck2 assigns
command. You would need to find out the namespace region name first, you
can either scan meta table or check the region dir name in hdfs with "hdfs
dfs -ls -R /hbase | grep namespace", in order to pass it as a param for

Em qua, 13 de mar de 2019 às 13:00, Jean-Marc Spaggiari <
[email protected]> escreveu:

> Hi Sean,
>
> I tried. I looked-up the region name for base:namespace like this:
>
> hdfs dfs -ls /hbase/data/hbase/meta/
>
> And found the region to be 1588230740.
>
> The master dies after 5 minutes, so I start the master, wait 2 minutes to
> be sure it's up, and run the following command:
>
> bin/hbase hbck -j
> test/hbase-operator-tools/hbase-hbck2/target/hbase-hbck2-1.0.0-SNAPSHOT.jar
> assigns 1588230740
>
> But HBCK2 doesn't like it:
> 08:57:35.273 [main] INFO
> org.apache.hadoop.hbase.client.RpcRetryingCallerImpl - Call exception,
> tries=9, retries=16, started=29322 ms ago, cancelled=false,
> msg=org.apache.hadoop.hbase.PleaseHoldException: Master is initializing
> at
> org.apache.hadoop.hbase.master.HMaster.checkInitialized(HMaster.java:3057)
> at
>
> org.apache.hadoop.hbase.master.MasterRpcServices.getClusterStatus(MasterRpcServices.java:942)
> at
>
> org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java)
> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:413)
> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
> at
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
> at
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
>
>
> It keeps retrying and after 16 times it stopped saying the master is not
> initialized.
>
> On the WebUI I can see that there is a single region assigned, the META
> region.
>
> Also, here is the HDFS structure of my META table. Sounds like some parts
> got lost in the process (The info content).
>
> hbase@node2:~/hbase-2.2.0$ hdfs dfs -ls -R /hbase/data/hbase/meta/
> drwxr-xr-x   - hbase supergroup          0 2019-03-12 15:42
> /hbase/data/hbase/meta/.tabledesc
> -rw-r--r--   3 hbase supergroup       1447 2019-03-12 15:42
> /hbase/data/hbase/meta/.tabledesc/.tableinfo.0000000001
> drwxr-xr-x   - hbase supergroup          0 2019-03-12 15:42
> /hbase/data/hbase/meta/.tmp
> drwxr-xr-x   - hbase supergroup          0 2019-03-12 15:49
> /hbase/data/hbase/meta/1588230740
> -rw-r--r--   3 hbase supergroup         32 2019-03-12 15:40
> /hbase/data/hbase/meta/1588230740/.regioninfo
> drwxr-xr-x   - hbase supergroup          0 2019-03-12 15:40
> /hbase/data/hbase/meta/1588230740/info
> drwxr-xr-x   - hbase supergroup          0 2019-03-12 15:40
> /hbase/data/hbase/meta/1588230740/recovered.edits
> -rw-r--r--   3 hbase supergroup          0 2019-03-12 15:40
> /hbase/data/hbase/meta/1588230740/recovered.edits/2.seqid
> drwxr-xr-x   - hbase supergroup          0 2019-03-12 15:42
> /hbase/data/hbase/meta/1588230740/rep_barrier
> drwxr-xr-x   - hbase supergroup          0 2019-03-12 15:47
> /hbase/data/hbase/meta/1588230740/table
> -rw-r--r--   3 hbase supergroup       5454 2019-03-12 15:47
> /hbase/data/hbase/meta/1588230740/table/b65e8774ff284e77bf22641de36110cc
>
> What will be the next best step?
>
> Thanks,
>
> JMS
>
>
>
> Le mer. 13 mars 2019 à 08:45, Sean Busbey <[email protected]> a écrit :
>
> > Okay so master thinks hbase:namespace is already enabled, but no RS
> > believes it should be hosting the regions.
> >
> > Can you find the region name for the hbase:namespace region and issue
> > an hbck2 assigns command for it?
> >
> > On Tue, Mar 12, 2019 at 8:26 PM Jean-Marc Spaggiari
> > <[email protected]> wrote:
> > >
> > > It doesn't say that much :(
> > >
> > > hbase@node2:~/hbase-2.2.0$ cat logs/hbase-hbase-master-node2.log  |
> > grep -i
> > > namespace
> > > Caused by: java.io.IOException: Timedout 300000ms waiting for namespace
> > > table to be assigned and enabled: tableName=hbase:namespace,
> > state=ENABLED
> > > at
> > >
> >
> org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:108)
> > > Caused by: java.io.IOException: Timedout 300000ms waiting for namespace
> > > table to be assigned and enabled: tableName=hbase:namespace,
> > state=ENABLED
> > > at
> > >
> >
> org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:108)
> > >
> > > I cleared the logs before restarting the instance. That all what it
> says
> > > about namespace.
> > >
> > > Full logs are available there: https://pastebin.com/9j2Rzdcg
> > >
> > > Le mar. 12 mars 2019 à 20:47, Sean Busbey <[email protected]> a
> > écrit :
> > >
> > > > okay so the master spent ~5 minutes waiting to see if it could get
> the
> > > > namespace table working. when it couldn't it aborted.
> > > >
> > > > can you look back over that 5 minutes and see what the master had to
> > > > say about the namespace table? did the master think some particular
> > > > server should have it open already? was it waiting for someone to
> > > > finish opening or closing it?
> > > >
> > > > On Tue, Mar 12, 2019 at 6:39 PM Jean-Marc Spaggiari
> > > > <[email protected]> wrote:
> > > > >
> > > > > Le mar. 12 mars 2019 à 19:25, Sean Busbey <[email protected]> a
> > écrit :
> > > > >
> > > > > > your command above points at the wrong jar from the hbck2 repo.
> > it's
> > > > > >
> > > > > pointing at the one where you need to manually assemble all the
> > > > > > dependencies it has.
> > > > > >
> > > > > > You want the one that does not say "original" in the name.
> > > > > >
> > > > > >
> > > > >
> > > > > Ha!!! That's why! Way easier ;)
> > > > >
> > > > > indeed, this works even without removing all environment variables:
> > > > >  bin/hbase hbck -j
> > > > >
> > > >
> >
> test/hbase-operator-tools/hbase-hbck2/target/hbase-hbck2-1.0.0-SNAPSHOT.jar
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > >
> > > > > > >
> > > > > > > > Can you confirm it's the one in
> > > > > > > > the bin tarball? what does the version command output? What
> > does
> > > > the
> > > > > > > > mapredcp command output? What does the cli help for the hbase
> > > > command
> > > > > > > > show?
> > > > > > > >
> > > > > > >
> > > > > > >  hbase@node2:~/hbase-2.2.0$ hbase mapredcp
> > > > > > >
> > > > > >
> > > >
> >
> /home/hbase/hbase-2.2.0/bin/../lib/shaded-clients/hbase-shaded-mapreduce-2.2.0.jar:/home/hbase/hbase-2.2.0/bin/../lib/client-facing-thirdparty/audience-annotations-0.5.0.jar:/home/hbase/hbase-2.2.0/bin/../lib/client-facing-thirdparty/commons-logging-1.2.jar:/home/hbase/hbase-2.2.0/bin/../lib/client-facing-thirdparty/findbugs-annotations-1.3.9-1.jar:/home/hbase/hbase-2.2.0/bin/../lib/client-facing-thirdparty/htrace-core4-4.2.0-incubating.jar:/home/hbase/hbase-2.2.0/bin/../lib/client-facing-thirdparty/log4j-1.2.17.jar:/home/hbase/hbase-2.2.0/bin/../lib/client-facing-thirdparty/slf4j-api-1.7.25.jar
> > > > > > >
> > > > > >
> > > > > > that looks great now.
> > > > > >
> > > > > >
> > > > > > Once you correct the hbck2 jar above I think you'll be good for
> > > > invoking
> > > > > > HBCK2.
> > > > > >
> > > > > > Next, what does the initializing master say it's doing? It should
> > be
> > > > > > on the master UI near the bottom. If it hasn't made progress
> since
> > > > > > your last update it'll be waiting for the hbase:namespace table.
> > If it
> > > > > > is, find the region and see what the last few messages in the
> > master
> > > > > > log are about that region.
> > > > > >
> > > > >
> > > > > The master died some times ago. It dies after 5 minutes.
> > > > >
> > > > > 2019-03-12 19:35:58,568 ERROR
> [master/node2:60000:becomeActiveMaster]
> > > > > master.HMaster: Failed to become active master
> > > > > java.lang.IllegalStateException: Expected the service
> > > > > ClusterSchemaServiceImpl [FAILED] to be RUNNING, but the service
> has
> > > > FAILED
> > > > > at
> > > > >
> > > >
> >
> org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.checkCurrentState(AbstractService.java:345)
> > > > > at
> > > > >
> > > >
> >
> org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.awaitRunning(AbstractService.java:291)
> > > > > at
> > > > >
> > > >
> >
> org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1341)
> > > > > at
> > > > >
> > > >
> >
> org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:1119)
> > > > > at
> > > > >
> > > >
> >
> org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2347)
> > > > > at
> > org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:595)
> > > > > at java.lang.Thread.run(Thread.java:748)
> > > > > Caused by: java.io.IOException: Timedout 300000ms waiting for
> > namespace
> > > > > table to be assigned and enabled: tableName=hbase:namespace,
> > > > state=ENABLED
> > > > > at
> > > > >
> > > >
> >
> org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:108)
> > > > > at
> > > > >
> > > >
> >
> org.apache.hadoop.hbase.master.ClusterSchemaServiceImpl.doStart(ClusterSchemaServiceImpl.java:63)
> > > > > at
> > > > >
> > > >
> >
> org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.startAsync(AbstractService.java:226)
> > > > > at
> > > > >
> > > >
> >
> org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1339)
> > > > > ... 4 more
> > > > >
> > > > >
> > > > > I just restarted it. I can see the meta table being assigned. I can
> > > > access
> > > > > the WebUI and I don't see any initializing information. On the
> table
> > > > > section, I don't see anything, in any tab. However, when doing
> > "list" on
> > > > > the shell, I can see my tables. But I can not scan them. Scanning
> any
> > > > table
> > > > > gives :
> > > > > hbase(main):001:0> scan 'hbase:namespace'
> > > > > ROW                                   COLUMN+CELL
> > > > >
> > > > >
> > > > > ERROR: Unknown table hbase:namespace!
> > > > >
> > > > > For usage try 'help "scan"'
> > > > >
> > > > > Took 1.0395 seconds
> > > > >
> > > > >
> > > > >
> > > > > JMS
> > > >
> > > >
> > > >
> > > > --
> > > > Sean
> > > >
> >
>

Reply via email to