Hi Sean, I tried. I looked-up the region name for base:namespace like this:
hdfs dfs -ls /hbase/data/hbase/meta/ And found the region to be 1588230740. The master dies after 5 minutes, so I start the master, wait 2 minutes to be sure it's up, and run the following command: bin/hbase hbck -j test/hbase-operator-tools/hbase-hbck2/target/hbase-hbck2-1.0.0-SNAPSHOT.jar assigns 1588230740 But HBCK2 doesn't like it: 08:57:35.273 [main] INFO org.apache.hadoop.hbase.client.RpcRetryingCallerImpl - Call exception, tries=9, retries=16, started=29322 ms ago, cancelled=false, msg=org.apache.hadoop.hbase.PleaseHoldException: Master is initializing at org.apache.hadoop.hbase.master.HMaster.checkInitialized(HMaster.java:3057) at org.apache.hadoop.hbase.master.MasterRpcServices.getClusterStatus(MasterRpcServices.java:942) at org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:413) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130) at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324) at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304) It keeps retrying and after 16 times it stopped saying the master is not initialized. On the WebUI I can see that there is a single region assigned, the META region. Also, here is the HDFS structure of my META table. Sounds like some parts got lost in the process (The info content). hbase@node2:~/hbase-2.2.0$ hdfs dfs -ls -R /hbase/data/hbase/meta/ drwxr-xr-x - hbase supergroup 0 2019-03-12 15:42 /hbase/data/hbase/meta/.tabledesc -rw-r--r-- 3 hbase supergroup 1447 2019-03-12 15:42 /hbase/data/hbase/meta/.tabledesc/.tableinfo.0000000001 drwxr-xr-x - hbase supergroup 0 2019-03-12 15:42 /hbase/data/hbase/meta/.tmp drwxr-xr-x - hbase supergroup 0 2019-03-12 15:49 /hbase/data/hbase/meta/1588230740 -rw-r--r-- 3 hbase supergroup 32 2019-03-12 15:40 /hbase/data/hbase/meta/1588230740/.regioninfo drwxr-xr-x - hbase supergroup 0 2019-03-12 15:40 /hbase/data/hbase/meta/1588230740/info drwxr-xr-x - hbase supergroup 0 2019-03-12 15:40 /hbase/data/hbase/meta/1588230740/recovered.edits -rw-r--r-- 3 hbase supergroup 0 2019-03-12 15:40 /hbase/data/hbase/meta/1588230740/recovered.edits/2.seqid drwxr-xr-x - hbase supergroup 0 2019-03-12 15:42 /hbase/data/hbase/meta/1588230740/rep_barrier drwxr-xr-x - hbase supergroup 0 2019-03-12 15:47 /hbase/data/hbase/meta/1588230740/table -rw-r--r-- 3 hbase supergroup 5454 2019-03-12 15:47 /hbase/data/hbase/meta/1588230740/table/b65e8774ff284e77bf22641de36110cc What will be the next best step? Thanks, JMS Le mer. 13 mars 2019 à 08:45, Sean Busbey <[email protected]> a écrit : > Okay so master thinks hbase:namespace is already enabled, but no RS > believes it should be hosting the regions. > > Can you find the region name for the hbase:namespace region and issue > an hbck2 assigns command for it? > > On Tue, Mar 12, 2019 at 8:26 PM Jean-Marc Spaggiari > <[email protected]> wrote: > > > > It doesn't say that much :( > > > > hbase@node2:~/hbase-2.2.0$ cat logs/hbase-hbase-master-node2.log | > grep -i > > namespace > > Caused by: java.io.IOException: Timedout 300000ms waiting for namespace > > table to be assigned and enabled: tableName=hbase:namespace, > state=ENABLED > > at > > > org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:108) > > Caused by: java.io.IOException: Timedout 300000ms waiting for namespace > > table to be assigned and enabled: tableName=hbase:namespace, > state=ENABLED > > at > > > org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:108) > > > > I cleared the logs before restarting the instance. That all what it says > > about namespace. > > > > Full logs are available there: https://pastebin.com/9j2Rzdcg > > > > Le mar. 12 mars 2019 à 20:47, Sean Busbey <[email protected]> a > écrit : > > > > > okay so the master spent ~5 minutes waiting to see if it could get the > > > namespace table working. when it couldn't it aborted. > > > > > > can you look back over that 5 minutes and see what the master had to > > > say about the namespace table? did the master think some particular > > > server should have it open already? was it waiting for someone to > > > finish opening or closing it? > > > > > > On Tue, Mar 12, 2019 at 6:39 PM Jean-Marc Spaggiari > > > <[email protected]> wrote: > > > > > > > > Le mar. 12 mars 2019 à 19:25, Sean Busbey <[email protected]> a > écrit : > > > > > > > > > your command above points at the wrong jar from the hbck2 repo. > it's > > > > > > > > > pointing at the one where you need to manually assemble all the > > > > > dependencies it has. > > > > > > > > > > You want the one that does not say "original" in the name. > > > > > > > > > > > > > > > > > > Ha!!! That's why! Way easier ;) > > > > > > > > indeed, this works even without removing all environment variables: > > > > bin/hbase hbck -j > > > > > > > > test/hbase-operator-tools/hbase-hbck2/target/hbase-hbck2-1.0.0-SNAPSHOT.jar > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Can you confirm it's the one in > > > > > > > the bin tarball? what does the version command output? What > does > > > the > > > > > > > mapredcp command output? What does the cli help for the hbase > > > command > > > > > > > show? > > > > > > > > > > > > > > > > > > > hbase@node2:~/hbase-2.2.0$ hbase mapredcp > > > > > > > > > > > > > > > /home/hbase/hbase-2.2.0/bin/../lib/shaded-clients/hbase-shaded-mapreduce-2.2.0.jar:/home/hbase/hbase-2.2.0/bin/../lib/client-facing-thirdparty/audience-annotations-0.5.0.jar:/home/hbase/hbase-2.2.0/bin/../lib/client-facing-thirdparty/commons-logging-1.2.jar:/home/hbase/hbase-2.2.0/bin/../lib/client-facing-thirdparty/findbugs-annotations-1.3.9-1.jar:/home/hbase/hbase-2.2.0/bin/../lib/client-facing-thirdparty/htrace-core4-4.2.0-incubating.jar:/home/hbase/hbase-2.2.0/bin/../lib/client-facing-thirdparty/log4j-1.2.17.jar:/home/hbase/hbase-2.2.0/bin/../lib/client-facing-thirdparty/slf4j-api-1.7.25.jar > > > > > > > > > > > > > > > > that looks great now. > > > > > > > > > > > > > > > Once you correct the hbck2 jar above I think you'll be good for > > > invoking > > > > > HBCK2. > > > > > > > > > > Next, what does the initializing master say it's doing? It should > be > > > > > on the master UI near the bottom. If it hasn't made progress since > > > > > your last update it'll be waiting for the hbase:namespace table. > If it > > > > > is, find the region and see what the last few messages in the > master > > > > > log are about that region. > > > > > > > > > > > > > The master died some times ago. It dies after 5 minutes. > > > > > > > > 2019-03-12 19:35:58,568 ERROR [master/node2:60000:becomeActiveMaster] > > > > master.HMaster: Failed to become active master > > > > java.lang.IllegalStateException: Expected the service > > > > ClusterSchemaServiceImpl [FAILED] to be RUNNING, but the service has > > > FAILED > > > > at > > > > > > > > org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.checkCurrentState(AbstractService.java:345) > > > > at > > > > > > > > org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.awaitRunning(AbstractService.java:291) > > > > at > > > > > > > > org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1341) > > > > at > > > > > > > > org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:1119) > > > > at > > > > > > > > org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2347) > > > > at > org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:595) > > > > at java.lang.Thread.run(Thread.java:748) > > > > Caused by: java.io.IOException: Timedout 300000ms waiting for > namespace > > > > table to be assigned and enabled: tableName=hbase:namespace, > > > state=ENABLED > > > > at > > > > > > > > org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:108) > > > > at > > > > > > > > org.apache.hadoop.hbase.master.ClusterSchemaServiceImpl.doStart(ClusterSchemaServiceImpl.java:63) > > > > at > > > > > > > > org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.startAsync(AbstractService.java:226) > > > > at > > > > > > > > org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1339) > > > > ... 4 more > > > > > > > > > > > > I just restarted it. I can see the meta table being assigned. I can > > > access > > > > the WebUI and I don't see any initializing information. On the table > > > > section, I don't see anything, in any tab. However, when doing > "list" on > > > > the shell, I can see my tables. But I can not scan them. Scanning any > > > table > > > > gives : > > > > hbase(main):001:0> scan 'hbase:namespace' > > > > ROW COLUMN+CELL > > > > > > > > > > > > ERROR: Unknown table hbase:namespace! > > > > > > > > For usage try 'help "scan"' > > > > > > > > Took 1.0395 seconds > > > > > > > > > > > > > > > > JMS > > > > > > > > > > > > -- > > > Sean > > > >
