[
https://issues.apache.org/jira/browse/HBASE-25260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17228995#comment-17228995
]
Michael Stack commented on HBASE-25260:
---------------------------------------
Would be good to see all of the log since startup, not just a snippet.
It looks like meta might be online given we are able to migrate table state...
but odd that we can't find the hbase:namespace table in hbase:meta – the data
was in good health pre-upgrade?
Can you upgrade to hbase-2.3.x instead of 2.1.x? It is our stable offering. The
tooling to fix issues is also much better than it was back on 2.1.1 (2.1.1 is
no longer maintained by the community). Thanks.
> upgrading hbase from 2.0.6 to 2.1.1, HMaster failed to become active because
> it cannot find hbase:namespace table
> -----------------------------------------------------------------------------------------------------------------
>
> Key: HBASE-25260
> URL: https://issues.apache.org/jira/browse/HBASE-25260
> Project: HBase
> Issue Type: Bug
> Affects Versions: 2.1.1, 2.0.6
> Reporter: Yongle Zhang
> Priority: Major
> Attachments: hmaster.log
>
>
> When we upgraded HBASE cluster from 2.0.6 to 2.1.1, the HMaster on upgraded
> node failed to start.
> Some stack trace in the error log:
> {code:java}
> 2020-11-06 02:01:26,420 WARN [PEWorker-12]
> assignment.RegionTransitionProcedure: Failed transition, suspend 1secs
> pid=12, ppid=9, state=RUNNABLE:REGION_TRANSITION_QUEUE, locked=true;
> AssignProcedure table=TestTable, region=37d62d2c1934da269a592e0e5cbca82a;
> rit=OFFLINE, location=null; waiting on rectified condition fixed by other
> Procedure or operator intervention
> org.apache.hadoop.hbase.master.TableStateManager$TableStateNotFoundException:
> TestTable
> at
> org.apache.hadoop.hbase.master.TableStateManager.getTableState(TableStateManager.java:215)
> at
> org.apache.hadoop.hbase.master.assignment.AssignProcedure.assign(AssignProcedure.java:194)
> at
> org.apache.hadoop.hbase.master.assignment.AssignProcedure.startTransition(AssignProcedure.java:205)
> at
> org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:355)
> at
> org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:97)
> at
> org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:957)
> at
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1835)
> at
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1595)
> at
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$1200(ProcedureExecutor.java:80)
> at
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:2140)
> {code}
> Seems it's caused by not able to find hbase:namespace table after upgrade:
> {code:java}
> 2020-11-06 02:01:26,791 ERROR [master/399fd6ca0c6d:16000:becomeActiveMaster]
> master.HMaster: Master server abort: loaded coprocessors are: []
> 2020-11-06 02:01:26,791 ERROR [master/399fd6ca0c6d:16000:becomeActiveMaster]
> master.HMaster: ***** ABORTING master 399fd6ca0c6d,16000,1604628075265:
> Unhandled exception. Starting shutdown. *****
> java.lang.IllegalStateException: Expected the service
> ClusterSchemaServiceImpl [FAILED] to be RUNNING, but the service has FAILED
> at
> org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.checkCurrentState(AbstractService.java:345)
> at
> org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.awaitRunning(AbstractService.java:291)
> at
> org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1253)
> at
> org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:1031)
> at
> org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2254)
> at org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:583)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.hadoop.hbase.TableNotFoundException: hbase:namespace
> at
> org.apache.hadoop.hbase.client.ConnectionImplementation.locateRegionInMeta(ConnectionImplementation.java:864)
> at
> org.apache.hadoop.hbase.client.ConnectionImplementation.locateRegion(ConnectionImplementation.java:759)
> at
> org.apache.hadoop.hbase.client.ConnectionUtils$ShortCircuitingClusterConnection.locateRegion(ConnectionUtils.java:131)
> at
> org.apache.hadoop.hbase.client.ConnectionImplementation.locateRegion(ConnectionImplementation.java:745)
> at
> org.apache.hadoop.hbase.client.ConnectionUtils$ShortCircuitingClusterConnection.locateRegion(ConnectionUtils.java:131)
> at
> org.apache.hadoop.hbase.client.ConnectionImplementation.locateRegion(ConnectionImplementation.java:716)
> at
> org.apache.hadoop.hbase.client.ConnectionUtils$ShortCircuitingClusterConnection.locateRegion(ConnectionUtils.java:131)
> at
> org.apache.hadoop.hbase.client.ConnectionImplementation.getRegionLocation(ConnectionImplementation.java:594)
> at
> org.apache.hadoop.hbase.client.ConnectionUtils$ShortCircuitingClusterConnection.getRegionLocation(ConnectionUtils.java:131)
> at
> org.apache.hadoop.hbase.client.HRegionLocator.getRegionLocation(HRegionLocator.java:72)
> at
> org.apache.hadoop.hbase.client.RegionServerCallable.prepare(RegionServerCallable.java:223)
> at
> org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:105)
> at org.apache.hadoop.hbase.client.HTable.get(HTable.java:386)
> at org.apache.hadoop.hbase.client.HTable.get(HTable.java:360)
> at
> org.apache.hadoop.hbase.master.TableNamespaceManager.get(TableNamespaceManager.java:142)
> at
> org.apache.hadoop.hbase.master.TableNamespaceManager.isTableAvailableAndInitialized(TableNamespaceManager.java:279)
> at
> org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:104)
> at
> org.apache.hadoop.hbase.master.ClusterSchemaServiceImpl.doStart(ClusterSchemaServiceImpl.java:63)
> at
> org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.startAsync(AbstractService.java:226)
> at
> org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1251)
> ... 4 more
> {code}
> Attached the error log file.
> [^hmaster.log]
>
> Steps to reproduce:
> # Start up a cluster of version 2.0.6 with 3 nodes
> # Use hbase pe to write data.
> {code:java}
> /hbase/bin/hbase pe --nomapred --oneCon=true --valueSize=10 --rows=100
> sequentialWrite 1{code}
> # Stop the cluster:
> ## using the graceful_stop.sh to stop all regionservers.
> ## Then run stop-hbase.sh
> # Upgrade the node to 2.1.1
> 5. After upgrading HMaster failed to start.
>
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)