[ 
https://issues.apache.org/jira/browse/HBASE-14190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14697985#comment-14697985
 ] 

Ted Yu commented on HBASE-14190:
--------------------------------

I did some debugging for test failure in 
TestNamespaceAuditor#testRegionOperations
In the middle of the test, this is called:
{code}
    restartMaster();
{code}
When new master initializes, it assigns hbase:quota again, resulting in:
{code}
2015-08-13 02:43:33,420 FATAL [PriorityRpcServer.handler=16,queue=0,port=50627] 
regionserver.HRegionServer(2072): ABORTING region server 
192.168.0.12,50627,1439458945214:        Received OPEN for the 
region:hbase:quota,,1439458947828.c74bc53e12f6aae7f979d1ed6d8b4387., which is 
already online
015-08-13 02:43:33,443 INFO  [PriorityRpcServer.handler=16,queue=0,port=50627] 
regionserver.HRegionServer(1873): STOPPED: Received OPEN for the 
region:hbase:quota,,              
1439458947828.c74bc53e12f6aae7f979d1ed6d8b4387., which is already online
{code}
Trying to find a solution for this scenario.

> Assign system tables ahead of user region assignment
> ----------------------------------------------------
>
>                 Key: HBASE-14190
>                 URL: https://issues.apache.org/jira/browse/HBASE-14190
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Ted Yu
>            Assignee: Ted Yu
>            Priority: Critical
>         Attachments: 14190-v12.txt, 14190-v6.txt, 14190-v7.txt, 14190-v8.txt
>
>
> Currently the namespace table region is assigned like user regions.
> I spent several hours working with a customer where master couldn't finish 
> initialization.
> Even though master was restarted quite a few times, it went down with the 
> following:
> {code}
> 2015-08-05 17:16:57,530 FATAL [hdpmaster1:60000.activeMasterManager] 
> master.HMaster: Master server abort: loaded coprocessors are: []
> 2015-08-05 17:16:57,530 FATAL [hdpmaster1:60000.activeMasterManager] 
> master.HMaster: Unhandled exception. Starting shutdown.
> java.io.IOException: Timedout 300000ms waiting for namespace table to be 
> assigned
>   at 
> org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:104)
>   at org.apache.hadoop.hbase.master.HMaster.initNamespace(HMaster.java:985)
>   at 
> org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:779)
>   at org.apache.hadoop.hbase.master.HMaster.access$500(HMaster.java:182)
>   at org.apache.hadoop.hbase.master.HMaster$1.run(HMaster.java:1646)
>   at java.lang.Thread.run(Thread.java:744)
> {code}
> During previous run(s), namespace table was created, hence leaving an entry 
> in hbase:meta.
> The following if block in TableNamespaceManager#start() was skipped:
> {code}
>     if (!MetaTableAccessor.tableExists(masterServices.getConnection(),
>       TableName.NAMESPACE_TABLE_NAME)) {
> {code}
> TableNamespaceManager#start() spins, waiting for namespace region to be 
> assigned.
> There was issue in master assigning user regions.
> We tried issuing 'assign' command from hbase shell which didn't work because 
> of the following check in MasterRpcServices#assignRegion():
> {code}
>       master.checkInitialized();
> {code}
> This scenario can be avoided if we assign hbase:namespace table after 
> hbase:meta is assigned but before user table region assignment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to