[ https://issues.apache.org/jira/browse/HBASE-14190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14697985#comment-14697985 ]
Ted Yu commented on HBASE-14190: -------------------------------- I did some debugging for test failure in TestNamespaceAuditor#testRegionOperations In the middle of the test, this is called: {code} restartMaster(); {code} When new master initializes, it assigns hbase:quota again, resulting in: {code} 2015-08-13 02:43:33,420 FATAL [PriorityRpcServer.handler=16,queue=0,port=50627] regionserver.HRegionServer(2072): ABORTING region server 192.168.0.12,50627,1439458945214: Received OPEN for the region:hbase:quota,,1439458947828.c74bc53e12f6aae7f979d1ed6d8b4387., which is already online 015-08-13 02:43:33,443 INFO [PriorityRpcServer.handler=16,queue=0,port=50627] regionserver.HRegionServer(1873): STOPPED: Received OPEN for the region:hbase:quota,, 1439458947828.c74bc53e12f6aae7f979d1ed6d8b4387., which is already online {code} Trying to find a solution for this scenario. > Assign system tables ahead of user region assignment > ---------------------------------------------------- > > Key: HBASE-14190 > URL: https://issues.apache.org/jira/browse/HBASE-14190 > Project: HBase > Issue Type: Bug > Reporter: Ted Yu > Assignee: Ted Yu > Priority: Critical > Attachments: 14190-v12.txt, 14190-v6.txt, 14190-v7.txt, 14190-v8.txt > > > Currently the namespace table region is assigned like user regions. > I spent several hours working with a customer where master couldn't finish > initialization. > Even though master was restarted quite a few times, it went down with the > following: > {code} > 2015-08-05 17:16:57,530 FATAL [hdpmaster1:60000.activeMasterManager] > master.HMaster: Master server abort: loaded coprocessors are: [] > 2015-08-05 17:16:57,530 FATAL [hdpmaster1:60000.activeMasterManager] > master.HMaster: Unhandled exception. Starting shutdown. > java.io.IOException: Timedout 300000ms waiting for namespace table to be > assigned > at > org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:104) > at org.apache.hadoop.hbase.master.HMaster.initNamespace(HMaster.java:985) > at > org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:779) > at org.apache.hadoop.hbase.master.HMaster.access$500(HMaster.java:182) > at org.apache.hadoop.hbase.master.HMaster$1.run(HMaster.java:1646) > at java.lang.Thread.run(Thread.java:744) > {code} > During previous run(s), namespace table was created, hence leaving an entry > in hbase:meta. > The following if block in TableNamespaceManager#start() was skipped: > {code} > if (!MetaTableAccessor.tableExists(masterServices.getConnection(), > TableName.NAMESPACE_TABLE_NAME)) { > {code} > TableNamespaceManager#start() spins, waiting for namespace region to be > assigned. > There was issue in master assigning user regions. > We tried issuing 'assign' command from hbase shell which didn't work because > of the following check in MasterRpcServices#assignRegion(): > {code} > master.checkInitialized(); > {code} > This scenario can be avoided if we assign hbase:namespace table after > hbase:meta is assigned but before user table region assignment. -- This message was sent by Atlassian JIRA (v6.3.4#6332)