[ 
https://issues.apache.org/jira/browse/HBASE-11988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14136703#comment-14136703
 ] 

stack commented on HBASE-11988:
-------------------------------

bq. As I said, IMO we must wait for region ENABLED also. 

The debug patch I committed this afternoon added this.

bq. But again these waits are timeout based. So still there is some chance that 
the time out elapsed and master continue with its start with out waiting for NS.

The debug patch actually throws and exception now so we should fail startup.  
We can't progress w/o ns table (I upped the timeout too to 5mins).

I think your patch is similar to the debug patch  -- the waiting on enabled bit 
is at least -- but we probably need to step back and think through all these 
system tables and what it takes getting a cluster off the ground especially if 
we have interesting scenarios like a splitting meta.

Good on you [~anoopsamjohn]



> AC/VC system table create on postStartMaster fails too often in test
> --------------------------------------------------------------------
>
>                 Key: HBASE-11988
>                 URL: https://issues.apache.org/jira/browse/HBASE-11988
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Anoop Sam John
>            Assignee: Anoop Sam John
>            Priority: Critical
>         Attachments: 11988.debug.txt, HBASE-11988.patch
>
>
> See for example
> {noformat}
> 2014-09-16 04:02:08,833 ERROR [ActiveMasterManager] master.HMaster(633): 
> Coprocessor postStartMaster() hook failed
> java.io.IOException: Table Namespace Manager not ready yet, try again later
>       at 
> org.apache.hadoop.hbase.master.HMaster.checkNamespaceManagerReady(HMaster.java:1669)
>       at 
> org.apache.hadoop.hbase.master.HMaster.getNamespaceDescriptor(HMaster.java:1852)
>       at org.apache.hadoop.hbase.master.HMaster.createTable(HMaster.java:1096)
>       at 
> org.apache.hadoop.hbase.security.access.AccessControlLists.init(AccessControlLists.java:143)
>       at 
> org.apache.hadoop.hbase.security.access.AccessController.postStartMaster(AccessController.java:1059)
>       at 
> org.apache.hadoop.hbase.master.MasterCoprocessorHost$58.call(MasterCoprocessorHost.java:692)
>       at 
> org.apache.hadoop.hbase.master.MasterCoprocessorHost.execOperation(MasterCoprocessorHost.java:861)
>       at 
> org.apache.hadoop.hbase.master.MasterCoprocessorHost.postStartMaster(MasterCoprocessorHost.java:688)
>       at 
> org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:631)
>       at org.apache.hadoop.hbase.master.HMaster.access$500(HMaster.java:155)
>       at org.apache.hadoop.hbase.master.HMaster$1.run(HMaster.java:1244)
>       at java.lang.Thread.run(Thread.java:744)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to