[ https://issues.apache.org/jira/browse/HBASE-13802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14564612#comment-14564612 ]
Hudson commented on HBASE-13802: -------------------------------- SUCCESS: Integrated in HBase-1.2 #115 (See [https://builds.apache.org/job/HBase-1.2/115/]) HBASE-13802 Procedure V2: Master fails to come up due to rollback of create namespace table (Stephen Jiang) (tedyu: rev 46c9a5b339a77167960ba5382e6d2f380fc312a8) * hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/DeleteTableProcedure.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/CreateTableProcedure.java HBASE-13802 Addendum fixes checkstyle warning by removing the whitespace before the right parenthesis (tedyu: rev 695c4790cdb2f6ba57ace674fb9bc4cc7744d6e4) * hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/CreateTableProcedure.java > Procedure V2: Master fails to come up due to rollback of create namespace > table > ------------------------------------------------------------------------------- > > Key: HBASE-13802 > URL: https://issues.apache.org/jira/browse/HBASE-13802 > Project: HBase > Issue Type: Bug > Components: master, proc-v2 > Affects Versions: 2.0.0, 1.1.0, 1.2.0 > Reporter: Stephen Yuan Jiang > Assignee: Stephen Yuan Jiang > Fix For: 2.0.0, 1.2.0, 1.1.1 > > Attachments: 13802-addendum.txt, HBASE-13802.v1.patch > > > In Procedure V2 (HBASE-13203) implementation, Rollback of a > CreateTableProcedure would call the Quota Manager to remove the table from > namespace quota. > {code} > protected static void deleteTableStates(final MasterProcedureEnv env, final > TableName tableName) { > > ProcedureSyncWait.getMasterQuotaManager(env).removeTableFromNamespaceQuota(tableName); > } > {code} > This could lead to a 'deadlock'-like situation during master starting up: > (1) The create namespace table procedure failed in the middle of master > crash/failover. When master re-started, it tried to rollback, one step of > rollback is to call QuotaManager to remove the table from NameSpaceQuota, but > the QuotaManager has NOT started - so the rollback has to wait. > (2). The QuotaManager would start in master after Namespace Manager starts. > (3). The Namespace Manager is waiting for the table lock to be released by > rollback of create namespace table procedure so that it can create namespace > table as part of Namespace Manager initialization. > {code} > HMaster#finishActiveMasterInitialization() { > ... > status.setStatus("Starting namespace manager"); > initNamespace(); > ... > status.setStatus("Starting quota manager"); > initQuotaManager(); > ... > } > {code} > (4). Now (1) waits for (2), which waits for (3), which waits for (1) - no one > make progress & master could not complete initialization and fails to come up. > {noformat} > 2015-05-28 10:01:26,890 INFO [ip-111-22-33-444:16000.activeMasterManager] > master.TableNamespaceManager: Namespace table not found. Creating... > 2015-05-28 10:06:22,016 WARN [ProcedureExecutorThread-0] > procedure.CreateTableProcedure: Failed rollback attempt > step=CREATE_TABLE_PRE_OPERATION table=hbase:namespace > org.apache.hadoop.hbase.exceptions.TimeoutIOException: Timed out while > waiting on quota manager to be available > at > org.apache.hadoop.hbase.master.procedure.ProcedureSyncWait.waitFor(ProcedureSyncWait.java:122) > at > org.apache.hadoop.hbase.master.procedure.ProcedureSyncWait.waitFor(ProcedureSyncWait.java:102) > at > org.apache.hadoop.hbase.master.procedure.ProcedureSyncWait.getMasterQuotaManager(ProcedureSyncWait.java:184) > at > org.apache.hadoop.hbase.master.procedure.DeleteTableProcedure.deleteTableStates(DeleteTableProcedure.java:408) > at > org.apache.hadoop.hbase.master.procedure.CreateTableProcedure.rollbackState(CreateTableProcedure.java:169) > at > org.apache.hadoop.hbase.master.procedure.CreateTableProcedure.rollbackState(CreateTableProcedure.java:58) > at > org.apache.hadoop.hbase.procedure2.StateMachineProcedure.rollback(StateMachineProcedure.java:121) > at > org.apache.hadoop.hbase.procedure2.Procedure.doRollback(Procedure.java:414) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:808) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:773) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execLoop(ProcedureExecutor.java:653) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execLoop(ProcedureExecutor.java:626) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$200(ProcedureExecutor.java:70) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor$1.run(ProcedureExecutor.java:413) > 2015-05-28 10:06:22,169 WARN [ProcedureExecutorThread-1] > procedure.CreateTableProcedure: The table hbase:namespace does not exist in > meta but has a znode. run hbck to fix inconsistencies. > 2015-05-28 10:06:27,292 FATAL [ip-111-22-33-444:16000.activeMasterManager] > master.HMaster: Failed to become active master > java.io.IOException: Timedout 300000ms waiting for namespace table to be > assigned > at > org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:104) > at > org.apache.hadoop.hbase.master.HMaster.initNamespace(HMaster.java:980) > at > org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:779) > at org.apache.hadoop.hbase.master.HMaster.access$500(HMaster.java:182) > at org.apache.hadoop.hbase.master.HMaster$1.run(HMaster.java:1632) > at java.lang.Thread.run(Thread.java:745) > 2015-05-28 10:06:27,293 FATAL [ip-111-22-33-444:16000.activeMasterManager] > master.HMaster: Master server abort: loaded coprocessors are: > [org.apache.ranger.authorization.hbase.RangerAuthorizationCoprocessor] > 2015-05-28 10:06:27,293 FATAL [ip-111-22-33-444:16000.activeMasterManager] > master.HMaster: Unhandled exception. Starting shutdown. > java.io.IOException: Timedout 300000ms waiting for namespace table to be > assigned > at > org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:104) > at > org.apache.hadoop.hbase.master.HMaster.initNamespace(HMaster.java:980) > at > org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:779) > at org.apache.hadoop.hbase.master.HMaster.access$500(HMaster.java:182) > at org.apache.hadoop.hbase.master.HMaster$1.run(HMaster.java:1632) > at java.lang.Thread.run(Thread.java:745) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)