Found this problem when running ITBLL on branch-3 but it affects all
active branches. It will cause master initialization fail with message
like this

java.lang.UnsupportedOperationException: Unexpected INITIALIZING state
for pid=-1, state=INITIALIZING, hasLock=false; CloseRegionProcedure
a13d6f17eba604f7e37d981aefc62212, server=data04,16020,1744450050895
        at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor.initializeStacks(ProcedureExecutor.java:453)
~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
        at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor.loadProcedures(ProcedureExecutor.java:593)
~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
        at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor$1.load(ProcedureExecutor.java:344)
~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
        at 
org.apache.hadoop.hbase.procedure2.store.region.RegionProcedureStore.load(RegionProcedureStore.java:287)
~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
        at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor.load(ProcedureExecutor.java:335)
~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
        at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor.init(ProcedureExecutor.java:688)
~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
        at 
org.apache.hadoop.hbase.master.HMaster.createProcedureExecutor(HMaster.java:1875)
~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
        at 
org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:1030)
~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
        at 
org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2554)
~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
        at org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:624)
~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
        at 
org.apache.hadoop.hbase.trace.TraceUtil.lambda$tracedRunnable$2(TraceUtil.java:155)
~[hbase-common-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
        at java.lang.Thread.run(Thread.java:840) ~[?:?]

It is not easy to happen but if it happens, the cluster can not
recover, unless you have the ability to modify HBase code to ignore
this invalid entry. It will be a huge problem for our users.

So I suggest we make new release for 2.5.x and 2.6.x after merging
this fix ASAP, and maybe we should also consider to introduce a fix
tool for helping our users fix this problem if they are on older
release lines...

Thanks.

Reply via email to