LiangJun He created HBASE-27029:
-----------------------------------

             Summary: When HMaster is stopped, the HMaster local region cache 
cannot be flushed normally
                 Key: HBASE-27029
                 URL: https://issues.apache.org/jira/browse/HBASE-27029
             Project: HBase
          Issue Type: Bug
          Components: master
    Affects Versions: 3.0.0-alpha-3
            Reporter: LiangJun He
            Assignee: LiangJun He
             Fix For: 3.0.0-alpha-3


After HBASE-26951, HMaster can be stoped gracefully. For example, the internal 
threads of HMaster can be closed normally, but I found that the local region of 
HMaster still cannot be closed normally.

The following is my test error message:
{code:java}
Wed May 11 14:48:56 CST 2022 Terminating master
2022-05-11 14:48:56,382 INFO  [shutdown-hook-0] regionserver.ShutdownHook: 
Shutdown hook starting; hbase.shutdown.hook=true; 
fsShutdownHook=org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer@4f4c789f
2022-05-11 14:48:56,382 INFO  [shutdown-hook-0] master.HMaster: ***** STOPPING 
master 'emr-header-1.cluster-xxxxx,16000,1652240899395' *****
2022-05-11 14:48:56,382 INFO  [shutdown-hook-0] master.HMaster: STOPPED: 
Shutdown hook
......
......
2022-05-11 14:48:57,367 ERROR [KeepAlivePEWorker-41] 
assignment.RegionStateStore: FAILED persisting 
region=23a692981e91e944d380a8bdf4b50c7e state=OPEN
org.apache.hadoop.hbase.ipc.StoppedRpcClientException: Call to 
address=emr-worker-1.cluster-xxxxx:16020 failed on local exception: 
org.apache.hadoop.hbase.ipc.StoppedRpcClientException
    at java.lang.Thread.getStackTrace(Thread.java:1559)
    at 
org.apache.hadoop.hbase.util.FutureUtils.setStackTrace(FutureUtils.java:130)
    at org.apache.hadoop.hbase.util.FutureUtils.rethrow(FutureUtils.java:149)
    at org.apache.hadoop.hbase.util.FutureUtils.get(FutureUtils.java:172)
    at 
org.apache.hadoop.hbase.client.TableOverAsyncTable.put(TableOverAsyncTable.java:214)
    at 
org.apache.hadoop.hbase.master.assignment.RegionStateStore.updateRegionLocation(RegionStateStore.java:259)
    at 
org.apache.hadoop.hbase.master.assignment.RegionStateStore.updateRegionLocation(RegionStateStore.java:224)
    at 
org.apache.hadoop.hbase.master.assignment.AssignmentManager.persistToMeta(AssignmentManager.java:2034)
    at 
org.apache.hadoop.hbase.master.assignment.RegionRemoteProcedureBase.execute(RegionRemoteProcedureBase.java:297)
    at 
org.apache.hadoop.hbase.master.assignment.RegionRemoteProcedureBase.execute(RegionRemoteProcedureBase.java:57)
    at 
org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:953)
    at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1667)
    at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1414)
    at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$1100(ProcedureExecutor.java:78)
    at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1981)
    at --------Future.get--------(Unknown Source)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
    at org.apache.hadoop.hbase.ipc.IPCUtil.wrapException(IPCUtil.java:210)
    at 
org.apache.hadoop.hbase.ipc.AbstractRpcClient.onCallFinished(AbstractRpcClient.java:388)
    at 
org.apache.hadoop.hbase.ipc.AbstractRpcClient.access$100(AbstractRpcClient.java:92)
    at 
org.apache.hadoop.hbase.ipc.AbstractRpcClient$3.run(AbstractRpcClient.java:422)
    at 
org.apache.hadoop.hbase.ipc.AbstractRpcClient$3.run(AbstractRpcClient.java:417)
    at org.apache.hadoop.hbase.ipc.Call.callComplete(Call.java:114)
    at org.apache.hadoop.hbase.ipc.Call.setException(Call.java:129)
    at 
org.apache.hadoop.hbase.ipc.AbstractRpcClient.callMethod(AbstractRpcClient.java:443)
    at 
org.apache.hadoop.hbase.ipc.AbstractRpcClient.access$300(AbstractRpcClient.java:92)
    at 
org.apache.hadoop.hbase.ipc.AbstractRpcClient$RpcChannelImplementation.callMethod(AbstractRpcClient.java:614)
    at 
org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$Stub.mutate(ClientProtos.java:46147)
    at 
org.apache.hadoop.hbase.client.RawAsyncTableImpl.lambda$mutate$0(RawAsyncTableImpl.java:175)
    at 
org.apache.hadoop.hbase.client.ConnectionUtils.call(ConnectionUtils.java:616)
    at 
org.apache.hadoop.hbase.client.RawAsyncTableImpl.mutate(RawAsyncTableImpl.java:174)
    at 
org.apache.hadoop.hbase.client.RawAsyncTableImpl.voidMutate(RawAsyncTableImpl.java:181)
    at 
org.apache.hadoop.hbase.client.RawAsyncTableImpl.lambda$null$8(RawAsyncTableImpl.java:249)
    at 
org.apache.hadoop.hbase.client.AsyncSingleRequestRpcRetryingCaller.call(AsyncSingleRequestRpcRetryingCaller.java:82)
    at 
org.apache.hadoop.hbase.client.AsyncSingleRequestRpcRetryingCaller.lambda$doCall$7(AsyncSingleRequestRpcRetryingCaller.java:115)
    at 
org.apache.hadoop.hbase.util.FutureUtils.lambda$addListener$0(FutureUtils.java:68)
    at 
java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)
    at 
java.util.concurrent.CompletableFuture.uniWhenCompleteStage(CompletableFuture.java:778)
    at 
java.util.concurrent.CompletableFuture.whenComplete(CompletableFuture.java:2140)
    at org.apache.hadoop.hbase.util.FutureUtils.addListener(FutureUtils.java:61)
    at 
org.apache.hadoop.hbase.client.AsyncSingleRequestRpcRetryingCaller.doCall(AsyncSingleRequestRpcRetryingCaller.java:106)
    at 
org.apache.hadoop.hbase.client.AsyncRpcRetryingCaller.lambda$tryScheduleRetry$0(AsyncRpcRetryingCaller.java:142)
    at 
org.apache.hbase.thirdparty.io.netty.util.HashedWheelTimer$HashedWheelTimeout.run(HashedWheelTimer.java:715)
    at 
org.apache.hbase.thirdparty.io.netty.util.concurrent.ImmediateExecutor.execute(ImmediateExecutor.java:34)
    at 
org.apache.hbase.thirdparty.io.netty.util.HashedWheelTimer$HashedWheelTimeout.expire(HashedWheelTimer.java:703)
    at 
org.apache.hbase.thirdparty.io.netty.util.HashedWheelTimer$HashedWheelBucket.expireTimeouts(HashedWheelTimer.java:790)
    at 
org.apache.hbase.thirdparty.io.netty.util.HashedWheelTimer$Worker.run(HashedWheelTimer.java:503)
    at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.hadoop.hbase.ipc.StoppedRpcClientException
    at 
org.apache.hadoop.hbase.ipc.AbstractRpcClient.getConnection(AbstractRpcClient.java:360)
    at 
org.apache.hadoop.hbase.ipc.AbstractRpcClient.callMethod(AbstractRpcClient.java:440)
    ... 23 more
2022-05-11 14:48:57,367 WARN  [KeepAlivePEWorker-41] 
assignment.RegionRemoteProcedureBase: Failed updating meta, suspend 1secs 
pid=1793261, ppid=1791968, state=RUNNABLE, hasLock=true; OpenRegionProcedure 
23a692981e91e944d380a8bdf4b50c7e, 
server=emr-worker-2.cluster-xxxx,16020,1652240898881, 
retry=org.apache.hadoop.hbase.util.RetryCounter@2f42033b; state=OPEN, 
location=emr-worker-2.cluster-18941,16020,1652240898881, table=usertable, 
region=23a692981e91e944d380a8bdf4b50c7e;
org.apache.hadoop.hbase.ipc.StoppedRpcClientException: Call to 
address=emr-worker-1.cluster-18941:16020 failed on local exception: 
org.apache.hadoop.hbase.ipc.StoppedRpcClientException
    at java.lang.Thread.getStackTrace(Thread.java:1559)
    at 
org.apache.hadoop.hbase.util.FutureUtils.setStackTrace(FutureUtils.java:130)
    at org.apache.hadoop.hbase.util.FutureUtils.rethrow(FutureUtils.java:149)
    at org.apache.hadoop.hbase.util.FutureUtils.get(FutureUtils.java:172)
    at 
org.apache.hadoop.hbase.client.TableOverAsyncTable.put(TableOverAsyncTable.java:214)
    at 
org.apache.hadoop.hbase.master.assignment.RegionStateStore.updateRegionLocation(RegionStateStore.java:259)
    at 
org.apache.hadoop.hbase.master.assignment.RegionStateStore.updateRegionLocation(RegionStateStore.java:224)
    at 
org.apache.hadoop.hbase.master.assignment.AssignmentManager.persistToMeta(AssignmentManager.java:2034)
    at 
org.apache.hadoop.hbase.master.assignment.RegionRemoteProcedureBase.execute(RegionRemoteProcedureBase.java:297)
    at 
org.apache.hadoop.hbase.master.assignment.RegionRemoteProcedureBase.execute(RegionRemoteProcedureBase.java:57)
    at 
org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:953)
    at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1667)
    at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1414)
    at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$1100(ProcedureExecutor.java:78)
    at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1981)
    at --------Future.get--------(Unknown Source)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
    at org.apache.hadoop.hbase.ipc.IPCUtil.wrapException(IPCUtil.java:210)
    at 
org.apache.hadoop.hbase.ipc.AbstractRpcClient.onCallFinished(AbstractRpcClient.java:388)
    at 
org.apache.hadoop.hbase.ipc.AbstractRpcClient.access$100(AbstractRpcClient.java:92)
    at 
org.apache.hadoop.hbase.ipc.AbstractRpcClient$3.run(AbstractRpcClient.java:422)
    at 
org.apache.hadoop.hbase.ipc.AbstractRpcClient$3.run(AbstractRpcClient.java:417)
    at org.apache.hadoop.hbase.ipc.Call.callComplete(Call.java:114)
    at org.apache.hadoop.hbase.ipc.Call.setException(Call.java:129)
    at 
org.apache.hadoop.hbase.ipc.AbstractRpcClient.callMethod(AbstractRpcClient.java:443)
    at 
org.apache.hadoop.hbase.ipc.AbstractRpcClient.access$300(AbstractRpcClient.java:92)
    at 
org.apache.hadoop.hbase.ipc.AbstractRpcClient$RpcChannelImplementation.callMethod(AbstractRpcClient.java:614)
    at 
org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$Stub.mutate(ClientProtos.java:46147)
    at 
org.apache.hadoop.hbase.client.RawAsyncTableImpl.lambda$mutate$0(RawAsyncTableImpl.java:175)
    at 
org.apache.hadoop.hbase.client.ConnectionUtils.call(ConnectionUtils.java:616)
    at 
org.apache.hadoop.hbase.client.RawAsyncTableImpl.mutate(RawAsyncTableImpl.java:174)
    at 
org.apache.hadoop.hbase.client.RawAsyncTableImpl.voidMutate(RawAsyncTableImpl.java:181)
    at 
org.apache.hadoop.hbase.client.RawAsyncTableImpl.lambda$null$8(RawAsyncTableImpl.java:249)
    at 
org.apache.hadoop.hbase.client.AsyncSingleRequestRpcRetryingCaller.call(AsyncSingleRequestRpcRetryingCaller.java:82)
    at 
org.apache.hadoop.hbase.client.AsyncSingleRequestRpcRetryingCaller.lambda$doCall$7(AsyncSingleRequestRpcRetryingCaller.java:115)
    at 
org.apache.hadoop.hbase.util.FutureUtils.lambda$addListener$0(FutureUtils.java:68)
    at 
java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)
    at 
java.util.concurrent.CompletableFuture.uniWhenCompleteStage(CompletableFuture.java:778)
    at 
java.util.concurrent.CompletableFuture.whenComplete(CompletableFuture.java:2140)
    at org.apache.hadoop.hbase.util.FutureUtils.addListener(FutureUtils.java:61)
    at 
org.apache.hadoop.hbase.client.AsyncSingleRequestRpcRetryingCaller.doCall(AsyncSingleRequestRpcRetryingCaller.java:106)
    at 
org.apache.hadoop.hbase.client.AsyncRpcRetryingCaller.lambda$tryScheduleRetry$0(AsyncRpcRetryingCaller.java:142)
    at 
org.apache.hbase.thirdparty.io.netty.util.HashedWheelTimer$HashedWheelTimeout.run(HashedWheelTimer.java:715)
    at 
org.apache.hbase.thirdparty.io.netty.util.concurrent.ImmediateExecutor.execute(ImmediateExecutor.java:34)
    at 
org.apache.hbase.thirdparty.io.netty.util.HashedWheelTimer$HashedWheelTimeout.expire(HashedWheelTimer.java:703)
    at 
org.apache.hbase.thirdparty.io.netty.util.HashedWheelTimer$HashedWheelBucket.expireTimeouts(HashedWheelTimer.java:790)
    at 
org.apache.hbase.thirdparty.io.netty.util.HashedWheelTimer$Worker.run(HashedWheelTimer.java:503)
    at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.hadoop.hbase.ipc.StoppedRpcClientException
    at 
org.apache.hadoop.hbase.ipc.AbstractRpcClient.getConnection(AbstractRpcClient.java:360)
    at 
org.apache.hadoop.hbase.ipc.AbstractRpcClient.callMethod(AbstractRpcClient.java:440)
    ... 23 more
.....
.....
2022-05-11 14:49:13,232 INFO  [master/emr-header-1:16000] region.MasterRegion: 
Closing local region {ENCODED => 1595e783b53d99cd5eef43b6debb2682, NAME => 
'master:store,,1.1595e783b53d99cd5eef43b6debb2682.', STARTKEY => '', ENDKEY => 
''}, isAbort=true
2022-05-11 14:49:13,237 INFO  [master/emr-header-1:16000] regionserver.HRegion: 
Closing region master:store,,1.1595e783b53d99cd5eef43b6debb2682.
2022-05-11 14:49:13,243 ERROR [master/emr-header-1:16000] regionserver.HRegion: 
Memstore data size is 139232 in region 
master:store,,1.1595e783b53d99cd5eef43b6debb2682.
2022-05-11 14:49:13,243 INFO  [master/emr-header-1:16000] regionserver.HRegion: 
Closed master:store,,1.1595e783b53d99cd5eef43b6debb2682.
2022-05-11 14:49:13,377 ERROR [main] master.HMasterCommandLine: Master exiting
java.lang.RuntimeException: HMaster Aborted
    at 
org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:251)
    at 
org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:144)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
    at 
org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:144)
    at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:3209) {code}
 

Finally HMaster exit due to abort, the local region cache cannot be flushed 
normally

successfully.

 

I also found another error that also causes the same problem:
{code:java}
2022-05-11 23:57:03,228 INFO  [master/emr-header-1:16000] 
hbase.HBaseServerBase: Shutdown executor service
2022-05-11 23:57:03,229 INFO  [master/emr-header-1:16000] 
procedure2.RemoteProcedureDispatcher: Stopping procedure remote dispatcher
2022-05-11 23:57:03,229 INFO  [master/emr-header-1:16000] 
procedure2.ProcedureExecutor: Stopping
2022-05-11 23:57:05,481 WARN  [master/emr-header-1:16000] 
procedure2.StoppableThread: Waiting termination of thread PEWorker-7, 2.2510 
sec; sending interrupt
2022-05-11 23:57:05,481 ERROR [PEWorker-7] assignment.RegionStateStore: FAILED 
persisting region=7ff63515f9ce9d876618e3f1bee7dc03 state=OPEN
java.io.InterruptedIOException
        at org.apache.hadoop.hbase.util.FutureUtils.get(FutureUtils.java:170)
        at 
org.apache.hadoop.hbase.client.TableOverAsyncTable.put(TableOverAsyncTable.java:214)
        at 
org.apache.hadoop.hbase.master.assignment.RegionStateStore.updateRegionLocation(RegionStateStore.java:259)
        at 
org.apache.hadoop.hbase.master.assignment.RegionStateStore.updateRegionLocation(RegionStateStore.java:224)
        at 
org.apache.hadoop.hbase.master.assignment.AssignmentManager.persistToMeta(AssignmentManager.java:2034)
        at 
org.apache.hadoop.hbase.master.assignment.RegionRemoteProcedureBase.execute(RegionRemoteProcedureBase.java:297)
        at 
org.apache.hadoop.hbase.master.assignment.RegionRemoteProcedureBase.execute(RegionRemoteProcedureBase.java:57)
        at 
org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:953)
        at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1667)
        at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1414)
        at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$1100(ProcedureExecutor.java:78)
        at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1981)
Caused by: java.lang.InterruptedException
        at 
java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:347)
        at 
java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895)
        at org.apache.hadoop.hbase.util.FutureUtils.get(FutureUtils.java:168)
        ... 11 more
2022-05-11 23:57:05,482 WARN  [PEWorker-7] 
assignment.RegionRemoteProcedureBase: Failed updating meta, suspend 1secs 
pid=1845211, ppid=1844964, state=RUNNABLE, hasLock=true; OpenRegionProcedure 
7ff63515f9
ce9d876618e3f1bee7dc03, server=emr-worker-2.cluster-18941,16020,1652263346995, 
retry=org.apache.hadoop.hbase.util.RetryCounter@4a8e233; state=OPEN, 
location=emr-worker-2.cluster-18941,16020,1652263346995,
 table=usertable, region=7ff63515f9ce9d876618e3f1bee7dc03;
java.io.InterruptedIOException
        at org.apache.hadoop.hbase.util.FutureUtils.get(FutureUtils.java:170)
        at 
org.apache.hadoop.hbase.client.TableOverAsyncTable.put(TableOverAsyncTable.java:214)
        at 
org.apache.hadoop.hbase.master.assignment.RegionStateStore.updateRegionLocation(RegionStateStore.java:259)
        at 
org.apache.hadoop.hbase.master.assignment.RegionStateStore.updateRegionLocation(RegionStateStore.java:224)
        at 
org.apache.hadoop.hbase.master.assignment.AssignmentManager.persistToMeta(AssignmentManager.java:2034)
        at 
org.apache.hadoop.hbase.master.assignment.RegionRemoteProcedureBase.execute(RegionRemoteProcedureBase.java:297)
        at 
org.apache.hadoop.hbase.master.assignment.RegionRemoteProcedureBase.execute(RegionRemoteProcedureBase.java:57)
        at 
org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:953)
        at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1667)
        at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1414)
        at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$1100(ProcedureExecutor.java:78)
        at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1981)
Caused by: java.lang.InterruptedException
        at 
java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:347)
        at 
java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895)
        at org.apache.hadoop.hbase.util.FutureUtils.get(FutureUtils.java:168)
        ... 11 more {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to