[ https://issues.apache.org/jira/browse/HBASE-23613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Lijin Bin resolved HBASE-23613. ------------------------------- Fix Version/s: 2.2.3 2.3.0 3.0.0 Resolution: Fixed > ProcedureExecutor check StuckWorkers blocked by DeadServerMetricRegionChore > --------------------------------------------------------------------------- > > Key: HBASE-23613 > URL: https://issues.apache.org/jira/browse/HBASE-23613 > Project: HBase > Issue Type: Improvement > Affects Versions: 2.2.2 > Reporter: Lijin Bin > Assignee: Lijin Bin > Priority: Major > Fix For: 3.0.0, 2.3.0, 2.2.3 > > > After debuging, i find WorkerMonitor in ProcedureExecutor do not execute for > a while because it is blocked by DeadServerMetricRegionChore. > TimeoutExecutorThread execute not only WorkerMonitor, but also > DeadServerMetricRegionChore RegionInTransitionChore... > {code} > "ProcExecTimeout" #1052 daemon prio=5 os_prio=0 tid=0x00007f5c98cc4000 > nid=0x229 waiting on condition [0x00007f5c2f857000] > java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x00000005c312ad80> (a > java.util.concurrent.locks.ReentrantLock$NonfairSync) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199) > at > java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209) > at > java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285) > at > org.apache.hadoop.hbase.master.assignment.RegionStateNode.lock(RegionStateNode.java:313) > at > org.apache.hadoop.hbase.master.assignment.AssignmentManager$DeadServerMetricRegionChore.periodicExecute(AssignmentManager.java:1186) > at > org.apache.hadoop.hbase.master.assignment.AssignmentManager$DeadServerMetricRegionChore.periodicExecute(AssignmentManager.java:1163) > at > org.apache.hadoop.hbase.procedure2.TimeoutExecutorThread.executeInMemoryChore(TimeoutExecutorThread.java:120) > at > org.apache.hadoop.hbase.procedure2.TimeoutExecutorThread.execDelayedProcedure(TimeoutExecutorThread.java:99) > at > org.apache.hadoop.hbase.procedure2.TimeoutExecutorThread.run(TimeoutExecutorThread.java:66) > "PEWorker-1" #1053 daemon prio=5 os_prio=0 tid=0x00007f5c98cc5800 nid=0x22a > in Object.wait() [0x00007f5c2f756000] > java.lang.Thread.State: TIMED_WAITING (on object monitor) > at java.lang.Object.wait(Native Method) > at > org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:168) > - locked <0x00000005839f18b0> (a > java.util.concurrent.atomic.AtomicBoolean) > at org.apache.hadoop.hbase.client.HTable.put(HTable.java:540) > at > org.apache.hadoop.hbase.master.assignment.RegionStateStore.updateRegionLocation(RegionStateStore.java:209) > at > org.apache.hadoop.hbase.master.assignment.RegionStateStore.updateUserRegionLocation(RegionStateStore.java:203) > at > org.apache.hadoop.hbase.master.assignment.RegionStateStore.updateRegionLocation(RegionStateStore.java:141) > at > org.apache.hadoop.hbase.master.assignment.AssignmentManager.persistToMeta(AssignmentManager.java:1742) > at > org.apache.hadoop.hbase.master.assignment.RegionRemoteProcedureBase.execute(RegionRemoteProcedureBase.java:298) > at > org.apache.hadoop.hbase.master.assignment.RegionRemoteProcedureBase.execute(RegionRemoteProcedureBase.java:58) > at > org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:962) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1648) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1395) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$1100(ProcedureExecutor.java:78) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1965) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)