[ https://issues.apache.org/jira/browse/HBASE-24673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17155666#comment-17155666 ]
Nick Dimiduk commented on HBASE-24673: -------------------------------------- Here's a stack trace collected from a stuck PEWorker thread, taken at the point when the worker pool is expanding. {noformat} Thread 175 (PEWorker-1): State: TIMED_WAITING Blocked count: 3136 Waited count: 4122 Stack: java.base@11.0.6/java.lang.Object.wait(Native Method) app//org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:168) app//org.apache.hadoop.hbase.client.HTable.put(HTable.java:539) app//org.apache.hadoop.hbase.master.assignment.RegionStateStore.updateRegionLocation(RegionStateStore.java:224) app//org.apache.hadoop.hbase.master.assignment.RegionStateStore.updateUserRegionLocation(RegionStateStore.java:218) app//org.apache.hadoop.hbase.master.assignment.RegionStateStore.updateRegionLocation(RegionStateStore.java:156) app//org.apache.hadoop.hbase.master.assignment.AssignmentManager.transitStateAndUpdate(AssignmentManager.java:1743) app//org.apache.hadoop.hbase.master.assignment.AssignmentManager.regionOpening(AssignmentManager.java:1758) app//org.apache.hadoop.hbase.master.assignment.TransitRegionStateProcedure.openRegion(TransitRegionStateProcedure.java:211) app//org.apache.hadoop.hbase.master.assignment.TransitRegionStateProcedure.executeFromState(TransitRegionStateProcedure.java:357) app//org.apache.hadoop.hbase.master.assignment.TransitRegionStateProcedure.executeFromState(TransitRegionStateProcedure.java:102) app//org.apache.hadoop.hbase.procedure2.StateMachineProcedure.execute(StateMachineProcedure.java:194) app//org.apache.hadoop.hbase.master.assignment.TransitRegionStateProcedure.execute(TransitRegionStateProcedure.java:324) app//org.apache.hadoop.hbase.master.assignment.TransitRegionStateProcedure.execute(TransitRegionStateProcedure.java:102) app//org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:962) app//org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1667) app//org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1414) app//org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$1100(ProcedureExecutor.java:77) app//org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1984) {noformat} > TransitionRegionStateProcedure of non-meta regions should yield when meta is > unavailable > ---------------------------------------------------------------------------------------- > > Key: HBASE-24673 > URL: https://issues.apache.org/jira/browse/HBASE-24673 > Project: HBase > Issue Type: Bug > Components: Region Assignment > Reporter: Nick Dimiduk > Assignee: Nick Dimiduk > Priority: Major > > One observation from HBASE-24526 is that while meta is unavailable, other > region movement procedures are getting stuck on meta RPCs. Let's make it so > that non-meta transitions check the state of meta before attempting any RPCs. > If meta is known unavailable, release the thread back to the scheduler. -- This message was sent by Atlassian Jira (v8.3.4#803005)