[ https://issues.apache.org/jira/browse/HBASE-24896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Viraj Jasani updated HBASE-24896: --------------------------------- Release Note: - Untangle RegionInfo, RegionInfoBuilder, and MutableRegionInfo static initializations. - Undo static initializing references from RegionInfo to RegionInfoBuilder. - Mark RegionInfo#UNDEFINED IA.Private and deprecated; it is for internal use only and likely to be removed in HBase4. (sub-task HBASE-24918) - Move MutableRegionInfo out of RegionInfoBuilder and have it as a stanadlone task. (sub-task HBASE-24918) > 'Stuck' in static initialization creating RegionInfo instance > ------------------------------------------------------------- > > Key: HBASE-24896 > URL: https://issues.apache.org/jira/browse/HBASE-24896 > Project: HBase > Issue Type: Bug > Affects Versions: 2.3.1 > Reporter: Michael Stack > Assignee: Michael Stack > Priority: Major > Fix For: 3.0.0-alpha-1, 2.4.0, 2.3.2 > > Attachments: hbasedn192-jstack-0.webarchive, > hbasedn192-jstack-1.webarchive, hbasedn192-jstack-2.webarchive > > > We ran into the following deadlocked server in testing. The priority handlers > seem stuck across multiple thread dumps. Seven of the ten total priority > threads have this state: > {code:java} > "RpcServer.priority.RWQ.Fifo.read.handler=5,queue=1,port=16020" #82 daemon > prio=5 os_prio=0 cpu=0.70ms elapsed=315627.86s allocated=3744B > defined_classes=0 tid=0x00007f3da0983040 nid=0x62d9 in Object.wait() > [0x00007f3d9bc8c000] > java.lang.Thread.State: RUNNABLE > at > org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:3327) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:1491) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.newRegionScanner(RSRpcServices.java:3143) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:3478) > at > org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:44858) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:393) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:133) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:338) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:318) > {code} > The anomalous three are as follows: > h3. #1 > {code:java} > "RpcServer.priority.RWQ.Fifo.write.handler=0,queue=0,port=16020" #77 daemon > prio=5 os_prio=0 cpu=175.98ms elapsed=315627.86s allocated=2153K > defined_classes=14 tid=0x00007f3da0ae6ec0 nid=0x62d4 in Object.wait() > [0x00007f3d9c190000] > java.lang.Thread.State: RUNNABLE > at > org.apache.hadoop.hbase.client.RegionInfo.<clinit>(RegionInfo.java:72) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:3327) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:1491) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.mutate(RSRpcServices.java:2912) > at > org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:44856) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:393) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:133) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:338) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:318){code} > ...which is the creation of the UNDEFINED in RegionInfo here: > {color:#808000}@InterfaceAudience.Public{color}{color:#000080}public > interface {color}RegionInfo {color:#000080}extends > {color}Comparable<RegionInfo> { > RegionInfo {color:#660e7a}UNDEFINED {color}= > RegionInfoBuilder.newBuilder(TableName.valueOf({color:#008000}"__UNDEFINED__"{color})).build(); > > h3. #2 > {code:java} > "RpcServer.priority.RWQ.Fifo.read.handler=4,queue=1,port=16020" #81 daemon > prio=5 os_prio=0 cpu=53.85ms elapsed=315627.86s allocated=81984B > defined_classes=3 tid=0x00007f3da0981590 nid=0x62d8 in Object.wait() > [0x00007f3d9bd8c000] > java.lang.Thread.State: RUNNABLE > at > org.apache.hadoop.hbase.client.RegionInfoBuilder.<clinit>(RegionInfoBuilder.java:49) > at > org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.toRegionInfo(ProtobufUtil.java:3231) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.executeOpenRegionProcedures(RSRpcServices.java:3755) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.lambda$executeProcedures$2(RSRpcServices.java:3827) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices$$Lambda$173/0x00000017c0e40040.accept(Unknown > Source) > at java.util.ArrayList.forEach(java.base@11.0.6/ArrayList.java:1540) > at > java.util.Collections$UnmodifiableCollection.forEach(java.base@11.0.6/Collections.java:1085) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.executeProcedures(RSRpcServices.java:3827) > at > org.apache.hadoop.hbase.shaded.protobuf.generated.AdminProtos$AdminService$2.callBlockingMethod(AdminProtos.java:34896) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:393) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:133) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:338) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:318) > {code} > which is here creating meta MetaRegionInfo.. > > {color:#000080}public static final {color}RegionInfo > {color:#660e7a}FIRST_META_REGIONINFO {color}= > {color:#000080}new {color}MutableRegionInfo({color:#0000ff}1L{color}, > TableName.{color:#660e7a}META_TABLE_NAME{color}, > RegionInfo.{color:#660e7a}DEFAULT_REPLICA_ID{color}); > > h3. #3 > {code:java} > "RpcServer.priority.RWQ.Fifo.read.handler=8,queue=1,port=16020" #85 daemon > prio=5 os_prio=0 cpu=0.50ms elapsed=315627.85s allocated=1960B > defined_classes=0 tid=0x00007f3da0d851d0 nid=0x62dc in Object.wait() > [0x00007f3d9b989000] > java.lang.Thread.State: RUNNABLE > at > org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.toRegionInfo(ProtobufUtil.java:3231) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.executeOpenRegionProcedures(RSRpcServices.java:3755) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.lambda$executeProcedures$2(RSRpcServices.java:3827) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices$$Lambda$173/0x00000017c0e40040.accept(Unknown > Source) > at java.util.ArrayList.forEach(java.base@11.0.6/ArrayList.java:1540) > at > java.util.Collections$UnmodifiableCollection.forEach(java.base@11.0.6/Collections.java:1085) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.executeProcedures(RSRpcServices.java:3827) > at > org.apache.hadoop.hbase.shaded.protobuf.generated.AdminProtos$AdminService$2.callBlockingMethod(AdminProtos.java:34896) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:393) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:133) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:338) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:318) > {code} > ... which is here in code > {color:#000080}if > {color}(tableName.equals(TableName.{color:#660e7a}META_TABLE_NAME{color}) && > replicaId == defaultReplicaId) { > {color:#000080}return > {color}RegionInfoBuilder.{color:#660e7a}FIRST_META_REGIONINFO{color}; > } > > The thread dump does not seem to recognize the above as a deadlock. > > ...at > org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:3327) > is doing the below: > {color:#000080}return > this{color}.{color:#660e7a}onlineRegions{color}.get(encodedRegionName); > ... where onlineRegions is concurrent Map of String to HRegion. > > > > -- This message was sent by Atlassian Jira (v8.3.4#803005)