[ https://issues.apache.org/jira/browse/HBASE-3654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13010749#comment-13010749 ]
Subbu M Iyer commented on HBASE-3654: ------------------------------------- Stack: Also initialized the AL so we should not see the blocking due to Arrays.copyOf at least for getOnlineRegions calls (which is the hotspot). The array list sizing issue may not be an issue during buildServerLoad as we may not have multiple concurrent threads trying to do this. I will run the load tests with my dummy mock class to confirm this. > Weird blocking between getOnlineRegion and createRegionLoad > ----------------------------------------------------------- > > Key: HBASE-3654 > URL: https://issues.apache.org/jira/browse/HBASE-3654 > Project: HBase > Issue Type: Bug > Affects Versions: 0.90.1 > Reporter: Jean-Daniel Cryans > Priority: Blocker > Fix For: 0.90.2 > > Attachments: ConcurrentHM, ConcurrentSKLM, CopyOnWrite, > HBASE-3654_Weird_blocking_getOnlineRegions_and_createServerLoad_-_COWAL.patch, > > HBASE-3654_Weird_blocking_getOnlineRegions_and_createServerLoad_-_COWAL1.patch, > > HBASE-3654_Weird_blocking_getOnlineRegions_and_createServerLoad_-_ConcurrentHM.patch, > TestOnlineRegions.java, hashmap > > > Saw this when debugging something else: > {code} > "regionserver60020" prio=10 tid=0x00007f538c1c0000 nid=0x4c7 runnable > [0x00007f53931da000] > java.lang.Thread.State: RUNNABLE > at > org.apache.hadoop.hbase.regionserver.Store.getStorefilesIndexSize(Store.java:1380) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.createRegionLoad(HRegionServer.java:916) > - locked <0x0000000672aa0a00> (a > java.util.concurrent.ConcurrentSkipListMap) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.buildServerLoad(HRegionServer.java:767) > - locked <0x0000000656f62710> (a java.util.HashMap) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.tryRegionServerReport(HRegionServer.java:722) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:591) > at java.lang.Thread.run(Thread.java:662) > "IPC Reader 9 on port 60020" prio=10 tid=0x00007f538c1be000 nid=0x4c6 waiting > for monitor entry [0x00007f53932db000] > java.lang.Thread.State: BLOCKED (on object monitor) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.getFromOnlineRegions(HRegionServer.java:2295) > - waiting to lock <0x0000000656f62710> (a java.util.HashMap) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.getOnlineRegion(HRegionServer.java:2307) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:2333) > at > org.apache.hadoop.hbase.regionserver.HRegionServer$QosFunction.isMetaRegion(HRegionServer.java:379) > at > org.apache.hadoop.hbase.regionserver.HRegionServer$QosFunction.apply(HRegionServer.java:422) > at > org.apache.hadoop.hbase.regionserver.HRegionServer$QosFunction.apply(HRegionServer.java:361) > at > org.apache.hadoop.hbase.ipc.HBaseServer.getQosLevel(HBaseServer.java:1126) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:982) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:946) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:522) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:316) > - locked <0x0000000656e60068> (a > org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader) > at > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) > at java.lang.Thread.run(Thread.java:662) > ... > "IPC Reader 0 on port 60020" prio=10 tid=0x00007f538c08b000 nid=0x4bd waiting > for monitor entry [0x00007f5393be4000] > java.lang.Thread.State: BLOCKED (on object monitor) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.getFromOnlineRegions(HRegionServer.java:2295) > - waiting to lock <0x0000000656f62710> (a java.util.HashMap) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.getOnlineRegion(HRegionServer.java:2307) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:2333) > at > org.apache.hadoop.hbase.regionserver.HRegionServer$QosFunction.isMetaRegion(HRegionServer.java:379) > at > org.apache.hadoop.hbase.regionserver.HRegionServer$QosFunction.apply(HRegionServer.java:422) > at > org.apache.hadoop.hbase.regionserver.HRegionServer$QosFunction.apply(HRegionServer.java:361) > at > org.apache.hadoop.hbase.ipc.HBaseServer.getQosLevel(HBaseServer.java:1126) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:982) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:946) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:522) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:316) > - locked <0x0000000656e635c8> (a > org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader) > at > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) > at java.lang.Thread.run(Thread.java:662) > {code} > All the readers are blocked! I have the feeling something much better could > be done. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira