[ https://issues.apache.org/jira/browse/HBASE-2966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12907462#action_12907462 ]
Patrick Hunt commented on HBASE-2966: ------------------------------------- btw, the large number of event threads in the stack dump is related to this issue, which is fixed in 3.3.2 (not yet released): https://issues.apache.org/jira/browse/ZOOKEEPER-795 however that's unrelated to this specific issue. > HBase client stuck on > org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1278) holding > regionLockObject lock > ----------------------------------------------------------------------------------------------------------------------- > > Key: HBASE-2966 > URL: https://issues.apache.org/jira/browse/HBASE-2966 > Project: HBase > Issue Type: Bug > Reporter: Kannan Muthukkaruppan > Attachments: stack.txt > > > We noticed in one case the HBase client program got stuck on > Zookeeper.exists() call. > > One of the threads was stuck here on the ZK call while holding an HBase level > lock (regionLockObject in locateRegionInMeta()). > {code} > "thrift-0-thread-8" prio=10 tid=0x00007f189ca4c000 nid=0x550f in > Object.wait() [0x0000000044241000] > java.lang.Thread.State: WAITING (on object monitor) > at java.lang.Object.wait(Native Method) > at java.lang.Object.wait(Object.java:485) > at > org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1278) > - locked <0x00007f1903a0c280> (a > org.apache.zookeeper.ClientCnxn$Packet) > at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:804) > at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:837) > at > org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper.getRSDirectoryCount(ZooKeeperWrapper.java:765) > at > org.apache.hadoop.hbase.client.HTable.getCurrentNrHRS(HTable.java:173) > at > org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:147) > at > org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:124) > at > org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:89) > at > org.apache.hadoop.hbase.client.HConnectionManager$TableServers.prefetchRegionCache(HConnectionManager.java:734) > at > org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegionInMeta(HConnectionManager.java:785) > - locked <0x00007f190d868848> (a java.lang.Object) > at > org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:679) > at > org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:646) > at > org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getRegionLocation(HConnectionManager.java:472) > at > org.apache.hadoop.hbase.client.ServerCallable.instantiateServer(ServerCallable.java:57) > at > org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getRegionServerWithRetries(HConnectionManager.java:1147) > at org.apache.hadoop.hbase.client.HTable.get(HTable.java:503) > {code} > The remaining other threads are all waiting on the regionLockObject lock > (held by the above thread) with stacks like: > > {code} > thrift-0-thread-7" prio=10 tid=0x00007f189ca4a800 nid=0x550e waiting for > monitor entry [0x0000000044141000] > java.lang.Thread.State: BLOCKED (on object monitor) > at > org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegionInMeta(HConnectionManager.java:783) > - waiting to lock <0x00007f190d868848> (a java.lang.Object) > at > org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:679) > at > org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:646) > at > org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getRegionLocation(HConnectionManager.java:472) > at > org.apache.hadoop.hbase.client.ServerCallable.instantiateServer(ServerCallable.java:57) > at > org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getRegionServerWithRetries(HConnectionManager.java:1147) > at org.apache.hadoop.hbase.client.HTable.get(HTable.java:503) > {code} > Any ideas? > > Meanwhile, I'll look into the ZK logs from the relevant time some more and > get back if I have more information. > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.