[ https://issues.apache.org/jira/browse/HBASE-3621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13005438#comment-13005438 ]
Jean-Daniel Cryans commented on HBASE-3621: ------------------------------------------- For example: {code} "somenode.prod.twitter.com:60000.timeoutMonitor" daemon prio=10 tid=0x00002aacb8567800 nid=0x772 in Object.wait() [0x0000000045bf1000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) at java.lang.Object.wait(Object.java:485) at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:757) - locked <0x00002aaab2a10da8> (a org.apache.hadoop.hbase.ipc.HBaseClient$Call) at org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:257) at $Proxy6.closeRegion(Unknown Source) at org.apache.hadoop.hbase.master.ServerManager.sendRegionClose(ServerManager.java:589) at org.apache.hadoop.hbase.master.AssignmentManager.unassign(AssignmentManager.java:1093) at org.apache.hadoop.hbase.master.AssignmentManager$TimeoutMonitor.chore(AssignmentManager.java:1672) - locked <0x00002aaabf759858> (a java.util.concurrent.ConcurrentSkipListMap) at org.apache.hadoop.hbase.Chore.run(Chore.java:66 ... "main-EventThread" daemon prio=10 tid=0x00002aacb850b000 nid=0x761 waiting for monitor entry [0x00000000455eb000] java.lang.Thread.State: BLOCKED (on object monitor) at org.apache.hadoop.hbase.master.AssignmentManager.nodeDataChanged(AssignmentManager.java:525) - waiting to lock <0x00002aaabf759858> (a java.util.concurrent.ConcurrentSkipListMap) at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:268) at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:530) at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:506) {code} The ZK event thread is blocked by that other thread that talks to a RS that doesn't answer. All ZK events get severely delayed. > The timeout handler in AssignmentManager does an RPC while holding lock on > RIT; a big no-no > ------------------------------------------------------------------------------------------- > > Key: HBASE-3621 > URL: https://issues.apache.org/jira/browse/HBASE-3621 > Project: HBase > Issue Type: Bug > Reporter: stack > Fix For: 0.90.2 > > > J-D found this debugging a failure on Dmitriy's cluster; we're RPC'ing under > a synchronized(regionsInTransition). Fix. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira