[ 
https://issues.apache.org/jira/browse/HBASE-3621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13005438#comment-13005438
 ] 

Jean-Daniel Cryans commented on HBASE-3621:
-------------------------------------------

For example:

{code}
"somenode.prod.twitter.com:60000.timeoutMonitor" daemon prio=10 
tid=0x00002aacb8567800 nid=0x772 in Object.wait() [0x0000000045bf1000]
   java.lang.Thread.State: WAITING (on object monitor)
  at java.lang.Object.wait(Native Method)
  at java.lang.Object.wait(Object.java:485)
  at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:757)
  - locked <0x00002aaab2a10da8> (a org.apache.hadoop.hbase.ipc.HBaseClient$Call)
  at org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:257)
  at $Proxy6.closeRegion(Unknown Source)
  at 
org.apache.hadoop.hbase.master.ServerManager.sendRegionClose(ServerManager.java:589)
  at 
org.apache.hadoop.hbase.master.AssignmentManager.unassign(AssignmentManager.java:1093)
  at 
org.apache.hadoop.hbase.master.AssignmentManager$TimeoutMonitor.chore(AssignmentManager.java:1672)
  - locked <0x00002aaabf759858> (a java.util.concurrent.ConcurrentSkipListMap)
  at org.apache.hadoop.hbase.Chore.run(Chore.java:66
...

"main-EventThread" daemon prio=10 tid=0x00002aacb850b000 nid=0x761 waiting for 
monitor entry [0x00000000455eb000]
   java.lang.Thread.State: BLOCKED (on object monitor)
  at 
org.apache.hadoop.hbase.master.AssignmentManager.nodeDataChanged(AssignmentManager.java:525)
  - waiting to lock <0x00002aaabf759858> (a 
java.util.concurrent.ConcurrentSkipListMap)
  at 
org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:268)
  at 
org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:530)
  at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:506)
{code}

The ZK event thread is blocked by that other thread that talks to a RS that 
doesn't answer. All ZK events get severely delayed.

> The timeout handler in AssignmentManager does an RPC while holding lock on 
> RIT; a big no-no
> -------------------------------------------------------------------------------------------
>
>                 Key: HBASE-3621
>                 URL: https://issues.apache.org/jira/browse/HBASE-3621
>             Project: HBase
>          Issue Type: Bug
>            Reporter: stack
>             Fix For: 0.90.2
>
>
> J-D found this debugging a failure on Dmitriy's cluster; we're RPC'ing under 
> a synchronized(regionsInTransition).  Fix.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to