[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ZOOKEEPER-4504:
--------------------------------------
    Labels: pull-request-available  (was: )

> ZKUtil#deleteRecursive causing deadlock in HDFS HA functionality
> ----------------------------------------------------------------
>
>                 Key: ZOOKEEPER-4504
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4504
>             Project: ZooKeeper
>          Issue Type: Bug
>            Reporter: Mohammad Arshad
>            Assignee: Mohammad Arshad
>            Priority: Critical
>              Labels: pull-request-available
>             Fix For: 3.7.1, 3.6.4, 3.9.0, 3.8.1
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> *Problem and Analysis:*
> After integrating ZooKeeper 3.6.3 we observed deadlock in HDFS HA 
> functionality as shown in below thread dumps.
> {code:java}
> "main-EventThread" #33 daemon prio=5 os_prio=0 tid=0x00007f9c017f1000 
> nid=0x101b waiting for monitor entry [0x00007f9bda8a6000]
>    java.lang.Thread.State: BLOCKED (on object monitor)
>       at 
> org.apache.hadoop.ha.ActiveStandbyElector.processWatchEvent(ActiveStandbyElector.java:603)
>       - waiting to lock <0x00000000c17986c0> (a 
> org.apache.hadoop.ha.ActiveStandbyElector)
>       at 
> org.apache.hadoop.ha.ActiveStandbyElector$WatcherWithClientRef.process(ActiveStandbyElector.java:1193)
>       at 
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:626)
>       at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:582)
> {code}
> {code:java}
> "main" #1 prio=5 os_prio=0 tid=0x00007f9c00060000 nid=0xea3 waiting on 
> condition [0x00007f9c06404000]
>    java.lang.Thread.State: WAITING (parking)
>       at sun.misc.Unsafe.park(Native Method)
>       - parking to wait for  <0x00000000c1b383c8> (a 
> java.util.concurrent.Semaphore$NonfairSync)
>       at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>       at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:838)
>       at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:999)
>       at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1306)
>       at java.util.concurrent.Semaphore.acquire(Semaphore.java:467)
>       at org.apache.zookeeper.ZKUtil.deleteInBatch(ZKUtil.java:122)
>       at org.apache.zookeeper.ZKUtil.deleteRecursive(ZKUtil.java:64)
>       at org.apache.zookeeper.ZKUtil.deleteRecursive(ZKUtil.java:76)
>       at 
> org.apache.hadoop.ha.ActiveStandbyElector$1.run(ActiveStandbyElector.java:386)
>       at 
> org.apache.hadoop.ha.ActiveStandbyElector$1.run(ActiveStandbyElector.java:383)
>       at 
> org.apache.hadoop.ha.ActiveStandbyElector.zkDoWithRetries(ActiveStandbyElector.java:1103)
>       at 
> org.apache.hadoop.ha.ActiveStandbyElector.zkDoWithRetries(ActiveStandbyElector.java:1095)
>       at 
> org.apache.hadoop.ha.ActiveStandbyElector.clearParentZNode(ActiveStandbyElector.java:383)
>       - locked <0x00000000c17986c0> (a 
> org.apache.hadoop.ha.ActiveStandbyElector)
>       at 
> org.apache.hadoop.ha.ZKFailoverController.formatZK(ZKFailoverController.java:290)
>       at 
> org.apache.hadoop.ha.ZKFailoverController.doRun(ZKFailoverController.java:227)
>       at 
> org.apache.hadoop.ha.ZKFailoverController.access$000(ZKFailoverController.java:66)
>       at 
> org.apache.hadoop.ha.ZKFailoverController$1.run(ZKFailoverController.java:186)
>       at 
> org.apache.hadoop.ha.ZKFailoverController$1.run(ZKFailoverController.java:182)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:360)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1741)
>       at 
> org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:498)
>       at 
> org.apache.hadoop.ha.ZKFailoverController.run(ZKFailoverController.java:182)
>       at 
> org.apache.hadoop.hdfs.tools.DFSZKFailoverController.main(DFSZKFailoverController.java:220)
> {code}
> org.apache.hadoop.ha.ActiveStandbyElector#clearParentZNode is instance 
> synchronized and calls ZKUtil.deleteRecursive(zk, pathRoot)
> ZKUtil.deleteRecursive is async API call and in callback it is invoking 
> ActiveStandbyElector#processWatchEvent which is synchronized on 
> ActiveStandbyElector instance.
> So there is deadlock, clearParentZNode() is waiting processWatchEvent() to 
> complete and processWatchEvent() is waiting clearParentZNode to complete
>  
> *Why this problem was not happening with earlier versions (3.5.x)?*
> In earlier zk versions, ZKUtil.deleteRecursive was using sync zk API 
> intnernally. So there was no callback (processWatchEvent) coming into the 
> scenario.
> *Proposed Fix:*
> There are two approaches to fix this problem. 
> 1. We can fix the problem in HDFS, modify the HDFS code to avoid the 
> deadlock. But we may get similar bugs in other projects.
> 2. Fix the problem in ZK. Make the API behavior same as the old behavior(use 
> sync API to delete the ZK node) and provide new overloaded API with new 
> behavior(use async API to delete the ZK node)
> I propose to fix the problem with 2nd approach.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to