[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16679247#comment-16679247
 ] 

Hudson commented on ZOOKEEPER-3162:
-----------------------------------

SUCCESS: Integrated in Jenkins build ZooKeeper-trunk #261 (See 
[https://builds.apache.org/job/ZooKeeper-trunk/261/])
ZOOKEEPER-3162: Broken lock semantics in C client lock-recipe. (andor: rev 
477fa0724fa66cc41d14e8a974ab4ac2a1b68433)
* (edit) zookeeper-recipes/zookeeper-recipes-queue/src/main/c/configure.ac
* (edit) zookeeper-recipes/zookeeper-recipes-lock/src/main/c/src/zoo_lock.c
* (edit) zookeeper-recipes/zookeeper-recipes-lock/build.xml
* (edit) zookeeper-recipes/zookeeper-recipes-lock/src/main/c/configure.ac
* (edit) zookeeper-recipes/zookeeper-recipes-lock/src/main/c/tests/TestClient.cc
* (edit) zookeeper-recipes/zookeeper-recipes-queue/build.xml


> Broken lock semantics in C client lock-recipe
> ---------------------------------------------
>
>                 Key: ZOOKEEPER-3162
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3162
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: c client
>    Affects Versions: 3.0.0, 3.4.13
>            Reporter: Andrea Reale
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> As reported (but never fixed) in the past by ZOOKEEPER-2409, ZOOKEEPER-2038 
> and (partly) ZOOKEEPER-2878, the C client lock-recipe implementation is 
> broken.
> I identified three issues.
> The main one (as also reported in the aforementioned reports) is that the 
> logic that goes through the lock waiting list is broken. child_floor uses 
> strcmp and compares the full node name (i.e., sessionID-sequence) rather than 
> only comparing the sequence number. This makes it possible for two different 
> clients to hold the lock at the same time: assume two clients, one associated 
> with session A, the other with session B, with A < B lexicographically. Now 
> assume that at some point a thread in B holds a lock and a thread in A tries 
> to acquire the same lock. A will manage to get the lock because of the wrong 
> comparison function, so now two guys hold the lock.
> The second issue is a possible deadlock inside zkr_lock_operation. 
> zkr_lock_operation is always called by holding the mutex associated to the 
> client lock. In some cases, zkr_lock_operaton may decide to give-up locking 
> and call zkr_lock_unlock to release the lock. When this happens, it will try 
> to acquire again the same phtread mutex, which will lead to a deadlock.
> The third issue relates to the return value of zkr_lock_lock. According to 
> the API docs, the functions returns 0 when no errors. Then it is up to the 
> invoker to check when the lock is held by calling zkr_lock_isowner. However, 
> the implementation, in case of no error, returns zkr_lock_isowner. This is 
> wrong because it becomes impossible to distinguish an error condition from a 
> success (but not ownerhsip). Instead the API (as described in the docs, btw) 
> should return always 0 when no errors occur.
> Shortly I will add the link to a PR fixing the issues.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to