Hi,

ZkHelixLock is a thin wrapper around the ZooKeeper WriteLock recipe (which was 
last changed over 5 years ago). Though we haven't extensively tested it in 
production, but we haven't seen it fail to return as described.

Do you know if ZKHelixLock._listener.lockAcquired() is ever called?

Feel free to examine the code here: 
https://github.com/apache/helix/blob/master/helix-core/src/main/java/org/apache/helix/lock/zk/ZKHelixLock.java

> From: neutronsh...@gmail.com
> Date: Mon, 9 May 2016 14:26:43 -0700
> Subject: calling ZKHelixLock from state machine transition
> To: dev@helix.apache.org
> 
> Hi Helix team,
> 
> We observed an issue at state machine transition handle:
> 
> // statemodel.java:
> 
> public void offlineToSlave(Message message, NotificationContext context) {
> 
>   // do work to start a local shard
> 
>   // we want to save the new shard info to resource config
> 
> 
>   ZKHelixLock zklock = new ZKHelixLock(clusterId, resource, zkclient);
>   try {
>     zklock.lock();    // ==> will be blocked here
> 
>     ZNRecord record = zkclient.readData(scope.getZkPath(), true);
>     update record fields;
>     zkclient.writeData(scope.getZkPath(), record);
>   } finally {
>     zklock.unlock();
>   }
> }
> 
> After several invocation of this method,  zklock.lock() method doesn't
> return (so the lock is not acquired).  State machine threads become
> blocked.
> 
> At zk path "<cluster>/LOCKS/RESOURCE_resource"  I see several znodes
> there, representing outstanding lock requests.
> 
> Are there any special care we should be aware of about zk lock ?  Thanks.
> 
> 
> -neutronsharc
                                          

Reply via email to