Re: calling ZKHelixLock from state machine transition

Neutron sharc Wed, 18 May 2016 23:02:30 -0700

Hi Kanak,

The same problem with zk helix lock re-appears.  I found some clues
about the potential bug.  This potential bug causes all threads
competing for a same zk helix lock to be blocked.


In my test there are two java threads blocked when trying to grab zk
lock (thread 15 and thread 19)

Here are related logs before the threads are blocked (inlined with my comments)

[INFO  2016-05-18 22:19:54,057 com.hcd.hcdadmin.M1Rebalancer:70]
rebalancer thread 15 before zklock
  =>  T15 enters

[DEBUG 2016-05-18 22:19:54,069 org.apache.helix.lock.zk.WriteLock:193]
Created id: /shawn1/LOCKS/RESOURCE_Pool0/x-72233245264911661-0000000078
   =>  T15  creates znode,  T15 is the smallest so it owns lock

[INFO  2016-05-18 22:19:54,071 com.hcd.hcdadmin.M1Rebalancer:70]
rebalancer thread 19 before zklock
   => T19 enters

[INFO  2016-05-18 22:19:54,071 com.hcd.hcdadmin.M1Rebalancer:72]
rebalancer thread 15 start computing for controller host1_admin
   => T15 performs its work

[DEBUG 2016-05-18 22:19:54,080 org.apache.helix.lock.zk.WriteLock:193]
Created id: /shawn1/LOCKS/RESOURCE_Pool0/x-72233245264911662-0000000079
    =>  T19 creates its znode

[DEBUG 2016-05-18 22:19:54,081 org.apache.helix.lock.zk.WriteLock:233]
watching less than me node:
/shawn1/LOCKS/RESOURCE_Pool0/x-72233245264911661-0000000078
    =>  T19 found its predecessor to wait for, which is T15

[WARN  2016-05-18 22:19:54,084 org.apache.helix.lock.zk.WriteLock:239]
Could not find the stats for less than me:
/shawn1/LOCKS/RESOURCE_Pool0/x-72233245264911661-0000000078
    =>  T19 calls zookeeper.exist() to register a watcher on T15, but
T15 has called unlock() to delete the znode at the same moment.  So
T19 continues to check while(id==null) loop.  Because T19 id is not
null now,  T19's LockZooKeeperOperation.execute() returns false. T19
will block at wait(), hoping somebody else will notify it.  But since
T19 is currently the smallest so nobody else can grab the lock and
wait up T19;  T19 blocks, and every subsequent caller also blocks.

The code that leads to the problem is here:
https://github.com/apache/helix/blob/master/helix-core/src/main/java/org/apache/helix/lock/zk/WriteLock.java#L238

One possible fix is to just set id to null at line 240 and let while()
loop to retry.
https://github.com/apache/helix/blob/master/helix-core/src/main/java/org/apache/helix/lock/zk/WriteLock.java#L240


[INFO  2016-05-18 22:19:54,092 com.hcd.hcdadmin.M1Rebalancer:125]
rebalancer thread 15 released zklock and returned
  =>  T15 has deleted znode a short while ago and returns from this method


[INFO  2016-05-18 22:19:54,179 com.hcd.hcdadmin.M1Rebalancer:70]
rebalancer thread 15 before zklock
  =>  T15 calls this method again,

[DEBUG 2016-05-18 22:19:54,191 org.apache.helix.lock.zk.WriteLock:193]
Created id: /shawn1/LOCKS/RESOURCE_Pool0/x-72233245264911663-0000000080
  => T15 creates znode

[DEBUG 2016-05-18 22:19:54,193 org.apache.helix.lock.zk.WriteLock:233]
watching less than me node:
/shawn1/LOCKS/RESOURCE_Pool0/x-72233245264911662-0000000079

  => T15 found T19 to be smallest so it waits for T19.  Nobody will
wake up T19,  so T15 is also blocked.




Any comments appreciated. Thanks.


-Neutronsharc



On Sat, May 14, 2016 at 5:20 PM, Neutron sharc <neutronsh...@gmail.com> wrote:
> We increased the max connections allowed per client at zk server side.
> The problem is gone now.
>
> On Tue, May 10, 2016 at 2:50 PM, Neutron sharc <neutronsh...@gmail.com> wrote:
>> Hi Kanak,  thanks for reply.
>>
>> The problem is gone if we set a constraint of 1 on "STATE_TRANSITION"
>> for the resource.  If we allow multiple state transitions to be
>> executed in the resource,  then this zklock problem occurs.
>>
>> btw,  we run multiple participants in a same jvm in our test.  In
>> other words, there are multiple java threads in a same jvm competing
>> for zklock.
>>
>> We haven't profiled the ZKHelixLock._listener.lockAcquired() since we
>> bypassed this problem using constraint.  Will revisit it later.
>>
>>
>>
>>
>> On Mon, May 9, 2016 at 8:28 PM, Kanak Biscuitwala <kana...@hotmail.com> 
>> wrote:
>>> Hi,
>>>
>>> ZkHelixLock is a thin wrapper around the ZooKeeper WriteLock recipe (which 
>>> was last changed over 5 years ago). Though we haven't extensively tested it 
>>> in production, but we haven't seen it fail to return as described.
>>>
>>> Do you know if ZKHelixLock._listener.lockAcquired() is ever called?
>>>
>>> Feel free to examine the code here: 
>>> https://github.com/apache/helix/blob/master/helix-core/src/main/java/org/apache/helix/lock/zk/ZKHelixLock.java
>>>
>>>> From: neutronsh...@gmail.com
>>>> Date: Mon, 9 May 2016 14:26:43 -0700
>>>> Subject: calling ZKHelixLock from state machine transition
>>>> To: dev@helix.apache.org
>>>>
>>>> Hi Helix team,
>>>>
>>>> We observed an issue at state machine transition handle:
>>>>
>>>> // statemodel.java:
>>>>
>>>> public void offlineToSlave(Message message, NotificationContext context) {
>>>>
>>>>   // do work to start a local shard
>>>>
>>>>   // we want to save the new shard info to resource config
>>>>
>>>>
>>>>   ZKHelixLock zklock = new ZKHelixLock(clusterId, resource, zkclient);
>>>>   try {
>>>>     zklock.lock();    // ==> will be blocked here
>>>>
>>>>     ZNRecord record = zkclient.readData(scope.getZkPath(), true);
>>>>     update record fields;
>>>>     zkclient.writeData(scope.getZkPath(), record);
>>>>   } finally {
>>>>     zklock.unlock();
>>>>   }
>>>> }
>>>>
>>>> After several invocation of this method,  zklock.lock() method doesn't
>>>> return (so the lock is not acquired).  State machine threads become
>>>> blocked.
>>>>
>>>> At zk path "<cluster>/LOCKS/RESOURCE_resource"  I see several znodes
>>>> there, representing outstanding lock requests.
>>>>
>>>> Are there any special care we should be aware of about zk lock ?  Thanks.
>>>>
>>>>
>>>> -neutronsharc
>>>

Re: calling ZKHelixLock from state machine transition

Reply via email to