[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-3652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17517245#comment-17517245
 ] 

Mate Szalay-Beko edited comment on ZOOKEEPER-3652 at 4/5/22 6:33 AM:
---------------------------------------------------------------------

Great catch, thanks for the fix!

Is there a reason why not to backport this to branch-3.5? (it is an important 
bugfix and 3.5 is not EoL yet)


was (Author: symat):
Great catch, thanks for the fix!

Is there a reason why not to backport this to branch-3.5? (it is an important 
bugfix and c.5 is not EoL yet)

> Improper synchronization in ClientCnxn
> --------------------------------------
>
>                 Key: ZOOKEEPER-3652
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3652
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: java client
>    Affects Versions: 3.5.6
>            Reporter: Sylvain Wallez
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 3.7.1, 3.6.4, 3.9.0, 3.8.1
>
>          Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> ZOOKEEPER-2111 introduced {{synchronized(state)}} statements in 
> {{ClientCnxn}} and {{ClientCnxn.SendThread}} to coordinate insertion in 
> {{outgoingQueue}} and draining it when the client connection isn't alive.
> There are several issues with this approach:
>  - the value of the {{state}} field is not stable, meaning we don't always 
> synchronize on the same object.
>  - the {{state}} field is an enum value, which are global objects. So in an 
> application with several ZooKeeper clients connected to different servers, 
> this causes some contention between clients.
> An easy fix is change those {{synchronized(state)}} statements to 
> {{synchronized(outgoingQueue)}} since it is local to each client and is what 
> we want to coordinate.
> I'll be happy to prepare a PR with the above change if this is deemed to be 
> the correct way to fix it.
>  
> Another issue that makes contention worse is 
> {{ClientCnxnSocketNIO.cleanup()}} that is called from within the above 
> synchronized block and contains {{Thread.sleep(100)}}. Why is this sleep 
> statement needed, and can we remove it?
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to