[ 
https://issues.apache.org/jira/browse/HELIX-195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dafu resolved HELIX-195.
------------------------

    Resolution: Fixed

rb:
https://reviews.apache.org/r/13345/

commit:
https://git-wip-us.apache.org/repos/asf?p=incubator-helix.git;a=commitdiff;h=6d5397990d9629009e304ae6eea4a046d5429871
                
> Race condition between FINALIZE callbacks and Zk Callbacks
> ----------------------------------------------------------
>
>                 Key: HELIX-195
>                 URL: https://issues.apache.org/jira/browse/HELIX-195
>             Project: Apache Helix
>          Issue Type: Sub-task
>            Reporter: dafu
>            Assignee: dafu
>
> FINALIZE callbacks are sent async via CallbackHandler#reset(), while Zk 
> callbacks are queued in ZkEventThread. It's possible that we are handling a 
> FINALIZE callback before all Zk callbacks are cleaned up. This creates race 
> conditions, for example, in zk session expiry, when a GenericController gets 
> a FINALIZE callback, it cleans up all listeners using ZkClient#unsubscribe(), 
> but Zk callbacks  leftover in ZkEventThread comes later, and re-subscribe all 
> listeners, causing zk watcher leaking.
> This is observed by setting up two controllers and expire the leader (by 
> simulating a long gc). The second controller takes the leadership and add all 
> listeners, but when the former leader recovers from gc, it gets leftover Zk 
> callbacks and re-subscribe the live-instance listener hence react to all 
> live-instance changes, though it doesn't acquire the leadership.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to