GitHub user jiajunwang opened a pull request:

    https://github.com/apache/helix/pull/103

    Fix disconnected zkConnection issue.

    One issue is found that when zkConnection is using an invalid zookeeper 
object (null), the callers will get NPE error.
    Affected Helix components are ZKHelixManager, ZkHelixPropertyStore, and 
other ZK related classes.
    For fixing this issue:
    1. Override retryUntilConnected() in Helix ZkClient to check the connection 
before trigger callbacks. This will prevent NPE. But the user will still need 
to try-catch IllegalStateException and re-create a ZkClient if necessary.
    2. For ZKHelixManager, implement handleSessionEstablishmentError to retry 
establishing a new connection. If the retry fails, Helix invokes a user 
registered state handler.
    3. Add unit test for simulating connection error and test if error handler 
can recover the connection or trigger user registered callback.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/jiajunwang/helix zkFix

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/helix/pull/103.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #103
    
----
commit 3064c40556afa5afa71c102708c519ffa2ba7b8c
Author: Jiajun Wang <jjw...@linkedin.com>
Date:   2017-07-25T23:50:22Z

    Fix disconnected zkConnection issue.
    
    One issue is found that when zkConnection may be using an invalid zookeeper 
object (null). And related calls will get NPE error.
    Affected Helix components are ZKHelixManager, ZkHelixPropertyStore and 
other zk related classes.
    For fixing this issue:
    1. Override retryUntilConnected() in Helix ZkClient to check the connection 
before trigger callbacks. This will prevent NPE. But user will still need to 
try-catch IllegalStateException, and re-create a ZkClient if necessary.
    2. For ZKHelixManager, implement handleSessionEstablishmentError to retry 
establishing a new connection. If retry fails, Helix invokes a user registered 
state handler.
    3. Add unit test for simulating connection error and test if error handler 
can recover the connection or trigger user registered callback.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Reply via email to