[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14541588#comment-14541588
 ] 

Rakesh R commented on ZOOKEEPER-2188:
-------------------------------------

[~haitao-tony], IIUC {{zkclient#isAlive}} is used to see the client is dead or 
not. In your case, the cluster is down. Now, when a client tries to connect to 
the server it fails to get a socket connection and it will continue retrying 
infinitely to establish a connection. This means the client is alive to 
establish a connection once the ZK quorum is available. There is a way to get 
out of this infinite loop, but it should be implemented through an application 
thread, here this thread would do a connection time out logic using 
{{zk.getState().isConnected();}} status or a connection watcher event of 
{{Event#SyncConnected}}/{{Event#SaslAuthenticated}}. Does this satisfy your 
case?

> client connection hung up because of  dead loop
> -----------------------------------------------
>
>                 Key: ZOOKEEPER-2188
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2188
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: java client
>    Affects Versions: 3.5.0
>            Reporter: sunhaitao
>
> There is something wrong with the client code ClientCnxn.java, it will keep 
> trying to connect to server in a dead loop.
> This is my test step, shut down zookeeper cluster, exectue zkCli.sh script to 
> connect to zookeeper cluster, it will keep trying to connect to zookeeper 
> server without stop.
> public void run() {
>             clientCnxnSocket.introduce(this, sessionId, outgoingQueue);
>             clientCnxnSocket.updateNow();
>             clientCnxnSocket.updateLastSendAndHeard();
>             int to;
>             long lastPingRwServer = Time.currentElapsedTime();
>             final int MAX_SEND_PING_INTERVAL = 10000; //10 seconds
>             while (state.isAlive()) {
>                 try {
>                     if (!clientCnxnSocket.isConnected()) {
>                         // don't re-establish connection if we are closing
>                         if (closing) {
>                             break;
>                         }
>                         startConnect();
>                         clientCnxnSocket.updateLastSendAndHeard();
>                     }
> public boolean isAlive() {
>             return this != CLOSED && this != AUTH_FAILED;
>         }
> because at the beginning it is CONNECTING so isAlive always returns true, 
> which leads to dead loop.
> we should add some retry limit to stop this



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to