[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haoze Wu updated ZOOKEEPER-4419:
--------------------------------
    Description: 
We were doing some testing in ZooKeeper 3.6.2, and found that sometimes a 
follower may take longer time (about 10s) to join the quorum. In our cluster, 
usually we expect that the leader and followers are ready to provide service in 
1 or 2 seconds.

*Analysis*

We did some investigations and found that the root cause is the exception 
thrown from `Learner.createSocket` 
([https://github.com/apache/zookeeper/blob/release-3.6.2/zookeeper-server/src/main/java/org/apache/zookeeper/server/quorum/Learner.java#L410-L423]).

`Learner.createSocket` is used in `Learner$LeaderConnector.connectToLeader`.

The code snippet 
([https://github.com/apache/zookeeper/blob/release-3.6.2/zookeeper-server/src/main/java/org/apache/zookeeper/server/quorum/Learner.java#L344-L407])
 is as follows:
{code:java}
        private Socket connectToLeader() throws IOException, X509Exception, 
InterruptedException {
            Socket sock = createSocket();               // line 345
            // ...
            int remainingTimeout;
            long startNanoTime = nanoTime();
            for (int tries = 0; tries < 5 && socket.get() == null; tries++) {
                try {
                    remainingTimeout = connectTimeout - (int) ((nanoTime() - 
startNanoTime) / 1_000_000);
                    if (remainingTimeout <= 0) {
                        LOG.error(...);
                        throw new IOException("connectToLeader exceeded on 
retries.");
                    }
                    sockConnect(sock, address, Math.min(connectTimeout, 
remainingTimeout));
                    // ...
                    break;
                } catch (IOException e) {
                    remainingTimeout = connectTimeout - (int) ((nanoTime() - 
startNanoTime) / 1_000_000);
                    if (remainingTimeout <= leaderConnectDelayDuringRetryMs) {
                        LOG.error(...);
                        throw e;
                    } else if (tries >= 4) {
                        LOG.error(...);
                        throw e;
                    } else {
                        LOG.warn(...);
                        sock = createSocket();          // line 400
                    }
                }
                Thread.sleep(leaderConnectDelayDuringRetryMs);
            }
            return sock;
        }
{code}
The retry (5 times) in this code implies that sometimes exceptions are expected 
to appear, and the program should proceed to try again. However, 
`Learner.createSocket` is invoked in either line 345 or line 400. And it may 
also throw exceptions. But it’s not within the try block. Therefore, this 
exception will turn to `Follower.followLeader` (the caller of 
`Learner$LeaderConnector.connectToLeader`). And then this server will change 
from FOLLOWING state to LOOKING state.

`Learner.createSocket` may throw exceptions in line 417 or 421:
{code:java}
    /**
     * Creating a simple or and SSL socket.
     * This can be overridden in tests to fake already connected sockets for 
connectToLeader.
     */
    protected Socket createSocket() throws X509Exception, IOException {
        Socket sock;
        if (self.isSslQuorum()) {
            sock = self.getX509Util().createSSLSocket();    // line 417
        } else {
            sock = new Socket();
        }
        sock.setSoTimeout(self.tickTime * self.initLimit);  // line 421
        return sock;
    } {code}
The reason for throwing the exception may be 
[https://stackoverflow.com/questions/22423063/java-exception-on-sslsocket-creation]
 or 
[https://stackoverflow.com/questions/7064788/when-does-java-net-socket-setsotimeout-throws-socketexception.]

We reproduced this bug by intentionally injecting an IOException right before 
the invocation of `Learner.createSocket` in line 345. We only grant one single 
injection in a testing run of a ZooKeeper cluster (3 nodes). Our analysis 
proves to be true. In terms of the symptom, sometimes the affected follower 
takes more time to join the quorum; sometimes the leader election is also 
affected, and the whole cluster does the re-election, meaning that every node 
is affected.

*Fix*

We propose that the `Learner.createSocket` should be moved to the try-catch 
block. We have created a pull request.

  was:
We were doing some testing in ZooKeeper 3.6.2, and found that sometimes a 
follower may take longer time (about 10s) to join the quorum. In our cluster, 
usually we expect that the leader and followers are ready to provide service in 
1 or 2 seconds.

*Analysis*

We did some investigations and found that the root cause is the exception 
thrown from `Learner.createSocket` 
([https://github.com/apache/zookeeper/blob/release-3.6.2/zookeeper-server/src/main/java/org/apache/zookeeper/server/quorum/Learner.java#L410-L423]).

`Learner.createSocket` is used in `Learner$LeaderConnector.connectToLeader`.

The code snippet 
([https://github.com/apache/zookeeper/blob/release-3.6.2/zookeeper-server/src/main/java/org/apache/zookeeper/server/quorum/Learner.java#L344-L407])
 is as follows:

 
{code:java}
        private Socket connectToLeader() throws IOException, X509Exception, 
InterruptedException {
            Socket sock = createSocket();               // line 345
            // ...
            int remainingTimeout;
            long startNanoTime = nanoTime();
            for (int tries = 0; tries < 5 && socket.get() == null; tries++) {
                try {
                    remainingTimeout = connectTimeout - (int) ((nanoTime() - 
startNanoTime) / 1_000_000);
                    if (remainingTimeout <= 0) {
                        LOG.error(...);
                        throw new IOException("connectToLeader exceeded on 
retries.");
                    }
                    sockConnect(sock, address, Math.min(connectTimeout, 
remainingTimeout));
                    // ...
                    break;
                } catch (IOException e) {
                    remainingTimeout = connectTimeout - (int) ((nanoTime() - 
startNanoTime) / 1_000_000);
                    if (remainingTimeout <= leaderConnectDelayDuringRetryMs) {
                        LOG.error(...);
                        throw e;
                    } else if (tries >= 4) {
                        LOG.error(...);
                        throw e;
                    } else {
                        LOG.warn(...);
                        sock = createSocket();          // line 400
                    }
                }
                Thread.sleep(leaderConnectDelayDuringRetryMs);
            }
            return sock;
        }
{code}
The retry (5 times) in this code implies that sometimes exceptions are expected 
to appear, and the program should proceed to try again. However, 
`Learner.createSocket` is invoked in either line 345 or line 400. And it may 
also throw exceptions. But it’s not within the try block. Therefore, this 
exception will turn to `Follower.followLeader` (the caller of 
`Learner$LeaderConnector.connectToLeader`). And then this server will change 
from FOLLOWING state to LOOKING state.

 

`Learner.createSocket` may throw exceptions in line 417 or 421:

 
{code:java}
    /**
     * Creating a simple or and SSL socket.
     * This can be overridden in tests to fake already connected sockets for 
connectToLeader.
     */
    protected Socket createSocket() throws X509Exception, IOException {
        Socket sock;
        if (self.isSslQuorum()) {
            sock = self.getX509Util().createSSLSocket();    // line 417
        } else {
            sock = new Socket();
        }
        sock.setSoTimeout(self.tickTime * self.initLimit);  // line 421
        return sock;
    } {code}
The reason for throwing the exception may be 
[https://stackoverflow.com/questions/22423063/java-exception-on-sslsocket-creation]
 or 
[https://stackoverflow.com/questions/7064788/when-does-java-net-socket-setsotimeout-throws-socketexception.]

 

We reproduced this bug by intentionally injecting an IOException right before 
the invocation of `Learner.createSocket` in line 345. We only grant one single 
injection in a testing run of a ZooKeeper cluster (3 nodes). Our analysis 
proves to be true. In terms of the symptom, sometimes the affected follower 
takes more time to join the quorum; sometimes the leader election is also 
affected, and the whole cluster does the re-election, meaning that every node 
is affected.

*Fix*

We propose that the `Learner.createSocket` should be moved to the try-catch 
block. We will create a pull request.


> The potential exception in Learner$LeaderConnector.connectToLeader may cause 
> unnecessary re-election or service delay
> ---------------------------------------------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-4419
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4419
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.6.2
>            Reporter: Haoze Wu
>            Priority: Major
>
> We were doing some testing in ZooKeeper 3.6.2, and found that sometimes a 
> follower may take longer time (about 10s) to join the quorum. In our cluster, 
> usually we expect that the leader and followers are ready to provide service 
> in 1 or 2 seconds.
> *Analysis*
> We did some investigations and found that the root cause is the exception 
> thrown from `Learner.createSocket` 
> ([https://github.com/apache/zookeeper/blob/release-3.6.2/zookeeper-server/src/main/java/org/apache/zookeeper/server/quorum/Learner.java#L410-L423]).
> `Learner.createSocket` is used in `Learner$LeaderConnector.connectToLeader`.
> The code snippet 
> ([https://github.com/apache/zookeeper/blob/release-3.6.2/zookeeper-server/src/main/java/org/apache/zookeeper/server/quorum/Learner.java#L344-L407])
>  is as follows:
> {code:java}
>         private Socket connectToLeader() throws IOException, X509Exception, 
> InterruptedException {
>             Socket sock = createSocket();               // line 345
>             // ...
>             int remainingTimeout;
>             long startNanoTime = nanoTime();
>             for (int tries = 0; tries < 5 && socket.get() == null; tries++) {
>                 try {
>                     remainingTimeout = connectTimeout - (int) ((nanoTime() - 
> startNanoTime) / 1_000_000);
>                     if (remainingTimeout <= 0) {
>                         LOG.error(...);
>                         throw new IOException("connectToLeader exceeded on 
> retries.");
>                     }
>                     sockConnect(sock, address, Math.min(connectTimeout, 
> remainingTimeout));
>                     // ...
>                     break;
>                 } catch (IOException e) {
>                     remainingTimeout = connectTimeout - (int) ((nanoTime() - 
> startNanoTime) / 1_000_000);
>                     if (remainingTimeout <= leaderConnectDelayDuringRetryMs) {
>                         LOG.error(...);
>                         throw e;
>                     } else if (tries >= 4) {
>                         LOG.error(...);
>                         throw e;
>                     } else {
>                         LOG.warn(...);
>                         sock = createSocket();          // line 400
>                     }
>                 }
>                 Thread.sleep(leaderConnectDelayDuringRetryMs);
>             }
>             return sock;
>         }
> {code}
> The retry (5 times) in this code implies that sometimes exceptions are 
> expected to appear, and the program should proceed to try again. However, 
> `Learner.createSocket` is invoked in either line 345 or line 400. And it may 
> also throw exceptions. But it’s not within the try block. Therefore, this 
> exception will turn to `Follower.followLeader` (the caller of 
> `Learner$LeaderConnector.connectToLeader`). And then this server will change 
> from FOLLOWING state to LOOKING state.
> `Learner.createSocket` may throw exceptions in line 417 or 421:
> {code:java}
>     /**
>      * Creating a simple or and SSL socket.
>      * This can be overridden in tests to fake already connected sockets for 
> connectToLeader.
>      */
>     protected Socket createSocket() throws X509Exception, IOException {
>         Socket sock;
>         if (self.isSslQuorum()) {
>             sock = self.getX509Util().createSSLSocket();    // line 417
>         } else {
>             sock = new Socket();
>         }
>         sock.setSoTimeout(self.tickTime * self.initLimit);  // line 421
>         return sock;
>     } {code}
> The reason for throwing the exception may be 
> [https://stackoverflow.com/questions/22423063/java-exception-on-sslsocket-creation]
>  or 
> [https://stackoverflow.com/questions/7064788/when-does-java-net-socket-setsotimeout-throws-socketexception.]
> We reproduced this bug by intentionally injecting an IOException right before 
> the invocation of `Learner.createSocket` in line 345. We only grant one 
> single injection in a testing run of a ZooKeeper cluster (3 nodes). Our 
> analysis proves to be true. In terms of the symptom, sometimes the affected 
> follower takes more time to join the quorum; sometimes the leader election is 
> also affected, and the whole cluster does the re-election, meaning that every 
> node is affected.
> *Fix*
> We propose that the `Learner.createSocket` should be moved to the try-catch 
> block. We have created a pull request.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to