Jiafu Jiang created ZOOKEEPER-3099:
--------------------------------------
Summary: ZooKeeper cluster is unavailable for session_timeout time
when the leader shutdown in a three-node environment.
Key: ZOOKEEPER-3099
URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3099
Project: ZooKeeper
Issue Type: Bug
Components: c client, java client
Affects Versions: 3.4.13, 3.4.12, 3.5.4, 3.4.11
Reporter: Jiafu Jiang
The default readTimeout timeout of ZooKeeper client is 2/3 * session_time, the
default connectTimeout is session_time/hostProvider.size(). If the ZooKeeper
cluster has 3 nodes, then connectTimeout is 1/3 * session_time.
Supports we have three ZooKeeper servers: zk1, zk2, zk3 deployed. And zk3 is
now the leader. Client c1 is now connected to zk2(follower). Then we shutdown
the network of zk3(leader), the same time, client c1 begin to write some data
to ZooKeeper. After a (syncLimit * tick) timeout, zk2 will disconnect with
leader and begin a new election, and zk2 becomes the leader.
The write operation will not succeed due to the leader is shutdown. It will
take at most readTimeout time for c1 to discover the failure, and client c1
will try to choose another ZooKeeper server. Unfortunately, c1 may choose zk3,
which is unreachable now, then c1 will spend connectTimeout to find out that
zk3 is unused. Notice that readTimeout + connectTimeout = sesstion_timeout in
my case(three-node cluster).
Therefore, in this case, the ZooKeeper cluster is unavailable for session
timeout time when only one ZooKeeper server is shutdown.
I have some suggestions:
# The HostProvider used by ZooKeeper can be specified by an argument.
# readTimeout can also be specified in any way.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)