Ruslan Nigmatullin created ZOOKEEPER-3086:
---------------------------------------------
Summary: [server] Lack of write timeouts causes quorum to stuck
Key: ZOOKEEPER-3086
URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3086
Project: ZooKeeper
Issue Type: Bug
Components: quorum
Affects Versions: 3.4.12, 3.5.4
Environment: Linux 4.13.0-32-generic, Java HotSpot(TM) 64-Bit Server
VM (build 25.121-b13, mixed mode)
Reporter: Ruslan Nigmatullin
Attachments: zookeeper-threads.txt
Network outage on leader host can cause `QuorumPeer` thread to stuck for
prolonged period of time (2+ hours, depends on tcp keep alive settings). It
effectively stalls the whole zookeeper server making it inoperable. We've found
it during one of our internal DRTs (Disaster Recovery Test).
The scenario which triggers the behavior (requires relatively high ping-load to
the follower):
# `Follower.processPacket` processes `Leader.PING` message
# Leader is network partitioned
# `Learner.ping` makes attempt to write to the leader socket
# If write socket buffer is full (due to other ping/sync calls) `Learner.ping`
blocks
# As leader is partitioned - `Learner.ping` blocks forever due to lack of
write timeout
# `QuorumPeer` is the only thread reading from the leader socket, effectively
meaning that the whole server is stuck and can't recover without manual process
restart.
Thread dump from the affected server is in attachments.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)