[ https://issues.apache.org/jira/browse/ZOOKEEPER-3086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16551051#comment-16551051 ]
Ruslan Nigmatullin commented on ZOOKEEPER-3086: ----------------------------------------------- ZK codebase uses `Socket.setSoTimeout` to setup a timeout, however based on the [documentation|https://docs.oracle.com/javase/8/docs/api/java/net/Socket.html#setSoTimeout-int-] it's only used for read operations. {quote}With this option set to a non-zero timeout, a read() call on the InputStream associated with this Socket will block for only this amount of time. {quote} > [server] Lack of write timeouts causes quorum to stuck > ------------------------------------------------------ > > Key: ZOOKEEPER-3086 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3086 > Project: ZooKeeper > Issue Type: Bug > Components: quorum > Affects Versions: 3.5.4, 3.4.12 > Environment: Linux 4.13.0-32-generic, Java HotSpot(TM) 64-Bit Server > VM (build 25.121-b13, mixed mode) > Reporter: Ruslan Nigmatullin > Priority: Major > Attachments: zookeeper-threads.txt > > > Network outage on leader host can cause `QuorumPeer` thread to stuck for > prolonged period of time (2+ hours, depends on tcp keep alive settings). It > effectively stalls the whole zookeeper server making it inoperable. We've > found it during one of our internal DRTs (Disaster Recovery Test). > The scenario which triggers the behavior (requires relatively high ping-load > to the follower): > # `Follower.processPacket` processes `Leader.PING` message > # Leader is network partitioned > # `Learner.ping` makes attempt to write to the leader socket > # If write socket buffer is full (due to other ping/sync calls) > `Learner.ping` blocks > # As leader is partitioned - `Learner.ping` blocks forever due to lack of > write timeout > # `QuorumPeer` is the only thread reading from the leader socket, > effectively meaning that the whole server is stuck and can't recover without > manual process restart. > > Thread dump from the affected server is in attachments. -- This message was sent by Atlassian JIRA (v7.6.3#76005)