[
https://issues.apache.org/jira/browse/ZOOKEEPER-4817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
mutu updated ZOOKEEPER-4817:
----------------------------
Attachment: system3_25s.log
system3_20s.log
system2_25s.log
system2_20s.log
system1_25s.log
system1_20s.log
> Client disconnection warning is missed in system log sometimes.
> ---------------------------------------------------------------
>
> Key: ZOOKEEPER-4817
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4817
> Project: ZooKeeper
> Issue Type: Bug
> Affects Versions: 3.10.0
> Reporter: mutu
> Priority: Major
> Attachments: system1_20s.log, system1_25s.log, system2_20s.log,
> system2_25s.log, system3_20s.log, system3_25s.log
>
>
> Recently, we encounter an confused issue. The client disconnection warning
> disappears in system log. However, sometimes, this message appears in system
> log. There is a cluster consisting of three node. A client sends many
> creation requests and then read the node created by the first request. The
> client read operation failed due to missing node. We watch the system log.
> Sometimes, there is a client disconnection warning. Sometimes, there is not.
> This incomplete system log mislead client judgement on the problem.
> After investigating, when NIOServerCnxn.doIO is stuck in any IO point in this
> function and the stuck time exceeds 20s, the client disconnection warning
> will disappear. If the stuck time is less than 20s, the client disconnection
> warning will appear in system log.
> We find that the root cause is that selectorThread is set as the daemon
> thread. When one node encounter the fail-slow nic, the client disconnects
> with the node. If the NIOServerCnxn.doIO is stuck and the stuck time exceeds
> 20s, the corresponding selectorThread will be killed by JVM. Hence, the
> client disconnection warning is missed.
> Are there any comments to figure out this issues and improve the
> diagnosability of ZooKeeper? I will very appreciate them.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)