[ https://issues.apache.org/jira/browse/ZOOKEEPER-1237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Chris Seawood updated ZOOKEEPER-1237: ------------------------------------- Attachment: zookeeper-3.4.5-ZK1237.patch This patch downgrades the EndOfStream Exception to a debug message. It also adds .isValid() checks to avoid the CancelledKeyException. At a glance, the real problem appears to be that there's no way for the client to deregister itself from the server via the C API. According to zookeeper.h, calling zookeeper_close() is supposed to close the filehandle and free up resources (presumably on the client side) but afaict, there's nothing on the server side to acknowledge that a client has legitimately disconnected. In NIOServerCnxn.java, there's even a comment to the effect that when the server initiates a disconnect, it only closes the socket and then lets the doIO() routine clean things up, which results in the additional exceptions being thrown. > ERRORs being logged when queued responses are sent after socket has closed. > --------------------------------------------------------------------------- > > Key: ZOOKEEPER-1237 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1237 > Project: ZooKeeper > Issue Type: Bug > Components: server > Affects Versions: 3.3.4, 3.4.0, 3.5.0 > Reporter: Patrick Hunt > Fix For: 3.5.0 > > Attachments: zookeeper-3.4.5-ZK1237.patch > > > After applying ZOOKEEPER-1049 to 3.3.3 (I believe the same problem exists in > 3.4/3.5 but haven't tested this) I'm seeing the following exception more > frequently: > {noformat} > Oct 19, 1:31:53 PM ERROR > Unexpected Exception: > java.nio.channels.CancelledKeyException > at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:55) > at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:59) > at > org.apache.zookeeper.server.NIOServerCnxn.sendBuffer(NIOServerCnxn.java:418) > at > org.apache.zookeeper.server.NIOServerCnxn.sendResponse(NIOServerCnxn.java:1509) > at > org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:367) > at > org.apache.zookeeper.server.quorum.CommitProcessor.run(CommitProcessor.java:73) > {noformat} > This is a long standing problem where we try to send a response after the > socket has been closed. Prior to ZOOKEEPER-1049 this issues happened much > less frequently (2 sec linger), but I believe it was possible. The timing > window is just wider now. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira