[
https://issues.apache.org/jira/browse/ZOOKEEPER-1237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chris Seawood updated ZOOKEEPER-1237:
-------------------------------------
Attachment: zookeeper-3.4.5-ZK1237.patch
This patch downgrades the EndOfStream Exception to a debug message. It also
adds .isValid() checks to avoid the CancelledKeyException.
At a glance, the real problem appears to be that there's no way for the client
to deregister itself from the server via the C API. According to zookeeper.h,
calling zookeeper_close() is supposed to close the filehandle and free up
resources (presumably on the client side) but afaict, there's nothing on the
server side to acknowledge that a client has legitimately disconnected. In
NIOServerCnxn.java, there's even a comment to the effect that when the server
initiates a disconnect, it only closes the socket and then lets the doIO()
routine clean things up, which results in the additional exceptions being
thrown.
> ERRORs being logged when queued responses are sent after socket has closed.
> ---------------------------------------------------------------------------
>
> Key: ZOOKEEPER-1237
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1237
> Project: ZooKeeper
> Issue Type: Bug
> Components: server
> Affects Versions: 3.3.4, 3.4.0, 3.5.0
> Reporter: Patrick Hunt
> Fix For: 3.5.0
>
> Attachments: zookeeper-3.4.5-ZK1237.patch
>
>
> After applying ZOOKEEPER-1049 to 3.3.3 (I believe the same problem exists in
> 3.4/3.5 but haven't tested this) I'm seeing the following exception more
> frequently:
> {noformat}
> Oct 19, 1:31:53 PM ERROR
> Unexpected Exception:
> java.nio.channels.CancelledKeyException
> at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:55)
> at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:59)
> at
> org.apache.zookeeper.server.NIOServerCnxn.sendBuffer(NIOServerCnxn.java:418)
> at
> org.apache.zookeeper.server.NIOServerCnxn.sendResponse(NIOServerCnxn.java:1509)
> at
> org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:367)
> at
> org.apache.zookeeper.server.quorum.CommitProcessor.run(CommitProcessor.java:73)
> {noformat}
> This is a long standing problem where we try to send a response after the
> socket has been closed. Prior to ZOOKEEPER-1049 this issues happened much
> less frequently (2 sec linger), but I believe it was possible. The timing
> window is just wider now.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira