[ https://issues.apache.org/jira/browse/CASSANDRA-7849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14139144#comment-14139144 ]
graham sanderson edited comment on CASSANDRA-7849 at 9/18/14 4:42 PM: ---------------------------------------------------------------------- Updated patch v3 that # Adds channel info to logged error from call sites in {{Message.java}} # Keeps everything at ERROR level exception from code path {{Dispatcher.exceptionCaught}}, which logs IOException at INFO except for 3 specific message strings at DEBUG ("Connection reset by peer", "Broken pipe", "Connection timed out") - corresponding to likely client disconnects Note that since {{Throwable#getLocalizedMessage}} exists, and the Windows JVM code path seems to map windows error codes to the *nix error messages, I think these message strings are actually more robust than I thought across platforms and/or locales was (Author: graham sanderson): Updated patch v3 that # Adds channel info to logged error from call sites in {{Message.java}} # Keeps everything at ERROR level exception from code path {{Dispatcher.exceptionCaught}}, which logs IOException at INFO or DEBUG for 3 specific messages ("Connection reset by peer", "Broken pipe", "Connection timed out") > Server logged error messages (in binary protocol) for unexpected exceptions > could be more helpful > ------------------------------------------------------------------------------------------------- > > Key: CASSANDRA-7849 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7849 > Project: Cassandra > Issue Type: Improvement > Reporter: graham sanderson > Assignee: graham sanderson > Fix For: 1.2.19, 2.0.11 > > Attachments: cassandra-1.2-7849.txt, cassandra-1.2-7849_v2.txt, > cassandra-1.2-7849_v3.txt > > > From time to time (actually quite frequently) we get error messages in the > server logs like this > {code} > ERROR [Native-Transport-Requests:288] 2014-08-29 04:48:07,118 > ErrorMessage.java (line 222) Unexpected exception during request > java.io.IOException: Connection reset by peer > at sun.nio.ch.FileDispatcherImpl.read0(Native Method) > at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) > at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) > at sun.nio.ch.IOUtil.read(IOUtil.java:192) > at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379) > at > org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:64) > at > org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:109) > at > org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312) > at > org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:90) > at > org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > {code} > These particular cases are almost certainly problems with the client driver, > client machine, client process, however after the fact this particular > exception is practically impossible to debug because there is no indication > in the underlying JVM/netty exception of who the peer was. I should note we > have lots of different types of applications running against the cluster so > it is very hard to correlate these to anything -- This message was sent by Atlassian JIRA (v6.3.4#6332)