[ 
https://issues.apache.org/jira/browse/CASSANDRA-7849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14136390#comment-14136390
 ] 

graham sanderson commented on CASSANDRA-7849:
---------------------------------------------

Well our options certainly include calling code path (as it happens the only 
errors I have seen, were all client related, and all {{IOException}} and all 
came from the Message.exceptionCaught path)

I am happy to do whatever, I'd suggest

1) Leave in the new the extra information in the message (this IS useful) - 
i.e. IP addresses of each end of the channel
2) Just use DEBUG level for Message.exceptionCaught code path
3) Possibly make the decision at that code path for ERROR vs DEBUG based on 
{{instanceof IOException}}?

The crux of the issue (and hence the stuff above) is that the IOException in 
particular does not help you distinguish the cause (except by examining the 
message which is obviously bad) from being noise or an actual issue... I had 
leaned towards INFO at some point

Note that this error message is not logged by netty, but originates there , but 
I think 2) pretty much covers that (since it is netty specific exception 
handling)

Thoughts? I'll go ahead and update the patch if we're in agreement (currently 
with 1,2,3 and no INFO level)

> Server logged error messages (in binary protocol) for unexpected exceptions 
> could be more helpful
> -------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-7849
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7849
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: graham sanderson
>            Assignee: graham sanderson
>             Fix For: 1.2.19, 2.0.11
>
>         Attachments: cassandra-1.2-7849.txt, cassandra-1.2-7849_v2.txt
>
>
> From time to time (actually quite frequently) we get error messages in the 
> server logs like this
> {code}
> ERROR [Native-Transport-Requests:288] 2014-08-29 04:48:07,118 
> ErrorMessage.java (line 222) Unexpected exception during request
> java.io.IOException: Connection reset by peer
>         at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
>         at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
>         at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
>         at sun.nio.ch.IOUtil.read(IOUtil.java:192)
>         at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379)
>         at 
> org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:64)
>         at 
> org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:109)
>         at 
> org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312)
>         at 
> org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:90)
>         at 
> org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:745)
> {code}
> These particular cases are almost certainly problems with the client driver, 
> client machine, client process, however after the fact this particular 
> exception is practically impossible to debug because there is no indication 
> in the underlying JVM/netty exception of who the peer was. I should note we 
> have lots of different types of applications running against the cluster so 
> it is very hard to correlate these to anything



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to