Hi We have a Giraph program which works fine if the graph is small. However, it complains about “Connection reset by peer” when the graph is big. We added a log statement right before the sendMessageToAllEdges method to capture the message size and the number of messages being send. We noticed the exception is raised whenever a vertex attempt to send large messages to lots of neighbor. For example, the log below indicates that the 64-edges-vertex is fine but the 3853-edges-vertex is causing exception. We cannot see any other error from the log. Seems like it is related to Netty communication. May be the buffer size? Any advise is highly appreciated.
2014-04-10 09:55:36,070 INFO [compute-0] com.neimanmarcus.api.matching.giraph.cc.CCVertex3: !!!*** omx136986717936641d8a715150240372f9ef8da1a306 has 64 neighbors 2014-04-10 09:55:36,129 INFO [compute-0] com.neimanmarcus.api.matching.giraph.cc.CCVertex3: !!!*** omx135795296165908348809827498ffc192bedd9f136 has 3853 neighbors 2014-04-10 09:55:58,161 INFO [netty-server-exec-0] org.apache.giraph.comm.netty.handler.RequestDecoder: decode: Server window metrics MBytes/sec sent = 0.0004, MBytes/sec received = 12.7831, MBytesSent = 0.0134, MBytesReceived = 383.7369, ave sent req MBytes = 0, ave received req MBytes = 0.0481, secs waited = 30.018 2014-04-10 09:56:08,806 WARN [netty-server-exec-5] org.apache.giraph.comm.netty.handler.RequestServerHandler: exceptionCaught: Channel failed with remote address /10.241.17.33:43398 java.io.IOException: Connection reset by peer at sun.nio.ch.FileDispatcher.write0(Native Method) at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:29) at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:69) at sun.nio.ch.IOUtil.write(IOUtil.java:26) at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:334) at org.jboss.netty.channel.socket.nio.SocketSendBufferPool$UnpooledSendBuffer.transferTo(SocketSendBufferPool.java:198) at org.jboss.netty.channel.socket.nio.AbstractNioWorker.write0(AbstractNioWorker.java:468) at org.jboss.netty.channel.socket.nio.AbstractNioWorker.writeFromTaskLoop(AbstractNioWorker.java:423) at org.jboss.netty.channel.socket.nio.AbstractNioChannel$WriteTask.run(AbstractNioChannel.java:364) at org.jboss.netty.channel.socket.nio.AbstractNioWorker.processWriteTaskQueue(AbstractNioWorker.java:341) at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:237) at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:38) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) 2014-04-10 09:56:09,040 WARN [netty-server-exec-6] org.apache.giraph.comm.netty.handler.RequestServerHandler: exceptionCaught: Channel failed with remote address /10.241.17.33:43398 java.nio.channels.ClosedChannelException at org.jboss.netty.channel.socket.nio.AbstractNioWorker.cleanUpWriteBuffer(AbstractNioWorker.java:673) at org.jboss.netty.channel.socket.nio.AbstractNioWorker.writeFromUserCode(AbstractNioWorker.java:400) at org.jboss.netty.channel.socket.nio.NioServerSocketPipelineSink.handleAcceptedSocket(NioServerSocketPipelineSink.java:120) text: Unable to write to output stream.