Hi All, I'm seeing Netty connection errors (stack trace pasting below) randomly across giraph runs. This error is not persistent, sometimes it goes away. I'm using 2230 workers and 1 master.
Do you know what might caused this? Is there any configuration change I can try that might fix this? Stack trace: 2016-07-14 20:21:26,818 WARN [main] org.apache.giraph.comm.netty.NettyClient: connectAllAddresses: Future failed to connect with ds1376.mycompany.net/11.11.11.11:11111 with 1079 failures because of java.net.SocketException: No buffer space available 2016-07-14 20:21:26,818 INFO [netty-client-worker-3] org.apache.giraph.comm.netty.NettyClient: Using Netty without authentication. 2016-07-14 20:21:26,818 INFO [netty-client-worker-2] org.apache.giraph.comm.netty.NettyClient: Using Netty without authentication. 2016-07-14 20:21:26,818 INFO [netty-client-worker-1] org.apache.giraph.comm.netty.NettyClient: Using Netty without authentication. 2016-07-14 20:21:26,818 INFO [main] org.apache.giraph.comm.netty.NettyClient: connectAllAddresses: Successfully added 0 connections, (2122 total connected) 108 failed, 1080 failures total. 2016-07-14 20:21:26,819 INFO [netty-client-worker-2] org.apache.giraph.comm.netty.NettyClient: Using Netty without authentication. 2016-07-14 20:21:26,819 ERROR [main] org.apache.giraph.graph.GraphMapper: Caught an unrecoverable exception connectAllAddresses: Too many failures (1080). java.lang.IllegalStateException: connectAllAddresses: Too many failures (1080). at org.apache.giraph.comm.netty.NettyClient.connectAllAddresses(NettyClient.java:488) at org.apache.giraph.comm.netty.NettyWorkerClient.openConnections(NettyWorkerClient.java:132) at org.apache.giraph.comm.netty.NettyWorkerClient.setup(NettyWorkerClient.java:168) at org.apache.giraph.worker.BspServiceWorker.setup(BspServiceWorker.java:584) at org.apache.giraph.graph.GraphTaskManager.execute(GraphTaskManager.java:284) at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:93) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) ᐧ