Can anybody help me understand what underlying issue might be causing these request re-issues a handful of times during graph execution? It appears to be recovering, however there is a significant time delay waiting for the request to time out, so I would rather understand the root cause and fix it. The part that is curious to me is that all of the logged fields (connected, future done, success, cause) seem to indicate that everything worked fine.
2014-03-18 15:47:42,925 WARN org.apache.giraph.comm.netty.NettyClient: checkRequestsForProblems: Problem with request id (destTask=148,reqId=24) connected = true, future done = true, success = true, cause = null, elapsed time = 602583, destination = host/1.1.1.1:30148 (reqId=24,destAddr=host:30148,elapsedNanos=602583065000,started=Tue Mar 18 15:37:40 EDT 2014,writeDone=true,writeSuccess=true) 2014-03-18 15:47:42,925 INFO org.apache.giraph.comm.netty.NettyClient: checkRequestsForProblems: Re-issuing request (reqId=24,destAddr=host:30148,elapsedNanos=27000,started=Tue Mar 18 15:47:42 EDT 2014) 2014-03-18 15:47:42,929 INFO org.apache.giraph.comm.netty.handler.ResponseClientHandler: messageReceived: Already completed request (taskId = 148, requestId = 24) As a bit of background, I'm running with 432 cores, 216 workers, 2 input/compute threads, and 216 partitions, with data is being loaded using both a vertex and edge input format. Thanks, Craig M.