Client Calls are not cancelled after a call timeout
---------------------------------------------------

         Key: HADOOP-255
         URL: http://issues.apache.org/jira/browse/HADOOP-255
     Project: Hadoop
        Type: Bug

  Components: ipc  
    Versions: 0.2.1    
 Environment: Tested on Linux 2.6
    Reporter: Naveen Nalam


In ipc/Client.java, if a call times out, a SocketTimeoutException is thrown but 
the Call object still exists on the queue.

What I found was that when transferring very large amounts of data, it's common 
for queued up calls to timeout. Yet even though the caller has is no longer 
waiting, the request is still serviced on the server and the data is sent to 
the client. The client after receiving the full response calls callComplete() 
which is a noop since nobody is waiting.

The problem is that the calls that timeout will retry and the system gets into 
a situation where data is being transferred around, but it's all data for timed 
out requests and no progress is ever made.

My quick solution to this was to add a "boolean timedout" to the Call object 
which I set to true whenever the queued caller times out. And then when the 
client starts to pull over the response data (in Connection::run) to first 
check if the Call is timedout and immediately close the connection.

I think a good fix for this is to queue requests on the client, and do a single 
sendParam only when there is no outstanding request. This will allow closing 
the connection when receiving a response for a request we no longer have 
pending, reopen the connection, and resend the next queued request. I can 
provide a patch for this, but I've seen a lot of recent activity in this area 
so I'd like to get some feedback first.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

Reply via email to