RPC should accepted connections even when rpc queue is full (ie undo part of 
HADOOP-2910)
-----------------------------------------------------------------------------------------

                 Key: HADOOP-3109
                 URL: https://issues.apache.org/jira/browse/HADOOP-3109
             Project: Hadoop Core
          Issue Type: Bug
            Reporter: Sanjay Radia
            Assignee: Hairong Kuang
             Fix For: 0.17.0


HADOOP-2910 changed HDFS to stop accepting new connections when the rpc queue 
is full. It should continue to accept connections and let the OS  deal with 
limiting connections.

HADOOP-2910's decision to not read from open sockets when queue is full is 
exactly right -  backup on the
client sockets and they will just wait( especially with HADOOP-2188 that 
removes client timeouts).
However we should continue to  accept connections:

The OS refuses new connections after a large number of connections are open 
(this is configurable parameter). With this patch, we have  new lower limit for 
# of open connections when the RPC queue is full.
The problem is that when there is a surge of requests, we would stop
accepting connection and clients will get a connection failed (a change from 
old behavior).
Instead if you continue to accept connections it is likely that the surge will 
be over shortly and
clients will get served. Of course if the surge lasts a long time the OS will 
stop accepting connections
and clients will fail and there not much one can do (except raise the os limit).

I propose that we continue accepting connections, but not read from
connections when the RPC queue is full. (ie undo part of 2910 work back to the 
old behavior).


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to