[
https://issues.apache.org/jira/browse/HADOOP-3109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12582785#action_12582785
]
Raghu Angadi commented on HADOOP-3109:
--------------------------------------
> It should continue to accept connections and let the OS deal with limiting
> connections.
How can OS limit connections properly if application keeps accepting them?
There could be some global limit in the OS, but isn't that very harsh on
everything else on the machine? Which parameter is this?
> The problem is that when there is a surge of requests, we would stop
> accepting connection and clients will get a connection failed (a change from
> old behavior).
timeout is removed in 2188.. if it is good for 2188, it good here too. Ideally
we should just have 2188 :).
> RPC should accepted connections even when rpc queue is full (ie undo part of
> HADOOP-2910)
> -----------------------------------------------------------------------------------------
>
> Key: HADOOP-3109
> URL: https://issues.apache.org/jira/browse/HADOOP-3109
> Project: Hadoop Core
> Issue Type: Bug
> Reporter: Sanjay Radia
> Assignee: Hairong Kuang
> Fix For: 0.17.0
>
>
> HADOOP-2910 changed HDFS to stop accepting new connections when the rpc queue
> is full. It should continue to accept connections and let the OS deal with
> limiting connections.
> HADOOP-2910's decision to not read from open sockets when queue is full is
> exactly right - backup on the
> client sockets and they will just wait( especially with HADOOP-2188 that
> removes client timeouts).
> However we should continue to accept connections:
> The OS refuses new connections after a large number of connections are open
> (this is configurable parameter). With this patch, we have new lower limit
> for # of open connections when the RPC queue is full.
> The problem is that when there is a surge of requests, we would stop
> accepting connection and clients will get a connection failed (a change from
> old behavior).
> Instead if you continue to accept connections it is likely that the surge
> will be over shortly and
> clients will get served. Of course if the surge lasts a long time the OS will
> stop accepting connections
> and clients will fail and there not much one can do (except raise the os
> limit).
> I propose that we continue accepting connections, but not read from
> connections when the RPC queue is full. (ie undo part of 2910 work back to
> the old behavior).
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.