[ 
https://issues.apache.org/jira/browse/HADOOP-9137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14299249#comment-14299249
 ] 

Daryn Sharp commented on HADOOP-9137:
-------------------------------------

I think the best approach would be accepting, returning a RetriableException, 
and closing.  Alas, it's not trivial but shouldn't be too hard.

The current approach of relying on the idle scan doesn't work because 
outstanding call metrics are wrong.  Out of band RPC messages (ex. sasl) don't 
increment on read, but do decrement on response.  I think I filed a jira a long 
time ago about it.  The acceptor will go into a spin loop scanning for 
connections that will never be considered idle, all the while spewing garbage 
and increasing GC pressure.

The easiest middle ground is probably to just accept & close the connection.  
It's not very friendly to clients, but it's a lot better than the NN dying.

> NN connections can use up all fds leaving none for rolling journal files
> ------------------------------------------------------------------------
>
>                 Key: HADOOP-9137
>                 URL: https://issues.apache.org/jira/browse/HADOOP-9137
>             Project: Hadoop Common
>          Issue Type: Bug
>            Reporter: Sanjay Radia
>            Assignee: Kihwal Lee
>         Attachments: HADOOP-9137.patch, hadoop-9137.trunk.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to