[ 
https://issues.apache.org/jira/browse/HADOOP-9229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13557313#comment-13557313
 ] 

Todd Lipcon commented on HADOOP-9229:
-------------------------------------

Hey Kihwal. Have you been watching HDFS-4404? Looks like basically the same 
issue, if I'm understanding you correctly. In particular, see this comment: 
https://issues.apache.org/jira/browse/HDFS-4404?focusedCommentId=13555680&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13555680
                
> IPC: Retry on connection reset or socket timeout during SASL negotiation
> ------------------------------------------------------------------------
>
>                 Key: HADOOP-9229
>                 URL: https://issues.apache.org/jira/browse/HADOOP-9229
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: ipc
>    Affects Versions: 3.0.0, 2.0.3-alpha, 0.23.7
>            Reporter: Kihwal Lee
>
> When an RPC server is overloaded, incoming connections may not get accepted 
> in time, causing listen queue overflow. The impact on client varies depending 
> on the type of OS in use. On Linux, connections in this state look fully 
> connected to the clients, but they are without buffers, thus any data sent to 
> the server will get dropped.
> This won't be a problem for protocols where client first wait for server's 
> greeting. Even for clients-speak-first protocols, it will be fine if the 
> overload is transient and such connections are accepted before the 
> retransmission of dropped packets arrive. Otherwise, clients can hit socket 
> timeout after several retransmissions.  In certain situations, connection 
> will get reset while clients still waiting for ack.
> We have seen this happening to IPC clients during SASL negotiation. Since no 
> call has been sent, we should allow retry when connection reset or socket 
> timeout happens in this stage.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to