[ https://issues.apache.org/jira/browse/HADOOP-9898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13810687#comment-13810687 ]
Andrew Wang commented on HADOOP-9898: ------------------------------------- +1, LGTM. Thanks Todd. Will commit shortly. > Set SO_KEEPALIVE on all our sockets > ----------------------------------- > > Key: HADOOP-9898 > URL: https://issues.apache.org/jira/browse/HADOOP-9898 > Project: Hadoop Common > Issue Type: Bug > Components: ipc, net > Affects Versions: 3.0.0 > Reporter: Todd Lipcon > Assignee: Todd Lipcon > Priority: Minor > Attachments: hadoop-9898.txt > > > We recently saw an issue where network issues between slaves and the NN > caused ESTABLISHED TCP connections to pile up and leak on the NN side. It > looks like the RST packets were getting dropped, which meant that the client > thought the connections were closed, while they hung open forever on the > server. > Setting the SO_KEEPALIVE option on our sockets would prevent this kind of > leak from going unchecked. -- This message was sent by Atlassian JIRA (v6.1#6144)