Daryn Sharp created HDFS-7005:
---------------------------------

             Summary: DFS input streams do not timeout
                 Key: HDFS-7005
                 URL: https://issues.apache.org/jira/browse/HDFS-7005
             Project: Hadoop HDFS
          Issue Type: Bug
          Components: hdfs-client
    Affects Versions: 2.5.0, 3.0.0
            Reporter: Daryn Sharp
            Assignee: Daryn Sharp
            Priority: Critical


Input streams lost their timeout.  The problem appears to be 
{{DFSClient#newConnectedPeer}} does not set the read timeout.  During a 
temporary network interruption the server will close the socket, unbeknownst to 
the client host, which blocks on a read forever.

The results are dire.  Services such as the RM, JHS, NMs, oozie servers, etc 
all need to be restarted to recover - unless you want to wait many hours for 
the tcp stack keepalive to detect the broken socket.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to