[ https://issues.apache.org/jira/browse/HDFS-4646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Konstantin Boudnik updated HDFS-4646: ------------------------------------- Fix Version/s: 2.0.5-beta > createNNProxyWithClientProtocol ignores configured timeout value > ---------------------------------------------------------------- > > Key: HDFS-4646 > URL: https://issues.apache.org/jira/browse/HDFS-4646 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode > Affects Versions: 3.0.0, 2.0.3-alpha, 2.0.4-alpha > Environment: Linux > Reporter: Jagane Sundar > Priority: Minor > Fix For: 3.0.0, 2.0.5-beta, 2.0.4-alpha > > Attachments: HDFS-4646.001.patch, HDFS-4646.patch > > > The Client RPC I/O timeout mechanism appears to be configured by two > core-site.xml paramters: > 1. A boolean ipc.client.ping > 2. A numeric value ipc.ping.interval > If ipc.client.ping is true, then we send a RPC ping every ipc.ping.interval > milliseconds > If ipc.client.ping is false, then ipc.ping.interval turns into the socket > timeout value. > The bug here is that while creating a Non HA proxy, the configured timeout > value is ignored, and 0 is passed in. 0 is taken to mean 'wait forever' and > the client RPC socket never times out. > Note that this bug is reproducible only in the case where the NN machine > dies, i.e. the TCP stack with the NN IP address stops responding completely. > The code does not take this path when you do a 'kill -9' of the NN process, > since there is a TCP stack that is alive and sends out a TCP RST to the > client, and that results in a socket error (not a timeout). > The fix is to pass in the correct configured value for timeout by calling > Client.getTimeout(conf) instead of passing in 0. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira