[ https://issues.apache.org/jira/browse/HADOOP-2638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12560123#action_12560123 ]
Raghu Angadi commented on HADOOP-2638: -------------------------------------- yup, implementation is simple. I guess it is kind of like Thread.yield()... a good system/app should not need it but might be ok as a temporary fix. Alternately we could have extra thread on the client only when an idle timeout in configured.. so that other users don't complain about it. > Add close of idle connection to DFSClient and to DataNode DataXceiveServer > -------------------------------------------------------------------------- > > Key: HADOOP-2638 > URL: https://issues.apache.org/jira/browse/HADOOP-2638 > Project: Hadoop > Issue Type: Improvement > Components: dfs > Reporter: stack > > This issue is for adding timeout and shutdown of idle DFSClient <-> DataNode > connections. > Applications can have DFS usage patterns than deviate from that of MR 'norm' > where files are generally opened, sucked down as fast as is possible, and > then closed. For example, at the other extreme, hbase wants to support fast > random reading of key values over a sometimes relatively large set of > MapFiles or MapFile equivalents. To avoid paying startup costs on every > random read -- opening the file and reading in the index each time -- hbase > just keeps all of its MapFiles open all the time. > In an hbase cluster of any significant size, this can add up to lots of file > handles per process: See HADOOP-2577, " [hbase] Scaling: Too many open file > handles to datanodes" for an accounting. > Given how DFSClient and DataXceiveServer interact when random reading, and > given past observations that have the client-side file handles mostly stuck > in CLOSE_WAIT (See HADOOP-2341, 'Datanode active connections never returns to > 0'), a suggestion made up on the list today, that idle connections should be > timedout and closed, would help applications that have hbase-like access > patterns conserve file handles and allow them scale. > Below is context that comes of the mailing list under the subject: 'Re: > Multiplexing sockets in DFSClient/datanodes?' > {code} > stack wrote: > > Doug Cutting wrote: > >> RPC also tears down idle connections, which HDFS does not. I wonder how > >> much doing that alone might help your case? That would probably be much > >> simpler to implement. Both client and server must already handle > >> connection failures, so it shouldn't be too great of a change to have one > >> or both sides actively close things down if they're idle for more than a > >> few seconds. > > > > If we added tear down of idle sockets, that'd work for us and, as you > > suggest, should be easier to do than rewriting the client to use async i/o. > > Currently, random reading, its probably rare that the currently opened > > HDFS block has the wanted offset and so a tear down of the current socket > > and an open of a new one is being done anyways. > HADOOP-2346 helps with the Datanode side of the problem. We still need > DFSClient to clean up idle connections (otherwise these sockets will stay in > CLOSE_WAIT state on the client). This would require an extra thread on client > to clean up these connections. You could file a jira for it. > Raghu. > {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.