[ https://issues.apache.org/jira/browse/HDFS-1836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13029802#comment-13029802 ]
Bharath Mundlapudi commented on HDFS-1836: ------------------------------------------ Dennis, For me, the following code seems like an issue. try { blockStream.close(); blockReplyStream.close(); } catch (IOException e) { } Reason: if blockStream throws an exception, blockReplyStream will not be closed. Can we replace all the places (2 places) in DFSClient with the following and try? try { blockStream.close(); } catch (IOException e) { } try { blockReplyStream.close(); } catch (IOException e) { } Can you just try this change in your environment? > Thousand of CLOSE_WAIT socket > ------------------------------ > > Key: HDFS-1836 > URL: https://issues.apache.org/jira/browse/HDFS-1836 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs client > Affects Versions: 0.20.2 > Environment: Linux 2.6.18-194.32.1.el5 #1 SMP Wed Jan 5 17:52:25 EST > 2011 x86_64 x86_64 x86_64 GNU/Linux > java version "1.6.0_23" > Java(TM) SE Runtime Environment (build 1.6.0_23-b05) > Java HotSpot(TM) 64-Bit Server VM (build 19.0-b09, mixed mode) > Reporter: Dennis Cheung > Attachments: patch-draft-1836.patch > > > $ /usr/sbin/lsof -i TCP:50010 | grep -c CLOSE_WAIT > 4471 > It is better if everything runs normal. > However, from time to time there are some "DataStreamer Exception: > java.net.SocketTimeoutException" and "DFSClient.processDatanodeError(2507) | > Error Recovery for" can be found from log file and the number of CLOSE_WAIT > socket just keep increasing > The CLOSE_WAIT handles may remain for hours and days; then "Too many open > file" some day. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira