[ https://issues.apache.org/jira/browse/HDFS-16565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17568284#comment-17568284 ]
JiangHua Zhu commented on HDFS-16565: ------------------------------------- Thanks to [~weichiu] for the suggestion. I think I will use it. > DataNode holds a large number of CLOSE_WAIT connections that are not released > ----------------------------------------------------------------------------- > > Key: HDFS-16565 > URL: https://issues.apache.org/jira/browse/HDFS-16565 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, ec > Affects Versions: 3.3.0 > Environment: CentOS Linux release 7.5.1804 (Core) > Reporter: JiangHua Zhu > Priority: Major > Attachments: screenshot-1.png, screenshot-2.png > > > There is a strange phenomenon here, DataNode holds a large number of > connections in CLOSE_WAIT state and does not release. > netstat -na | awk '/^tcp/ {++S[$NF]} END {for(a in S) print a, S[a]}' > LISTEN 20 > CLOSE_WAIT 17707 > ESTABLISHED 1450 > TIME_WAIT 12 > It can be found that the connections with the CLOSE_WAIT state have reached > 17k and are still growing. View these CLOSE_WAITs through the lsof command, > and get the following phenomenon: > lsof -i tcp | grep -E 'CLOSE_WAIT|COMMAND' > !screenshot-1.png! > It can be seen that the reason for this phenomenon is that Socket#close() is > not called correctly, and DataNode interacts with other nodes as Client. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org