[ https://issues.apache.org/jira/browse/HDFS-17357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
liuguanghua updated HDFS-17357: ------------------------------- Description: NioInetPeer.close() now do not close socket connection. In my environment,all data were stored with EC. And I found 3w+ connections leakage in datanode . And I found many warn message as blew. 2024-01-22 15:27:57,500 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: hostname:50010:DataXceiverServer When any Exception is found in DataXceiverServer, it will execute clostStream. IOUtils.closeStream(peer) -> Peer.close() -> NioInetPeer.close() But NioInetPeer.close() is not invoked with close socket connection. And this will lead to connection leakage. Other subClass of Peer's close() is implemented with socket.close(). See EncryptedPeer, DomainPeer, BasicInetPeer This solution can be reporduced as following: (1) Client write data to HDFS (2) datanode Xceiver count max to DFS_DATANODE_MAX_RECEIVER_THREADS_KEY , the new Xceiver will fail and throw IOException . And the socket will not release. (3) Client crash for that no new data will be added or client.close is executed. (4) There will be socket connection leakage between datanodes. The connection leakage like this dn1 dn1:57042 dn2:50010 ESTABLISHED dn2 dn2:50010 dn1:57042 ESTABLISHED was: NioInetPeer.close() now do not close socket connection. In my environment,all data were stored with EC. And I found 3w+ connections leakage in datanode . And I found many warn message as blew. 2024-01-22 15:27:57,500 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: hostname:50010:DataXceiverServer When any Exception is found in DataXceiverServer, it will execute clostStream. IOUtils.closeStream(peer) -> Peer.close() -> NioInetPeer.close() But NioInetPeer.close() is not invoked with close socket connection. And this will lead to connection leakage. Other subClass of Peer's close() is implemented with socket.close(). See EncryptedPeer, DomainPeer, BasicInetPeer This solution can be reporduced as following: (1) Client write data to HDFS (2) datanode Xceiver count max to DFS_DATANODE_MAX_RECEIVER_THREADS_KEY , the new Xceiver will fail and throw IOException . And the socket will not release. (3) Client crash for that no new data will be added or client.close is executed. (4) There will be socket connection leakage between datanodes. > NioInetPeer.close() should close socket connection. > --------------------------------------------------- > > Key: HDFS-17357 > URL: https://issues.apache.org/jira/browse/HDFS-17357 > Project: Hadoop HDFS > Issue Type: Bug > Reporter: liuguanghua > Assignee: liuguanghua > Priority: Major > Labels: pull-request-available > > NioInetPeer.close() now do not close socket connection. > > In my environment,all data were stored with EC. > And I found 3w+ connections leakage in datanode . And I found many warn > message as blew. > 2024-01-22 15:27:57,500 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > hostname:50010:DataXceiverServer > > When any Exception is found in DataXceiverServer, it will execute clostStream. > IOUtils.closeStream(peer) -> Peer.close() -> NioInetPeer.close() > But NioInetPeer.close() is not invoked with close socket connection. And > this will lead to connection leakage. > Other subClass of Peer's close() is implemented with socket.close(). See > EncryptedPeer, DomainPeer, BasicInetPeer > > > This solution can be reporduced as following: > (1) Client write data to HDFS > (2) datanode Xceiver count max to DFS_DATANODE_MAX_RECEIVER_THREADS_KEY , the > new Xceiver will fail and throw IOException . And the socket will not release. > (3) Client crash for that no new data will be added or client.close is > executed. > (4) There will be socket connection leakage between datanodes. > > > The connection leakage like this > dn1 > dn1:57042 dn2:50010 ESTABLISHED > dn2 > dn2:50010 dn1:57042 ESTABLISHED -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org