[ https://issues.apache.org/jira/browse/YARN-11575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
zhixing updated YARN-11575: --------------------------- Attachment: image-2023-09-21-23-51-01-583.png 5B985F102FAF4EECA0CAD1D20019D181.PNG-1.crdownload 5B985F102FAF4EECA0CAD1D20019D181.PNG.crdownload 微信图片_20230827103545.png 微信图片_20230827102340.png 微信图片_20230827102251.png Component/s: ATSv2 resourcemanager Affects Version/s: 3.1.0 Description: I encountered a yarn bug where executing the command "yarn application -status ats-hbase" leads to a connection leak between the resourcemanager and datanode. The resourcemanager does not close the connections with the datanode, and on the resourcemanager node, many TCP connections with the datanode are in the CLOSE_WAIT state The relevant issue and log screenshots are as follows. The tcpdump log capturing port 1019 is shown below This is the tcpdump package of resourcemanager with datanode 1019 port this is the resourcemanager log !微信图片_20230827102251.png! this is the resourcemanager process !微信图片_20230827102340.png! This is the tcpdump package info of resourcemanager with datanode 1019 port !微信图片_20230827103545.png! this is the tcp connection of resoucemanager with datanode, after rm running a period of time will leave many close_wait state connection. !image-2023-09-21-23-51-01-583.png! my service version is amabri: 3.1.1.3.1.0.0-78 HDFS: 3.1.1.3.1 yarn: 3.1.0 Summary: the connection of resourcemanager with datanode cannot close after executing the command yarn application -status ats-hbase (was: resourcemanager connection with datanode leak) > the connection of resourcemanager with datanode cannot close after executing > the command yarn application -status ats-hbase > --------------------------------------------------------------------------------------------------------------------------- > > Key: YARN-11575 > URL: https://issues.apache.org/jira/browse/YARN-11575 > Project: Hadoop YARN > Issue Type: Bug > Components: ATSv2, resourcemanager > Affects Versions: 3.1.0 > Reporter: zhixing > Priority: Major > Attachments: 5B985F102FAF4EECA0CAD1D20019D181.PNG-1.crdownload, > 5B985F102FAF4EECA0CAD1D20019D181.PNG.crdownload, > image-2023-09-21-23-51-01-583.png, 微信图片_20230827102251.png, > 微信图片_20230827102340.png, 微信图片_20230827103545.png > > > I encountered a yarn bug where executing the command "yarn application > -status ats-hbase" leads to a connection leak between the resourcemanager and > datanode. The resourcemanager does not close the connections with the > datanode, and on the resourcemanager node, many TCP connections with the > datanode are in the CLOSE_WAIT state > The relevant issue and log screenshots are as follows. The tcpdump log > capturing port 1019 is shown below > > This is the tcpdump package of resourcemanager with datanode 1019 port > > this is the resourcemanager log > !微信图片_20230827102251.png! > > this is the resourcemanager process > !微信图片_20230827102340.png! > > > This is the tcpdump package info of resourcemanager with datanode 1019 port > !微信图片_20230827103545.png! > > this is the tcp connection of resoucemanager with datanode, after rm running > a period of time will leave many close_wait state connection. > !image-2023-09-21-23-51-01-583.png! > > > my service version is > amabri: 3.1.1.3.1.0.0-78 > HDFS: 3.1.1.3.1 > yarn: 3.1.0 -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org