[ https://issues.apache.org/jira/browse/HDFS-16293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Yuanxin Zhu updated HDFS-16293: ------------------------------- Summary: Client sleep and hold 'dataQueue' when DataNodes are congested (was: Client sleep and hold 'dataqueue' when datanode are condensed) > Client sleep and hold 'dataQueue' when DataNodes are congested > -------------------------------------------------------------- > > Key: HDFS-16293 > URL: https://issues.apache.org/jira/browse/HDFS-16293 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client > Affects Versions: 3.2.2 > Reporter: Yuanxin Zhu > Priority: Major > Original Estimate: 24h > Remaining Estimate: 24h > > When I open the ECN and use Terasort for testing, DataNodes are > congested(HDFS-8008). The client enters the sleep state after receiving the > ACK for many times, but does not release the 'dataQueue'. The > ResponseProcessor thread needs the 'dataQueue' to execute > 'ackQueue.getFirst()', so the ResponseProcessor will wait for the client to > release the 'dataQueue', which is equivalent to that the ResponseProcessor > thread also enters sleep, resulting in ACK delay.MapReduce tasks can be > delayed by tens of minutes or even hours. > The DataStreamer thread can first execute 'one = dataQueue. getFirst()', > release 'dataQueue', and then judge whether to execute 'backOffIfNecessary()' > according to 'one.isHeartbeatPacket()' > -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org