[jira] [Commented] (HDFS-9178) Slow datanode I/O can cause a wrong node to be marked bad
[ https://issues.apache.org/jira/browse/HDFS-9178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16881469#comment-16881469 ] Yongjun Zhang commented on HDFS-9178: - HI [~kihwal], many thanks for the work here! > Slow datanode I/O can cause a wrong node to be marked bad > - > > Key: HDFS-9178 > URL: https://issues.apache.org/jira/browse/HDFS-9178 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Kihwal Lee >Assignee: Kihwal Lee >Priority: Critical > Fix For: 2.8.0, 2.7.2, 2.6.4, 3.0.0-alpha1 > > Attachments: 002-HDFS-9178.branch-2.6.patch, > HDFS-9178.branch-2.6.patch, HDFS-9178.patch > > > When non-leaf datanode in a pipeline is slow on or stuck at disk I/O, the > downstream node can timeout on reading packet since even the heartbeat > packets will not be relayed down. > The packet read timeout is set in {{DataXceiver#run()}}: > {code} > peer.setReadTimeout(dnConf.socketTimeout); > {code} > When the downstream node times out and closes the connection to the upstream, > the upstream node's {{PacketResponder}} gets {{EOFException}} and it sends an > ack upstream with the downstream node status set to {{ERROR}}. This caused > the client to exclude the downstream node, even though the upstream node was > the one got stuck. > The connection to downstream has longer timeout, so the downstream will > always timeout first. The downstream timeout is set in {{writeBlock()}} > {code} > int timeoutValue = dnConf.socketTimeout + > (HdfsConstants.READ_TIMEOUT_EXTENSION * targets.length); > int writeTimeout = dnConf.socketWriteTimeout + > (HdfsConstants.WRITE_TIMEOUT_EXTENSION * targets.length); > NetUtils.connect(mirrorSock, mirrorTarget, timeoutValue); > OutputStream unbufMirrorOut = NetUtils.getOutputStream(mirrorSock, > writeTimeout); > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9178) Slow datanode I/O can cause a wrong node to be marked bad
[ https://issues.apache.org/jira/browse/HDFS-9178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15128715#comment-15128715 ] Junping Du commented on HDFS-9178: -- Thanks [~kihwal] for review the patch! I have commit the patch to branch-2.6. > Slow datanode I/O can cause a wrong node to be marked bad > - > > Key: HDFS-9178 > URL: https://issues.apache.org/jira/browse/HDFS-9178 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Kihwal Lee >Assignee: Kihwal Lee >Priority: Critical > Fix For: 2.7.2, 2.6.4 > > Attachments: 002-HDFS-9178.branch-2.6.patch, > HDFS-9178.branch-2.6.patch, HDFS-9178.patch > > > When non-leaf datanode in a pipeline is slow on or stuck at disk I/O, the > downstream node can timeout on reading packet since even the heartbeat > packets will not be relayed down. > The packet read timeout is set in {{DataXceiver#run()}}: > {code} > peer.setReadTimeout(dnConf.socketTimeout); > {code} > When the downstream node times out and closes the connection to the upstream, > the upstream node's {{PacketResponder}} gets {{EOFException}} and it sends an > ack upstream with the downstream node status set to {{ERROR}}. This caused > the client to exclude the downstream node, even thought the upstream node was > the one got stuck. > The connection to downstream has longer timeout, so the downstream will > always timeout first. The downstream timeout is set in {{writeBlock()}} > {code} > int timeoutValue = dnConf.socketTimeout + > (HdfsConstants.READ_TIMEOUT_EXTENSION * targets.length); > int writeTimeout = dnConf.socketWriteTimeout + > (HdfsConstants.WRITE_TIMEOUT_EXTENSION * targets.length); > NetUtils.connect(mirrorSock, mirrorTarget, timeoutValue); > OutputStream unbufMirrorOut = NetUtils.getOutputStream(mirrorSock, > writeTimeout); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9178) Slow datanode I/O can cause a wrong node to be marked bad
[ https://issues.apache.org/jira/browse/HDFS-9178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15128657#comment-15128657 ] Kihwal Lee commented on HDFS-9178: -- +1 It looks like the only difference is in {{DataNodeFaultInjector}}. > Slow datanode I/O can cause a wrong node to be marked bad > - > > Key: HDFS-9178 > URL: https://issues.apache.org/jira/browse/HDFS-9178 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Kihwal Lee >Assignee: Kihwal Lee >Priority: Critical > Fix For: 2.7.2 > > Attachments: 002-HDFS-9178.branch-2.6.patch, > HDFS-9178.branch-2.6.patch, HDFS-9178.patch > > > When non-leaf datanode in a pipeline is slow on or stuck at disk I/O, the > downstream node can timeout on reading packet since even the heartbeat > packets will not be relayed down. > The packet read timeout is set in {{DataXceiver#run()}}: > {code} > peer.setReadTimeout(dnConf.socketTimeout); > {code} > When the downstream node times out and closes the connection to the upstream, > the upstream node's {{PacketResponder}} gets {{EOFException}} and it sends an > ack upstream with the downstream node status set to {{ERROR}}. This caused > the client to exclude the downstream node, even thought the upstream node was > the one got stuck. > The connection to downstream has longer timeout, so the downstream will > always timeout first. The downstream timeout is set in {{writeBlock()}} > {code} > int timeoutValue = dnConf.socketTimeout + > (HdfsConstants.READ_TIMEOUT_EXTENSION * targets.length); > int writeTimeout = dnConf.socketWriteTimeout + > (HdfsConstants.WRITE_TIMEOUT_EXTENSION * targets.length); > NetUtils.connect(mirrorSock, mirrorTarget, timeoutValue); > OutputStream unbufMirrorOut = NetUtils.getOutputStream(mirrorSock, > writeTimeout); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9178) Slow datanode I/O can cause a wrong node to be marked bad
[ https://issues.apache.org/jira/browse/HDFS-9178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15080547#comment-15080547 ] Junping Du commented on HDFS-9178: -- Hi [~kihwal], I saw you already attached the patch for 2.6 branch. Shall we commit this patch in branch-2.6? Thanks! > Slow datanode I/O can cause a wrong node to be marked bad > - > > Key: HDFS-9178 > URL: https://issues.apache.org/jira/browse/HDFS-9178 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Kihwal Lee >Assignee: Kihwal Lee >Priority: Critical > Fix For: 3.0.0, 2.7.2 > > Attachments: HDFS-9178.branch-2.6.patch, HDFS-9178.patch > > > When non-leaf datanode in a pipeline is slow on or stuck at disk I/O, the > downstream node can timeout on reading packet since even the heartbeat > packets will not be relayed down. > The packet read timeout is set in {{DataXceiver#run()}}: > {code} > peer.setReadTimeout(dnConf.socketTimeout); > {code} > When the downstream node times out and closes the connection to the upstream, > the upstream node's {{PacketResponder}} gets {{EOFException}} and it sends an > ack upstream with the downstream node status set to {{ERROR}}. This caused > the client to exclude the downstream node, even thought the upstream node was > the one got stuck. > The connection to downstream has longer timeout, so the downstream will > always timeout first. The downstream timeout is set in {{writeBlock()}} > {code} > int timeoutValue = dnConf.socketTimeout + > (HdfsConstants.READ_TIMEOUT_EXTENSION * targets.length); > int writeTimeout = dnConf.socketWriteTimeout + > (HdfsConstants.WRITE_TIMEOUT_EXTENSION * targets.length); > NetUtils.connect(mirrorSock, mirrorTarget, timeoutValue); > OutputStream unbufMirrorOut = NetUtils.getOutputStream(mirrorSock, > writeTimeout); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9178) Slow datanode I/O can cause a wrong node to be marked bad
[ https://issues.apache.org/jira/browse/HDFS-9178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15019094#comment-15019094 ] Sangjin Lee commented on HDFS-9178: --- Does this issue exist in 2.6.x? Should this be backported to branch-2.6? > Slow datanode I/O can cause a wrong node to be marked bad > - > > Key: HDFS-9178 > URL: https://issues.apache.org/jira/browse/HDFS-9178 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Kihwal Lee >Assignee: Kihwal Lee >Priority: Critical > Fix For: 3.0.0, 2.7.2 > > Attachments: HDFS-9178.branch-2.6.patch, HDFS-9178.patch > > > When non-leaf datanode in a pipeline is slow on or stuck at disk I/O, the > downstream node can timeout on reading packet since even the heartbeat > packets will not be relayed down. > The packet read timeout is set in {{DataXceiver#run()}}: > {code} > peer.setReadTimeout(dnConf.socketTimeout); > {code} > When the downstream node times out and closes the connection to the upstream, > the upstream node's {{PacketResponder}} gets {{EOFException}} and it sends an > ack upstream with the downstream node status set to {{ERROR}}. This caused > the client to exclude the downstream node, even thought the upstream node was > the one got stuck. > The connection to downstream has longer timeout, so the downstream will > always timeout first. The downstream timeout is set in {{writeBlock()}} > {code} > int timeoutValue = dnConf.socketTimeout + > (HdfsConstants.READ_TIMEOUT_EXTENSION * targets.length); > int writeTimeout = dnConf.socketWriteTimeout + > (HdfsConstants.WRITE_TIMEOUT_EXTENSION * targets.length); > NetUtils.connect(mirrorSock, mirrorTarget, timeoutValue); > OutputStream unbufMirrorOut = NetUtils.getOutputStream(mirrorSock, > writeTimeout); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9178) Slow datanode I/O can cause a wrong node to be marked bad
[ https://issues.apache.org/jira/browse/HDFS-9178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14947023#comment-14947023 ] Hudson commented on HDFS-9178: -- FAILURE: Integrated in Hadoop-trunk-Commit #8587 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/8587/]) HDFS-9178. Slow datanode I/O can cause a wrong node to be marked bad. (kihwal: rev 99e5204ff5326430558b6f6fd9da7c44654c15d7) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestClientProtocolForPipelineRecovery.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNodeFaultInjector.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockReceiver.java > Slow datanode I/O can cause a wrong node to be marked bad > - > > Key: HDFS-9178 > URL: https://issues.apache.org/jira/browse/HDFS-9178 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Kihwal Lee >Assignee: Kihwal Lee >Priority: Critical > Attachments: HDFS-9178.patch > > > When non-leaf datanode in a pipeline is slow on or stuck at disk I/O, the > downstream node can timeout on reading packet since even the heartbeat > packets will not be relayed down. > The packet read timeout is set in {{DataXceiver#run()}}: > {code} > peer.setReadTimeout(dnConf.socketTimeout); > {code} > When the downstream node times out and closes the connection to the upstream, > the upstream node's {{PacketResponder}} gets {{EOFException}} and it sends an > ack upstream with the downstream node status set to {{ERROR}}. This caused > the client to exclude the downstream node, even thought the upstream node was > the one got stuck. > The connection to downstream has longer timeout, so the downstream will > always timeout first. The downstream timeout is set in {{writeBlock()}} > {code} > int timeoutValue = dnConf.socketTimeout + > (HdfsConstants.READ_TIMEOUT_EXTENSION * targets.length); > int writeTimeout = dnConf.socketWriteTimeout + > (HdfsConstants.WRITE_TIMEOUT_EXTENSION * targets.length); > NetUtils.connect(mirrorSock, mirrorTarget, timeoutValue); > OutputStream unbufMirrorOut = NetUtils.getOutputStream(mirrorSock, > writeTimeout); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9178) Slow datanode I/O can cause a wrong node to be marked bad
[ https://issues.apache.org/jira/browse/HDFS-9178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14946988#comment-14946988 ] Daryn Sharp commented on HDFS-9178: --- +1 Seems to have helped fix our problems. > Slow datanode I/O can cause a wrong node to be marked bad > - > > Key: HDFS-9178 > URL: https://issues.apache.org/jira/browse/HDFS-9178 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Kihwal Lee >Assignee: Kihwal Lee >Priority: Critical > Attachments: HDFS-9178.patch > > > When non-leaf datanode in a pipeline is slow on or stuck at disk I/O, the > downstream node can timeout on reading packet since even the heartbeat > packets will not be relayed down. > The packet read timeout is set in {{DataXceiver#run()}}: > {code} > peer.setReadTimeout(dnConf.socketTimeout); > {code} > When the downstream node times out and closes the connection to the upstream, > the upstream node's {{PacketResponder}} gets {{EOFException}} and it sends an > ack upstream with the downstream node status set to {{ERROR}}. This caused > the client to exclude the downstream node, even thought the upstream node was > the one got stuck. > The connection to downstream has longer timeout, so the downstream will > always timeout first. The downstream timeout is set in {{writeBlock()}} > {code} > int timeoutValue = dnConf.socketTimeout + > (HdfsConstants.READ_TIMEOUT_EXTENSION * targets.length); > int writeTimeout = dnConf.socketWriteTimeout + > (HdfsConstants.WRITE_TIMEOUT_EXTENSION * targets.length); > NetUtils.connect(mirrorSock, mirrorTarget, timeoutValue); > OutputStream unbufMirrorOut = NetUtils.getOutputStream(mirrorSock, > writeTimeout); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9178) Slow datanode I/O can cause a wrong node to be marked bad
[ https://issues.apache.org/jira/browse/HDFS-9178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14947041#comment-14947041 ] Kihwal Lee commented on HDFS-9178: -- Thanks for the review, Daryn. I've committed this to trunk, branch-2 and branch-2.7. > Slow datanode I/O can cause a wrong node to be marked bad > - > > Key: HDFS-9178 > URL: https://issues.apache.org/jira/browse/HDFS-9178 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Kihwal Lee >Assignee: Kihwal Lee >Priority: Critical > Fix For: 3.0.0, 2.7.2 > > Attachments: HDFS-9178.patch > > > When non-leaf datanode in a pipeline is slow on or stuck at disk I/O, the > downstream node can timeout on reading packet since even the heartbeat > packets will not be relayed down. > The packet read timeout is set in {{DataXceiver#run()}}: > {code} > peer.setReadTimeout(dnConf.socketTimeout); > {code} > When the downstream node times out and closes the connection to the upstream, > the upstream node's {{PacketResponder}} gets {{EOFException}} and it sends an > ack upstream with the downstream node status set to {{ERROR}}. This caused > the client to exclude the downstream node, even thought the upstream node was > the one got stuck. > The connection to downstream has longer timeout, so the downstream will > always timeout first. The downstream timeout is set in {{writeBlock()}} > {code} > int timeoutValue = dnConf.socketTimeout + > (HdfsConstants.READ_TIMEOUT_EXTENSION * targets.length); > int writeTimeout = dnConf.socketWriteTimeout + > (HdfsConstants.WRITE_TIMEOUT_EXTENSION * targets.length); > NetUtils.connect(mirrorSock, mirrorTarget, timeoutValue); > OutputStream unbufMirrorOut = NetUtils.getOutputStream(mirrorSock, > writeTimeout); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9178) Slow datanode I/O can cause a wrong node to be marked bad
[ https://issues.apache.org/jira/browse/HDFS-9178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14947122#comment-14947122 ] Hudson commented on HDFS-9178: -- FAILURE: Integrated in Hadoop-Yarn-trunk #1231 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/1231/]) HDFS-9178. Slow datanode I/O can cause a wrong node to be marked bad. (kihwal: rev 99e5204ff5326430558b6f6fd9da7c44654c15d7) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestClientProtocolForPipelineRecovery.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockReceiver.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNodeFaultInjector.java > Slow datanode I/O can cause a wrong node to be marked bad > - > > Key: HDFS-9178 > URL: https://issues.apache.org/jira/browse/HDFS-9178 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Kihwal Lee >Assignee: Kihwal Lee >Priority: Critical > Fix For: 3.0.0, 2.7.2 > > Attachments: HDFS-9178.branch-2.6.patch, HDFS-9178.patch > > > When non-leaf datanode in a pipeline is slow on or stuck at disk I/O, the > downstream node can timeout on reading packet since even the heartbeat > packets will not be relayed down. > The packet read timeout is set in {{DataXceiver#run()}}: > {code} > peer.setReadTimeout(dnConf.socketTimeout); > {code} > When the downstream node times out and closes the connection to the upstream, > the upstream node's {{PacketResponder}} gets {{EOFException}} and it sends an > ack upstream with the downstream node status set to {{ERROR}}. This caused > the client to exclude the downstream node, even thought the upstream node was > the one got stuck. > The connection to downstream has longer timeout, so the downstream will > always timeout first. The downstream timeout is set in {{writeBlock()}} > {code} > int timeoutValue = dnConf.socketTimeout + > (HdfsConstants.READ_TIMEOUT_EXTENSION * targets.length); > int writeTimeout = dnConf.socketWriteTimeout + > (HdfsConstants.WRITE_TIMEOUT_EXTENSION * targets.length); > NetUtils.connect(mirrorSock, mirrorTarget, timeoutValue); > OutputStream unbufMirrorOut = NetUtils.getOutputStream(mirrorSock, > writeTimeout); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9178) Slow datanode I/O can cause a wrong node to be marked bad
[ https://issues.apache.org/jira/browse/HDFS-9178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14947384#comment-14947384 ] Hudson commented on HDFS-9178: -- FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #467 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/467/]) HDFS-9178. Slow datanode I/O can cause a wrong node to be marked bad. (kihwal: rev 99e5204ff5326430558b6f6fd9da7c44654c15d7) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestClientProtocolForPipelineRecovery.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockReceiver.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNodeFaultInjector.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt > Slow datanode I/O can cause a wrong node to be marked bad > - > > Key: HDFS-9178 > URL: https://issues.apache.org/jira/browse/HDFS-9178 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Kihwal Lee >Assignee: Kihwal Lee >Priority: Critical > Fix For: 3.0.0, 2.7.2 > > Attachments: HDFS-9178.branch-2.6.patch, HDFS-9178.patch > > > When non-leaf datanode in a pipeline is slow on or stuck at disk I/O, the > downstream node can timeout on reading packet since even the heartbeat > packets will not be relayed down. > The packet read timeout is set in {{DataXceiver#run()}}: > {code} > peer.setReadTimeout(dnConf.socketTimeout); > {code} > When the downstream node times out and closes the connection to the upstream, > the upstream node's {{PacketResponder}} gets {{EOFException}} and it sends an > ack upstream with the downstream node status set to {{ERROR}}. This caused > the client to exclude the downstream node, even thought the upstream node was > the one got stuck. > The connection to downstream has longer timeout, so the downstream will > always timeout first. The downstream timeout is set in {{writeBlock()}} > {code} > int timeoutValue = dnConf.socketTimeout + > (HdfsConstants.READ_TIMEOUT_EXTENSION * targets.length); > int writeTimeout = dnConf.socketWriteTimeout + > (HdfsConstants.WRITE_TIMEOUT_EXTENSION * targets.length); > NetUtils.connect(mirrorSock, mirrorTarget, timeoutValue); > OutputStream unbufMirrorOut = NetUtils.getOutputStream(mirrorSock, > writeTimeout); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9178) Slow datanode I/O can cause a wrong node to be marked bad
[ https://issues.apache.org/jira/browse/HDFS-9178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14947405#comment-14947405 ] Hudson commented on HDFS-9178: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #2405 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/2405/]) HDFS-9178. Slow datanode I/O can cause a wrong node to be marked bad. (kihwal: rev 99e5204ff5326430558b6f6fd9da7c44654c15d7) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNodeFaultInjector.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestClientProtocolForPipelineRecovery.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockReceiver.java > Slow datanode I/O can cause a wrong node to be marked bad > - > > Key: HDFS-9178 > URL: https://issues.apache.org/jira/browse/HDFS-9178 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Kihwal Lee >Assignee: Kihwal Lee >Priority: Critical > Fix For: 3.0.0, 2.7.2 > > Attachments: HDFS-9178.branch-2.6.patch, HDFS-9178.patch > > > When non-leaf datanode in a pipeline is slow on or stuck at disk I/O, the > downstream node can timeout on reading packet since even the heartbeat > packets will not be relayed down. > The packet read timeout is set in {{DataXceiver#run()}}: > {code} > peer.setReadTimeout(dnConf.socketTimeout); > {code} > When the downstream node times out and closes the connection to the upstream, > the upstream node's {{PacketResponder}} gets {{EOFException}} and it sends an > ack upstream with the downstream node status set to {{ERROR}}. This caused > the client to exclude the downstream node, even thought the upstream node was > the one got stuck. > The connection to downstream has longer timeout, so the downstream will > always timeout first. The downstream timeout is set in {{writeBlock()}} > {code} > int timeoutValue = dnConf.socketTimeout + > (HdfsConstants.READ_TIMEOUT_EXTENSION * targets.length); > int writeTimeout = dnConf.socketWriteTimeout + > (HdfsConstants.WRITE_TIMEOUT_EXTENSION * targets.length); > NetUtils.connect(mirrorSock, mirrorTarget, timeoutValue); > OutputStream unbufMirrorOut = NetUtils.getOutputStream(mirrorSock, > writeTimeout); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9178) Slow datanode I/O can cause a wrong node to be marked bad
[ https://issues.apache.org/jira/browse/HDFS-9178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14947280#comment-14947280 ] Hudson commented on HDFS-9178: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #2437 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2437/]) HDFS-9178. Slow datanode I/O can cause a wrong node to be marked bad. (kihwal: rev 99e5204ff5326430558b6f6fd9da7c44654c15d7) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestClientProtocolForPipelineRecovery.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNodeFaultInjector.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockReceiver.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt > Slow datanode I/O can cause a wrong node to be marked bad > - > > Key: HDFS-9178 > URL: https://issues.apache.org/jira/browse/HDFS-9178 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Kihwal Lee >Assignee: Kihwal Lee >Priority: Critical > Fix For: 3.0.0, 2.7.2 > > Attachments: HDFS-9178.branch-2.6.patch, HDFS-9178.patch > > > When non-leaf datanode in a pipeline is slow on or stuck at disk I/O, the > downstream node can timeout on reading packet since even the heartbeat > packets will not be relayed down. > The packet read timeout is set in {{DataXceiver#run()}}: > {code} > peer.setReadTimeout(dnConf.socketTimeout); > {code} > When the downstream node times out and closes the connection to the upstream, > the upstream node's {{PacketResponder}} gets {{EOFException}} and it sends an > ack upstream with the downstream node status set to {{ERROR}}. This caused > the client to exclude the downstream node, even thought the upstream node was > the one got stuck. > The connection to downstream has longer timeout, so the downstream will > always timeout first. The downstream timeout is set in {{writeBlock()}} > {code} > int timeoutValue = dnConf.socketTimeout + > (HdfsConstants.READ_TIMEOUT_EXTENSION * targets.length); > int writeTimeout = dnConf.socketWriteTimeout + > (HdfsConstants.WRITE_TIMEOUT_EXTENSION * targets.length); > NetUtils.connect(mirrorSock, mirrorTarget, timeoutValue); > OutputStream unbufMirrorOut = NetUtils.getOutputStream(mirrorSock, > writeTimeout); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9178) Slow datanode I/O can cause a wrong node to be marked bad
[ https://issues.apache.org/jira/browse/HDFS-9178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14947323#comment-14947323 ] Hudson commented on HDFS-9178: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #494 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/494/]) HDFS-9178. Slow datanode I/O can cause a wrong node to be marked bad. (kihwal: rev 99e5204ff5326430558b6f6fd9da7c44654c15d7) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNodeFaultInjector.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockReceiver.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestClientProtocolForPipelineRecovery.java > Slow datanode I/O can cause a wrong node to be marked bad > - > > Key: HDFS-9178 > URL: https://issues.apache.org/jira/browse/HDFS-9178 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Kihwal Lee >Assignee: Kihwal Lee >Priority: Critical > Fix For: 3.0.0, 2.7.2 > > Attachments: HDFS-9178.branch-2.6.patch, HDFS-9178.patch > > > When non-leaf datanode in a pipeline is slow on or stuck at disk I/O, the > downstream node can timeout on reading packet since even the heartbeat > packets will not be relayed down. > The packet read timeout is set in {{DataXceiver#run()}}: > {code} > peer.setReadTimeout(dnConf.socketTimeout); > {code} > When the downstream node times out and closes the connection to the upstream, > the upstream node's {{PacketResponder}} gets {{EOFException}} and it sends an > ack upstream with the downstream node status set to {{ERROR}}. This caused > the client to exclude the downstream node, even thought the upstream node was > the one got stuck. > The connection to downstream has longer timeout, so the downstream will > always timeout first. The downstream timeout is set in {{writeBlock()}} > {code} > int timeoutValue = dnConf.socketTimeout + > (HdfsConstants.READ_TIMEOUT_EXTENSION * targets.length); > int writeTimeout = dnConf.socketWriteTimeout + > (HdfsConstants.WRITE_TIMEOUT_EXTENSION * targets.length); > NetUtils.connect(mirrorSock, mirrorTarget, timeoutValue); > OutputStream unbufMirrorOut = NetUtils.getOutputStream(mirrorSock, > writeTimeout); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9178) Slow datanode I/O can cause a wrong node to be marked bad
[ https://issues.apache.org/jira/browse/HDFS-9178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14947195#comment-14947195 ] Hudson commented on HDFS-9178: -- FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #502 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/502/]) HDFS-9178. Slow datanode I/O can cause a wrong node to be marked bad. (kihwal: rev 99e5204ff5326430558b6f6fd9da7c44654c15d7) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNodeFaultInjector.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockReceiver.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestClientProtocolForPipelineRecovery.java > Slow datanode I/O can cause a wrong node to be marked bad > - > > Key: HDFS-9178 > URL: https://issues.apache.org/jira/browse/HDFS-9178 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Kihwal Lee >Assignee: Kihwal Lee >Priority: Critical > Fix For: 3.0.0, 2.7.2 > > Attachments: HDFS-9178.branch-2.6.patch, HDFS-9178.patch > > > When non-leaf datanode in a pipeline is slow on or stuck at disk I/O, the > downstream node can timeout on reading packet since even the heartbeat > packets will not be relayed down. > The packet read timeout is set in {{DataXceiver#run()}}: > {code} > peer.setReadTimeout(dnConf.socketTimeout); > {code} > When the downstream node times out and closes the connection to the upstream, > the upstream node's {{PacketResponder}} gets {{EOFException}} and it sends an > ack upstream with the downstream node status set to {{ERROR}}. This caused > the client to exclude the downstream node, even thought the upstream node was > the one got stuck. > The connection to downstream has longer timeout, so the downstream will > always timeout first. The downstream timeout is set in {{writeBlock()}} > {code} > int timeoutValue = dnConf.socketTimeout + > (HdfsConstants.READ_TIMEOUT_EXTENSION * targets.length); > int writeTimeout = dnConf.socketWriteTimeout + > (HdfsConstants.WRITE_TIMEOUT_EXTENSION * targets.length); > NetUtils.connect(mirrorSock, mirrorTarget, timeoutValue); > OutputStream unbufMirrorOut = NetUtils.getOutputStream(mirrorSock, > writeTimeout); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9178) Slow datanode I/O can cause a wrong node to be marked bad
[ https://issues.apache.org/jira/browse/HDFS-9178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14939789#comment-14939789 ] Kihwal Lee commented on HDFS-9178: -- - release audit: caused by the EC branch merge - checkstyle: file length, which was already over the "limit". - test failures: mostly new EC related tests. They seem to pass when run locally, including {{TestLazyWriter}}. > Slow datanode I/O can cause a wrong node to be marked bad > - > > Key: HDFS-9178 > URL: https://issues.apache.org/jira/browse/HDFS-9178 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Kihwal Lee >Assignee: Kihwal Lee >Priority: Critical > Attachments: HDFS-9178.patch > > > When non-leaf datanode in a pipeline is slow on or stuck at disk I/O, the > downstream node can timeout on reading packet since even the heartbeat > packets will not be relayed down. > The packet read timeout is set in {{DataXceiver#run()}}: > {code} > peer.setReadTimeout(dnConf.socketTimeout); > {code} > When the downstream node times out and closes the connection to the upstream, > the upstream node's {{PacketResponder}} gets {{EOFException}} and it sends an > ack upstream with the downstream node status set to {{ERROR}}. This caused > the client to exclude the downstream node, even thought the upstream node was > the one got stuck. > The connection to downstream has longer timeout, so the downstream will > always timeout first. The downstream timeout is set in {{writeBlock()}} > {code} > int timeoutValue = dnConf.socketTimeout + > (HdfsConstants.READ_TIMEOUT_EXTENSION * targets.length); > int writeTimeout = dnConf.socketWriteTimeout + > (HdfsConstants.WRITE_TIMEOUT_EXTENSION * targets.length); > NetUtils.connect(mirrorSock, mirrorTarget, timeoutValue); > OutputStream unbufMirrorOut = NetUtils.getOutputStream(mirrorSock, > writeTimeout); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9178) Slow datanode I/O can cause a wrong node to be marked bad
[ https://issues.apache.org/jira/browse/HDFS-9178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14938762#comment-14938762 ] Kihwal Lee commented on HDFS-9178: -- A simple solution is to let datanode check when it last sent a packet whenever downstream closes connection. If it has not sent a packet for a long time (e.g. 0.9*timeout. it is supposed to send a packet at least every 0.5*timeout), it or its upstream might be at fault. In this case, it will simply close connection to its upstream, so that the same check is triggered upstream. If an upstream node thinks it has sent packets in time, the downstream node will be reported as bad. When it goes all the way to client, the client will remove the first node and rebuild the pipeline. Since {{DataStreamer}} does not get stuck on disk I/O (except on rare occasion when it logs and the disk is having an issue), it would be either slow first node or communication problem between client and the first node. So removing first node seems reasonable. > Slow datanode I/O can cause a wrong node to be marked bad > - > > Key: HDFS-9178 > URL: https://issues.apache.org/jira/browse/HDFS-9178 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Kihwal Lee >Priority: Critical > > When non-leaf datanode in a pipeline is slow on or stuck at disk I/O, the > downstream node can timeout on reading packet since even the heartbeat > packets will not be relayed down. > The packet read timeout is set in {{DataXceiver#run()}}: > {code} > peer.setReadTimeout(dnConf.socketTimeout); > {code} > When the downstream node times out and closes the connection to the upstream, > the upstream node's {{PacketResponder}} gets {{EOFException}} and it sends an > ack upstream with the downstream node status set to {{ERROR}}. This caused > the client to exclude the downstream node, even thought the upstream node was > the one got stuck. > The connection to downstream has longer timeout, so the downstream will > always timeout first. The downstream timeout is set in {{writeBlock()}} > {code} > int timeoutValue = dnConf.socketTimeout + > (HdfsConstants.READ_TIMEOUT_EXTENSION * targets.length); > int writeTimeout = dnConf.socketWriteTimeout + > (HdfsConstants.WRITE_TIMEOUT_EXTENSION * targets.length); > NetUtils.connect(mirrorSock, mirrorTarget, timeoutValue); > OutputStream unbufMirrorOut = NetUtils.getOutputStream(mirrorSock, > writeTimeout); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9178) Slow datanode I/O can cause a wrong node to be marked bad
[ https://issues.apache.org/jira/browse/HDFS-9178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14938946#comment-14938946 ] Hadoop QA commented on HDFS-9178: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 17m 37s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 1s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 59s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 10m 10s | There were no new javadoc warning messages. | | {color:red}-1{color} | release audit | 0m 15s | The applied patch generated 1 release audit warnings. | | {color:red}-1{color} | checkstyle | 1m 24s | The applied patch generated 1 new checkstyle issues (total was 61, now 61). | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 33s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 34s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 2m 31s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 3m 11s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 64m 32s | Tests failed in hadoop-hdfs. | | | | 109m 51s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.hdfs.TestWriteReadStripedFile | | Timed out tests | org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyWriter | | | org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure010 | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12764472/HDFS-9178.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 6c17d31 | | Release Audit | https://builds.apache.org/job/PreCommit-HDFS-Build/12753/artifact/patchprocess/patchReleaseAuditProblems.txt | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/12753/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/12753/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/12753/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/12753/console | This message was automatically generated. > Slow datanode I/O can cause a wrong node to be marked bad > - > > Key: HDFS-9178 > URL: https://issues.apache.org/jira/browse/HDFS-9178 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Kihwal Lee >Assignee: Kihwal Lee >Priority: Critical > Attachments: HDFS-9178.patch > > > When non-leaf datanode in a pipeline is slow on or stuck at disk I/O, the > downstream node can timeout on reading packet since even the heartbeat > packets will not be relayed down. > The packet read timeout is set in {{DataXceiver#run()}}: > {code} > peer.setReadTimeout(dnConf.socketTimeout); > {code} > When the downstream node times out and closes the connection to the upstream, > the upstream node's {{PacketResponder}} gets {{EOFException}} and it sends an > ack upstream with the downstream node status set to {{ERROR}}. This caused > the client to exclude the downstream node, even thought the upstream node was > the one got stuck. > The connection to downstream has longer timeout, so the downstream will > always timeout first. The downstream timeout is set in {{writeBlock()}} > {code} > int timeoutValue = dnConf.socketTimeout + > (HdfsConstants.READ_TIMEOUT_EXTENSION * targets.length); > int writeTimeout = dnConf.socketWriteTimeout + > (HdfsConstants.WRITE_TIMEOUT_EXTENSION * targets.length); > NetUtils.connect(mirrorSock, mirrorTarget, timeoutValue); > OutputStream unbufMirrorOut = NetUtils.getOutputStream(mirrorSock, > writeTimeout); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)