[ 
https://issues.apache.org/jira/browse/HDFS-17397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17825931#comment-17825931
 ] 

ASF GitHub Bot commented on HDFS-17397:
---------------------------------------

xleoken commented on code in PR #6591:
URL: https://github.com/apache/hadoop/pull/6591#discussion_r1508372843


##########
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DataStreamer.java:
##########
@@ -1182,10 +1182,12 @@ public void run() {
             if (begin != null) {
               long duration = Time.monotonicNowNanos() - begin;
               if (TimeUnit.NANOSECONDS.toMillis(duration) > 
dfsclientSlowLogThresholdMs) {
-                LOG.info("Slow ReadProcessor read fields for block " + block
+                final String msg = "Slow ReadProcessor read fields for block " 
+ block
                     + " took " + TimeUnit.NANOSECONDS.toMillis(duration) + "ms 
(threshold="
                     + dfsclientSlowLogThresholdMs + "ms); ack: " + ack
-                    + ", targets: " + Arrays.asList(targets));
+                    + ", targets: " + Arrays.asList(targets);
+                LOG.warn(msg);
+                throw new IOException(msg);

Review Comment:
   Welcome @ZanderXu 
   
   > How to identify this case
   
   When the client takes more time to read ack than 
`dfsclientSlowLogThresholdMs`.
   
   > Which datanode should be marked as a bad or slow DN
   
   When some datanodes in poor network environment.
   
   > Maybe Datastreamer can identify this case and recovery it through 
PipelineRecovery
   
   The core issue is that the response time between the client and DN is 
greater than `dfsclientSlowLogThresholdMs`, but only print a log without taking 
any action. We should print the log and throw an `IOException`.
   
   > but I don't think your modification is a good solution.
   
   Maybe you're right, but this may be the simplest modification. After this 
patch, we solved the slow dn problem in production environment.
   





> Choose another DN as soon as possible, when encountering network issues
> -----------------------------------------------------------------------
>
>                 Key: HDFS-17397
>                 URL: https://issues.apache.org/jira/browse/HDFS-17397
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: xleoken
>            Priority: Minor
>              Labels: pull-request-available
>         Attachments: hadoop.png
>
>
> Choose another DN as soon as possible, when encountering network issues.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to