[ https://issues.apache.org/jira/browse/HDFS-16896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17694214#comment-17694214 ]
ASF GitHub Bot commented on HDFS-16896: --------------------------------------- mccormickt12 commented on code in PR #5322: URL: https://github.com/apache/hadoop/pull/5322#discussion_r1119413264 ########## hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java: ########## @@ -197,6 +197,15 @@ private void clearLocalDeadNodes() { deadNodes.clear(); } + /** + * Clear list of ignored nodes used for hedged reads. + */ + private void clearIgnoredNodes(Collection<DatanodeInfo> ignoredNodes) { Review Comment: I could add a `clearSkippedNodes` and then clear both dead and ignored in there, but that might be confusing as it sounds like theres another node type/list. I don't think `clearLocalDeadNodes` should also clear ignoredNodes because thats a bit miselading. The deadnodes and ignore nodes are handled differently, so i don't think its crazy to keep them separate and clear. (like i said before) i would add a method that clears both dead and ignore, but concerned it may be confusing. lmk what you think > HDFS Client hedged read has increased failure rate than without hedged read > --------------------------------------------------------------------------- > > Key: HDFS-16896 > URL: https://issues.apache.org/jira/browse/HDFS-16896 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client > Reporter: Tom McCormick > Assignee: Tom McCormick > Priority: Major > Labels: pull-request-available > > When hedged read is enabled by HDFS client, we see an increased failure rate > on reads. > *stacktrace* > > {code:java} > Caused by: org.apache.hadoop.hdfs.BlockMissingException: Could not obtain > block: BP-1183972111-10.197.192.88-1590025572374:blk_17114848218_16043459722 > file=/data/tracking/streaming/AdImpressionEvent/daily/2022/07/18/compaction_1/part-r-1914862.1658217125623.1362294472.orc > at > org.apache.hadoop.hdfs.DFSInputStream.refetchLocations(DFSInputStream.java:1077) > at > org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:1060) > at > org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:1039) > at > org.apache.hadoop.hdfs.DFSInputStream.hedgedFetchBlockByteRange(DFSInputStream.java:1365) > at org.apache.hadoop.hdfs.DFSInputStream.pread(DFSInputStream.java:1572) > at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:1535) > at org.apache.hadoop.fs.FSInputStream.readFully(FSInputStream.java:121) > at > org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:112) > at > org.apache.hadoop.fs.RetryingInputStream.lambda$readFully$3(RetryingInputStream.java:172) > at org.apache.hadoop.fs.RetryPolicy.lambda$run$0(RetryPolicy.java:137) > at org.apache.hadoop.fs.NoOpRetryPolicy.run(NoOpRetryPolicy.java:36) > at org.apache.hadoop.fs.RetryPolicy.run(RetryPolicy.java:136) > at > org.apache.hadoop.fs.RetryingInputStream.readFully(RetryingInputStream.java:168) > at > org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:112) > at > org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:112) > at > io.trino.plugin.hive.orc.HdfsOrcDataSource.readInternal(HdfsOrcDataSource.java:76) > ... 46 more > {code} > -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org