[jira] [Work logged] (HDFS-16266) Add remote port information to HDFS audit log
[ https://issues.apache.org/jira/browse/HDFS-16266?focusedWorklogId=672941=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-672941 ] ASF GitHub Bot logged work on HDFS-16266: - Author: ASF GitHub Bot Created on: 02/Nov/21 02:38 Start Date: 02/Nov/21 02:38 Worklog Time Spent: 10m Work Description: tomscut commented on pull request #3538: URL: https://github.com/apache/hadoop/pull/3538#issuecomment-957045765 Hi @aajisaka , could you please review this again. Thanks a lot. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 672941) Time Spent: 7h (was: 6h 50m) > Add remote port information to HDFS audit log > - > > Key: HDFS-16266 > URL: https://issues.apache.org/jira/browse/HDFS-16266 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: tomscut >Assignee: tomscut >Priority: Major > Labels: pull-request-available > Time Spent: 7h > Remaining Estimate: 0h > > In our production environment, we occasionally encounter a problem where a > user submits an abnormal computation task, causing a sudden flood of > requests, which causes the queueTime and processingTime of the Namenode to > rise very high, causing a large backlog of tasks. > We usually locate and kill specific Spark, Flink, or MapReduce tasks based on > metrics and audit logs. Currently, IP and UGI are recorded in audit logs, but > there is no port information, so it is difficult to locate specific processes > sometimes. Therefore, I propose that we add the port information to the audit > log, so that we can easily track the upstream process. > Currently, some projects contain port information in audit logs, such as > Hbase and Alluxio. I think it is also necessary to add port information for > HDFS audit logs. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-14240) blockReport test in NNThroughputBenchmark throws ArrayIndexOutOfBoundsException
[ https://issues.apache.org/jira/browse/HDFS-14240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira Ajisaka resolved HDFS-14240. -- Assignee: (was: Ranith Sardar) Resolution: Duplicate Closing as duplicate. > blockReport test in NNThroughputBenchmark throws > ArrayIndexOutOfBoundsException > --- > > Key: HDFS-14240 > URL: https://issues.apache.org/jira/browse/HDFS-14240 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Shen Yinjie >Priority: Major > Attachments: screenshot-1.png > > > _emphasized text_When I run a blockReport test with NNThroughputBenchmark, > BlockReportStats.addBlocks() throws ArrayIndexOutOfBoundsException. > digging the code: > {code:java} > for(DatanodeInfo dnInfo : loc.getLocations()) > { int dnIdx = dnInfo.getXferPort() - 1; > datanodes[dnIdx].addBlock(loc.getBlock().getLocalBlock());{code} > > problem is here:array datanodes's length is determined by args as > "-datanodes" or "-threads" ,but dnIdx = dnInfo.getXferPort() is a random port. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-16292) The DFS Input Stream is waiting to be read
[ https://issues.apache.org/jira/browse/HDFS-16292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17437082#comment-17437082 ] tomscut edited comment on HDFS-16292 at 11/2/21, 12:57 AM: --- Hi [~weichiu], do you mean this issue HDFS-10223 . was (Author: tomscut): Hi [~weichiu], do you mean this issue [HDFS-10223|https://issues.apache.org/jira/browse/HDFS-10223]. > The DFS Input Stream is waiting to be read > -- > > Key: HDFS-16292 > URL: https://issues.apache.org/jira/browse/HDFS-16292 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 2.5.2 >Reporter: Hualong Zhang >Priority: Minor > Attachments: HDFS-16292.path, image-2021-11-01-18-36-54-329.png, > image-2021-11-02-08-54-27-273.png > > > The input stream has been waiting.The problem seems to be that > BlockReaderPeer#peer does not set ReadTimeout and WriteTimeout.We can solve > this problem by setting the timeout in BlockReaderFactory#nextTcpPeer > Jstack as follows > !image-2021-11-01-18-36-54-329.png! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-16292) The DFS Input Stream is waiting to be read
[ https://issues.apache.org/jira/browse/HDFS-16292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17437084#comment-17437084 ] tomscut edited comment on HDFS-16292 at 11/2/21, 12:57 AM: --- Coincidentally, our cluster also encountered this problem yesterday, but our version is 3.1.0 (this patch HDFS-10223 has been merged). Client stack: !image-2021-11-02-08-54-27-273.png|width=607,height=341! {code:java} "Executor task launch worker for task 2690" #47 daemon prio=5 os_prio=0 tid=0x7f3730286800 nid=0x1abc4 runnable [0x7f37109ed000]"Executor task launch worker for task 2690" #47 daemon prio=5 os_prio=0 tid=0x7f3730286800 nid=0x1abc4 runnable [0x7f37109ed000] java.lang.Thread.State: RUNNABLE at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method) at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269) at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:79) at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86) - locked <0x0006cb9cf3a0> (a sun.nio.ch.Util$2) - locked <0x0006cb9cf390> (a java.util.Collections$UnmodifiableSet) - locked <0x0006cb9cf168> (a sun.nio.ch.EPollSelectorImpl) at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97) at org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:335) at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:157) at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161) at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131) at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:118) at java.io.FilterInputStream.read(FilterInputStream.java:83) at org.apache.hadoop.hdfs.protocolPB.PBHelperClient.vintPrefixed(PBHelperClient.java:547) at org.apache.hadoop.hdfs.client.impl.BlockReaderRemote.newBlockReader(BlockReaderRemote.java:407) at org.apache.hadoop.hdfs.client.impl.BlockReaderFactory.getRemoteBlockReader(BlockReaderFactory.java:853) at org.apache.hadoop.hdfs.client.impl.BlockReaderFactory.getRemoteBlockReaderFromTcp(BlockReaderFactory.java:749) at org.apache.hadoop.hdfs.client.impl.BlockReaderFactory.build(BlockReaderFactory.java:379) at org.apache.hadoop.hdfs.DFSInputStream.getBlockReader(DFSInputStream.java:669) at org.apache.hadoop.hdfs.DFSInputStream.actualGetFromOneDataNode(DFSInputStream.java:1117) at org.apache.hadoop.hdfs.DFSInputStream.fetchBlockByteRange(DFSInputStream.java:1069) at org.apache.hadoop.hdfs.DFSInputStream.pread(DFSInputStream.java:1501) at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:1465) at org.apache.hadoop.fs.FSInputStream.readFully(FSInputStream.java:121) at org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:111) at org.apache.orc.impl.RecordReaderUtils.readDiskRanges(RecordReaderUtils.java:566) at org.apache.orc.impl.RecordReaderUtils$DefaultDataReader.readRowIndex(RecordReaderUtils.java:219) at org.apache.orc.impl.RecordReaderImpl.readRowIndex(RecordReaderImpl.java:1419) at org.apache.orc.impl.RecordReaderImpl.readRowIndex(RecordReaderImpl.java:1402) at org.apache.orc.impl.RecordReaderImpl.pickRowGroups(RecordReaderImpl.java:1056) at org.apache.orc.impl.RecordReaderImpl.readStripe(RecordReaderImpl.java:1087) at org.apache.orc.impl.RecordReaderImpl.advanceStripe(RecordReaderImpl.java:1254) at org.apache.orc.impl.RecordReaderImpl.advanceToNextRow(RecordReaderImpl.java:1289) at org.apache.orc.impl.RecordReaderImpl.nextBatch(RecordReaderImpl.java:1325) at org.apache.spark.sql.execution.datasources.orc.OrcColumnarBatchReader.nextBatch(OrcColumnarBatchReader.java:196) at org.apache.spark.sql.execution.datasources.orc.OrcColumnarBatchReader.nextKeyValue(OrcColumnarBatchReader.java:99) at org.apache.spark.sql.execution.datasources.RecordReaderIterator.hasNext(RecordReaderIterator.scala:39) at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1$$anon$2.getNext(FileScanRDD.scala:145) at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:73) at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:93) at org.apache.spark.sql.execution.FileSourceScanExec$$anon$1.hasNext(DataSourceScanExec.scala:492) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage5.columnartorow_nextBatch_0$(Unknown Source) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage5.agg_doAggregateWithKeys_0$(Unknown Source) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage5.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:729) at
[jira] [Comment Edited] (HDFS-16292) The DFS Input Stream is waiting to be read
[ https://issues.apache.org/jira/browse/HDFS-16292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17437084#comment-17437084 ] tomscut edited comment on HDFS-16292 at 11/2/21, 12:56 AM: --- Coincidentally, our cluster also encountered this problem yesterday, but our version is 3.1.0(This patch [HDFS-10223|https://issues.apache.org/jira/browse/HDFS-10223] has been merged). Client stack: !image-2021-11-02-08-54-27-273.png|width=607,height=341! {code:java} "Executor task launch worker for task 2690" #47 daemon prio=5 os_prio=0 tid=0x7f3730286800 nid=0x1abc4 runnable [0x7f37109ed000]"Executor task launch worker for task 2690" #47 daemon prio=5 os_prio=0 tid=0x7f3730286800 nid=0x1abc4 runnable [0x7f37109ed000] java.lang.Thread.State: RUNNABLE at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method) at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269) at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:79) at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86) - locked <0x0006cb9cf3a0> (a sun.nio.ch.Util$2) - locked <0x0006cb9cf390> (a java.util.Collections$UnmodifiableSet) - locked <0x0006cb9cf168> (a sun.nio.ch.EPollSelectorImpl) at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97) at org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:335) at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:157) at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161) at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131) at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:118) at java.io.FilterInputStream.read(FilterInputStream.java:83) at org.apache.hadoop.hdfs.protocolPB.PBHelperClient.vintPrefixed(PBHelperClient.java:547) at org.apache.hadoop.hdfs.client.impl.BlockReaderRemote.newBlockReader(BlockReaderRemote.java:407) at org.apache.hadoop.hdfs.client.impl.BlockReaderFactory.getRemoteBlockReader(BlockReaderFactory.java:853) at org.apache.hadoop.hdfs.client.impl.BlockReaderFactory.getRemoteBlockReaderFromTcp(BlockReaderFactory.java:749) at org.apache.hadoop.hdfs.client.impl.BlockReaderFactory.build(BlockReaderFactory.java:379) at org.apache.hadoop.hdfs.DFSInputStream.getBlockReader(DFSInputStream.java:669) at org.apache.hadoop.hdfs.DFSInputStream.actualGetFromOneDataNode(DFSInputStream.java:1117) at org.apache.hadoop.hdfs.DFSInputStream.fetchBlockByteRange(DFSInputStream.java:1069) at org.apache.hadoop.hdfs.DFSInputStream.pread(DFSInputStream.java:1501) at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:1465) at org.apache.hadoop.fs.FSInputStream.readFully(FSInputStream.java:121) at org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:111) at org.apache.orc.impl.RecordReaderUtils.readDiskRanges(RecordReaderUtils.java:566) at org.apache.orc.impl.RecordReaderUtils$DefaultDataReader.readRowIndex(RecordReaderUtils.java:219) at org.apache.orc.impl.RecordReaderImpl.readRowIndex(RecordReaderImpl.java:1419) at org.apache.orc.impl.RecordReaderImpl.readRowIndex(RecordReaderImpl.java:1402) at org.apache.orc.impl.RecordReaderImpl.pickRowGroups(RecordReaderImpl.java:1056) at org.apache.orc.impl.RecordReaderImpl.readStripe(RecordReaderImpl.java:1087) at org.apache.orc.impl.RecordReaderImpl.advanceStripe(RecordReaderImpl.java:1254) at org.apache.orc.impl.RecordReaderImpl.advanceToNextRow(RecordReaderImpl.java:1289) at org.apache.orc.impl.RecordReaderImpl.nextBatch(RecordReaderImpl.java:1325) at org.apache.spark.sql.execution.datasources.orc.OrcColumnarBatchReader.nextBatch(OrcColumnarBatchReader.java:196) at org.apache.spark.sql.execution.datasources.orc.OrcColumnarBatchReader.nextKeyValue(OrcColumnarBatchReader.java:99) at org.apache.spark.sql.execution.datasources.RecordReaderIterator.hasNext(RecordReaderIterator.scala:39) at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1$$anon$2.getNext(FileScanRDD.scala:145) at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:73) at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:93) at org.apache.spark.sql.execution.FileSourceScanExec$$anon$1.hasNext(DataSourceScanExec.scala:492) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage5.columnartorow_nextBatch_0$(Unknown Source) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage5.agg_doAggregateWithKeys_0$(Unknown Source) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage5.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at
[jira] [Comment Edited] (HDFS-16292) The DFS Input Stream is waiting to be read
[ https://issues.apache.org/jira/browse/HDFS-16292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17437084#comment-17437084 ] tomscut edited comment on HDFS-16292 at 11/2/21, 12:54 AM: --- Coincidentally, our cluster also encountered this problem yesterday, but our version is 3.1.0. Client stack: !image-2021-11-02-08-54-27-273.png|width=607,height=341! {code:java} "Executor task launch worker for task 2690" #47 daemon prio=5 os_prio=0 tid=0x7f3730286800 nid=0x1abc4 runnable [0x7f37109ed000]"Executor task launch worker for task 2690" #47 daemon prio=5 os_prio=0 tid=0x7f3730286800 nid=0x1abc4 runnable [0x7f37109ed000] java.lang.Thread.State: RUNNABLE at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method) at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269) at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:79) at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86) - locked <0x0006cb9cf3a0> (a sun.nio.ch.Util$2) - locked <0x0006cb9cf390> (a java.util.Collections$UnmodifiableSet) - locked <0x0006cb9cf168> (a sun.nio.ch.EPollSelectorImpl) at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97) at org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:335) at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:157) at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161) at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131) at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:118) at java.io.FilterInputStream.read(FilterInputStream.java:83) at org.apache.hadoop.hdfs.protocolPB.PBHelperClient.vintPrefixed(PBHelperClient.java:547) at org.apache.hadoop.hdfs.client.impl.BlockReaderRemote.newBlockReader(BlockReaderRemote.java:407) at org.apache.hadoop.hdfs.client.impl.BlockReaderFactory.getRemoteBlockReader(BlockReaderFactory.java:853) at org.apache.hadoop.hdfs.client.impl.BlockReaderFactory.getRemoteBlockReaderFromTcp(BlockReaderFactory.java:749) at org.apache.hadoop.hdfs.client.impl.BlockReaderFactory.build(BlockReaderFactory.java:379) at org.apache.hadoop.hdfs.DFSInputStream.getBlockReader(DFSInputStream.java:669) at org.apache.hadoop.hdfs.DFSInputStream.actualGetFromOneDataNode(DFSInputStream.java:1117) at org.apache.hadoop.hdfs.DFSInputStream.fetchBlockByteRange(DFSInputStream.java:1069) at org.apache.hadoop.hdfs.DFSInputStream.pread(DFSInputStream.java:1501) at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:1465) at org.apache.hadoop.fs.FSInputStream.readFully(FSInputStream.java:121) at org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:111) at org.apache.orc.impl.RecordReaderUtils.readDiskRanges(RecordReaderUtils.java:566) at org.apache.orc.impl.RecordReaderUtils$DefaultDataReader.readRowIndex(RecordReaderUtils.java:219) at org.apache.orc.impl.RecordReaderImpl.readRowIndex(RecordReaderImpl.java:1419) at org.apache.orc.impl.RecordReaderImpl.readRowIndex(RecordReaderImpl.java:1402) at org.apache.orc.impl.RecordReaderImpl.pickRowGroups(RecordReaderImpl.java:1056) at org.apache.orc.impl.RecordReaderImpl.readStripe(RecordReaderImpl.java:1087) at org.apache.orc.impl.RecordReaderImpl.advanceStripe(RecordReaderImpl.java:1254) at org.apache.orc.impl.RecordReaderImpl.advanceToNextRow(RecordReaderImpl.java:1289) at org.apache.orc.impl.RecordReaderImpl.nextBatch(RecordReaderImpl.java:1325) at org.apache.spark.sql.execution.datasources.orc.OrcColumnarBatchReader.nextBatch(OrcColumnarBatchReader.java:196) at org.apache.spark.sql.execution.datasources.orc.OrcColumnarBatchReader.nextKeyValue(OrcColumnarBatchReader.java:99) at org.apache.spark.sql.execution.datasources.RecordReaderIterator.hasNext(RecordReaderIterator.scala:39) at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1$$anon$2.getNext(FileScanRDD.scala:145) at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:73) at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:93) at org.apache.spark.sql.execution.FileSourceScanExec$$anon$1.hasNext(DataSourceScanExec.scala:492) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage5.columnartorow_nextBatch_0$(Unknown Source) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage5.agg_doAggregateWithKeys_0$(Unknown Source) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage5.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:729) at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:458) at
[jira] [Comment Edited] (HDFS-16292) The DFS Input Stream is waiting to be read
[ https://issues.apache.org/jira/browse/HDFS-16292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17437084#comment-17437084 ] tomscut edited comment on HDFS-16292 at 11/2/21, 12:53 AM: --- Coincidentally, our cluster also encountered this problem yesterday, but our version is 3.1.0. Client stack: {code:java} "Executor task launch worker for task 2690" #47 daemon prio=5 os_prio=0 tid=0x7f3730286800 nid=0x1abc4 runnable [0x7f37109ed000]"Executor task launch worker for task 2690" #47 daemon prio=5 os_prio=0 tid=0x7f3730286800 nid=0x1abc4 runnable [0x7f37109ed000] java.lang.Thread.State: RUNNABLE at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method) at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269) at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:79) at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86) - locked <0x0006cb9cf3a0> (a sun.nio.ch.Util$2) - locked <0x0006cb9cf390> (a java.util.Collections$UnmodifiableSet) - locked <0x0006cb9cf168> (a sun.nio.ch.EPollSelectorImpl) at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97) at org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:335) at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:157) at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161) at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131) at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:118) at java.io.FilterInputStream.read(FilterInputStream.java:83) at org.apache.hadoop.hdfs.protocolPB.PBHelperClient.vintPrefixed(PBHelperClient.java:547) at org.apache.hadoop.hdfs.client.impl.BlockReaderRemote.newBlockReader(BlockReaderRemote.java:407) at org.apache.hadoop.hdfs.client.impl.BlockReaderFactory.getRemoteBlockReader(BlockReaderFactory.java:853) at org.apache.hadoop.hdfs.client.impl.BlockReaderFactory.getRemoteBlockReaderFromTcp(BlockReaderFactory.java:749) at org.apache.hadoop.hdfs.client.impl.BlockReaderFactory.build(BlockReaderFactory.java:379) at org.apache.hadoop.hdfs.DFSInputStream.getBlockReader(DFSInputStream.java:669) at org.apache.hadoop.hdfs.DFSInputStream.actualGetFromOneDataNode(DFSInputStream.java:1117) at org.apache.hadoop.hdfs.DFSInputStream.fetchBlockByteRange(DFSInputStream.java:1069) at org.apache.hadoop.hdfs.DFSInputStream.pread(DFSInputStream.java:1501) at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:1465) at org.apache.hadoop.fs.FSInputStream.readFully(FSInputStream.java:121) at org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:111) at org.apache.orc.impl.RecordReaderUtils.readDiskRanges(RecordReaderUtils.java:566) at org.apache.orc.impl.RecordReaderUtils$DefaultDataReader.readRowIndex(RecordReaderUtils.java:219) at org.apache.orc.impl.RecordReaderImpl.readRowIndex(RecordReaderImpl.java:1419) at org.apache.orc.impl.RecordReaderImpl.readRowIndex(RecordReaderImpl.java:1402) at org.apache.orc.impl.RecordReaderImpl.pickRowGroups(RecordReaderImpl.java:1056) at org.apache.orc.impl.RecordReaderImpl.readStripe(RecordReaderImpl.java:1087) at org.apache.orc.impl.RecordReaderImpl.advanceStripe(RecordReaderImpl.java:1254) at org.apache.orc.impl.RecordReaderImpl.advanceToNextRow(RecordReaderImpl.java:1289) at org.apache.orc.impl.RecordReaderImpl.nextBatch(RecordReaderImpl.java:1325) at org.apache.spark.sql.execution.datasources.orc.OrcColumnarBatchReader.nextBatch(OrcColumnarBatchReader.java:196) at org.apache.spark.sql.execution.datasources.orc.OrcColumnarBatchReader.nextKeyValue(OrcColumnarBatchReader.java:99) at org.apache.spark.sql.execution.datasources.RecordReaderIterator.hasNext(RecordReaderIterator.scala:39) at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1$$anon$2.getNext(FileScanRDD.scala:145) at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:73) at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:93) at org.apache.spark.sql.execution.FileSourceScanExec$$anon$1.hasNext(DataSourceScanExec.scala:492) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage5.columnartorow_nextBatch_0$(Unknown Source) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage5.agg_doAggregateWithKeys_0$(Unknown Source) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage5.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:729) at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:458) at
[jira] [Comment Edited] (HDFS-16292) The DFS Input Stream is waiting to be read
[ https://issues.apache.org/jira/browse/HDFS-16292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17437084#comment-17437084 ] tomscut edited comment on HDFS-16292 at 11/2/21, 12:52 AM: --- Coincidentally, our cluster also encountered this problem yesterday, but our version is 3.1.0. Client stack: {noformat} "Executor task launch worker for task 2690" #47 daemon prio=5 os_prio=0 tid=0x7f3730286800 nid=0x1abc4 runnable [0x7f37109ed000]"Executor task launch worker for task 2690" #47 daemon prio=5 os_prio=0 tid=0x7f3730286800 nid=0x1abc4 runnable [0x7f37109ed000] java.lang.Thread.State: RUNNABLE at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method) at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269) at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:79) at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86) - locked <0x0006cb9cf3a0> (a sun.nio.ch.Util$2) - locked <0x0006cb9cf390> (a java.util.Collections$UnmodifiableSet) - locked <0x0006cb9cf168> (a sun.nio.ch.EPollSelectorImpl) at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97) at org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:335) at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:157) at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161) at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131) at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:118) at java.io.FilterInputStream.read(FilterInputStream.java:83) at org.apache.hadoop.hdfs.protocolPB.PBHelperClient.vintPrefixed(PBHelperClient.java:547) at org.apache.hadoop.hdfs.client.impl.BlockReaderRemote.newBlockReader(BlockReaderRemote.java:407) at org.apache.hadoop.hdfs.client.impl.BlockReaderFactory.getRemoteBlockReader(BlockReaderFactory.java:853) at org.apache.hadoop.hdfs.client.impl.BlockReaderFactory.getRemoteBlockReaderFromTcp(BlockReaderFactory.java:749) at org.apache.hadoop.hdfs.client.impl.BlockReaderFactory.build(BlockReaderFactory.java:379) at org.apache.hadoop.hdfs.DFSInputStream.getBlockReader(DFSInputStream.java:669) at org.apache.hadoop.hdfs.DFSInputStream.actualGetFromOneDataNode(DFSInputStream.java:1117) at org.apache.hadoop.hdfs.DFSInputStream.fetchBlockByteRange(DFSInputStream.java:1069) at org.apache.hadoop.hdfs.DFSInputStream.pread(DFSInputStream.java:1501) at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:1465) at org.apache.hadoop.fs.FSInputStream.readFully(FSInputStream.java:121) at org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:111) at org.apache.orc.impl.RecordReaderUtils.readDiskRanges(RecordReaderUtils.java:566) at org.apache.orc.impl.RecordReaderUtils$DefaultDataReader.readRowIndex(RecordReaderUtils.java:219) at org.apache.orc.impl.RecordReaderImpl.readRowIndex(RecordReaderImpl.java:1419) at org.apache.orc.impl.RecordReaderImpl.readRowIndex(RecordReaderImpl.java:1402) at org.apache.orc.impl.RecordReaderImpl.pickRowGroups(RecordReaderImpl.java:1056) at org.apache.orc.impl.RecordReaderImpl.readStripe(RecordReaderImpl.java:1087) at org.apache.orc.impl.RecordReaderImpl.advanceStripe(RecordReaderImpl.java:1254) at org.apache.orc.impl.RecordReaderImpl.advanceToNextRow(RecordReaderImpl.java:1289) at org.apache.orc.impl.RecordReaderImpl.nextBatch(RecordReaderImpl.java:1325) at org.apache.spark.sql.execution.datasources.orc.OrcColumnarBatchReader.nextBatch(OrcColumnarBatchReader.java:196) at org.apache.spark.sql.execution.datasources.orc.OrcColumnarBatchReader.nextKeyValue(OrcColumnarBatchReader.java:99) at org.apache.spark.sql.execution.datasources.RecordReaderIterator.hasNext(RecordReaderIterator.scala:39) at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1$$anon$2.getNext(FileScanRDD.scala:145) at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:73) at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:93) at org.apache.spark.sql.execution.FileSourceScanExec$$anon$1.hasNext(DataSourceScanExec.scala:492) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage5.columnartorow_nextBatch_0$(Unknown Source) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage5.agg_doAggregateWithKeys_0$(Unknown Source) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage5.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:729) at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:458) at
[jira] [Commented] (HDFS-16292) The DFS Input Stream is waiting to be read
[ https://issues.apache.org/jira/browse/HDFS-16292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17437084#comment-17437084 ] tomscut commented on HDFS-16292: Coincidentally, our cluster also encountered this problem yesterday, but our version is 3.1.0. Client stack: {code:java} "Executor task launch worker for task 2690" #47 daemon prio=5 os_prio=0 tid=0x7f3730286800 nid=0x1abc4 runnable [0x7f37109ed000]"Executor task launch worker for task 2690" #47 daemon prio=5 os_prio=0 tid=0x7f3730286800 nid=0x1abc4 runnable [0x7f37109ed000] java.lang.Thread.State: RUNNABLE at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method) at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269) at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:79) at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86) - locked <0x0006cb9cf3a0> (a sun.nio.ch.Util$2) - locked <0x0006cb9cf390> (a java.util.Collections$UnmodifiableSet) - locked <0x0006cb9cf168> (a sun.nio.ch.EPollSelectorImpl) at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97) at org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:335) at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:157) at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161) at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131) at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:118) at java.io.FilterInputStream.read(FilterInputStream.java:83) at org.apache.hadoop.hdfs.protocolPB.PBHelperClient.vintPrefixed(PBHelperClient.java:547) at org.apache.hadoop.hdfs.client.impl.BlockReaderRemote.newBlockReader(BlockReaderRemote.java:407) at org.apache.hadoop.hdfs.client.impl.BlockReaderFactory.getRemoteBlockReader(BlockReaderFactory.java:853) at org.apache.hadoop.hdfs.client.impl.BlockReaderFactory.getRemoteBlockReaderFromTcp(BlockReaderFactory.java:749) at org.apache.hadoop.hdfs.client.impl.BlockReaderFactory.build(BlockReaderFactory.java:379) at org.apache.hadoop.hdfs.DFSInputStream.getBlockReader(DFSInputStream.java:669) at org.apache.hadoop.hdfs.DFSInputStream.actualGetFromOneDataNode(DFSInputStream.java:1117) at org.apache.hadoop.hdfs.DFSInputStream.fetchBlockByteRange(DFSInputStream.java:1069) at org.apache.hadoop.hdfs.DFSInputStream.pread(DFSInputStream.java:1501) at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:1465) at org.apache.hadoop.fs.FSInputStream.readFully(FSInputStream.java:121) at org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:111) at org.apache.orc.impl.RecordReaderUtils.readDiskRanges(RecordReaderUtils.java:566) at org.apache.orc.impl.RecordReaderUtils$DefaultDataReader.readRowIndex(RecordReaderUtils.java:219) at org.apache.orc.impl.RecordReaderImpl.readRowIndex(RecordReaderImpl.java:1419) at org.apache.orc.impl.RecordReaderImpl.readRowIndex(RecordReaderImpl.java:1402) at org.apache.orc.impl.RecordReaderImpl.pickRowGroups(RecordReaderImpl.java:1056) at org.apache.orc.impl.RecordReaderImpl.readStripe(RecordReaderImpl.java:1087) at org.apache.orc.impl.RecordReaderImpl.advanceStripe(RecordReaderImpl.java:1254) at org.apache.orc.impl.RecordReaderImpl.advanceToNextRow(RecordReaderImpl.java:1289) at org.apache.orc.impl.RecordReaderImpl.nextBatch(RecordReaderImpl.java:1325) at org.apache.spark.sql.execution.datasources.orc.OrcColumnarBatchReader.nextBatch(OrcColumnarBatchReader.java:196) at org.apache.spark.sql.execution.datasources.orc.OrcColumnarBatchReader.nextKeyValue(OrcColumnarBatchReader.java:99) at org.apache.spark.sql.execution.datasources.RecordReaderIterator.hasNext(RecordReaderIterator.scala:39) at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1$$anon$2.getNext(FileScanRDD.scala:145) at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:73) at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:93) at org.apache.spark.sql.execution.FileSourceScanExec$$anon$1.hasNext(DataSourceScanExec.scala:492) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage5.columnartorow_nextBatch_0$(Unknown Source) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage5.agg_doAggregateWithKeys_0$(Unknown Source) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage5.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:729) at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:458) at org.apache.spark.shuffle.sort.UnsafeShuffleWriter.write(UnsafeShuffleWriter.java:181) at
[jira] [Commented] (HDFS-16292) The DFS Input Stream is waiting to be read
[ https://issues.apache.org/jira/browse/HDFS-16292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17437082#comment-17437082 ] tomscut commented on HDFS-16292: Hi [~weichiu], do you mean this issue [HDFS-10223|https://issues.apache.org/jira/browse/HDFS-10223]. > The DFS Input Stream is waiting to be read > -- > > Key: HDFS-16292 > URL: https://issues.apache.org/jira/browse/HDFS-16292 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 2.5.2 >Reporter: Hualong Zhang >Priority: Minor > Attachments: HDFS-16292.path, image-2021-11-01-18-36-54-329.png > > > The input stream has been waiting.The problem seems to be that > BlockReaderPeer#peer does not set ReadTimeout and WriteTimeout.We can solve > this problem by setting the timeout in BlockReaderFactory#nextTcpPeer > Jstack as follows > !image-2021-11-01-18-36-54-329.png! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-6994) libhdfs3 - A native C/C++ HDFS client
[ https://issues.apache.org/jira/browse/HDFS-6994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17437043#comment-17437043 ] Clay B. commented on HDFS-6994: --- Hi [~wangzw], I updated this JIRA's description to have a working URL to libhdfs3 for those looking. Please revert my changes if that is undesirable. > libhdfs3 - A native C/C++ HDFS client > - > > Key: HDFS-6994 > URL: https://issues.apache.org/jira/browse/HDFS-6994 > Project: Hadoop HDFS > Issue Type: New Feature > Components: hdfs-client >Reporter: Zhanwei Wang >Assignee: Zhanwei Wang >Priority: Major > Attachments: HDFS-6994-rpc-8.patch, HDFS-6994.patch > > > Hi All > I just got the permission to open source libhdfs3, which is a native C/C++ > HDFS client based on Hadoop RPC protocol and HDFS Data Transfer Protocol. > libhdfs3 provide the libhdfs style C interface and a C++ interface. Support > both HADOOP RPC version 8 and 9. Support Namenode HA and Kerberos > authentication. > libhdfs3 is currently used by Apache HAWQ at: > https://github.com/apache/hawq/tree/master/depends/libhdfs3 > I'd like to integrate libhdfs3 into HDFS source code to benefit others. > The libhdfs3 code originally from Pivotal was available on github at: > https://github.com/Pivotal-Data-Attic/pivotalrd-libhdfs3 > http://pivotal-data-attic.github.io/pivotalrd-libhdfs3/ -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-6994) libhdfs3 - A native C/C++ HDFS client
[ https://issues.apache.org/jira/browse/HDFS-6994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Clay B. updated HDFS-6994: -- Description: Hi All I just got the permission to open source libhdfs3, which is a native C/C++ HDFS client based on Hadoop RPC protocol and HDFS Data Transfer Protocol. libhdfs3 provide the libhdfs style C interface and a C++ interface. Support both HADOOP RPC version 8 and 9. Support Namenode HA and Kerberos authentication. libhdfs3 is currently used by Apache HAWQ at: https://github.com/apache/hawq/tree/master/depends/libhdfs3 I'd like to integrate libhdfs3 into HDFS source code to benefit others. The libhdfs3 code originally from Pivotal was available on github at: https://github.com/Pivotal-Data-Attic/pivotalrd-libhdfs3 http://pivotal-data-attic.github.io/pivotalrd-libhdfs3/ was: Hi All I just got the permission to open source libhdfs3, which is a native C/C++ HDFS client based on Hadoop RPC protocol and HDFS Data Transfer Protocol. libhdfs3 provide the libhdfs style C interface and a C++ interface. Support both HADOOP RPC version 8 and 9. Support Namenode HA and Kerberos authentication. libhdfs3 is currently used by HAWQ of Pivotal I'd like to integrate libhdfs3 into HDFS source code to benefit others. You can find libhdfs3 code from github https://github.com/Pivotal-Data-Attic/pivotalrd-libhdfs3 http://pivotal-data-attic.github.io/pivotalrd-libhdfs3/ > libhdfs3 - A native C/C++ HDFS client > - > > Key: HDFS-6994 > URL: https://issues.apache.org/jira/browse/HDFS-6994 > Project: Hadoop HDFS > Issue Type: New Feature > Components: hdfs-client >Reporter: Zhanwei Wang >Assignee: Zhanwei Wang >Priority: Major > Attachments: HDFS-6994-rpc-8.patch, HDFS-6994.patch > > > Hi All > I just got the permission to open source libhdfs3, which is a native C/C++ > HDFS client based on Hadoop RPC protocol and HDFS Data Transfer Protocol. > libhdfs3 provide the libhdfs style C interface and a C++ interface. Support > both HADOOP RPC version 8 and 9. Support Namenode HA and Kerberos > authentication. > libhdfs3 is currently used by Apache HAWQ at: > https://github.com/apache/hawq/tree/master/depends/libhdfs3 > I'd like to integrate libhdfs3 into HDFS source code to benefit others. > The libhdfs3 code originally from Pivotal was available on github at: > https://github.com/Pivotal-Data-Attic/pivotalrd-libhdfs3 > http://pivotal-data-attic.github.io/pivotalrd-libhdfs3/ -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16269) [Fix] Improve NNThroughputBenchmark#blockReport operation
[ https://issues.apache.org/jira/browse/HDFS-16269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira Ajisaka updated HDFS-16269: - Fix Version/s: 3.3.2 Backported to branch-3.3. > [Fix] Improve NNThroughputBenchmark#blockReport operation > - > > Key: HDFS-16269 > URL: https://issues.apache.org/jira/browse/HDFS-16269 > Project: Hadoop HDFS > Issue Type: Bug > Components: benchmarks, namenode >Affects Versions: 2.9.2 >Reporter: JiangHua Zhu >Assignee: JiangHua Zhu >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0, 3.3.2 > > Time Spent: 5h > Remaining Estimate: 0h > > When using NNThroughputBenchmark to verify the blockReport, you will get some > exception information. > Commands used: > ./bin/hadoop org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark -fs > -op blockReport -datanodes 3 -reports 1 > The exception information: > 21/10/12 14:35:18 INFO namenode.NNThroughputBenchmark: Starting benchmark: > blockReport > 21/10/12 14:35:19 INFO namenode.NNThroughputBenchmark: Creating 10 files with > 10 blocks each. > 21/10/12 14:35:19 ERROR namenode.NNThroughputBenchmark: > java.lang.ArrayIndexOutOfBoundsException: 50009 > at > org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark$BlockReportStats.addBlocks(NNThroughputBenchmark.java:1161) > at > org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark$BlockReportStats.generateInputs(NNThroughputBenchmark.java:1143) > at > org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark$OperationStatsBase.benchmark(NNThroughputBenchmark.java:257) > at > org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark.run(NNThroughputBenchmark.java:1528) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90) > at > org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark.runBenchmark(NNThroughputBenchmark.java:1430) > at > org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark.main(NNThroughputBenchmark.java:1550) > Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 50009 > at > org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark$BlockReportStats.addBlocks(NNThroughputBenchmark.java:1161) > at > org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark$BlockReportStats.generateInputs(NNThroughputBenchmark.java:1143) > at > org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark$OperationStatsBase.benchmark(NNThroughputBenchmark.java:257) > at > org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark.run(NNThroughputBenchmark.java:1528) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90) > at > org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark.runBenchmark(NNThroughputBenchmark.java:1430) > at > org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark.main(NNThroughputBenchmark.java:1550) > Checked some code and found that the problem appeared here. > private ExtendedBlock addBlocks(String fileName, String clientName) > throws IOException { > for(DatanodeInfo dnInfo: loc.getLocations()) { >int dnIdx = dnInfo.getXferPort()-1; >datanodes[dnIdx].addBlock(loc.getBlock().getLocalBlock()); > } > } > It can be seen from this that what dnInfo.getXferPort() gets is a port > information and should not be used as an index of an array. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16291) Make the comment of INode#ReclaimContext more standardized
[ https://issues.apache.org/jira/browse/HDFS-16291?focusedWorklogId=672744=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-672744 ] ASF GitHub Bot logged work on HDFS-16291: - Author: ASF GitHub Bot Created on: 01/Nov/21 16:15 Start Date: 01/Nov/21 16:15 Worklog Time Spent: 10m Work Description: jianghuazhu commented on pull request #3602: URL: https://github.com/apache/hadoop/pull/3602#issuecomment-956373649 Thank you @tomscut for your comments and reviews. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 672744) Time Spent: 40m (was: 0.5h) > Make the comment of INode#ReclaimContext more standardized > -- > > Key: HDFS-16291 > URL: https://issues.apache.org/jira/browse/HDFS-16291 > Project: Hadoop HDFS > Issue Type: Improvement > Components: documentation, namenode >Affects Versions: 3.4.0 >Reporter: JiangHua Zhu >Assignee: JiangHua Zhu >Priority: Minor > Labels: pull-request-available > Attachments: image-2021-10-31-20-25-08-379.png > > Time Spent: 40m > Remaining Estimate: 0h > > In the INode#ReclaimContext class, there are some comments that are not > standardized enough. > E.g: > !image-2021-10-31-20-25-08-379.png! > We should make comments more standardized. This will be more readable. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-15413) DFSStripedInputStream throws exception when datanodes close idle connections
[ https://issues.apache.org/jira/browse/HDFS-15413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17436908#comment-17436908 ] Jeff Kubina edited comment on HDFS-15413 at 11/1/21, 4:11 PM: -- Is this issue being worked or was it resolved already? was (Author: jmkubin): Is this issue being worked to was it resolved already? > DFSStripedInputStream throws exception when datanodes close idle connections > > > Key: HDFS-15413 > URL: https://issues.apache.org/jira/browse/HDFS-15413 > Project: Hadoop HDFS > Issue Type: Bug > Components: ec, erasure-coding, hdfs-client >Affects Versions: 3.1.3 > Environment: - Hadoop 3.1.3 > - erasure coding with ISA-L and RS-3-2-1024k scheme > - running in kubernetes > - dfs.client.socket-timeout = 1 > - dfs.datanode.socket.write.timeout = 1 >Reporter: Andrey Elenskiy >Priority: Critical > Attachments: out.log > > > We've run into an issue with compactions failing in HBase when erasure coding > is enabled on a table directory. After digging further I was able to narrow > it down to a seek + read logic and able to reproduce the issue with hdfs > client only: > {code:java} > import org.apache.hadoop.conf.Configuration; > import org.apache.hadoop.fs.Path; > import org.apache.hadoop.fs.FileSystem; > import org.apache.hadoop.fs.FSDataInputStream; > public class ReaderRaw { > public static void main(final String[] args) throws Exception { > Path p = new Path(args[0]); > int bufLen = Integer.parseInt(args[1]); > int sleepDuration = Integer.parseInt(args[2]); > int countBeforeSleep = Integer.parseInt(args[3]); > int countAfterSleep = Integer.parseInt(args[4]); > Configuration conf = new Configuration(); > FSDataInputStream istream = FileSystem.get(conf).open(p); > byte[] buf = new byte[bufLen]; > int readTotal = 0; > int count = 0; > try { > while (true) { > istream.seek(readTotal); > int bytesRemaining = bufLen; > int bufOffset = 0; > while (bytesRemaining > 0) { > int nread = istream.read(buf, 0, bufLen); > if (nread < 0) { > throw new Exception("nread is less than zero"); > } > readTotal += nread; > bufOffset += nread; > bytesRemaining -= nread; > } > count++; > if (count == countBeforeSleep) { > System.out.println("sleeping for " + sleepDuration + " > milliseconds"); > Thread.sleep(sleepDuration); > System.out.println("resuming"); > } > if (count == countBeforeSleep + countAfterSleep) { > System.out.println("done"); > break; > } > } > } catch (Exception e) { > System.out.println("exception on read " + count + " read total " > + readTotal); > throw e; > } > } > } > {code} > The issue appears to be due to the fact that datanodes close the connection > of EC client if it doesn't fetch next packet for longer than > dfs.client.socket-timeout. The EC client doesn't retry and instead assumes > that those datanodes went away resulting in "missing blocks" exception. > I was able to consistently reproduce with the following arguments: > {noformat} > bufLen = 100 (just below 1MB which is the size of the stripe) > sleepDuration = (dfs.client.socket-timeout + 1) * 1000 (in our case 11000) > countBeforeSleep = 1 > countAfterSleep = 7 > {noformat} > I've attached the entire log output of running the snippet above against > erasure coded file with RS-3-2-1024k policy. And here are the logs from > datanodes of disconnecting the client: > datanode 1: > {noformat} > 2020-06-15 19:06:20,697 INFO datanode.DataNode: Likely the client has stopped > reading, disconnecting it (datanode-v11-0-hadoop.hadoop:9866:DataXceiver > error processing READ_BLOCK operation src: /10.128.23.40:53748 dst: > /10.128.14.46:9866); java.net.SocketTimeoutException: 1 millis timeout > while waiting for channel to be ready for write. ch : > java.nio.channels.SocketChannel[connected local=/10.128.14.46:9866 > remote=/10.128.23.40:53748] > {noformat} > datanode 2: > {noformat} > 2020-06-15 19:06:20,341 INFO datanode.DataNode: Likely the client has stopped > reading, disconnecting it (datanode-v11-1-hadoop.hadoop:9866:DataXceiver > error processing READ_BLOCK operation src: /10.128.23.40:48772 dst: > /10.128.9.42:9866); java.net.SocketTimeoutException: 1 millis timeout > while waiting for channel to be ready for write. ch : >
[jira] [Resolved] (HDFS-16269) [Fix] Improve NNThroughputBenchmark#blockReport operation
[ https://issues.apache.org/jira/browse/HDFS-16269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira Ajisaka resolved HDFS-16269. -- Fix Version/s: 3.4.0 Resolution: Fixed Committed to trunk. Thank you [~jianghuazhu] for your contribution. > [Fix] Improve NNThroughputBenchmark#blockReport operation > - > > Key: HDFS-16269 > URL: https://issues.apache.org/jira/browse/HDFS-16269 > Project: Hadoop HDFS > Issue Type: Bug > Components: benchmarks, namenode >Affects Versions: 2.9.2 >Reporter: JiangHua Zhu >Assignee: JiangHua Zhu >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 5h > Remaining Estimate: 0h > > When using NNThroughputBenchmark to verify the blockReport, you will get some > exception information. > Commands used: > ./bin/hadoop org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark -fs > -op blockReport -datanodes 3 -reports 1 > The exception information: > 21/10/12 14:35:18 INFO namenode.NNThroughputBenchmark: Starting benchmark: > blockReport > 21/10/12 14:35:19 INFO namenode.NNThroughputBenchmark: Creating 10 files with > 10 blocks each. > 21/10/12 14:35:19 ERROR namenode.NNThroughputBenchmark: > java.lang.ArrayIndexOutOfBoundsException: 50009 > at > org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark$BlockReportStats.addBlocks(NNThroughputBenchmark.java:1161) > at > org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark$BlockReportStats.generateInputs(NNThroughputBenchmark.java:1143) > at > org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark$OperationStatsBase.benchmark(NNThroughputBenchmark.java:257) > at > org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark.run(NNThroughputBenchmark.java:1528) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90) > at > org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark.runBenchmark(NNThroughputBenchmark.java:1430) > at > org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark.main(NNThroughputBenchmark.java:1550) > Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 50009 > at > org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark$BlockReportStats.addBlocks(NNThroughputBenchmark.java:1161) > at > org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark$BlockReportStats.generateInputs(NNThroughputBenchmark.java:1143) > at > org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark$OperationStatsBase.benchmark(NNThroughputBenchmark.java:257) > at > org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark.run(NNThroughputBenchmark.java:1528) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90) > at > org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark.runBenchmark(NNThroughputBenchmark.java:1430) > at > org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark.main(NNThroughputBenchmark.java:1550) > Checked some code and found that the problem appeared here. > private ExtendedBlock addBlocks(String fileName, String clientName) > throws IOException { > for(DatanodeInfo dnInfo: loc.getLocations()) { >int dnIdx = dnInfo.getXferPort()-1; >datanodes[dnIdx].addBlock(loc.getBlock().getLocalBlock()); > } > } > It can be seen from this that what dnInfo.getXferPort() gets is a port > information and should not be used as an index of an array. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16269) [Fix] Improve NNThroughputBenchmark#blockReport operation
[ https://issues.apache.org/jira/browse/HDFS-16269?focusedWorklogId=672733=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-672733 ] ASF GitHub Bot logged work on HDFS-16269: - Author: ASF GitHub Bot Created on: 01/Nov/21 15:56 Start Date: 01/Nov/21 15:56 Worklog Time Spent: 10m Work Description: aajisaka merged pull request #3544: URL: https://github.com/apache/hadoop/pull/3544 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 672733) Time Spent: 4h 50m (was: 4h 40m) > [Fix] Improve NNThroughputBenchmark#blockReport operation > - > > Key: HDFS-16269 > URL: https://issues.apache.org/jira/browse/HDFS-16269 > Project: Hadoop HDFS > Issue Type: Bug > Components: benchmarks, namenode >Affects Versions: 2.9.2 >Reporter: JiangHua Zhu >Assignee: JiangHua Zhu >Priority: Major > Labels: pull-request-available > Time Spent: 4h 50m > Remaining Estimate: 0h > > When using NNThroughputBenchmark to verify the blockReport, you will get some > exception information. > Commands used: > ./bin/hadoop org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark -fs > -op blockReport -datanodes 3 -reports 1 > The exception information: > 21/10/12 14:35:18 INFO namenode.NNThroughputBenchmark: Starting benchmark: > blockReport > 21/10/12 14:35:19 INFO namenode.NNThroughputBenchmark: Creating 10 files with > 10 blocks each. > 21/10/12 14:35:19 ERROR namenode.NNThroughputBenchmark: > java.lang.ArrayIndexOutOfBoundsException: 50009 > at > org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark$BlockReportStats.addBlocks(NNThroughputBenchmark.java:1161) > at > org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark$BlockReportStats.generateInputs(NNThroughputBenchmark.java:1143) > at > org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark$OperationStatsBase.benchmark(NNThroughputBenchmark.java:257) > at > org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark.run(NNThroughputBenchmark.java:1528) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90) > at > org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark.runBenchmark(NNThroughputBenchmark.java:1430) > at > org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark.main(NNThroughputBenchmark.java:1550) > Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 50009 > at > org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark$BlockReportStats.addBlocks(NNThroughputBenchmark.java:1161) > at > org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark$BlockReportStats.generateInputs(NNThroughputBenchmark.java:1143) > at > org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark$OperationStatsBase.benchmark(NNThroughputBenchmark.java:257) > at > org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark.run(NNThroughputBenchmark.java:1528) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90) > at > org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark.runBenchmark(NNThroughputBenchmark.java:1430) > at > org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark.main(NNThroughputBenchmark.java:1550) > Checked some code and found that the problem appeared here. > private ExtendedBlock addBlocks(String fileName, String clientName) > throws IOException { > for(DatanodeInfo dnInfo: loc.getLocations()) { >int dnIdx = dnInfo.getXferPort()-1; >datanodes[dnIdx].addBlock(loc.getBlock().getLocalBlock()); > } > } > It can be seen from this that what dnInfo.getXferPort() gets is a port > information and should not be used as an index of an array. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16269) [Fix] Improve NNThroughputBenchmark#blockReport operation
[ https://issues.apache.org/jira/browse/HDFS-16269?focusedWorklogId=672735=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-672735 ] ASF GitHub Bot logged work on HDFS-16269: - Author: ASF GitHub Bot Created on: 01/Nov/21 15:56 Start Date: 01/Nov/21 15:56 Worklog Time Spent: 10m Work Description: aajisaka commented on pull request #3544: URL: https://github.com/apache/hadoop/pull/3544#issuecomment-956355572 Merged. Thank you @jianghuazhu @ferhui @jojochuang -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 672735) Time Spent: 5h (was: 4h 50m) > [Fix] Improve NNThroughputBenchmark#blockReport operation > - > > Key: HDFS-16269 > URL: https://issues.apache.org/jira/browse/HDFS-16269 > Project: Hadoop HDFS > Issue Type: Bug > Components: benchmarks, namenode >Affects Versions: 2.9.2 >Reporter: JiangHua Zhu >Assignee: JiangHua Zhu >Priority: Major > Labels: pull-request-available > Time Spent: 5h > Remaining Estimate: 0h > > When using NNThroughputBenchmark to verify the blockReport, you will get some > exception information. > Commands used: > ./bin/hadoop org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark -fs > -op blockReport -datanodes 3 -reports 1 > The exception information: > 21/10/12 14:35:18 INFO namenode.NNThroughputBenchmark: Starting benchmark: > blockReport > 21/10/12 14:35:19 INFO namenode.NNThroughputBenchmark: Creating 10 files with > 10 blocks each. > 21/10/12 14:35:19 ERROR namenode.NNThroughputBenchmark: > java.lang.ArrayIndexOutOfBoundsException: 50009 > at > org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark$BlockReportStats.addBlocks(NNThroughputBenchmark.java:1161) > at > org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark$BlockReportStats.generateInputs(NNThroughputBenchmark.java:1143) > at > org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark$OperationStatsBase.benchmark(NNThroughputBenchmark.java:257) > at > org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark.run(NNThroughputBenchmark.java:1528) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90) > at > org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark.runBenchmark(NNThroughputBenchmark.java:1430) > at > org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark.main(NNThroughputBenchmark.java:1550) > Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 50009 > at > org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark$BlockReportStats.addBlocks(NNThroughputBenchmark.java:1161) > at > org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark$BlockReportStats.generateInputs(NNThroughputBenchmark.java:1143) > at > org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark$OperationStatsBase.benchmark(NNThroughputBenchmark.java:257) > at > org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark.run(NNThroughputBenchmark.java:1528) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90) > at > org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark.runBenchmark(NNThroughputBenchmark.java:1430) > at > org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark.main(NNThroughputBenchmark.java:1550) > Checked some code and found that the problem appeared here. > private ExtendedBlock addBlocks(String fileName, String clientName) > throws IOException { > for(DatanodeInfo dnInfo: loc.getLocations()) { >int dnIdx = dnInfo.getXferPort()-1; >datanodes[dnIdx].addBlock(loc.getBlock().getLocalBlock()); > } > } > It can be seen from this that what dnInfo.getXferPort() gets is a port > information and should not be used as an index of an array. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15413) DFSStripedInputStream throws exception when datanodes close idle connections
[ https://issues.apache.org/jira/browse/HDFS-15413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17436908#comment-17436908 ] Jeff Kubina commented on HDFS-15413: Is this issue being worked to was it resolved already? > DFSStripedInputStream throws exception when datanodes close idle connections > > > Key: HDFS-15413 > URL: https://issues.apache.org/jira/browse/HDFS-15413 > Project: Hadoop HDFS > Issue Type: Bug > Components: ec, erasure-coding, hdfs-client >Affects Versions: 3.1.3 > Environment: - Hadoop 3.1.3 > - erasure coding with ISA-L and RS-3-2-1024k scheme > - running in kubernetes > - dfs.client.socket-timeout = 1 > - dfs.datanode.socket.write.timeout = 1 >Reporter: Andrey Elenskiy >Priority: Critical > Attachments: out.log > > > We've run into an issue with compactions failing in HBase when erasure coding > is enabled on a table directory. After digging further I was able to narrow > it down to a seek + read logic and able to reproduce the issue with hdfs > client only: > {code:java} > import org.apache.hadoop.conf.Configuration; > import org.apache.hadoop.fs.Path; > import org.apache.hadoop.fs.FileSystem; > import org.apache.hadoop.fs.FSDataInputStream; > public class ReaderRaw { > public static void main(final String[] args) throws Exception { > Path p = new Path(args[0]); > int bufLen = Integer.parseInt(args[1]); > int sleepDuration = Integer.parseInt(args[2]); > int countBeforeSleep = Integer.parseInt(args[3]); > int countAfterSleep = Integer.parseInt(args[4]); > Configuration conf = new Configuration(); > FSDataInputStream istream = FileSystem.get(conf).open(p); > byte[] buf = new byte[bufLen]; > int readTotal = 0; > int count = 0; > try { > while (true) { > istream.seek(readTotal); > int bytesRemaining = bufLen; > int bufOffset = 0; > while (bytesRemaining > 0) { > int nread = istream.read(buf, 0, bufLen); > if (nread < 0) { > throw new Exception("nread is less than zero"); > } > readTotal += nread; > bufOffset += nread; > bytesRemaining -= nread; > } > count++; > if (count == countBeforeSleep) { > System.out.println("sleeping for " + sleepDuration + " > milliseconds"); > Thread.sleep(sleepDuration); > System.out.println("resuming"); > } > if (count == countBeforeSleep + countAfterSleep) { > System.out.println("done"); > break; > } > } > } catch (Exception e) { > System.out.println("exception on read " + count + " read total " > + readTotal); > throw e; > } > } > } > {code} > The issue appears to be due to the fact that datanodes close the connection > of EC client if it doesn't fetch next packet for longer than > dfs.client.socket-timeout. The EC client doesn't retry and instead assumes > that those datanodes went away resulting in "missing blocks" exception. > I was able to consistently reproduce with the following arguments: > {noformat} > bufLen = 100 (just below 1MB which is the size of the stripe) > sleepDuration = (dfs.client.socket-timeout + 1) * 1000 (in our case 11000) > countBeforeSleep = 1 > countAfterSleep = 7 > {noformat} > I've attached the entire log output of running the snippet above against > erasure coded file with RS-3-2-1024k policy. And here are the logs from > datanodes of disconnecting the client: > datanode 1: > {noformat} > 2020-06-15 19:06:20,697 INFO datanode.DataNode: Likely the client has stopped > reading, disconnecting it (datanode-v11-0-hadoop.hadoop:9866:DataXceiver > error processing READ_BLOCK operation src: /10.128.23.40:53748 dst: > /10.128.14.46:9866); java.net.SocketTimeoutException: 1 millis timeout > while waiting for channel to be ready for write. ch : > java.nio.channels.SocketChannel[connected local=/10.128.14.46:9866 > remote=/10.128.23.40:53748] > {noformat} > datanode 2: > {noformat} > 2020-06-15 19:06:20,341 INFO datanode.DataNode: Likely the client has stopped > reading, disconnecting it (datanode-v11-1-hadoop.hadoop:9866:DataXceiver > error processing READ_BLOCK operation src: /10.128.23.40:48772 dst: > /10.128.9.42:9866); java.net.SocketTimeoutException: 1 millis timeout > while waiting for channel to be ready for write. ch : > java.nio.channels.SocketChannel[connected local=/10.128.9.42:9866 > remote=/10.128.23.40:48772] > {noformat} > datanode 3: > {noformat} > 2020-06-15 19:06:20,467 INFO
[jira] [Updated] (HDFS-16293) Client sleeps and holds 'dataQueue' when DataNodes are congested
[ https://issues.apache.org/jira/browse/HDFS-16293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuanxin Zhu updated HDFS-16293: --- Summary: Client sleeps and holds 'dataQueue' when DataNodes are congested (was: Client sleep and hold 'dataQueue' when DataNodes are congested) > Client sleeps and holds 'dataQueue' when DataNodes are congested > > > Key: HDFS-16293 > URL: https://issues.apache.org/jira/browse/HDFS-16293 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Affects Versions: 3.2.2 >Reporter: Yuanxin Zhu >Priority: Major > Original Estimate: 24h > Remaining Estimate: 24h > > When I open the ECN and use Terasort for testing, DataNodes are > congested(HDFS-8008). The client enters the sleep state after receiving the > ACK for many times, but does not release the 'dataQueue'. The > ResponseProcessor thread needs the 'dataQueue' to execute > 'ackQueue.getFirst()', so the ResponseProcessor will wait for the client to > release the 'dataQueue', which is equivalent to that the ResponseProcessor > thread also enters sleep, resulting in ACK delay.MapReduce tasks can be > delayed by tens of minutes or even hours. > The DataStreamer thread can first execute 'one = dataQueue. getFirst()', > release 'dataQueue', and then judge whether to execute 'backOffIfNecessary()' > according to 'one.isHeartbeatPacket()' > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16293) Client sleep and hold 'dataQueue' when DataNodes are congested
[ https://issues.apache.org/jira/browse/HDFS-16293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuanxin Zhu updated HDFS-16293: --- Summary: Client sleep and hold 'dataQueue' when DataNodes are congested (was: Client sleep and hold 'dataqueue' when datanode are condensed) > Client sleep and hold 'dataQueue' when DataNodes are congested > -- > > Key: HDFS-16293 > URL: https://issues.apache.org/jira/browse/HDFS-16293 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Affects Versions: 3.2.2 >Reporter: Yuanxin Zhu >Priority: Major > Original Estimate: 24h > Remaining Estimate: 24h > > When I open the ECN and use Terasort for testing, DataNodes are > congested(HDFS-8008). The client enters the sleep state after receiving the > ACK for many times, but does not release the 'dataQueue'. The > ResponseProcessor thread needs the 'dataQueue' to execute > 'ackQueue.getFirst()', so the ResponseProcessor will wait for the client to > release the 'dataQueue', which is equivalent to that the ResponseProcessor > thread also enters sleep, resulting in ACK delay.MapReduce tasks can be > delayed by tens of minutes or even hours. > The DataStreamer thread can first execute 'one = dataQueue. getFirst()', > release 'dataQueue', and then judge whether to execute 'backOffIfNecessary()' > according to 'one.isHeartbeatPacket()' > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16293) Client sleep and hold 'dataqueue' when datanode are condensed
[ https://issues.apache.org/jira/browse/HDFS-16293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuanxin Zhu updated HDFS-16293: --- Description: When I open the ECN and use Terasort for testing, DataNodes are congested(HDFS-8008). The client enters the sleep state after receiving the ACK for many times, but does not release the 'dataQueue'. The ResponseProcessor thread needs the 'dataQueue' to execute 'ackQueue.getFirst()', so the ResponseProcessor will wait for the client to release the 'dataQueue', which is equivalent to that the ResponseProcessor thread also enters sleep, resulting in ACK delay.MapReduce tasks can be delayed by tens of minutes or even hours. The DataStreamer thread can first execute 'one = dataQueue. getFirst()', release 'dataQueue', and then judge whether to execute 'backOffIfNecessary()' according to 'one.isHeartbeatPacket()' was: When I open the ECN and use Terasort for testing, datanodes are congested([HDFS-8008|https://issues.apache.org/jira/browse/HDFS-8008]). The client enters the sleep state after receiving the ACK for many times, but does not release the 'dataqueue'. The ResponseProcessor thread needs the 'dataqueue' to execute 'ackqueue. getfirst()', so the ResponseProcessor will wait for the client to release the 'dataqueue', which is equivalent to that the ResponseProcessor thread also enters sleep, resulting in ack delay.MapReduce tasks can be delayed by tens of minutes or even hours > Client sleep and hold 'dataqueue' when datanode are condensed > - > > Key: HDFS-16293 > URL: https://issues.apache.org/jira/browse/HDFS-16293 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Affects Versions: 3.2.2 >Reporter: Yuanxin Zhu >Priority: Major > Original Estimate: 24h > Remaining Estimate: 24h > > When I open the ECN and use Terasort for testing, DataNodes are > congested(HDFS-8008). The client enters the sleep state after receiving the > ACK for many times, but does not release the 'dataQueue'. The > ResponseProcessor thread needs the 'dataQueue' to execute > 'ackQueue.getFirst()', so the ResponseProcessor will wait for the client to > release the 'dataQueue', which is equivalent to that the ResponseProcessor > thread also enters sleep, resulting in ACK delay.MapReduce tasks can be > delayed by tens of minutes or even hours. > The DataStreamer thread can first execute 'one = dataQueue. getFirst()', > release 'dataQueue', and then judge whether to execute 'backOffIfNecessary()' > according to 'one.isHeartbeatPacket()' > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16269) [Fix] Improve NNThroughputBenchmark#blockReport operation
[ https://issues.apache.org/jira/browse/HDFS-16269?focusedWorklogId=672657=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-672657 ] ASF GitHub Bot logged work on HDFS-16269: - Author: ASF GitHub Bot Created on: 01/Nov/21 13:22 Start Date: 01/Nov/21 13:22 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #3544: URL: https://github.com/apache/hadoop/pull/3544#issuecomment-956229769 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 1m 3s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 1s | | codespell was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 2 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 37m 15s | | trunk passed | | +1 :green_heart: | compile | 1m 33s | | trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | compile | 1m 25s | | trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | checkstyle | 1m 4s | | trunk passed | | +1 :green_heart: | mvnsite | 1m 31s | | trunk passed | | +1 :green_heart: | javadoc | 1m 5s | | trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | javadoc | 1m 33s | | trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | spotbugs | 3m 35s | | trunk passed | | +1 :green_heart: | shadedclient | 25m 35s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 1m 19s | | the patch passed | | +1 :green_heart: | compile | 1m 28s | | the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | javac | 1m 28s | | the patch passed | | +1 :green_heart: | compile | 1m 16s | | the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | javac | 1m 16s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | +1 :green_heart: | checkstyle | 0m 56s | | the patch passed | | +1 :green_heart: | mvnsite | 1m 24s | | the patch passed | | +1 :green_heart: | javadoc | 0m 55s | | the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | javadoc | 1m 25s | | the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | spotbugs | 3m 39s | | the patch passed | | +1 :green_heart: | shadedclient | 25m 32s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | -1 :x: | unit | 376m 41s | [/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3544/6/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 0m 45s | | The patch does not generate ASF License warnings. | | | | 487m 51s | | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.fs.viewfs.TestViewFileSystemOverloadSchemeHdfsFileSystemContract | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3544/6/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/3544 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell | | uname | Linux 55004c7946a0 4.15.0-143-generic #147-Ubuntu SMP Wed Apr 14 16:10:11 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / bfedca87dc856dec491c81af77c3cc2eb58b1537 | | Default Java | Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3544/6/testReport/ | | Max. process+thread count | 2103 (vs. ulimit of 5500) | | modules | C:
[jira] [Updated] (HDFS-16293) Client sleep and hold 'dataqueue' when datanode are condensed
[ https://issues.apache.org/jira/browse/HDFS-16293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuanxin Zhu updated HDFS-16293: --- Description: When I open the ECN and use Terasort for testing, datanodes are congested([HDFS-8008|https://issues.apache.org/jira/browse/HDFS-8008]). The client enters the sleep state after receiving the ACK for many times, but does not release the 'dataqueue'. The ResponseProcessor thread needs the 'dataqueue' to execute 'ackqueue. getfirst()', so the ResponseProcessor will wait for the client to release the 'dataqueue', which is equivalent to that the ResponseProcessor thread also enters sleep, resulting in ack delay.MapReduce tasks can be delayed by tens of minutes or even hours was: When I open the ECN and use Terasort for testing, datanodes are congested([https://issues.apache.org/jira/browse/HDFS-8008|http://example.com]). The client enters the sleep state after receiving the ACK for many times, but does not release the 'dataqueue'. The ResponseProcessor thread needs the 'dataqueue' to execute 'ackqueue. getfirst()', so the ResponseProcessor will wait for the client to release the 'dataqueue', which is equivalent to that the ResponseProcessor thread also enters sleep, resulting in ack delay.MapReduce tasks can be delayed by tens of minutes or even hours > Client sleep and hold 'dataqueue' when datanode are condensed > - > > Key: HDFS-16293 > URL: https://issues.apache.org/jira/browse/HDFS-16293 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Affects Versions: 3.2.2 >Reporter: Yuanxin Zhu >Priority: Major > Original Estimate: 24h > Remaining Estimate: 24h > > When I open the ECN and use Terasort for testing, datanodes are > congested([HDFS-8008|https://issues.apache.org/jira/browse/HDFS-8008]). The > client enters the sleep state after receiving the ACK for many times, but > does not release the 'dataqueue'. The ResponseProcessor thread needs the > 'dataqueue' to execute 'ackqueue. getfirst()', so the ResponseProcessor will > wait for the client to release the 'dataqueue', which is equivalent to that > the ResponseProcessor thread also enters sleep, resulting in ack > delay.MapReduce tasks can be delayed by tens of minutes or even hours > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16293) Client sleep and hold 'dataqueue' when datanode are condensed
[ https://issues.apache.org/jira/browse/HDFS-16293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuanxin Zhu updated HDFS-16293: --- Description: When I open the ECN and use Terasort for testing, datanodes are congested([https://issues.apache.org/jira/browse/HDFS-8008|http://example.com]). The client enters the sleep state after receiving the ACK for many times, but does not release the 'dataqueue'. The ResponseProcessor thread needs the 'dataqueue' to execute 'ackqueue. getfirst()', so the ResponseProcessor will wait for the client to release the 'dataqueue', which is equivalent to that the ResponseProcessor thread also enters sleep, resulting in ack delay.MapReduce tasks can be delayed by tens of minutes or even hours was: When I open the ECN and use Terasort for testing, datanodes are congested([#https://issues.apache.org/jira/browse/HDFS-8008]). The client enters the sleep state after receiving the ACK for many times, but does not release the 'dataqueue'. The ResponseProcessor thread needs the 'dataqueue' to execute 'ackqueue. getfirst()', so the ResponseProcessor will wait for the client to release the 'dataqueue', which is equivalent to that the ResponseProcessor thread also enters sleep, resulting in ack delay.MapReduce tasks can be delayed by tens of minutes or even hours > Client sleep and hold 'dataqueue' when datanode are condensed > - > > Key: HDFS-16293 > URL: https://issues.apache.org/jira/browse/HDFS-16293 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Affects Versions: 3.2.2 >Reporter: Yuanxin Zhu >Priority: Major > Original Estimate: 24h > Remaining Estimate: 24h > > When I open the ECN and use Terasort for testing, datanodes are > congested([https://issues.apache.org/jira/browse/HDFS-8008|http://example.com]). > The client enters the sleep state after receiving the ACK for many times, > but does not release the 'dataqueue'. The ResponseProcessor thread needs the > 'dataqueue' to execute 'ackqueue. getfirst()', so the ResponseProcessor will > wait for the client to release the 'dataqueue', which is equivalent to that > the ResponseProcessor thread also enters sleep, resulting in ack > delay.MapReduce tasks can be delayed by tens of minutes or even hours > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-16293) Client sleep and hold 'dataqueue' when datanode are condensed
Yuanxin Zhu created HDFS-16293: -- Summary: Client sleep and hold 'dataqueue' when datanode are condensed Key: HDFS-16293 URL: https://issues.apache.org/jira/browse/HDFS-16293 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 3.2.2 Reporter: Yuanxin Zhu When I open the ECN and use Terasort for testing, datanodes are congested([#https://issues.apache.org/jira/browse/HDFS-8008]). The client enters the sleep state after receiving the ACK for many times, but does not release the 'dataqueue'. The ResponseProcessor thread needs the 'dataqueue' to execute 'ackqueue. getfirst()', so the ResponseProcessor will wait for the client to release the 'dataqueue', which is equivalent to that the ResponseProcessor thread also enters sleep, resulting in ack delay.MapReduce tasks can be delayed by tens of minutes or even hours -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16286) Debug tool to verify the correctness of erasure coding on file
[ https://issues.apache.org/jira/browse/HDFS-16286?focusedWorklogId=672650=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-672650 ] ASF GitHub Bot logged work on HDFS-16286: - Author: ASF GitHub Bot Created on: 01/Nov/21 12:54 Start Date: 01/Nov/21 12:54 Worklog Time Spent: 10m Work Description: sodonnel commented on a change in pull request #3593: URL: https://github.com/apache/hadoop/pull/3593#discussion_r740188593 ## File path: hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DebugAdmin.java ## @@ -387,6 +414,211 @@ int run(List args) throws IOException { } } + /** + * The command for verifying the correctness of erasure coding on an erasure coded file. + */ + private class VerifyECCommand extends DebugCommand { +private DFSClient client; +private int dataBlkNum; +private int parityBlkNum; +private int cellSize; +private boolean useDNHostname; +private CachingStrategy cachingStrategy; +private int stripedReadBufferSize; +private CompletionService readService; +private RawErasureDecoder decoder; +private BlockReader[] blockReaders; + + +VerifyECCommand() { + super("verifyEC", + "verifyEC -file ", + " Verify HDFS erasure coding on all block groups of the file."); +} + +int run(List args) throws IOException { + if (args.size() < 2) { +System.out.println(usageText); +System.out.println(helpText + System.lineSeparator()); +return 1; + } + String file = StringUtils.popOptionWithArgument("-file", args); + Path path = new Path(file); + DistributedFileSystem dfs = AdminHelper.getDFS(getConf()); + this.client = dfs.getClient(); + + FileStatus fileStatus; + try { +fileStatus = dfs.getFileStatus(path); + } catch (FileNotFoundException e) { +System.err.println("File " + file + " does not exist."); +return 1; + } + + if (!fileStatus.isFile()) { +System.err.println("File " + file + " is not a regular file."); +return 1; + } + if (!dfs.isFileClosed(path)) { +System.err.println("File " + file + " is not closed."); +return 1; + } + this.useDNHostname = getConf().getBoolean(DFSConfigKeys.DFS_DATANODE_USE_DN_HOSTNAME, + DFSConfigKeys.DFS_DATANODE_USE_DN_HOSTNAME_DEFAULT); + this.cachingStrategy = CachingStrategy.newDefaultStrategy(); + this.stripedReadBufferSize = getConf().getInt( + DFSConfigKeys.DFS_DN_EC_RECONSTRUCTION_STRIPED_READ_BUFFER_SIZE_KEY, + DFSConfigKeys.DFS_DN_EC_RECONSTRUCTION_STRIPED_READ_BUFFER_SIZE_DEFAULT); + + LocatedBlocks locatedBlocks = client.getLocatedBlocks(file, 0, fileStatus.getLen()); + if (locatedBlocks.getErasureCodingPolicy() == null) { +System.err.println("File " + file + " is not erasure coded."); +return 1; + } + ErasureCodingPolicy ecPolicy = locatedBlocks.getErasureCodingPolicy(); + this.dataBlkNum = ecPolicy.getNumDataUnits(); + this.parityBlkNum = ecPolicy.getNumParityUnits(); + this.cellSize = ecPolicy.getCellSize(); + this.decoder = CodecUtil.createRawDecoder(getConf(), ecPolicy.getCodecName(), + new ErasureCoderOptions( + ecPolicy.getNumDataUnits(), ecPolicy.getNumParityUnits())); + int blockNum = dataBlkNum + parityBlkNum; + this.readService = new ExecutorCompletionService<>( + DFSUtilClient.getThreadPoolExecutor(blockNum, blockNum, 60, + new LinkedBlockingQueue<>(), "read-", false)); + this.blockReaders = new BlockReader[dataBlkNum + parityBlkNum]; + + for (LocatedBlock locatedBlock : locatedBlocks.getLocatedBlocks()) { +System.out.println("Checking EC block group: blk_" + locatedBlock.getBlock().getBlockId()); +LocatedStripedBlock blockGroup = (LocatedStripedBlock) locatedBlock; + +try { + verifyBlockGroup(blockGroup); + System.out.println("Status: OK"); +} catch (Exception e) { + System.err.println("Status: ERROR, message: " + e.getMessage()); + return 1; +} finally { + closeBlockReaders(); +} + } + System.out.println("\nAll EC block group status: OK"); + return 0; +} + +private void verifyBlockGroup(LocatedStripedBlock blockGroup) throws Exception { + final LocatedBlock[] indexedBlocks = StripedBlockUtil.parseStripedBlockGroup(blockGroup, + cellSize, dataBlkNum, parityBlkNum); + + int blockNumExpected = Math.min(dataBlkNum, + (int) ((blockGroup.getBlockSize() - 1) / cellSize + 1)) + parityBlkNum; + if (blockGroup.getBlockIndices().length < blockNumExpected) { +throw new Exception("Block group is under-erasure-coded."); + } + + long
[jira] [Work logged] (HDFS-16286) Debug tool to verify the correctness of erasure coding on file
[ https://issues.apache.org/jira/browse/HDFS-16286?focusedWorklogId=672649=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-672649 ] ASF GitHub Bot logged work on HDFS-16286: - Author: ASF GitHub Bot Created on: 01/Nov/21 12:52 Start Date: 01/Nov/21 12:52 Worklog Time Spent: 10m Work Description: sodonnel commented on a change in pull request #3593: URL: https://github.com/apache/hadoop/pull/3593#discussion_r740187297 ## File path: hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DebugAdmin.java ## @@ -387,6 +414,211 @@ int run(List args) throws IOException { } } + /** + * The command for verifying the correctness of erasure coding on an erasure coded file. + */ + private class VerifyECCommand extends DebugCommand { +private DFSClient client; +private int dataBlkNum; +private int parityBlkNum; +private int cellSize; +private boolean useDNHostname; +private CachingStrategy cachingStrategy; +private int stripedReadBufferSize; +private CompletionService readService; +private RawErasureDecoder decoder; +private BlockReader[] blockReaders; + + +VerifyECCommand() { + super("verifyEC", + "verifyEC -file ", + " Verify HDFS erasure coding on all block groups of the file."); +} + +int run(List args) throws IOException { + if (args.size() < 2) { +System.out.println(usageText); +System.out.println(helpText + System.lineSeparator()); +return 1; + } + String file = StringUtils.popOptionWithArgument("-file", args); + Path path = new Path(file); + DistributedFileSystem dfs = AdminHelper.getDFS(getConf()); + this.client = dfs.getClient(); + + FileStatus fileStatus; + try { +fileStatus = dfs.getFileStatus(path); + } catch (FileNotFoundException e) { +System.err.println("File " + file + " does not exist."); +return 1; + } + + if (!fileStatus.isFile()) { +System.err.println("File " + file + " is not a regular file."); +return 1; + } + if (!dfs.isFileClosed(path)) { +System.err.println("File " + file + " is not closed."); +return 1; + } + this.useDNHostname = getConf().getBoolean(DFSConfigKeys.DFS_DATANODE_USE_DN_HOSTNAME, + DFSConfigKeys.DFS_DATANODE_USE_DN_HOSTNAME_DEFAULT); + this.cachingStrategy = CachingStrategy.newDefaultStrategy(); + this.stripedReadBufferSize = getConf().getInt( + DFSConfigKeys.DFS_DN_EC_RECONSTRUCTION_STRIPED_READ_BUFFER_SIZE_KEY, + DFSConfigKeys.DFS_DN_EC_RECONSTRUCTION_STRIPED_READ_BUFFER_SIZE_DEFAULT); + + LocatedBlocks locatedBlocks = client.getLocatedBlocks(file, 0, fileStatus.getLen()); + if (locatedBlocks.getErasureCodingPolicy() == null) { +System.err.println("File " + file + " is not erasure coded."); +return 1; + } + ErasureCodingPolicy ecPolicy = locatedBlocks.getErasureCodingPolicy(); + this.dataBlkNum = ecPolicy.getNumDataUnits(); + this.parityBlkNum = ecPolicy.getNumParityUnits(); + this.cellSize = ecPolicy.getCellSize(); + this.decoder = CodecUtil.createRawDecoder(getConf(), ecPolicy.getCodecName(), + new ErasureCoderOptions( + ecPolicy.getNumDataUnits(), ecPolicy.getNumParityUnits())); + int blockNum = dataBlkNum + parityBlkNum; + this.readService = new ExecutorCompletionService<>( + DFSUtilClient.getThreadPoolExecutor(blockNum, blockNum, 60, + new LinkedBlockingQueue<>(), "read-", false)); + this.blockReaders = new BlockReader[dataBlkNum + parityBlkNum]; + + for (LocatedBlock locatedBlock : locatedBlocks.getLocatedBlocks()) { +System.out.println("Checking EC block group: blk_" + locatedBlock.getBlock().getBlockId()); +LocatedStripedBlock blockGroup = (LocatedStripedBlock) locatedBlock; + +try { + verifyBlockGroup(blockGroup); + System.out.println("Status: OK"); +} catch (Exception e) { + System.err.println("Status: ERROR, message: " + e.getMessage()); + return 1; +} finally { + closeBlockReaders(); +} + } + System.out.println("\nAll EC block group status: OK"); + return 0; +} + +private void verifyBlockGroup(LocatedStripedBlock blockGroup) throws Exception { + final LocatedBlock[] indexedBlocks = StripedBlockUtil.parseStripedBlockGroup(blockGroup, + cellSize, dataBlkNum, parityBlkNum); + + int blockNumExpected = Math.min(dataBlkNum, + (int) ((blockGroup.getBlockSize() - 1) / cellSize + 1)) + parityBlkNum; + if (blockGroup.getBlockIndices().length < blockNumExpected) { +throw new Exception("Block group is under-erasure-coded."); + } + + long
[jira] [Commented] (HDFS-16292) The DFS Input Stream is waiting to be read
[ https://issues.apache.org/jira/browse/HDFS-16292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17436790#comment-17436790 ] Wei-Chiu Chuang commented on HDFS-16292: Hi thanks for reporting the issue. However, 2.5.2 is ancient. HADOOP-10223 appears to have addressed this issue already. > The DFS Input Stream is waiting to be read > -- > > Key: HDFS-16292 > URL: https://issues.apache.org/jira/browse/HDFS-16292 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 2.5.2 >Reporter: Hualong Zhang >Priority: Minor > Attachments: HDFS-16292.path, image-2021-11-01-18-36-54-329.png > > > The input stream has been waiting.The problem seems to be that > BlockReaderPeer#peer does not set ReadTimeout and WriteTimeout.We can solve > this problem by setting the timeout in BlockReaderFactory#nextTcpPeer > Jstack as follows > !image-2021-11-01-18-36-54-329.png! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-16292) The DFS Input Stream is waiting to be read
[ https://issues.apache.org/jira/browse/HDFS-16292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang reassigned HDFS-16292: -- Assignee: (was: Wei-Chiu Chuang) > The DFS Input Stream is waiting to be read > -- > > Key: HDFS-16292 > URL: https://issues.apache.org/jira/browse/HDFS-16292 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 2.5.2 >Reporter: Hualong Zhang >Priority: Minor > Attachments: HDFS-16292.path, image-2021-11-01-18-36-54-329.png > > > The input stream has been waiting.The problem seems to be that > BlockReaderPeer#peer does not set ReadTimeout and WriteTimeout.We can solve > this problem by setting the timeout in BlockReaderFactory#nextTcpPeer > Jstack as follows > !image-2021-11-01-18-36-54-329.png! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-16292) The DFS Input Stream is waiting to be read
[ https://issues.apache.org/jira/browse/HDFS-16292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang reassigned HDFS-16292: -- Assignee: Wei-Chiu Chuang > The DFS Input Stream is waiting to be read > -- > > Key: HDFS-16292 > URL: https://issues.apache.org/jira/browse/HDFS-16292 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 2.5.2 >Reporter: Hualong Zhang >Assignee: Wei-Chiu Chuang >Priority: Minor > Attachments: HDFS-16292.path, image-2021-11-01-18-36-54-329.png > > > The input stream has been waiting.The problem seems to be that > BlockReaderPeer#peer does not set ReadTimeout and WriteTimeout.We can solve > this problem by setting the timeout in BlockReaderFactory#nextTcpPeer > Jstack as follows > !image-2021-11-01-18-36-54-329.png! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16292) The DFS Input Stream is waiting to be read
[ https://issues.apache.org/jira/browse/HDFS-16292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17436758#comment-17436758 ] Hadoop QA commented on HDFS-16292: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Logfile || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} | | {color:blue}0{color} | {color:blue} patch {color} | {color:blue} 0m 4s{color} | {color:blue}{color} | {color:blue} The patch file was not named according to hadoop's naming conventions. Please see https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute for instructions. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 8s{color} | {color:red}{color} | {color:red} HDFS-16292 does not apply to trunk. Rebase required? Wrong Branch? See https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | HDFS-16292 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/13035552/HDFS-16292.path | | Console output | https://ci-hadoop.apache.org/job/PreCommit-HDFS-Build/730/console | | versions | git=2.17.1 | | Powered by | Apache Yetus 0.13.0-SNAPSHOT https://yetus.apache.org | This message was automatically generated. > The DFS Input Stream is waiting to be read > -- > > Key: HDFS-16292 > URL: https://issues.apache.org/jira/browse/HDFS-16292 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 2.5.2 >Reporter: Hualong Zhang >Priority: Minor > Attachments: HDFS-16292.path, image-2021-11-01-18-36-54-329.png > > > The input stream has been waiting.The problem seems to be that > BlockReaderPeer#peer does not set ReadTimeout and WriteTimeout.We can solve > this problem by setting the timeout in BlockReaderFactory#nextTcpPeer > Jstack as follows > !image-2021-11-01-18-36-54-329.png! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16292) The DFS Input Stream is waiting to be read
[ https://issues.apache.org/jira/browse/HDFS-16292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hualong Zhang updated HDFS-16292: - Attachment: HDFS-16292.path Status: Patch Available (was: Open) > The DFS Input Stream is waiting to be read > -- > > Key: HDFS-16292 > URL: https://issues.apache.org/jira/browse/HDFS-16292 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 2.5.2 >Reporter: Hualong Zhang >Priority: Minor > Attachments: HDFS-16292.path, image-2021-11-01-18-36-54-329.png > > > The input stream has been waiting.The problem seems to be that > BlockReaderPeer#peer does not set ReadTimeout and WriteTimeout.We can solve > this problem by setting the timeout in BlockReaderFactory#nextTcpPeer > Jstack as follows > !image-2021-11-01-18-36-54-329.png! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-16292) The DFS Input Stream is waiting to be read
Hualong Zhang created HDFS-16292: Summary: The DFS Input Stream is waiting to be read Key: HDFS-16292 URL: https://issues.apache.org/jira/browse/HDFS-16292 Project: Hadoop HDFS Issue Type: Improvement Components: datanode Affects Versions: 2.5.2 Reporter: Hualong Zhang Attachments: image-2021-11-01-18-36-54-329.png The input stream has been waiting.The problem seems to be that BlockReaderPeer#peer does not set ReadTimeout and WriteTimeout.We can solve this problem by setting the timeout in BlockReaderFactory#nextTcpPeer Jstack as follows !image-2021-11-01-18-36-54-329.png! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16269) [Fix] Improve NNThroughputBenchmark#blockReport operation
[ https://issues.apache.org/jira/browse/HDFS-16269?focusedWorklogId=672570=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-672570 ] ASF GitHub Bot logged work on HDFS-16269: - Author: ASF GitHub Bot Created on: 01/Nov/21 07:59 Start Date: 01/Nov/21 07:59 Worklog Time Spent: 10m Work Description: jianghuazhu commented on a change in pull request #3544: URL: https://github.com/apache/hadoop/pull/3544#discussion_r739952380 ## File path: hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNNThroughputBenchmark.java ## @@ -166,4 +166,25 @@ public void testNNThroughputForAppendOp() throws Exception { } } } + + /** + * This test runs {@link NNThroughputBenchmark} against a mini DFS cluster + * for block report operation. + */ + @Test(timeout = 12) + public void testNNThroughputForBlockReportOp() throws Exception { +final Configuration conf = new HdfsConfiguration(); +conf.setInt(DFSConfigKeys.DFS_NAMENODE_MIN_BLOCK_SIZE_KEY, 16); +conf.setInt(DFSConfigKeys.DFS_BLOCK_SIZE_KEY, 16); +try (MiniDFSCluster cluster = new MiniDFSCluster.Builder(conf). +numDataNodes(3).build()) { + cluster.waitActive(); + final Configuration benchConf = new HdfsConfiguration(); + benchConf.setInt(DFSConfigKeys.DFS_NAMENODE_MIN_BLOCK_SIZE_KEY, 16); + benchConf.setInt(DFSConfigKeys.DFS_BLOCK_SIZE_KEY, 16); + NNThroughputBenchmark.runBenchmark(benchConf, + new String[]{"-fs", cluster.getURI().toString(), "-op", + "blockReport", "-datanodes", "3", "-reports", "2"}); Review comment: Sorry, the information shown here cannot fully demonstrate my thoughts. I will submit some updates. ![image](https://user-images.githubusercontent.com/6416939/139640085-b474b30b-a117-453e-bc90-520074b5b9b3.png) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 672570) Time Spent: 4.5h (was: 4h 20m) > [Fix] Improve NNThroughputBenchmark#blockReport operation > - > > Key: HDFS-16269 > URL: https://issues.apache.org/jira/browse/HDFS-16269 > Project: Hadoop HDFS > Issue Type: Bug > Components: benchmarks, namenode >Affects Versions: 2.9.2 >Reporter: JiangHua Zhu >Assignee: JiangHua Zhu >Priority: Major > Labels: pull-request-available > Time Spent: 4.5h > Remaining Estimate: 0h > > When using NNThroughputBenchmark to verify the blockReport, you will get some > exception information. > Commands used: > ./bin/hadoop org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark -fs > -op blockReport -datanodes 3 -reports 1 > The exception information: > 21/10/12 14:35:18 INFO namenode.NNThroughputBenchmark: Starting benchmark: > blockReport > 21/10/12 14:35:19 INFO namenode.NNThroughputBenchmark: Creating 10 files with > 10 blocks each. > 21/10/12 14:35:19 ERROR namenode.NNThroughputBenchmark: > java.lang.ArrayIndexOutOfBoundsException: 50009 > at > org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark$BlockReportStats.addBlocks(NNThroughputBenchmark.java:1161) > at > org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark$BlockReportStats.generateInputs(NNThroughputBenchmark.java:1143) > at > org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark$OperationStatsBase.benchmark(NNThroughputBenchmark.java:257) > at > org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark.run(NNThroughputBenchmark.java:1528) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90) > at > org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark.runBenchmark(NNThroughputBenchmark.java:1430) > at > org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark.main(NNThroughputBenchmark.java:1550) > Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 50009 > at > org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark$BlockReportStats.addBlocks(NNThroughputBenchmark.java:1161) > at > org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark$BlockReportStats.generateInputs(NNThroughputBenchmark.java:1143) > at > org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark$OperationStatsBase.benchmark(NNThroughputBenchmark.java:257) > at > org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark.run(NNThroughputBenchmark.java:1528) > at
[jira] [Work logged] (HDFS-16285) Make HDFS ownership tools cross platform
[ https://issues.apache.org/jira/browse/HDFS-16285?focusedWorklogId=672560=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-672560 ] ASF GitHub Bot logged work on HDFS-16285: - Author: ASF GitHub Bot Created on: 01/Nov/21 07:04 Start Date: 01/Nov/21 07:04 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #3588: URL: https://github.com/apache/hadoop/pull/3588#issuecomment-955983417 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 55s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 1s | | codespell was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 6 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 24m 2s | | trunk passed | | +1 :green_heart: | compile | 3m 23s | | trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | compile | 3m 16s | | trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | mvnsite | 0m 21s | | trunk passed | | +1 :green_heart: | shadedclient | 53m 28s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 0m 14s | | the patch passed | | +1 :green_heart: | compile | 3m 18s | | the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | cc | 3m 18s | | the patch passed | | +1 :green_heart: | golang | 3m 18s | | the patch passed | | +1 :green_heart: | javac | 3m 18s | | the patch passed | | +1 :green_heart: | compile | 3m 20s | | the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | cc | 3m 20s | | the patch passed | | +1 :green_heart: | golang | 3m 20s | | the patch passed | | +1 :green_heart: | javac | 3m 20s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | +1 :green_heart: | mvnsite | 0m 16s | | the patch passed | | +1 :green_heart: | shadedclient | 22m 2s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 101m 35s | | hadoop-hdfs-native-client in the patch passed. | | +1 :green_heart: | asflicense | 0m 30s | | The patch does not generate ASF License warnings. | | | | 187m 51s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3588/5/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/3588 | | Optional Tests | dupname asflicense compile cc mvnsite javac unit codespell golang | | uname | Linux 7b6f500dd2cd 4.15.0-153-generic #160-Ubuntu SMP Thu Jul 29 06:54:29 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / 15677758e0447b99c25fc7e0158bc3969a6c1544 | | Default Java | Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3588/5/testReport/ | | Max. process+thread count | 522 (vs. ulimit of 5500) | | modules | C: hadoop-hdfs-project/hadoop-hdfs-native-client U: hadoop-hdfs-project/hadoop-hdfs-native-client | | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3588/5/console | | versions | git=2.25.1 maven=3.6.3 | | Powered by | Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org | This message was automatically generated. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 672560) Time Spent: 1h 20m (was: 1h 10m) > Make HDFS ownership tools cross platform >