[jira] [Commented] (HDFS-16937) Delete RPC should also record number of delete blocks in audit log
[ https://issues.apache.org/jira/browse/HDFS-16937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17694346#comment-17694346 ] ASF GitHub Bot commented on HDFS-16937: --- hfutatzhanghb opened a new pull request, #5442: URL: https://github.com/apache/hadoop/pull/5442 please see https://issues.apache.org/jira/browse/HDFS-16937. > Delete RPC should also record number of delete blocks in audit log > -- > > Key: HDFS-16937 > URL: https://issues.apache.org/jira/browse/HDFS-16937 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.3.4 >Reporter: ZhangHB >Priority: Minor > > To better trace the jitter caused by delete rpc, we should also record the > number of deleting blocks in audit log. With this information, we can know > which user cause the jitter. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16937) Delete RPC should also record number of delete blocks in audit log
[ https://issues.apache.org/jira/browse/HDFS-16937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HDFS-16937: -- Labels: pull-request-available (was: ) > Delete RPC should also record number of delete blocks in audit log > -- > > Key: HDFS-16937 > URL: https://issues.apache.org/jira/browse/HDFS-16937 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.3.4 >Reporter: ZhangHB >Priority: Minor > Labels: pull-request-available > > To better trace the jitter caused by delete rpc, we should also record the > number of deleting blocks in audit log. With this information, we can know > which user cause the jitter. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-16937) Delete RPC should also record number of delete blocks in audit log
ZhangHB created HDFS-16937: -- Summary: Delete RPC should also record number of delete blocks in audit log Key: HDFS-16937 URL: https://issues.apache.org/jira/browse/HDFS-16937 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 3.3.4 Reporter: ZhangHB To better trace the jitter caused by delete rpc, we should also record the number of deleting blocks in audit log. With this information, we can know which user cause the jitter. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16896) HDFS Client hedged read has increased failure rate than without hedged read
[ https://issues.apache.org/jira/browse/HDFS-16896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17694224#comment-17694224 ] ASF GitHub Bot commented on HDFS-16896: --- mccormickt12 commented on code in PR #5322: URL: https://github.com/apache/hadoop/pull/5322#discussion_r1119427557 ## hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java: ## @@ -197,6 +197,15 @@ private void clearLocalDeadNodes() { deadNodes.clear(); } + /** + * Clear list of ignored nodes used for hedged reads. + */ + private void clearIgnoredNodes(Collection ignoredNodes) { Review Comment: sounds good, to be clear this is what im planning ``` private void clearCachedNodeState(Collection ignoredNodes) { clearLocalDeadNodes(); //2nd option is to remove only nodes[blockId] clearIgnoredNodes(ignoredNodes); } ``` > HDFS Client hedged read has increased failure rate than without hedged read > --- > > Key: HDFS-16896 > URL: https://issues.apache.org/jira/browse/HDFS-16896 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Reporter: Tom McCormick >Assignee: Tom McCormick >Priority: Major > Labels: pull-request-available > > When hedged read is enabled by HDFS client, we see an increased failure rate > on reads. > *stacktrace* > > {code:java} > Caused by: org.apache.hadoop.hdfs.BlockMissingException: Could not obtain > block: BP-1183972111-10.197.192.88-1590025572374:blk_17114848218_16043459722 > file=/data/tracking/streaming/AdImpressionEvent/daily/2022/07/18/compaction_1/part-r-1914862.1658217125623.1362294472.orc > at > org.apache.hadoop.hdfs.DFSInputStream.refetchLocations(DFSInputStream.java:1077) > at > org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:1060) > at > org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:1039) > at > org.apache.hadoop.hdfs.DFSInputStream.hedgedFetchBlockByteRange(DFSInputStream.java:1365) > at org.apache.hadoop.hdfs.DFSInputStream.pread(DFSInputStream.java:1572) > at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:1535) > at org.apache.hadoop.fs.FSInputStream.readFully(FSInputStream.java:121) > at > org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:112) > at > org.apache.hadoop.fs.RetryingInputStream.lambda$readFully$3(RetryingInputStream.java:172) > at org.apache.hadoop.fs.RetryPolicy.lambda$run$0(RetryPolicy.java:137) > at org.apache.hadoop.fs.NoOpRetryPolicy.run(NoOpRetryPolicy.java:36) > at org.apache.hadoop.fs.RetryPolicy.run(RetryPolicy.java:136) > at > org.apache.hadoop.fs.RetryingInputStream.readFully(RetryingInputStream.java:168) > at > org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:112) > at > org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:112) > at > io.trino.plugin.hive.orc.HdfsOrcDataSource.readInternal(HdfsOrcDataSource.java:76) > ... 46 more > {code} > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16896) HDFS Client hedged read has increased failure rate than without hedged read
[ https://issues.apache.org/jira/browse/HDFS-16896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17694222#comment-17694222 ] ASF GitHub Bot commented on HDFS-16896: --- mkuchenbecker commented on code in PR #5322: URL: https://github.com/apache/hadoop/pull/5322#discussion_r1119426257 ## hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java: ## @@ -197,6 +197,15 @@ private void clearLocalDeadNodes() { deadNodes.clear(); } + /** + * Clear list of ignored nodes used for hedged reads. + */ + private void clearIgnoredNodes(Collection ignoredNodes) { Review Comment: I'd personally err on the side of "slightly confusing name with documentation but ensure it always happens." `clearCachedNodeState`? > HDFS Client hedged read has increased failure rate than without hedged read > --- > > Key: HDFS-16896 > URL: https://issues.apache.org/jira/browse/HDFS-16896 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Reporter: Tom McCormick >Assignee: Tom McCormick >Priority: Major > Labels: pull-request-available > > When hedged read is enabled by HDFS client, we see an increased failure rate > on reads. > *stacktrace* > > {code:java} > Caused by: org.apache.hadoop.hdfs.BlockMissingException: Could not obtain > block: BP-1183972111-10.197.192.88-1590025572374:blk_17114848218_16043459722 > file=/data/tracking/streaming/AdImpressionEvent/daily/2022/07/18/compaction_1/part-r-1914862.1658217125623.1362294472.orc > at > org.apache.hadoop.hdfs.DFSInputStream.refetchLocations(DFSInputStream.java:1077) > at > org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:1060) > at > org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:1039) > at > org.apache.hadoop.hdfs.DFSInputStream.hedgedFetchBlockByteRange(DFSInputStream.java:1365) > at org.apache.hadoop.hdfs.DFSInputStream.pread(DFSInputStream.java:1572) > at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:1535) > at org.apache.hadoop.fs.FSInputStream.readFully(FSInputStream.java:121) > at > org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:112) > at > org.apache.hadoop.fs.RetryingInputStream.lambda$readFully$3(RetryingInputStream.java:172) > at org.apache.hadoop.fs.RetryPolicy.lambda$run$0(RetryPolicy.java:137) > at org.apache.hadoop.fs.NoOpRetryPolicy.run(NoOpRetryPolicy.java:36) > at org.apache.hadoop.fs.RetryPolicy.run(RetryPolicy.java:136) > at > org.apache.hadoop.fs.RetryingInputStream.readFully(RetryingInputStream.java:168) > at > org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:112) > at > org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:112) > at > io.trino.plugin.hive.orc.HdfsOrcDataSource.readInternal(HdfsOrcDataSource.java:76) > ... 46 more > {code} > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16917) Add transfer rate quantile metrics for DataNode reads
[ https://issues.apache.org/jira/browse/HDFS-16917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17694217#comment-17694217 ] ASF GitHub Bot commented on HDFS-16917: --- rdingankar commented on PR #5397: URL: https://github.com/apache/hadoop/pull/5397#issuecomment-1447270824 @omalley Can you also help in backporting the PR to branch 3.3 > Add transfer rate quantile metrics for DataNode reads > - > > Key: HDFS-16917 > URL: https://issues.apache.org/jira/browse/HDFS-16917 > Project: Hadoop HDFS > Issue Type: Task > Components: datanode >Reporter: Ravindra Dingankar >Assignee: Ravindra Dingankar >Priority: Minor > Labels: pull-request-available > Fix For: 3.3.0, 3.4.0 > > > Currently we have the following metrics for datanode reads. > |BytesRead > BlocksRead > TotalReadTime|Total number of bytes read from DataNode > Total number of blocks read from DataNode > Total number of milliseconds spent on read operation| > We would like to add a new quantile metric calculating the transfer rate for > datanode reads. > This will give us a distribution across a window of the read transfer rate > for each datanode. > Quantiles for transfer rate per host will help in identifying issues like > hotspotting of datasets as well as finding repetitive slow datanodes. > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16896) HDFS Client hedged read has increased failure rate than without hedged read
[ https://issues.apache.org/jira/browse/HDFS-16896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17694215#comment-17694215 ] ASF GitHub Bot commented on HDFS-16896: --- mccormickt12 commented on code in PR #5322: URL: https://github.com/apache/hadoop/pull/5322#discussion_r1119413903 ## hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java: ## @@ -1337,8 +1352,12 @@ private void hedgedFetchBlockByteRange(LocatedBlock block, long start, } catch (InterruptedException ie) { // Ignore and retry } -if (refetch) { - refetchLocations(block, ignored); +// If refetch is true, then all nodes are in deadNodes or ignoredNodes. +// We should loop through all futures and remove them, so we do not +// have concurrent requests to the same node. +// Once all futures are cleared, we can clear the ignoredNodes and retry. Review Comment: yes, the thing i am trying to emphasize is the `&& futures.isEmpty()` check which is specific to how ignored nodes is cleared > HDFS Client hedged read has increased failure rate than without hedged read > --- > > Key: HDFS-16896 > URL: https://issues.apache.org/jira/browse/HDFS-16896 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Reporter: Tom McCormick >Assignee: Tom McCormick >Priority: Major > Labels: pull-request-available > > When hedged read is enabled by HDFS client, we see an increased failure rate > on reads. > *stacktrace* > > {code:java} > Caused by: org.apache.hadoop.hdfs.BlockMissingException: Could not obtain > block: BP-1183972111-10.197.192.88-1590025572374:blk_17114848218_16043459722 > file=/data/tracking/streaming/AdImpressionEvent/daily/2022/07/18/compaction_1/part-r-1914862.1658217125623.1362294472.orc > at > org.apache.hadoop.hdfs.DFSInputStream.refetchLocations(DFSInputStream.java:1077) > at > org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:1060) > at > org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:1039) > at > org.apache.hadoop.hdfs.DFSInputStream.hedgedFetchBlockByteRange(DFSInputStream.java:1365) > at org.apache.hadoop.hdfs.DFSInputStream.pread(DFSInputStream.java:1572) > at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:1535) > at org.apache.hadoop.fs.FSInputStream.readFully(FSInputStream.java:121) > at > org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:112) > at > org.apache.hadoop.fs.RetryingInputStream.lambda$readFully$3(RetryingInputStream.java:172) > at org.apache.hadoop.fs.RetryPolicy.lambda$run$0(RetryPolicy.java:137) > at org.apache.hadoop.fs.NoOpRetryPolicy.run(NoOpRetryPolicy.java:36) > at org.apache.hadoop.fs.RetryPolicy.run(RetryPolicy.java:136) > at > org.apache.hadoop.fs.RetryingInputStream.readFully(RetryingInputStream.java:168) > at > org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:112) > at > org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:112) > at > io.trino.plugin.hive.orc.HdfsOrcDataSource.readInternal(HdfsOrcDataSource.java:76) > ... 46 more > {code} > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16896) HDFS Client hedged read has increased failure rate than without hedged read
[ https://issues.apache.org/jira/browse/HDFS-16896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17694216#comment-17694216 ] ASF GitHub Bot commented on HDFS-16896: --- mccormickt12 commented on code in PR #5322: URL: https://github.com/apache/hadoop/pull/5322#discussion_r1119414105 ## hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java: ## @@ -224,7 +233,7 @@ boolean deadNodesContain(DatanodeInfo nodeInfo) { } /** - * Grab the open-file info from namenode + * Grab the open-file info from namenode. Review Comment: it came up in checkstyle, it siad i added one new checkstyle, so i just fixed it > HDFS Client hedged read has increased failure rate than without hedged read > --- > > Key: HDFS-16896 > URL: https://issues.apache.org/jira/browse/HDFS-16896 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Reporter: Tom McCormick >Assignee: Tom McCormick >Priority: Major > Labels: pull-request-available > > When hedged read is enabled by HDFS client, we see an increased failure rate > on reads. > *stacktrace* > > {code:java} > Caused by: org.apache.hadoop.hdfs.BlockMissingException: Could not obtain > block: BP-1183972111-10.197.192.88-1590025572374:blk_17114848218_16043459722 > file=/data/tracking/streaming/AdImpressionEvent/daily/2022/07/18/compaction_1/part-r-1914862.1658217125623.1362294472.orc > at > org.apache.hadoop.hdfs.DFSInputStream.refetchLocations(DFSInputStream.java:1077) > at > org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:1060) > at > org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:1039) > at > org.apache.hadoop.hdfs.DFSInputStream.hedgedFetchBlockByteRange(DFSInputStream.java:1365) > at org.apache.hadoop.hdfs.DFSInputStream.pread(DFSInputStream.java:1572) > at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:1535) > at org.apache.hadoop.fs.FSInputStream.readFully(FSInputStream.java:121) > at > org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:112) > at > org.apache.hadoop.fs.RetryingInputStream.lambda$readFully$3(RetryingInputStream.java:172) > at org.apache.hadoop.fs.RetryPolicy.lambda$run$0(RetryPolicy.java:137) > at org.apache.hadoop.fs.NoOpRetryPolicy.run(NoOpRetryPolicy.java:36) > at org.apache.hadoop.fs.RetryPolicy.run(RetryPolicy.java:136) > at > org.apache.hadoop.fs.RetryingInputStream.readFully(RetryingInputStream.java:168) > at > org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:112) > at > org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:112) > at > io.trino.plugin.hive.orc.HdfsOrcDataSource.readInternal(HdfsOrcDataSource.java:76) > ... 46 more > {code} > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16896) HDFS Client hedged read has increased failure rate than without hedged read
[ https://issues.apache.org/jira/browse/HDFS-16896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17694214#comment-17694214 ] ASF GitHub Bot commented on HDFS-16896: --- mccormickt12 commented on code in PR #5322: URL: https://github.com/apache/hadoop/pull/5322#discussion_r1119413264 ## hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java: ## @@ -197,6 +197,15 @@ private void clearLocalDeadNodes() { deadNodes.clear(); } + /** + * Clear list of ignored nodes used for hedged reads. + */ + private void clearIgnoredNodes(Collection ignoredNodes) { Review Comment: I could add a `clearSkippedNodes` and then clear both dead and ignored in there, but that might be confusing as it sounds like theres another node type/list. I don't think `clearLocalDeadNodes` should also clear ignoredNodes because thats a bit miselading. The deadnodes and ignore nodes are handled differently, so i don't think its crazy to keep them separate and clear. (like i said before) i would add a method that clears both dead and ignore, but concerned it may be confusing. lmk what you think > HDFS Client hedged read has increased failure rate than without hedged read > --- > > Key: HDFS-16896 > URL: https://issues.apache.org/jira/browse/HDFS-16896 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Reporter: Tom McCormick >Assignee: Tom McCormick >Priority: Major > Labels: pull-request-available > > When hedged read is enabled by HDFS client, we see an increased failure rate > on reads. > *stacktrace* > > {code:java} > Caused by: org.apache.hadoop.hdfs.BlockMissingException: Could not obtain > block: BP-1183972111-10.197.192.88-1590025572374:blk_17114848218_16043459722 > file=/data/tracking/streaming/AdImpressionEvent/daily/2022/07/18/compaction_1/part-r-1914862.1658217125623.1362294472.orc > at > org.apache.hadoop.hdfs.DFSInputStream.refetchLocations(DFSInputStream.java:1077) > at > org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:1060) > at > org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:1039) > at > org.apache.hadoop.hdfs.DFSInputStream.hedgedFetchBlockByteRange(DFSInputStream.java:1365) > at org.apache.hadoop.hdfs.DFSInputStream.pread(DFSInputStream.java:1572) > at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:1535) > at org.apache.hadoop.fs.FSInputStream.readFully(FSInputStream.java:121) > at > org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:112) > at > org.apache.hadoop.fs.RetryingInputStream.lambda$readFully$3(RetryingInputStream.java:172) > at org.apache.hadoop.fs.RetryPolicy.lambda$run$0(RetryPolicy.java:137) > at org.apache.hadoop.fs.NoOpRetryPolicy.run(NoOpRetryPolicy.java:36) > at org.apache.hadoop.fs.RetryPolicy.run(RetryPolicy.java:136) > at > org.apache.hadoop.fs.RetryingInputStream.readFully(RetryingInputStream.java:168) > at > org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:112) > at > org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:112) > at > io.trino.plugin.hive.orc.HdfsOrcDataSource.readInternal(HdfsOrcDataSource.java:76) > ... 46 more > {code} > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16896) HDFS Client hedged read has increased failure rate than without hedged read
[ https://issues.apache.org/jira/browse/HDFS-16896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17694212#comment-17694212 ] ASF GitHub Bot commented on HDFS-16896: --- mccormickt12 commented on code in PR #5322: URL: https://github.com/apache/hadoop/pull/5322#discussion_r1119409826 ## hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestPread.java: ## @@ -603,7 +603,9 @@ public Void answer(InvocationOnMock invocation) throws Throwable { input.read(0, buffer, 0, 1024); Assert.fail("Reading the block should have thrown BlockMissingException"); } catch (BlockMissingException e) { - assertEquals(3, input.getHedgedReadOpsLoopNumForTesting()); + // The result of 9 is due to 2 blocks by 4 iterations plus one because + // hedgedReadOpsLoopNumForTesting is incremented at start of the loop. + assertEquals(9, input.getHedgedReadOpsLoopNumForTesting()); Review Comment: we are actually 4x'ing, the comment was meant to help clarify. If you recall the issue was we only previously tried each block once, the change is to make hedged reads follow the same number of retires as non hedged reads which has 3 retry loops. This example has 2 blocks, and the last loop is when it exists. Previously 2+1 and now 8+1 > HDFS Client hedged read has increased failure rate than without hedged read > --- > > Key: HDFS-16896 > URL: https://issues.apache.org/jira/browse/HDFS-16896 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Reporter: Tom McCormick >Assignee: Tom McCormick >Priority: Major > Labels: pull-request-available > > When hedged read is enabled by HDFS client, we see an increased failure rate > on reads. > *stacktrace* > > {code:java} > Caused by: org.apache.hadoop.hdfs.BlockMissingException: Could not obtain > block: BP-1183972111-10.197.192.88-1590025572374:blk_17114848218_16043459722 > file=/data/tracking/streaming/AdImpressionEvent/daily/2022/07/18/compaction_1/part-r-1914862.1658217125623.1362294472.orc > at > org.apache.hadoop.hdfs.DFSInputStream.refetchLocations(DFSInputStream.java:1077) > at > org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:1060) > at > org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:1039) > at > org.apache.hadoop.hdfs.DFSInputStream.hedgedFetchBlockByteRange(DFSInputStream.java:1365) > at org.apache.hadoop.hdfs.DFSInputStream.pread(DFSInputStream.java:1572) > at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:1535) > at org.apache.hadoop.fs.FSInputStream.readFully(FSInputStream.java:121) > at > org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:112) > at > org.apache.hadoop.fs.RetryingInputStream.lambda$readFully$3(RetryingInputStream.java:172) > at org.apache.hadoop.fs.RetryPolicy.lambda$run$0(RetryPolicy.java:137) > at org.apache.hadoop.fs.NoOpRetryPolicy.run(NoOpRetryPolicy.java:36) > at org.apache.hadoop.fs.RetryPolicy.run(RetryPolicy.java:136) > at > org.apache.hadoop.fs.RetryingInputStream.readFully(RetryingInputStream.java:168) > at > org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:112) > at > org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:112) > at > io.trino.plugin.hive.orc.HdfsOrcDataSource.readInternal(HdfsOrcDataSource.java:76) > ... 46 more > {code} > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-16917) Add transfer rate quantile metrics for DataNode reads
[ https://issues.apache.org/jira/browse/HDFS-16917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley reassigned HDFS-16917: Assignee: Ravindra Dingankar > Add transfer rate quantile metrics for DataNode reads > - > > Key: HDFS-16917 > URL: https://issues.apache.org/jira/browse/HDFS-16917 > Project: Hadoop HDFS > Issue Type: Task > Components: datanode >Reporter: Ravindra Dingankar >Assignee: Ravindra Dingankar >Priority: Minor > Labels: pull-request-available > Fix For: 3.3.0, 3.4.0 > > > Currently we have the following metrics for datanode reads. > |BytesRead > BlocksRead > TotalReadTime|Total number of bytes read from DataNode > Total number of blocks read from DataNode > Total number of milliseconds spent on read operation| > We would like to add a new quantile metric calculating the transfer rate for > datanode reads. > This will give us a distribution across a window of the read transfer rate > for each datanode. > Quantiles for transfer rate per host will help in identifying issues like > hotspotting of datasets as well as finding repetitive slow datanodes. > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16917) Add transfer rate quantile metrics for DataNode reads
[ https://issues.apache.org/jira/browse/HDFS-16917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravindra Dingankar updated HDFS-16917: -- Fix Version/s: 3.3.0 > Add transfer rate quantile metrics for DataNode reads > - > > Key: HDFS-16917 > URL: https://issues.apache.org/jira/browse/HDFS-16917 > Project: Hadoop HDFS > Issue Type: Task > Components: datanode >Reporter: Ravindra Dingankar >Priority: Minor > Labels: pull-request-available > Fix For: 3.3.0, 3.4.0 > > > Currently we have the following metrics for datanode reads. > |BytesRead > BlocksRead > TotalReadTime|Total number of bytes read from DataNode > Total number of blocks read from DataNode > Total number of milliseconds spent on read operation| > We would like to add a new quantile metric calculating the transfer rate for > datanode reads. > This will give us a distribution across a window of the read transfer rate > for each datanode. > Quantiles for transfer rate per host will help in identifying issues like > hotspotting of datasets as well as finding repetitive slow datanodes. > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16917) Add transfer rate quantile metrics for DataNode reads
[ https://issues.apache.org/jira/browse/HDFS-16917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17694151#comment-17694151 ] ASF GitHub Bot commented on HDFS-16917: --- rdingankar commented on PR #5397: URL: https://github.com/apache/hadoop/pull/5397#issuecomment-1446947721 Thanks @xinglin and @mkuchenbecker for the reviews and @omalley for helping to merge the change. > Add transfer rate quantile metrics for DataNode reads > - > > Key: HDFS-16917 > URL: https://issues.apache.org/jira/browse/HDFS-16917 > Project: Hadoop HDFS > Issue Type: Task > Components: datanode >Reporter: Ravindra Dingankar >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0 > > > Currently we have the following metrics for datanode reads. > |BytesRead > BlocksRead > TotalReadTime|Total number of bytes read from DataNode > Total number of blocks read from DataNode > Total number of milliseconds spent on read operation| > We would like to add a new quantile metric calculating the transfer rate for > datanode reads. > This will give us a distribution across a window of the read transfer rate > for each datanode. > Quantiles for transfer rate per host will help in identifying issues like > hotspotting of datasets as well as finding repetitive slow datanodes. > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-16917) Add transfer rate quantile metrics for DataNode reads
[ https://issues.apache.org/jira/browse/HDFS-16917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley resolved HDFS-16917. -- Fix Version/s: 3.4.0 Resolution: Fixed > Add transfer rate quantile metrics for DataNode reads > - > > Key: HDFS-16917 > URL: https://issues.apache.org/jira/browse/HDFS-16917 > Project: Hadoop HDFS > Issue Type: Task > Components: datanode >Reporter: Ravindra Dingankar >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0 > > > Currently we have the following metrics for datanode reads. > |BytesRead > BlocksRead > TotalReadTime|Total number of bytes read from DataNode > Total number of blocks read from DataNode > Total number of milliseconds spent on read operation| > We would like to add a new quantile metric calculating the transfer rate for > datanode reads. > This will give us a distribution across a window of the read transfer rate > for each datanode. > Quantiles for transfer rate per host will help in identifying issues like > hotspotting of datasets as well as finding repetitive slow datanodes. > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16917) Add transfer rate quantile metrics for DataNode reads
[ https://issues.apache.org/jira/browse/HDFS-16917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17694122#comment-17694122 ] ASF GitHub Bot commented on HDFS-16917: --- omalley merged PR #5397: URL: https://github.com/apache/hadoop/pull/5397 > Add transfer rate quantile metrics for DataNode reads > - > > Key: HDFS-16917 > URL: https://issues.apache.org/jira/browse/HDFS-16917 > Project: Hadoop HDFS > Issue Type: Task > Components: datanode >Reporter: Ravindra Dingankar >Priority: Minor > Labels: pull-request-available > > Currently we have the following metrics for datanode reads. > |BytesRead > BlocksRead > TotalReadTime|Total number of bytes read from DataNode > Total number of blocks read from DataNode > Total number of milliseconds spent on read operation| > We would like to add a new quantile metric calculating the transfer rate for > datanode reads. > This will give us a distribution across a window of the read transfer rate > for each datanode. > Quantiles for transfer rate per host will help in identifying issues like > hotspotting of datasets as well as finding repetitive slow datanodes. > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16890) RBF: Add period state refresh to keep router state near active namenode's
[ https://issues.apache.org/jira/browse/HDFS-16890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HDFS-16890: - Fix Version/s: (was: 3.3.6) > RBF: Add period state refresh to keep router state near active namenode's > - > > Key: HDFS-16890 > URL: https://issues.apache.org/jira/browse/HDFS-16890 > Project: Hadoop HDFS > Issue Type: Task >Reporter: Simbarashe Dzinamarira >Assignee: Simbarashe Dzinamarira >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > > When using the ObserverReadProxyProvider, clients can set > *dfs.client.failover.observer.auto-msync-period...* to periodically get the > Active namenode's state. When using routers without the > ObserverReadProxyProvider, this periodic update is lost. > In a busy cluster, the Router constantly gets updated with the active > namenode's state when > # There is a write operation. > # There is an operation (read/write) from a new clients. > However, in the scenario when there are no new clients and no write > operations, the state kept in the router can lag behind the active's. The > router does update its state with responses from the Observer, but the > observer may be lagging behind too. > We should have a periodic refresh in the router to serve a similar role as > *dfs.client.failover.observer.auto-msync-period* -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-16890) RBF: Add period state refresh to keep router state near active namenode's
[ https://issues.apache.org/jira/browse/HDFS-16890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley resolved HDFS-16890. -- Fix Version/s: 3.4.0 3.3.6 Resolution: Fixed > RBF: Add period state refresh to keep router state near active namenode's > - > > Key: HDFS-16890 > URL: https://issues.apache.org/jira/browse/HDFS-16890 > Project: Hadoop HDFS > Issue Type: Task >Reporter: Simbarashe Dzinamarira >Assignee: Simbarashe Dzinamarira >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0, 3.3.6 > > > When using the ObserverReadProxyProvider, clients can set > *dfs.client.failover.observer.auto-msync-period...* to periodically get the > Active namenode's state. When using routers without the > ObserverReadProxyProvider, this periodic update is lost. > In a busy cluster, the Router constantly gets updated with the active > namenode's state when > # There is a write operation. > # There is an operation (read/write) from a new clients. > However, in the scenario when there are no new clients and no write > operations, the state kept in the router can lag behind the active's. The > router does update its state with responses from the Observer, but the > observer may be lagging behind too. > We should have a periodic refresh in the router to serve a similar role as > *dfs.client.failover.observer.auto-msync-period* -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16936) Add baseDir option in NNThroughputBenchmark
[ https://issues.apache.org/jira/browse/HDFS-16936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17694114#comment-17694114 ] ASF GitHub Bot commented on HDFS-16936: --- hadoop-yetus commented on PR #5438: URL: https://github.com/apache/hadoop/pull/5438#issuecomment-1446790776 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 56s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 1 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 38m 28s | | trunk passed | | +1 :green_heart: | compile | 1m 27s | | trunk passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 | | +1 :green_heart: | compile | 1m 22s | | trunk passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | +1 :green_heart: | checkstyle | 1m 7s | | trunk passed | | +1 :green_heart: | mvnsite | 1m 30s | | trunk passed | | +1 :green_heart: | javadoc | 1m 8s | | trunk passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 | | +1 :green_heart: | javadoc | 1m 34s | | trunk passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | +1 :green_heart: | spotbugs | 3m 29s | | trunk passed | | +1 :green_heart: | shadedclient | 23m 0s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 1m 19s | | the patch passed | | +1 :green_heart: | compile | 1m 20s | | the patch passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 | | +1 :green_heart: | javac | 1m 20s | | the patch passed | | +1 :green_heart: | compile | 1m 13s | | the patch passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | +1 :green_heart: | javac | 1m 13s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 0m 51s | [/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5438/4/artifact/out/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs-project/hadoop-hdfs: The patch generated 2 new + 125 unchanged - 1 fixed = 127 total (was 126) | | +1 :green_heart: | mvnsite | 1m 19s | | the patch passed | | +1 :green_heart: | javadoc | 0m 51s | | the patch passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 | | +1 :green_heart: | javadoc | 1m 27s | | the patch passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | +1 :green_heart: | spotbugs | 3m 17s | | the patch passed | | +1 :green_heart: | shadedclient | 22m 23s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | -1 :x: | unit | 244m 19s | [/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5438/4/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 0m 50s | | The patch does not generate ASF License warnings. | | | | 351m 6s | | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.hdfs.server.namenode.ha.TestObserverNode | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.42 ServerAPI=1.42 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5438/4/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/5438 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets | | uname | Linux faa79acf889c 4.15.0-200-generic #211-Ubuntu SMP Thu Nov 24 18:16:04 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / 100b50c1ca907f2003d1974cc3f7d16d60f8 | | Default Java | Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | Test Results | https://ci-hadoo
[jira] [Commented] (HDFS-16890) RBF: Add period state refresh to keep router state near active namenode's
[ https://issues.apache.org/jira/browse/HDFS-16890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17694113#comment-17694113 ] ASF GitHub Bot commented on HDFS-16890: --- omalley merged PR #5298: URL: https://github.com/apache/hadoop/pull/5298 > RBF: Add period state refresh to keep router state near active namenode's > - > > Key: HDFS-16890 > URL: https://issues.apache.org/jira/browse/HDFS-16890 > Project: Hadoop HDFS > Issue Type: Task >Reporter: Simbarashe Dzinamarira >Assignee: Simbarashe Dzinamarira >Priority: Major > Labels: pull-request-available > > When using the ObserverReadProxyProvider, clients can set > *dfs.client.failover.observer.auto-msync-period...* to periodically get the > Active namenode's state. When using routers without the > ObserverReadProxyProvider, this periodic update is lost. > In a busy cluster, the Router constantly gets updated with the active > namenode's state when > # There is a write operation. > # There is an operation (read/write) from a new clients. > However, in the scenario when there are no new clients and no write > operations, the state kept in the router can lag behind the active's. The > router does update its state with responses from the Observer, but the > observer may be lagging behind too. > We should have a periodic refresh in the router to serve a similar role as > *dfs.client.failover.observer.auto-msync-period* -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16936) Add baseDir option in NNThroughputBenchmark
[ https://issues.apache.org/jira/browse/HDFS-16936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17694093#comment-17694093 ] ASF GitHub Bot commented on HDFS-16936: --- hadoop-yetus commented on PR #5438: URL: https://github.com/apache/hadoop/pull/5438#issuecomment-1446723423 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 42s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 1s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 1s | | detect-secrets was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 1 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 38m 28s | | trunk passed | | +1 :green_heart: | compile | 1m 28s | | trunk passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 | | +1 :green_heart: | compile | 1m 24s | | trunk passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | +1 :green_heart: | checkstyle | 1m 8s | | trunk passed | | +1 :green_heart: | mvnsite | 1m 30s | | trunk passed | | +1 :green_heart: | javadoc | 1m 8s | | trunk passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 | | +1 :green_heart: | javadoc | 1m 34s | | trunk passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | +1 :green_heart: | spotbugs | 3m 28s | | trunk passed | | +1 :green_heart: | shadedclient | 22m 42s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 1m 19s | | the patch passed | | +1 :green_heart: | compile | 1m 18s | | the patch passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 | | +1 :green_heart: | javac | 1m 18s | | the patch passed | | +1 :green_heart: | compile | 1m 16s | | the patch passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | +1 :green_heart: | javac | 1m 16s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 0m 53s | [/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5438/3/artifact/out/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs-project/hadoop-hdfs: The patch generated 2 new + 125 unchanged - 1 fixed = 127 total (was 126) | | +1 :green_heart: | mvnsite | 1m 21s | | the patch passed | | +1 :green_heart: | javadoc | 0m 51s | | the patch passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 | | +1 :green_heart: | javadoc | 1m 22s | | the patch passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | +1 :green_heart: | spotbugs | 3m 18s | | the patch passed | | +1 :green_heart: | shadedclient | 22m 18s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | -1 :x: | unit | 206m 12s | [/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5438/3/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 0m 49s | | The patch does not generate ASF License warnings. | | | | 312m 25s | | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.hdfs.server.datanode.TestDirectoryScanner | | | hadoop.hdfs.server.namenode.TestNNThroughputBenchmark | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.42 ServerAPI=1.42 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5438/3/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/5438 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets | | uname | Linux 1814bf9208ed 4.15.0-200-generic #211-Ubuntu SMP Thu Nov 24 18:16:04 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / 3544ef331ed04d247069bbe4536c57bc4b09bf31 | | Default Java | Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0
[jira] [Commented] (HDFS-16936) Add baseDir option in NNThroughputBenchmark
[ https://issues.apache.org/jira/browse/HDFS-16936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17694081#comment-17694081 ] ASF GitHub Bot commented on HDFS-16936: --- hadoop-yetus commented on PR #5438: URL: https://github.com/apache/hadoop/pull/5438#issuecomment-1446644535 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 42s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 1 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 39m 9s | | trunk passed | | +1 :green_heart: | compile | 1m 28s | | trunk passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 | | +1 :green_heart: | compile | 1m 21s | | trunk passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | +1 :green_heart: | checkstyle | 1m 7s | | trunk passed | | +1 :green_heart: | mvnsite | 1m 34s | | trunk passed | | +1 :green_heart: | javadoc | 1m 10s | | trunk passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 | | +1 :green_heart: | javadoc | 1m 35s | | trunk passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | +1 :green_heart: | spotbugs | 3m 38s | | trunk passed | | +1 :green_heart: | shadedclient | 22m 43s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 1m 18s | | the patch passed | | +1 :green_heart: | compile | 1m 24s | | the patch passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 | | +1 :green_heart: | javac | 1m 24s | | the patch passed | | +1 :green_heart: | compile | 1m 16s | | the patch passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | +1 :green_heart: | javac | 1m 16s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 0m 53s | [/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5438/1/artifact/out/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs-project/hadoop-hdfs: The patch generated 2 new + 125 unchanged - 1 fixed = 127 total (was 126) | | +1 :green_heart: | mvnsite | 1m 24s | | the patch passed | | +1 :green_heart: | javadoc | 0m 53s | | the patch passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 | | +1 :green_heart: | javadoc | 1m 30s | | the patch passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | +1 :green_heart: | spotbugs | 3m 32s | | the patch passed | | +1 :green_heart: | shadedclient | 22m 22s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | -1 :x: | unit | 214m 40s | [/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5438/1/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 0m 50s | | The patch does not generate ASF License warnings. | | | | 322m 1s | | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.hdfs.server.namenode.TestFsck | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.42 ServerAPI=1.42 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5438/1/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/5438 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets | | uname | Linux 18737bfe128c 4.15.0-200-generic #211-Ubuntu SMP Thu Nov 24 18:16:04 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / 36d4eab85b664c4f3c674642f0fea16709d03459 | | Default Java | Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | Test Results | https://ci-hadoop.apache.or
[jira] [Commented] (HDFS-16936) Add baseDir option in NNThroughputBenchmark
[ https://issues.apache.org/jira/browse/HDFS-16936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17694079#comment-17694079 ] ASF GitHub Bot commented on HDFS-16936: --- hadoop-yetus commented on PR #5438: URL: https://github.com/apache/hadoop/pull/5438#issuecomment-1446642704 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 37s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 1s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 1s | | detect-secrets was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 1 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 39m 13s | | trunk passed | | +1 :green_heart: | compile | 1m 31s | | trunk passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 | | +1 :green_heart: | compile | 1m 23s | | trunk passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | +1 :green_heart: | checkstyle | 1m 5s | | trunk passed | | +1 :green_heart: | mvnsite | 1m 33s | | trunk passed | | +1 :green_heart: | javadoc | 1m 12s | | trunk passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 | | +1 :green_heart: | javadoc | 1m 29s | | trunk passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | +1 :green_heart: | spotbugs | 3m 35s | | trunk passed | | +1 :green_heart: | shadedclient | 22m 25s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 1m 19s | | the patch passed | | +1 :green_heart: | compile | 1m 22s | | the patch passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 | | +1 :green_heart: | javac | 1m 22s | | the patch passed | | +1 :green_heart: | compile | 1m 15s | | the patch passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | +1 :green_heart: | javac | 1m 15s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 0m 55s | [/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5438/2/artifact/out/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs-project/hadoop-hdfs: The patch generated 2 new + 125 unchanged - 1 fixed = 127 total (was 126) | | +1 :green_heart: | mvnsite | 1m 23s | | the patch passed | | +1 :green_heart: | javadoc | 0m 55s | | the patch passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 | | +1 :green_heart: | javadoc | 1m 29s | | the patch passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | +1 :green_heart: | spotbugs | 3m 27s | | the patch passed | | +1 :green_heart: | shadedclient | 22m 20s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 212m 46s | | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 0m 48s | | The patch does not generate ASF License warnings. | | | | 319m 38s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.42 ServerAPI=1.42 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5438/2/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/5438 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets | | uname | Linux ae7505acdd07 4.15.0-200-generic #211-Ubuntu SMP Thu Nov 24 18:16:04 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / 36d4eab85b664c4f3c674642f0fea16709d03459 | | Default Java | Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5438/2/testReport/ | | Max. process+thread count | 3574 (vs. ulimit of 5500) | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR
[jira] [Commented] (HDFS-16934) org.apache.hadoop.hdfs.tools.TestDFSAdmin#testAllDatanodesReconfig regression
[ https://issues.apache.org/jira/browse/HDFS-16934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17694041#comment-17694041 ] ASF GitHub Bot commented on HDFS-16934: --- slfan1989 commented on PR #5434: URL: https://github.com/apache/hadoop/pull/5434#issuecomment-1446370125 @steveloughran Can you help review this pr? Thank you very much! > org.apache.hadoop.hdfs.tools.TestDFSAdmin#testAllDatanodesReconfig regression > - > > Key: HDFS-16934 > URL: https://issues.apache.org/jira/browse/HDFS-16934 > Project: Hadoop HDFS > Issue Type: Bug > Components: dfsadmin, test >Affects Versions: 3.4.0, 3.3.5, 3.3.9 >Reporter: Steve Loughran >Assignee: Shilun Fan >Priority: Minor > Labels: pull-request-available > > jenkins test failure as the logged output is in the wrong order for the > assertions. HDFS-16624 flipped the order...without that this would have > worked. > {code} > java.lang.AssertionError > at org.junit.Assert.fail(Assert.java:87) > at org.junit.Assert.assertTrue(Assert.java:42) > at org.junit.Assert.assertTrue(Assert.java:53) > at > org.apache.hadoop.hdfs.tools.TestDFSAdmin.testAllDatanodesReconfig(TestDFSAdmin.java:1149) > {code} > Here the code is asserting about the contents of the output, > {code} > assertTrue(outs.get(0).startsWith("Reconfiguring status for node")); > assertTrue("SUCCESS: Changed property > dfs.datanode.peer.stats.enabled".equals(outs.get(2)) > || "SUCCESS: Changed property > dfs.datanode.peer.stats.enabled".equals(outs.get(1))); // here > assertTrue("\tFrom: \"false\"".equals(outs.get(3)) || "\tFrom: > \"false\"".equals(outs.get(2))); > assertTrue("\tTo: \"true\"".equals(outs.get(4)) || "\tTo: > \"true\"".equals(outs.get(3))) > {code} > If you look at the log, the actual line is appearing in that list, just in a > different place. race condition > {code} > 2023-02-24 01:02:06,275 [Listener at localhost/41795] INFO > tools.TestDFSAdmin (TestDFSAdmin.java:testAllDatanodesReconfig(1146)) - > dfsadmin -status -livenodes output: > 2023-02-24 01:02:06,276 [Listener at localhost/41795] INFO > tools.TestDFSAdmin > (TestDFSAdmin.java:lambda$testAllDatanodesReconfig$0(1147)) - Reconfiguring > status for node [127.0.0.1:41795]: started at Fri Feb 24 01:02:03 GMT 2023 > and finished at Fri Feb 24 01:02:03 GMT 2023. > 2023-02-24 01:02:06,276 [Listener at localhost/41795] INFO > tools.TestDFSAdmin > (TestDFSAdmin.java:lambda$testAllDatanodesReconfig$0(1147)) - Reconfiguring > status for node [127.0.0.1:34007]: started at Fri Feb 24 01:02:03 GMT > 2023SUCCESS: Changed property dfs.datanode.peer.stats.enabled > 2023-02-24 01:02:06,277 [Listener at localhost/41795] INFO > tools.TestDFSAdmin > (TestDFSAdmin.java:lambda$testAllDatanodesReconfig$0(1147)) - From: "false" > 2023-02-24 01:02:06,277 [Listener at localhost/41795] INFO > tools.TestDFSAdmin > (TestDFSAdmin.java:lambda$testAllDatanodesReconfig$0(1147)) - To: "true" > 2023-02-24 01:02:06,277 [Listener at localhost/41795] INFO > tools.TestDFSAdmin > (TestDFSAdmin.java:lambda$testAllDatanodesReconfig$0(1147)) - and finished > at Fri Feb 24 01:02:03 GMT 2023. > 2023-02-24 01:02:06,277 [Listener at localhost/41795] INFO > tools.TestDFSAdmin > (TestDFSAdmin.java:lambda$testAllDatanodesReconfig$0(1147)) - SUCCESS: > Changed property dfs.datanode.peer.stats.enabled > {code} > we have a race condition in output generation and the assertions are clearly > too brittle > for the 3.3.5 release I'm not going to make this a blocker. What i will do is > propose that the asserts move to assertJ with an assertion that the > collection "containsExactlyInAnyOrder" all the strings. > That will > 1. not be brittle. > 2. give nice errors on failure -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16936) Add baseDir option in NNThroughputBenchmark
[ https://issues.apache.org/jira/browse/HDFS-16936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Bukhner updated HDFS-16936: Description: Now it's impossible to configure directory path using by {*}NNThroughputBenchmark{*}. This improvement is helpful in *RBF* features testing. Example of usage: {code:java} sudo bin/hadoop org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark -fs ... -op create -threads 16 -baseDir /cluster1{code} was: Now it's impossible to configure directory path in {*}NNThroughputBenchmark{*}. This improvement is helpful in *RBF* features testing. Example of usage: {code:java} sudo bin/hadoop org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark -fs ... -op create -threads 16 -baseDir /cluster1{code} > Add baseDir option in NNThroughputBenchmark > --- > > Key: HDFS-16936 > URL: https://issues.apache.org/jira/browse/HDFS-16936 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.3.4 >Reporter: Mark Bukhner >Priority: Trivial > Labels: pull-request-available > Fix For: 3.4.0, 3.3.5 > > > Now it's impossible to configure directory path using by > {*}NNThroughputBenchmark{*}. > This improvement is helpful in *RBF* features testing. > Example of usage: > {code:java} > sudo bin/hadoop org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark > -fs ... -op create -threads 16 -baseDir /cluster1{code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16936) Add baseDir option in NNThroughputBenchmark
[ https://issues.apache.org/jira/browse/HDFS-16936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Bukhner updated HDFS-16936: Description: Now it's impossible to configure directory path in {*}NNThroughputBenchmark{*}. This improvement is helpful in *RBF* features testing. Example of usage: {code:java} sudo bin/hadoop org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark -fs ... -op create -threads 16 -baseDir /cluster1{code} was: Now it's impossible to configure directory path in {*}NNThroughputBenchmark{*}. This improvement is helpful in *RBF* features testing. Priority: Trivial (was: Minor) > Add baseDir option in NNThroughputBenchmark > --- > > Key: HDFS-16936 > URL: https://issues.apache.org/jira/browse/HDFS-16936 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.3.4 >Reporter: Mark Bukhner >Priority: Trivial > Labels: pull-request-available > Fix For: 3.4.0, 3.3.5 > > > Now it's impossible to configure directory path in > {*}NNThroughputBenchmark{*}. > This improvement is helpful in *RBF* features testing. > Example of usage: > {code:java} > sudo bin/hadoop org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark > -fs ... -op create -threads 16 -baseDir /cluster1{code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16936) Add baseDir option in NNThroughputBenchmark
[ https://issues.apache.org/jira/browse/HDFS-16936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17693958#comment-17693958 ] ASF GitHub Bot commented on HDFS-16936: --- Alowator opened a new pull request, #5438: URL: https://github.com/apache/hadoop/pull/5438 ### Description of PR Now it's impossible to configure directory path in **NNThroughputBenchmark**. This improvement is helpful in **RBF** features testing. Example of usage: `sudo bin/hadoop org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark -fs ... -op create -threads 16 -baseDir /cluster1"` ### How was this patch tested? Running local 2 subclusters in RBF mode, then running two parallel working NNThroughputBenchmarks with different -baseDir options. ### For code changes: - [+] Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')? - [+] Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation? - [ ] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? - [ ] If applicable, have you updated the `LICENSE`, `LICENSE-binary`, `NOTICE-binary` files? > Add baseDir option in NNThroughputBenchmark > --- > > Key: HDFS-16936 > URL: https://issues.apache.org/jira/browse/HDFS-16936 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.3.4 >Reporter: Mark Bukhner >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0, 3.3.5 > > > Now it's impossible to configure directory path in > {*}NNThroughputBenchmark{*}. > This improvement is helpful in *RBF* features testing. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-16936) Add baseDir option in NNThroughputBenchmark
Mark Bukhner created HDFS-16936: --- Summary: Add baseDir option in NNThroughputBenchmark Key: HDFS-16936 URL: https://issues.apache.org/jira/browse/HDFS-16936 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 3.3.4 Reporter: Mark Bukhner Fix For: 3.4.0, 3.3.5 Now it's impossible to configure directory path in {*}NNThroughputBenchmark{*}. This improvement is helpful in *RBF* features testing. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-14548) Cannot create snapshot when the snapshotCounter reaches MaxSnapshotID
[ https://issues.apache.org/jira/browse/HDFS-14548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephen O'Donnell resolved HDFS-14548. -- Resolution: Duplicate > Cannot create snapshot when the snapshotCounter reaches MaxSnapshotID > - > > Key: HDFS-14548 > URL: https://issues.apache.org/jira/browse/HDFS-14548 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: zhangqianqiong >Priority: Major > Attachments: 1559717485296.jpg > > > when a new snapshot is created, the snapshotCounter would increment, but when > a snapshot is deleted, the snapshotCounter would not decrement. Over time, > when the snapshotCounter reaches the MaxSnapshotID, the new snapshot cannot > be created. > By the way, How can I reset the snapshotCounter? > > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16600) Fix deadlock of fine-grain lock for FsDatastImpl of DataNode.
[ https://issues.apache.org/jira/browse/HDFS-16600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17693871#comment-17693871 ] ZhangHB commented on HDFS-16600: [~xuzq_zander] , Hi, brother. Could you please provide some performance result, Thanks. Looking forward to receiving your reply. > Fix deadlock of fine-grain lock for FsDatastImpl of DataNode. > - > > Key: HDFS-16600 > URL: https://issues.apache.org/jira/browse/HDFS-16600 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: ZanderXu >Assignee: ZanderXu >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 5h 10m > Remaining Estimate: 0h > > The UT > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.testSynchronousEviction > failed, because happened deadlock, which is introduced by > [HDFS-16534|https://issues.apache.org/jira/browse/HDFS-16534]. > DeadLock: > {code:java} > // org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.createRbw line 1588 > need a read lock > try (AutoCloseableLock lock = lockManager.readLock(LockLevel.BLOCK_POOl, > b.getBlockPoolId())) > // org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.evictBlocks line > 3526 need a write lock > try (AutoCloseableLock lock = lockManager.writeLock(LockLevel.BLOCK_POOl, > bpid)) > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org