[jira] [Work logged] (HDFS-16287) Support to make dfs.namenode.avoid.read.slow.datanode reconfigurable

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16287?focusedWorklogId=673001&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-673001
 ]

ASF GitHub Bot logged work on HDFS-16287:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 07:23
Start Date: 02/Nov/21 07:23
Worklog Time Spent: 10m 
  Work Description: haiyang1987 commented on pull request #3596:
URL: https://github.com/apache/hadoop/pull/3596#issuecomment-957160686


   @ferhui @tomscut I submitted some code. Can you help review.
   thank you very much.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 673001)
Time Spent: 1h 20m  (was: 1h 10m)

> Support to make dfs.namenode.avoid.read.slow.datanode  reconfigurable
> -
>
> Key: HDFS-16287
> URL: https://issues.apache.org/jira/browse/HDFS-16287
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haiyang Hu
>Assignee: Haiyang Hu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> 1. Consider that make dfs.namenode.avoid.read.slow.datanode  reconfigurable 
> and rapid rollback in case this feature 
> [HDFS-16076|https://issues.apache.org/jira/browse/HDFS-16076] unexpected 
> things happen in production environment  
> 2.  DatanodeManager#startSlowPeerCollector by parameter 
> 'dfs.datanode.peer.stats.enabled' to control



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-16294) Remove invalid DataNode#CONFIG_PROPERTY_SIMULATED

2021-11-02 Thread JiangHua Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

JiangHua Zhu reassigned HDFS-16294:
---

Assignee: JiangHua Zhu

> Remove invalid DataNode#CONFIG_PROPERTY_SIMULATED
> -
>
> Key: HDFS-16294
> URL: https://issues.apache.org/jira/browse/HDFS-16294
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Major
> Fix For: 2.9.2
>
>
> As early as when HDFS-2907 was resolved, 
> SimulatedFSDataset#CONFIG_PROPERTY_SIMULATED was removed. Replaced by: 
> SimulatedFSDataset#Factory and 
> DFSConfigKyes#DFS_DATANODE_FSDATASET_FACTORY_KEY.
> However, the introduction to CONFIG_PROPERTY_SIMULATED is still retained in 
> the DataNode.
> Here are some traces related to HDFS-2907:
> https://issues.apache.org/jira/browse/HDFS-2907
> https://github.com/apache/hadoop/commit/efbc58f30c8e8d9f26c6a82d32d53716fb2b222a#diff-ab77612831fcb9a35e14c294417f0919c7a30c0cef9a4aec6b32d5f2df957020



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-16294) Remove invalid DataNode#CONFIG_PROPERTY_SIMULATED

2021-11-02 Thread JiangHua Zhu (Jira)
JiangHua Zhu created HDFS-16294:
---

 Summary: Remove invalid DataNode#CONFIG_PROPERTY_SIMULATED
 Key: HDFS-16294
 URL: https://issues.apache.org/jira/browse/HDFS-16294
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Reporter: JiangHua Zhu
 Fix For: 2.9.2


As early as when HDFS-2907 was resolved, 
SimulatedFSDataset#CONFIG_PROPERTY_SIMULATED was removed. Replaced by: 
SimulatedFSDataset#Factory and DFSConfigKyes#DFS_DATANODE_FSDATASET_FACTORY_KEY.
However, the introduction to CONFIG_PROPERTY_SIMULATED is still retained in the 
DataNode.


Here are some traces related to HDFS-2907:
https://issues.apache.org/jira/browse/HDFS-2907
https://github.com/apache/hadoop/commit/efbc58f30c8e8d9f26c6a82d32d53716fb2b222a#diff-ab77612831fcb9a35e14c294417f0919c7a30c0cef9a4aec6b32d5f2df957020



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16294) Remove invalid DataNode#CONFIG_PROPERTY_SIMULATED

2021-11-02 Thread JiangHua Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

JiangHua Zhu updated HDFS-16294:

Description: 
As early as when HDFS-2907 was resolved, 
SimulatedFSDataset#CONFIG_PROPERTY_SIMULATED was removed. Replaced by: 
SimulatedFSDataset#Factory and DFSConfigKyes#DFS_DATANODE_FSDATASET_FACTORY_KEY.
However, the introduction to CONFIG_PROPERTY_SIMULATED is still retained in the 
DataNode.
 !screenshot.png! 

Here are some traces related to HDFS-2907:
https://issues.apache.org/jira/browse/HDFS-2907
https://github.com/apache/hadoop/commit/efbc58f30c8e8d9f26c6a82d32d53716fb2b222a#diff-ab77612831fcb9a35e14c294417f0919c7a30c0cef9a4aec6b32d5f2df957020

  was:
As early as when HDFS-2907 was resolved, 
SimulatedFSDataset#CONFIG_PROPERTY_SIMULATED was removed. Replaced by: 
SimulatedFSDataset#Factory and DFSConfigKyes#DFS_DATANODE_FSDATASET_FACTORY_KEY.
However, the introduction to CONFIG_PROPERTY_SIMULATED is still retained in the 
DataNode.


Here are some traces related to HDFS-2907:
https://issues.apache.org/jira/browse/HDFS-2907
https://github.com/apache/hadoop/commit/efbc58f30c8e8d9f26c6a82d32d53716fb2b222a#diff-ab77612831fcb9a35e14c294417f0919c7a30c0cef9a4aec6b32d5f2df957020


> Remove invalid DataNode#CONFIG_PROPERTY_SIMULATED
> -
>
> Key: HDFS-16294
> URL: https://issues.apache.org/jira/browse/HDFS-16294
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Major
> Fix For: 2.9.2
>
> Attachments: screenshot.png
>
>
> As early as when HDFS-2907 was resolved, 
> SimulatedFSDataset#CONFIG_PROPERTY_SIMULATED was removed. Replaced by: 
> SimulatedFSDataset#Factory and 
> DFSConfigKyes#DFS_DATANODE_FSDATASET_FACTORY_KEY.
> However, the introduction to CONFIG_PROPERTY_SIMULATED is still retained in 
> the DataNode.
>  !screenshot.png! 
> Here are some traces related to HDFS-2907:
> https://issues.apache.org/jira/browse/HDFS-2907
> https://github.com/apache/hadoop/commit/efbc58f30c8e8d9f26c6a82d32d53716fb2b222a#diff-ab77612831fcb9a35e14c294417f0919c7a30c0cef9a4aec6b32d5f2df957020



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16294) Remove invalid DataNode#CONFIG_PROPERTY_SIMULATED

2021-11-02 Thread JiangHua Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

JiangHua Zhu updated HDFS-16294:

Attachment: screenshot.png

> Remove invalid DataNode#CONFIG_PROPERTY_SIMULATED
> -
>
> Key: HDFS-16294
> URL: https://issues.apache.org/jira/browse/HDFS-16294
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Major
> Fix For: 2.9.2
>
> Attachments: screenshot.png
>
>
> As early as when HDFS-2907 was resolved, 
> SimulatedFSDataset#CONFIG_PROPERTY_SIMULATED was removed. Replaced by: 
> SimulatedFSDataset#Factory and 
> DFSConfigKyes#DFS_DATANODE_FSDATASET_FACTORY_KEY.
> However, the introduction to CONFIG_PROPERTY_SIMULATED is still retained in 
> the DataNode.
> Here are some traces related to HDFS-2907:
> https://issues.apache.org/jira/browse/HDFS-2907
> https://github.com/apache/hadoop/commit/efbc58f30c8e8d9f26c6a82d32d53716fb2b222a#diff-ab77612831fcb9a35e14c294417f0919c7a30c0cef9a4aec6b32d5f2df957020



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work started] (HDFS-16294) Remove invalid DataNode#CONFIG_PROPERTY_SIMULATED

2021-11-02 Thread JiangHua Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HDFS-16294 started by JiangHua Zhu.
---
> Remove invalid DataNode#CONFIG_PROPERTY_SIMULATED
> -
>
> Key: HDFS-16294
> URL: https://issues.apache.org/jira/browse/HDFS-16294
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.9.2
>
> Attachments: screenshot.png
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> As early as when HDFS-2907 was resolved, 
> SimulatedFSDataset#CONFIG_PROPERTY_SIMULATED was removed. Replaced by: 
> SimulatedFSDataset#Factory and 
> DFSConfigKyes#DFS_DATANODE_FSDATASET_FACTORY_KEY.
> However, the introduction to CONFIG_PROPERTY_SIMULATED is still retained in 
> the DataNode.
>  !screenshot.png! 
> Here are some traces related to HDFS-2907:
> https://issues.apache.org/jira/browse/HDFS-2907
> https://github.com/apache/hadoop/commit/efbc58f30c8e8d9f26c6a82d32d53716fb2b222a#diff-ab77612831fcb9a35e14c294417f0919c7a30c0cef9a4aec6b32d5f2df957020



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16294) Remove invalid DataNode#CONFIG_PROPERTY_SIMULATED

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDFS-16294:
--
Labels: pull-request-available  (was: )

> Remove invalid DataNode#CONFIG_PROPERTY_SIMULATED
> -
>
> Key: HDFS-16294
> URL: https://issues.apache.org/jira/browse/HDFS-16294
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.9.2
>
> Attachments: screenshot.png
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> As early as when HDFS-2907 was resolved, 
> SimulatedFSDataset#CONFIG_PROPERTY_SIMULATED was removed. Replaced by: 
> SimulatedFSDataset#Factory and 
> DFSConfigKyes#DFS_DATANODE_FSDATASET_FACTORY_KEY.
> However, the introduction to CONFIG_PROPERTY_SIMULATED is still retained in 
> the DataNode.
>  !screenshot.png! 
> Here are some traces related to HDFS-2907:
> https://issues.apache.org/jira/browse/HDFS-2907
> https://github.com/apache/hadoop/commit/efbc58f30c8e8d9f26c6a82d32d53716fb2b222a#diff-ab77612831fcb9a35e14c294417f0919c7a30c0cef9a4aec6b32d5f2df957020



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16294) Remove invalid DataNode#CONFIG_PROPERTY_SIMULATED

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16294?focusedWorklogId=673018&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-673018
 ]

ASF GitHub Bot logged work on HDFS-16294:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 08:15
Start Date: 02/Nov/21 08:15
Worklog Time Spent: 10m 
  Work Description: jianghuazhu opened a new pull request #3605:
URL: https://github.com/apache/hadoop/pull/3605


   ### Description of PR
   As early as when HDFS-2907 was resolved, 
SimulatedFSDataset#CONFIG_PROPERTY_SIMULATED was removed. Replaced by: 
SimulatedFSDataset#Factory and DFSConfigKyes#DFS_DATANODE_FSDATASET_FACTORY_KEY.
   However, the introduction to CONFIG_PROPERTY_SIMULATED is still retained in 
the DataNode.
   Details: HDFS-16294
   
   ### How was this patch tested?
   This pr mainly solves the changes related to annotations, and the test 
pressure is not great.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 673018)
Remaining Estimate: 0h
Time Spent: 10m

> Remove invalid DataNode#CONFIG_PROPERTY_SIMULATED
> -
>
> Key: HDFS-16294
> URL: https://issues.apache.org/jira/browse/HDFS-16294
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Major
> Fix For: 2.9.2
>
> Attachments: screenshot.png
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> As early as when HDFS-2907 was resolved, 
> SimulatedFSDataset#CONFIG_PROPERTY_SIMULATED was removed. Replaced by: 
> SimulatedFSDataset#Factory and 
> DFSConfigKyes#DFS_DATANODE_FSDATASET_FACTORY_KEY.
> However, the introduction to CONFIG_PROPERTY_SIMULATED is still retained in 
> the DataNode.
>  !screenshot.png! 
> Here are some traces related to HDFS-2907:
> https://issues.apache.org/jira/browse/HDFS-2907
> https://github.com/apache/hadoop/commit/efbc58f30c8e8d9f26c6a82d32d53716fb2b222a#diff-ab77612831fcb9a35e14c294417f0919c7a30c0cef9a4aec6b32d5f2df957020



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16294) Remove invalid DataNode#CONFIG_PROPERTY_SIMULATED

2021-11-02 Thread JiangHua Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

JiangHua Zhu updated HDFS-16294:

Affects Version/s: 2.9.2

> Remove invalid DataNode#CONFIG_PROPERTY_SIMULATED
> -
>
> Key: HDFS-16294
> URL: https://issues.apache.org/jira/browse/HDFS-16294
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.9.2
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.9.2
>
> Attachments: screenshot.png
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> As early as when HDFS-2907 was resolved, 
> SimulatedFSDataset#CONFIG_PROPERTY_SIMULATED was removed. Replaced by: 
> SimulatedFSDataset#Factory and 
> DFSConfigKyes#DFS_DATANODE_FSDATASET_FACTORY_KEY.
> However, the introduction to CONFIG_PROPERTY_SIMULATED is still retained in 
> the DataNode.
>  !screenshot.png! 
> Here are some traces related to HDFS-2907:
> https://issues.apache.org/jira/browse/HDFS-2907
> https://github.com/apache/hadoop/commit/efbc58f30c8e8d9f26c6a82d32d53716fb2b222a#diff-ab77612831fcb9a35e14c294417f0919c7a30c0cef9a4aec6b32d5f2df957020



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16294) Remove invalid DataNode#CONFIG_PROPERTY_SIMULATED

2021-11-02 Thread JiangHua Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

JiangHua Zhu updated HDFS-16294:

Fix Version/s: (was: 2.9.2)

> Remove invalid DataNode#CONFIG_PROPERTY_SIMULATED
> -
>
> Key: HDFS-16294
> URL: https://issues.apache.org/jira/browse/HDFS-16294
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.9.2
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Major
>  Labels: pull-request-available
> Attachments: screenshot.png
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> As early as when HDFS-2907 was resolved, 
> SimulatedFSDataset#CONFIG_PROPERTY_SIMULATED was removed. Replaced by: 
> SimulatedFSDataset#Factory and 
> DFSConfigKyes#DFS_DATANODE_FSDATASET_FACTORY_KEY.
> However, the introduction to CONFIG_PROPERTY_SIMULATED is still retained in 
> the DataNode.
>  !screenshot.png! 
> Here are some traces related to HDFS-2907:
> https://issues.apache.org/jira/browse/HDFS-2907
> https://github.com/apache/hadoop/commit/efbc58f30c8e8d9f26c6a82d32d53716fb2b222a#diff-ab77612831fcb9a35e14c294417f0919c7a30c0cef9a4aec6b32d5f2df957020



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16286) Debug tool to verify the correctness of erasure coding on file

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16286?focusedWorklogId=673033&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-673033
 ]

ASF GitHub Bot logged work on HDFS-16286:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 09:05
Start Date: 02/Nov/21 09:05
Worklog Time Spent: 10m 
  Work Description: cndaimin commented on a change in pull request #3593:
URL: https://github.com/apache/hadoop/pull/3593#discussion_r740842196



##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DebugAdmin.java
##
@@ -387,6 +414,211 @@ int run(List args) throws IOException {
 }
   }
 
+  /**
+   * The command for verifying the correctness of erasure coding on an erasure 
coded file.
+   */
+  private class VerifyECCommand extends DebugCommand {
+private DFSClient client;
+private int dataBlkNum;
+private int parityBlkNum;
+private int cellSize;
+private boolean useDNHostname;
+private CachingStrategy cachingStrategy;
+private int stripedReadBufferSize;
+private CompletionService readService;
+private RawErasureDecoder decoder;
+private BlockReader[] blockReaders;
+
+
+VerifyECCommand() {
+  super("verifyEC",
+  "verifyEC -file ",
+  "  Verify HDFS erasure coding on all block groups of the file.");
+}
+
+int run(List args) throws IOException {
+  if (args.size() < 2) {
+System.out.println(usageText);
+System.out.println(helpText + System.lineSeparator());
+return 1;
+  }
+  String file = StringUtils.popOptionWithArgument("-file", args);
+  Path path = new Path(file);
+  DistributedFileSystem dfs = AdminHelper.getDFS(getConf());
+  this.client = dfs.getClient();
+
+  FileStatus fileStatus;
+  try {
+fileStatus = dfs.getFileStatus(path);
+  } catch (FileNotFoundException e) {
+System.err.println("File " + file + " does not exist.");
+return 1;
+  }
+
+  if (!fileStatus.isFile()) {
+System.err.println("File " + file + " is not a regular file.");
+return 1;
+  }
+  if (!dfs.isFileClosed(path)) {
+System.err.println("File " + file + " is not closed.");
+return 1;
+  }
+  this.useDNHostname = 
getConf().getBoolean(DFSConfigKeys.DFS_DATANODE_USE_DN_HOSTNAME,
+  DFSConfigKeys.DFS_DATANODE_USE_DN_HOSTNAME_DEFAULT);
+  this.cachingStrategy = CachingStrategy.newDefaultStrategy();
+  this.stripedReadBufferSize = getConf().getInt(
+  DFSConfigKeys.DFS_DN_EC_RECONSTRUCTION_STRIPED_READ_BUFFER_SIZE_KEY,
+  
DFSConfigKeys.DFS_DN_EC_RECONSTRUCTION_STRIPED_READ_BUFFER_SIZE_DEFAULT);
+
+  LocatedBlocks locatedBlocks = client.getLocatedBlocks(file, 0, 
fileStatus.getLen());
+  if (locatedBlocks.getErasureCodingPolicy() == null) {
+System.err.println("File " + file + " is not erasure coded.");
+return 1;
+  }
+  ErasureCodingPolicy ecPolicy = locatedBlocks.getErasureCodingPolicy();
+  this.dataBlkNum = ecPolicy.getNumDataUnits();
+  this.parityBlkNum = ecPolicy.getNumParityUnits();
+  this.cellSize = ecPolicy.getCellSize();
+  this.decoder = CodecUtil.createRawDecoder(getConf(), 
ecPolicy.getCodecName(),
+  new ErasureCoderOptions(
+  ecPolicy.getNumDataUnits(), ecPolicy.getNumParityUnits()));
+  int blockNum = dataBlkNum + parityBlkNum;
+  this.readService = new ExecutorCompletionService<>(
+  DFSUtilClient.getThreadPoolExecutor(blockNum, blockNum, 60,
+  new LinkedBlockingQueue<>(), "read-", false));
+  this.blockReaders = new BlockReader[dataBlkNum + parityBlkNum];
+
+  for (LocatedBlock locatedBlock : locatedBlocks.getLocatedBlocks()) {
+System.out.println("Checking EC block group: blk_" + 
locatedBlock.getBlock().getBlockId());
+LocatedStripedBlock blockGroup = (LocatedStripedBlock) locatedBlock;
+
+try {
+  verifyBlockGroup(blockGroup);
+  System.out.println("Status: OK");
+} catch (Exception e) {
+  System.err.println("Status: ERROR, message: " + e.getMessage());
+  return 1;
+} finally {
+  closeBlockReaders();
+}
+  }
+  System.out.println("\nAll EC block group status: OK");
+  return 0;
+}
+
+private void verifyBlockGroup(LocatedStripedBlock blockGroup) throws 
Exception {
+  final LocatedBlock[] indexedBlocks = 
StripedBlockUtil.parseStripedBlockGroup(blockGroup,
+  cellSize, dataBlkNum, parityBlkNum);
+
+  int blockNumExpected = Math.min(dataBlkNum,
+  (int) ((blockGroup.getBlockSize() - 1) / cellSize + 1)) + 
parityBlkNum;
+  if (blockGroup.getBlockIndices().length < blockNumExpected) {
+throw new Exception("Block group is under-erasure-coded.");
+  }
+
+  long m

[jira] [Work logged] (HDFS-16286) Debug tool to verify the correctness of erasure coding on file

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16286?focusedWorklogId=673051&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-673051
 ]

ASF GitHub Bot logged work on HDFS-16286:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 09:39
Start Date: 02/Nov/21 09:39
Worklog Time Spent: 10m 
  Work Description: cndaimin commented on a change in pull request #3593:
URL: https://github.com/apache/hadoop/pull/3593#discussion_r740874125



##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DebugAdmin.java
##
@@ -387,6 +414,211 @@ int run(List args) throws IOException {
 }
   }
 
+  /**
+   * The command for verifying the correctness of erasure coding on an erasure 
coded file.
+   */
+  private class VerifyECCommand extends DebugCommand {
+private DFSClient client;
+private int dataBlkNum;
+private int parityBlkNum;
+private int cellSize;
+private boolean useDNHostname;
+private CachingStrategy cachingStrategy;
+private int stripedReadBufferSize;
+private CompletionService readService;
+private RawErasureDecoder decoder;
+private BlockReader[] blockReaders;
+
+
+VerifyECCommand() {
+  super("verifyEC",
+  "verifyEC -file ",
+  "  Verify HDFS erasure coding on all block groups of the file.");
+}
+
+int run(List args) throws IOException {
+  if (args.size() < 2) {
+System.out.println(usageText);
+System.out.println(helpText + System.lineSeparator());
+return 1;
+  }
+  String file = StringUtils.popOptionWithArgument("-file", args);
+  Path path = new Path(file);
+  DistributedFileSystem dfs = AdminHelper.getDFS(getConf());
+  this.client = dfs.getClient();
+
+  FileStatus fileStatus;
+  try {
+fileStatus = dfs.getFileStatus(path);
+  } catch (FileNotFoundException e) {
+System.err.println("File " + file + " does not exist.");
+return 1;
+  }
+
+  if (!fileStatus.isFile()) {
+System.err.println("File " + file + " is not a regular file.");
+return 1;
+  }
+  if (!dfs.isFileClosed(path)) {
+System.err.println("File " + file + " is not closed.");
+return 1;
+  }
+  this.useDNHostname = 
getConf().getBoolean(DFSConfigKeys.DFS_DATANODE_USE_DN_HOSTNAME,
+  DFSConfigKeys.DFS_DATANODE_USE_DN_HOSTNAME_DEFAULT);
+  this.cachingStrategy = CachingStrategy.newDefaultStrategy();
+  this.stripedReadBufferSize = getConf().getInt(
+  DFSConfigKeys.DFS_DN_EC_RECONSTRUCTION_STRIPED_READ_BUFFER_SIZE_KEY,
+  
DFSConfigKeys.DFS_DN_EC_RECONSTRUCTION_STRIPED_READ_BUFFER_SIZE_DEFAULT);
+
+  LocatedBlocks locatedBlocks = client.getLocatedBlocks(file, 0, 
fileStatus.getLen());
+  if (locatedBlocks.getErasureCodingPolicy() == null) {
+System.err.println("File " + file + " is not erasure coded.");
+return 1;
+  }
+  ErasureCodingPolicy ecPolicy = locatedBlocks.getErasureCodingPolicy();
+  this.dataBlkNum = ecPolicy.getNumDataUnits();
+  this.parityBlkNum = ecPolicy.getNumParityUnits();
+  this.cellSize = ecPolicy.getCellSize();
+  this.decoder = CodecUtil.createRawDecoder(getConf(), 
ecPolicy.getCodecName(),
+  new ErasureCoderOptions(
+  ecPolicy.getNumDataUnits(), ecPolicy.getNumParityUnits()));
+  int blockNum = dataBlkNum + parityBlkNum;
+  this.readService = new ExecutorCompletionService<>(
+  DFSUtilClient.getThreadPoolExecutor(blockNum, blockNum, 60,
+  new LinkedBlockingQueue<>(), "read-", false));
+  this.blockReaders = new BlockReader[dataBlkNum + parityBlkNum];
+
+  for (LocatedBlock locatedBlock : locatedBlocks.getLocatedBlocks()) {
+System.out.println("Checking EC block group: blk_" + 
locatedBlock.getBlock().getBlockId());
+LocatedStripedBlock blockGroup = (LocatedStripedBlock) locatedBlock;
+
+try {
+  verifyBlockGroup(blockGroup);
+  System.out.println("Status: OK");
+} catch (Exception e) {
+  System.err.println("Status: ERROR, message: " + e.getMessage());
+  return 1;
+} finally {
+  closeBlockReaders();
+}
+  }
+  System.out.println("\nAll EC block group status: OK");
+  return 0;
+}
+
+private void verifyBlockGroup(LocatedStripedBlock blockGroup) throws 
Exception {
+  final LocatedBlock[] indexedBlocks = 
StripedBlockUtil.parseStripedBlockGroup(blockGroup,
+  cellSize, dataBlkNum, parityBlkNum);
+
+  int blockNumExpected = Math.min(dataBlkNum,
+  (int) ((blockGroup.getBlockSize() - 1) / cellSize + 1)) + 
parityBlkNum;
+  if (blockGroup.getBlockIndices().length < blockNumExpected) {
+throw new Exception("Block group is under-erasure-coded.");
+  }
+
+  long m

[jira] [Work logged] (HDFS-16286) Debug tool to verify the correctness of erasure coding on file

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16286?focusedWorklogId=673053&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-673053
 ]

ASF GitHub Bot logged work on HDFS-16286:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 09:44
Start Date: 02/Nov/21 09:44
Worklog Time Spent: 10m 
  Work Description: cndaimin commented on pull request #3593:
URL: https://github.com/apache/hadoop/pull/3593#issuecomment-957273204


   @sodonnel Thanks for your review. 
   Update: I have fixed the review comments and added some test in 
`TestDebugAdmin#testVerifyECCommand`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 673053)
Time Spent: 1h  (was: 50m)

> Debug tool to verify the correctness of erasure coding on file
> --
>
> Key: HDFS-16286
> URL: https://issues.apache.org/jira/browse/HDFS-16286
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding, tools
>Affects Versions: 3.3.0, 3.3.1
>Reporter: daimin
>Assignee: daimin
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Block data in erasure coded block group may corrupt and the block meta 
> (checksum) is unable to discover the corruption in some cases such as EC 
> reconstruction, related issues are:  HDFS-14768, HDFS-15186, HDFS-15240.
> In addition to HDFS-15759, there needs a tool to check erasure coded file 
> whether any block group has data corruption in case of other conditions 
> rather than EC reconstruction, or the feature HDFS-15759(validation during EC 
> reconstruction) is not open(which is close by default now).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15413) DFSStripedInputStream throws exception when datanodes close idle connections

2021-11-02 Thread Hemanth Boyina (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17437284#comment-17437284
 ] 

Hemanth Boyina commented on HDFS-15413:
---

Hi [~jmkubin] AFAIK the issue is not resolved yet, you can work on it if you 
want to

> DFSStripedInputStream throws exception when datanodes close idle connections
> 
>
> Key: HDFS-15413
> URL: https://issues.apache.org/jira/browse/HDFS-15413
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ec, erasure-coding, hdfs-client
>Affects Versions: 3.1.3
> Environment: - Hadoop 3.1.3
> - erasure coding with ISA-L and RS-3-2-1024k scheme
> - running in kubernetes
> - dfs.client.socket-timeout = 1
> - dfs.datanode.socket.write.timeout = 1
>Reporter: Andrey Elenskiy
>Priority: Critical
> Attachments: out.log
>
>
> We've run into an issue with compactions failing in HBase when erasure coding 
> is enabled on a table directory. After digging further I was able to narrow 
> it down to a seek + read logic and able to reproduce the issue with hdfs 
> client only:
> {code:java}
> import org.apache.hadoop.conf.Configuration;
> import org.apache.hadoop.fs.Path;
> import org.apache.hadoop.fs.FileSystem;
> import org.apache.hadoop.fs.FSDataInputStream;
> public class ReaderRaw {
> public static void main(final String[] args) throws Exception {
> Path p = new Path(args[0]);
> int bufLen = Integer.parseInt(args[1]);
> int sleepDuration = Integer.parseInt(args[2]);
> int countBeforeSleep = Integer.parseInt(args[3]);
> int countAfterSleep = Integer.parseInt(args[4]);
> Configuration conf = new Configuration();
> FSDataInputStream istream = FileSystem.get(conf).open(p);
> byte[] buf = new byte[bufLen];
> int readTotal = 0;
> int count = 0;
> try {
>   while (true) {
> istream.seek(readTotal);
> int bytesRemaining = bufLen;
> int bufOffset = 0;
> while (bytesRemaining > 0) {
>   int nread = istream.read(buf, 0, bufLen);
>   if (nread < 0) {
>   throw new Exception("nread is less than zero");
>   }
>   readTotal += nread;
>   bufOffset += nread;
>   bytesRemaining -= nread;
> }
> count++;
> if (count == countBeforeSleep) {
> System.out.println("sleeping for " + sleepDuration + " 
> milliseconds");
> Thread.sleep(sleepDuration);
> System.out.println("resuming");
> }
> if (count == countBeforeSleep + countAfterSleep) {
> System.out.println("done");
> break;
> }
>   }
> } catch (Exception e) {
> System.out.println("exception on read " + count + " read total " 
> + readTotal);
> throw e;
> }
> }
> }
> {code}
> The issue appears to be due to the fact that datanodes close the connection 
> of EC client if it doesn't fetch next packet for longer than 
> dfs.client.socket-timeout. The EC client doesn't retry and instead assumes 
> that those datanodes went away resulting in "missing blocks" exception.
> I was able to consistently reproduce with the following arguments:
> {noformat}
> bufLen = 100 (just below 1MB which is the size of the stripe) 
> sleepDuration = (dfs.client.socket-timeout + 1) * 1000 (in our case 11000)
> countBeforeSleep = 1
> countAfterSleep = 7
> {noformat}
> I've attached the entire log output of running the snippet above against 
> erasure coded file with RS-3-2-1024k policy. And here are the logs from 
> datanodes of disconnecting the client:
> datanode 1:
> {noformat}
> 2020-06-15 19:06:20,697 INFO datanode.DataNode: Likely the client has stopped 
> reading, disconnecting it (datanode-v11-0-hadoop.hadoop:9866:DataXceiver 
> error processing READ_BLOCK operation  src: /10.128.23.40:53748 dst: 
> /10.128.14.46:9866); java.net.SocketTimeoutException: 1 millis timeout 
> while waiting for channel to be ready for write. ch : 
> java.nio.channels.SocketChannel[connected local=/10.128.14.46:9866 
> remote=/10.128.23.40:53748]
> {noformat}
> datanode 2:
> {noformat}
> 2020-06-15 19:06:20,341 INFO datanode.DataNode: Likely the client has stopped 
> reading, disconnecting it (datanode-v11-1-hadoop.hadoop:9866:DataXceiver 
> error processing READ_BLOCK operation  src: /10.128.23.40:48772 dst: 
> /10.128.9.42:9866); java.net.SocketTimeoutException: 1 millis timeout 
> while waiting for channel to be ready for write. ch : 
> java.nio.channels.SocketChannel[connected local=/10.128.9.42:9866 
> remote=/10.128.23.40:48772]
> {noformat}
> datan

[jira] [Work logged] (HDFS-16273) RBF: RouterRpcFairnessPolicyController add availableHandleOnPerNs metrics

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16273?focusedWorklogId=673082&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-673082
 ]

ASF GitHub Bot logged work on HDFS-16273:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 11:01
Start Date: 02/Nov/21 11:01
Worklog Time Spent: 10m 
  Work Description: aajisaka commented on pull request #3553:
URL: https://github.com/apache/hadoop/pull/3553#issuecomment-957333859


   Hi @tasanuma would you check this PR?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 673082)
Time Spent: 40m  (was: 0.5h)

> RBF: RouterRpcFairnessPolicyController add availableHandleOnPerNs metrics
> -
>
> Key: HDFS-16273
> URL: https://issues.apache.org/jira/browse/HDFS-16273
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: rbf
>Affects Versions: 3.4.0
>Reporter: Xiangyi Zhu
>Assignee: Xiangyi Zhu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Add the availableHandlerOnPerNs metrics to monitor whether the number of 
> handlers configured for each NS is reasonable when using 
> RouterRpcFairnessPolicyController.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16287) Support to make dfs.namenode.avoid.read.slow.datanode reconfigurable

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16287?focusedWorklogId=673088&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-673088
 ]

ASF GitHub Bot logged work on HDFS-16287:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 11:06
Start Date: 02/Nov/21 11:06
Worklog Time Spent: 10m 
  Work Description: ferhui commented on a change in pull request #3596:
URL: https://github.com/apache/hadoop/pull/3596#discussion_r740948534



##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java
##
@@ -260,17 +257,14 @@
 final Timer timer = new Timer();
 this.slowPeerTracker = dataNodePeerStatsEnabled ?
 new SlowPeerTracker(conf, timer) : null;
-this.excludeSlowNodesEnabled = conf.getBoolean(

Review comment:
   Is it unused? If so we can remove it from hdfs-default.xml.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 673088)
Time Spent: 1.5h  (was: 1h 20m)

> Support to make dfs.namenode.avoid.read.slow.datanode  reconfigurable
> -
>
> Key: HDFS-16287
> URL: https://issues.apache.org/jira/browse/HDFS-16287
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haiyang Hu
>Assignee: Haiyang Hu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> 1. Consider that make dfs.namenode.avoid.read.slow.datanode  reconfigurable 
> and rapid rollback in case this feature 
> [HDFS-16076|https://issues.apache.org/jira/browse/HDFS-16076] unexpected 
> things happen in production environment  
> 2.  DatanodeManager#startSlowPeerCollector by parameter 
> 'dfs.datanode.peer.stats.enabled' to control



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16287) Support to make dfs.namenode.avoid.read.slow.datanode reconfigurable

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16287?focusedWorklogId=673090&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-673090
 ]

ASF GitHub Bot logged work on HDFS-16287:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 11:07
Start Date: 02/Nov/21 11:07
Worklog Time Spent: 10m 
  Work Description: ferhui commented on pull request #3596:
URL: https://github.com/apache/hadoop/pull/3596#issuecomment-957338822


   @haiyang1987 Thanks for contribution, some comments:
   we can change the title here and jira, If 
dfs.namenode.block-placement-policy.exclude-slow-nodes.enabled is not 
reconfigurable.
   And I will check whether 
dfs.namenode.block-placement-policy.exclude-slow-nodes.enabled is unused.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 673090)
Time Spent: 1h 40m  (was: 1.5h)

> Support to make dfs.namenode.avoid.read.slow.datanode  reconfigurable
> -
>
> Key: HDFS-16287
> URL: https://issues.apache.org/jira/browse/HDFS-16287
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haiyang Hu
>Assignee: Haiyang Hu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> 1. Consider that make dfs.namenode.avoid.read.slow.datanode  reconfigurable 
> and rapid rollback in case this feature 
> [HDFS-16076|https://issues.apache.org/jira/browse/HDFS-16076] unexpected 
> things happen in production environment  
> 2.  DatanodeManager#startSlowPeerCollector by parameter 
> 'dfs.datanode.peer.stats.enabled' to control



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16291) Make the comment of INode#ReclaimContext more standardized

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16291?focusedWorklogId=673091&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-673091
 ]

ASF GitHub Bot logged work on HDFS-16291:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 11:10
Start Date: 02/Nov/21 11:10
Worklog Time Spent: 10m 
  Work Description: jianghuazhu commented on pull request #3602:
URL: https://github.com/apache/hadoop/pull/3602#issuecomment-957340418


   Hi @jojochuang @ferhui @Hexiaoqiao, are you willing to spend some time to 
help review this pr.
   Thank you very much.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 673091)
Time Spent: 50m  (was: 40m)

> Make the comment of INode#ReclaimContext more standardized
> --
>
> Key: HDFS-16291
> URL: https://issues.apache.org/jira/browse/HDFS-16291
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation, namenode
>Affects Versions: 3.4.0
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Minor
>  Labels: pull-request-available
> Attachments: image-2021-10-31-20-25-08-379.png
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> In the INode#ReclaimContext class, there are some comments that are not 
> standardized enough.
> E.g:
>  !image-2021-10-31-20-25-08-379.png! 
> We should make comments more standardized. This will be more readable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16259) Catch and re-throw sub-classes of AccessControlException thrown by any permission provider plugins (eg Ranger)

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16259?focusedWorklogId=673096&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-673096
 ]

ASF GitHub Bot logged work on HDFS-16259:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 11:14
Start Date: 02/Nov/21 11:14
Worklog Time Spent: 10m 
  Work Description: sodonnel merged pull request #3598:
URL: https://github.com/apache/hadoop/pull/3598


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 673096)
Time Spent: 40m  (was: 0.5h)

> Catch and re-throw sub-classes of AccessControlException thrown by any 
> permission provider plugins (eg Ranger)
> --
>
> Key: HDFS-16259
> URL: https://issues.apache.org/jira/browse/HDFS-16259
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> When a permission provider plugin is enabled (eg Ranger) there are some 
> scenarios where it can throw a sub-class of an AccessControlException (eg 
> RangerAccessControlException). If this exception is allowed to propagate up 
> the stack, it can give problems in the HDFS Client, when it unwraps the 
> remote exception containing the AccessControlException sub-class.
> Ideally, we should make AccessControlException final so it cannot be 
> sub-classed, but that would be a breaking change at this point. Therefore I 
> believe the safest thing to do, is to catch any AccessControlException that 
> comes out of the permission enforcer plugin, and re-throw an 
> AccessControlException instead.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16259) Catch and re-throw sub-classes of AccessControlException thrown by any permission provider plugins (eg Ranger)

2021-11-02 Thread Stephen O'Donnell (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen O'Donnell updated HDFS-16259:
-
Fix Version/s: 3.3.2
   3.4.0

> Catch and re-throw sub-classes of AccessControlException thrown by any 
> permission provider plugins (eg Ranger)
> --
>
> Key: HDFS-16259
> URL: https://issues.apache.org/jira/browse/HDFS-16259
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.2
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> When a permission provider plugin is enabled (eg Ranger) there are some 
> scenarios where it can throw a sub-class of an AccessControlException (eg 
> RangerAccessControlException). If this exception is allowed to propagate up 
> the stack, it can give problems in the HDFS Client, when it unwraps the 
> remote exception containing the AccessControlException sub-class.
> Ideally, we should make AccessControlException final so it cannot be 
> sub-classed, but that would be a breaking change at this point. Therefore I 
> believe the safest thing to do, is to catch any AccessControlException that 
> comes out of the permission enforcer plugin, and re-throw an 
> AccessControlException instead.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-16259) Catch and re-throw sub-classes of AccessControlException thrown by any permission provider plugins (eg Ranger)

2021-11-02 Thread Stephen O'Donnell (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen O'Donnell resolved HDFS-16259.
--
Resolution: Fixed

Thanks for the review [~weichiu] and the discussion on this [~ayushtkn]

> Catch and re-throw sub-classes of AccessControlException thrown by any 
> permission provider plugins (eg Ranger)
> --
>
> Key: HDFS-16259
> URL: https://issues.apache.org/jira/browse/HDFS-16259
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.2
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> When a permission provider plugin is enabled (eg Ranger) there are some 
> scenarios where it can throw a sub-class of an AccessControlException (eg 
> RangerAccessControlException). If this exception is allowed to propagate up 
> the stack, it can give problems in the HDFS Client, when it unwraps the 
> remote exception containing the AccessControlException sub-class.
> Ideally, we should make AccessControlException final so it cannot be 
> sub-classed, but that would be a breaking change at this point. Therefore I 
> believe the safest thing to do, is to catch any AccessControlException that 
> comes out of the permission enforcer plugin, and re-throw an 
> AccessControlException instead.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16266) Add remote port information to HDFS audit log

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16266?focusedWorklogId=673119&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-673119
 ]

ASF GitHub Bot logged work on HDFS-16266:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 12:05
Start Date: 02/Nov/21 12:05
Worklog Time Spent: 10m 
  Work Description: tomscut commented on pull request #3538:
URL: https://github.com/apache/hadoop/pull/3538#issuecomment-957399841


   > routerClientIp
   
   Thanks @aajisaka for your review and comment. I will make some changes in 
the other JIRA if necessary, because this might involve the callContext from 
Router. What do you think of this? Thank you.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 673119)
Time Spent: 7h 10m  (was: 7h)

> Add remote port information to HDFS audit log
> -
>
> Key: HDFS-16266
> URL: https://issues.apache.org/jira/browse/HDFS-16266
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: tomscut
>Assignee: tomscut
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 7h 10m
>  Remaining Estimate: 0h
>
> In our production environment, we occasionally encounter a problem where a 
> user submits an abnormal computation task, causing a sudden flood of 
> requests, which causes the queueTime and processingTime of the Namenode to 
> rise very high, causing a large backlog of tasks.
> We usually locate and kill specific Spark, Flink, or MapReduce tasks based on 
> metrics and audit logs. Currently, IP and UGI are recorded in audit logs, but 
> there is no port information, so it is difficult to locate specific processes 
> sometimes. Therefore, I propose that we add the port information to the audit 
> log, so that we can easily track the upstream process.
> Currently, some projects contain port information in audit logs, such as 
> Hbase and Alluxio. I think it is also necessary to add port information for 
> HDFS audit logs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16287) Support to make dfs.namenode.avoid.read.slow.datanode reconfigurable

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16287?focusedWorklogId=673125&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-673125
 ]

ASF GitHub Bot logged work on HDFS-16287:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 12:22
Start Date: 02/Nov/21 12:22
Worklog Time Spent: 10m 
  Work Description: tomscut commented on pull request #3596:
URL: https://github.com/apache/hadoop/pull/3596#issuecomment-957437161


   > @ferhui @tomscut I submitted some code. Can you help review. thank you 
very much.
   
   Thanks for reminding me. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 673125)
Time Spent: 1h 50m  (was: 1h 40m)

> Support to make dfs.namenode.avoid.read.slow.datanode  reconfigurable
> -
>
> Key: HDFS-16287
> URL: https://issues.apache.org/jira/browse/HDFS-16287
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haiyang Hu
>Assignee: Haiyang Hu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> 1. Consider that make dfs.namenode.avoid.read.slow.datanode  reconfigurable 
> and rapid rollback in case this feature 
> [HDFS-16076|https://issues.apache.org/jira/browse/HDFS-16076] unexpected 
> things happen in production environment  
> 2.  DatanodeManager#startSlowPeerCollector by parameter 
> 'dfs.datanode.peer.stats.enabled' to control



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16266) Add remote port information to HDFS audit log

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16266?focusedWorklogId=673129&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-673129
 ]

ASF GitHub Bot logged work on HDFS-16266:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 12:26
Start Date: 02/Nov/21 12:26
Worklog Time Spent: 10m 
  Work Description: aajisaka commented on pull request #3538:
URL: https://github.com/apache/hadoop/pull/3538#issuecomment-957445801


   > I will make some changes in the other JIRA if necessary, because this 
might involve the callContext from Router. What do you think of this? Thank you.
   
   Agreed. Let's discuss in a separate JIRA.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 673129)
Time Spent: 7h 20m  (was: 7h 10m)

> Add remote port information to HDFS audit log
> -
>
> Key: HDFS-16266
> URL: https://issues.apache.org/jira/browse/HDFS-16266
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: tomscut
>Assignee: tomscut
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 7h 20m
>  Remaining Estimate: 0h
>
> In our production environment, we occasionally encounter a problem where a 
> user submits an abnormal computation task, causing a sudden flood of 
> requests, which causes the queueTime and processingTime of the Namenode to 
> rise very high, causing a large backlog of tasks.
> We usually locate and kill specific Spark, Flink, or MapReduce tasks based on 
> metrics and audit logs. Currently, IP and UGI are recorded in audit logs, but 
> there is no port information, so it is difficult to locate specific processes 
> sometimes. Therefore, I propose that we add the port information to the audit 
> log, so that we can easily track the upstream process.
> Currently, some projects contain port information in audit logs, such as 
> Hbase and Alluxio. I think it is also necessary to add port information for 
> HDFS audit logs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16287) Support to make dfs.namenode.avoid.read.slow.datanode reconfigurable

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16287?focusedWorklogId=673130&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-673130
 ]

ASF GitHub Bot logged work on HDFS-16287:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 12:27
Start Date: 02/Nov/21 12:27
Worklog Time Spent: 10m 
  Work Description: tomscut commented on a change in pull request #3596:
URL: https://github.com/apache/hadoop/pull/3596#discussion_r741006484



##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java
##
@@ -511,7 +505,16 @@ private boolean isInactive(DatanodeInfo datanode) {
   private boolean isSlowNode(String dnUuid) {
 return avoidSlowDataNodesForRead && slowNodesUuidSet.contains(dnUuid);
   }
-  
+
+  public void setAvoidSlowDataNodesForReadEnabled(boolean enable) {

Review comment:
   We might need to check whether the slowPeerTracker is started or we 
might not get the slow peers.

##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java
##
@@ -260,17 +257,14 @@
 final Timer timer = new Timer();
 this.slowPeerTracker = dataNodePeerStatsEnabled ?
 new SlowPeerTracker(conf, timer) : null;
-this.excludeSlowNodesEnabled = conf.getBoolean(
-DFS_NAMENODE_BLOCKPLACEMENTPOLICY_EXCLUDE_SLOW_NODES_ENABLED_KEY,
-DFS_NAMENODE_BLOCKPLACEMENTPOLICY_EXCLUDE_SLOW_NODES_ENABLED_DEFAULT);
 this.maxSlowPeerReportNodes = conf.getInt(
 DFSConfigKeys.DFS_NAMENODE_MAX_SLOWPEER_COLLECT_NODES_KEY,
 DFSConfigKeys.DFS_NAMENODE_MAX_SLOWPEER_COLLECT_NODES_DEFAULT);
 this.slowPeerCollectionInterval = conf.getTimeDuration(
 DFSConfigKeys.DFS_NAMENODE_SLOWPEER_COLLECT_INTERVAL_KEY,
 DFSConfigKeys.DFS_NAMENODE_SLOWPEER_COLLECT_INTERVAL_DEFAULT,
 TimeUnit.MILLISECONDS);
-if (slowPeerTracker != null && excludeSlowNodesEnabled) {

Review comment:
   If this change is made, the SlowPeerCollector thread will be started 
regardless of whether we enable this feature.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 673130)
Time Spent: 2h  (was: 1h 50m)

> Support to make dfs.namenode.avoid.read.slow.datanode  reconfigurable
> -
>
> Key: HDFS-16287
> URL: https://issues.apache.org/jira/browse/HDFS-16287
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haiyang Hu
>Assignee: Haiyang Hu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> 1. Consider that make dfs.namenode.avoid.read.slow.datanode  reconfigurable 
> and rapid rollback in case this feature 
> [HDFS-16076|https://issues.apache.org/jira/browse/HDFS-16076] unexpected 
> things happen in production environment  
> 2.  DatanodeManager#startSlowPeerCollector by parameter 
> 'dfs.datanode.peer.stats.enabled' to control



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16266) Add remote port information to HDFS audit log

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16266?focusedWorklogId=673132&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-673132
 ]

ASF GitHub Bot logged work on HDFS-16266:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 12:29
Start Date: 02/Nov/21 12:29
Worklog Time Spent: 10m 
  Work Description: tomscut commented on pull request #3538:
URL: https://github.com/apache/hadoop/pull/3538#issuecomment-957450485


   > > I will make some changes in the other JIRA if necessary, because this 
might involve the callContext from Router. What do you think of this? Thank you.
   > 
   > Agreed. Let's discuss in a separate JIRA.
   
   Thanks @aajisaka for your quick reply.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 673132)
Time Spent: 7.5h  (was: 7h 20m)

> Add remote port information to HDFS audit log
> -
>
> Key: HDFS-16266
> URL: https://issues.apache.org/jira/browse/HDFS-16266
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: tomscut
>Assignee: tomscut
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 7.5h
>  Remaining Estimate: 0h
>
> In our production environment, we occasionally encounter a problem where a 
> user submits an abnormal computation task, causing a sudden flood of 
> requests, which causes the queueTime and processingTime of the Namenode to 
> rise very high, causing a large backlog of tasks.
> We usually locate and kill specific Spark, Flink, or MapReduce tasks based on 
> metrics and audit logs. Currently, IP and UGI are recorded in audit logs, but 
> there is no port information, so it is difficult to locate specific processes 
> sometimes. Therefore, I propose that we add the port information to the audit 
> log, so that we can easily track the upstream process.
> Currently, some projects contain port information in audit logs, such as 
> Hbase and Alluxio. I think it is also necessary to add port information for 
> HDFS audit logs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16294) Remove invalid DataNode#CONFIG_PROPERTY_SIMULATED

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16294?focusedWorklogId=673164&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-673164
 ]

ASF GitHub Bot logged work on HDFS-16294:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 13:36
Start Date: 02/Nov/21 13:36
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #3605:
URL: https://github.com/apache/hadoop/pull/3605#issuecomment-957588812


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 38s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  31m 57s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 23s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  compile  |   1m 16s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  checkstyle  |   1m  0s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 23s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 58s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 24s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   3m  6s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  22m  4s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 11s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 13s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javac  |   1m 13s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  5s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  javac  |   1m  5s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 50s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m 12s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 45s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 21s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   3m  6s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  21m 43s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  | 223m 16s |  |  hadoop-hdfs in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 46s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 319m 45s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3605/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/3605 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell |
   | uname | Linux e0355e340cb5 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 
23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / d46bd52ef0cb43aea364d0b2a32f704103c0eec2 |
   | Default Java | Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3605/1/testReport/ |
   | Max. process+thread count | 3390 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3605/1/console |
   | versions | git=2

[jira] [Work logged] (HDFS-16294) Remove invalid DataNode#CONFIG_PROPERTY_SIMULATED

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16294?focusedWorklogId=673176&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-673176
 ]

ASF GitHub Bot logged work on HDFS-16294:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 14:11
Start Date: 02/Nov/21 14:11
Worklog Time Spent: 10m 
  Work Description: jianghuazhu commented on pull request #3605:
URL: https://github.com/apache/hadoop/pull/3605#issuecomment-957661245


   It seems that Jenkins did not execute successfully, It seems that these have 
little to do with the code I submitted.
   @tasanuma @prasad-acit @tomscut, you are willing to spend some time 
reviewing this PR.
   Thank you very much.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 673176)
Time Spent: 0.5h  (was: 20m)

> Remove invalid DataNode#CONFIG_PROPERTY_SIMULATED
> -
>
> Key: HDFS-16294
> URL: https://issues.apache.org/jira/browse/HDFS-16294
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.9.2
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Major
>  Labels: pull-request-available
> Attachments: screenshot.png
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> As early as when HDFS-2907 was resolved, 
> SimulatedFSDataset#CONFIG_PROPERTY_SIMULATED was removed. Replaced by: 
> SimulatedFSDataset#Factory and 
> DFSConfigKyes#DFS_DATANODE_FSDATASET_FACTORY_KEY.
> However, the introduction to CONFIG_PROPERTY_SIMULATED is still retained in 
> the DataNode.
>  !screenshot.png! 
> Here are some traces related to HDFS-2907:
> https://issues.apache.org/jira/browse/HDFS-2907
> https://github.com/apache/hadoop/commit/efbc58f30c8e8d9f26c6a82d32d53716fb2b222a#diff-ab77612831fcb9a35e14c294417f0919c7a30c0cef9a4aec6b32d5f2df957020



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-16295) datanode connect to namenode, datanode start can't resolved dns, rpc always exception when Handshake

2021-11-02 Thread zhangkun (Jira)
zhangkun created HDFS-16295:
---

 Summary: datanode connect to namenode, datanode start can't 
resolved dns, rpc always exception when Handshake
 Key: HDFS-16295
 URL: https://issues.apache.org/jira/browse/HDFS-16295
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 3.2.1
Reporter: zhangkun


hadoop on k8s

i use k8s statefulset start hdfs DataNode and NameNode。

when dataNode start ,if can't resolved NameNode host,InetSocketAddress is 
unResolved。

Rpc request nameNode always throw unknowHostException

 

if after dataNode start, NameNode is started and Namenode is reachable, 
DataNode also can't reconnect to namenode

 

 

 

hadoop version:3.2.1

(code:

 

org.apache.hadoop.hdfs.server.datanode.BPServiceActor.retrieveNamespaceInfo()

bpNamenode.versionRequest();

 

RPC:

org.apache.hadoop.ipc.Client.Connection

the constructed function of class Connnection  will throw UnKnownHostExeception

)

 

 

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16286) Debug tool to verify the correctness of erasure coding on file

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16286?focusedWorklogId=673280&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-673280
 ]

ASF GitHub Bot logged work on HDFS-16286:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 16:53
Start Date: 02/Nov/21 16:53
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #3593:
URL: https://github.com/apache/hadoop/pull/3593#issuecomment-957941102


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 55s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  2s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  34m 45s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 22s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  compile  |   1m 16s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  checkstyle  |   0m 57s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 23s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 57s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 23s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   3m 22s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  25m 41s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 14s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 16s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javac  |   1m 16s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  7s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  javac  |   1m  7s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   0m 51s | 
[/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3593/2/artifact/out/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs-project/hadoop-hdfs: The patch generated 1 new + 13 unchanged - 
0 fixed = 14 total (was 13)  |
   | +1 :green_heart: |  mvnsite  |   1m 14s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 48s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 17s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   3m 17s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  24m 27s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 363m 34s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3593/2/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 38s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 469m 10s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | 
hadoop.hdfs.server.balancer.TestBalancerWithHANameNodes |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3593/2/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/3593 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell |
   | uname | Linux 8846eb1a8063 4.15.0-153-generic #160-Ubuntu SMP Thu Jul 29 
06:54:29 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / d30b66ba08b5ad4404363477591cb1681c12cb6c |
   | Default Java | Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 
/usr/

[jira] [Work logged] (HDFS-16286) Debug tool to verify the correctness of erasure coding on file

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16286?focusedWorklogId=673291&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-673291
 ]

ASF GitHub Bot logged work on HDFS-16286:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 17:15
Start Date: 02/Nov/21 17:15
Worklog Time Spent: 10m 
  Work Description: sodonnel commented on a change in pull request #3593:
URL: https://github.com/apache/hadoop/pull/3593#discussion_r741306774



##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/tools/TestDebugAdmin.java
##
@@ -166,8 +179,91 @@ public void testComputeMetaCommand() throws Exception {
 
   @Test(timeout = 6)
   public void testRecoverLeaseforFileNotFound() throws Exception {
+cluster = new MiniDFSCluster.Builder(conf).numDataNodes(1).build();
+cluster.waitActive();
 assertTrue(runCmd(new String[] {
 "recoverLease", "-path", "/foo", "-retries", "2" }).contains(
 "Giving up on recoverLease for /foo after 1 try"));
   }
+
+  @Test(timeout = 6)
+  public void testVerifyECCommand() throws Exception {
+final ErasureCodingPolicy ecPolicy = SystemErasureCodingPolicies.getByID(
+SystemErasureCodingPolicies.RS_3_2_POLICY_ID);
+cluster = DFSTestUtil.setupCluster(conf, 6, 5, 0);
+cluster.waitActive();
+DistributedFileSystem fs = cluster.getFileSystem();
+
+assertEquals("ret: 1, verifyEC -file   Verify HDFS erasure coding on 
" +
+"all block groups of the file.", runCmd(new String[]{"verifyEC"}));
+
+assertEquals("ret: 1, File /bar does not exist.",
+runCmd(new String[]{"verifyEC", "-file", "/bar"}));
+
+fs.create(new Path("/bar")).close();
+assertEquals("ret: 1, File /bar is not erasure coded.",
+runCmd(new String[]{"verifyEC", "-file", "/bar"}));
+
+
+final Path ecDir = new Path("/ec");
+fs.mkdir(ecDir, FsPermission.getDirDefault());
+fs.enableErasureCodingPolicy(ecPolicy.getName());
+fs.setErasureCodingPolicy(ecDir, ecPolicy.getName());
+
+assertEquals("ret: 1, File /ec is not a regular file.",
+runCmd(new String[]{"verifyEC", "-file", "/ec"}));
+
+fs.create(new Path(ecDir, "foo"));
+assertEquals("ret: 1, File /ec/foo is not closed.",
+runCmd(new String[]{"verifyEC", "-file", "/ec/foo"}));
+
+final short repl = 1;
+final long k = 1024;
+final long m = k * k;
+final long seed = 0x1234567L;
+DFSTestUtil.createFile(fs, new Path(ecDir, "foo_65535"), 65535, repl, 
seed);
+assertTrue(runCmd(new String[]{"verifyEC", "-file", "/ec/foo_65535"})
+.contains("All EC block group status: OK"));
+DFSTestUtil.createFile(fs, new Path(ecDir, "foo_256k"), 256 * k, repl, 
seed);
+assertTrue(runCmd(new String[]{"verifyEC", "-file", "/ec/foo_256k"})
+.contains("All EC block group status: OK"));
+DFSTestUtil.createFile(fs, new Path(ecDir, "foo_1m"), m, repl, seed);
+assertTrue(runCmd(new String[]{"verifyEC", "-file", "/ec/foo_1m"})
+.contains("All EC block group status: OK"));
+DFSTestUtil.createFile(fs, new Path(ecDir, "foo_2m"), 2 * m, repl, seed);
+assertTrue(runCmd(new String[]{"verifyEC", "-file", "/ec/foo_2m"})
+.contains("All EC block group status: OK"));
+DFSTestUtil.createFile(fs, new Path(ecDir, "foo_3m"), 3 * m, repl, seed);
+assertTrue(runCmd(new String[]{"verifyEC", "-file", "/ec/foo_3m"})
+.contains("All EC block group status: OK"));
+DFSTestUtil.createFile(fs, new Path(ecDir, "foo_5m"), 5 * m, repl, seed);
+assertTrue(runCmd(new String[]{"verifyEC", "-file", "/ec/foo_5m"})
+.contains("All EC block group status: OK"));
+

Review comment:
   Could you add one more test case for a file that has multiple block 
groups, so we test the command looping over more than 1 block? You are using EC 
3-2, so write a file that is 6MB, with a 1MB block size. That should create 2 
block groups, with a length of 3MB each. Each block would then have a single 
1MB EC chunk in it. 
   
   In `DFSTestUtil` there is a method to pass the blocksize already, so the 
test would be almost the same as the ones above:
   
   ```
 public static void createFile(FileSystem fs, Path fileName, int bufferLen,
 long fileLen, long blockSize, short replFactor, long seed)
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 673291)
Time Spent: 1h 20m  (was: 1h 10m)

> Debug tool to verify the correctness of erasure coding on file
> -

[jira] [Work logged] (HDFS-16286) Debug tool to verify the correctness of erasure coding on file

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16286?focusedWorklogId=673293&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-673293
 ]

ASF GitHub Bot logged work on HDFS-16286:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 17:16
Start Date: 02/Nov/21 17:16
Worklog Time Spent: 10m 
  Work Description: sodonnel commented on pull request #3593:
URL: https://github.com/apache/hadoop/pull/3593#issuecomment-957960788


   Thanks for the update @cndaimin - There is just one style issue detected and 
I have one suggestion about adding another test case inside your existing test. 
Aside from that, I think this change looks good.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 673293)
Time Spent: 1.5h  (was: 1h 20m)

> Debug tool to verify the correctness of erasure coding on file
> --
>
> Key: HDFS-16286
> URL: https://issues.apache.org/jira/browse/HDFS-16286
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding, tools
>Affects Versions: 3.3.0, 3.3.1
>Reporter: daimin
>Assignee: daimin
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Block data in erasure coded block group may corrupt and the block meta 
> (checksum) is unable to discover the corruption in some cases such as EC 
> reconstruction, related issues are:  HDFS-14768, HDFS-15186, HDFS-15240.
> In addition to HDFS-15759, there needs a tool to check erasure coded file 
> whether any block group has data corruption in case of other conditions 
> rather than EC reconstruction, or the feature HDFS-15759(validation during EC 
> reconstruction) is not open(which is close by default now).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16287) Support to make dfs.namenode.avoid.read.slow.datanode reconfigurable

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16287?focusedWorklogId=673309&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-673309
 ]

ASF GitHub Bot logged work on HDFS-16287:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 17:41
Start Date: 02/Nov/21 17:41
Worklog Time Spent: 10m 
  Work Description: ferhui commented on a change in pull request #3596:
URL: https://github.com/apache/hadoop/pull/3596#discussion_r740948534



##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java
##
@@ -260,17 +257,14 @@
 final Timer timer = new Timer();
 this.slowPeerTracker = dataNodePeerStatsEnabled ?
 new SlowPeerTracker(conf, timer) : null;
-this.excludeSlowNodesEnabled = conf.getBoolean(

Review comment:
   Is it unused? If so we can remove it from hdfs-default.xml.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 673309)
Time Spent: 2h 10m  (was: 2h)

> Support to make dfs.namenode.avoid.read.slow.datanode  reconfigurable
> -
>
> Key: HDFS-16287
> URL: https://issues.apache.org/jira/browse/HDFS-16287
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haiyang Hu
>Assignee: Haiyang Hu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> 1. Consider that make dfs.namenode.avoid.read.slow.datanode  reconfigurable 
> and rapid rollback in case this feature 
> [HDFS-16076|https://issues.apache.org/jira/browse/HDFS-16076] unexpected 
> things happen in production environment  
> 2.  DatanodeManager#startSlowPeerCollector by parameter 
> 'dfs.datanode.peer.stats.enabled' to control



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16286) Debug tool to verify the correctness of erasure coding on file

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16286?focusedWorklogId=673320&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-673320
 ]

ASF GitHub Bot logged work on HDFS-16286:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 17:46
Start Date: 02/Nov/21 17:46
Worklog Time Spent: 10m 
  Work Description: sodonnel commented on pull request #3593:
URL: https://github.com/apache/hadoop/pull/3593#issuecomment-957960788


   Thanks for the update @cndaimin - There is just one style issue detected and 
I have one suggestion about adding another test case inside your existing test. 
Aside from that, I think this change looks good.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 673320)
Time Spent: 1h 40m  (was: 1.5h)

> Debug tool to verify the correctness of erasure coding on file
> --
>
> Key: HDFS-16286
> URL: https://issues.apache.org/jira/browse/HDFS-16286
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding, tools
>Affects Versions: 3.3.0, 3.3.1
>Reporter: daimin
>Assignee: daimin
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Block data in erasure coded block group may corrupt and the block meta 
> (checksum) is unable to discover the corruption in some cases such as EC 
> reconstruction, related issues are:  HDFS-14768, HDFS-15186, HDFS-15240.
> In addition to HDFS-15759, there needs a tool to check erasure coded file 
> whether any block group has data corruption in case of other conditions 
> rather than EC reconstruction, or the feature HDFS-15759(validation during EC 
> reconstruction) is not open(which is close by default now).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16294) Remove invalid DataNode#CONFIG_PROPERTY_SIMULATED

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16294?focusedWorklogId=673330&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-673330
 ]

ASF GitHub Bot logged work on HDFS-16294:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 17:46
Start Date: 02/Nov/21 17:46
Worklog Time Spent: 10m 
  Work Description: jianghuazhu commented on pull request #3605:
URL: https://github.com/apache/hadoop/pull/3605#issuecomment-957661245


   It seems that Jenkins did not execute successfully, It seems that these have 
little to do with the code I submitted.
   @tasanuma @prasad-acit @tomscut, you are willing to spend some time 
reviewing this PR.
   Thank you very much.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 673330)
Time Spent: 40m  (was: 0.5h)

> Remove invalid DataNode#CONFIG_PROPERTY_SIMULATED
> -
>
> Key: HDFS-16294
> URL: https://issues.apache.org/jira/browse/HDFS-16294
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.9.2
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Major
>  Labels: pull-request-available
> Attachments: screenshot.png
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> As early as when HDFS-2907 was resolved, 
> SimulatedFSDataset#CONFIG_PROPERTY_SIMULATED was removed. Replaced by: 
> SimulatedFSDataset#Factory and 
> DFSConfigKyes#DFS_DATANODE_FSDATASET_FACTORY_KEY.
> However, the introduction to CONFIG_PROPERTY_SIMULATED is still retained in 
> the DataNode.
>  !screenshot.png! 
> Here are some traces related to HDFS-2907:
> https://issues.apache.org/jira/browse/HDFS-2907
> https://github.com/apache/hadoop/commit/efbc58f30c8e8d9f26c6a82d32d53716fb2b222a#diff-ab77612831fcb9a35e14c294417f0919c7a30c0cef9a4aec6b32d5f2df957020



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16294) Remove invalid DataNode#CONFIG_PROPERTY_SIMULATED

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16294?focusedWorklogId=673337&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-673337
 ]

ASF GitHub Bot logged work on HDFS-16294:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 17:47
Start Date: 02/Nov/21 17:47
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #3605:
URL: https://github.com/apache/hadoop/pull/3605#issuecomment-957588812


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 38s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  31m 57s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 23s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  compile  |   1m 16s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  checkstyle  |   1m  0s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 23s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 58s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 24s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   3m  6s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  22m  4s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 11s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 13s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javac  |   1m 13s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  5s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  javac  |   1m  5s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 50s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m 12s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 45s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 21s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   3m  6s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  21m 43s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  | 223m 16s |  |  hadoop-hdfs in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 46s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 319m 45s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3605/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/3605 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell |
   | uname | Linux e0355e340cb5 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 
23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / d46bd52ef0cb43aea364d0b2a32f704103c0eec2 |
   | Default Java | Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3605/1/testReport/ |
   | Max. process+thread count | 3390 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3605/1/console |
   | versions | git=2

[jira] [Work logged] (HDFS-16291) Make the comment of INode#ReclaimContext more standardized

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16291?focusedWorklogId=673338&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-673338
 ]

ASF GitHub Bot logged work on HDFS-16291:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 17:48
Start Date: 02/Nov/21 17:48
Worklog Time Spent: 10m 
  Work Description: jianghuazhu commented on pull request #3602:
URL: https://github.com/apache/hadoop/pull/3602#issuecomment-956373649






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 673338)
Time Spent: 1h  (was: 50m)

> Make the comment of INode#ReclaimContext more standardized
> --
>
> Key: HDFS-16291
> URL: https://issues.apache.org/jira/browse/HDFS-16291
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation, namenode
>Affects Versions: 3.4.0
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Minor
>  Labels: pull-request-available
> Attachments: image-2021-10-31-20-25-08-379.png
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> In the INode#ReclaimContext class, there are some comments that are not 
> standardized enough.
> E.g:
>  !image-2021-10-31-20-25-08-379.png! 
> We should make comments more standardized. This will be more readable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16266) Add remote port information to HDFS audit log

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16266?focusedWorklogId=673371&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-673371
 ]

ASF GitHub Bot logged work on HDFS-16266:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 17:50
Start Date: 02/Nov/21 17:50
Worklog Time Spent: 10m 
  Work Description: aajisaka commented on pull request #3538:
URL: https://github.com/apache/hadoop/pull/3538#issuecomment-957445801


   > I will make some changes in the other JIRA if necessary, because this 
might involve the callContext from Router. What do you think of this? Thank you.
   
   Agreed. Let's discuss in a separate JIRA.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 673371)
Time Spent: 7h 40m  (was: 7.5h)

> Add remote port information to HDFS audit log
> -
>
> Key: HDFS-16266
> URL: https://issues.apache.org/jira/browse/HDFS-16266
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: tomscut
>Assignee: tomscut
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 7h 40m
>  Remaining Estimate: 0h
>
> In our production environment, we occasionally encounter a problem where a 
> user submits an abnormal computation task, causing a sudden flood of 
> requests, which causes the queueTime and processingTime of the Namenode to 
> rise very high, causing a large backlog of tasks.
> We usually locate and kill specific Spark, Flink, or MapReduce tasks based on 
> metrics and audit logs. Currently, IP and UGI are recorded in audit logs, but 
> there is no port information, so it is difficult to locate specific processes 
> sometimes. Therefore, I propose that we add the port information to the audit 
> log, so that we can easily track the upstream process.
> Currently, some projects contain port information in audit logs, such as 
> Hbase and Alluxio. I think it is also necessary to add port information for 
> HDFS audit logs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16269) [Fix] Improve NNThroughputBenchmark#blockReport operation

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16269?focusedWorklogId=673395&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-673395
 ]

ASF GitHub Bot logged work on HDFS-16269:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 17:53
Start Date: 02/Nov/21 17:53
Worklog Time Spent: 10m 
  Work Description: aajisaka commented on pull request #3544:
URL: https://github.com/apache/hadoop/pull/3544#issuecomment-956355572


   Merged. Thank you @jianghuazhu @ferhui @jojochuang 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 673395)
Time Spent: 5h 10m  (was: 5h)

> [Fix] Improve NNThroughputBenchmark#blockReport operation
> -
>
> Key: HDFS-16269
> URL: https://issues.apache.org/jira/browse/HDFS-16269
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: benchmarks, namenode
>Affects Versions: 2.9.2
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.2
>
>  Time Spent: 5h 10m
>  Remaining Estimate: 0h
>
> When using NNThroughputBenchmark to verify the blockReport, you will get some 
> exception information.
> Commands used:
> ./bin/hadoop org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark -fs 
>  -op blockReport -datanodes 3 -reports 1
> The exception information:
> 21/10/12 14:35:18 INFO namenode.NNThroughputBenchmark: Starting benchmark: 
> blockReport
> 21/10/12 14:35:19 INFO namenode.NNThroughputBenchmark: Creating 10 files with 
> 10 blocks each.
> 21/10/12 14:35:19 ERROR namenode.NNThroughputBenchmark: 
> java.lang.ArrayIndexOutOfBoundsException: 50009
> at 
> org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark$BlockReportStats.addBlocks(NNThroughputBenchmark.java:1161)
> at 
> org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark$BlockReportStats.generateInputs(NNThroughputBenchmark.java:1143)
> at 
> org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark$OperationStatsBase.benchmark(NNThroughputBenchmark.java:257)
> at 
> org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark.run(NNThroughputBenchmark.java:1528)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
> at 
> org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark.runBenchmark(NNThroughputBenchmark.java:1430)
> at 
> org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark.main(NNThroughputBenchmark.java:1550)
> Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 50009
> at 
> org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark$BlockReportStats.addBlocks(NNThroughputBenchmark.java:1161)
> at 
> org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark$BlockReportStats.generateInputs(NNThroughputBenchmark.java:1143)
> at 
> org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark$OperationStatsBase.benchmark(NNThroughputBenchmark.java:257)
> at 
> org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark.run(NNThroughputBenchmark.java:1528)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
> at 
> org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark.runBenchmark(NNThroughputBenchmark.java:1430)
> at 
> org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark.main(NNThroughputBenchmark.java:1550)
> Checked some code and found that the problem appeared here.
> private ExtendedBlock addBlocks(String fileName, String clientName)
>  throws IOException {
>  for(DatanodeInfo dnInfo: loc.getLocations()) {
>int dnIdx = dnInfo.getXferPort()-1;
>datanodes[dnIdx].addBlock(loc.getBlock().getLocalBlock());
> }
>  }
> It can be seen from this that what dnInfo.getXferPort() gets is a port 
> information and should not be used as an index of an array.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16287) Support to make dfs.namenode.avoid.read.slow.datanode reconfigurable

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16287?focusedWorklogId=673449&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-673449
 ]

ASF GitHub Bot logged work on HDFS-16287:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 17:58
Start Date: 02/Nov/21 17:58
Worklog Time Spent: 10m 
  Work Description: tomscut commented on pull request #3596:
URL: https://github.com/apache/hadoop/pull/3596#issuecomment-957437161


   > @ferhui @tomscut I submitted some code. Can you help review. thank you 
very much.
   
   Thanks for reminding me. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 673449)
Time Spent: 2h 20m  (was: 2h 10m)

> Support to make dfs.namenode.avoid.read.slow.datanode  reconfigurable
> -
>
> Key: HDFS-16287
> URL: https://issues.apache.org/jira/browse/HDFS-16287
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haiyang Hu
>Assignee: Haiyang Hu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> 1. Consider that make dfs.namenode.avoid.read.slow.datanode  reconfigurable 
> and rapid rollback in case this feature 
> [HDFS-16076|https://issues.apache.org/jira/browse/HDFS-16076] unexpected 
> things happen in production environment  
> 2.  DatanodeManager#startSlowPeerCollector by parameter 
> 'dfs.datanode.peer.stats.enabled' to control



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16266) Add remote port information to HDFS audit log

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16266?focusedWorklogId=673504&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-673504
 ]

ASF GitHub Bot logged work on HDFS-16266:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 18:04
Start Date: 02/Nov/21 18:04
Worklog Time Spent: 10m 
  Work Description: tomscut commented on pull request #3538:
URL: https://github.com/apache/hadoop/pull/3538#issuecomment-957045765






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 673504)
Time Spent: 7h 50m  (was: 7h 40m)

> Add remote port information to HDFS audit log
> -
>
> Key: HDFS-16266
> URL: https://issues.apache.org/jira/browse/HDFS-16266
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: tomscut
>Assignee: tomscut
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 7h 50m
>  Remaining Estimate: 0h
>
> In our production environment, we occasionally encounter a problem where a 
> user submits an abnormal computation task, causing a sudden flood of 
> requests, which causes the queueTime and processingTime of the Namenode to 
> rise very high, causing a large backlog of tasks.
> We usually locate and kill specific Spark, Flink, or MapReduce tasks based on 
> metrics and audit logs. Currently, IP and UGI are recorded in audit logs, but 
> there is no port information, so it is difficult to locate specific processes 
> sometimes. Therefore, I propose that we add the port information to the audit 
> log, so that we can easily track the upstream process.
> Currently, some projects contain port information in audit logs, such as 
> Hbase and Alluxio. I think it is also necessary to add port information for 
> HDFS audit logs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16286) Debug tool to verify the correctness of erasure coding on file

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16286?focusedWorklogId=673509&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-673509
 ]

ASF GitHub Bot logged work on HDFS-16286:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 18:05
Start Date: 02/Nov/21 18:05
Worklog Time Spent: 10m 
  Work Description: sodonnel commented on a change in pull request #3593:
URL: https://github.com/apache/hadoop/pull/3593#discussion_r741306774



##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/tools/TestDebugAdmin.java
##
@@ -166,8 +179,91 @@ public void testComputeMetaCommand() throws Exception {
 
   @Test(timeout = 6)
   public void testRecoverLeaseforFileNotFound() throws Exception {
+cluster = new MiniDFSCluster.Builder(conf).numDataNodes(1).build();
+cluster.waitActive();
 assertTrue(runCmd(new String[] {
 "recoverLease", "-path", "/foo", "-retries", "2" }).contains(
 "Giving up on recoverLease for /foo after 1 try"));
   }
+
+  @Test(timeout = 6)
+  public void testVerifyECCommand() throws Exception {
+final ErasureCodingPolicy ecPolicy = SystemErasureCodingPolicies.getByID(
+SystemErasureCodingPolicies.RS_3_2_POLICY_ID);
+cluster = DFSTestUtil.setupCluster(conf, 6, 5, 0);
+cluster.waitActive();
+DistributedFileSystem fs = cluster.getFileSystem();
+
+assertEquals("ret: 1, verifyEC -file   Verify HDFS erasure coding on 
" +
+"all block groups of the file.", runCmd(new String[]{"verifyEC"}));
+
+assertEquals("ret: 1, File /bar does not exist.",
+runCmd(new String[]{"verifyEC", "-file", "/bar"}));
+
+fs.create(new Path("/bar")).close();
+assertEquals("ret: 1, File /bar is not erasure coded.",
+runCmd(new String[]{"verifyEC", "-file", "/bar"}));
+
+
+final Path ecDir = new Path("/ec");
+fs.mkdir(ecDir, FsPermission.getDirDefault());
+fs.enableErasureCodingPolicy(ecPolicy.getName());
+fs.setErasureCodingPolicy(ecDir, ecPolicy.getName());
+
+assertEquals("ret: 1, File /ec is not a regular file.",
+runCmd(new String[]{"verifyEC", "-file", "/ec"}));
+
+fs.create(new Path(ecDir, "foo"));
+assertEquals("ret: 1, File /ec/foo is not closed.",
+runCmd(new String[]{"verifyEC", "-file", "/ec/foo"}));
+
+final short repl = 1;
+final long k = 1024;
+final long m = k * k;
+final long seed = 0x1234567L;
+DFSTestUtil.createFile(fs, new Path(ecDir, "foo_65535"), 65535, repl, 
seed);
+assertTrue(runCmd(new String[]{"verifyEC", "-file", "/ec/foo_65535"})
+.contains("All EC block group status: OK"));
+DFSTestUtil.createFile(fs, new Path(ecDir, "foo_256k"), 256 * k, repl, 
seed);
+assertTrue(runCmd(new String[]{"verifyEC", "-file", "/ec/foo_256k"})
+.contains("All EC block group status: OK"));
+DFSTestUtil.createFile(fs, new Path(ecDir, "foo_1m"), m, repl, seed);
+assertTrue(runCmd(new String[]{"verifyEC", "-file", "/ec/foo_1m"})
+.contains("All EC block group status: OK"));
+DFSTestUtil.createFile(fs, new Path(ecDir, "foo_2m"), 2 * m, repl, seed);
+assertTrue(runCmd(new String[]{"verifyEC", "-file", "/ec/foo_2m"})
+.contains("All EC block group status: OK"));
+DFSTestUtil.createFile(fs, new Path(ecDir, "foo_3m"), 3 * m, repl, seed);
+assertTrue(runCmd(new String[]{"verifyEC", "-file", "/ec/foo_3m"})
+.contains("All EC block group status: OK"));
+DFSTestUtil.createFile(fs, new Path(ecDir, "foo_5m"), 5 * m, repl, seed);
+assertTrue(runCmd(new String[]{"verifyEC", "-file", "/ec/foo_5m"})
+.contains("All EC block group status: OK"));
+

Review comment:
   Could you add one more test case for a file that has multiple block 
groups, so we test the command looping over more than 1 block? You are using EC 
3-2, so write a file that is 6MB, with a 1MB block size. That should create 2 
block groups, with a length of 3MB each. Each block would then have a single 
1MB EC chunk in it. 
   
   In `DFSTestUtil` there is a method to pass the blocksize already, so the 
test would be almost the same as the ones above:
   
   ```
 public static void createFile(FileSystem fs, Path fileName, int bufferLen,
 long fileLen, long blockSize, short replFactor, long seed)
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 673509)
Time Spent: 1h 50m  (was: 1h 40m)

> Debug tool to verify the correctness of erasure coding on file
> -

[jira] [Work logged] (HDFS-16286) Debug tool to verify the correctness of erasure coding on file

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16286?focusedWorklogId=673512&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-673512
 ]

ASF GitHub Bot logged work on HDFS-16286:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 18:05
Start Date: 02/Nov/21 18:05
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #3593:
URL: https://github.com/apache/hadoop/pull/3593#issuecomment-957941102


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 55s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  2s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  34m 45s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 22s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  compile  |   1m 16s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  checkstyle  |   0m 57s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 23s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 57s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 23s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   3m 22s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  25m 41s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 14s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 16s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javac  |   1m 16s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  7s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  javac  |   1m  7s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   0m 51s | 
[/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3593/2/artifact/out/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs-project/hadoop-hdfs: The patch generated 1 new + 13 unchanged - 
0 fixed = 14 total (was 13)  |
   | +1 :green_heart: |  mvnsite  |   1m 14s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 48s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 17s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   3m 17s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  24m 27s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 363m 34s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3593/2/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 38s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 469m 10s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | 
hadoop.hdfs.server.balancer.TestBalancerWithHANameNodes |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3593/2/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/3593 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell |
   | uname | Linux 8846eb1a8063 4.15.0-153-generic #160-Ubuntu SMP Thu Jul 29 
06:54:29 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / d30b66ba08b5ad4404363477591cb1681c12cb6c |
   | Default Java | Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 
/usr/

[jira] [Work logged] (HDFS-16273) RBF: RouterRpcFairnessPolicyController add availableHandleOnPerNs metrics

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16273?focusedWorklogId=673518&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-673518
 ]

ASF GitHub Bot logged work on HDFS-16273:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 18:06
Start Date: 02/Nov/21 18:06
Worklog Time Spent: 10m 
  Work Description: aajisaka commented on pull request #3553:
URL: https://github.com/apache/hadoop/pull/3553#issuecomment-957333859


   Hi @tasanuma would you check this PR?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 673518)
Time Spent: 50m  (was: 40m)

> RBF: RouterRpcFairnessPolicyController add availableHandleOnPerNs metrics
> -
>
> Key: HDFS-16273
> URL: https://issues.apache.org/jira/browse/HDFS-16273
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: rbf
>Affects Versions: 3.4.0
>Reporter: Xiangyi Zhu
>Assignee: Xiangyi Zhu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Add the availableHandlerOnPerNs metrics to monitor whether the number of 
> handlers configured for each NS is reasonable when using 
> RouterRpcFairnessPolicyController.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16269) [Fix] Improve NNThroughputBenchmark#blockReport operation

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16269?focusedWorklogId=673530&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-673530
 ]

ASF GitHub Bot logged work on HDFS-16269:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 18:07
Start Date: 02/Nov/21 18:07
Worklog Time Spent: 10m 
  Work Description: aajisaka merged pull request #3544:
URL: https://github.com/apache/hadoop/pull/3544


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 673530)
Time Spent: 5h 20m  (was: 5h 10m)

> [Fix] Improve NNThroughputBenchmark#blockReport operation
> -
>
> Key: HDFS-16269
> URL: https://issues.apache.org/jira/browse/HDFS-16269
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: benchmarks, namenode
>Affects Versions: 2.9.2
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.2
>
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> When using NNThroughputBenchmark to verify the blockReport, you will get some 
> exception information.
> Commands used:
> ./bin/hadoop org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark -fs 
>  -op blockReport -datanodes 3 -reports 1
> The exception information:
> 21/10/12 14:35:18 INFO namenode.NNThroughputBenchmark: Starting benchmark: 
> blockReport
> 21/10/12 14:35:19 INFO namenode.NNThroughputBenchmark: Creating 10 files with 
> 10 blocks each.
> 21/10/12 14:35:19 ERROR namenode.NNThroughputBenchmark: 
> java.lang.ArrayIndexOutOfBoundsException: 50009
> at 
> org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark$BlockReportStats.addBlocks(NNThroughputBenchmark.java:1161)
> at 
> org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark$BlockReportStats.generateInputs(NNThroughputBenchmark.java:1143)
> at 
> org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark$OperationStatsBase.benchmark(NNThroughputBenchmark.java:257)
> at 
> org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark.run(NNThroughputBenchmark.java:1528)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
> at 
> org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark.runBenchmark(NNThroughputBenchmark.java:1430)
> at 
> org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark.main(NNThroughputBenchmark.java:1550)
> Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 50009
> at 
> org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark$BlockReportStats.addBlocks(NNThroughputBenchmark.java:1161)
> at 
> org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark$BlockReportStats.generateInputs(NNThroughputBenchmark.java:1143)
> at 
> org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark$OperationStatsBase.benchmark(NNThroughputBenchmark.java:257)
> at 
> org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark.run(NNThroughputBenchmark.java:1528)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
> at 
> org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark.runBenchmark(NNThroughputBenchmark.java:1430)
> at 
> org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark.main(NNThroughputBenchmark.java:1550)
> Checked some code and found that the problem appeared here.
> private ExtendedBlock addBlocks(String fileName, String clientName)
>  throws IOException {
>  for(DatanodeInfo dnInfo: loc.getLocations()) {
>int dnIdx = dnInfo.getXferPort()-1;
>datanodes[dnIdx].addBlock(loc.getBlock().getLocalBlock());
> }
>  }
> It can be seen from this that what dnInfo.getXferPort() gets is a port 
> information and should not be used as an index of an array.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16294) Remove invalid DataNode#CONFIG_PROPERTY_SIMULATED

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16294?focusedWorklogId=673563&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-673563
 ]

ASF GitHub Bot logged work on HDFS-16294:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 18:11
Start Date: 02/Nov/21 18:11
Worklog Time Spent: 10m 
  Work Description: jianghuazhu opened a new pull request #3605:
URL: https://github.com/apache/hadoop/pull/3605


   ### Description of PR
   As early as when HDFS-2907 was resolved, 
SimulatedFSDataset#CONFIG_PROPERTY_SIMULATED was removed. Replaced by: 
SimulatedFSDataset#Factory and DFSConfigKyes#DFS_DATANODE_FSDATASET_FACTORY_KEY.
   However, the introduction to CONFIG_PROPERTY_SIMULATED is still retained in 
the DataNode.
   Details: HDFS-16294
   
   ### How was this patch tested?
   This pr mainly solves the changes related to annotations, and the test 
pressure is not great.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 673563)
Time Spent: 1h  (was: 50m)

> Remove invalid DataNode#CONFIG_PROPERTY_SIMULATED
> -
>
> Key: HDFS-16294
> URL: https://issues.apache.org/jira/browse/HDFS-16294
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.9.2
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Major
>  Labels: pull-request-available
> Attachments: screenshot.png
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> As early as when HDFS-2907 was resolved, 
> SimulatedFSDataset#CONFIG_PROPERTY_SIMULATED was removed. Replaced by: 
> SimulatedFSDataset#Factory and 
> DFSConfigKyes#DFS_DATANODE_FSDATASET_FACTORY_KEY.
> However, the introduction to CONFIG_PROPERTY_SIMULATED is still retained in 
> the DataNode.
>  !screenshot.png! 
> Here are some traces related to HDFS-2907:
> https://issues.apache.org/jira/browse/HDFS-2907
> https://github.com/apache/hadoop/commit/efbc58f30c8e8d9f26c6a82d32d53716fb2b222a#diff-ab77612831fcb9a35e14c294417f0919c7a30c0cef9a4aec6b32d5f2df957020



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16287) Support to make dfs.namenode.avoid.read.slow.datanode reconfigurable

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16287?focusedWorklogId=673575&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-673575
 ]

ASF GitHub Bot logged work on HDFS-16287:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 18:13
Start Date: 02/Nov/21 18:13
Worklog Time Spent: 10m 
  Work Description: tomscut commented on a change in pull request #3596:
URL: https://github.com/apache/hadoop/pull/3596#discussion_r741006484



##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java
##
@@ -511,7 +505,16 @@ private boolean isInactive(DatanodeInfo datanode) {
   private boolean isSlowNode(String dnUuid) {
 return avoidSlowDataNodesForRead && slowNodesUuidSet.contains(dnUuid);
   }
-  
+
+  public void setAvoidSlowDataNodesForReadEnabled(boolean enable) {

Review comment:
   We might need to check whether the slowPeerTracker is started or we 
might not get the slow peers.

##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java
##
@@ -260,17 +257,14 @@
 final Timer timer = new Timer();
 this.slowPeerTracker = dataNodePeerStatsEnabled ?
 new SlowPeerTracker(conf, timer) : null;
-this.excludeSlowNodesEnabled = conf.getBoolean(
-DFS_NAMENODE_BLOCKPLACEMENTPOLICY_EXCLUDE_SLOW_NODES_ENABLED_KEY,
-DFS_NAMENODE_BLOCKPLACEMENTPOLICY_EXCLUDE_SLOW_NODES_ENABLED_DEFAULT);
 this.maxSlowPeerReportNodes = conf.getInt(
 DFSConfigKeys.DFS_NAMENODE_MAX_SLOWPEER_COLLECT_NODES_KEY,
 DFSConfigKeys.DFS_NAMENODE_MAX_SLOWPEER_COLLECT_NODES_DEFAULT);
 this.slowPeerCollectionInterval = conf.getTimeDuration(
 DFSConfigKeys.DFS_NAMENODE_SLOWPEER_COLLECT_INTERVAL_KEY,
 DFSConfigKeys.DFS_NAMENODE_SLOWPEER_COLLECT_INTERVAL_DEFAULT,
 TimeUnit.MILLISECONDS);
-if (slowPeerTracker != null && excludeSlowNodesEnabled) {

Review comment:
   If this change is made, the SlowPeerCollector thread will be started 
regardless of whether we enable this feature.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 673575)
Time Spent: 2.5h  (was: 2h 20m)

> Support to make dfs.namenode.avoid.read.slow.datanode  reconfigurable
> -
>
> Key: HDFS-16287
> URL: https://issues.apache.org/jira/browse/HDFS-16287
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haiyang Hu
>Assignee: Haiyang Hu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> 1. Consider that make dfs.namenode.avoid.read.slow.datanode  reconfigurable 
> and rapid rollback in case this feature 
> [HDFS-16076|https://issues.apache.org/jira/browse/HDFS-16076] unexpected 
> things happen in production environment  
> 2.  DatanodeManager#startSlowPeerCollector by parameter 
> 'dfs.datanode.peer.stats.enabled' to control



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16286) Debug tool to verify the correctness of erasure coding on file

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16286?focusedWorklogId=673584&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-673584
 ]

ASF GitHub Bot logged work on HDFS-16286:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 18:14
Start Date: 02/Nov/21 18:14
Worklog Time Spent: 10m 
  Work Description: cndaimin commented on a change in pull request #3593:
URL: https://github.com/apache/hadoop/pull/3593#discussion_r740842196



##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DebugAdmin.java
##
@@ -387,6 +414,211 @@ int run(List args) throws IOException {
 }
   }
 
+  /**
+   * The command for verifying the correctness of erasure coding on an erasure 
coded file.
+   */
+  private class VerifyECCommand extends DebugCommand {
+private DFSClient client;
+private int dataBlkNum;
+private int parityBlkNum;
+private int cellSize;
+private boolean useDNHostname;
+private CachingStrategy cachingStrategy;
+private int stripedReadBufferSize;
+private CompletionService readService;
+private RawErasureDecoder decoder;
+private BlockReader[] blockReaders;
+
+
+VerifyECCommand() {
+  super("verifyEC",
+  "verifyEC -file ",
+  "  Verify HDFS erasure coding on all block groups of the file.");
+}
+
+int run(List args) throws IOException {
+  if (args.size() < 2) {
+System.out.println(usageText);
+System.out.println(helpText + System.lineSeparator());
+return 1;
+  }
+  String file = StringUtils.popOptionWithArgument("-file", args);
+  Path path = new Path(file);
+  DistributedFileSystem dfs = AdminHelper.getDFS(getConf());
+  this.client = dfs.getClient();
+
+  FileStatus fileStatus;
+  try {
+fileStatus = dfs.getFileStatus(path);
+  } catch (FileNotFoundException e) {
+System.err.println("File " + file + " does not exist.");
+return 1;
+  }
+
+  if (!fileStatus.isFile()) {
+System.err.println("File " + file + " is not a regular file.");
+return 1;
+  }
+  if (!dfs.isFileClosed(path)) {
+System.err.println("File " + file + " is not closed.");
+return 1;
+  }
+  this.useDNHostname = 
getConf().getBoolean(DFSConfigKeys.DFS_DATANODE_USE_DN_HOSTNAME,
+  DFSConfigKeys.DFS_DATANODE_USE_DN_HOSTNAME_DEFAULT);
+  this.cachingStrategy = CachingStrategy.newDefaultStrategy();
+  this.stripedReadBufferSize = getConf().getInt(
+  DFSConfigKeys.DFS_DN_EC_RECONSTRUCTION_STRIPED_READ_BUFFER_SIZE_KEY,
+  
DFSConfigKeys.DFS_DN_EC_RECONSTRUCTION_STRIPED_READ_BUFFER_SIZE_DEFAULT);
+
+  LocatedBlocks locatedBlocks = client.getLocatedBlocks(file, 0, 
fileStatus.getLen());
+  if (locatedBlocks.getErasureCodingPolicy() == null) {
+System.err.println("File " + file + " is not erasure coded.");
+return 1;
+  }
+  ErasureCodingPolicy ecPolicy = locatedBlocks.getErasureCodingPolicy();
+  this.dataBlkNum = ecPolicy.getNumDataUnits();
+  this.parityBlkNum = ecPolicy.getNumParityUnits();
+  this.cellSize = ecPolicy.getCellSize();
+  this.decoder = CodecUtil.createRawDecoder(getConf(), 
ecPolicy.getCodecName(),
+  new ErasureCoderOptions(
+  ecPolicy.getNumDataUnits(), ecPolicy.getNumParityUnits()));
+  int blockNum = dataBlkNum + parityBlkNum;
+  this.readService = new ExecutorCompletionService<>(
+  DFSUtilClient.getThreadPoolExecutor(blockNum, blockNum, 60,
+  new LinkedBlockingQueue<>(), "read-", false));
+  this.blockReaders = new BlockReader[dataBlkNum + parityBlkNum];
+
+  for (LocatedBlock locatedBlock : locatedBlocks.getLocatedBlocks()) {
+System.out.println("Checking EC block group: blk_" + 
locatedBlock.getBlock().getBlockId());
+LocatedStripedBlock blockGroup = (LocatedStripedBlock) locatedBlock;
+
+try {
+  verifyBlockGroup(blockGroup);
+  System.out.println("Status: OK");
+} catch (Exception e) {
+  System.err.println("Status: ERROR, message: " + e.getMessage());
+  return 1;
+} finally {
+  closeBlockReaders();
+}
+  }
+  System.out.println("\nAll EC block group status: OK");
+  return 0;
+}
+
+private void verifyBlockGroup(LocatedStripedBlock blockGroup) throws 
Exception {
+  final LocatedBlock[] indexedBlocks = 
StripedBlockUtil.parseStripedBlockGroup(blockGroup,
+  cellSize, dataBlkNum, parityBlkNum);
+
+  int blockNumExpected = Math.min(dataBlkNum,
+  (int) ((blockGroup.getBlockSize() - 1) / cellSize + 1)) + 
parityBlkNum;
+  if (blockGroup.getBlockIndices().length < blockNumExpected) {
+throw new Exception("Block group is under-erasure-coded.");
+  }
+
+  long m

[jira] [Work logged] (HDFS-16287) Support to make dfs.namenode.avoid.read.slow.datanode reconfigurable

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16287?focusedWorklogId=673594&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-673594
 ]

ASF GitHub Bot logged work on HDFS-16287:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 18:15
Start Date: 02/Nov/21 18:15
Worklog Time Spent: 10m 
  Work Description: haiyang1987 commented on pull request #3596:
URL: https://github.com/apache/hadoop/pull/3596#issuecomment-957160686


   @ferhui @tomscut I submitted some code. Can you help review.
   thank you very much.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 673594)
Time Spent: 2h 40m  (was: 2.5h)

> Support to make dfs.namenode.avoid.read.slow.datanode  reconfigurable
> -
>
> Key: HDFS-16287
> URL: https://issues.apache.org/jira/browse/HDFS-16287
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haiyang Hu
>Assignee: Haiyang Hu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> 1. Consider that make dfs.namenode.avoid.read.slow.datanode  reconfigurable 
> and rapid rollback in case this feature 
> [HDFS-16076|https://issues.apache.org/jira/browse/HDFS-16076] unexpected 
> things happen in production environment  
> 2.  DatanodeManager#startSlowPeerCollector by parameter 
> 'dfs.datanode.peer.stats.enabled' to control



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16259) Catch and re-throw sub-classes of AccessControlException thrown by any permission provider plugins (eg Ranger)

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16259?focusedWorklogId=673600&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-673600
 ]

ASF GitHub Bot logged work on HDFS-16259:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 18:16
Start Date: 02/Nov/21 18:16
Worklog Time Spent: 10m 
  Work Description: sodonnel merged pull request #3598:
URL: https://github.com/apache/hadoop/pull/3598


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 673600)
Time Spent: 50m  (was: 40m)

> Catch and re-throw sub-classes of AccessControlException thrown by any 
> permission provider plugins (eg Ranger)
> --
>
> Key: HDFS-16259
> URL: https://issues.apache.org/jira/browse/HDFS-16259
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.2
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> When a permission provider plugin is enabled (eg Ranger) there are some 
> scenarios where it can throw a sub-class of an AccessControlException (eg 
> RangerAccessControlException). If this exception is allowed to propagate up 
> the stack, it can give problems in the HDFS Client, when it unwraps the 
> remote exception containing the AccessControlException sub-class.
> Ideally, we should make AccessControlException final so it cannot be 
> sub-classed, but that would be a breaking change at this point. Therefore I 
> believe the safest thing to do, is to catch any AccessControlException that 
> comes out of the permission enforcer plugin, and re-throw an 
> AccessControlException instead.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16286) Debug tool to verify the correctness of erasure coding on file

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16286?focusedWorklogId=673661&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-673661
 ]

ASF GitHub Bot logged work on HDFS-16286:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 18:22
Start Date: 02/Nov/21 18:22
Worklog Time Spent: 10m 
  Work Description: cndaimin commented on pull request #3593:
URL: https://github.com/apache/hadoop/pull/3593#issuecomment-957273204


   @sodonnel Thanks for your review. 
   Update: I have fixed the review comments and added some test in 
`TestDebugAdmin#testVerifyECCommand`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 673661)
Time Spent: 2h 20m  (was: 2h 10m)

> Debug tool to verify the correctness of erasure coding on file
> --
>
> Key: HDFS-16286
> URL: https://issues.apache.org/jira/browse/HDFS-16286
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding, tools
>Affects Versions: 3.3.0, 3.3.1
>Reporter: daimin
>Assignee: daimin
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> Block data in erasure coded block group may corrupt and the block meta 
> (checksum) is unable to discover the corruption in some cases such as EC 
> reconstruction, related issues are:  HDFS-14768, HDFS-15186, HDFS-15240.
> In addition to HDFS-15759, there needs a tool to check erasure coded file 
> whether any block group has data corruption in case of other conditions 
> rather than EC reconstruction, or the feature HDFS-15759(validation during EC 
> reconstruction) is not open(which is close by default now).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16287) Support to make dfs.namenode.avoid.read.slow.datanode reconfigurable

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16287?focusedWorklogId=673676&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-673676
 ]

ASF GitHub Bot logged work on HDFS-16287:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 18:24
Start Date: 02/Nov/21 18:24
Worklog Time Spent: 10m 
  Work Description: ferhui commented on pull request #3596:
URL: https://github.com/apache/hadoop/pull/3596#issuecomment-957338822


   @haiyang1987 Thanks for contribution, some comments:
   we can change the title here and jira, If 
dfs.namenode.block-placement-policy.exclude-slow-nodes.enabled is not 
reconfigurable.
   And I will check whether 
dfs.namenode.block-placement-policy.exclude-slow-nodes.enabled is unused.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 673676)
Time Spent: 2h 50m  (was: 2h 40m)

> Support to make dfs.namenode.avoid.read.slow.datanode  reconfigurable
> -
>
> Key: HDFS-16287
> URL: https://issues.apache.org/jira/browse/HDFS-16287
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haiyang Hu
>Assignee: Haiyang Hu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> 1. Consider that make dfs.namenode.avoid.read.slow.datanode  reconfigurable 
> and rapid rollback in case this feature 
> [HDFS-16076|https://issues.apache.org/jira/browse/HDFS-16076] unexpected 
> things happen in production environment  
> 2.  DatanodeManager#startSlowPeerCollector by parameter 
> 'dfs.datanode.peer.stats.enabled' to control



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16287) Support to make dfs.namenode.avoid.read.slow.datanode reconfigurable

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16287?focusedWorklogId=673893&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-673893
 ]

ASF GitHub Bot logged work on HDFS-16287:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 21:17
Start Date: 02/Nov/21 21:17
Worklog Time Spent: 10m 
  Work Description: ferhui commented on a change in pull request #3596:
URL: https://github.com/apache/hadoop/pull/3596#discussion_r740948534



##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java
##
@@ -260,17 +257,14 @@
 final Timer timer = new Timer();
 this.slowPeerTracker = dataNodePeerStatsEnabled ?
 new SlowPeerTracker(conf, timer) : null;
-this.excludeSlowNodesEnabled = conf.getBoolean(

Review comment:
   Is it unused? If so we can remove it from hdfs-default.xml.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 673893)
Time Spent: 3h  (was: 2h 50m)

> Support to make dfs.namenode.avoid.read.slow.datanode  reconfigurable
> -
>
> Key: HDFS-16287
> URL: https://issues.apache.org/jira/browse/HDFS-16287
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haiyang Hu
>Assignee: Haiyang Hu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> 1. Consider that make dfs.namenode.avoid.read.slow.datanode  reconfigurable 
> and rapid rollback in case this feature 
> [HDFS-16076|https://issues.apache.org/jira/browse/HDFS-16076] unexpected 
> things happen in production environment  
> 2.  DatanodeManager#startSlowPeerCollector by parameter 
> 'dfs.datanode.peer.stats.enabled' to control



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16286) Debug tool to verify the correctness of erasure coding on file

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16286?focusedWorklogId=673905&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-673905
 ]

ASF GitHub Bot logged work on HDFS-16286:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 21:19
Start Date: 02/Nov/21 21:19
Worklog Time Spent: 10m 
  Work Description: sodonnel commented on pull request #3593:
URL: https://github.com/apache/hadoop/pull/3593#issuecomment-957960788


   Thanks for the update @cndaimin - There is just one style issue detected and 
I have one suggestion about adding another test case inside your existing test. 
Aside from that, I think this change looks good.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 673905)
Time Spent: 2.5h  (was: 2h 20m)

> Debug tool to verify the correctness of erasure coding on file
> --
>
> Key: HDFS-16286
> URL: https://issues.apache.org/jira/browse/HDFS-16286
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding, tools
>Affects Versions: 3.3.0, 3.3.1
>Reporter: daimin
>Assignee: daimin
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Block data in erasure coded block group may corrupt and the block meta 
> (checksum) is unable to discover the corruption in some cases such as EC 
> reconstruction, related issues are:  HDFS-14768, HDFS-15186, HDFS-15240.
> In addition to HDFS-15759, there needs a tool to check erasure coded file 
> whether any block group has data corruption in case of other conditions 
> rather than EC reconstruction, or the feature HDFS-15759(validation during EC 
> reconstruction) is not open(which is close by default now).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16291) Make the comment of INode#ReclaimContext more standardized

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16291?focusedWorklogId=673921&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-673921
 ]

ASF GitHub Bot logged work on HDFS-16291:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 21:20
Start Date: 02/Nov/21 21:20
Worklog Time Spent: 10m 
  Work Description: jianghuazhu commented on pull request #3602:
URL: https://github.com/apache/hadoop/pull/3602#issuecomment-957340418


   Hi @jojochuang @ferhui @Hexiaoqiao, are you willing to spend some time to 
help review this pr.
   Thank you very much.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 673921)
Time Spent: 1h 10m  (was: 1h)

> Make the comment of INode#ReclaimContext more standardized
> --
>
> Key: HDFS-16291
> URL: https://issues.apache.org/jira/browse/HDFS-16291
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation, namenode
>Affects Versions: 3.4.0
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Minor
>  Labels: pull-request-available
> Attachments: image-2021-10-31-20-25-08-379.png
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> In the INode#ReclaimContext class, there are some comments that are not 
> standardized enough.
> E.g:
>  !image-2021-10-31-20-25-08-379.png! 
> We should make comments more standardized. This will be more readable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16294) Remove invalid DataNode#CONFIG_PROPERTY_SIMULATED

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16294?focusedWorklogId=674043&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-674043
 ]

ASF GitHub Bot logged work on HDFS-16294:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 21:33
Start Date: 02/Nov/21 21:33
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #3605:
URL: https://github.com/apache/hadoop/pull/3605#issuecomment-957588812


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 38s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  31m 57s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 23s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  compile  |   1m 16s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  checkstyle  |   1m  0s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 23s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 58s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 24s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   3m  6s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  22m  4s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 11s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 13s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javac  |   1m 13s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  5s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  javac  |   1m  5s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 50s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m 12s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 45s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 21s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   3m  6s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  21m 43s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  | 223m 16s |  |  hadoop-hdfs in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 46s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 319m 45s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3605/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/3605 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell |
   | uname | Linux e0355e340cb5 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 
23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / d46bd52ef0cb43aea364d0b2a32f704103c0eec2 |
   | Default Java | Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3605/1/testReport/ |
   | Max. process+thread count | 3390 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3605/1/console |
   | versions | git=2

[jira] [Work logged] (HDFS-16286) Debug tool to verify the correctness of erasure coding on file

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16286?focusedWorklogId=674046&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-674046
 ]

ASF GitHub Bot logged work on HDFS-16286:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 21:34
Start Date: 02/Nov/21 21:34
Worklog Time Spent: 10m 
  Work Description: cndaimin commented on pull request #3593:
URL: https://github.com/apache/hadoop/pull/3593#issuecomment-957273204


   @sodonnel Thanks for your review. 
   Update: I have fixed the review comments and added some test in 
`TestDebugAdmin#testVerifyECCommand`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 674046)
Time Spent: 2h 40m  (was: 2.5h)

> Debug tool to verify the correctness of erasure coding on file
> --
>
> Key: HDFS-16286
> URL: https://issues.apache.org/jira/browse/HDFS-16286
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding, tools
>Affects Versions: 3.3.0, 3.3.1
>Reporter: daimin
>Assignee: daimin
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> Block data in erasure coded block group may corrupt and the block meta 
> (checksum) is unable to discover the corruption in some cases such as EC 
> reconstruction, related issues are:  HDFS-14768, HDFS-15186, HDFS-15240.
> In addition to HDFS-15759, there needs a tool to check erasure coded file 
> whether any block group has data corruption in case of other conditions 
> rather than EC reconstruction, or the feature HDFS-15759(validation during EC 
> reconstruction) is not open(which is close by default now).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16266) Add remote port information to HDFS audit log

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16266?focusedWorklogId=674048&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-674048
 ]

ASF GitHub Bot logged work on HDFS-16266:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 21:34
Start Date: 02/Nov/21 21:34
Worklog Time Spent: 10m 
  Work Description: tomscut commented on pull request #3538:
URL: https://github.com/apache/hadoop/pull/3538#issuecomment-957045765






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 674048)
Time Spent: 8h  (was: 7h 50m)

> Add remote port information to HDFS audit log
> -
>
> Key: HDFS-16266
> URL: https://issues.apache.org/jira/browse/HDFS-16266
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: tomscut
>Assignee: tomscut
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 8h
>  Remaining Estimate: 0h
>
> In our production environment, we occasionally encounter a problem where a 
> user submits an abnormal computation task, causing a sudden flood of 
> requests, which causes the queueTime and processingTime of the Namenode to 
> rise very high, causing a large backlog of tasks.
> We usually locate and kill specific Spark, Flink, or MapReduce tasks based on 
> metrics and audit logs. Currently, IP and UGI are recorded in audit logs, but 
> there is no port information, so it is difficult to locate specific processes 
> sometimes. Therefore, I propose that we add the port information to the audit 
> log, so that we can easily track the upstream process.
> Currently, some projects contain port information in audit logs, such as 
> Hbase and Alluxio. I think it is also necessary to add port information for 
> HDFS audit logs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16273) RBF: RouterRpcFairnessPolicyController add availableHandleOnPerNs metrics

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16273?focusedWorklogId=674060&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-674060
 ]

ASF GitHub Bot logged work on HDFS-16273:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 21:35
Start Date: 02/Nov/21 21:35
Worklog Time Spent: 10m 
  Work Description: aajisaka commented on pull request #3553:
URL: https://github.com/apache/hadoop/pull/3553#issuecomment-957333859


   Hi @tasanuma would you check this PR?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 674060)
Time Spent: 1h  (was: 50m)

> RBF: RouterRpcFairnessPolicyController add availableHandleOnPerNs metrics
> -
>
> Key: HDFS-16273
> URL: https://issues.apache.org/jira/browse/HDFS-16273
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: rbf
>Affects Versions: 3.4.0
>Reporter: Xiangyi Zhu
>Assignee: Xiangyi Zhu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Add the availableHandlerOnPerNs metrics to monitor whether the number of 
> handlers configured for each NS is reasonable when using 
> RouterRpcFairnessPolicyController.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16286) Debug tool to verify the correctness of erasure coding on file

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16286?focusedWorklogId=674056&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-674056
 ]

ASF GitHub Bot logged work on HDFS-16286:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 21:35
Start Date: 02/Nov/21 21:35
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #3593:
URL: https://github.com/apache/hadoop/pull/3593#issuecomment-957941102


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 55s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  2s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  34m 45s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 22s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  compile  |   1m 16s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  checkstyle  |   0m 57s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 23s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 57s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 23s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   3m 22s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  25m 41s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 14s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 16s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javac  |   1m 16s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  7s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  javac  |   1m  7s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   0m 51s | 
[/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3593/2/artifact/out/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs-project/hadoop-hdfs: The patch generated 1 new + 13 unchanged - 
0 fixed = 14 total (was 13)  |
   | +1 :green_heart: |  mvnsite  |   1m 14s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 48s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 17s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   3m 17s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  24m 27s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 363m 34s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3593/2/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 38s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 469m 10s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | 
hadoop.hdfs.server.balancer.TestBalancerWithHANameNodes |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3593/2/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/3593 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell |
   | uname | Linux 8846eb1a8063 4.15.0-153-generic #160-Ubuntu SMP Thu Jul 29 
06:54:29 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / d30b66ba08b5ad4404363477591cb1681c12cb6c |
   | Default Java | Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 
/usr/

[jira] [Work logged] (HDFS-16286) Debug tool to verify the correctness of erasure coding on file

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16286?focusedWorklogId=674089&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-674089
 ]

ASF GitHub Bot logged work on HDFS-16286:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 21:39
Start Date: 02/Nov/21 21:39
Worklog Time Spent: 10m 
  Work Description: cndaimin commented on a change in pull request #3593:
URL: https://github.com/apache/hadoop/pull/3593#discussion_r740842196



##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DebugAdmin.java
##
@@ -387,6 +414,211 @@ int run(List args) throws IOException {
 }
   }
 
+  /**
+   * The command for verifying the correctness of erasure coding on an erasure 
coded file.
+   */
+  private class VerifyECCommand extends DebugCommand {
+private DFSClient client;
+private int dataBlkNum;
+private int parityBlkNum;
+private int cellSize;
+private boolean useDNHostname;
+private CachingStrategy cachingStrategy;
+private int stripedReadBufferSize;
+private CompletionService readService;
+private RawErasureDecoder decoder;
+private BlockReader[] blockReaders;
+
+
+VerifyECCommand() {
+  super("verifyEC",
+  "verifyEC -file ",
+  "  Verify HDFS erasure coding on all block groups of the file.");
+}
+
+int run(List args) throws IOException {
+  if (args.size() < 2) {
+System.out.println(usageText);
+System.out.println(helpText + System.lineSeparator());
+return 1;
+  }
+  String file = StringUtils.popOptionWithArgument("-file", args);
+  Path path = new Path(file);
+  DistributedFileSystem dfs = AdminHelper.getDFS(getConf());
+  this.client = dfs.getClient();
+
+  FileStatus fileStatus;
+  try {
+fileStatus = dfs.getFileStatus(path);
+  } catch (FileNotFoundException e) {
+System.err.println("File " + file + " does not exist.");
+return 1;
+  }
+
+  if (!fileStatus.isFile()) {
+System.err.println("File " + file + " is not a regular file.");
+return 1;
+  }
+  if (!dfs.isFileClosed(path)) {
+System.err.println("File " + file + " is not closed.");
+return 1;
+  }
+  this.useDNHostname = 
getConf().getBoolean(DFSConfigKeys.DFS_DATANODE_USE_DN_HOSTNAME,
+  DFSConfigKeys.DFS_DATANODE_USE_DN_HOSTNAME_DEFAULT);
+  this.cachingStrategy = CachingStrategy.newDefaultStrategy();
+  this.stripedReadBufferSize = getConf().getInt(
+  DFSConfigKeys.DFS_DN_EC_RECONSTRUCTION_STRIPED_READ_BUFFER_SIZE_KEY,
+  
DFSConfigKeys.DFS_DN_EC_RECONSTRUCTION_STRIPED_READ_BUFFER_SIZE_DEFAULT);
+
+  LocatedBlocks locatedBlocks = client.getLocatedBlocks(file, 0, 
fileStatus.getLen());
+  if (locatedBlocks.getErasureCodingPolicy() == null) {
+System.err.println("File " + file + " is not erasure coded.");
+return 1;
+  }
+  ErasureCodingPolicy ecPolicy = locatedBlocks.getErasureCodingPolicy();
+  this.dataBlkNum = ecPolicy.getNumDataUnits();
+  this.parityBlkNum = ecPolicy.getNumParityUnits();
+  this.cellSize = ecPolicy.getCellSize();
+  this.decoder = CodecUtil.createRawDecoder(getConf(), 
ecPolicy.getCodecName(),
+  new ErasureCoderOptions(
+  ecPolicy.getNumDataUnits(), ecPolicy.getNumParityUnits()));
+  int blockNum = dataBlkNum + parityBlkNum;
+  this.readService = new ExecutorCompletionService<>(
+  DFSUtilClient.getThreadPoolExecutor(blockNum, blockNum, 60,
+  new LinkedBlockingQueue<>(), "read-", false));
+  this.blockReaders = new BlockReader[dataBlkNum + parityBlkNum];
+
+  for (LocatedBlock locatedBlock : locatedBlocks.getLocatedBlocks()) {
+System.out.println("Checking EC block group: blk_" + 
locatedBlock.getBlock().getBlockId());
+LocatedStripedBlock blockGroup = (LocatedStripedBlock) locatedBlock;
+
+try {
+  verifyBlockGroup(blockGroup);
+  System.out.println("Status: OK");
+} catch (Exception e) {
+  System.err.println("Status: ERROR, message: " + e.getMessage());
+  return 1;
+} finally {
+  closeBlockReaders();
+}
+  }
+  System.out.println("\nAll EC block group status: OK");
+  return 0;
+}
+
+private void verifyBlockGroup(LocatedStripedBlock blockGroup) throws 
Exception {
+  final LocatedBlock[] indexedBlocks = 
StripedBlockUtil.parseStripedBlockGroup(blockGroup,
+  cellSize, dataBlkNum, parityBlkNum);
+
+  int blockNumExpected = Math.min(dataBlkNum,
+  (int) ((blockGroup.getBlockSize() - 1) / cellSize + 1)) + 
parityBlkNum;
+  if (blockGroup.getBlockIndices().length < blockNumExpected) {
+throw new Exception("Block group is under-erasure-coded.");
+  }
+
+  long m

[jira] [Work logged] (HDFS-16287) Support to make dfs.namenode.avoid.read.slow.datanode reconfigurable

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16287?focusedWorklogId=674104&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-674104
 ]

ASF GitHub Bot logged work on HDFS-16287:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 21:41
Start Date: 02/Nov/21 21:41
Worklog Time Spent: 10m 
  Work Description: tomscut commented on a change in pull request #3596:
URL: https://github.com/apache/hadoop/pull/3596#discussion_r741006484



##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java
##
@@ -511,7 +505,16 @@ private boolean isInactive(DatanodeInfo datanode) {
   private boolean isSlowNode(String dnUuid) {
 return avoidSlowDataNodesForRead && slowNodesUuidSet.contains(dnUuid);
   }
-  
+
+  public void setAvoidSlowDataNodesForReadEnabled(boolean enable) {

Review comment:
   We might need to check whether the slowPeerTracker is started or we 
might not get the slow peers.

##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java
##
@@ -260,17 +257,14 @@
 final Timer timer = new Timer();
 this.slowPeerTracker = dataNodePeerStatsEnabled ?
 new SlowPeerTracker(conf, timer) : null;
-this.excludeSlowNodesEnabled = conf.getBoolean(
-DFS_NAMENODE_BLOCKPLACEMENTPOLICY_EXCLUDE_SLOW_NODES_ENABLED_KEY,
-DFS_NAMENODE_BLOCKPLACEMENTPOLICY_EXCLUDE_SLOW_NODES_ENABLED_DEFAULT);
 this.maxSlowPeerReportNodes = conf.getInt(
 DFSConfigKeys.DFS_NAMENODE_MAX_SLOWPEER_COLLECT_NODES_KEY,
 DFSConfigKeys.DFS_NAMENODE_MAX_SLOWPEER_COLLECT_NODES_DEFAULT);
 this.slowPeerCollectionInterval = conf.getTimeDuration(
 DFSConfigKeys.DFS_NAMENODE_SLOWPEER_COLLECT_INTERVAL_KEY,
 DFSConfigKeys.DFS_NAMENODE_SLOWPEER_COLLECT_INTERVAL_DEFAULT,
 TimeUnit.MILLISECONDS);
-if (slowPeerTracker != null && excludeSlowNodesEnabled) {

Review comment:
   If this change is made, the SlowPeerCollector thread will be started 
regardless of whether we enable this feature.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 674104)
Time Spent: 3h 10m  (was: 3h)

> Support to make dfs.namenode.avoid.read.slow.datanode  reconfigurable
> -
>
> Key: HDFS-16287
> URL: https://issues.apache.org/jira/browse/HDFS-16287
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haiyang Hu
>Assignee: Haiyang Hu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> 1. Consider that make dfs.namenode.avoid.read.slow.datanode  reconfigurable 
> and rapid rollback in case this feature 
> [HDFS-16076|https://issues.apache.org/jira/browse/HDFS-16076] unexpected 
> things happen in production environment  
> 2.  DatanodeManager#startSlowPeerCollector by parameter 
> 'dfs.datanode.peer.stats.enabled' to control



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16286) Debug tool to verify the correctness of erasure coding on file

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16286?focusedWorklogId=674054&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-674054
 ]

ASF GitHub Bot logged work on HDFS-16286:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 21:34
Start Date: 02/Nov/21 21:34
Worklog Time Spent: 10m 
  Work Description: sodonnel commented on a change in pull request #3593:
URL: https://github.com/apache/hadoop/pull/3593#discussion_r741306774



##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/tools/TestDebugAdmin.java
##
@@ -166,8 +179,91 @@ public void testComputeMetaCommand() throws Exception {
 
   @Test(timeout = 6)
   public void testRecoverLeaseforFileNotFound() throws Exception {
+cluster = new MiniDFSCluster.Builder(conf).numDataNodes(1).build();
+cluster.waitActive();
 assertTrue(runCmd(new String[] {
 "recoverLease", "-path", "/foo", "-retries", "2" }).contains(
 "Giving up on recoverLease for /foo after 1 try"));
   }
+
+  @Test(timeout = 6)
+  public void testVerifyECCommand() throws Exception {
+final ErasureCodingPolicy ecPolicy = SystemErasureCodingPolicies.getByID(
+SystemErasureCodingPolicies.RS_3_2_POLICY_ID);
+cluster = DFSTestUtil.setupCluster(conf, 6, 5, 0);
+cluster.waitActive();
+DistributedFileSystem fs = cluster.getFileSystem();
+
+assertEquals("ret: 1, verifyEC -file   Verify HDFS erasure coding on 
" +
+"all block groups of the file.", runCmd(new String[]{"verifyEC"}));
+
+assertEquals("ret: 1, File /bar does not exist.",
+runCmd(new String[]{"verifyEC", "-file", "/bar"}));
+
+fs.create(new Path("/bar")).close();
+assertEquals("ret: 1, File /bar is not erasure coded.",
+runCmd(new String[]{"verifyEC", "-file", "/bar"}));
+
+
+final Path ecDir = new Path("/ec");
+fs.mkdir(ecDir, FsPermission.getDirDefault());
+fs.enableErasureCodingPolicy(ecPolicy.getName());
+fs.setErasureCodingPolicy(ecDir, ecPolicy.getName());
+
+assertEquals("ret: 1, File /ec is not a regular file.",
+runCmd(new String[]{"verifyEC", "-file", "/ec"}));
+
+fs.create(new Path(ecDir, "foo"));
+assertEquals("ret: 1, File /ec/foo is not closed.",
+runCmd(new String[]{"verifyEC", "-file", "/ec/foo"}));
+
+final short repl = 1;
+final long k = 1024;
+final long m = k * k;
+final long seed = 0x1234567L;
+DFSTestUtil.createFile(fs, new Path(ecDir, "foo_65535"), 65535, repl, 
seed);
+assertTrue(runCmd(new String[]{"verifyEC", "-file", "/ec/foo_65535"})
+.contains("All EC block group status: OK"));
+DFSTestUtil.createFile(fs, new Path(ecDir, "foo_256k"), 256 * k, repl, 
seed);
+assertTrue(runCmd(new String[]{"verifyEC", "-file", "/ec/foo_256k"})
+.contains("All EC block group status: OK"));
+DFSTestUtil.createFile(fs, new Path(ecDir, "foo_1m"), m, repl, seed);
+assertTrue(runCmd(new String[]{"verifyEC", "-file", "/ec/foo_1m"})
+.contains("All EC block group status: OK"));
+DFSTestUtil.createFile(fs, new Path(ecDir, "foo_2m"), 2 * m, repl, seed);
+assertTrue(runCmd(new String[]{"verifyEC", "-file", "/ec/foo_2m"})
+.contains("All EC block group status: OK"));
+DFSTestUtil.createFile(fs, new Path(ecDir, "foo_3m"), 3 * m, repl, seed);
+assertTrue(runCmd(new String[]{"verifyEC", "-file", "/ec/foo_3m"})
+.contains("All EC block group status: OK"));
+DFSTestUtil.createFile(fs, new Path(ecDir, "foo_5m"), 5 * m, repl, seed);
+assertTrue(runCmd(new String[]{"verifyEC", "-file", "/ec/foo_5m"})
+.contains("All EC block group status: OK"));
+

Review comment:
   Could you add one more test case for a file that has multiple block 
groups, so we test the command looping over more than 1 block? You are using EC 
3-2, so write a file that is 6MB, with a 1MB block size. That should create 2 
block groups, with a length of 3MB each. Each block would then have a single 
1MB EC chunk in it. 
   
   In `DFSTestUtil` there is a method to pass the blocksize already, so the 
test would be almost the same as the ones above:
   
   ```
 public static void createFile(FileSystem fs, Path fileName, int bufferLen,
 long fileLen, long blockSize, short replFactor, long seed)
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 674054)
Time Spent: 2h 50m  (was: 2h 40m)

> Debug tool to verify the correctness of erasure coding on file
> -

[jira] [Work logged] (HDFS-16287) Support to make dfs.namenode.avoid.read.slow.datanode reconfigurable

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16287?focusedWorklogId=674122&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-674122
 ]

ASF GitHub Bot logged work on HDFS-16287:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 21:43
Start Date: 02/Nov/21 21:43
Worklog Time Spent: 10m 
  Work Description: haiyang1987 commented on pull request #3596:
URL: https://github.com/apache/hadoop/pull/3596#issuecomment-957160686


   @ferhui @tomscut I submitted some code. Can you help review.
   thank you very much.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 674122)
Time Spent: 3h 20m  (was: 3h 10m)

> Support to make dfs.namenode.avoid.read.slow.datanode  reconfigurable
> -
>
> Key: HDFS-16287
> URL: https://issues.apache.org/jira/browse/HDFS-16287
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haiyang Hu
>Assignee: Haiyang Hu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> 1. Consider that make dfs.namenode.avoid.read.slow.datanode  reconfigurable 
> and rapid rollback in case this feature 
> [HDFS-16076|https://issues.apache.org/jira/browse/HDFS-16076] unexpected 
> things happen in production environment  
> 2.  DatanodeManager#startSlowPeerCollector by parameter 
> 'dfs.datanode.peer.stats.enabled' to control



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16259) Catch and re-throw sub-classes of AccessControlException thrown by any permission provider plugins (eg Ranger)

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16259?focusedWorklogId=674129&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-674129
 ]

ASF GitHub Bot logged work on HDFS-16259:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 21:44
Start Date: 02/Nov/21 21:44
Worklog Time Spent: 10m 
  Work Description: sodonnel merged pull request #3598:
URL: https://github.com/apache/hadoop/pull/3598


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 674129)
Time Spent: 1h  (was: 50m)

> Catch and re-throw sub-classes of AccessControlException thrown by any 
> permission provider plugins (eg Ranger)
> --
>
> Key: HDFS-16259
> URL: https://issues.apache.org/jira/browse/HDFS-16259
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.2
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> When a permission provider plugin is enabled (eg Ranger) there are some 
> scenarios where it can throw a sub-class of an AccessControlException (eg 
> RangerAccessControlException). If this exception is allowed to propagate up 
> the stack, it can give problems in the HDFS Client, when it unwraps the 
> remote exception containing the AccessControlException sub-class.
> Ideally, we should make AccessControlException final so it cannot be 
> sub-classed, but that would be a breaking change at this point. Therefore I 
> believe the safest thing to do, is to catch any AccessControlException that 
> comes out of the permission enforcer plugin, and re-throw an 
> AccessControlException instead.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16266) Add remote port information to HDFS audit log

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16266?focusedWorklogId=674158&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-674158
 ]

ASF GitHub Bot logged work on HDFS-16266:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 21:46
Start Date: 02/Nov/21 21:46
Worklog Time Spent: 10m 
  Work Description: aajisaka commented on pull request #3538:
URL: https://github.com/apache/hadoop/pull/3538#issuecomment-957445801


   > I will make some changes in the other JIRA if necessary, because this 
might involve the callContext from Router. What do you think of this? Thank you.
   
   Agreed. Let's discuss in a separate JIRA.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 674158)
Time Spent: 8h 10m  (was: 8h)

> Add remote port information to HDFS audit log
> -
>
> Key: HDFS-16266
> URL: https://issues.apache.org/jira/browse/HDFS-16266
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: tomscut
>Assignee: tomscut
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 8h 10m
>  Remaining Estimate: 0h
>
> In our production environment, we occasionally encounter a problem where a 
> user submits an abnormal computation task, causing a sudden flood of 
> requests, which causes the queueTime and processingTime of the Namenode to 
> rise very high, causing a large backlog of tasks.
> We usually locate and kill specific Spark, Flink, or MapReduce tasks based on 
> metrics and audit logs. Currently, IP and UGI are recorded in audit logs, but 
> there is no port information, so it is difficult to locate specific processes 
> sometimes. Therefore, I propose that we add the port information to the audit 
> log, so that we can easily track the upstream process.
> Currently, some projects contain port information in audit logs, such as 
> Hbase and Alluxio. I think it is also necessary to add port information for 
> HDFS audit logs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16287) Support to make dfs.namenode.avoid.read.slow.datanode reconfigurable

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16287?focusedWorklogId=674168&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-674168
 ]

ASF GitHub Bot logged work on HDFS-16287:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 21:47
Start Date: 02/Nov/21 21:47
Worklog Time Spent: 10m 
  Work Description: tomscut commented on pull request #3596:
URL: https://github.com/apache/hadoop/pull/3596#issuecomment-957437161


   > @ferhui @tomscut I submitted some code. Can you help review. thank you 
very much.
   
   Thanks for reminding me. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 674168)
Time Spent: 3.5h  (was: 3h 20m)

> Support to make dfs.namenode.avoid.read.slow.datanode  reconfigurable
> -
>
> Key: HDFS-16287
> URL: https://issues.apache.org/jira/browse/HDFS-16287
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haiyang Hu
>Assignee: Haiyang Hu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> 1. Consider that make dfs.namenode.avoid.read.slow.datanode  reconfigurable 
> and rapid rollback in case this feature 
> [HDFS-16076|https://issues.apache.org/jira/browse/HDFS-16076] unexpected 
> things happen in production environment  
> 2.  DatanodeManager#startSlowPeerCollector by parameter 
> 'dfs.datanode.peer.stats.enabled' to control



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16294) Remove invalid DataNode#CONFIG_PROPERTY_SIMULATED

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16294?focusedWorklogId=674165&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-674165
 ]

ASF GitHub Bot logged work on HDFS-16294:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 21:47
Start Date: 02/Nov/21 21:47
Worklog Time Spent: 10m 
  Work Description: jianghuazhu commented on pull request #3605:
URL: https://github.com/apache/hadoop/pull/3605#issuecomment-957661245


   It seems that Jenkins did not execute successfully, It seems that these have 
little to do with the code I submitted.
   @tasanuma @prasad-acit @tomscut, you are willing to spend some time 
reviewing this PR.
   Thank you very much.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 674165)
Time Spent: 1h 20m  (was: 1h 10m)

> Remove invalid DataNode#CONFIG_PROPERTY_SIMULATED
> -
>
> Key: HDFS-16294
> URL: https://issues.apache.org/jira/browse/HDFS-16294
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.9.2
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Major
>  Labels: pull-request-available
> Attachments: screenshot.png
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> As early as when HDFS-2907 was resolved, 
> SimulatedFSDataset#CONFIG_PROPERTY_SIMULATED was removed. Replaced by: 
> SimulatedFSDataset#Factory and 
> DFSConfigKyes#DFS_DATANODE_FSDATASET_FACTORY_KEY.
> However, the introduction to CONFIG_PROPERTY_SIMULATED is still retained in 
> the DataNode.
>  !screenshot.png! 
> Here are some traces related to HDFS-2907:
> https://issues.apache.org/jira/browse/HDFS-2907
> https://github.com/apache/hadoop/commit/efbc58f30c8e8d9f26c6a82d32d53716fb2b222a#diff-ab77612831fcb9a35e14c294417f0919c7a30c0cef9a4aec6b32d5f2df957020



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16287) Support to make dfs.namenode.avoid.read.slow.datanode reconfigurable

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16287?focusedWorklogId=674197&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-674197
 ]

ASF GitHub Bot logged work on HDFS-16287:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 21:50
Start Date: 02/Nov/21 21:50
Worklog Time Spent: 10m 
  Work Description: ferhui commented on pull request #3596:
URL: https://github.com/apache/hadoop/pull/3596#issuecomment-957338822


   @haiyang1987 Thanks for contribution, some comments:
   we can change the title here and jira, If 
dfs.namenode.block-placement-policy.exclude-slow-nodes.enabled is not 
reconfigurable.
   And I will check whether 
dfs.namenode.block-placement-policy.exclude-slow-nodes.enabled is unused.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 674197)
Time Spent: 3h 40m  (was: 3.5h)

> Support to make dfs.namenode.avoid.read.slow.datanode  reconfigurable
> -
>
> Key: HDFS-16287
> URL: https://issues.apache.org/jira/browse/HDFS-16287
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haiyang Hu
>Assignee: Haiyang Hu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> 1. Consider that make dfs.namenode.avoid.read.slow.datanode  reconfigurable 
> and rapid rollback in case this feature 
> [HDFS-16076|https://issues.apache.org/jira/browse/HDFS-16076] unexpected 
> things happen in production environment  
> 2.  DatanodeManager#startSlowPeerCollector by parameter 
> 'dfs.datanode.peer.stats.enabled' to control



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16294) Remove invalid DataNode#CONFIG_PROPERTY_SIMULATED

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16294?focusedWorklogId=674201&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-674201
 ]

ASF GitHub Bot logged work on HDFS-16294:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 21:51
Start Date: 02/Nov/21 21:51
Worklog Time Spent: 10m 
  Work Description: jianghuazhu opened a new pull request #3605:
URL: https://github.com/apache/hadoop/pull/3605


   ### Description of PR
   As early as when HDFS-2907 was resolved, 
SimulatedFSDataset#CONFIG_PROPERTY_SIMULATED was removed. Replaced by: 
SimulatedFSDataset#Factory and DFSConfigKyes#DFS_DATANODE_FSDATASET_FACTORY_KEY.
   However, the introduction to CONFIG_PROPERTY_SIMULATED is still retained in 
the DataNode.
   Details: HDFS-16294
   
   ### How was this patch tested?
   This pr mainly solves the changes related to annotations, and the test 
pressure is not great.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 674201)
Time Spent: 1.5h  (was: 1h 20m)

> Remove invalid DataNode#CONFIG_PROPERTY_SIMULATED
> -
>
> Key: HDFS-16294
> URL: https://issues.apache.org/jira/browse/HDFS-16294
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.9.2
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Major
>  Labels: pull-request-available
> Attachments: screenshot.png
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> As early as when HDFS-2907 was resolved, 
> SimulatedFSDataset#CONFIG_PROPERTY_SIMULATED was removed. Replaced by: 
> SimulatedFSDataset#Factory and 
> DFSConfigKyes#DFS_DATANODE_FSDATASET_FACTORY_KEY.
> However, the introduction to CONFIG_PROPERTY_SIMULATED is still retained in 
> the DataNode.
>  !screenshot.png! 
> Here are some traces related to HDFS-2907:
> https://issues.apache.org/jira/browse/HDFS-2907
> https://github.com/apache/hadoop/commit/efbc58f30c8e8d9f26c6a82d32d53716fb2b222a#diff-ab77612831fcb9a35e14c294417f0919c7a30c0cef9a4aec6b32d5f2df957020



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16283) RBF: improve renewLease() to call only a specific NameNode rather than make fan-out calls

2021-11-02 Thread Aihua Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HDFS-16283:

Attachment: RBF_ improve renewLease() to call only a specific NameNode 
rather than make fan-out calls.pdf

> RBF: improve renewLease() to call only a specific NameNode rather than make 
> fan-out calls
> -
>
> Key: HDFS-16283
> URL: https://issues.apache.org/jira/browse/HDFS-16283
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>Priority: Major
>  Labels: pull-request-available
> Attachments: RBF_ improve renewLease() to call only a specific 
> NameNode rather than make fan-out calls.pdf
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Currently renewLease() against a router will make fan-out to all the 
> NameNodes. Since renewLease() call is so frequent and if one of the NameNodes 
> are slow, then eventually the router queues are blocked by all renewLease() 
> and cause router degradation. 
> We will make a change in the client side to keep track of NameNode Id in 
> additional to current fileId so routers understand which NameNodes the client 
> is renewing lease against.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16283) RBF: improve renewLease() to call only a specific NameNode rather than make fan-out calls

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16283?focusedWorklogId=674228&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-674228
 ]

ASF GitHub Bot logged work on HDFS-16283:
-

Author: ASF GitHub Bot
Created on: 02/Nov/21 22:20
Start Date: 02/Nov/21 22:20
Worklog Time Spent: 10m 
  Work Description: aihuaxu commented on pull request #3595:
URL: https://github.com/apache/hadoop/pull/3595#issuecomment-958268688


   Please take a look at the simple design doc I posted in the jira and let me 
know your thoughts. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 674228)
Time Spent: 1h 40m  (was: 1.5h)

> RBF: improve renewLease() to call only a specific NameNode rather than make 
> fan-out calls
> -
>
> Key: HDFS-16283
> URL: https://issues.apache.org/jira/browse/HDFS-16283
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>Priority: Major
>  Labels: pull-request-available
> Attachments: RBF_ improve renewLease() to call only a specific 
> NameNode rather than make fan-out calls.pdf
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Currently renewLease() against a router will make fan-out to all the 
> NameNodes. Since renewLease() call is so frequent and if one of the NameNodes 
> are slow, then eventually the router queues are blocked by all renewLease() 
> and cause router degradation. 
> We will make a change in the client side to keep track of NameNode Id in 
> additional to current fileId so routers understand which NameNodes the client 
> is renewing lease against.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16286) Debug tool to verify the correctness of erasure coding on file

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16286?focusedWorklogId=674337&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-674337
 ]

ASF GitHub Bot logged work on HDFS-16286:
-

Author: ASF GitHub Bot
Created on: 03/Nov/21 02:38
Start Date: 03/Nov/21 02:38
Worklog Time Spent: 10m 
  Work Description: cndaimin commented on a change in pull request #3593:
URL: https://github.com/apache/hadoop/pull/3593#discussion_r741582271



##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/tools/TestDebugAdmin.java
##
@@ -166,8 +179,91 @@ public void testComputeMetaCommand() throws Exception {
 
   @Test(timeout = 6)
   public void testRecoverLeaseforFileNotFound() throws Exception {
+cluster = new MiniDFSCluster.Builder(conf).numDataNodes(1).build();
+cluster.waitActive();
 assertTrue(runCmd(new String[] {
 "recoverLease", "-path", "/foo", "-retries", "2" }).contains(
 "Giving up on recoverLease for /foo after 1 try"));
   }
+
+  @Test(timeout = 6)
+  public void testVerifyECCommand() throws Exception {
+final ErasureCodingPolicy ecPolicy = SystemErasureCodingPolicies.getByID(
+SystemErasureCodingPolicies.RS_3_2_POLICY_ID);
+cluster = DFSTestUtil.setupCluster(conf, 6, 5, 0);
+cluster.waitActive();
+DistributedFileSystem fs = cluster.getFileSystem();
+
+assertEquals("ret: 1, verifyEC -file   Verify HDFS erasure coding on 
" +
+"all block groups of the file.", runCmd(new String[]{"verifyEC"}));
+
+assertEquals("ret: 1, File /bar does not exist.",
+runCmd(new String[]{"verifyEC", "-file", "/bar"}));
+
+fs.create(new Path("/bar")).close();
+assertEquals("ret: 1, File /bar is not erasure coded.",
+runCmd(new String[]{"verifyEC", "-file", "/bar"}));
+
+
+final Path ecDir = new Path("/ec");
+fs.mkdir(ecDir, FsPermission.getDirDefault());
+fs.enableErasureCodingPolicy(ecPolicy.getName());
+fs.setErasureCodingPolicy(ecDir, ecPolicy.getName());
+
+assertEquals("ret: 1, File /ec is not a regular file.",
+runCmd(new String[]{"verifyEC", "-file", "/ec"}));
+
+fs.create(new Path(ecDir, "foo"));
+assertEquals("ret: 1, File /ec/foo is not closed.",
+runCmd(new String[]{"verifyEC", "-file", "/ec/foo"}));
+
+final short repl = 1;
+final long k = 1024;
+final long m = k * k;
+final long seed = 0x1234567L;
+DFSTestUtil.createFile(fs, new Path(ecDir, "foo_65535"), 65535, repl, 
seed);
+assertTrue(runCmd(new String[]{"verifyEC", "-file", "/ec/foo_65535"})
+.contains("All EC block group status: OK"));
+DFSTestUtil.createFile(fs, new Path(ecDir, "foo_256k"), 256 * k, repl, 
seed);
+assertTrue(runCmd(new String[]{"verifyEC", "-file", "/ec/foo_256k"})
+.contains("All EC block group status: OK"));
+DFSTestUtil.createFile(fs, new Path(ecDir, "foo_1m"), m, repl, seed);
+assertTrue(runCmd(new String[]{"verifyEC", "-file", "/ec/foo_1m"})
+.contains("All EC block group status: OK"));
+DFSTestUtil.createFile(fs, new Path(ecDir, "foo_2m"), 2 * m, repl, seed);
+assertTrue(runCmd(new String[]{"verifyEC", "-file", "/ec/foo_2m"})
+.contains("All EC block group status: OK"));
+DFSTestUtil.createFile(fs, new Path(ecDir, "foo_3m"), 3 * m, repl, seed);
+assertTrue(runCmd(new String[]{"verifyEC", "-file", "/ec/foo_3m"})
+.contains("All EC block group status: OK"));
+DFSTestUtil.createFile(fs, new Path(ecDir, "foo_5m"), 5 * m, repl, seed);
+assertTrue(runCmd(new String[]{"verifyEC", "-file", "/ec/foo_5m"})
+.contains("All EC block group status: OK"));
+

Review comment:
   Thanks, that's a good advice, updated.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 674337)
Time Spent: 3h 20m  (was: 3h 10m)

> Debug tool to verify the correctness of erasure coding on file
> --
>
> Key: HDFS-16286
> URL: https://issues.apache.org/jira/browse/HDFS-16286
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding, tools
>Affects Versions: 3.3.0, 3.3.1
>Reporter: daimin
>Assignee: daimin
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> Block data in erasure coded block group may corrupt and the block meta 
> (checksum) is unable to discover the corrup

[jira] [Work logged] (HDFS-16286) Debug tool to verify the correctness of erasure coding on file

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16286?focusedWorklogId=674341&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-674341
 ]

ASF GitHub Bot logged work on HDFS-16286:
-

Author: ASF GitHub Bot
Created on: 03/Nov/21 02:44
Start Date: 03/Nov/21 02:44
Worklog Time Spent: 10m 
  Work Description: cndaimin commented on pull request #3593:
URL: https://github.com/apache/hadoop/pull/3593#issuecomment-958610440


   @sodonnel  Thanks for your review. 
   Update: Removed the unused  import and added a test on verifying file with 2 
block groups.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 674341)
Time Spent: 3.5h  (was: 3h 20m)

> Debug tool to verify the correctness of erasure coding on file
> --
>
> Key: HDFS-16286
> URL: https://issues.apache.org/jira/browse/HDFS-16286
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding, tools
>Affects Versions: 3.3.0, 3.3.1
>Reporter: daimin
>Assignee: daimin
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> Block data in erasure coded block group may corrupt and the block meta 
> (checksum) is unable to discover the corruption in some cases such as EC 
> reconstruction, related issues are:  HDFS-14768, HDFS-15186, HDFS-15240.
> In addition to HDFS-15759, there needs a tool to check erasure coded file 
> whether any block group has data corruption in case of other conditions 
> rather than EC reconstruction, or the feature HDFS-15759(validation during EC 
> reconstruction) is not open(which is close by default now).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-16296) RBF: RouterRpcFairnessPolicyController add denied permits for each nameservice

2021-11-02 Thread Janus Chow (Jira)
Janus Chow created HDFS-16296:
-

 Summary: RBF: RouterRpcFairnessPolicyController add denied permits 
for each nameservice
 Key: HDFS-16296
 URL: https://issues.apache.org/jira/browse/HDFS-16296
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Janus Chow
Assignee: Janus Chow


Currently RouterRpcFairnessPolicyController has a metric of 
"getProxyOpPermitRejected" to show the total rejected invokes due to lack of 
permits.

This ticket is to add the metrics for each nameservice to have a better view of 
the load of each nameservice.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16266) Add remote port information to HDFS audit log

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16266?focusedWorklogId=674377&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-674377
 ]

ASF GitHub Bot logged work on HDFS-16266:
-

Author: ASF GitHub Bot
Created on: 03/Nov/21 04:30
Start Date: 03/Nov/21 04:30
Worklog Time Spent: 10m 
  Work Description: tomscut commented on pull request #3538:
URL: https://github.com/apache/hadoop/pull/3538#issuecomment-958648376


   Hi @tasanuma @jojochuang @aajisaka , could you please help merge this PR. 
And I will open a new JIRA based on it. Thanks a lot.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 674377)
Time Spent: 8h 20m  (was: 8h 10m)

> Add remote port information to HDFS audit log
> -
>
> Key: HDFS-16266
> URL: https://issues.apache.org/jira/browse/HDFS-16266
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: tomscut
>Assignee: tomscut
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 8h 20m
>  Remaining Estimate: 0h
>
> In our production environment, we occasionally encounter a problem where a 
> user submits an abnormal computation task, causing a sudden flood of 
> requests, which causes the queueTime and processingTime of the Namenode to 
> rise very high, causing a large backlog of tasks.
> We usually locate and kill specific Spark, Flink, or MapReduce tasks based on 
> metrics and audit logs. Currently, IP and UGI are recorded in audit logs, but 
> there is no port information, so it is difficult to locate specific processes 
> sometimes. Therefore, I propose that we add the port information to the audit 
> log, so that we can easily track the upstream process.
> Currently, some projects contain port information in audit logs, such as 
> Hbase and Alluxio. I think it is also necessary to add port information for 
> HDFS audit logs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16296) RBF: RouterRpcFairnessPolicyController add denied permits for each nameservice

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDFS-16296:
--
Labels: pull-request-available  (was: )

> RBF: RouterRpcFairnessPolicyController add denied permits for each nameservice
> --
>
> Key: HDFS-16296
> URL: https://issues.apache.org/jira/browse/HDFS-16296
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Janus Chow
>Assignee: Janus Chow
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently RouterRpcFairnessPolicyController has a metric of 
> "getProxyOpPermitRejected" to show the total rejected invokes due to lack of 
> permits.
> This ticket is to add the metrics for each nameservice to have a better view 
> of the load of each nameservice.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16296) RBF: RouterRpcFairnessPolicyController add denied permits for each nameservice

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16296?focusedWorklogId=674379&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-674379
 ]

ASF GitHub Bot logged work on HDFS-16296:
-

Author: ASF GitHub Bot
Created on: 03/Nov/21 04:43
Start Date: 03/Nov/21 04:43
Worklog Time Spent: 10m 
  Work Description: symious opened a new pull request #3613:
URL: https://github.com/apache/hadoop/pull/3613


   
   
   ### Description of PR
   
   Currently RouterRpcFairnessPolicyController has a metric of 
"getProxyOpPermitRejected" to show the total rejected invokes due to lack of 
permits.
   
   This ticket is to add the metrics for each nameservice to have a better view 
of the load of each nameservice.
   
   Jira ticket: https://issues.apache.org/jira/browse/HDFS-16296
   
   ### How was this patch tested?
   
   unit test
   
   ### For code changes:
   
   - [x] Does the title or this PR starts with the corresponding JIRA issue id 
(e.g. 'HADOOP-17799. Your PR title ...')?
   - [ ] Object storage: have the integration tests been executed and the 
endpoint declared according to the connector-specific documentation?
   - [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)?
   - [ ] If applicable, have you updated the `LICENSE`, `LICENSE-binary`, 
`NOTICE-binary` files?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 674379)
Remaining Estimate: 0h
Time Spent: 10m

> RBF: RouterRpcFairnessPolicyController add denied permits for each nameservice
> --
>
> Key: HDFS-16296
> URL: https://issues.apache.org/jira/browse/HDFS-16296
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Janus Chow
>Assignee: Janus Chow
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently RouterRpcFairnessPolicyController has a metric of 
> "getProxyOpPermitRejected" to show the total rejected invokes due to lack of 
> permits.
> This ticket is to add the metrics for each nameservice to have a better view 
> of the load of each nameservice.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16296) RBF: RouterRpcFairnessPolicyController add denied permits for each nameservice

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16296?focusedWorklogId=674380&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-674380
 ]

ASF GitHub Bot logged work on HDFS-16296:
-

Author: ASF GitHub Bot
Created on: 03/Nov/21 04:44
Start Date: 03/Nov/21 04:44
Worklog Time Spent: 10m 
  Work Description: symious commented on pull request #3613:
URL: https://github.com/apache/hadoop/pull/3613#issuecomment-958652432


   @goiri @ferhui Could you help to check?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 674380)
Time Spent: 20m  (was: 10m)

> RBF: RouterRpcFairnessPolicyController add denied permits for each nameservice
> --
>
> Key: HDFS-16296
> URL: https://issues.apache.org/jira/browse/HDFS-16296
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Janus Chow
>Assignee: Janus Chow
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Currently RouterRpcFairnessPolicyController has a metric of 
> "getProxyOpPermitRejected" to show the total rejected invokes due to lack of 
> permits.
> This ticket is to add the metrics for each nameservice to have a better view 
> of the load of each nameservice.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16291) Make the comment of INode#ReclaimContext more standardized

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16291?focusedWorklogId=674398&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-674398
 ]

ASF GitHub Bot logged work on HDFS-16291:
-

Author: ASF GitHub Bot
Created on: 03/Nov/21 06:10
Start Date: 03/Nov/21 06:10
Worklog Time Spent: 10m 
  Work Description: ferhui commented on a change in pull request #3602:
URL: https://github.com/apache/hadoop/pull/3602#discussion_r741639733



##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INode.java
##
@@ -993,15 +993,13 @@ public long getNsDelta() {
 private final QuotaDelta quotaDelta;
 
 /**
- * @param bsps
- *  block storage policy suite to calculate intended storage type

Review comment:
   How about just adding the same blanks on line 997 and below wrong format 
lines as line 996, and do not change other lines.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 674398)
Time Spent: 1h 20m  (was: 1h 10m)

> Make the comment of INode#ReclaimContext more standardized
> --
>
> Key: HDFS-16291
> URL: https://issues.apache.org/jira/browse/HDFS-16291
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation, namenode
>Affects Versions: 3.4.0
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Minor
>  Labels: pull-request-available
> Attachments: image-2021-10-31-20-25-08-379.png
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> In the INode#ReclaimContext class, there are some comments that are not 
> standardized enough.
> E.g:
>  !image-2021-10-31-20-25-08-379.png! 
> We should make comments more standardized. This will be more readable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16296) RBF: RouterRpcFairnessPolicyController add denied permits for each nameservice

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16296?focusedWorklogId=674401&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-674401
 ]

ASF GitHub Bot logged work on HDFS-16296:
-

Author: ASF GitHub Bot
Created on: 03/Nov/21 06:18
Start Date: 03/Nov/21 06:18
Worklog Time Spent: 10m 
  Work Description: ferhui commented on pull request #3613:
URL: https://github.com/apache/hadoop/pull/3613#issuecomment-958680152


   @symious Thanks for contribution, it looks good. Let's wait for the CI 
reports.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 674401)
Time Spent: 0.5h  (was: 20m)

> RBF: RouterRpcFairnessPolicyController add denied permits for each nameservice
> --
>
> Key: HDFS-16296
> URL: https://issues.apache.org/jira/browse/HDFS-16296
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Janus Chow
>Assignee: Janus Chow
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Currently RouterRpcFairnessPolicyController has a metric of 
> "getProxyOpPermitRejected" to show the total rejected invokes due to lack of 
> permits.
> This ticket is to add the metrics for each nameservice to have a better view 
> of the load of each nameservice.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16296) RBF: RouterRpcFairnessPolicyController add denied permits for each nameservice

2021-11-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16296?focusedWorklogId=674408&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-674408
 ]

ASF GitHub Bot logged work on HDFS-16296:
-

Author: ASF GitHub Bot
Created on: 03/Nov/21 06:48
Start Date: 03/Nov/21 06:48
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #3613:
URL: https://github.com/apache/hadoop/pull/3613#issuecomment-958690121


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   1m 14s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  34m 10s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   0m 45s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  compile  |   0m 43s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  checkstyle  |   0m 28s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 47s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 44s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   0m 58s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   1m 23s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  20m 35s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 34s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 35s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javac  |   0m 35s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 29s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  javac  |   0m 29s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   0m 18s | 
[/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs-rbf.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3613/1/artifact/out/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs-rbf.txt)
 |  hadoop-hdfs-project/hadoop-hdfs-rbf: The patch generated 1 new + 0 
unchanged - 0 fixed = 1 total (was 0)  |
   | +1 :green_heart: |  mvnsite  |   0m 32s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 32s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   0m 49s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   1m 23s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  20m 14s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  |  34m 58s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3613/1/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt)
 |  hadoop-hdfs-rbf in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 39s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 124m 21s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | 
hadoop.hdfs.server.federation.router.TestRouterFederationRename |
   |   | hadoop.hdfs.rbfbalance.TestRouterDistCpProcedure |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3613/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/3613 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell |
   | uname | Linux ea5a4d6e0025 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 
23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / d9f54bfb62cf0e5efc6e1973beebe3e018884217 |
   | Default Java | Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   | Multi-JDK vers