[ https://issues.apache.org/jira/browse/HDFS-16626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
caozhiqiang updated HDFS-16626: ------------------------------- Description: In the output of command 'hdfs dfsadmin -report', the value of Under replicated blocks and ec Low redundancy block groups only contains the block number in BlockManager::neededReconstruction. It should also contain the block number in BlockManager::pendingReconstruction, include the timeout items. Specially, in some scenario, for example, decommission a dn with a lot of ec blocks, there would be a lot blocks in pendingReconstruction at a long time but neededReconstruction's size may be 0. That will confuse user and they can't access the real decommissioning progress. {code:java} Configured Capacity: 1036741707829248 (942.91 TB) Present Capacity: 983872491622400 (894.83 TB) DFS Remaining: 974247450424426 (886.07 TB) DFS Used: 9625041197974 (8.75 TB) DFS Used%: 0.98% Replicated Blocks: Under replicated blocks: 0 Blocks with corrupt replicas: 0 Missing blocks: 0 Missing blocks (with replication factor 1): 0 Low redundancy blocks with highest priority to recover: 0 Pending deletion blocks: 0 Erasure Coded Block Groups: Low redundancy block groups: 3481 Block groups with corrupt internal blocks: 0 Missing block groups: 0 Low redundancy blocks with highest priority to recover: 0 Pending deletion blocks: 245 {code} The below graph show the metrics monitor of under_replicated_blocks and pending_replicated_blocks in decommissioning a datanode process. The value of pending_replicated_blocks would not be included in dfsadmin report. !image-2022-06-08-18-30-13-757.png|width=836,height=157! was: In the output of command 'hdfs dfsadmin -report', the value of Under replicated blocks and ec Low redundancy block groups only contains the block number in BlockManager::neededReconstruction. It should also contain the block number in BlockManager::pendingReconstruction, include the timeout items. Specially, in some scenario, for example, decommission a dn with a lot of ec blocks, there would be a lot blocks in pendingReconstruction at a long time but neededReconstruction's size may be 0. That will confuse user and they can't access the real decommissioning progress. {code:java} Configured Capacity: 1036741707829248 (942.91 TB) Present Capacity: 983872491622400 (894.83 TB) DFS Remaining: 974247450424426 (886.07 TB) DFS Used: 9625041197974 (8.75 TB) DFS Used%: 0.98% Replicated Blocks: Under replicated blocks: 0 Blocks with corrupt replicas: 0 Missing blocks: 0 Missing blocks (with replication factor 1): 0 Low redundancy blocks with highest priority to recover: 0 Pending deletion blocks: 0 Erasure Coded Block Groups: Low redundancy block groups: 3481 Block groups with corrupt internal blocks: 0 Missing block groups: 0 Low redundancy blocks with highest priority to recover: 0 Pending deletion blocks: 245 {code} The below graph show the metrics monitor of under_replicated_blocks and pending_replicated_blocks in decommissioning a datanode process. The value of pending_replicated_blocks would not be included in dfsadmin report. !image-2022-06-08-11-38-29-664.png|width=1319,height=248! > Under replicated blocks in dfsadmin report should contain > pendingReconstruction‘s blocks > ---------------------------------------------------------------------------------------- > > Key: HDFS-16626 > URL: https://issues.apache.org/jira/browse/HDFS-16626 > Project: Hadoop HDFS > Issue Type: Improvement > Components: ec, namanode > Affects Versions: 3.4.0 > Reporter: caozhiqiang > Assignee: caozhiqiang > Priority: Major > Labels: pull-request-available > Attachments: image-2022-06-08-18-30-13-757.png > > Time Spent: 10m > Remaining Estimate: 0h > > In the output of command 'hdfs dfsadmin -report', the value of Under > replicated blocks and ec Low redundancy block groups only contains the block > number in BlockManager::neededReconstruction. It should also contain the > block number in BlockManager::pendingReconstruction, include the timeout > items. Specially, in some scenario, for example, decommission a dn with a lot > of ec blocks, there would be a lot blocks in pendingReconstruction at a long > time but neededReconstruction's size may be 0. That will confuse user and > they can't access the real decommissioning progress. > {code:java} > Configured Capacity: 1036741707829248 (942.91 TB) > Present Capacity: 983872491622400 (894.83 TB) > DFS Remaining: 974247450424426 (886.07 TB) > DFS Used: 9625041197974 (8.75 TB) > DFS Used%: 0.98% > Replicated Blocks: > Under replicated blocks: 0 > Blocks with corrupt replicas: 0 > Missing blocks: 0 > Missing blocks (with replication factor 1): 0 > Low redundancy blocks with highest priority to recover: 0 > Pending deletion blocks: 0 > Erasure Coded Block Groups: > Low redundancy block groups: 3481 > Block groups with corrupt internal blocks: 0 > Missing block groups: 0 > Low redundancy blocks with highest priority to recover: 0 > Pending deletion blocks: 245 {code} > The below graph show the metrics monitor of under_replicated_blocks and > pending_replicated_blocks in decommissioning a datanode process. The value of > pending_replicated_blocks would not be included in dfsadmin report. > !image-2022-06-08-18-30-13-757.png|width=836,height=157! -- This message was sent by Atlassian Jira (v8.20.7#820007) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org