Mark Ormesher created HDFS-15103: ------------------------------------ Summary: JMX endpoint and "dfsadmin" report 1 corrupt block; "fsck" reports 0 Key: HDFS-15103 URL: https://issues.apache.org/jira/browse/HDFS-15103 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 3.2.1 Environment: * CentOS 7 * HDFS 3.2.1 * 2x HA NNs * 5x identical DNs Reporter: Mark Ormesher
We're seeing a long-running discrepancy between the number of corrupted blocks reported by the JMX endpoint and {{dfsadmin -report}} (1) and by {{fsck /}} (0). This has persisted through rolling restarts of the NNs and DNs, and through complete shutdowns for the HDFS cluster for unrelated maintenance. {panel:title=JMX endpoint snippet} {code} (...) "CorruptBlocks" : 1, "ScheduledReplicationBlocks" : 0, "PendingDeletionBlocks" : 0, "LowRedundancyReplicatedBlocks" : 0, "CorruptReplicatedBlocks" : 1, "MissingReplicatedBlocks" : 0, "MissingReplicationOneBlocks" : 0, (...) {code} {panel} {panel:title=dfsadmin -report} {code} $ ./hdfs dfsadmin -report | grep -i corrupt Blocks with corrupt replicas: 1 Block groups with corrupt internal blocks: 0 {code} {panel} {panel:title=fsck /} {code} $ ./hdfs fsck / -files -blocks | grep -i corrupt Corrupt blocks: 0 Corrupt block groups: 0 {code} {panel} I've read through the related tickets below, all of which suggest this issue was resolved in 2.7.8, but we're seeing it in 3.2.1. https://issues.apache.org/jira/browse/HDFS-8533 https://issues.apache.org/jira/browse/HDFS-10213 https://issues.apache.org/jira/browse/HDFS-13999 How can we work out whether we really do have a corrupt block, and if we do how can we work out which block it is if {{fsck}} thinks everything is fine? -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org