Hi. hadooper! I use hadoop-2.7.1 and my cluster has 130 nodes. recently I am facing a problem. I have found corrupt block by nagios. nagios request http://namenode01:50070/jmx?qry=Hadoop:service=NameNode,name=FSNamesystem. below is result. notice CorruptBlocks is 1.
{ "beans" : [ { "name" : "Hadoop:service=NameNode,name=FSNamesystem", "modelerType" : "FSNamesystem", "tag.Context" : "dfs", "tag.HAState" : "active", "tag.Hostname" : "css0700.nhnsystem.com", "MissingBlocks" : 0, "MissingReplOneBlocks" : 0, "ExpiredHeartbeats" : 10, "TransactionsSinceLastCheckpoint" : 820630, "TransactionsSinceLastLogRoll" : 1916, "LastWrittenTransactionId" : 376685578, "LastCheckpointTime" : 1452583950883, "CapacityTotal" : 1650893130075660, "CapacityTotalGB" : 1537514.0, "CapacityUsed" : 1237079990848257, "CapacityUsedGB" : 1152121.0, "CapacityRemaining" : 410189981364473, "CapacityRemainingGB" : 382019.0, "CapacityUsedNonDFS" : 3623157862930, "TotalLoad" : 6717, "SnapshottableDirectories" : 0, "Snapshots" : 0, "BlocksTotal" : 4034155, "FilesTotal" : 2866690, "PendingReplicationBlocks" : 0, "UnderReplicatedBlocks" : 0, "CorruptBlocks" : 1, "ScheduledReplicationBlocks" : 0, "PendingDeletionBlocks" : 0, "ExcessBlocks" : 87, "PostponedMisreplicatedBlocks" : 0, "PendingDataNodeMessageCount" : 0, "MillisSinceLastLoadedEdits" : 0, "BlockCapacity" : 67108864, "StaleDataNodes" : 0, "TotalFiles" : 2866690 } ] } however 'hdfs fsck / -list-corruptfileblocks' does not found. below is fsck result. The filesystem under path '/' has 0 CORRUPT files 'hdfs dfsadmin -report' is similar first result. Configured Capacity: 1650892318461452 (1.47 PB) Present Capacity: 1647353181258422 (1.46 PB) DFS Remaining: 408711410865856 (371.72 TB) DFS Used: 1238641770392566 (1.10 PB) DFS Used%: 75.19% Under replicated blocks: 0 Blocks with corrupt replicas: 1 Missing blocks: 0 Missing blocks (with replication factor 1): 0 My question is 1. why different result? 2. How do I find corrupt filename? I wonder which file is corrupt. Thank you.