Hi All, I have a CDH4 hadoop cluster setup with 3 datanodes and a data replication factor of 2.
When I try to check the consumed dfs space, I get different values using the "hdfs dfsadmin -report" and "hdfs fsck" command. Could anyone please help me understand the reason behind the discrepancy in the values? I get the following output: *# sudo -u hdfs hdfs dfsadmin -report* Configured Capacity: 321252989337600 (292.18 TB) Present Capacity: 264896108259328 (240.92 TB) DFS Remaining: 264665811648512 (240.71 TB) DFS Used: 230296610816 (214.48 GB) DFS Used%: 0.09% Under replicated blocks: 19 Blocks with corrupt replicas: 0 Missing blocks: 0 ------------------------------------------------- Datanodes available: 3 (3 total, 0 dead) Live datanodes: Name: (slave1) Hostname: localhost Decommission Status : Normal Configured Capacity: 107084329779200 (97.39 TB) DFS Used: 77728510976 (72.39 GB) Non DFS Used: 18784664751104 (17.08 TB) DFS Remaining: 88221936517120 (80.24 TB) DFS Used%: 0.07% DFS Remaining%: 82.39% Last contact: Fri Aug 09 13:26:38 IST 2013 Name: (slave3) Hostname: localhost Decommission Status : Normal Configured Capacity: 107084329779200 (97.39 TB) DFS Used: 76206287872 (70.97 GB) Non DFS Used: 18786185925632 (17.09 TB) DFS Remaining: 88221937565696 (80.24 TB) DFS Used%: 0.07% DFS Remaining%: 82.39% Last contact: Fri Aug 09 13:26:37 IST 2013 Name:(slave2) Hostname: localhost Decommission Status : Normal Configured Capacity: 107084329779200 (97.39 TB) DFS Used: 76361811968 (71.12 GB) Non DFS Used: 18786030401536 (17.09 TB) DFS Remaining: 88221937565696 (80.24 TB) DFS Used%: 0.07% DFS Remaining%: 82.39% -------------------------------------------------------------------------------------------------------------------------- *# sudo -u hdfs hadoop fsck /* Connecting to namenode via http://master1:50070 Status: HEALTHY Total size: 75245213337 B Total dirs: 3203 Total files: 7893 Total blocks (validated): 7642 (avg. block size 9846272 B) Minimally replicated blocks: 7642 (100.0 %) Over-replicated blocks: 0 (0.0 %) Under-replicated blocks: 19 (0.24862601 %) Mis-replicated blocks: 0 (0.0 %) Default replication factor: 2 Average block replication: 2.0024862 Corrupt blocks: 0 Missing replicas: 133 (0.86162215 %) Number of data-nodes: 3 Number of racks: 1 FSCK ended at Fri Aug 09 14:01:47 IST 2013 in 266 milliseconds The filesystem under path '/' is HEALTHY ---------------------------------------------------------------------------------------------------------------------------------------------------- *# sudo -u hdfs hadoop fs -count -q /* 2147483647 2147472547 none inf 3203 7897 75245470999 / Thanks & Regards, *Yogini Gulkotwar* *Flutura Decision Sciences & Analytics, Bangalore* *Email*: yogini.gulkot...@flutura.com<yogini.gulkot...@fluturasolutions.com> *Website*: www.fluturasolutions.com