The block size is a meta attribute. If you append to the file later, it still needs to know when to split further - so it keeps that value as a mere metadata it can use to advise itself on write boundaries.
On Sat, Apr 5, 2014 at 7:35 PM, sam liu <samliuhad...@gmail.com> wrote: > Thanks for your comments! > > As I mentioned HDFS use only what it needs on the local file system. For > example, a 16 KB hdfs file only use 16 KB local file system storage, not 64 > MB(its hdfs block size) storage. In this case, what's the use of the block > size(64 MB) of the 16 KB file? > > > 2014-04-05 17:12 GMT+08:00 Harsh J <ha...@cloudera.com>: > >> The fsck is showing you an "average block size", not the block size >> metadata attribute of the file like stat shows. In this specific case, >> the average is just the length of your file, which is lesser than one >> whole block. >> >> On Sat, Apr 5, 2014 at 8:21 AM, sam liu <samliuhad...@gmail.com> wrote: >> > Hi Experts, >> > >> > First, I believe it's no doubt that HDFS use only what it needs on the >> > local >> > file system. For example, we store a file(12 KB size) to HDFS, and HDFS >> > only >> > use 12 KB on the local file system, and won't use 64 MB(block size) on >> > the >> > local file system for that file. >> > >> > However, I found the block sizes shown by 'fsck' and '-stat' are >> > inconsistent: >> > >> > 1) hadoop fsck /user/user1/filesize/derby.jar -files -blocks -locations: >> > output: >> > ... >> > BP-1600629425-9.30.122.112-1395627917492:blk_1073743264_2443 len=2673375 >> > ... >> > Total blocks (validated): 1 (avg. block size 2673375 B) >> > ... >> > conslusion: >> > The block size is 2673375 B shown by fsck. >> > >> > 2) hadoop dfs -stat "%b %n %o %r %Y" /user/user1/filesize/derby.jar: >> > output: >> > 2673375 derby.jar 134217728 2 1396662626191 >> > conslusion: >> > The block size is 134217728 B shown by stat. >> > >> > Also, if I browser this file from http://namenode:50070, the file size >> > of >> > /user/user1/filesize/derby.jar equals to 2.5 MB(2673375 B), however the >> > block size equals to 128 MB(134217728 B). >> > >> > Why block sizes shown by 'fsck' and '-stat' are inconsistent? >> > >> > >> > >> >> >> >> -- >> Harsh J > > -- Harsh J