Have you tried it to see what diffrence it makes? -- Met vriendelijke groet, Niels Basjes (Verstuurd vanaf mobiel ) Op 3 okt. 2011 07:06 schreef "Jinsong Hu" <jinsong...@hotmail.com> het volgende: > Hi, There: > I just thought an idea. When we format the disk , the block size is > usually 1K to 4K. For hdfs, the block size is usually 64M. > I wonder if we change the raw file system's block size to something > significantly bigger, say, 1M or 8M, will that improve > disk IO performance for hadoop's hdfs ? > Currently, I noticed that mapr distribution uses mfs, its own file system.
> That resulted in 4 times performance gain in terms > of disk IO. I just wonder if we tune the hosting os parameters, we can > achieve better disk IO performance with just the regular > apache hadoop distribution. > I understand that making the block size bigger can result in some disk > space waste for small files. However, for disk dedicated > for hdfs, where most of the files are very big, I just wonder if it is a > good idea. Any body have any comment ? > > Jimmy >