Re: making file system block size bigger to improve hdfs performance ?

2011-10-10 Thread Brian Bockelman
I can provide another data point here: xfs works very well in modern Linuxes (in the 2.6.9 era, it had many memory management headaches, especially around the switch to 4k stacks), and its advantage is significant when you run file systems over 95% occupied. Brian On Oct 10, 2011, at 8:51 AM,

Re: making file system block size bigger to improve hdfs performance ?

2011-10-10 Thread M. C. Srivas
XFS was created in 1991 by Silicon Graphics. It was designed for streaming. The Linux port was in 2002 or so. I've used it extensively for the past 8 years. It is very stable, and many NAS companies have embedded it in their products. In particular, it works well even when the disk starts getting

Re: making file system block size bigger to improve hdfs performance ?

2011-10-10 Thread Steve Loughran
On 09/10/11 07:01, M. C. Srivas wrote: If you insist on HDFS, try using XFS underneath, it does a much better job than ext3 or ext4 for Hadoop in terms of how data is layed out on disk. But its memory footprint is alteast twice of that of ext3, so it will gobble up a lot more memory on your box.

Re: making file system block size bigger to improve hdfs performance ?

2011-10-08 Thread M. C. Srivas
By default, Linux file systems use a 4K block size. Block size of 4K means all I/O happens 4K at a time. Any *updates* to data smaller than 4K will result in a read-modify-write cycle on disk, ie, if a file was extended from 1K to 2K, the fs will read in the 4K, memcpy the region from 1K-2K into th

Re: making file system block size bigger to improve hdfs performance ?

2011-10-03 Thread Ted Dunning
The MapR system allocates files with 8K blocks internally, so I doubt that any improvement that you see with a larger block size on HDFS is going to matter much and it could seriously confuse your underlying file system. The performance advantage for MapR has more to do with a better file system d

Re: making file system block size bigger to improve hdfs performance ?

2011-10-02 Thread Niels Basjes
Have you tried it to see what diffrence it makes? -- Met vriendelijke groet, Niels Basjes (Verstuurd vanaf mobiel ) Op 3 okt. 2011 07:06 schreef "Jinsong Hu" het volgende: > Hi, There: > I just thought an idea. When we format the disk , the block size is > usually 1K to 4K. For hdfs, the block s

making file system block size bigger to improve hdfs performance ?

2011-10-02 Thread Jinsong Hu
Hi, There: I just thought an idea. When we format the disk , the block size is usually 1K to 4K. For hdfs, the block size is usually 64M. I wonder if we change the raw file system's block size to something significantly bigger, say, 1M or 8M, will that improve disk IO performance for hadoop's