Did you try the XFS 'allocsize' mount parameter (for example, allocsize=8m)?  
This will reduce fragmentation during concurrent writes.   
Its more complicated, but using separate partitions for temp space versus HDFS 
also has an effect.  XFS isn't as good with the temp space.

In short, a single test with default configurations is useful, but doesn't 
complete the picture.  Both file systems have several important tuning knobs.


On Apr 22, 2010, at 1:02 AM, stephen mulcahy wrote:

> Hi,
> 
> I've been tweaking our cluster roll-out process to refine it. While 
> doing so, I decided to check if XFS gives any performance benefit over EXT4.
> 
> As per a comment I read somewhere on the hbase wiki - XFS makes for 
> faster formatting of filesystems (it takes us 5.5 minutes to rebuild a 
> datanode from bare metal to a full Hadoop config on top of Debian 
> Squeeze using XFS) versus EXT4 (same bare metal restore takes 9 minutes).
> 
> However, TeraSort performance on a cluster of 45 of these data-nodes 
> shows XFS is slower (same configuration settings on both installs other 
> than changed filesystem), specifically,
> 
> mkfs.xfs -f -l size=64m DEV
> (mounted with noatime,nodiratime,logbufs=8)
> gives me a cluster which runs TeraSort in about 23 minutes
> 
> mkfs.ext4 -T largefile4 DEV
> (mounted with noatime)
> gives me a cluster which runs TeraSort in about 18.5 minutes
> 
> So I'll be rolling our cluster back to EXT4, but thought the information 
> might be useful/interesting to others.
> 
> -stephen
> 
> 
> XFS config chosen from notes at 
> http://everything2.com/index.pl?node_id=1479435
> 
> -- 
> Stephen Mulcahy, DI2, Digital Enterprise Research Institute,
> NUI Galway, IDA Business Park, Lower Dangan, Galway, Ireland
> http://di2.deri.ie    http://webstar.deri.ie    http://sindice.com

Reply via email to