stephen mulcahy wrote:
Hi,
I've been tweaking our cluster roll-out process to refine it. While
doing so, I decided to check if XFS gives any performance benefit over
EXT4.
As per a comment I read somewhere on the hbase wiki - XFS makes for
faster formatting of filesystems (it takes us 5.5 minutes to rebuild a
datanode from bare metal to a full Hadoop config on top of Debian
Squeeze using XFS) versus EXT4 (same bare metal restore takes 9 minutes).
However, TeraSort performance on a cluster of 45 of these data-nodes
shows XFS is slower (same configuration settings on both installs other
than changed filesystem), specifically,
mkfs.xfs -f -l size=64m DEV
(mounted with noatime,nodiratime,logbufs=8)
gives me a cluster which runs TeraSort in about 23 minutes
mkfs.ext4 -T largefile4 DEV
(mounted with noatime)
gives me a cluster which runs TeraSort in about 18.5 minutes
So I'll be rolling our cluster back to EXT4, but thought the information
might be useful/interesting to others.
-stephen
XFS config chosen from notes at
http://everything2.com/index.pl?node_id=1479435
That's really interesting. Do you want to update the bits of the Hadoop
wiki that talks about filesystems?