Re: Hadoop performance - xfs and ext4
Ah, one more thing. With XFS there is an online defragmenter -- it runs every night on my cluster. Performance on a fresh, empty system will not match a used one that has become fragmented. On Apr 22, 2010, at 1:02 AM, stephen mulcahy wrote: > Hi, > > I've been tweaking our cluster roll-out process to refine it. While > doing so, I decided to check if XFS gives any performance benefit over EXT4. > > As per a comment I read somewhere on the hbase wiki - XFS makes for > faster formatting of filesystems (it takes us 5.5 minutes to rebuild a > datanode from bare metal to a full Hadoop config on top of Debian > Squeeze using XFS) versus EXT4 (same bare metal restore takes 9 minutes). > > However, TeraSort performance on a cluster of 45 of these data-nodes > shows XFS is slower (same configuration settings on both installs other > than changed filesystem), specifically, > > mkfs.xfs -f -l size=64m DEV > (mounted with noatime,nodiratime,logbufs=8) > gives me a cluster which runs TeraSort in about 23 minutes > > mkfs.ext4 -T largefile4 DEV > (mounted with noatime) > gives me a cluster which runs TeraSort in about 18.5 minutes > > So I'll be rolling our cluster back to EXT4, but thought the information > might be useful/interesting to others. > > -stephen > > > XFS config chosen from notes at > http://everything2.com/index.pl?node_id=1479435 > > -- > Stephen Mulcahy, DI2, Digital Enterprise Research Institute, > NUI Galway, IDA Business Park, Lower Dangan, Galway, Ireland > http://di2.deri.iehttp://webstar.deri.iehttp://sindice.com
Re: Hadoop performance - xfs and ext4
Did you try the XFS 'allocsize' mount parameter (for example, allocsize=8m)? This will reduce fragmentation during concurrent writes. Its more complicated, but using separate partitions for temp space versus HDFS also has an effect. XFS isn't as good with the temp space. In short, a single test with default configurations is useful, but doesn't complete the picture. Both file systems have several important tuning knobs. On Apr 22, 2010, at 1:02 AM, stephen mulcahy wrote: > Hi, > > I've been tweaking our cluster roll-out process to refine it. While > doing so, I decided to check if XFS gives any performance benefit over EXT4. > > As per a comment I read somewhere on the hbase wiki - XFS makes for > faster formatting of filesystems (it takes us 5.5 minutes to rebuild a > datanode from bare metal to a full Hadoop config on top of Debian > Squeeze using XFS) versus EXT4 (same bare metal restore takes 9 minutes). > > However, TeraSort performance on a cluster of 45 of these data-nodes > shows XFS is slower (same configuration settings on both installs other > than changed filesystem), specifically, > > mkfs.xfs -f -l size=64m DEV > (mounted with noatime,nodiratime,logbufs=8) > gives me a cluster which runs TeraSort in about 23 minutes > > mkfs.ext4 -T largefile4 DEV > (mounted with noatime) > gives me a cluster which runs TeraSort in about 18.5 minutes > > So I'll be rolling our cluster back to EXT4, but thought the information > might be useful/interesting to others. > > -stephen > > > XFS config chosen from notes at > http://everything2.com/index.pl?node_id=1479435 > > -- > Stephen Mulcahy, DI2, Digital Enterprise Research Institute, > NUI Galway, IDA Business Park, Lower Dangan, Galway, Ireland > http://di2.deri.iehttp://webstar.deri.iehttp://sindice.com
Re: Hadoop performance - xfs and ext4
On Tue, May 11, 2010 at 10:39 AM, Todd Lipcon wrote: > On Tue, May 11, 2010 at 7:33 AM, stephen mulcahy > wrote: > > > On 23/04/10 15:43, Todd Lipcon wrote: > > > >> Hi Stephen, > >> > >> Can you try mounting ext4 with the nodelalloc option? I've seen the same > >> improvement due to delayed allocation butbeen a little nervous about > that > >> option (especially in the NN where we currently follow what the kernel > >> people call an antipattern for image rotation). > >> > > > > Hi Todd, > > > > Sorry for the delayed response - I had to wait for another test window > > before trying this out. > > > > To clarify, my namename and secondary namenode have been using ext4 in > all > > tests - reconfiguring the datanodes is a fast operation, the nn and 2nn > less > > so. I figure any big performance benefit would appear on the data nodes > > anyway and can then apply it back to the nn and 2nn if testing shows any > > benefits in changing. > > > > So I tried running our datanodes with their ext4 filesystems mounted > using > > "noatime,nodelalloc" and after 6 runs of the TeraSort, it seems it runs > > SLOWER with those options by between 5-8%. The TeraGen itself seemed to > run > > about 5% faster but it was only a single run so I'm not sure how reliable > > that is. > > > > Yep, that's what I'd expect. noatime should be a small improvement, > nodelalloc should be a small detriment. The thing is that delayed > allocation > has some strange cases that could theoretically cause data loss after a > power outage, so I was interested to see if it nullified all of your > performance gains or if it were just a small hit. > > -Todd > > -- > Todd Lipcon > Software Engineer, Cloudera > For most people doing tuning of the disk configuration for the NameNode is waisted time. Why? The current capacity of our hadoop cluster is Present Capacity: 48799678056 (101.09 TB) Yet the NameNode data itself is tiny. du -hs /usr/local/hadoop_root/hdfs_master 684M/usr/local/hadoop_root/hdfs_master Likely the entire Node table fits entirely inside the VFS cache, performance is not usually an issue, reliability is. The more exotic you get with this mount (EXT5, rarely used mount options), the less reliable it is going to be (IMHO). This is because your configuration space is not shared by that many people. DataNodes are a different story. These are worth tuning. I suggest configuring a single datanode as (say EXT4 with fancy options x,y,z), Wait a while get real production load at it, then look at some performance data and see if this node has any tangible difference in performance. Do not look for low level things like, bonnie say delete rate is +5& but create rate -%5. Look at the big picture, if you can't see a tangible big picture difference like ' map jobs seem to finish 5% faster on this node' what are you doing the tuning for :) ? I know this seems like a rather un-scientific approach, but disk tuning/performance measuring is very complex because application, VFS cache, available memory are the critical factors performance.
Re: Hadoop performance - xfs and ext4
On Tue, May 11, 2010 at 7:33 AM, stephen mulcahy wrote: > On 23/04/10 15:43, Todd Lipcon wrote: > >> Hi Stephen, >> >> Can you try mounting ext4 with the nodelalloc option? I've seen the same >> improvement due to delayed allocation butbeen a little nervous about that >> option (especially in the NN where we currently follow what the kernel >> people call an antipattern for image rotation). >> > > Hi Todd, > > Sorry for the delayed response - I had to wait for another test window > before trying this out. > > To clarify, my namename and secondary namenode have been using ext4 in all > tests - reconfiguring the datanodes is a fast operation, the nn and 2nn less > so. I figure any big performance benefit would appear on the data nodes > anyway and can then apply it back to the nn and 2nn if testing shows any > benefits in changing. > > So I tried running our datanodes with their ext4 filesystems mounted using > "noatime,nodelalloc" and after 6 runs of the TeraSort, it seems it runs > SLOWER with those options by between 5-8%. The TeraGen itself seemed to run > about 5% faster but it was only a single run so I'm not sure how reliable > that is. > Yep, that's what I'd expect. noatime should be a small improvement, nodelalloc should be a small detriment. The thing is that delayed allocation has some strange cases that could theoretically cause data loss after a power outage, so I was interested to see if it nullified all of your performance gains or if it were just a small hit. -Todd -- Todd Lipcon Software Engineer, Cloudera
Re: Hadoop performance - xfs and ext4
On 23/04/10 15:43, Todd Lipcon wrote: Hi Stephen, Can you try mounting ext4 with the nodelalloc option? I've seen the same improvement due to delayed allocation butbeen a little nervous about that option (especially in the NN where we currently follow what the kernel people call an antipattern for image rotation). Hi Todd, Sorry for the delayed response - I had to wait for another test window before trying this out. To clarify, my namename and secondary namenode have been using ext4 in all tests - reconfiguring the datanodes is a fast operation, the nn and 2nn less so. I figure any big performance benefit would appear on the data nodes anyway and can then apply it back to the nn and 2nn if testing shows any benefits in changing. So I tried running our datanodes with their ext4 filesystems mounted using "noatime,nodelalloc" and after 6 runs of the TeraSort, it seems it runs SLOWER with those options by between 5-8%. The TeraGen itself seemed to run about 5% faster but it was only a single run so I'm not sure how reliable that is. hth, -stephen -- Stephen Mulcahy, DI2, Digital Enterprise Research Institute, NUI Galway, IDA Business Park, Lower Dangan, Galway, Ireland http://di2.deri.iehttp://webstar.deri.iehttp://sindice.com
Re: Hadoop performance - xfs and ext4
On 4/23/2010 6:17 AM, stephen mulcahy wrote: Steve Loughran wrote: That's really interesting. Do you want to update the bits of the Hadoop wiki that talks about filesystems? I can if people think that would be useful. Absolutely. +1 Thanks, --Konstantin I'm not sure if my results are neccesarily going to reflect what will happen on other peoples systems and configs though - whats the best way of addressing that? Do my apache credentials work for the wiki or do I need to explicitly have a new account for the hadoop wiki? -stephen
Re: Hadoop performance - xfs and ext4
stephen mulcahy wrote: Steve Loughran wrote: That's really interesting. Do you want to update the bits of the Hadoop wiki that talks about filesystems? I can if people think that would be useful. I'm not sure if my results are neccesarily going to reflect what will happen on other peoples systems and configs though - whats the best way of addressing that? Do my apache credentials work for the wiki or do I need to explicitly have a new account for the hadoop wiki? There is one login for each wiki, so you need to create a new account for the hadoop wiki from any others. Try not use a password that is bound to something important :) http://wiki.apache.org/hadoop/FrontPage
Re: Hadoop performance - xfs and ext4
I've done some research and following mount option sound like optimal , will you interested to give it a try? noatime,data=writeback,barrier=0,nobh On Fri, Apr 23, 2010 at 10:43 PM, Todd Lipcon wrote: > Hi Stephen, > > Can you try mounting ext4 with the nodelalloc option? I've seen the same > improvement due to delayed allocation butbeen a little nervous about that > option (especially in the NN where we currently follow what the kernel > people call an antipattern for image rotation). > > -Todd > > On Fri, Apr 23, 2010 at 6:12 AM, stephen mulcahy > wrote: > >> Andrew Klochkov wrote: >> >>> Hi, >>> >>> Just curious - did you try ext3? Can it be faster then ext4? Hadoop wiki >>> suggests ext3 as it's used mostly for hadoop clusters: >>> >>> http://wiki.apache.org/hadoop/DiskSetup >>> >> >> For completeness, I rebuilt one more time with ext3 >> >> mkfs.ext3 -T largefile4 DEV >> (mounted with noatime) >> gives me a cluster which runs TeraSort in about 22.5 minutes >> >> So ext4 looks like the winner, from a performance perspective, at least for >> running the TeraSort on my cluster with it's specific configuration. >> >> -stephen >> >> -- >> Stephen Mulcahy, DI2, Digital Enterprise Research Institute, >> NUI Galway, IDA Business Park, Lower Dangan, Galway, Ireland >> http://di2.deri.ie http://webstar.deri.ie http://sindice.com >> > > > > -- > Todd Lipcon > Software Engineer, Cloudera >
Re: Hadoop performance - xfs and ext4
Hi Stephen, Can you try mounting ext4 with the nodelalloc option? I've seen the same improvement due to delayed allocation butbeen a little nervous about that option (especially in the NN where we currently follow what the kernel people call an antipattern for image rotation). -Todd On Fri, Apr 23, 2010 at 6:12 AM, stephen mulcahy wrote: > Andrew Klochkov wrote: > >> Hi, >> >> Just curious - did you try ext3? Can it be faster then ext4? Hadoop wiki >> suggests ext3 as it's used mostly for hadoop clusters: >> >> http://wiki.apache.org/hadoop/DiskSetup >> > > For completeness, I rebuilt one more time with ext3 > > mkfs.ext3 -T largefile4 DEV > (mounted with noatime) > gives me a cluster which runs TeraSort in about 22.5 minutes > > So ext4 looks like the winner, from a performance perspective, at least for > running the TeraSort on my cluster with it's specific configuration. > > -stephen > > -- > Stephen Mulcahy, DI2, Digital Enterprise Research Institute, > NUI Galway, IDA Business Park, Lower Dangan, Galway, Ireland > http://di2.deri.iehttp://webstar.deri.iehttp://sindice.com > -- Todd Lipcon Software Engineer, Cloudera
Re: Hadoop performance - xfs and ext4
Steve Loughran wrote: That's really interesting. Do you want to update the bits of the Hadoop wiki that talks about filesystems? I can if people think that would be useful. I'm not sure if my results are neccesarily going to reflect what will happen on other peoples systems and configs though - whats the best way of addressing that? Do my apache credentials work for the wiki or do I need to explicitly have a new account for the hadoop wiki? -stephen -- Stephen Mulcahy, DI2, Digital Enterprise Research Institute, NUI Galway, IDA Business Park, Lower Dangan, Galway, Ireland http://di2.deri.iehttp://webstar.deri.iehttp://sindice.com
Re: Hadoop performance - xfs and ext4
Andrew Klochkov wrote: Hi, Just curious - did you try ext3? Can it be faster then ext4? Hadoop wiki suggests ext3 as it's used mostly for hadoop clusters: http://wiki.apache.org/hadoop/DiskSetup For completeness, I rebuilt one more time with ext3 mkfs.ext3 -T largefile4 DEV (mounted with noatime) gives me a cluster which runs TeraSort in about 22.5 minutes So ext4 looks like the winner, from a performance perspective, at least for running the TeraSort on my cluster with it's specific configuration. -stephen -- Stephen Mulcahy, DI2, Digital Enterprise Research Institute, NUI Galway, IDA Business Park, Lower Dangan, Galway, Ireland http://di2.deri.iehttp://webstar.deri.iehttp://sindice.com
Re: Hadoop performance - xfs and ext4
stephen mulcahy wrote: Hi, I've been tweaking our cluster roll-out process to refine it. While doing so, I decided to check if XFS gives any performance benefit over EXT4. As per a comment I read somewhere on the hbase wiki - XFS makes for faster formatting of filesystems (it takes us 5.5 minutes to rebuild a datanode from bare metal to a full Hadoop config on top of Debian Squeeze using XFS) versus EXT4 (same bare metal restore takes 9 minutes). However, TeraSort performance on a cluster of 45 of these data-nodes shows XFS is slower (same configuration settings on both installs other than changed filesystem), specifically, mkfs.xfs -f -l size=64m DEV (mounted with noatime,nodiratime,logbufs=8) gives me a cluster which runs TeraSort in about 23 minutes mkfs.ext4 -T largefile4 DEV (mounted with noatime) gives me a cluster which runs TeraSort in about 18.5 minutes So I'll be rolling our cluster back to EXT4, but thought the information might be useful/interesting to others. -stephen XFS config chosen from notes at http://everything2.com/index.pl?node_id=1479435 That's really interesting. Do you want to update the bits of the Hadoop wiki that talks about filesystems?
Re: Hadoop performance - xfs and ext4
Hi, Just curious - did you try ext3? Can it be faster then ext4? Hadoop wiki suggests ext3 as it's used mostly for hadoop clusters: http://wiki.apache.org/hadoop/DiskSetup On Thu, Apr 22, 2010 at 12:02 PM, stephen mulcahy wrote: > Hi, > > I've been tweaking our cluster roll-out process to refine it. While doing > so, I decided to check if XFS gives any performance benefit over EXT4. > > As per a comment I read somewhere on the hbase wiki - XFS makes for faster > formatting of filesystems (it takes us 5.5 minutes to rebuild a datanode > from bare metal to a full Hadoop config on top of Debian Squeeze using XFS) > versus EXT4 (same bare metal restore takes 9 minutes). > > However, TeraSort performance on a cluster of 45 of these data-nodes shows > XFS is slower (same configuration settings on both installs other than > changed filesystem), specifically, > > mkfs.xfs -f -l size=64m DEV > (mounted with noatime,nodiratime,logbufs=8) > gives me a cluster which runs TeraSort in about 23 minutes > > mkfs.ext4 -T largefile4 DEV > (mounted with noatime) > gives me a cluster which runs TeraSort in about 18.5 minutes > > So I'll be rolling our cluster back to EXT4, but thought the information > might be useful/interesting to others. > > -stephen > > > XFS config chosen from notes at > http://everything2.com/index.pl?node_id=1479435 > > -- > Stephen Mulcahy, DI2, Digital Enterprise Research Institute, > NUI Galway, IDA Business Park, Lower Dangan, Galway, Ireland > http://di2.deri.iehttp://webstar.deri.iehttp://sindice.com > -- Andrew Klochkov
Hadoop performance - xfs and ext4
Hi, I've been tweaking our cluster roll-out process to refine it. While doing so, I decided to check if XFS gives any performance benefit over EXT4. As per a comment I read somewhere on the hbase wiki - XFS makes for faster formatting of filesystems (it takes us 5.5 minutes to rebuild a datanode from bare metal to a full Hadoop config on top of Debian Squeeze using XFS) versus EXT4 (same bare metal restore takes 9 minutes). However, TeraSort performance on a cluster of 45 of these data-nodes shows XFS is slower (same configuration settings on both installs other than changed filesystem), specifically, mkfs.xfs -f -l size=64m DEV (mounted with noatime,nodiratime,logbufs=8) gives me a cluster which runs TeraSort in about 23 minutes mkfs.ext4 -T largefile4 DEV (mounted with noatime) gives me a cluster which runs TeraSort in about 18.5 minutes So I'll be rolling our cluster back to EXT4, but thought the information might be useful/interesting to others. -stephen XFS config chosen from notes at http://everything2.com/index.pl?node_id=1479435 -- Stephen Mulcahy, DI2, Digital Enterprise Research Institute, NUI Galway, IDA Business Park, Lower Dangan, Galway, Ireland http://di2.deri.iehttp://webstar.deri.iehttp://sindice.com