On Tue, May 11, 2010 at 7:33 AM, stephen mulcahy <stephen.mulc...@deri.org>wrote:
> On 23/04/10 15:43, Todd Lipcon wrote: > >> Hi Stephen, >> >> Can you try mounting ext4 with the nodelalloc option? I've seen the same >> improvement due to delayed allocation butbeen a little nervous about that >> option (especially in the NN where we currently follow what the kernel >> people call an antipattern for image rotation). >> > > Hi Todd, > > Sorry for the delayed response - I had to wait for another test window > before trying this out. > > To clarify, my namename and secondary namenode have been using ext4 in all > tests - reconfiguring the datanodes is a fast operation, the nn and 2nn less > so. I figure any big performance benefit would appear on the data nodes > anyway and can then apply it back to the nn and 2nn if testing shows any > benefits in changing. > > So I tried running our datanodes with their ext4 filesystems mounted using > "noatime,nodelalloc" and after 6 runs of the TeraSort, it seems it runs > SLOWER with those options by between 5-8%. The TeraGen itself seemed to run > about 5% faster but it was only a single run so I'm not sure how reliable > that is. > Yep, that's what I'd expect. noatime should be a small improvement, nodelalloc should be a small detriment. The thing is that delayed allocation has some strange cases that could theoretically cause data loss after a power outage, so I was interested to see if it nullified all of your performance gains or if it were just a small hit. -Todd -- Todd Lipcon Software Engineer, Cloudera