On 10/8/2011 3:33 PM, Wietse Venema wrote:
> Stan Hoeppner:
> [ Charset ISO-8859-1 unsupported, converting... ]
>> On 10/8/2011 5:17 AM, Wietse Venema wrote:
>>> Stan Hoeppner:
>>>> nicely.  On the other hand, you won't see an EXTx filesystem capable of
>>>> anywhere close to 10GB/s or greater file IO.  Here XFS doesn't break a
>>>> sweat.
>>>
>>> I recall that XFS was optimized for fast read/write with large
>>> files, while email files are small, and have a comparatively high
>>> metadata overhead (updating directories, inodes etc.). XFS is
>>> probably not optimal here.
>>>
>>>     Wietse
>>
>>
>> With modern XFS this really depends on the specific workload and custom
>> settings.  Default XFS has always been very good with large file
>> performance and has been optimized for such.  It was historically
>> hampered by write heavy metadata operations, but was sufficiently fast
>> with metadata read operations, especially at high parallelism.  The
>> 'delaylog' code introduced in 2009 has mostly alleviated the metadata
>> write performance issues.  Delaylog is the default mode since Linux 2.6.39.
>>
>> XFS is not optimized by default for the OP's specific mail workload, but
>> is almost infinitely tunable.  The OP has been given multiple options on
>> the XFS list to fix this problem.  XFS is not unsuitable for this
>> workload.  The 10GB XFS filesystem created by the OP for this workload
>> is not suitable.  Doubling the FS size or tweaking the inode layout
>> fixes the problem.
>>
>> As with most things, optimizing the defaults for some workloads may
>> yield less than optimal performance with others.  By default XFS is less
>> than optimal for a high concurrency maildir workload.  However with a
>> proper storage stack architecture and XFS optimizations it handily
>> outperforms all other filesystems.  This would be the "XFS linear
>> concatenation" setup I believe I've described here previously.
>>
>> XFS can do just about anything you want it to at any performance level
>> you need.  For the non default use cases, it simply requires knowledge,
>> planning, tweaking, testing, and tweaking to get it there, not to
>> mention time.  Alas, the learning curve is very steep.
> 
> That's a lot of text. How about some hard numbers?
> 
>       Wietse

Maybe not the perfect example, but here's one such high concurrency
synthetic mail server workload comparison showing XFS with a substantial
lead over everything but JFS, in which case the lead is much smaller:

http://btrfs.boxacle.net/repository/raid/history/History_Mail_server_simulation._num_threads=128.html

I don't have access to this system so I'm unable to demonstrate the
additional performance of an XFS+linear concat setup.  The throughput
would be considerably higher still.  The 8-way LVM stripe over 17 drive
RAID0 stripes would have caused hot and cold spots within the array
spindles, as wide stripe arrays always do with small file random IOPS
workloads.  Using a properly configured XFS+linear concat in these tests
would likely guarantee full concurrency on 128 of the 136 spindles.  I
say likely as I've not read the test code and don't know exactly how it
behaves WRT directory parallelism.

If anyone has a relatively current (4 years) bare metal "lab" box with
say 24-32 locally attached SAS drives (the more the better) to which I
could get SSH KVM access, have pretty much free reign to destroy
anything on it and build a proper test rig, I'd be happy to do a bunch
of maildir type workload tests of the various Linux filesystems and
publish the results, focusing on getting the XFS+linear concat info into
public view.

If not, but if someone with sufficient hardware would like to do this
project him/herself, I'd be glad to assist getting the XFS+linear concat
configured correctly.  Unfortunately it's not something one can setup
without already having a somewhat intimate knowledge of the XFS
allocation group architecture.  Once performance data is out there, and
there is demand generated, I'll try to publish a how-to.

Wietse has called me out on my assertion.  The XFS allocation group
design properly combined with a linear concat dictates the performance
is greater for this workload, simply based on the IO math vs striped
RAID.  All those who have stated they use it testify to the increased
performance.  But no one has published competitive analysis yet.  I'd
love to get such data published as it's a great solution and many could
benefit from it, at least Linux users anyway--XFS is only available on
Linux now that IRIX is dead...

-- 
Stan

Reply via email to