On Tue, Apr 30, 2013 at 09:09:14AM +1000, hannah commodore wrote:
> So what does this all mean? Is it a regression in Linux 3?
> 
> Or were previous versions not actually blocking while sync was called?

i think Toby indicated the source of the problem with his comment about
barriers.

from kernel 2.6.28 onwards, write barriers are turned on by default
in ext4. i'm not sure what version they got added for XFS but they've
definitely been on by default for a few years now, and are known to have
a massive performance penalty for mysql and innodb[2]

similarly, LVM got full write barrier support in 2.6.33. mdadm RAID0/1
has had write barriers for several years, and RAID5/6 got them in late
2009/early 2010 IIRC.

it's only safe to turn barriers off if disable any write-caching in
the drive OR if you have a non-volatile write cache (e.g. hardware
raid, something like bcache with an SSD, or ZFS with an SSD for ZIL),
otherwise you risk data loss and filesystem corruption in the event of a
crash or power-failure.


[1] http://kernelnewbies.org/Ext4#head-25c0a1275a571f7332fa196d4437c38e79f39f63

this also links to a May 2008 article on write barries:

http://lwn.net/Articles/283161/

(see also http://lwn.net/Articles/349970/ "Ext3 and RAID: silent data
killers?" which inspired mdadm's author to add write barriers for RAID5)


[2] https://pracops.com/wiki/index.php/Write_barriers

this one links to useful info at:

http://www.mysqlperformanceblog.com/2009/03/02/ssd-xfs-lvm-fsync-write-cache-barrier-and-lost-transactions/


craig

ps: there's also the fact that dd just isn't a very good tool for
performance benchmarking, especially with such small files.


pps: my suggestion would be to run mysql VMs on ZFS volumes (rather than
LVM+mdadm LVs) with 16KB record size, an SSD for L2ARC and ZIL, and
"skip-innodb_doublewrite" in mysql.conf as suggested here:

https://blogs.oracle.com/realneel/entry/mysql_innodb_zfs_best_practices

and here:

http://ftp.nchu.edu.tw/MySQL/tech-resources/articles/mysql-zfs.html

for non-VM mysql, a zfs filesystem for /var/lib/mysql with 16KB record
size, SSD caching/ZIL, and skip-innodb_doublewrite


and similar for postgresql, although the tuning recommendation there is
for 8K recordsize for pgsql zfs filesystems/volumes. also, enabling zfs
compression has been shown to improve performance with some kinds of
data and IO loads.  I don't know if anyone has done similar testing with
mysql.


-- 
craig sanders <[email protected]>
_______________________________________________
luv-main mailing list
[email protected]
http://lists.luv.asn.au/listinfo/luv-main

Reply via email to