Re: [zfs-discuss] ZFS fragmentation with MySQL databases

Kees Nuyt Fri, 21 Nov 2008 23:57:40 -0800

[Default] On Fri, 21 Nov 2008 17:20:48 PST, Vincent Kéravec
<[EMAIL PROTECTED]> wrote:


> I just try ZFS on one of our slave and got some really
> bad performance.
> 
> When I start the server yesterday, it was able to keep
> up with the main server without problem but after two
> days of consecutive run the server is crushed by IO.
> 
> After running the dtrace script iopattern, I notice
> that the workload is now 100% Random IO. Copying the
> database (140Go) from one directory to an other took
> more than 4 hours without any other tasks running on
> the server, and all the reads on table that where
> updated where random... Keeping an eye on iopattern and
> zpool iostat I saw that when the systems was accessing
> file that have not been changed the disk was reading
> sequentially at more than 50Mo/s but when reading files
> that changed often the speed got down to 2-3 Mo/s.

Good observation and analysis.
 
> The server has plenty of diskplace so it should not
> have such a level of file fragmentation in such a short
> time.

My explanation would be: Whenever a block within a file
changes, zfs has to write it at another location ("copy on
write"), so the previous version isn't immediately lost.

Zfs will try to keep the new version of the block close to
the original one, but after several changes on the same
database page, things get pretty messed up and logical
sequential I/O becomes pretty much physically random indeed.

The original blocks will eventually be added to the freelist
and reused, so proximity can be restored, but it will never
be 100% sequential again.
The effect is larger when many snapshots are kept, because
older block versions are not freed, or when the same block
is changed very often and freelist updating has to be
postponed.

That is the trade-off between "always consistent" and
"fast".

> For information I'm using solaris 10/08 with a mirrored
> root pool on two 1Tb Sata harddisk (slow with random
> io). I'm using MySQL 5.0.67 with MyISAM engine. The zfs
> recordsize is 8k as recommended on the zfs guide.

I would suggest to enlarge the MyISAM buffers.
The InnoDB engine does copy on write within its data files,
so things might be different there. 
-- 
  (  Kees Nuyt
  )
c[_]
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS fragmentation with MySQL databases

Reply via email to