On Thu, Mar 4, 2010 at 7:28 PM, Ian Collins <i...@ianshome.com> wrote:

> Gary Mills wrote:
>
>> We have an IMAP e-mail server running on a Solaris 10 10/09 system.
>> It uses six ZFS filesystems built on a single zpool with 14 daily
>> snapshots.  Every day at 11:56, a cron command destroys the oldest
>> snapshots and creates new ones, both recursively.  For about four
>> minutes thereafter, the load average drops and I/O to the disk devices
>> drops to almost zero.  Then, the load average shoots up to about ten
>> times normal and then declines to normal over about four minutes, as
>> disk activity resumes.  The statistics return to their normal state
>> about ten minutes after the cron command runs.
>>
>> Is it destroying old snapshots or creating new ones that causes this
>> dead time?  What does each of these procedures do that could affect
>> the system?  What can I do to make this less visible to users?
>>
>>
>>
> I have a couple of Solaris 10 boxes that do something similar (hourly
> snaps) and I've never seen any lag in creating and destroying snapshots.
>  One system with 16 filesystems takes 5 seconds to destroy the 16 oldest
> snaps and create 5 recursive new ones.  I logged load average on these boxes
> and there is a small spike on the hour, but this is down to sending the
> snaps, not creating them.
>

We've seen the behaviour that Gary describes while destroying datasets
recursively (>600GB and with 7 snapshots). It seems that close to the end
the server stalls for 10-15 minutes and NFS activity stops. For small
datasets/snapshots that doesn't happen or is harder to notice.

Does ZFS have to do something special when it's done releasing the data
blocks at the end of the destroy operation ?

-- 
Giovanni Tirloni
sysdroid.com
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to