On 08 March, 2010 - Miles Nordin sent me these 1,8K bytes:

> >>>>> "gm" == Gary Mills <mi...@cc.umanitoba.ca> writes:
> 
>     gm> destroys the oldest snapshots and creates new ones, both
>     gm> recursively.
> 
> I'd be curious if you try taking the same snapshots non-recursively
> instead, does the pause go away?  

According to my testing, that would give you a much longer period of
"slightly slower", but shorter period of "per filesystem
reallyslowness", given recursive snapshots over lots of "independent"
filesystems.

> Because recursive snapshots are special: they're supposed to
> atomically synchronize the cut-point across all the filesystems
> involved, AIUI.  I don't see that recursive destroys should be
> anything special though.

>From my experiences on a homedir file server with about 700 filesystems
and ~65 snapshots on each, giving about 45k snapshots.. In the
beginning, the snapshots took zero time to create.. Now when we have
snapshots spanning over a year, it's not as fast.

We then turned to only doing daily snapshots (for online backups in
addition to regular backups), but they could take up to 45 minutes
sometimes with "regular nfs work" being abysmal. So we started tuning
some stuff, and doing hourly snapshots actually helped (probably keeping
some data structures warm in ARC). Down to 2-3 minutes or so for a
recursive snapshot. So we tried adding 2x 4GB USB sticks (Kingston Data
Traveller Mini Slim) as metadata L2ARC and that seems to have pushed the
snapshot times down to about 30 seconds.

http://www.acc.umu.se/~stric/tmp/snaptimes.png

y axis is mmss, so a value of 450 is 4 minutes, 50 seconds.. not all
linear ;)  x axis is just snapshot number, higher == newer..
Large spikes are snapshots at the same time as daily backups.
In snapshot 67..100 in the picture, I removed the L2ARC USB sticks and
the times increased and started fluctuating.. I'll give it a few days
and put the L2ARC back.. Even cheap $10 USB sticks can help it seems.

>     gm> Is it destroying old snapshots or creating new ones that
>     gm> causes this dead time?
> 
> sortof seems like you should tell us this, not the other way
> around. :)  Seriously though, isn't that easy to test?  And I'm curious
> myself too.



> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss



/Tomas
-- 
Tomas Ögren, st...@acc.umu.se, http://www.acc.umu.se/~stric/
|- Student at Computing Science, University of Umeå
`- Sysadmin at {cs,acc}.umu.se
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to