On 24-Feb-09, at 1:37 PM, Mattias Pantzare wrote:

On Tue, Feb 24, 2009 at 19:18, Nicolas Williams
<nicolas.willi...@sun.com> wrote:
On Mon, Feb 23, 2009 at 10:05:31AM -0800, Christopher Mera wrote:
I recently read up on Scott Dickson's blog with his solution for
jumpstart/flashless cloning of ZFS root filesystem boxes. I have to say
that it initially looks to work out cleanly, but of course there are
kinks to be worked out that deal with auto mounting filesystems mostly.

The issue that I'm having is that a few days after these cloned systems
are brought up and reconfigured they are crashing and svc.configd
refuses to start.

When you snapshot a ZFS filesystem you get just that -- a snapshot at
the filesystem level.  That does not mean you get a snapshot at the
_application_ level. Now, svc.configd is a daemon that keeps a SQLite2
database.  If you snapshot the filesystem in the middle of a SQLite2
transaction you won't get the behavior that you want.

In other words: quiesce your system before you snapshot its root
filesystem for the purpose of replicating that root on other systems.

That would be a bug in ZFS or SQLite2.

A snapshoot should be an atomic operation. The effect should be the
same as power fail in the meddle of an transaction and decent
databases can cope with that.

In this special case, that is likely so. But Nicolas' point is salutary in general, especially in the increasingly common case of virtual machines whose disk images are on ZFS. Interacting bugs or bad configuration can produce novel failure modes.

Quiescing a system with a complex mix of applications and service layers is no simple matter either, as many readers of this list well know... :)

--Toby

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to