Hi list,

I'm running libvirt qemu guests on RBD, and currently taking backups by issuing a domfsfreeze, taking a snapshot, and then issuing a domfsthaw. This seems to be a common approach.

This is safe, but it's impactful: the guest has frozen I/O for the duration of the snapshot. This is usually only a few seconds. Unfortunately, the freeze action doesn't seem to be very reliable. Sometimes it times out, leaving the guest in a messy situation with frozen I/O (thaw times out too when this happens, or returns success but FSes end up frozen anyway). This is clearly a bug somewhere, but I wonder whether the freeze is a hard requirement or not.

Are there any atomicity guarantees for RBD snapshots taken *without* freezing the filesystem? Obviously the filesystem will be dirty and will require journal recovery, but that is okay; it's equivalent to a hard shutdown/crash. But is there any chance of corruption related to the snapshot being taken in a non-atomic fashion? Filesystems and applications these days should have no trouble with hard shutdowns, as long as storage writes follow ordering guarantees (no writes getting reordered across a barrier and such).

Put another way: do RBD snapshots have ~identical atomicity guarantees to e.g. LVM snapshots?

If we can get away without the freeze, honestly I'd rather go that route. If I really need to pause I/O during the snapshot creation, I might end up resorting to pausing the whole VM (suspend/resume), which has higher impact but also probably a much lower chance of messing up (or having excess latency), since it doesn't involve the guest OS or the qemu agent at all...

--
Hector Martin (hec...@marcansoft.com)
Public Key: https://marcan.st/marcan.asc
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to