On Thu, May 23, 2013 at 09:58:31PM +0000, Mark Trumpold wrote: > I have a working configuration using the signal approach suggested by Stefan. > > 'qemu-nbd.c' is patched as follows: > > do { > main_loop_wait(false); > + if (sighup_reported) { > + sighup_reported = false; > + bdrv_drain_all(); > + bdrv_flush_all(); > } > } while (!sigterm_reported && (persistent || !nbd_started || nb_fds > 0)); > > The driving script was patched as follows: > > mount -o remount,ro /dev/nbd0 > blockdev --flushbufs /dev/nbd0 > + kill -HUP <qemu-nbd process id> > > I needed to retain 'blockdev --flushbufs' for things to work. Seems the > 'bdrv_flush_all' is flushing what is being missed by the blockdev flush. I > did not go back an retest with 'fsync' or other approaches I had tried before.
Okay, that makes sense: 'blockdev --flushbufs' is writing dirty pages to the NBD device. bdrv_drain_all() + bdrv_flush_all() ensures that image file writes reach the physical disk. One thing to be careful of is whether these operations are asynchronous. The signal is asynchronous, you have no way of knowing when qemu-nbd is finished flushing to the physical disk. I didn't check blockdev(8) but it could be the same there. So watch out, otherwise your script is timing-dependent and may not actually have finished flushing when you take the snapshot. Stefan