Il 17/09/2014 17:04, Stefan Hajnoczi ha scritto: > On Wed, Sep 17, 2014 at 10:25 AM, Paolo Bonzini <pbonz...@redhat.com> wrote: >> -----BEGIN PGP SIGNED MESSAGE----- >> Hash: SHA1 >> >> Il 17/09/2014 11:06, Stefan Hajnoczi ha scritto: >>> I think the fundamental problem here is that the mirror block job >>> on the source host does not synchronize with live migration. >>> >>> Remember the mirror block job iterates on the dirty bitmap >>> whenever it feels like. >>> >>> There is no guarantee that the mirror block job has quiesced before >>> migration handover takes place, right? >> >> Libvirt does that. Migration is started only once storage mirroring >> is out of the bulk phase, and the handover looks like: >> >> 1) migration completes >> >> 2) because the source VM is stopped, the disk has quiesced on the source > > But the mirror block job might still be writing out dirty blocks.
Right, but it quiesces after (3). >> 3) libvirt sends block-job-complete > > No, it sends block-job-cancel after the source QEMU's migration has > completed. See the qemuMigrationCancelDriveMirror() call in > src/qemu/qemu_migration.c:qemuMigrationRun(). No problem, block-job-cancel and block-job-complete are the same except for pivoting to the destination. >> 4) libvirt receives BLOCK_JOB_COMPLETED. The disk has now quiesced on >> the destination as well. > > I don't see where this happens in the libvirt source code. Libvirt > doesn't care about block job events for drive-mirror during migration. > > And that's why there could still be I/O going on (since > block-job-cancel is asynchronous). Oops, this would be a bug! block-job-complete and block-job-cancel are asynchronous. CCing Michal Privoznik who wrote the libvirt code. Paolo >> 5) the VM is started on the destination >> >> 6) the NBD server is stopped on the destination and the source VM is quit. >> >> It is actually a feature that storage migration is completed >> asynchronously with respect to RAM migration. The problem is that >> qcow2_invalidate_cache happens between (3) and (5), and it doesn't >> like the concurrent I/O received by the NBD server. > > I agree that qcow2_invalidate_cache() (and any other invalidate cache > implementations) need to allow concurrent I/O requests. > > Either I'm misreading the libvirt code or libvirt is not actually > ensuring that the block job on the source has cancelled/completed > before the guest is resumed on the destination. So I think there is > still a bug, maybe Eric can verify this? > > Stefan >