On 08/05/19 01:15, Peter Xu wrote:
> Design
> ===================
> 
> I started with a naive/stupid design that I always pass all 1's to the
> KVM for a memory range to clear all the dirty bits within that memory
> range, but then I encountered guest oops - it's simply because we
> can't clear any dirty bit from QEMU if we are not _sure_ that the bit
> is dirty in the kernel.  Otherwise we might accidentally clear a bit
> that we don't even know of (e.g., the bit was clear in migration's
> dirty bitmap in QEMU) but actually that page was just being written so
> QEMU will never remember to migrate that new page again.
> 
> The new design is focused on a dirty bitmap cache within the QEMU kvm
> layer (which is per kvm memory slot).  With that we know what's dirty
> in the kernel previously (note! the kernel bitmap is still growing all
> the time so the cache will only be a subset of the realtime kernel
> bitmap but that's far enough for us) and with that we'll be sure to
> not accidentally clear unknown dirty pages.
> 
> With this method, we can also avoid race when multiple users (e.g.,
> DIRTY_MEMORY_VGA and DIRTY_MEMORY_MIGRATION) want to clear the bit for
> multiple time.  If without the kvm memory slot cached dirty bitmap we
> won't be able to know which bit has been cleared and then if we send
> the CLEAR operation upon the same bit twice (or more) we can still
> face the same issue to clear something accidentally while we
> shouldn't.
> 
> Summary: we really need to be careful on what bit to clear otherwise
> we can face anything after the migration completes.  And I hope this
> series has considered all about this.

The disadvantage of this is that you won't clear in the kernel those
dirty bits that come from other sources (e.g. vhost or
address_space_map).  This can lead to double-copying of pages.

Migration already makes a local copy in rb->bmap, and
memory_region_snapshot_and_clear_dirty can also do the clear.  Would it
be possible to invoke the clear using rb->bmap instead of the KVMSlot's
new bitmap?

Thanks,

Paolo

> Besides the new KVM cache layer and the new ioctl support, this series
> introduced the memory_region_clear_dirty_bitmap() in the memory API
> layer to allow clearing dirty bits of a specific memory range within
> the memory region.


Reply via email to