Hi Peter,

Thanks for having a look at it.

On 2020/12/8 23:16, Peter Xu wrote:
Hi, Zenghui,

On Tue, Dec 08, 2020 at 07:40:13PM +0800, Zenghui Yu wrote:
The kernel KVM_CLEAR_DIRTY_LOG interface has align requirement on both the
start and the size of the given range of pages. We have been careful to
handle the unaligned cases when performing CLEAR on one slot. But it seems
that we forget to take the unaligned *size* case into account when
preparing bitmap for the interface, and we may end up clearing dirty status
for pages outside of [start, start + size).

Thanks for the patch, though my understanding is that this is not a bug.

Please have a look at kvm_memslot_init_dirty_bitmap() where we'll allocate the
dirty bitmap to be aligned to 8 bytes (assuming that's the possible max of the
value sizeof(unsigned long)).  That exactly covers 64 pages.

So here as long as start_delta==0 (so the value of "bmap_npages - size / psize"
won't really matter a lot, imho), then we'll definitely have KVMSlot.dirty_bmap
long enough to cover the range we'd like to clear.

I agree.  But actually I'm not saying that KVMSlot.dirty_bmap is not
long enough.  What I was having in mind is something like:

    // psize = qemu_real_host_page_size;
    // slot.start_addr = 0;
    // slot.memory_size = 64 * psize;

    kvm_log_clear_one_slot(slot, as, 0 * psize, 32 * psize);   --> [1]
    kvm_log_clear_one_slot(slot, as, 32 * psize, 32 * psize);  --> [2]

So the @size is not aligned with 64 pages.  Before this patch, we'll
clear dirty status for all pages(0-63) through [1].  It looks to me that
this violates the caller's expectation since we only want to clear
pages(0-31).

As I said, I don't think this will happen in practice -- the migration
code should always provide us with a 64-page aligned section (right?).
I'm just thinking about the correctness of the specific algorithm used
by kvm_log_clear_one_slot().

Or maybe I had missed some other points obvious ;-) ?


Thanks,
Zenghui

Note that the size of KVMSlot.dirty_bmap can be bigger than the actually size
of the kvm memslot, however since kvm_memslot_init_dirty_bitmap() initialized
it to all zero so the extra bits will always be zero for the whole lifecycle of
the vm/bitmap.

Thanks,


Signed-off-by: Zenghui Yu <yuzeng...@huawei.com>
---
  accel/kvm/kvm-all.c | 7 +++++--
  1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index bed2455ca5..05d323ba1f 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -747,7 +747,7 @@ static int kvm_log_clear_one_slot(KVMSlot *mem, int as_id, 
uint64_t start,
      assert(bmap_start % BITS_PER_LONG == 0);
      /* We should never do log_clear before log_sync */
      assert(mem->dirty_bmap);
-    if (start_delta) {
+    if (start_delta || bmap_npages - size / psize) {
          /* Slow path - we need to manipulate a temp bitmap */
          bmap_clear = bitmap_new(bmap_npages);
          bitmap_copy_with_src_offset(bmap_clear, mem->dirty_bmap,
@@ -760,7 +760,10 @@ static int kvm_log_clear_one_slot(KVMSlot *mem, int as_id, 
uint64_t start,
          bitmap_clear(bmap_clear, 0, start_delta);
          d.dirty_bitmap = bmap_clear;
      } else {
-        /* Fast path - start address aligns well with BITS_PER_LONG */
+        /*
+         * Fast path - both start and size align well with BITS_PER_LONG
+         * (or the end of memory slot)
+         */
          d.dirty_bitmap = mem->dirty_bmap + BIT_WORD(bmap_start);
      }
--
2.19.1




Reply via email to