Re: [Qemu-devel] [Migration Bug? ] Occasionally, the content of VM's memory is inconsistent between Source and Destination of migration

Wen Congyang Thu, 02 Apr 2015 02:12:48 -0700

On 03/26/2015 06:29 PM, Juan Quintela wrote:
> Wen Congyang <we...@cn.fujitsu.com> wrote:
>> On 03/25/2015 05:50 PM, Juan Quintela wrote:
>>> zhanghailiang <zhang.zhanghaili...@huawei.com> wrote:
>>>> Hi all,
>>>>
>>>> We found that, sometimes, the content of VM's memory is
>>>> inconsistent between Source side and Destination side
>>>> when we check it just after finishing migration but before VM continue to 
>>>> Run.
>>>>
>>>> We use a patch like bellow to find this issue, you can find it from affix,
>>>> and Steps to reprduce:
>>>>
>>>> (1) Compile QEMU:
>>>>  ./configure --target-list=x86_64-softmmu  --extra-ldflags="-lssl" && make
>>>>
>>>> (2) Command and output:
>>>> SRC: # x86_64-softmmu/qemu-system-x86_64 -enable-kvm -cpu
>>>> qemu64,-kvmclock -netdev tap,id=hn0-device
>>>> virtio-net-pci,id=net-pci0,netdev=hn0 -boot c -drive
>>>> file=/mnt/sdb/pure_IMG/sles/sles11_sp3.img,if=none,id=drive-virtio-disk0,cache=unsafe
>>>> -device
>>>> virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0
>>>> -vnc :7 -m 2048 -smp 2 -device piix3-usb-uhci -device usb-tablet
>>>> -monitor stdio
>>>
>>> Could you try to reproduce:
>>> - without vhost
>>> - without virtio-net
>>> - cache=unsafe is going to give you trouble, but trouble should only
>>>   happen after migration of pages have finished.
>>
>> If I use ide disk, it doesn't happen.
>> Even if I use virtio-net with vhost=on, it still doesn't happen. I guess
>> it is because I migrate the guest when it is booting. The virtio net
>> device is not used in this case.
> 
> Kevin, Stefan, Michael, any great idea?


The following patch can fix this problem(vhost=off):

>From ebc024702dd3147e0cbdfd173c599103dc87796c Mon Sep 17 00:00:00 2001
From: Wen Congyang <we...@cn.fujitsu.com>
Date: Thu, 2 Apr 2015 16:28:17 +0800
Subject: [PATCH] fix qiov size

Signed-off-by: Wen Congyang <we...@cn.fujitsu.com>
---
 hw/block/virtio-blk.c          | 15 +++++++++++++++
 include/hw/virtio/virtio-blk.h |  1 +
 2 files changed, 16 insertions(+)

diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index 000c38d..13967bc 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -33,6 +33,7 @@ VirtIOBlockReq *virtio_blk_alloc_request(VirtIOBlock *s)
     VirtIOBlockReq *req = g_slice_new(VirtIOBlockReq);
     req->dev = s;
     req->qiov.size = 0;
+    req->size = 0;
     req->next = NULL;
     req->mr_next = NULL;
     return req;
@@ -97,12 +98,20 @@ static void virtio_blk_rw_complete(void *opaque, int ret)
              * external iovec. It was allocated in submit_merged_requests
              * to be able to merge requests. */
             qemu_iovec_destroy(&req->qiov);
+
+            /* Restore qiov->size here */
+            req->qiov.size = req->size;
         }
 
         if (ret) {
             int p = virtio_ldl_p(VIRTIO_DEVICE(req->dev), &req->out.type);
             bool is_read = !(p & VIRTIO_BLK_T_OUT);
             if (virtio_blk_handle_rw_error(req, -ret, is_read)) {
+                /*
+                 * FIXME:
+                 *  The memory may be dirtied on read failure, it will
+                 *  break live migration.
+                 */
                 continue;
             }
         }
@@ -323,6 +332,12 @@ static inline void submit_requests(BlockBackend *blk, 
MultiReqBuffer *mrb,
         struct iovec *tmp_iov = qiov->iov;
         int tmp_niov = qiov->niov;
 
+        /*
+         * Save old qiov->size, which will used in
+         * virtio_blk_complete_request()
+         */
+        mrb->reqs[start]->size = qiov->size;
+
         /* mrb->reqs[start]->qiov was initialized from external so we can't
          * modifiy it here. We need to initialize it locally and then add the
          * external iovecs. */
diff --git a/include/hw/virtio/virtio-blk.h b/include/hw/virtio/virtio-blk.h
index b3ffcd9..7d47310 100644
--- a/include/hw/virtio/virtio-blk.h
+++ b/include/hw/virtio/virtio-blk.h
@@ -67,6 +67,7 @@ typedef struct VirtIOBlockReq {
     struct virtio_blk_inhdr *in;
     struct virtio_blk_outhdr out;
     QEMUIOVector qiov;
+    size_t size;
     struct VirtIOBlockReq *next;
     struct VirtIOBlockReq *mr_next;
     BlockAcctCookie acct;
-- 
2.1.0


PS: I don't check if virtio-scsi, virtio-net... has the similar problem.
If vhost=on, we can also reproduce this problem.

> 
> Thanks, Juan.
> 
>>
>> Thanks
>> Wen Congyang
>>
>>>
>>> What kind of load were you having when reproducing this issue?
>>> Just to confirm, you have been able to reproduce this without COLO
>>> patches, right?
>>>
>>>> (qemu) migrate tcp:192.168.3.8:3004
>>>> before saving ram complete
>>>> ff703f6889ab8701e4e040872d079a28
>>>> md_host : after saving ram complete
>>>> ff703f6889ab8701e4e040872d079a28
>>>>
>>>> DST: # x86_64-softmmu/qemu-system-x86_64 -enable-kvm -cpu
>>>> qemu64,-kvmclock -netdev tap,id=hn0,vhost=on -device
>>>> virtio-net-pci,id=net-pci0,netdev=hn0 -boot c -drive
>>>> file=/mnt/sdb/pure_IMG/sles/sles11_sp3.img,if=none,id=drive-virtio-disk0,cache=unsafe
>>>> -device
>>>> virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0
>>>> -vnc :7 -m 2048 -smp 2 -device piix3-usb-uhci -device usb-tablet
>>>> -monitor stdio -incoming tcp:0:3004
>>>> (qemu) QEMU_VM_SECTION_END, after loading ram
>>>> 230e1e68ece9cd4e769630e1bcb5ddfb
>>>> md_host : after loading all vmstate
>>>> 230e1e68ece9cd4e769630e1bcb5ddfb
>>>> md_host : after cpu_synchronize_all_post_init
>>>> 230e1e68ece9cd4e769630e1bcb5ddfb
>>>>
>>>> This happens occasionally, and it is more easy to reproduce when
>>>> issue migration command during VM's startup time.
>>>
>>> OK, a couple of things.  Memory don't have to be exactly identical.
>>> Virtio devices in particular do funny things on "post-load".  There
>>> aren't warantees for that as far as I know, we should end with an
>>> equivalent device state in memory.
>>>
>>>> We have done further test and found that some pages has been
>>>> dirtied but its corresponding migration_bitmap is not set.
>>>> We can't figure out which modules of QEMU has missed setting bitmap
>>>> when dirty page of VM,
>>>> it is very difficult for us to trace all the actions of dirtying VM's 
>>>> pages.
>>>
>>> This seems to point to a bug in one of the devices.
>>>
>>>> Actually, the first time we found this problem was in the COLO FT
>>>> development, and it triggered some strange issues in
>>>> VM which all pointed to the issue of inconsistent of VM's
>>>> memory. (We have try to save all memory of VM to slave side every
>>>> time
>>>> when do checkpoint in COLO FT, and everything will be OK.)
>>>>
>>>> Is it OK for some pages that not transferred to destination when do
>>>> migration ? Or is it a bug?
>>>
>>> Pages transferred should be the same, after device state transmission is
>>> when things could change.
>>>
>>>> This issue has blocked our COLO development... :(
>>>>
>>>> Any help will be greatly appreciated!
>>>
>>> Later, Juan.
>>>
> .
>

Re: [Qemu-devel] [Migration Bug? ] Occasionally, the content of VM's memory is inconsistent between Source and Destination of migration

Reply via email to