Wen Congyang <we...@cn.fujitsu.com> wrote: > On 03/27/2015 04:56 PM, Stefan Hajnoczi wrote: >> On Thu, Mar 26, 2015 at 11:29:43AM +0100, Juan Quintela wrote: >>> Wen Congyang <we...@cn.fujitsu.com> wrote: >>>> On 03/25/2015 05:50 PM, Juan Quintela wrote: >>>>> zhanghailiang <zhang.zhanghaili...@huawei.com> wrote: >>>>>> Hi all, >>>>>> >>>>>> We found that, sometimes, the content of VM's memory is >>>>>> inconsistent between Source side and Destination side >>>>>> when we check it just after finishing migration but before VM >>>>>> continue to Run. >>>>>> >>>>>> We use a patch like bellow to find this issue, you can find it from >>>>>> affix, >>>>>> and Steps to reprduce: >>>>>> >>>>>> (1) Compile QEMU: >>>>>> ./configure --target-list=x86_64-softmmu --extra-ldflags="-lssl" && >>>>>> make >>>>>> >>>>>> (2) Command and output: >>>>>> SRC: # x86_64-softmmu/qemu-system-x86_64 -enable-kvm -cpu >>>>>> qemu64,-kvmclock -netdev tap,id=hn0-device >>>>>> virtio-net-pci,id=net-pci0,netdev=hn0 -boot c -drive >>>>>> file=/mnt/sdb/pure_IMG/sles/sles11_sp3.img,if=none,id=drive-virtio-disk0,cache=unsafe >>>>>> -device >>>>>> virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0 >>>>>> -vnc :7 -m 2048 -smp 2 -device piix3-usb-uhci -device usb-tablet >>>>>> -monitor stdio >>>>> >>>>> Could you try to reproduce: >>>>> - without vhost >>>>> - without virtio-net >>>>> - cache=unsafe is going to give you trouble, but trouble should only >>>>> happen after migration of pages have finished. >>>> >>>> If I use ide disk, it doesn't happen. >>>> Even if I use virtio-net with vhost=on, it still doesn't happen. I guess >>>> it is because I migrate the guest when it is booting. The virtio net >>>> device is not used in this case. >>> >>> Kevin, Stefan, Michael, any great idea? >> >> You must use -drive cache=none if you want to use live migration. It >> should not directly affect memory during migration though. > > Otherwise, what will happen? If the user doesn't use cache=none, and > tries to use live migration, qemu doesn't output any message or trigger > an event to notify the user.
Problem here is what is your shared storage. Some clustered filesystem got this right and can run without cache=none. But neither NFS, iscsi or FC (FC is by my understanding, not sure) can. And that are the more used ones. So, qemu don't now what FS/storage type the user has, so it can make no real errors. Later, Juan. > > Thanks > Wen Congyang > >> >>>>>> We have done further test and found that some pages has been >>>>>> dirtied but its corresponding migration_bitmap is not set. >>>>>> We can't figure out which modules of QEMU has missed setting bitmap >>>>>> when dirty page of VM, >>>>>> it is very difficult for us to trace all the actions of dirtying VM's >>>>>> pages. >>>>> >>>>> This seems to point to a bug in one of the devices. >> >> I think you'll need to track down which pages are different. If you are >> lucky, their contents will reveal what the page is used for. >> >> Stefan >>