Block live migration without pausing VM
Hi all, I want to implement live-migration of a highly available VM but I cannot use shared storage. The -b option to the migrate command already allows for copying the block device (locally stored raw file) and that is exactly what I want. This worked in my experiments but the VM is unreachable (CPU halted?) while the block device is copied, which is unacceptable for my use-case. Is there a way to copy the block device while the VM is running (copy on read or such)? The LiveBlockMigration page in the qemu wiki [1] mentions that some of this is already implemented, but I cannot find any of it in the latest qemu-kvm. Any pointers? Thanks for your suggestions. [1] http://wiki.qemu.org/Features/LiveBlockMigration -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: strange guest slowness after some time
Tomasz Chmielewski wrote: Felix Leimbach schrieb: BTW, what CPU do you have? One dual core Opteron 2212 Note: I will upgrade to two Shanghai Quad-Cores in 2 weeks and test with those as well. processor : 1 vendor_id : AuthenticAMD cpu family : 15 model : 65 model name : Dual-Core AMD Opteron(tm) Processor 2212 stepping: 2 cpu MHz : 1994.996 cache size : 1024 KB It's exactly the same CPU I have. Interesting: Since two months I'm running on 2 Shanghai Quad-Cores instead and the problem is definitely gone. The rest of the hardware as well as the whole software-stack remained unchanged. That should confirm what we assumed already. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: strange guest slowness after some time
Tomasz Chmielewski wrote: Avi Kivity schrieb: Tomasz Chmielewski wrote: After a week or so, network in one guest got slow with kvm-84 and no cpufreq. This is virtio, right? What about e1000? (I realize it takes a week to reproduce, but maybe you have some more experience) Yes, all affected had virtio. Probably because I didn't have many guests with e1000 interface. After a guest gets slow, I stop it and add another interface, e1000. If it gets slow again, I'll check if e1000 interface is slow as well. Will keep you updated. I see similar behavior: After a week one of my guests' network totally stops to respond. Only guests using virtio networking get hit. Both windows and linux guests are affected. My guests in production use e1000 and have never been hit. While that can be a coincidence it seems very unlikely: Out of 3 virtio guests 2 have been hit, one repeatedly. Out of 3 e1000 guests none has ever been hit. Observed with kvm-83 and kvm-84 with the host running in-kernel KVM code (linux 2.6.25.7) -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: strange guest slowness after some time
Tomasz Chmielewski wrote: Avi Kivity schrieb: Might it be that some counter overflowed? What are the packet counts on long running guests? I don't think so. I just made both counters (TX, RX) of ifconfig for virtio interfaces overflow several times and everything is still as fast as it should be. I had overflows on the counters as well (32 bit guests) without an problem. Here is the current ifconfig output of a machine which suffered the problem before: eth0 Link encap:Ethernet HWaddr 52:54:00:74:01:01 inet addr:10.75.13.1 Bcast:10.75.255.255 Mask:255.255.0.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:3542104 errors:0 dropped:0 overruns:0 frame:0 TX packets:412546 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:682285568 (650.6 MiB) TX bytes:2907586796 (2.7 GiB) (output of ifconfig, even on an unaffected e1000 guest, might help) currently I have e1000 only on windows guests. Is there a way to gather relevant statistics there too? -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: strange guest slowness after some time
Avi Kivity wrote: Does idle=poll help things? It can cause tsc breakage similar to cpufreq. On the host, right? Can't test that as I cannot reboot the server. Is tsc breakage still s.th. to watch out after I've upgraded to the Shanghai quadcores? -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: strange guest slowness after some time
Avi Kivity wrote: Felix Leimbach wrote: eth0 Link encap:Ethernet HWaddr 52:54:00:74:01:01 inet addr:10.75.13.1 Bcast:10.75.255.255 Mask:255.255.0.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:3542104 errors:0 dropped:0 overruns:0 frame:0 TX packets:412546 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:682285568 (650.6 MiB) TX bytes:2907586796 (2.7 GiB) packet counters are will within 32-bit limits. byte counters not so interesting. ah right, I checked the byte counters only. Testing packet counter overflow now (takes a while). Do you experience the slowdown on Windows guests? both Linux and Windows Server 2003. All 32bit. But with me it is not a slowdown but a complete loss of network in the guest. Can't be pinged anymore. Although there might be a slowdown period before the that, I've heard hints in that direction from users. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: strange guest slowness after some time
Tomasz Chmielewski wrote: Tomasz Chmielewski schrieb: Avi Kivity schrieb: packet counters are will within 32-bit limits. byte counters not so interesting. Ah OK. I did only byte overflow. Packet overflow will take much longer. It's one of these very rare cases where setting very small MTU is useful... OK, another bug found. Set your MTU to 100. On two hosts, do: HOST1_MTU1500# dd if=/dev/zero | ssh mana...@host2 dd of=/dev/null HOST2_MTU100# dd if=/dev/zero | ssh mana...@host1 dd of=/dev/null HOST2 with MTU 100 will crash after 10-15 minutes (with packet count still not overflown). Intersting. What are the packet counter at crash time (roughly)? My - currently running - test is: Guest 1 (Linux): MTU 150 # cat /dev/zero | nc guest2ip Guest 2 (Windows 2003 Server): MTU: 1500 # nc -l -p NUL My packet are currently at 63 million without a problem - yet. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: KVM-74 and network timeout?
Sean Mahrt wrote: I’ve noticed the guest with a lot of Disk I/O (commercial detection) after a while has a lot of NFS timeouts…. Virtio or E1000 give me the same result. I noticed exactly the same problem after moving from kvm-64 on a 2.6.25.3 host to kvm-74 on a 2.6.26.3 host. Adding to your observations: - CIFS shares are affected as well: Under heavy traffic I get timeouts from the server, see [1] - ne2k_pci and rtl8139 guests seem to be affected as well Now the real bad part, I’m getting pings in the order of ms, like 20-100ms on a bridged connection… and NFS is going crazy... My pings also increased from 0.1ms to 16ms when the physical interface of the bridge was maxed out. Don't know, whether transferring from VM to VM would also trigger that. I’m using smp on the guests (and the host), and 2.6.25 on the guests… My guests where UP and mostly Windows Server 2003 and one Gentoo 2.6.26, so I think the culprit is elsewhere. Where should I start looking? Is this a KVM-74 issue? Bump to KVM-75? You might try kvm-64 which is rock-solid for me when paired with a 2.6.25 KVM kernel module. [1] Sep 6 17:42:58 [EMAIL PROTECTED] CIFS VFS: server not responding Sep 6 17:42:58 [EMAIL PROTECTED] CIFS VFS: No response to cmd 46 mid 30836 Sep 6 17:42:58 [EMAIL PROTECTED] CIFS VFS: Send error in read = -11 Sep 6 17:51:28 [EMAIL PROTECTED] CIFS VFS: server not responding Sep 6 17:51:28 [EMAIL PROTECTED] CIFS VFS: No response for cmd 50 mid 30850 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html