Apologies for the top post as my mobile device doesn't allow anything else. I have no set the maximum permissible migration time but that default certainly points to a possibility of a solution. As for the writing semantics it was a straight dd from disk to /dev/shm so I can't speak for the kernel but naively I would think it may be contiguous space.
On 22.11.11 18:09 ext Pierre Riteau wrote: On 22 nov. 2011, at 14:04, Oliver Hookins wrote: > On Tue, Nov 22, 2011 at 10:31:58AM +0100, ext Juan Quintela wrote: >> Oliver Hookins <oliver.hook...@nokia.com> wrote: >>> On Tue, Nov 15, 2011 at 11:47:58AM +0100, ext Juan Quintela wrote: >>>> Takuya Yoshikawa <yoshikawa.tak...@oss.ntt.co.jp> wrote: >>>>> Adding qemu-devel ML to CC. >>>>> >>>>> Your question should have been sent to qemu-devel ML because the logic >>>>> is implemented in QEMU, not KVM. >>>>> >>>>> (2011/11/11 1:35), Oliver Hookins wrote: >>>>>> Hi, >>>>>> >>>>>> I am performing some benchmarks on KVM migration on two different types >>>>>> of VM. >>>>>> One has 4GB RAM and the other 32GB. More or less idle, the 4GB VM takes >>>>>> about 20 >>>>>> seconds to migrate on our hardware while the 32GB VM takes about a >>>>>> minute. >>>>>> >>>>>> With a reasonable amount of memory activity going on (in the hundreds of >>>>>> MB per >>>>>> second) the 32GB VM takes 3.5 minutes to migrate, but the 4GB VM never >>>>>> completes. Intuitively this tells me there is some watermarking of dirty >>>>>> pages >>>>>> going on that is not particularly efficient when the dirty pages ratio >>>>>> is high >>>>>> compared to total memory, but I may be completely incorrect. >>>> >>>>> You can change the ratio IIRC. >>>>> Hopefully, someone who knows well about QEMU will tell you better ways. >>>>> >>>>> Takuya >>>>> >>>>>> >>>>>> Could anybody fill me in on what might be going on here? We're using >>>>>> libvirt >>>>>> 0.8.2 and kvm-83-224.el5.centos.1 >>>> >>>> This is pretty old qemu/kvm code base. >>>> In principle, it makes no sense that with 32GB RAM migration finishes, >>>> and with 4GB RAM it is unable (intuitively it should be, if ever, the >>>> other way around). >>>> >>>> Do you have an easy test that makes the problem easily reproducible? >>>> Have you tried ustream qemu.git? (some improvements on that department). >>> >>> I've just tried the qemu-kvm 0.14.1 tag which seems to be the latest that >>> builds >>> on my platform. For some strange reason migrations always seem to fail in >>> one >>> direction with "Unknown savevm section or instance 'hpet' 0" messages. >> >> What is your platform? This seems like you are running with hpet in one >> side, but without it in the other. What command line are you using? > > Yes, my mistake. We were also testing later kernels and my test machines > managed > to get out of sync. One had support for hpet clocksource but the other one > didn't. > >> >>> This seems to point to different migration protocols on either end but they >>> are >>> both running the same version of qemu-kvm I built. Does this ring any bells >>> for >>> anyone? >> >> Command line mismatch. But, what is your platform? > > CentOS5.6. Now running the VMs through qemu-kvm 0.14.1, unloaded migrations > take > about half the time but with memory I/O load now both VMs never complete the > migration. In practical terms I'm writing about 50MB/s into memory and we > have a > 10Gbps network (and I've seen real speeds up to 8-9Gbps on the wire) so there > should be enough capacity to sync up the dirty pages. > > So now the 32GB and 4GB VMs have matching behaviour (which makes more sense) > but > I'm not any closer to figuring out what is going on. Did you modify the max downtime? The default is 30 ms. At 8 Gbps, this only allows to send 30 MB of data on the wire. -- Pierre Riteau -- PhD student, Myriads team, IRISA, Rennes, France http://perso.univ-rennes1.fr/pierre.riteau/