Re: [Qemu-devel] [PATCH v6 00/11] rdma: migration support

Chegu Vinod Thu, 09 May 2013 15:21:34 -0700

On 5/9/2013 10:20 AM, Michael R. Hines wrote:

Comments inline. FYI: please CC mrhi...@us.ibm.com,
because it helps me know when to scroll threw the bazillion qemu-develemails.
I have things separated out into folders and rules, but a direct CC isbetter =)


Sure will do.

On 05/03/2013 07:28 PM, Chegu Vinod wrote:
Hi Michael,
I picked up the qemu bits from your github branch and gave it atry. (BTW the setup I was given temporary access to has a pair ofMLX's IB QDR cards connected back to back via QSFP cables)
Observed a couple of things and wanted to share..perhaps you may beaware of them already or perhaps these are unrelated to your specificchanges ? (Note: Still haven't finished the review of your changes ).
a) x-rdma-pin-all off case
Seem to only work sometimes but fails at other times. Here is anexample...
(qemu) rdma: Accepting rdma connection...
rdma: Memory pin all: disabled
rdma: verbs context after listen: 0x555556757d50
rdma: dest_connect Source GID: fe80::2:c903:9:53a5, Dest GID:fe80::2:c903:9:5855
rdma: Accepted migration
qemu-system-x86_64: VQ 1 size 0x100 Guest index 0x4d2 inconsistentwith Host ind
ex 0x4ec: delta 0xffe6
qemu: warning: error while loading state for instance 0x0 of device'virtio-net'
load of migration failed
Can you give me more details about the configuration of your VM?

The guest is a 10-VCPU/128GB ...and nothing really that fancy withrespect to storage or networking.

Hosted on a large Westmere-EX box (target is a similarly configuredWestmere-X system). There is a shared SAN disk between the two hosts.Both hosts have 3.9-rc7 kernel that I got at that time from kvm.gittree. The guest was also running the same kernel.


Since I was just trying it out I was not running any workload either.

On the source host the qemu command line :


/usr/local/bin/qemu-system-x86_64 \
-enable-kvm \
-cpu host \
-name vm1 \
-m 131072 -smp 10,sockets=1,cores=10,threads=1 \
-mem-path /dev/hugepages \

-chardevsocket,id=charmonitor,path=/var/lib/libvirt/qemu/vm1.monitor,server,nowait \-drivefile=/dev/libvirt_lvm3/vm1,if=none,id=drive-virtio-disk0,format=raw,cache=none,aio=native\-devicevirtio-blk-pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1\

-monitor stdio \
-net nic,model=virtio,macaddr=52:54:00:71:01:01,netdev=nic-0 \
-netdev tap,id=nic-0,ifname=tap0,script=no,downscript=no,vhost=on \
-vnc :4

On the destination host the command line was same as the above with thefollowing additional arg...


-incoming x-rdma:<static private ipaddr of the IB>:<port #>

b) x-rdma-pin-all on case :
The guest is not resuming on the target host. i.e. the source host'sqemu states that migration is complete but the guest is notresponsive anymore... (doesn't seem to have crashed but its stucksomewhere). Have you seen this behavior before ? Any tips on how Icould extract additional info ?
Is the QEMU monitor still responsive?


They were responsive.

Can you capture a screenshot of the guest's console to see if there isa panic?


No panic on the guest's console :(

What kind of storage is attached to the VM?


Simple virtio disk hosted on a SAN disk (see the qemu command line).

Besides the list of noted restrictions/issues around having to pinall of guest memory....if the pinning is done as part of starting ofthe migration it ends up taking noticeably long time for largerguests. Wonder whether that should be counted as part of the totalmigration time ?.
That's a good question: The pin-all option should not be slowing downyour VM to much as the VM should still be running before themigration_thread() actually kicks in and starts the migration.

Well I had hoped that it would not have any serious impacts but it endedup freezing the guest...

I need more information on the configuration of your VM, guestoperating system, architecture and so forth.......


Pl. see above.

And similarly as before whether or not QEMU is not responsive orwhether or not it's the guest that's panicked.......

Guest just freezes...doesn't panic when this pinning is in progress(i.e. after I set the capability and start the migration) . After thepin'ng completes the guest continues to run and the migrationcontinues...till it "completes" (as per the source host's qemu)...but Inever see it resume on the target host.

Also the act of pinning all the memory seems to "freeze" the guest.e.g. : For larger enterprise sized guests (say 128GB and higher) theguest is "frozen" is anywhere from nearly a minute (~50seconds) tomultiple minutes as the guest size increases...which imo kind ofdefeats the purpose of live guest migration.
That's bad =) There must be a bug somewhere........ the largest VM Ican create on my hardware is ~16GB - so let me give that a try and tryto track down the problem.

Ok. Perhaps run a simple test run inside the guest can help observe anyscheduling delays even when you are attempting to pin a 16GB guest ?

Would like to hear if you have already thought about any otheralternatives to address this issue ? for e.g. would it be better topin all of the guest's memory as part of starting the guest itself ?Yes there are restrictions when we do pinning...but it can help withperformance.
For such a large VM, I would definitely recommend pinning because I'massuming you have enough processors or a large enough application toactually *use* that much memory, which would suggest that even afterthe bulk phase round of the migration has already completed that yourVM is probably going to remain to be pretty busy.
It's just a matter of me tracking down what's causing the freeze andfixing it........ I'll look into it right now on my machine.

Ok

---
BTW, a different (yet sort of related) topic... recently a patch wentinto upstream that provided an option to qemu to mlock all of guestmemory :
https://lists.gnu.org/archive/html/qemu-devel/2013-04/msg03947.html .
I had no idea.......very interesting.
but when attempting to do the mlock for larger guests a lot of timeis spent bringing each page into cache and clearing/zeron'g it etc.etc.
https://lists.gnu.org/archive/html/qemu-devel/2013-04/msg04161.html
Wow, I didn't know that either. Perhaps this must be causing theentire QEMU process and its threads to seize up.
It may be necessary to run the pinning command *outside* of QEMU's I/Olock in a separate thread if it's really that much overhead.

Not really sure if the BQL is causing the freeze...but in generalpinning of all memory when the guest is run is perhaps not the bestchoice for large enterprise class guests...i.e. its better to do it aspart of the start of the guest.


Thanks a lot for pointing this out.........

BTW, A good thing to try out is to see if we can mlock memory of a largeguest (i.e. on the source and target qemu's) and migrate the guest usingbasic TCP over a regular 10Gig NIC.


Thanks,
Vinod

----
Note: The basic tcp based live guest migration in the same qemuversion still works fine on the same hosts over a pair of non-RDMAcards 10Gb NICs connected back-to-back.
Acknowledged.

Re: [Qemu-devel] [PATCH v6 00/11] rdma: migration support

Reply via email to