Re: [Qemu-devel] [RFC PATCH RDMA support v4: 03/10] more verbose documentation of the RDMA transport

Michael S. Tsirkin Wed, 20 Mar 2013 23:12:04 -0700

On Tue, Mar 19, 2013 at 01:49:34PM -0400, Michael R. Hines wrote:
> I also did a test using RDMA + cgroup, and the kernel killed my QEMU :)
> 
> So, infiniband is not smart enough to know how to avoid pinning a
> zero page, I guess.
> 
> - Michael
> 
> On 03/19/2013 01:14 PM, Paolo Bonzini wrote:
> >Il 19/03/2013 18:09, Michael R. Hines ha scritto:
> >>Allowing QEMU to swap due to a cgroup limit during migration is a viable
> >>overcommit option?
> >>
> >>I'm trying to keep an open mind, but that would kill the migration
> >>time.....
> >Would it swap?  Doesn't the kernel back all zero pages with a single
> >copy-on-write page?  If that still accounts towards cgroup limits, it
> >would be a bug.
> >
> >Old kernels do not have a shared zero hugepage, and that includes some
> >distro kernels.  Perhaps that's the problem.
> >
> >Paolo
> >


I really shouldn't break COW if you don't request LOCAL_WRITE.
I think it's a kernel bug, and apparently has been there in the code since the
first version: get_user_pages parameters swapped.

I'll send a patch. If it's applied, you should also
change your code from

+                                IBV_ACCESS_LOCAL_WRITE |
+                                IBV_ACCESS_REMOTE_WRITE |
+                                IBV_ACCESS_REMOTE_READ);

to

+                                IBV_ACCESS_REMOTE_READ);

on send side.
Then, each time we detect a page has changed we must make sure to
unregister and re-register it. Or if you want to be very
smart, check that the PFN didn't change and reregister
if it did.

This will make overcommit work.

-- 
MST

Re: [Qemu-devel] [RFC PATCH RDMA support v4: 03/10] more verbose documentation of the RDMA transport

Reply via email to