On Wed, Sep 04, 2024 at 11:07:44PM +0200, David Hildenbrand wrote:
> On 04.09.24 21:16, Peter Xu wrote:
> > Zhiyi reported an infinite loop issue in VFIO use case.  The cause of that
> > was a separate discussion, however during that I found a regression of
> > dirty sync slowness when profiling.
> > 
> > Each KVMMemoryListerner maintains an array of kvm memslots.  Currently it's
> > statically allocated to be the max supported by the kernel.  However after
> > Linux commit 4fc096a99e ("KVM: Raise the maximum number of user memslots"),
> > the max supported memslots reported now grows to some number large enough
> > so that it may not be wise to always statically allocate with the max
> > reported.
> > 
> > What's worse, QEMU kvm code still walks all the allocated memslots entries
> > to do any form of lookups.  It can drastically slow down all memslot
> > operations because each of such loop can run over 32K times on the new
> > kernels.
> > 
> > Fix this issue by making the memslots to be allocated dynamically.
> > 
> > Here the initial size was set to 16 because it should cover the basic VM
> > usages, so that the hope is the majority VM use case may not even need to
> > grow at all (e.g. if one starts a VM with ./qemu-system-x86_64 by default
> > it'll consume 9 memslots), however not too large to waste memory.
> > 
> > There can also be even better way to address this, but so far this is the
> > simplest and should be already better even than before we grow the max
> > supported memslots.  For example, in the case of above issue when VFIO was
> > attached on a 32GB system, there are only ~10 memslots used.  So it could
> > be good enough as of now.
> > 
> > In the above VFIO context, measurement shows that the precopy dirty sync
> > shrinked from ~86ms to ~3ms after this patch applied.  It should also apply
> > to any KVM enabled VM even without VFIO.
> > 
> > Reported-by: Zhiyi Guo <zh...@redhat.com>
> > Tested-by: Zhiyi Guo <zh...@redhat.com>
> > Signed-off-by: Peter Xu <pet...@redhat.com>
> > ---
> 
> 
> 
> >   {
> >       int i;
> > -    kml->slots = g_new0(KVMSlot, s->nr_slots_max);
> >       kml->as_id = as_id;
> > -    for (i = 0; i < s->nr_slots_max; i++) {
> > -        kml->slots[i].slot = i;
> > -    }
> > +    kvm_slots_grow(kml, KVM_MEMSLOTS_NUM_ALLOC_DEFAULT);
> 
> I would just keep the static initialization here, and add the additional
> 
>       kml->nr_slots_allocated = KVM_MEMSLOTS_NUM_ALLOC_DEFAULT;

IMHO it'll be cleaner to always allocate in the grow() so as to avoid
details on e.g. initializations of kml->slots[].slot above.

> 
> here.
> 
> Then, you can remove the parameter from kvm_slots_grow() completely and just 
> call it
> kvm_slots_double() and simplify a bit:
> 
> static bool kvm_slots_double(KVMMemoryListener *kml)
> {
>     unsigned int i, nr_slots_new, cur = kml->nr_slots_allocated;
>     KVMSlot *slots;
> 
>     nr_slots_new = MIN(cur * 2, kvm_state->nr_slots_max);
>     if (nr_slots_new == kvm_state->nr_slots_max) {
>         /* We reached the maximum */
>       return false;
>     }
> 
>     assert(kml->slots);
>     slots = g_renew(KVMSlot, kml->slots, nr_slots_new);
>     /*
>      * g_renew() doesn't initialize extended buffers, however kvm
>      * memslots require fields to be zero-initialized. E.g. pointers,
>      * memory_size field, etc.
>      */
>     memset(&slots[cur], 0x0, sizeof(slots[0]) * (nr_slots_new - cur));
> 
>     for (i = cur; i < nr_slots_new; i++) {
>         slots[i].slot = i;
>     }
> 
>     kml->slots = slots;
>     kml->nr_slots_allocated = nr_slots_new;
>     trace_kvm_slots_grow(cur, nr_slots_new);
> 
>     return true;
> }

Personally I still think it cleaner to allow setting whatever size.

We only have one place growing so far, which is pretty trivial to double
there, IMO.  I'll wait for a second opinion, or let me know if you have
strong feelings..

> 
> 
> Apart from that looks sane. On the slot freeing/allocation path, there is 
> certainly
> more optimization potential :)
> 
> I'm surprised this 32k loop wasn't found earlier.

Yes, it's in the range where it isn't too big to be discovered I guess, but
large enough to affect many things, so better fix it sooner than later.

This reminded me we should probably copy stable for this patch.  I think it
means I'll try to move this patch to the 1st patch to make Michael's life
and downstream easier.

-- 
Peter Xu


Reply via email to