Hi Peter,

On Thu, Sep 5, 2024 at 6:00 PM Peter Xu <pet...@redhat.com> wrote:
>
> On Thu, Sep 05, 2024 at 05:32:46PM +0200, Juraj Marcin wrote:
> > Hi Peter,
>
> Hi, Juraj,
>
> [...]
>
> > >  unsigned int kvm_get_max_memslots(void)
> > >  {
> > >      KVMState *s = KVM_STATE(current_accel());
> > > @@ -193,15 +247,20 @@ unsigned int kvm_get_free_memslots(void)
> > >  /* Called with KVMMemoryListener.slots_lock held */
> > >  static KVMSlot *kvm_get_free_slot(KVMMemoryListener *kml)
> > >  {
> > > -    KVMState *s = kvm_state;
> > >      int i;
> > >
> > > -    for (i = 0; i < s->nr_slots; i++) {
> > > +retry:
> > > +    for (i = 0; i < kml->nr_slots_allocated; i++) {
> > >          if (kml->slots[i].memory_size == 0) {
> > >              return &kml->slots[i];
> > >          }
> > >      }
> > >
> > > +    /* If no free slots, try to grow first by doubling */
> > > +    if (kvm_slots_double(kml)) {
> > > +        goto retry;
> >
> > At this point we know all previously allocated slots were used and
> > there should be a free slot just after the last used slot (at the
> > start of the region zeroed in the grow function). Wouldn't it be
> > faster to return it here right away, instead of iterating through
> > slots that should still be used again?
>
> Good question.
>
> One trivial concern is we'll then have assumption on how kvm_slots_double()
> behaves, e.g., it must not move anything around inside, and we need to know

> that it touches nr_slots_allocated so we need to cache it.  The outcome
> looks like this:
>
> ===8<===
> diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
> index 020fd16ab8..7429fe87a8 100644
> --- a/accel/kvm/kvm-all.c
> +++ b/accel/kvm/kvm-all.c
> @@ -249,9 +249,9 @@ unsigned int kvm_get_free_memslots(void)
>  /* Called with KVMMemoryListener.slots_lock held */
>  static KVMSlot *kvm_get_free_slot(KVMMemoryListener *kml)
>  {
> +    unsigned int n;
>      int i;
>
> -retry:
>      for (i = 0; i < kml->nr_slots_allocated; i++) {
>          if (kml->slots[i].memory_size == 0) {
>              return &kml->slots[i];
> @@ -259,8 +259,13 @@ retry:
>      }
>
>      /* If no free slots, try to grow first by doubling */
> +    n = kml->nr_slots_allocated;
>      if (kvm_slots_double(kml)) {
> -        goto retry;
> +        /*
> +         * If succeed, we must have n used slots, then followed by n free
> +         * slots.
> +         */
> +        return &kml->slots[n];
>      }
>
>      return NULL;
> ===8<===
>
> It's still good to get rid of "goto", and faster indeed.  Though I wished
> we don't need those assumptions, as cons.
>
> One thing to mention that I expect this is extremely slow path, where I
> don't expect to even be reached in major uses of QEMU, and when reached
> should be only once or limited few times per VM life cycle.  The re-walks
> here shouldn't be a perf concern IMHO, because when it's a concern we'll
> hit it much more frequently elsewhere... many other hotter paths around.
>
> So far it looks slightly more readable to me to keep the old way, but I'm
> ok either way.  What do you think?

I agree that it requires this assumption of not moving slots around,
but I think it's intuitive to assume it when it comes to
doubling/increasing the size of an array, realloc() and g_renew() also
don't shuffle existing elements.

In addition, there already is such an assumption. If slots were moved
around, pointers returned by `return &kml->slots[i];` wouldn't point
to the same slot structure after doubling.

However, I realized there's also another problem with this return
statement. g_renew() could have moved the whole array to a new
address, making all the previously returned pointers invalid. This
could be solved by either adding another layer of indirection, so the
function returns a pointer to a single slot structure that never moves
and the array contains pointers to these structures, or the slots need
to be always accessed through an up-to-date pointer to the array,
probably from another structure or through a getter function. With the
first approach, pointers in the array could shuffle, but with the
second one, the index of a slot must not change during the lifetime of
the slot, keeping the assumption correct.

>
> Thanks,
>
> --
> Peter Xu
>


-- 

Juraj Marcin


Reply via email to