On Fri, Sep 06, 2024 at 12:54:37PM +0200, Juraj Marcin wrote:
> Hi Peter,
> 
> On Thu, Sep 5, 2024 at 6:00 PM Peter Xu <pet...@redhat.com> wrote:
> >
> > On Thu, Sep 05, 2024 at 05:32:46PM +0200, Juraj Marcin wrote:
> > > Hi Peter,
> >
> > Hi, Juraj,
> >
> > [...]
> >
> > > >  unsigned int kvm_get_max_memslots(void)
> > > >  {
> > > >      KVMState *s = KVM_STATE(current_accel());
> > > > @@ -193,15 +247,20 @@ unsigned int kvm_get_free_memslots(void)
> > > >  /* Called with KVMMemoryListener.slots_lock held */
> > > >  static KVMSlot *kvm_get_free_slot(KVMMemoryListener *kml)
> > > >  {
> > > > -    KVMState *s = kvm_state;
> > > >      int i;
> > > >
> > > > -    for (i = 0; i < s->nr_slots; i++) {
> > > > +retry:
> > > > +    for (i = 0; i < kml->nr_slots_allocated; i++) {
> > > >          if (kml->slots[i].memory_size == 0) {
> > > >              return &kml->slots[i];
> > > >          }
> > > >      }
> > > >
> > > > +    /* If no free slots, try to grow first by doubling */
> > > > +    if (kvm_slots_double(kml)) {
> > > > +        goto retry;
> > >
> > > At this point we know all previously allocated slots were used and
> > > there should be a free slot just after the last used slot (at the
> > > start of the region zeroed in the grow function). Wouldn't it be
> > > faster to return it here right away, instead of iterating through
> > > slots that should still be used again?
> >
> > Good question.
> >
> > One trivial concern is we'll then have assumption on how kvm_slots_double()
> > behaves, e.g., it must not move anything around inside, and we need to know
> 
> > that it touches nr_slots_allocated so we need to cache it.  The outcome
> > looks like this:
> >
> > ===8<===
> > diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
> > index 020fd16ab8..7429fe87a8 100644
> > --- a/accel/kvm/kvm-all.c
> > +++ b/accel/kvm/kvm-all.c
> > @@ -249,9 +249,9 @@ unsigned int kvm_get_free_memslots(void)
> >  /* Called with KVMMemoryListener.slots_lock held */
> >  static KVMSlot *kvm_get_free_slot(KVMMemoryListener *kml)
> >  {
> > +    unsigned int n;
> >      int i;
> >
> > -retry:
> >      for (i = 0; i < kml->nr_slots_allocated; i++) {
> >          if (kml->slots[i].memory_size == 0) {
> >              return &kml->slots[i];
> > @@ -259,8 +259,13 @@ retry:
> >      }
> >
> >      /* If no free slots, try to grow first by doubling */
> > +    n = kml->nr_slots_allocated;
> >      if (kvm_slots_double(kml)) {
> > -        goto retry;
> > +        /*
> > +         * If succeed, we must have n used slots, then followed by n free
> > +         * slots.
> > +         */
> > +        return &kml->slots[n];
> >      }
> >
> >      return NULL;
> > ===8<===
> >
> > It's still good to get rid of "goto", and faster indeed.  Though I wished
> > we don't need those assumptions, as cons.
> >
> > One thing to mention that I expect this is extremely slow path, where I
> > don't expect to even be reached in major uses of QEMU, and when reached
> > should be only once or limited few times per VM life cycle.  The re-walks
> > here shouldn't be a perf concern IMHO, because when it's a concern we'll
> > hit it much more frequently elsewhere... many other hotter paths around.
> >
> > So far it looks slightly more readable to me to keep the old way, but I'm
> > ok either way.  What do you think?
> 
> I agree that it requires this assumption of not moving slots around,
> but I think it's intuitive to assume it when it comes to
> doubling/increasing the size of an array, realloc() and g_renew() also
> don't shuffle existing elements.
> 
> In addition, there already is such an assumption. If slots were moved
> around, pointers returned by `return &kml->slots[i];` wouldn't point
> to the same slot structure after doubling.
> 
> However, I realized there's also another problem with this return
> statement. g_renew() could have moved the whole array to a new
> address, making all the previously returned pointers invalid. This
> could be solved by either adding another layer of indirection, so the
> function returns a pointer to a single slot structure that never moves
> and the array contains pointers to these structures, or the slots need
> to be always accessed through an up-to-date pointer to the array,
> probably from another structure or through a getter function. With the
> first approach, pointers in the array could shuffle, but with the
> second one, the index of a slot must not change during the lifetime of
> the slot, keeping the assumption correct.

Note that all access to kvm slots are protected by kvm_slots_lock() which
is currently a mutex (aka, non-RCU), and we can't cache slot yet so far
because we don't know whether there's concurrent update.

Thanks,

-- 
Peter Xu


Reply via email to