On Thu, Sep 05, 2024 at 05:32:46PM +0200, Juraj Marcin wrote: > Hi Peter,
Hi, Juraj, [...] > > unsigned int kvm_get_max_memslots(void) > > { > > KVMState *s = KVM_STATE(current_accel()); > > @@ -193,15 +247,20 @@ unsigned int kvm_get_free_memslots(void) > > /* Called with KVMMemoryListener.slots_lock held */ > > static KVMSlot *kvm_get_free_slot(KVMMemoryListener *kml) > > { > > - KVMState *s = kvm_state; > > int i; > > > > - for (i = 0; i < s->nr_slots; i++) { > > +retry: > > + for (i = 0; i < kml->nr_slots_allocated; i++) { > > if (kml->slots[i].memory_size == 0) { > > return &kml->slots[i]; > > } > > } > > > > + /* If no free slots, try to grow first by doubling */ > > + if (kvm_slots_double(kml)) { > > + goto retry; > > At this point we know all previously allocated slots were used and > there should be a free slot just after the last used slot (at the > start of the region zeroed in the grow function). Wouldn't it be > faster to return it here right away, instead of iterating through > slots that should still be used again? Good question. One trivial concern is we'll then have assumption on how kvm_slots_double() behaves, e.g., it must not move anything around inside, and we need to know that it touches nr_slots_allocated so we need to cache it. The outcome looks like this: ===8<=== diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c index 020fd16ab8..7429fe87a8 100644 --- a/accel/kvm/kvm-all.c +++ b/accel/kvm/kvm-all.c @@ -249,9 +249,9 @@ unsigned int kvm_get_free_memslots(void) /* Called with KVMMemoryListener.slots_lock held */ static KVMSlot *kvm_get_free_slot(KVMMemoryListener *kml) { + unsigned int n; int i; -retry: for (i = 0; i < kml->nr_slots_allocated; i++) { if (kml->slots[i].memory_size == 0) { return &kml->slots[i]; @@ -259,8 +259,13 @@ retry: } /* If no free slots, try to grow first by doubling */ + n = kml->nr_slots_allocated; if (kvm_slots_double(kml)) { - goto retry; + /* + * If succeed, we must have n used slots, then followed by n free + * slots. + */ + return &kml->slots[n]; } return NULL; ===8<=== It's still good to get rid of "goto", and faster indeed. Though I wished we don't need those assumptions, as cons. One thing to mention that I expect this is extremely slow path, where I don't expect to even be reached in major uses of QEMU, and when reached should be only once or limited few times per VM life cycle. The re-walks here shouldn't be a perf concern IMHO, because when it's a concern we'll hit it much more frequently elsewhere... many other hotter paths around. So far it looks slightly more readable to me to keep the old way, but I'm ok either way. What do you think? Thanks, -- Peter Xu