Hi, Yong, On Mon, Nov 21, 2022 at 11:26:35AM -0500, huang...@chinatelecom.cn wrote: > From: Hyman Huang(黄勇) <huang...@chinatelecom.cn> > > When tested large vcpu size vm with dirtylimit feature, Qemu crashed > due to the assertion in kvm_dirty_ring_reap_one, which assert that > vcpu's kvm_dirty_gfns has been allocated and not NULL. > > Because dirty ring reaper thread races with Qemu main thread, reaper > may reap vcpu's dirty ring buffer when main thread doesn't complete > vcpu instantiation. So add the waiting logic in reaper thread and > start to reap until vcpu instantiation is completed. > > Signed-off-by: Hyman Huang(黄勇) <huang...@chinatelecom.cn> > --- > accel/kvm/kvm-all.c | 36 ++++++++++++++++++++++++++++++++++++ > 1 file changed, 36 insertions(+) > > diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c > index f99b0be..9457715 100644 > --- a/accel/kvm/kvm-all.c > +++ b/accel/kvm/kvm-all.c > @@ -1401,6 +1401,35 @@ out: > kvm_slots_unlock(); > } > > +/* > + * test if dirty ring has been initialized by checking if vcpu > + * has been initialized and gfns was allocated correspondlingly. > + * return true if dirty ring has been initialized, false otherwise. > + */ > +static bool kvm_vcpu_dirty_ring_initialized(void) > +{ > + CPUState *cpu; > + MachineState *ms = MACHINE(qdev_get_machine()); > + int ncpus = ms->smp.cpus; > + > + /* > + * assume vcpu has not been initilaized if generation > + * id less than number of vcpu > + */ > + if (ncpus > cpu_list_generation_id_get()) { > + return false; > + } > + > + CPU_FOREACH(cpu) { > + if (!cpu->kvm_dirty_gfns) { > + return false; > + } > + } > + > + return true; > +} > + > + > static void *kvm_dirty_ring_reaper_thread(void *data) > { > KVMState *s = data; > @@ -1410,6 +1439,13 @@ static void *kvm_dirty_ring_reaper_thread(void *data) > > trace_kvm_dirty_ring_reaper("init"); > > +retry: > + /* don't allow reaping dirty ring if ring buffer hasn't been mapped */ > + if (!kvm_vcpu_dirty_ring_initialized()) { > + sleep(1);
The sleep here is probably not necessary. Could you instead have a look at the other much simpler patch? Here: https://lore.kernel.org/qemu-devel/20220927154653.77296-1-pet...@redhat.com/ > + goto retry; > + } > + > while (true) { > r->reaper_state = KVM_DIRTY_RING_REAPER_WAIT; > trace_kvm_dirty_ring_reaper("wait"); > -- > 1.8.3.1 > > -- Peter Xu