Hi On Tue, Nov 16, 2021 at 7:22 AM Peter Xu <pet...@redhat.com> wrote: > > When finalizing the dump-guest-memory with detached mode, we'll first set dump > status to either FAIL or COMPLETE before doing the cleanup, however right > after > the dump status change it's possible that another dump-guest-memory qmp > command > is sent so both the main thread and dump thread (which is during cleanup) > could > be accessing dump state in paralell. That's racy.
parallel > > Fix it by protecting the finalizing phase of dump-guest-memory using BQL as > well. To do that, we expand the BQL from dump_cleanup() into dump_process(), > so we will take the BQL longer than before. The critical section must cover > the status switch from ACTIVE->{FAIL|COMPLETE} so as to make sure there's no > race any more. > > We can also just introduce a specific mutex to serialize the dump process, but > BQL should be enough for now so far, not to mention vm_start() in dump_cleanup > will need BQL anyway, so maybe easier to just use the same mutex. > > Reported-by: Laszlo Ersek <ler...@redhat.com> > Signed-off-by: Peter Xu <pet...@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lur...@redhat.com> > --- > dump/dump.c | 24 ++++++++++++++++++------ > 1 file changed, 18 insertions(+), 6 deletions(-) > > diff --git a/dump/dump.c b/dump/dump.c > index 662d0a62cd..196b7b8ab9 100644 > --- a/dump/dump.c > +++ b/dump/dump.c > @@ -96,13 +96,7 @@ static int dump_cleanup(DumpState *s) > g_free(s->guest_note); > s->guest_note = NULL; > if (s->resume) { > - if (s->detached) { > - qemu_mutex_lock_iothread(); > - } > vm_start(); > - if (s->detached) { > - qemu_mutex_unlock_iothread(); > - } > } > migrate_del_blocker(dump_migration_blocker); > > @@ -1873,6 +1867,11 @@ static void dump_process(DumpState *s, Error **errp) > Error *local_err = NULL; > DumpQueryResult *result = NULL; > > + /* > + * When running with detached mode, these operations are not run with > BQL. > + * It's still safe, because it's protected by setting s->state to ACTIVE, > + * so dump_in_progress() check will block yet another dump-guest-memory. > + */ > if (s->has_format && s->format == DUMP_GUEST_MEMORY_FORMAT_WIN_DMP) { > #ifdef TARGET_X86_64 > create_win_dump(s, &local_err); > @@ -1883,6 +1882,15 @@ static void dump_process(DumpState *s, Error **errp) > create_vmcore(s, &local_err); > } > > + /* > + * Serialize the finalizing of dump process using BQL to make sure no > + * concurrent access to DumpState is allowed. BQL is also required for > + * dump_cleanup as vm_start() needs it. > + */ > + if (s->detached) { > + qemu_mutex_lock_iothread(); > + } > + > /* make sure status is written after written_size updates */ > smp_wmb(); > qatomic_set(&s->status, > @@ -1898,6 +1906,10 @@ static void dump_process(DumpState *s, Error **errp) > > error_propagate(errp, local_err); > dump_cleanup(s); > + > + if (s->detached) { > + qemu_mutex_unlock_iothread(); > + } > } > > static void *dump_thread(void *data) > -- > 2.32.0 >