On Fri, Sep 15, 2017 at 04:20:26PM +0800, Fam Zheng wrote: [...]
> > > void qmp_migrate_set_cache_size(int64_t value, Error **errp) > > > { > > > MigrationState *s = migrate_get_current(); > > > diff --git a/vl.c b/vl.c > > > index fb1f05b937..abbe61f40b 100644 > > > --- a/vl.c > > > +++ b/vl.c > > > @@ -87,6 +87,7 @@ int main(int argc, char **argv) > > > #include "sysemu/blockdev.h" > > > #include "hw/block/block.h" > > > #include "migration/misc.h" > > > +#include "migration/savevm.h" > > > #include "migration/snapshot.h" > > > #include "migration/global_state.h" > > > #include "sysemu/tpm.h" > > > @@ -4799,6 +4800,8 @@ int main(int argc, char **argv, char **envp) > > > iothread_stop_all(); > > > > > > pause_all_vcpus(); > > > + migrate_cancel(); > > > > IIUC this is an async cancel, so when reach here the migration thread > > can still be alive. Then... > > > > > + qemu_savevm_state_cleanup(); > > > > ... Here calling qemu_savevm_state_cleanup() may be problematic if > > migration thread has not yet quitted. > > > > I'm thinking whether we should make migrate_fd_cancel() wait until the > > migration thread finishes (state change to CANCELLED). Then the > > migration thread will do the cleanup, and here we can avoid calling > > qemu_savevm_state_cleanup() as well. > > But if the migration thread is stuck and CANCELLED is never reached, we'll > hang > here? Maybe we can add timeout. Anyway, if it gets stuck, I do see it a migration bug, and in that case I'm not sure force return after timeout would be anything wiser. Dave/Juan may have better idea though. -- Peter Xu