> -----Original Message----- > From: Paolo Bonzini [mailto:pbonz...@redhat.com] > Sent: Tuesday, August 14, 2018 9:07 PM > To: Zhoujian (jay) <jianjay.z...@huawei.com>; Dr. David Alan Gilbert > <dgilb...@redhat.com> > Cc: qemu-devel@nongnu.org; quint...@redhat.com; wangxin (U) > <wangxinxin.w...@huawei.com> > Subject: Re: [RFC PATCH] vl: fix migration when watchdog expires > > On 14/08/2018 15:03, Zhoujian (jay) wrote: > >> -----Original Message----- > >> From: Paolo Bonzini [mailto:pbonz...@redhat.com] > >> Sent: Tuesday, August 14, 2018 8:02 PM > >> To: Dr. David Alan Gilbert <dgilb...@redhat.com> > >> Cc: Zhoujian (jay) <jianjay.z...@huawei.com>; qemu-devel@nongnu.org; > >> quint...@redhat.com; wangxin (U) <wangxinxin.w...@huawei.com> > >> Subject: Re: [RFC PATCH] vl: fix migration when watchdog expires > >> > >> On 14/08/2018 13:52, Dr. David Alan Gilbert wrote: > >>> a) Should the watchdog expire when the VM is stopped; I think it > >>> shouldn't - hw/acpi/tco.c uses a virtual timer as does i6300esb; so > >>> is the bug here that the watchdog being used didn't use a virtual timer? > >> > >> All watchdogs do. > >> > >>> b) If the watchdog expires just before the VM gets stopped, is > >>> there a race which could hit this? Possibly. > >> > >> Yes, I think it is a race that happens just before vm_stop, but I > >> don't understand why the "qemu_clock_enable" in pause_all_vcpus does not > prevent it. > > > > Hi Paolo, > > The sequence is like this I think > > > > | > > | <----- watchdog expired, which set reset_requested to > SHUTDOWN_CAUSE_GUEST_RESET > > | > > | <----- migration thread sets to RUN_STATE_FINISH_MIGRATE, it > will disable QEMU_CLOCK_VIRTUAL clock, > > | but it is done after the setting of reset_requested > > So the fix would be to process the reset request here? (In do_vm_stop or > pause_all_vcpus). The code is currently in main_loop_should_exit().
I think this makes sense, process the reset request after the QEMU_CLOCK_VIRTUAL disabled, will have a try. Regards, Jay Zhou > > Paolo > > > | <----- main loop thread sets to RUN_STATE_PRELAUNCH since it > detected a reset request > > | > > | <----- migration thread sets to RUN_STATE_POSTMIGRATE > > > > > > Regards, > > Jay Zhou > > > >> > >> It should be possible to write a deterministic testcase with qtest... > >> > >> Paolo