> -----Original Message-----
> From: Paolo Bonzini [mailto:pbonz...@redhat.com]
> Sent: Tuesday, August 14, 2018 9:07 PM
> To: Zhoujian (jay) <jianjay.z...@huawei.com>; Dr. David Alan Gilbert
> <dgilb...@redhat.com>
> Cc: qemu-devel@nongnu.org; quint...@redhat.com; wangxin (U)
> <wangxinxin.w...@huawei.com>
> Subject: Re: [RFC PATCH] vl: fix migration when watchdog expires
> 
> On 14/08/2018 15:03, Zhoujian (jay) wrote:
> >> -----Original Message-----
> >> From: Paolo Bonzini [mailto:pbonz...@redhat.com]
> >> Sent: Tuesday, August 14, 2018 8:02 PM
> >> To: Dr. David Alan Gilbert <dgilb...@redhat.com>
> >> Cc: Zhoujian (jay) <jianjay.z...@huawei.com>; qemu-devel@nongnu.org;
> >> quint...@redhat.com; wangxin (U) <wangxinxin.w...@huawei.com>
> >> Subject: Re: [RFC PATCH] vl: fix migration when watchdog expires
> >>
> >> On 14/08/2018 13:52, Dr. David Alan Gilbert wrote:
> >>>  a) Should the watchdog expire when the VM is stopped; I think it
> >>> shouldn't - hw/acpi/tco.c uses a virtual timer as does i6300esb; so
> >>> is the bug here that the watchdog being used didn't use a virtual timer?
> >>
> >> All watchdogs do.
> >>
> >>>  b) If the watchdog expires just before the VM gets stopped, is
> >>> there a race which could hit this?  Possibly.
> >>
> >> Yes, I think it is a race that happens just before vm_stop, but I
> >> don't understand why the "qemu_clock_enable" in pause_all_vcpus does not
> prevent it.
> >
> > Hi Paolo,
> > The sequence is like this I think
> >
> >          |
> >          |  <-----  watchdog expired, which set reset_requested to
> SHUTDOWN_CAUSE_GUEST_RESET
> >          |
> >          |  <-----  migration thread sets to RUN_STATE_FINISH_MIGRATE, it
> will disable QEMU_CLOCK_VIRTUAL clock,
> >          |          but it is done after the setting of reset_requested
> 
> So the fix would be to process the reset request here?  (In do_vm_stop or
> pause_all_vcpus).  The code is currently in main_loop_should_exit().

I think this makes sense, process the reset request after the QEMU_CLOCK_VIRTUAL
disabled, will have a try.

Regards,
Jay Zhou

> 
> Paolo
> 
> >          |  <-----  main loop thread sets to RUN_STATE_PRELAUNCH since it
> detected a reset request
> >          |
> >          |  <-----  migration thread sets to RUN_STATE_POSTMIGRATE
> >
> >
> > Regards,
> > Jay Zhou
> >
> >>
> >> It should be possible to write a deterministic testcase with qtest...
> >>
> >> Paolo

Reply via email to