[Qemu-devel] racing between pause_all_vcpus() and qemu_cpu_stop()

2018-10-01 Thread Peter Maydell
I've been investigating a race condition where sometimes when my guest writes to a device register which triggers a qemu_system_reset_request(), it doesn't actually cause a clean reset, but instead the guest CPU continues to execute instructions. I managed to repro it under 'rr', which let me walk

Re: [Qemu-devel] racing between pause_all_vcpus() and qemu_cpu_stop()

2018-10-01 Thread Alex Bennée
Peter Maydell writes: > I've been investigating a race condition where sometimes when my > guest writes to a device register which triggers a > qemu_system_reset_request(), it doesn't actually cause a clean reset, > but instead the guest CPU continues to execute instructions. > I managed to rep

Re: [Qemu-devel] racing between pause_all_vcpus() and qemu_cpu_stop()

2018-10-02 Thread Peter Maydell
On 1 October 2018 at 19:12, Alex Bennée wrote: > I would have thought the reset code should be scheduled via safe async > work to run in the vCPU context. Why should the main loop get involved > at all here? The reset code is much older than the safe-async support for running things in the vCPU c

Re: [Qemu-devel] racing between pause_all_vcpus() and qemu_cpu_stop()

2018-10-02 Thread Paolo Bonzini
On 02/10/2018 10:01, Peter Maydell wrote: > On 1 October 2018 at 19:12, Alex Bennée wrote: >> I would have thought the reset code should be scheduled via safe async >> work to run in the vCPU context. Why should the main loop get involved >> at all here? > The reset code is much older than the saf

Re: [Qemu-devel] racing between pause_all_vcpus() and qemu_cpu_stop()

2018-10-02 Thread Peter Maydell
On 2 October 2018 at 09:58, Paolo Bonzini wrote: > > First, the reset code should indeed use run_on_cpu (it need not be safe > i.e. stop-the-world; just run it in the vCPU thread). It certainly > doesn't do this right now. I don't understand this part. We're resetting the entire world: surely we

Re: [Qemu-devel] racing between pause_all_vcpus() and qemu_cpu_stop()

2018-10-02 Thread Paolo Bonzini
On 02/10/2018 11:04, Peter Maydell wrote: > On 2 October 2018 at 09:58, Paolo Bonzini wrote: >> >> First, the reset code should indeed use run_on_cpu (it need not be safe >> i.e. stop-the-world; just run it in the vCPU thread). It certainly >> doesn't do this right now. > > I don't understand th

Re: [Qemu-devel] racing between pause_all_vcpus() and qemu_cpu_stop()

2018-10-02 Thread Alex Bennée
Peter Maydell writes: > On 1 October 2018 at 19:12, Alex Bennée wrote: >> I would have thought the reset code should be scheduled via safe async >> work to run in the vCPU context. Why should the main loop get involved >> at all here? > > The reset code is much older than the safe-async suppor

Re: [Qemu-devel] racing between pause_all_vcpus() and qemu_cpu_stop()

2018-10-02 Thread Peter Maydell
On 2 October 2018 at 11:00, Alex Bennée wrote: > > Peter Maydell writes: > >> On 1 October 2018 at 19:12, Alex Bennée wrote: >>> I would have thought the reset code should be scheduled via safe async >>> work to run in the vCPU context. Why should the main loop get involved >>> at all here? >> >

Re: [Qemu-devel] racing between pause_all_vcpus() and qemu_cpu_stop()

2018-10-02 Thread Peter Maydell
On 2 October 2018 at 10:59, Paolo Bonzini wrote: > On 02/10/2018 11:04, Peter Maydell wrote: >> On 2 October 2018 at 09:58, Paolo Bonzini wrote: >>> >>> First, the reset code should indeed use run_on_cpu (it need not be safe >>> i.e. stop-the-world; just run it in the vCPU thread). It certainly

Re: [Qemu-devel] racing between pause_all_vcpus() and qemu_cpu_stop()

2018-10-02 Thread Paolo Bonzini
On 02/10/2018 12:34, Peter Maydell wrote: > On 2 October 2018 at 10:59, Paolo Bonzini wrote: >> On 02/10/2018 11:04, Peter Maydell wrote: >>> On 2 October 2018 at 09:58, Paolo Bonzini wrote: First, the reset code should indeed use run_on_cpu (it need not be safe i.e. stop-the-world

Re: [Qemu-devel] racing between pause_all_vcpus() and qemu_cpu_stop()

2018-10-02 Thread Peter Maydell
On 2 October 2018 at 17:46, Paolo Bonzini wrote: > On 02/10/2018 12:34, Peter Maydell wrote: >> Maybe I just don't understand what you're suggesting should be >> done via run-on-cpu. But it seems to me that the problem here >> is that cpu_stop_current() should not call qemu_cpu_stop() >> immediate