Re: Lockup with --accel tcg,thread=single
Paolo Bonzini writes: > On 01/10/19 10:42, Alex Bennée wrote: >> >> Paolo Bonzini writes: >> >>> On 30/09/19 21:20, Alex Bennée wrote: Does seem to imply the vCPU CPUState is where the queue is. That's not to say there shouldn't be a single work queue for thread=single. >>> >>> Indeed it doesn't. I confused this with commit a8efa60633 ("cpus: run >>> work items for all vCPUs if single-threaded", 2018-11-27). >>> >>> Are you going to make a patch to have a single work queue, or should >>> I? >> >> What's the neatest way to do it? Are you thinking just special case >> queue_work_on_cpu to special case first_cpu when mttcg is not enabled? > > Yes, I cannot think of anything better. And I am immediately stymied by the fact that cpus-common is a common blob so can't have differentiation for SOFTMMU cases in it. Did you have a look at: Date: Mon, 30 Sep 2019 17:38:15 +0100 Message-ID: <87h84tloy0@linaro.org> Is that too ugly? > > Paolo -- Alex Bennée
Re: Lockup with --accel tcg,thread=single
On Mon, 30 Sep 2019 at 18:47, Paolo Bonzini wrote: > On 30/09/19 17:37, Peter Maydell wrote: > > Side note -- this use of run_on_cpu() means that we now drop > > the iothread lock within memory_region_snapshot_and_clear_dirty(), > > which we didn't before. This means that a vCPU thread can now > > get in and execute an access to the device registers of > > hw/display/vga.c, updating its state in a way I suspect that the > > device model code is not expecting... So maybe the right answer > > here should be to come up with a fix for the race that 9458a9a1 > > addresses that doesn't require us to drop the iothread lock in > > memory_region_snapshot_and_clear_dirty() ? Alternatively we need > > to audit the callers and flag in the documentation that this > > function has the unexpected side effect of briefly dropping the > > iothread lock. > > Yes, this is intended. There shouldn't be side effects other than > possibly a one-frame flash for all callers. This seems quite tricky to audit to me -- for instance is the code in vga_draw_graphic() really designed for and expecting to cope with races where information it reads from s->foo after the call to memory_region_snapshot_and_clear_dirty() might have been updated by the guest and not be consistent with the information it read from s->bar before the call and cached in a local variable? I think most people write device code assuming that while execution is within a function in the device model the device won't have to deal with possible reentrancy where another thread comes in and alters the device state underfoot. thanks -- PMM
Re: Lockup with --accel tcg,thread=single
On 01/10/19 10:42, Alex Bennée wrote: > > Paolo Bonzini writes: > >> On 30/09/19 21:20, Alex Bennée wrote: >>> Does seem to imply the vCPU CPUState is where the queue is. That's not >>> to say there shouldn't be a single work queue for thread=single. >> >> Indeed it doesn't. I confused this with commit a8efa60633 ("cpus: run >> work items for all vCPUs if single-threaded", 2018-11-27). >> >> Are you going to make a patch to have a single work queue, or should >> I? > > What's the neatest way to do it? Are you thinking just special case > queue_work_on_cpu to special case first_cpu when mttcg is not enabled? Yes, I cannot think of anything better. Paolo
Re: Lockup with --accel tcg,thread=single
Paolo Bonzini writes: > On 30/09/19 21:20, Alex Bennée wrote: >> Does seem to imply the vCPU CPUState is where the queue is. That's not >> to say there shouldn't be a single work queue for thread=single. > > Indeed it doesn't. I confused this with commit a8efa60633 ("cpus: run > work items for all vCPUs if single-threaded", 2018-11-27). > > Are you going to make a patch to have a single work queue, or should > I? What's the neatest way to do it? Are you thinking just special case queue_work_on_cpu to special case first_cpu when mttcg is not enabled? -- Alex Bennée
Re: Lockup with --accel tcg,thread=single
On 30/09/19 21:20, Alex Bennée wrote: > Does seem to imply the vCPU CPUState is where the queue is. That's not > to say there shouldn't be a single work queue for thread=single. Indeed it doesn't. I confused this with commit a8efa60633 ("cpus: run work items for all vCPUs if single-threaded", 2018-11-27). Are you going to make a patch to have a single work queue, or should I? Paolo
Re: Lockup with --accel tcg,thread=single
Paolo Bonzini writes: > On 30/09/19 17:37, Peter Maydell wrote: >> Not sure currently what the best fix is here. > > Since thread=single uses just one queued work list, could it be as > simple as Does it? I thought this was the case but: static void queue_work_on_cpu(CPUState *cpu, struct qemu_work_item *wi) { qemu_mutex_lock(&cpu->work_mutex); if (cpu->queued_work_first == NULL) { cpu->queued_work_first = wi; } else { cpu->queued_work_last->next = wi; } cpu->queued_work_last = wi; wi->next = NULL; wi->done = false; qemu_mutex_unlock(&cpu->work_mutex); qemu_cpu_kick(cpu); } Does seem to imply the vCPU CPUState is where the queue is. That's not to say there shouldn't be a single work queue for thread=single. > > diff --git a/cpus.c b/cpus.c > index d2c61ff..314f9aa 100644 > --- a/cpus.c > +++ b/cpus.c > @@ -1539,7 +1539,7 @@ static void *qemu_tcg_rr_cpu_thread_fn(void *arg) > cpu = first_cpu; > } > > -while (cpu && !cpu->queued_work_first && !cpu->exit_request) { > +while (cpu && !first_cpu->queued_work_first && !cpu->exit_request) { > > atomic_mb_set(&tcg_current_rr_cpu, cpu); > current_cpu = cpu; > > or something like that? > >> Side note -- this use of run_on_cpu() means that we now drop >> the iothread lock within memory_region_snapshot_and_clear_dirty(), >> which we didn't before. This means that a vCPU thread can now >> get in and execute an access to the device registers of >> hw/display/vga.c, updating its state in a way I suspect that the >> device model code is not expecting... So maybe the right answer >> here should be to come up with a fix for the race that 9458a9a1 >> addresses that doesn't require us to drop the iothread lock in >> memory_region_snapshot_and_clear_dirty() ? Alternatively we need >> to audit the callers and flag in the documentation that this >> function has the unexpected side effect of briefly dropping the >> iothread lock. > > Yes, this is intended. There shouldn't be side effects other than > possibly a one-frame flash for all callers. > > Paolo -- Alex Bennée
Re: Lockup with --accel tcg,thread=single
On 30/09/19 17:37, Peter Maydell wrote: > Not sure currently what the best fix is here. Since thread=single uses just one queued work list, could it be as simple as diff --git a/cpus.c b/cpus.c index d2c61ff..314f9aa 100644 --- a/cpus.c +++ b/cpus.c @@ -1539,7 +1539,7 @@ static void *qemu_tcg_rr_cpu_thread_fn(void *arg) cpu = first_cpu; } -while (cpu && !cpu->queued_work_first && !cpu->exit_request) { +while (cpu && !first_cpu->queued_work_first && !cpu->exit_request) { atomic_mb_set(&tcg_current_rr_cpu, cpu); current_cpu = cpu; or something like that? > Side note -- this use of run_on_cpu() means that we now drop > the iothread lock within memory_region_snapshot_and_clear_dirty(), > which we didn't before. This means that a vCPU thread can now > get in and execute an access to the device registers of > hw/display/vga.c, updating its state in a way I suspect that the > device model code is not expecting... So maybe the right answer > here should be to come up with a fix for the race that 9458a9a1 > addresses that doesn't require us to drop the iothread lock in > memory_region_snapshot_and_clear_dirty() ? Alternatively we need > to audit the callers and flag in the documentation that this > function has the unexpected side effect of briefly dropping the > iothread lock. Yes, this is intended. There shouldn't be side effects other than possibly a one-frame flash for all callers. Paolo
Re: Lockup with --accel tcg,thread=single
Peter Maydell writes: > On Mon, 30 Sep 2019 at 14:17, Doug Gale wrote: >> >> I found a lockup in single threaded TCG, with OVMF bios, 16 CPUs. >> >> qemu-system-x86_64 --accel tcg,thread=single -smp cpus=16 -bios >> /usr/share/ovmf/OVMF.fd >> >> Using Ubuntu 18.04 LTS, default gnome desktop. There is no guest OS, >> there is no hard drive, just the OVMF firmware locks it up. (I >> narrowed it down to this from a much larger repro) > >> Peter Maydell helped me bisect it in IRC. >> Works fine at commit 1e8a98b53867f61 >> Fails at commit 9458a9a1df1a4c7 >> >> MTTCG works fine all the way up to master. > > Thanks for this bug report. I've reproduced it and think > I have figured out what is going on here. > > Commit 9458a9a1df1a4c719e245 is Paolo's commit that fixes the > TCG-vs-dirty-bitmap race by having the thread which is > doing a memory access wait for the vCPU thread to finish > its current TB using a no-op run_on_cpu() operation. > > In the case of the hang the thread doing the memory access > is the iothread, like this: > > #14 0x55c150c0a98c in run_on_cpu (cpu=0x55c153801c60, > func=0x55c150bbb542 , data=...) > at /home/petmay01/linaro/qemu-from-laptop/qemu/cpus.c:1205 > #15 0x55c150bbb58c in tcg_log_global_after_sync > (listener=0x55c1538410c8) at > /home/petmay01/linaro/qemu-from-laptop/qemu/exec.c:2963 > #16 0x55c150c1fe18 in memory_global_after_dirty_log_sync () at > /home/petmay01/linaro/qemu-from-laptop/qemu/memory.c:2598 > #17 0x55c150c1e6b8 in memory_region_snapshot_and_clear_dirty > (mr=0x55c1541e82b0, addr=0, size=192, client=0) > at /home/petmay01/linaro/qemu-from-laptop/qemu/memory.c:2106 > #18 0x55c150c76c05 in vga_draw_graphic (s=0x55c1541e82a0, full_update=0) > at /home/petmay01/linaro/qemu-from-laptop/qemu/hw/display/vga.c:1661 > #19 0x55c150c771c4 in vga_update_display (opaque=0x55c1541e82a0) > at /home/petmay01/linaro/qemu-from-laptop/qemu/hw/display/vga.c:1785 > #20 0x55c151052a83 in graphic_hw_update (con=0x55c1536dfaa0) at > /home/petmay01/linaro/qemu-from-laptop/qemu/ui/console.c:268 > #21 0x55c151091490 in gd_refresh (dcl=0x55c1549af090) at > /home/petmay01/linaro/qemu-from-laptop/qemu/ui/gtk.c:484 > #22 0x55c151056571 in dpy_refresh (s=0x55c1542f9d90) at > /home/petmay01/linaro/qemu-from-laptop/qemu/ui/console.c:1629 > #23 0x55c1510527f0 in gui_update (opaque=0x55c1542f9d90) at > /home/petmay01/linaro/qemu-from-laptop/qemu/ui/console.c:206 > #24 0x55c1511ee67c in timerlist_run_timers > (timer_list=0x55c15370c280) at > /home/petmay01/linaro/qemu-from-laptop/qemu/util/qemu-timer.c:592 > #25 0x55c1511ee726 in qemu_clock_run_timers > (type=QEMU_CLOCK_REALTIME) at > /home/petmay01/linaro/qemu-from-laptop/qemu/util/qemu-timer.c:606 > #26 0x55c1511ee9e5 in qemu_clock_run_all_timers () at > /home/petmay01/linaro/qemu-from-laptop/qemu/util/qemu-timer.c:692 > #27 0x55c1511ef181 in main_loop_wait (nonblocking=0) at > /home/petmay01/linaro/qemu-from-laptop/qemu/util/main-loop.c:524 > #28 0x55c150db03fe in main_loop () at > /home/petmay01/linaro/qemu-from-laptop/qemu/vl.c:1806 > > run_on_cpu() adds the do_nothing function to a cpu queued-work list. > > However, the single-threaded TCG runloop qemu_tcg_rr_cpu_thread_fn() > has the basic structure: > >while (1) { >for (each vCPU) { >run this vCPU for a timeslice; >} >qemu_tcg_rr_wait_io_event(); >} > > and it processes cpu work queues only in qemu_tcg_rr_wait_io_event(). > This is a problem, because the thing which causes us to stop running > one vCPU when its timeslice ends and move on to the next is the > tcg_kick_vcpu_timer -- and the iothread will never process that timer > and kick the vcpu because it is currently blocked in run_on_cpu() ! > > Not sure currently what the best fix is here. We seem to be repeating ourselves because: a8efa60633575a2ee4dbf807a71cb44d44b0e0f8 Author: Paolo Bonzini AuthorDate: Wed Nov 14 12:36:57 2018 +0100 cpus: run work items for all vCPUs if single-threaded However looking at the commit I can still see we have the problem of not advancing to the next vCPU if the kick timer (or some other event) doesn't bring the execution to an exit. I suspect you could get this in Linux but it's probably sufficiently busy to ensure vCPUs are always exiting for some reason or another. So options I can think of so far are: 1. Kick 'em all when not inter-vCPU Something like this untested patch... --8<---cut here---start->8--- 1 file changed, 17 insertions(+), 5 deletions(-) cpus.c | 22 +- modified cpus.c @@ -949,8 +949,8 @@ static inline int64_t qemu_tcg_next_kick(void) return qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + TCG_KICK_PERIOD; } -/* Kick the currently round-robin scheduled vCPU */ -static void qemu_cpu_kick_rr_cpu(void) +/* Kick the currently round-robin scheduled vCPU to next */ +static void qemu_cpu_kick_rr_
Re: Lockup with --accel tcg,thread=single
On Mon, 30 Sep 2019 at 14:17, Doug Gale wrote: > > I found a lockup in single threaded TCG, with OVMF bios, 16 CPUs. > > qemu-system-x86_64 --accel tcg,thread=single -smp cpus=16 -bios > /usr/share/ovmf/OVMF.fd > > Using Ubuntu 18.04 LTS, default gnome desktop. There is no guest OS, > there is no hard drive, just the OVMF firmware locks it up. (I > narrowed it down to this from a much larger repro) > Peter Maydell helped me bisect it in IRC. > Works fine at commit 1e8a98b53867f61 > Fails at commit 9458a9a1df1a4c7 > > MTTCG works fine all the way up to master. Thanks for this bug report. I've reproduced it and think I have figured out what is going on here. Commit 9458a9a1df1a4c719e245 is Paolo's commit that fixes the TCG-vs-dirty-bitmap race by having the thread which is doing a memory access wait for the vCPU thread to finish its current TB using a no-op run_on_cpu() operation. In the case of the hang the thread doing the memory access is the iothread, like this: #14 0x55c150c0a98c in run_on_cpu (cpu=0x55c153801c60, func=0x55c150bbb542 , data=...) at /home/petmay01/linaro/qemu-from-laptop/qemu/cpus.c:1205 #15 0x55c150bbb58c in tcg_log_global_after_sync (listener=0x55c1538410c8) at /home/petmay01/linaro/qemu-from-laptop/qemu/exec.c:2963 #16 0x55c150c1fe18 in memory_global_after_dirty_log_sync () at /home/petmay01/linaro/qemu-from-laptop/qemu/memory.c:2598 #17 0x55c150c1e6b8 in memory_region_snapshot_and_clear_dirty (mr=0x55c1541e82b0, addr=0, size=192, client=0) at /home/petmay01/linaro/qemu-from-laptop/qemu/memory.c:2106 #18 0x55c150c76c05 in vga_draw_graphic (s=0x55c1541e82a0, full_update=0) at /home/petmay01/linaro/qemu-from-laptop/qemu/hw/display/vga.c:1661 #19 0x55c150c771c4 in vga_update_display (opaque=0x55c1541e82a0) at /home/petmay01/linaro/qemu-from-laptop/qemu/hw/display/vga.c:1785 #20 0x55c151052a83 in graphic_hw_update (con=0x55c1536dfaa0) at /home/petmay01/linaro/qemu-from-laptop/qemu/ui/console.c:268 #21 0x55c151091490 in gd_refresh (dcl=0x55c1549af090) at /home/petmay01/linaro/qemu-from-laptop/qemu/ui/gtk.c:484 #22 0x55c151056571 in dpy_refresh (s=0x55c1542f9d90) at /home/petmay01/linaro/qemu-from-laptop/qemu/ui/console.c:1629 #23 0x55c1510527f0 in gui_update (opaque=0x55c1542f9d90) at /home/petmay01/linaro/qemu-from-laptop/qemu/ui/console.c:206 #24 0x55c1511ee67c in timerlist_run_timers (timer_list=0x55c15370c280) at /home/petmay01/linaro/qemu-from-laptop/qemu/util/qemu-timer.c:592 #25 0x55c1511ee726 in qemu_clock_run_timers (type=QEMU_CLOCK_REALTIME) at /home/petmay01/linaro/qemu-from-laptop/qemu/util/qemu-timer.c:606 #26 0x55c1511ee9e5 in qemu_clock_run_all_timers () at /home/petmay01/linaro/qemu-from-laptop/qemu/util/qemu-timer.c:692 #27 0x55c1511ef181 in main_loop_wait (nonblocking=0) at /home/petmay01/linaro/qemu-from-laptop/qemu/util/main-loop.c:524 #28 0x55c150db03fe in main_loop () at /home/petmay01/linaro/qemu-from-laptop/qemu/vl.c:1806 run_on_cpu() adds the do_nothing function to a cpu queued-work list. However, the single-threaded TCG runloop qemu_tcg_rr_cpu_thread_fn() has the basic structure: while (1) { for (each vCPU) { run this vCPU for a timeslice; } qemu_tcg_rr_wait_io_event(); } and it processes cpu work queues only in qemu_tcg_rr_wait_io_event(). This is a problem, because the thing which causes us to stop running one vCPU when its timeslice ends and move on to the next is the tcg_kick_vcpu_timer -- and the iothread will never process that timer and kick the vcpu because it is currently blocked in run_on_cpu() ! Not sure currently what the best fix is here. Side note -- this use of run_on_cpu() means that we now drop the iothread lock within memory_region_snapshot_and_clear_dirty(), which we didn't before. This means that a vCPU thread can now get in and execute an access to the device registers of hw/display/vga.c, updating its state in a way I suspect that the device model code is not expecting... So maybe the right answer here should be to come up with a fix for the race that 9458a9a1 addresses that doesn't require us to drop the iothread lock in memory_region_snapshot_and_clear_dirty() ? Alternatively we need to audit the callers and flag in the documentation that this function has the unexpected side effect of briefly dropping the iothread lock. thanks -- PMM