"Nicholas Piggin" <npig...@gmail.com> writes:
> On Sat Sep 16, 2023 at 1:29 PM AEST, Richard Henderson wrote: >> From: Nicholas Piggin <npig...@gmail.com> >> >> mttcg asserts that an execution ending with EXCP_HALTED must have >> cpu->halted. However between the event or instruction that sets >> cpu->halted and requests exit and the assertion here, an >> asynchronous event could clear cpu->halted. >> >> This leads to crashes running AIX on ppc/pseries because it uses >> H_CEDE/H_PROD hcalls, where H_CEDE sets self->halted = 1 and >> H_PROD sets other cpu->halted = 0 and kicks it. >> >> H_PROD could be turned into an interrupt to wake, but several other >> places in ppc, sparc, and semihosting follow what looks like a similar >> pattern setting halted = 0 directly. So remove this assertion. >> >> Reported-by: Ivan Warren <i...@vmfacility.fr> >> Signed-off-by: Nicholas Piggin <npig...@gmail.com> >> Message-Id: <20230829010658.8252-1-npig...@gmail.com> >> [rth: Keep the case label and adjust the comment.] > > Hey Richard, > > Thanks for picking this up. > > I think EXCP_HLT and EXCP_HALTED are effectively the same, so they could > be merged after this. > > I couldn't quite decipher the intended difference between them, HLT is > "hlt instruction reached", but it does tend to go into a mode where it > is halted waiting for external event. Is there some useful difference in > semantics we should retain (and at least try to find a way to assert)? I always thought HALTED was where the system was halted (e.g. during a shutdown) but I agree its less than clear. Do both effectively end up in wait_for_io for some event to start the loop again? > > I did look at how to avoid the halted race and keep the assert, e.g., > have the CPU only modify its own halted, and external events would have > a wakeup field to set. In the end it wasn't clear that that was any > simpler and you still have races to reason about, now between the two > fields. So unless someone wants to keep both, should we merge? > > Thanks, > Nick > >> Signed-off-by: Richard Henderson <richard.hender...@linaro.org> >> --- >> accel/tcg/tcg-accel-ops-mttcg.c | 9 ++------- >> 1 file changed, 2 insertions(+), 7 deletions(-) >> >> diff --git a/accel/tcg/tcg-accel-ops-mttcg.c >> b/accel/tcg/tcg-accel-ops-mttcg.c >> index b276262007..4b0dfb4be7 100644 >> --- a/accel/tcg/tcg-accel-ops-mttcg.c >> +++ b/accel/tcg/tcg-accel-ops-mttcg.c >> @@ -100,14 +100,9 @@ static void *mttcg_cpu_thread_fn(void *arg) >> break; >> case EXCP_HALTED: >> /* >> - * during start-up the vCPU is reset and the thread is >> - * kicked several times. If we don't ensure we go back >> - * to sleep in the halted state we won't cleanly >> - * start-up when the vCPU is enabled. >> - * >> - * cpu->halted should ensure we sleep in wait_io_event >> + * Usually cpu->halted is set, but may have already been >> + * reset by another thread by the time we arrive here. >> */ >> - g_assert(cpu->halted); >> break; >> case EXCP_ATOMIC: >> qemu_mutex_unlock_iothread(); -- Alex Bennée Virtualisation Tech Lead @ Linaro