On Mon Sep 18, 2023 at 5:59 PM AEST, Alex Bennée wrote: > > "Nicholas Piggin" <npig...@gmail.com> writes: > > > On Sat Sep 16, 2023 at 1:29 PM AEST, Richard Henderson wrote: > >> From: Nicholas Piggin <npig...@gmail.com> > >> > >> mttcg asserts that an execution ending with EXCP_HALTED must have > >> cpu->halted. However between the event or instruction that sets > >> cpu->halted and requests exit and the assertion here, an > >> asynchronous event could clear cpu->halted. > >> > >> This leads to crashes running AIX on ppc/pseries because it uses > >> H_CEDE/H_PROD hcalls, where H_CEDE sets self->halted = 1 and > >> H_PROD sets other cpu->halted = 0 and kicks it. > >> > >> H_PROD could be turned into an interrupt to wake, but several other > >> places in ppc, sparc, and semihosting follow what looks like a similar > >> pattern setting halted = 0 directly. So remove this assertion. > >> > >> Reported-by: Ivan Warren <i...@vmfacility.fr> > >> Signed-off-by: Nicholas Piggin <npig...@gmail.com> > >> Message-Id: <20230829010658.8252-1-npig...@gmail.com> > >> [rth: Keep the case label and adjust the comment.] > > > > Hey Richard, > > > > Thanks for picking this up. > > > > I think EXCP_HLT and EXCP_HALTED are effectively the same, so they could > > be merged after this. > > > > I couldn't quite decipher the intended difference between them, HLT is > > "hlt instruction reached", but it does tend to go into a mode where it > > is halted waiting for external event. Is there some useful difference in > > semantics we should retain (and at least try to find a way to assert)? > > I always thought HALTED was where the system was halted (e.g. during a > shutdown) but I agree its less than clear.
Maybe that was so. I didn't manage to track down the original intention of them, but now they are not different, HALTED does just wait for event too. EXCP_HALTED did previously require the operation set ->halted = 1 before calling (the assert only breaks due to concurrent wakeup clearing it). But some ops that use EXCP_HLT also set ->halted. So nowadays halted == 1 means to check ->cpu_has_work() before running the CPU again (and otherwise wait on io event as you say). And EXCP_HLT/HALTED are both just ways to return from the cpu exec loop. One thing I'm not sure of is why you would set EXCP_HLT without setting halted. In some cases it could be a bug (e.g., avr helper_sleep()), but there are a few ops that use it after a CPU reset or shutdown which might be valid. Could call those ones something like EXCP_RESET or EXCP_REEXEC. Thanks, Nick