Hi Corinna, Thanks for reviewing.
On Mon, 23 Jun 2025 15:30:45 +0200 Corinna Vinschen wrote: > Hi Takashi, > > On Jun 23 20:51, Takashi Yano wrote: > > After the commit f305ca916ad2, some stress-ng tests fail in arm64 > > windows. There seems to be two causes for this issue. One is that > > calling SuspendThread(GetCurrentThread()) may suspend myself in > > the kernel. Branching to sigdelayed in the kernel code does not > > work as expected as the original _cygtls::interrup_now() intended. > > The other cause is, single step exception sometimes does not trigger > > exception::handle() for some reason. Therefore, register vectored > > exception handler (VEH) and use it for single step exception instead. > > > > Addresses: https://cygwin.com/pipermail/cygwin/2025-June/258332.html > > Fixes: f305ca916ad2 ("Cygwin: signal: Prevent unexpected crash on frequent > > SIGSEGV") > > Reported-by: Jeremy Drake <[email protected]> > > Reviewed-by: > > Signed-off-by: Takashi Yano <[email protected]> > > --- > > winsup/cygwin/exceptions.cc | 46 +++++++++++++++------------ > > winsup/cygwin/local_includes/cygtls.h | 1 + > > 2 files changed, 27 insertions(+), 20 deletions(-) > > > > diff --git a/winsup/cygwin/exceptions.cc b/winsup/cygwin/exceptions.cc > > index a4699b172..e5193551b 100644 > > --- a/winsup/cygwin/exceptions.cc > > +++ b/winsup/cygwin/exceptions.cc > > @@ -653,13 +653,6 @@ exception::handle (EXCEPTION_RECORD *e, exception_list > > *frame, CONTEXT *in, > > static int NO_COPY debugging = 0; > > _cygtls& me = _my_tls; > > > > - if (me.suspend_on_exception) > > - { > > - SuspendThread (GetCurrentThread ()); > > - if (e->ExceptionCode == (DWORD) STATUS_SINGLE_STEP) > > - return ExceptionContinueExecution; > > - } > > - > > if (debugging && ++debugging < 500000) > > { > > SetThreadPriority (hMainThread, THREAD_PRIORITY_NORMAL); > > @@ -923,6 +916,22 @@ sig_handle_tty_stop (int sig, siginfo_t *, void *) > > } > > } /* end extern "C" */ > > > > +static HANDLE h_veh = NULL; > > +static LONG CALLBACK > > +veh (EXCEPTION_POINTERS *ep) > > A better name would be nice. Something like singlestep_handler or something > more appropriate. Done. > > +{ > > + if (_my_tls.suspend_on_exception) > > + { > > + _my_tls.in_exception_handler = true; > > + while (_my_tls.suspend_on_exception) ; /* Don't call yield() to > > privent > > ^^^^^^^ > > prevent > > > + the thread form being suspended > > + in the kernel. */ > > + if (ep->ExceptionRecord->ExceptionCode == (DWORD) STATUS_SINGLE_STEP) > > + return EXCEPTION_CONTINUE_EXECUTION; > > + } > > + return EXCEPTION_CONTINUE_SEARCH; > > +} > > + > > bool > > _cygtls::interrupt_now (CONTEXT *cx, siginfo_t& si, void *handler, > > struct sigaction& siga) > > @@ -943,25 +952,22 @@ _cygtls::interrupt_now (CONTEXT *cx, siginfo_t& si, > > void *handler, > > by setting the trap flag (TF) before calling ResumeThread(). This > > will trigger either STATUS_SINGLE_STEP or the exception caused by > > the instruction that Rip originally pointed to. By suspending the > > - targeted thread within exception::handle(), Rip no longer points > > - to the problematic instruction, allowing safe handling of the > > - interrupt. As a result, Rip can be adjusted appropriately, and the > > - thread can resume execution without unexpected crashes. */ > > + targeted thread within the vectored exception handler veh(), Rip no > > + longer points to the problematic instruction, allowing safe handling > > + of the interrupt. As a result, Rip can be adjusted appropriately, > > + and the thread can resume execution without unexpected crashes. */ > > if (!inside_kernel (cx, true)) > > { > > + if (h_veh == NULL) > > + h_veh = AddVectoredExceptionHandler (1, veh); > > h_veh is static, but not NO_COPY. I'm pretty sure this might crash > a subsequently forked child. > > IMO it would make more sense to make h_veh a local var and call > RemoveVectoredExceptionHandler after calling SuspendThread. Done. > > cx->EFlags |= 0x100; /* Set TF (setup single step execution) */ > > SetThreadContext (*this, cx); > > suspend_on_exception = true; > > + in_exception_handler = false; > > ResumeThread (*this); > > - ULONG cnt = 0; > > - NTSTATUS status; > > - do > > - { > > - yield (); > > - status = NtQueryInformationThread (*this, ThreadSuspendCount, > > - &cnt, sizeof (cnt), NULL); > > - } > > - while (NT_SUCCESS (status) && cnt == 0); > > + while (!in_exception_handler) > > + yield (); > > We're yielding a lot, which may waste CPU time. Rather than yielding, > we should consider to use WaitOnAddress/WakeByAddress* more often. > > https://learn.microsoft.com/en-us/windows/win32/api/synchapi/nf-synchapi-waitonaddress Done. > But there's another problem I don't get. The VEH apparently > runs in the context of the single stepping thread (you're using > _my_tls in the VEH). It sets in_exception_handler to true and then > goes into a busy loop before returning the exception flag. > > But that means the following SuspendThread... > > > > + SuspendThread (*this); > > ...will suspend the thread while in the VEH... > > > GetThreadContext (*this, cx); > > suspend_on_exception = false; > > ...because suspend_on_exception is true up to here. > > How is that supposed to work? Perhaps I don't fully understand your concern. I intended to suspend the thread at the busy loop in the VEH. Then, branching to sigdelayed from there and return to the busy loop with suspend_on_exception flag of false. What is your point? -- Takashi Yano <[email protected]>
