Hi Alexander,
On Mon, Nov 10, 2025 at 09:00:01PM +0200, Alexander Lakhin wrote:
> Sorry for the delay. I've finally completed a new round of experiments and
> discovered the following:
[...]
> 12.10.2025 03:42, Thomas Munro wrote:
> > * I wonder about the special code paths for handlers that were already
> > running and happened to be in sigreturn(), or something like that,
> > which I didn't study at all, but it occurred to me that our pqsignal
> > will only block the signal itself while running a handler (since it
> > doesn't specify SA_NODEFER)... so what happens if you block all
> > signals while running each handler by changing
> > sigemptyset(&act.sa_mask) to sigfillset(&act.sa_mask)?
>
> Thank you for the suggestion!
>
> With this modification:
> @@ -137,7 +140,7 @@ pqsignal(int signo, pqsigfunc func)
>
> #if !(defined(WIN32) && defined(FRONTEND))
> act.sa_handler = func;
> - sigemptyset(&act.sa_mask);
> + sigfillset(&act.sa_mask);
> act.sa_flags = SA_RESTART;
>
> I got 100 iterations passed (12 of them hanged) without that Assert
> triggered.
But those hangs were unrelated to the assert then, right?
> > * I see special code paths for SIGIO and SIGURG that I didn't try to
> > understand, but I wonder what would happen if we s/SIGURG/SIGXCPU/
>
> With sed 's/SIGURG/SIGXCPU/' -i src/backend/storage/ipc/waiteventset.c, I
> still got:
> !!!wrapper_handler[8401]| postgres_signal_arg: 28565808, PG_NSIG: 33
> TRAP: failed Assert("postgres_signal_arg < PG_NSIG"), File: "pqsignal.c",
> Line: 94, PID: 8401
> ...
> 2025-11-09 12:51:24.095 GMT postmaster[7282] LOG: client backend (PID 8401)
> was terminated by signal 6: Aborted
> 2025-11-09 12:51:24.095 GMT postmaster[7282] DETAIL: Failed process was
> running: UPDATE PKTABLE set ptest2=5 where ptest2=2;
> ---
>
> !!!wrapper_handler[21000]| postgres_signal_arg: 28545040, PG_NSIG: 33
> TRAP: failed Assert("postgres_signal_arg < PG_NSIG"), File: "pqsignal.c",
> Line: 94, PID: 21000
> ...
> 2025-11-09 13:06:59.458 GMT postmaster[20669] LOG: client backend (PID
> 21000) was terminated by signal 6: Aborted
> 2025-11-09 13:06:59.458 GMT postmaster[20669] DETAIL: Failed process was
> running: UPDATE pvactst SET i = i WHERE i < 1000;
> ---
> !!!wrapper_handler[21973]| postgres_signal_arg: 28562608, PG_NSIG: 33
> TRAP: failed Assert("postgres_signal_arg < PG_NSIG"), File: "pqsignal.c",
> Line: 94, PID: 21973
>
> 2025-11-09 14:56:23.955 GMT postmaster[20665] LOG: client backend (PID
> 21973) was terminated by signal 6: Aborted
> 2025-11-09 14:56:23.955 GMT postmaster[20665] DETAIL: Failed process was
> running: INSERT INTO pagg_tab_m SELECT i % 30, i % 40, i % 50 FROM
> generate_series(0, 2999) i;
>
> The failure rate is approximately 1 per 30 runs.
Is that the same failure rate you got before?
Michael