On Wed, Apr 15, 2026 at 6:23 PM Jacob Champion <[email protected]> wrote: > > On Wed, Apr 15, 2026 at 7:17 AM Andrew Dunstan <[email protected]> wrote: > > OK, pushed. Thanks. > > I hit the following in the pg_basebackup tests just now, running on Linux: > > [08:41:21.621](0.377s) ok 196 - Walsender killed > [09:09:11.134](1669.513s) # pump_until: timeout expired when > searching for "(?^:background process terminated unexpectedly)" with > stream: "pg_basebackup: error: unexpected termination of replication > stream: FATAL: terminating connection due to administrator command > # DETAIL: Signal sent by PID 155573, UID 1000. > # " > [09:09:11.134](0.000s) not ok 197 - background process exit message > [09:09:11.134](0.000s) # Failed test 'background process exit message' > # at src/postgres/src/bin/pg_basebackup/t/010_pg_basebackup.pl line > 1049. > > But I haven't been able to reproduce since, so I don't know if this is > a new race, or the commit just exposed one that was there before?
Hi Jacob, the time baseback took seems strange to me (27mins?!). It was properly killed by a timeout, and the new code added the exact PID that caused the issue. If you happen to spot it again long running it might make some sense to find where the time is spent there during that basebackup (in this test we shouldn't be taking large backups). Alternative would be to check pg server logs of that specific failed run to see exactly where it was stuck after 08:41 (but before 09:09). -J.
