On Mon, Feb 26, 2024 at 02:01:45PM +0000, Bertrand Drouvot wrote: > Though [1] mentioned up-thread is not pushed yet, I'm Sharing the POC patch > now > (see the attached).
I have looked at what you have here. First, in a build where 818fefd8fd is included, this makes the test script a lot slower. Most of the logic is quick, but we're spending 10s or so checking that catalog_xmin has advanced. Could it be possible to make that faster? A second issue is the failure mode when 818fefd8fd is reverted. The test is getting stuck when we are waiting on the standby to catch up, until a timeout decides to kick in to fail the test, and all the previous tests pass. Could it be possible to make that more responsive? I assume that in the failure mode we would get an incorrect conflict_reason for injection_inactiveslot, succeeding in checking the failure. + my $terminated = 0; + for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++) + { + if ($node_standby->log_contains( + 'terminating process .* to release replication slot \"injection_activeslot\"', $logstart)) + { + $terminated = 1; + last; + } + usleep(100_000); + } + ok($terminated, 'terminating process holding the active slot is logged with injection point'); The LOG exists when we are sure that the startup process is waiting in the injection point, so this loop could be replaced with something like: + $node_standby->wait_for_event('startup', 'TerminateProcessHoldingSlot'); + ok( $node_standby->log_contains('terminating process .* .. ', 'termin .. ';) Nit: the name of the injection point should be terminate-process-holding-slot rather than TerminateProcessHoldingSlot, to be consistent with the other ones. -- Michael
signature.asc
Description: PGP signature