Hi,

On Thu, Jan 11, 2024 at 01:00:00PM +0300, Alexander Lakhin wrote:
> Hi Bertrand,
> 
> 10.01.2024 19:32, Bertrand Drouvot wrote:
> > 
> > > is an absent message 'obsolete replication slot "row_removal_activeslot"'
> > Looking at 
> > recovery-15-vacuum_pg_class/i035_standby_logical_decoding_standby.log here:
> > 
> > Yeah, the missing message has to come from InvalidatePossiblyObsoleteSlot().
> > 
> > In case of an active slot we first call ReportSlotInvalidation with the 
> > second
> > parameter set to true (to emit the "terminating" message), then SIGTERM the 
> > active
> > process and then (later) we should call the other ReportSlotInvalidation()
> > call with the second parameter set to false (to emit the message that we 
> > don't
> > see here).
> > 
> > So it means InvalidatePossiblyObsoleteSlot() did not trigger the second 
> > ReportSlotInvalidation()
> > call.
> 
> I've found a way to reproduce the issue without slowing down a machine or
> running multiple tests in parallel. It's enough for this to add a delay to
> allow BackgroundWriterMain() to execute LogStandbySnapshot():
> @@ -692,2 +690,3 @@
>  $node_primary->safe_psql('testdb', qq[UPDATE prun SET s = 'D';]);
> +$node_primary->safe_psql('testdb', qq[SELECT pg_sleep(15);]);
>  $node_primary->safe_psql('testdb', qq[UPDATE prun SET s = 'E';]);
> 
> With this delay, I get the failure immediately:
> $ PROVE_TESTS="t/035*" TEMP_CONFIG=/tmp/extra.config make check -s -C 
> src/test/recovery
> # +++ tap check in src/test/recovery +++
> t/035_standby_logical_decoding.pl .. 47/?
> #   Failed test 'activeslot slot invalidation is logged with on-access 
> pruning'
> #   at t/035_standby_logical_decoding.pl line 227.

Thanks a lot for the testing!

So I think we have 2 issues here:

1) The one you're mentioning above related to the on-access pruning test:

I think the engine behavior is right here and that the test is racy. I'm
proposing to bypass the active slot invalidation check for this particular test 
(
as I don't see any "easy" way to take care of this race condition). The active 
slot
invalidation is already well covered in the other tests anyway. 

I'm proposing the attached 
v4-0001-Fix-035_standby_logical_decoding.pl-race-conditio.patch
for it.

2) The fact that sometime we're getting a termination message which is not 
followed
by an obsolete one (like as discussed in [1]).

For this one, I think that InvalidatePossiblyObsoleteSlot() is racy:

In case of an active slot we proceed in 2 steps:
 - terminate the backend holding the slot
 - report the slot as obsolete

This is racy because between the two we release the mutex on the slot, which
means the effective_xmin and effective_catalog_xmin could advance during that 
time.

I'm proposing the attached 
v1-0001-Fix-race-condition-in-InvalidatePossiblyObsoleteS.patch
for it.

Would it be possible to re-launch your repro (the slow one, not the pg_sleep() 
one)
with bot patch applied and see how it goes? (Please note that v4 replaces v3 
that
you're already using in your tests).

If it helps, I'll propose 
v1-0001-Fix-race-condition-in-InvalidatePossiblyObsoleteS.patch
into a dedicated hackers thread.

[1]: 
https://www.postgresql.org/message-id/ZZ7GpII4bAYN%2BjT5%40ip-10-97-1-34.eu-west-3.compute.internal

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
>From 8347851b3cc42655cfd81914f0c2f5cc1d22e2b8 Mon Sep 17 00:00:00 2001
From: Bertrand Drouvot <bertranddrouvot...@gmail.com>
Date: Thu, 11 Jan 2024 13:36:57 +0000
Subject: [PATCH v4] Fix 035_standby_logical_decoding.pl race conditions

We want to ensure that vacuum was able to remove dead rows (aka no other
transactions holding back global xmin) before testing for slots invalidation on
the standby.

Also, get rid of the active slot invalidation check during the on-access pruning
check. This test is racy for active slots and active slots invalidations are well
covered in other tests.
---
 .../t/035_standby_logical_decoding.pl         | 114 +++++++++---------
 1 file changed, 59 insertions(+), 55 deletions(-)
 100.0% src/test/recovery/t/

diff --git a/src/test/recovery/t/035_standby_logical_decoding.pl b/src/test/recovery/t/035_standby_logical_decoding.pl
index 8bc39a5f03..7a50187326 100644
--- a/src/test/recovery/t/035_standby_logical_decoding.pl
+++ b/src/test/recovery/t/035_standby_logical_decoding.pl
@@ -191,9 +191,9 @@ sub check_slots_conflict_reason
 
 # Drop the slots, re-create them, change hot_standby_feedback,
 # check xmin and catalog_xmin values, make slot active and reset stat.
-sub reactive_slots_change_hfs_and_wait_for_xmins
+sub recreate_slots_change_hfs_and_wait_for_xmins
 {
-	my ($previous_slot_prefix, $slot_prefix, $hsf, $invalidated) = @_;
+	my ($previous_slot_prefix, $slot_prefix, $hsf, $invalidated, $activate) = @_;
 
 	# drop the logical slots
 	drop_logical_slots($previous_slot_prefix);
@@ -203,8 +203,11 @@ sub reactive_slots_change_hfs_and_wait_for_xmins
 
 	change_hot_standby_feedback_and_wait_for_xmins($hsf, $invalidated);
 
-	$handle =
-	  make_slot_active($node_standby, $slot_prefix, 1, \$stdout, \$stderr);
+	if ($activate)
+	{
+		$handle =
+			make_slot_active($node_standby, $slot_prefix, 1, \$stdout, \$stderr);
+	}
 
 	# reset stat: easier to check for confl_active_logicalslot in pg_stat_database_conflicts
 	$node_standby->psql('testdb', q[select pg_stat_reset();]);
@@ -213,7 +216,7 @@ sub reactive_slots_change_hfs_and_wait_for_xmins
 # Check invalidation in the logfile and in pg_stat_database_conflicts
 sub check_for_invalidation
 {
-	my ($slot_prefix, $log_start, $test_name) = @_;
+	my ($slot_prefix, $log_start, $test_name, $activated) = @_;
 
 	my $active_slot = $slot_prefix . 'activeslot';
 	my $inactive_slot = $slot_prefix . 'inactiveslot';
@@ -230,14 +233,33 @@ sub check_for_invalidation
 		"activeslot slot invalidation is logged $test_name");
 
 	# Verify that pg_stat_database_conflicts.confl_active_logicalslot has been updated
-	ok( $node_standby->poll_query_until(
-			'postgres',
-			"select (confl_active_logicalslot = 1) from pg_stat_database_conflicts where datname = 'testdb'",
-			't'),
-		'confl_active_logicalslot updated'
-	) or die "Timed out waiting confl_active_logicalslot to be updated";
+	if ($activated)
+	{
+		ok( $node_standby->poll_query_until(
+				'postgres',
+				"select (confl_active_logicalslot = 1) from pg_stat_database_conflicts where datname = 'testdb'",
+				't'),
+			'confl_active_logicalslot updated'
+		) or die "Timed out waiting confl_active_logicalslot to be updated";
+	}
 }
 
+# Launch $sql and wait for a new snapshot that has a newer horizon before
+# doing the vacuum with $vac_option on $to_vac.
+sub wait_until_vacuum_can_remove
+{
+	my ($vac_option, $sql, $to_vac) = @_;
+
+	my $xid = $node_primary->safe_psql('testdb', qq[$sql
+													select txid_current();]);
+
+	$node_primary->poll_query_until('testdb',
+		"SELECT (select txid_snapshot_xmin(txid_current_snapshot()) - $xid) > 0")
+	  or die "new snapshot does not have a newer horizon";
+
+	$node_primary->safe_psql('testdb', qq[VACUUM $vac_option verbose $to_vac;
+										  INSERT INTO flush_wal DEFAULT VALUES; -- see create table flush_wal;]);
+}
 ########################
 # Initialize primary node
 ########################
@@ -248,6 +270,7 @@ $node_primary->append_conf(
 wal_level = 'logical'
 max_replication_slots = 4
 max_wal_senders = 4
+autovacuum = off
 });
 $node_primary->dump_info;
 $node_primary->start;
@@ -464,22 +487,17 @@ $node_subscriber->stop;
 # One way to produce recovery conflict is to create/drop a relation and
 # launch a vacuum full on pg_class with hot_standby_feedback turned off on
 # the standby.
-reactive_slots_change_hfs_and_wait_for_xmins('behaves_ok_', 'vacuum_full_',
-	0, 1);
+recreate_slots_change_hfs_and_wait_for_xmins('behaves_ok_', 'vacuum_full_',
+	0, 1, 1);
 
 # This should trigger the conflict
-$node_primary->safe_psql(
-	'testdb', qq[
-  CREATE TABLE conflict_test(x integer, y text);
-  DROP TABLE conflict_test;
-  VACUUM full pg_class;
-  INSERT INTO flush_wal DEFAULT VALUES; -- see create table flush_wal
-]);
+wait_until_vacuum_can_remove('full', 'CREATE TABLE conflict_test(x integer, y text);
+								 DROP TABLE conflict_test;', 'pg_class');
 
 $node_primary->wait_for_replay_catchup($node_standby);
 
 # Check invalidation in the logfile and in pg_stat_database_conflicts
-check_for_invalidation('vacuum_full_', 1, 'with vacuum FULL on pg_class');
+check_for_invalidation('vacuum_full_', 1, 'with vacuum FULL on pg_class', 1);
 
 # Verify conflict_reason is 'rows_removed' in pg_replication_slots
 check_slots_conflict_reason('vacuum_full_', 'rows_removed');
@@ -546,22 +564,17 @@ my $logstart = -s $node_standby->logfile;
 
 # One way to produce recovery conflict is to create/drop a relation and
 # launch a vacuum on pg_class with hot_standby_feedback turned off on the standby.
-reactive_slots_change_hfs_and_wait_for_xmins('vacuum_full_', 'row_removal_',
-	0, 1);
+recreate_slots_change_hfs_and_wait_for_xmins('vacuum_full_', 'row_removal_',
+	0, 1, 1);
 
 # This should trigger the conflict
-$node_primary->safe_psql(
-	'testdb', qq[
-  CREATE TABLE conflict_test(x integer, y text);
-  DROP TABLE conflict_test;
-  VACUUM pg_class;
-  INSERT INTO flush_wal DEFAULT VALUES; -- see create table flush_wal
-]);
+wait_until_vacuum_can_remove('', 'CREATE TABLE conflict_test(x integer, y text);
+							 DROP TABLE conflict_test;', 'pg_class');
 
 $node_primary->wait_for_replay_catchup($node_standby);
 
 # Check invalidation in the logfile and in pg_stat_database_conflicts
-check_for_invalidation('row_removal_', $logstart, 'with vacuum on pg_class');
+check_for_invalidation('row_removal_', $logstart, 'with vacuum on pg_class', 1);
 
 # Verify conflict_reason is 'rows_removed' in pg_replication_slots
 check_slots_conflict_reason('row_removal_', 'rows_removed');
@@ -584,23 +597,18 @@ $logstart = -s $node_standby->logfile;
 
 # One way to produce recovery conflict is to create/drop a relation and
 # launch a vacuum on pg_class with hot_standby_feedback turned off on the standby.
-reactive_slots_change_hfs_and_wait_for_xmins('row_removal_',
-	'shared_row_removal_', 0, 1);
+recreate_slots_change_hfs_and_wait_for_xmins('row_removal_',
+	'shared_row_removal_', 0, 1, 1);
 
 # Trigger the conflict
-$node_primary->safe_psql(
-	'testdb', qq[
-  CREATE ROLE create_trash;
-  DROP ROLE create_trash;
-  VACUUM pg_authid;
-  INSERT INTO flush_wal DEFAULT VALUES; -- see create table flush_wal
-]);
+wait_until_vacuum_can_remove('', 'CREATE ROLE create_trash;
+							 DROP ROLE create_trash;', 'pg_authid');
 
 $node_primary->wait_for_replay_catchup($node_standby);
 
 # Check invalidation in the logfile and in pg_stat_database_conflicts
 check_for_invalidation('shared_row_removal_', $logstart,
-	'with vacuum on pg_authid');
+	'with vacuum on pg_authid', 1);
 
 # Verify conflict_reason is 'rows_removed' in pg_replication_slots
 check_slots_conflict_reason('shared_row_removal_', 'rows_removed');
@@ -621,18 +629,14 @@ check_pg_recvlogical_stderr($handle,
 # get the position to search from in the standby logfile
 $logstart = -s $node_standby->logfile;
 
-reactive_slots_change_hfs_and_wait_for_xmins('shared_row_removal_',
-	'no_conflict_', 0, 1);
+recreate_slots_change_hfs_and_wait_for_xmins('shared_row_removal_',
+	'no_conflict_', 0, 1, 1);
 
 # This should not trigger a conflict
-$node_primary->safe_psql(
-	'testdb', qq[
-  CREATE TABLE conflict_test(x integer, y text);
-  INSERT INTO conflict_test(x,y) SELECT s, s::text FROM generate_series(1,4) s;
-  UPDATE conflict_test set x=1, y=1;
-  VACUUM conflict_test;
-  INSERT INTO flush_wal DEFAULT VALUES; -- see create table flush_wal
-]);
+wait_until_vacuum_can_remove('', 'CREATE TABLE conflict_test(x integer, y text);
+							 INSERT INTO conflict_test(x,y) SELECT s, s::text FROM generate_series(1,4) s;
+							 UPDATE conflict_test set x=1, y=1;', 'conflict_test');
+
 $node_primary->wait_for_replay_catchup($node_standby);
 
 # message should not be issued
@@ -679,8 +683,8 @@ $logstart = -s $node_standby->logfile;
 
 # One way to produce recovery conflict is to trigger an on-access pruning
 # on a relation marked as user_catalog_table.
-reactive_slots_change_hfs_and_wait_for_xmins('no_conflict_', 'pruning_', 0,
-	0);
+recreate_slots_change_hfs_and_wait_for_xmins('no_conflict_', 'pruning_', 0,
+	0, 0);
 
 # This should trigger the conflict
 $node_primary->safe_psql('testdb',
@@ -695,7 +699,7 @@ $node_primary->safe_psql('testdb', qq[UPDATE prun SET s = 'E';]);
 $node_primary->wait_for_replay_catchup($node_standby);
 
 # Check invalidation in the logfile and in pg_stat_database_conflicts
-check_for_invalidation('pruning_', $logstart, 'with on-access pruning');
+check_for_invalidation('pruning_', $logstart, 'with on-access pruning', 0);
 
 # Verify conflict_reason is 'rows_removed' in pg_replication_slots
 check_slots_conflict_reason('pruning_', 'rows_removed');
@@ -739,7 +743,7 @@ $node_primary->restart;
 $node_primary->wait_for_replay_catchup($node_standby);
 
 # Check invalidation in the logfile and in pg_stat_database_conflicts
-check_for_invalidation('wal_level_', $logstart, 'due to wal_level');
+check_for_invalidation('wal_level_', $logstart, 'due to wal_level', 1);
 
 # Verify conflict_reason is 'wal_level_insufficient' in pg_replication_slots
 check_slots_conflict_reason('wal_level_', 'wal_level_insufficient');
-- 
2.34.1

>From 2e4d580af4a1e6b2cb000d4e9db2c42e40aa4cd2 Mon Sep 17 00:00:00 2001
From: Bertrand Drouvot <bertranddrouvot...@gmail.com>
Date: Thu, 11 Jan 2024 13:57:53 +0000
Subject: [PATCH v1] Fix race condition in InvalidatePossiblyObsoleteSlot()

In case of an active slot we proceed in 2 steps:
- terminate the backend holding the slot
- report the slot as obsolete

This is racy because between the two we release the mutex on the slot, which
means the effective_xmin and effective_catalog_xmin could advance during that time.

Holding the mutex longer is not an option (given what we'd to do while holding it)
so record the effective_xmin and effective_catalog_xmin instead that was used during
the backend termination.
---
 src/backend/replication/slot.c | 17 +++++++++++++----
 1 file changed, 13 insertions(+), 4 deletions(-)
 100.0% src/backend/replication/

diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 52da694c79..2e34cca0e8 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -1352,6 +1352,9 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 {
 	int			last_signaled_pid = 0;
 	bool		released_lock = false;
+	bool		terminated = false;
+	XLogRecPtr	initial_effective_xmin;
+	XLogRecPtr	initial_catalog_effective_xmin;
 
 	for (;;)
 	{
@@ -1396,15 +1399,20 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 				case RS_INVAL_HORIZON:
 					if (!SlotIsLogical(s))
 						break;
+					if (!terminated)
+					{
+						initial_effective_xmin = s->effective_xmin;
+						initial_catalog_effective_xmin = s->effective_catalog_xmin;
+					}
 					/* invalid DB oid signals a shared relation */
 					if (dboid != InvalidOid && dboid != s->data.database)
 						break;
-					if (TransactionIdIsValid(s->effective_xmin) &&
-						TransactionIdPrecedesOrEquals(s->effective_xmin,
+					if (TransactionIdIsValid(initial_effective_xmin) &&
+						TransactionIdPrecedesOrEquals(initial_effective_xmin,
 													  snapshotConflictHorizon))
 						conflict = cause;
-					else if (TransactionIdIsValid(s->effective_catalog_xmin) &&
-							 TransactionIdPrecedesOrEquals(s->effective_catalog_xmin,
+					else if (TransactionIdIsValid(initial_catalog_effective_xmin) &&
+							 TransactionIdPrecedesOrEquals(initial_catalog_effective_xmin,
 														   snapshotConflictHorizon))
 						conflict = cause;
 					break;
@@ -1499,6 +1507,7 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 					(void) kill(active_pid, SIGTERM);
 
 				last_signaled_pid = active_pid;
+				terminated = true;
 			}
 
 			/* Wait until the slot is released. */
-- 
2.34.1

Reply via email to