Re: Race condition in FetchTableStates() breaks synchronization of subscription tables

vignesh C Wed, 31 Jan 2024 20:59:45 -0800

On Tue, 30 Jan 2024 at 17:22, Zhijie Hou (Fujitsu)
<houzj.f...@fujitsu.com> wrote:
>
> On Tuesday, January 30, 2024 11:21 AM vignesh C <vignes...@gmail.com> wrote:
> >
> > On Tue, 30 Jan 2024 at 07:24, Zhijie Hou (Fujitsu) <houzj.f...@fujitsu.com>
> > wrote:
> > >
> > > On Monday, January 29, 2024 9:22 PM vignesh C <vignes...@gmail.com>
> > wrote:
> > > >
> > > > On Fri, 26 Jan 2024 at 11:30, Alexander Lakhin <exclus...@gmail.com>
> > wrote:
> > > > >
> > > > > Hello hackers,
> > > > >
> > > > > After determining a possible cause for intermittent failures of
> > > > > the test subscription/031_column_list [1], I was wondering what
> > > > > makes another subscription test (014_binary) fail on the buildfarm:
> > > > > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=snakefly&d
> > > > > t=20
> > > > > 24-01-22%2001%3A19%3A03
> > > > >
> > > >
> > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mylodon&dt=2
> > > > 02
> > > > > 4-01-14%2018%3A19%3A20
> > > > > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=piculet&dt
> > > > > =202
> > > > > 3-12-21%2001%3A11%3A52
> > > > >
> > > >
> > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mylodon&dt=2
> > > > 02
> > > > > 3-11-27%2001%3A42%3A39
> > > > >
> > > > > All those failures caused by a timeout when waiting for a message
> > > > > expected in _subscriber.log. For example, in the snakefly's case:
> > > > > [01:14:46.158](1.937s) ok 7 - check synced data on subscriber with
> > > > > custom type timed out waiting for match: (?^:ERROR: ( [A-Z0-9]+:)?
> > > > > incorrect binary data format) at
> > > > /home/bf/bf-build/piculet/HEAD/pgsql/src/test/subscription/t/014_bin
> > > > ary.pl
> > > > line 269.
> > > > >
> > > > > _subscriber.log contains:
> > > > > 2023-12-21 01:14:46.215 UTC [410039] 014_binary.pl LOG:  statement:
> > > > > ALTER SUBSCRIPTION tsub REFRESH PUBLICATION;
> > > > > 2023-12-21 01:17:46.756 UTC [409999] ERROR:  could not receive
> > > > > data from WAL stream: server closed the connection unexpectedly
> > > > >          This probably means the server terminated abnormally
> > > > >          before or while processing the request.
> > > > > 2023-12-21 01:17:46.760 UTC [405057] LOG:  background worker
> > > > > "logical replication apply worker" (PID 409999) exited with exit
> > > > > code 1
> > > > > 2023-12-21 01:17:46.779 UTC [532857] LOG:  logical replication
> > > > > apply worker for subscription "tsub" has started ...
> > > > >
> > > > > While _subscriber.log from a successful test run contains:
> > > > > 2024-01-26 03:49:07.065 UTC [9726:5] 014_binary.pl LOG:  statement:
> > > > > ALTER SUBSCRIPTION tsub REFRESH PUBLICATION;
> > > > > 2024-01-26 03:49:07.075 UTC [9726:6] 014_binary.pl LOG: disconnection:
> > > > > session time: 0:00:00.014 user=postgres database=postgres
> > > > > host=[local]
> > > > > 2024-01-26 03:49:07.558 UTC [9729:1] LOG:  logical replication
> > > > > apply worker for subscription "tsub" has started
> > > > > 2024-01-26 03:49:07.563 UTC [9731:1] LOG:  logical replication
> > > > > table synchronization worker for subscription "tsub", table
> > > > > "test_mismatching_types" has started
> > > > > 2024-01-26 03:49:07.585 UTC [9731:2] ERROR:  incorrect binary data
> > > > > format
> > > > > 2024-01-26 03:49:07.585 UTC [9731:3] CONTEXT:  COPY
> > > > > test_mismatching_types, line 1, column a
> > > > >
> > > > > In this case, "logical replication apply worker for subscription
> > > > > "tsub" has started" appears just after "ALTER SUBSCRIPTION", not 3
> > > > > minutes
> > > > later.
> > > > >
> > > > > I've managed to reproduce this failure locally by running multiple
> > > > > tests in parallel, and my analysis shows that it is caused by a
> > > > > race condition when accessing variable table_states_valid inside
> > tablesync.c.
> > > > >
> > > > > tablesync.c does the following with table_states_valid:
> > > > > /*
> > > > >   * Callback from syscache invalidation.
> > > > >   */
> > > > > void
> > > > > invalidate_syncing_table_states(Datum arg, int cacheid, uint32
> > > > > hashvalue) {
> > > > >      table_states_valid = false;
> > > > > }
> > > > > ...
> > > > > static bool
> > > > > FetchTableStates(bool *started_tx) { ...
> > > > >      if (!table_states_valid)
> > > > >      {
> > > > > ...
> > > > >          /* Fetch all non-ready tables. */
> > > > >          rstates = GetSubscriptionRelations(MySubscription->oid,
> > > > > true); ...
> > > > >          table_states_valid = true;
> > > > >      }
> > > > >
> > > > > So, when syscache invalidation occurs inside the code block "if
> > > > > (!table_states_valid)", that invalidation is effectively ignored.
> > > > >
> > > > > In a failed case I observe the following events:
> > > > > 1. logical replication apply worker performs
> > > > >   LogicalRepApplyLoop() -> process_syncing_tables() ->
> > > > >   process_syncing_tables_for_apply() -> FetchTableStates() 
> > > > > periodically.
> > > > >
> > > > > 2. ALTER SUBSCRIPTION tsub REFRESH PUBLICATION invalidates
> > syscache
> > > > >   for SUBSCRIPTIONRELMAP, and that leads to calling
> > > > >   invalidate_syncing_table_states().
> > > > >
> > > > > 3. If the apply worker manages to fetch no non-ready tables in
> > > > >   FetchTableStates() and ignore "table_states_valid = false" from
> > > > >   invalidate_syncing_table_states(), then it just misses the 
> > > > > invalidation
> > > > >   event, so it keeps working without noticing non-ready tables
> > appeared as
> > > > >   the result of ALTER SUBSCRIPTION
> > > > > (process_syncing_tables_for_apply()
> > > > skips
> > > > >   a loop "foreach(lc, table_states_not_ready) ..." until some other 
> > > > > event
> > > > >   occurs).
> > > > >
> > > > > pg_usleep(100000) added just below GetSubscriptionRelations(...)
> > > > > proves my analysis -- without it, I need tens of iterations (with
> > > > > 50 tests running in
> > > > > parallel) to catch the failure, but with it, I get the failure on
> > > > > the first iteration.
> > > >
> > > > Thanks for the analysis, I was able to reproduce the issue with the
> > > > steps you had shared. I agree with your analysis. I added some logs
> > > > to verify that the invalidation was getting missed.
> > > >
> > > > I felt that this invalidation is getting ignored because we have
> > > > used a boolean variable here, how about changing it slightly so that
> > > > table_states_invalid gets incremented for every invalidation and
> > > > then decrementing table_states_invalid after getting the non-ready
> > > > tables like in the attached patch.  I was able to verify that the test 
> > > > passes
> > with the attached patch.
> > >
> > > Thanks for the patch.
> > >
> > > I am not sure if counting the invalidation number is needed, as even
> > > if there are hundreds of invalidations outside of FetchTableStates,
> > > one FetchTableStates call should reset the count to 0 as it is checking 
> > > the
> > latest catalog.
> >
> > Another approach I was thinking of is to reset table_states_valid 
> > immediately in
> > the beginning of FetchTableStates, so any new invalidations will take care 
> > of
> > setting table_states_valid again it again which will be handled in the next
> > iteration of fetching non-ready tables like in the attached patch.
>
> I think this is not the standard appraoch for cache building. Because if any
> ERROR happens during the cache building, then we will come into the situation
> that table_states_valid=true while the cache data is invalid. Even if we
> currently don't access these cache after erroring out, but I think we'd better
> to avoid this risk.


Here is an updated patch which changes the boolean variable to a
tri-state enum and set stable state to valid only if no invalidations
have been occurred while the list is being prepared.

Regards,
Vignesh

From cec60bbd7a6d1b66672c6969a11b89a8e18190c9 Mon Sep 17 00:00:00 2001
From: Vignesh C <vignes...@gmail.com>
Date: Thu, 1 Feb 2024 09:46:40 +0530
Subject: [PATCH v3] Table sync missed for newly added tables because
 subscription relation cache invalidation was not handled in certain
 concurrent scenarios.

Table sync was missed if the invalidation of table sync occurs while
the non ready tables were getting prepared. This occurrs because
the table state was being set to valid at the end of non ready table
list preparation irrespective of any inavlidations occurred while
preparing the list. Fixed it by changing the boolean variable to an
tri-state enum and by setting table state to valid only if no
invalidations have been occurred while the list is being prepared.
---
 src/backend/replication/logical/tablesync.c | 25 +++++++++++++++++----
 src/tools/pgindent/typedefs.list            |  1 +
 2 files changed, 22 insertions(+), 4 deletions(-)

diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index 5acab3f3e2..b38b0e3ba1 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -123,7 +123,14 @@
 #include "utils/syscache.h"
 #include "utils/usercontext.h"
 
-static bool table_states_valid = false;
+typedef enum
+{
+	SYNC_TABLE_STATE_NEEDS_REBUILD,
+	SYNC_TABLE_STATE_REBUILD_STARTED,
+	SYNC_TABLE_STATE_VALID,
+} SyncingTablesState;
+
+static SyncingTablesState table_states_valid = SYNC_TABLE_STATE_NEEDS_REBUILD;
 static List *table_states_not_ready = NIL;
 static bool FetchTableStates(bool *started_tx);
 
@@ -273,7 +280,7 @@ wait_for_worker_state_change(char expected_state)
 void
 invalidate_syncing_table_states(Datum arg, int cacheid, uint32 hashvalue)
 {
-	table_states_valid = false;
+	table_states_valid = SYNC_TABLE_STATE_NEEDS_REBUILD;
 }
 
 /*
@@ -1568,13 +1575,15 @@ FetchTableStates(bool *started_tx)
 
 	*started_tx = false;
 
-	if (!table_states_valid)
+	if (table_states_valid != SYNC_TABLE_STATE_VALID)
 	{
 		MemoryContext oldctx;
 		List	   *rstates;
 		ListCell   *lc;
 		SubscriptionRelState *rstate;
 
+		table_states_valid = SYNC_TABLE_STATE_REBUILD_STARTED;
+
 		/* Clean the old lists. */
 		list_free_deep(table_states_not_ready);
 		table_states_not_ready = NIL;
@@ -1608,7 +1617,15 @@ FetchTableStates(bool *started_tx)
 		has_subrels = (table_states_not_ready != NIL) ||
 			HasSubscriptionRelations(MySubscription->oid);
 
-		table_states_valid = true;
+		/*
+		 * If the subscription relation cache has been invalidated since we
+		 * entered this routine, we still use and return the relations we just
+		 * finished constructing, to avoid infinite loops, but we leave the
+		 * table states marked as stale so that we'll rebuild it again on next
+		 * access. Otherwise, we mark the table states as valid.
+		 */
+		if (table_states_valid == SYNC_TABLE_STATE_REBUILD_STARTED)
+			table_states_valid = SYNC_TABLE_STATE_VALID;
 	}
 
 	return has_subrels;
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 91433d439b..5a40f549f9 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2703,6 +2703,7 @@ SupportRequestSelectivity
 SupportRequestSimplify
 SupportRequestWFuncMonotonic
 Syn
+SyncingTablesState
 SyncOps
 SyncRepConfigData
 SyncRepStandbyData
-- 
2.34.1

Re: Race condition in FetchTableStates() breaks synchronization of subscription tables

Reply via email to