On Fri, Mar 29, 2024 at 6:36 AM Zhijie Hou (Fujitsu)
<houzj.f...@fujitsu.com> wrote:
>
> Attach a new version patch which fixed an un-initialized variable issue and
> added some comments. Also, temporarily enable DEBUG2 for the 040 tap-test so 
> that
> we can analyze the possible CFbot failures easily.

As suggested by Amit in [1], for the fix being discussed where we need
to advance the synced slot on standby, we need to skip the dbid check
in fast_forward mode in CreateDecodingContext(). We tried few tests to
make sure that there was no table-access done during fast-forward mode

1) Initially we tried avoiding database-id check in
CreateDecodingContext() only when called by
pg_logical_replication_slot_advance(). 'make check-world' passed on
HEAD for the same.

2) But the more generic solution was to skip the database check if
"fast_forward" is true. It was tried and 'make check-world' passed on
HEAD for that as well.

3) Another thing tried by Hou-San was to run pgbench after skipping db
check in the fast_forward logical decoding case.
pgbench was run to generate some changes and then the logical slot was
advanced to the latest position in another database. A LOG was added
in relation_open to catch table access. It was found that there was no
table-access in fast forward logical decoding i.e. no LOGS for
table-open were generated during the test. Steps given at [2]

[1]: 
https://www.postgresql.org/message-id/CAA4eK1KMiKangJa4NH_K1oFc87Y01n3rnpuwYagT59Y%3DADW8Dw%40mail.gmail.com

[2]:
--------------
1. apply the DEBUG patch (attached as .txt) which will log the
relation open and table cache access.

2. create a slot:
SELECT 'init' FROM pg_create_logical_replication_slot('logicalslot',
'test_decoding', false, false, true);

3. run pgbench to generate some data.
pgbench -i postgres
pgbench --aggregate-interval=5 --time=5 --client=10 --log --rate=1000
--latency-limit=10 --failures-detailed --max-tries=10 postgres

4. start a fresh session in a different db and advance the slot to the
latest position. There should be no relation open or CatCache log
between the LOG "starting logical decoding for slot .." and LOG
"decoding over".
SELECT pg_replication_slot_advance('logicalslot', pg_current_wal_lsn());
--------------

thanks
Shveta
From 5386894faa14c0de9854e0eee9679f8eea775f65 Mon Sep 17 00:00:00 2001
From: Hou Zhijie <houzj.f...@fujitsu.com>
Date: Fri, 29 Mar 2024 11:46:36 +0800
Subject: [PATCH] debug log

---
 src/backend/access/common/relation.c | 2 ++
 src/backend/replication/slotfuncs.c  | 1 +
 src/backend/utils/cache/catcache.c   | 1 +
 3 files changed, 4 insertions(+)

diff --git a/src/backend/access/common/relation.c 
b/src/backend/access/common/relation.c
index d8a313a2c9..40718fc47e 100644
--- a/src/backend/access/common/relation.c
+++ b/src/backend/access/common/relation.c
@@ -50,6 +50,7 @@ relation_open(Oid relationId, LOCKMODE lockmode)
 
        Assert(lockmode >= NoLock && lockmode < MAX_LOCKMODES);
 
+       elog(LOG, "relation_open");
        /* Get the lock before trying to open the relcache entry */
        if (lockmode != NoLock)
                LockRelationOid(relationId, lockmode);
@@ -91,6 +92,7 @@ try_relation_open(Oid relationId, LOCKMODE lockmode)
 
        Assert(lockmode >= NoLock && lockmode < MAX_LOCKMODES);
 
+       elog(LOG, "try_relation_open");
        /* Get the lock first */
        if (lockmode != NoLock)
                LockRelationOid(relationId, lockmode);
diff --git a/src/backend/replication/slotfuncs.c 
b/src/backend/replication/slotfuncs.c
index ef5081784c..564b36fc45 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -608,6 +608,7 @@ pg_logical_replication_slot_advance(XLogRecPtr moveto,
 
                /* free context, call shutdown callback */
                FreeDecodingContext(ctx);
+               elog(LOG, "decoding over");
 
                InvalidateSystemCaches();
        }
diff --git a/src/backend/utils/cache/catcache.c 
b/src/backend/utils/cache/catcache.c
index 569f51cb33..e19c586697 100644
--- a/src/backend/utils/cache/catcache.c
+++ b/src/backend/utils/cache/catcache.c
@@ -1328,6 +1328,7 @@ SearchCatCacheInternal(CatCache *cache,
 
        Assert(cache->cc_nkeys == nkeys);
 
+       elog(LOG, "SearchCatCacheInternal");
        /*
         * one-time startup overhead for each cache
         */
-- 
2.31.1

Reply via email to