On Fri, Jun 7, 2024 at 7:57 AM Zhijie Hou (Fujitsu)
wrote:
>
> Thanks for the comments! Here is the V6 patch that addressed the these.
>
I have pushed this after making minor changes in the wording. I have
also changed one of the queries in docs to ignore the NULL slot_name
values.
--
With Rega
On Thursday, June 6, 2024 12:21 PM Peter Smith
>
> Hi, here are some review comments for the docs patch v5-0001.
Thanks for the comments! Here is the V6 patch that addressed the these.
Best Regards,
Hou zj
v6-0001-Document-the-steps-to-check-if-the-standby-is-rea.patch
Description: v6-0001-D
Hi, here are some review comments for the docs patch v5-0001.
Apart from these it LGTM.
==
doc/src/sgml/logical-replication.sgml
1.
+
+ On the subscriber node, use the following SQL to identify which slots
+ should be synced to the standby that we plan to promote. This query will
On Wednesday, June 5, 2024 2:32 PM Peter Smith wrote:
> Hi. Here are some minor review comments for the docs patch v4-0001.
Thanks for the comments!
> The SGML file wrapping can be fixed to fill up to 80 cols for some of the
> paragraphs.
Unlike comments in C code, I think we don't force the
Hi. Here are some minor review comments for the docs patch v4-0001.
==
doc/src/sgml/logical-replication.sgml
1. General
The SGML file wrapping can be fixed to fill up to 80 cols for some of
the paragraphs.
~~~
2.
+ standby is promoted. They can continue subscribing to publications
now on
On Wed, Jun 5, 2024 at 7:52 AM Zhijie Hou (Fujitsu)
wrote:
>
> Attach the V4 doc patch which addressed Peter and Bertrand's comments.
>
Few comments:
1.
+ On the subscriber node, use the following SQL to identify
+ which slots should be synced to the standby that we plan to promote.
On Thursday, May 23, 2024 1:34 PM Peter Smith wrote:
Thanks for the comments. I addressed most of the comments except the
following one which I am not sure:
> 5b.
> Patch says "on the subscriber node", but isn't that the simplest case?
> e.g. maybe there are multiple nodes having subscriptions f
On Wednesday, May 8, 2024 5:21 PM Bertrand Drouvot
wrote:
> A few comments:
Thanks for the comments!
> 2 ===
>
> +test_sub=# SELECT
> + array_agg(slotname) AS slots
> + FROM
> + ((
> + SELECT r.srsubid AS subid, CONCAT('pg_', srsubid, '_sync_',
>
Here are some review comments for the docs patch v3-0001.
==
Commit message
1.
This patch adds detailed documentation for the slot sync feature
including examples to guide users on how to verify that all slots have
been successfully synchronized to the standby server and how to
confirm whethe
Hi,
On Mon, Apr 29, 2024 at 11:58:09AM +, Zhijie Hou (Fujitsu) wrote:
> On Monday, April 29, 2024 5:11 PM shveta malik wrote:
> >
> > On Mon, Apr 29, 2024 at 11:38 AM shveta malik
> > wrote:
> > >
> > > On Mon, Apr 29, 2024 at 10:57 AM Zhijie Hou (Fujitsu)
> > > wrote:
> > > >
> > > > On F
On Mon, Apr 29, 2024 at 5:28 PM Zhijie Hou (Fujitsu)
wrote:
>
> On Monday, April 29, 2024 5:11 PM shveta malik wrote:
> >
> > On Mon, Apr 29, 2024 at 11:38 AM shveta malik
> > wrote:
> > >
> > > On Mon, Apr 29, 2024 at 10:57 AM Zhijie Hou (Fujitsu)
> > > wrote:
> > > >
> > > > On Friday, March
On Monday, April 29, 2024 5:11 PM shveta malik wrote:
>
> On Mon, Apr 29, 2024 at 11:38 AM shveta malik
> wrote:
> >
> > On Mon, Apr 29, 2024 at 10:57 AM Zhijie Hou (Fujitsu)
> > wrote:
> > >
> > > On Friday, March 15, 2024 10:45 PM Bertrand Drouvot
> wrote:
> > > >
> > > > Hi,
> > > >
> > > >
On Mon, Apr 29, 2024 at 11:38 AM shveta malik wrote:
>
> On Mon, Apr 29, 2024 at 10:57 AM Zhijie Hou (Fujitsu)
> wrote:
> >
> > On Friday, March 15, 2024 10:45 PM Bertrand Drouvot
> > wrote:
> > >
> > > Hi,
> > >
> > > On Thu, Mar 14, 2024 at 02:22:44AM +, Zhijie Hou (Fujitsu) wrote:
> > >
On Mon, Apr 29, 2024 at 11:38 AM shveta malik wrote:
>
> On Mon, Apr 29, 2024 at 10:57 AM Zhijie Hou (Fujitsu)
> wrote:
> >
> > On Friday, March 15, 2024 10:45 PM Bertrand Drouvot
> > wrote:
> > >
> > > Hi,
> > >
> > > On Thu, Mar 14, 2024 at 02:22:44AM +, Zhijie Hou (Fujitsu) wrote:
> > >
On Mon, Apr 29, 2024 at 10:57 AM Zhijie Hou (Fujitsu)
wrote:
>
> On Friday, March 15, 2024 10:45 PM Bertrand Drouvot
> wrote:
> >
> > Hi,
> >
> > On Thu, Mar 14, 2024 at 02:22:44AM +, Zhijie Hou (Fujitsu) wrote:
> > > Hi,
> > >
> > > Since the standby_slot_names patch has been committed, I a
On Friday, March 15, 2024 10:45 PM Bertrand Drouvot
wrote:
>
> Hi,
>
> On Thu, Mar 14, 2024 at 02:22:44AM +, Zhijie Hou (Fujitsu) wrote:
> > Hi,
> >
> > Since the standby_slot_names patch has been committed, I am attaching
> > the last doc patch for review.
> >
>
> Thanks!
>
> 1 ===
>
>
On Friday, April 12, 2024 11:31 AM Amit Kapila wrote:
>
> On Thu, Apr 11, 2024 at 5:04 PM Zhijie Hou (Fujitsu)
> wrote:
> >
> > On Thursday, April 11, 2024 12:11 PM Amit Kapila
> wrote:
> >
> > >
> > > 2.
> > > - if (remote_slot->restart_lsn < slot->data.restart_lsn)
> > > + if (remote_slot->co
On Thu, Apr 11, 2024 at 5:04 PM Zhijie Hou (Fujitsu)
wrote:
>
> On Thursday, April 11, 2024 12:11 PM Amit Kapila
> wrote:
>
> >
> > 2.
> > - if (remote_slot->restart_lsn < slot->data.restart_lsn)
> > + if (remote_slot->confirmed_lsn < slot->data.confirmed_flush)
> > elog(ERROR,
> > "cannot s
On Thursday, April 11, 2024 12:11 PM Amit Kapila
wrote:
>
> On Wed, Apr 10, 2024 at 5:28 PM Zhijie Hou (Fujitsu)
> wrote:
> >
> > On Thursday, April 4, 2024 5:37 PM Amit Kapila
> wrote:
> > >
> > > BTW, while thinking on this one, I
> > > noticed that in the function LogicalConfirmReceivedLoca
On Wed, Apr 10, 2024 at 5:28 PM Zhijie Hou (Fujitsu)
wrote:
>
> On Thursday, April 4, 2024 5:37 PM Amit Kapila
> wrote:
> >
> > BTW, while thinking on this one, I
> > noticed that in the function LogicalConfirmReceivedLocation(), we first
> > update
> > the disk copy, see comment [1] and then i
On Thursday, April 4, 2024 5:37 PM Amit Kapila wrote:
>
> BTW, while thinking on this one, I
> noticed that in the function LogicalConfirmReceivedLocation(), we first update
> the disk copy, see comment [1] and then in-memory whereas the same is not
> true in
> update_local_synced_slot() for the
On Thursday, April 4, 2024 4:25 PM Masahiko Sawada
wrote:
Hi,
> On Wed, Apr 3, 2024 at 7:06 PM Amit Kapila
> wrote:
> >
> > On Wed, Apr 3, 2024 at 11:13 AM Amit Kapila
> wrote:
> > >
> > > On Wed, Apr 3, 2024 at 9:36 AM Bharath Rupireddy
> > > wrote:
> > >
> > > > I quickly looked at v8, and
On Mon, Apr 8, 2024 at 7:01 PM Zhijie Hou (Fujitsu)
wrote:
>
> Thanks for pushing.
>
> I checked the BF status, and noticed one BF failure, which I think is related
> to
> a miss in the test code.
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=adder&dt=2024-04-08%2012%3A04%3A27
>
> Fro
On Mon, Apr 8, 2024 at 9:49 PM Andres Freund wrote:
>
> On 2024-04-08 16:01:41 +0530, Amit Kapila wrote:
> > Pushed.
>
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=adder&dt=2024-04-08%2012%3A04%3A27
>
> This unfortunately is a commit after
>
Right, and thanks for the report. Hou-San
Hi,
On 2024-04-08 16:01:41 +0530, Amit Kapila wrote:
> Pushed.
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=adder&dt=2024-04-08%2012%3A04%3A27
This unfortunately is a commit after
commit 6f3d8d5e7cc
Author: Amit Kapila
Date: 2024-04-08 13:21:55 +0530
Fix the intermittent buil
On Monday, April 8, 2024 6:32 PM Amit Kapila wrote:
>
> On Mon, Apr 8, 2024 at 12:19 PM Zhijie Hou (Fujitsu)
> wrote:
> >
> > On Saturday, April 6, 2024 12:43 PM Amit Kapila
> wrote:
> > > On Fri, Apr 5, 2024 at 8:05 PM Bertrand Drouvot
> > > wrote:
> > >
> > > Yeah, that could be the first st
On Mon, Apr 8, 2024 at 12:19 PM Zhijie Hou (Fujitsu)
wrote:
>
> On Saturday, April 6, 2024 12:43 PM Amit Kapila
> wrote:
> > On Fri, Apr 5, 2024 at 8:05 PM Bertrand Drouvot
> > wrote:
> >
> > Yeah, that could be the first step. We can probably add an injection point
> > to
> > control the bgwr
On Saturday, April 6, 2024 12:43 PM Amit Kapila wrote:
> On Fri, Apr 5, 2024 at 8:05 PM Bertrand Drouvot
> wrote:
> >
> > On Fri, Apr 05, 2024 at 06:23:10PM +0530, Amit Kapila wrote:
> > > On Fri, Apr 5, 2024 at 5:17 PM Amit Kapila
> wrote:
> > > Thinking more on this, it doesn't seem related to
On Sun, Apr 7, 2024 at 3:06 AM Andres Freund wrote:
>
> On 2024-04-06 10:58:32 +0530, Amit Kapila wrote:
> > On Sat, Apr 6, 2024 at 10:13 AM Amit Kapila wrote:
> > >
> >
> > There are still a few pending issues to be fixed in this feature but
> > otherwise, we have committed all the main patches,
Hi,
On 2024-04-06 10:58:32 +0530, Amit Kapila wrote:
> On Sat, Apr 6, 2024 at 10:13 AM Amit Kapila wrote:
> >
>
> There are still a few pending issues to be fixed in this feature but
> otherwise, we have committed all the main patches, so I marked the CF
> entry corresponding to this work as com
Hi,
On Sat, Apr 06, 2024 at 10:13:00AM +0530, Amit Kapila wrote:
> On Fri, Apr 5, 2024 at 8:05 PM Bertrand Drouvot
> wrote:
>
> I think the new LSN can be visible only when the corresponding WAL is
> written by XLogWrite(). I don't know what in XLogSetAsyncXactLSN() can
> make it visible. In you
On Sat, Apr 6, 2024 at 10:13 AM Amit Kapila wrote:
>
There are still a few pending issues to be fixed in this feature but
otherwise, we have committed all the main patches, so I marked the CF
entry corresponding to this work as committed.
--
With Regards,
Amit Kapila.
On Fri, Apr 5, 2024 at 8:05 PM Bertrand Drouvot
wrote:
>
> On Fri, Apr 05, 2024 at 06:23:10PM +0530, Amit Kapila wrote:
> > On Fri, Apr 5, 2024 at 5:17 PM Amit Kapila wrote:
> > Thinking more on this, it doesn't seem related to
> > c9920a9068eac2e6c8fb34988d18c0b42b9bf811 as that commit doesn't c
Hi,
On Fri, Apr 05, 2024 at 02:35:42PM +, Bertrand Drouvot wrote:
> I think that maybe as a first step we should move the "elog(DEBUG2," message
> as
> proposed above to help debugging (that could help to confirm the above
> theory).
If you agree and think that makes sense, pleae find attac
Hi,
On Fri, Apr 05, 2024 at 06:23:10PM +0530, Amit Kapila wrote:
> On Fri, Apr 5, 2024 at 5:17 PM Amit Kapila wrote:
> Thinking more on this, it doesn't seem related to
> c9920a9068eac2e6c8fb34988d18c0b42b9bf811 as that commit doesn't change
> any locking or something like that which impacts writ
On Fri, Apr 5, 2024 at 5:17 PM Amit Kapila wrote:
>
> On Thu, Apr 4, 2024 at 2:59 PM shveta malik wrote:
> >
> > There is an intermittent BF failure observed at [1] after this commit
> > (2ec005b).
> >
>
> Thanks for analyzing and providing the patch. I'll look into it. There
> is another BF fai
On Thu, Apr 4, 2024 at 2:59 PM shveta malik wrote:
>
> There is an intermittent BF failure observed at [1] after this commit
> (2ec005b).
>
Thanks for analyzing and providing the patch. I'll look into it. There
is another BF failure [1] which I have analyzed. The main reason for
failure is the f
On Fri, Apr 5, 2024 at 4:31 PM Bertrand Drouvot
wrote:
>
> BTW, I just realized that the LSN I used in my example in the
> LSN_FORMAT_ARGS()
> are not the right ones.
Noted. Thanks.
Please find v3 with the comments addressed.
thanks
Shveta
v3-0001-Correct-sanity-check-to-compare-confirmed_ls
Hi,
On Fri, Apr 05, 2024 at 04:09:01PM +0530, shveta malik wrote:
> On Fri, Apr 5, 2024 at 10:09 AM Bertrand Drouvot
> wrote:
> >
> > What about something like?
> >
> > ereport(LOG,
> > errmsg("synchronized confirmed_flush_lsn for slot \"%s\" differs
> > from remote slot",
> > re
On Fri, Apr 5, 2024 at 10:09 AM Bertrand Drouvot
wrote:
>
> What about something like?
>
> ereport(LOG,
> errmsg("synchronized confirmed_flush_lsn for slot \"%s\" differs from
> remote slot",
> remote_slot->name),
> errdetail("Remote slot has LSN %X/%X but local slot has L
Hi,
On Fri, Apr 05, 2024 at 09:43:35AM +0530, shveta malik wrote:
> On Fri, Apr 5, 2024 at 9:22 AM Bertrand Drouvot
> wrote:
> >
> > Hi,
> >
> > On Thu, Apr 04, 2024 at 05:31:45PM +0530, shveta malik wrote:
> > > On Thu, Apr 4, 2024 at 2:59 PM shveta malik
> > > wrote:
> > 2 ===
> >
> > +
On Fri, Apr 5, 2024 at 9:22 AM Bertrand Drouvot
wrote:
>
> Hi,
>
> On Thu, Apr 04, 2024 at 05:31:45PM +0530, shveta malik wrote:
> > On Thu, Apr 4, 2024 at 2:59 PM shveta malik wrote:
> > >
> > >
> > > Prior to commit 2ec005b, this check was okay, as we did not expect
> > > restart_lsn of the syn
Hi,
On Thu, Apr 04, 2024 at 05:31:45PM +0530, shveta malik wrote:
> On Thu, Apr 4, 2024 at 2:59 PM shveta malik wrote:
> >
> >
> > Prior to commit 2ec005b, this check was okay, as we did not expect
> > restart_lsn of the synced slot to be ahead of remote since we were
> > directly copying the lsn
On Thu, Apr 4, 2024 at 2:59 PM shveta malik wrote:
>
>
> Prior to commit 2ec005b, this check was okay, as we did not expect
> restart_lsn of the synced slot to be ahead of remote since we were
> directly copying the lsns. But now when we use 'advance' to do logical
> decoding on standby, there is
On Thu, Apr 4, 2024 at 1:55 PM Masahiko Sawada wrote:
>
> While testing this change, I realized that it could happen that the
> server logs are flooded with the following logical decoding logs that
> are written every 200 ms:
>
> 2024-04-04 16:15:19.270 JST [3838739] LOG: starting logical decodin
On Wed, Apr 3, 2024 at 3:36 PM Amit Kapila wrote:
>
> On Wed, Apr 3, 2024 at 11:13 AM Amit Kapila wrote:
> >
> > On Wed, Apr 3, 2024 at 9:36 AM Bharath Rupireddy
> > wrote:
> >
> > > I quickly looked at v8, and have a nit, rest all looks good.
> > >
> > > +if (DecodingContextReady(ctx) &
On Wed, Apr 3, 2024 at 7:06 PM Amit Kapila wrote:
>
> On Wed, Apr 3, 2024 at 11:13 AM Amit Kapila wrote:
> >
> > On Wed, Apr 3, 2024 at 9:36 AM Bharath Rupireddy
> > wrote:
> >
> > > I quickly looked at v8, and have a nit, rest all looks good.
> > >
> > > +if (DecodingContextReady(ctx) &
On Wed, Apr 3, 2024 at 11:13 AM Amit Kapila wrote:
>
> On Wed, Apr 3, 2024 at 9:36 AM Bharath Rupireddy
> wrote:
>
> > I quickly looked at v8, and have a nit, rest all looks good.
> >
> > +if (DecodingContextReady(ctx) && found_consistent_snapshot)
> > +*found_consistent_snaps
On Wed, Apr 3, 2024 at 9:36 AM Bharath Rupireddy
wrote:
>
> On Wed, Apr 3, 2024 at 9:04 AM Amit Kapila wrote:
> >
> > > I'd just rename LogicalSlotAdvanceAndCheckSnapState(XLogRecPtr
> > > moveto, bool *found_consistent_snapshot) to
> > > pg_logical_replication_slot_advance(XLogRecPtr moveto, boo
On Wed, Apr 3, 2024 at 9:04 AM Amit Kapila wrote:
>
> > I'd just rename LogicalSlotAdvanceAndCheckSnapState(XLogRecPtr
> > moveto, bool *found_consistent_snapshot) to
> > pg_logical_replication_slot_advance(XLogRecPtr moveto, bool
> > *found_consistent_snapshot) and use it. If others don't like th
On Tue, Apr 2, 2024 at 7:42 PM Bharath Rupireddy
wrote:
>
> On Tue, Apr 2, 2024 at 7:25 PM Zhijie Hou (Fujitsu)
> wrote:
> >
> > > 1. Can we just remove pg_logical_replication_slot_advance and use
> > > LogicalSlotAdvanceAndCheckSnapState instead? If worried about the
> > > function naming, Logic
On Tue, Apr 2, 2024 at 7:25 PM Zhijie Hou (Fujitsu)
wrote:
>
> > 1. Can we just remove pg_logical_replication_slot_advance and use
> > LogicalSlotAdvanceAndCheckSnapState instead? If worried about the
> > function naming, LogicalSlotAdvanceAndCheckSnapState can be renamed to
> > pg_logical_replica
On Tuesday, April 2, 2024 8:49 PM Bharath Rupireddy
wrote:
>
> On Tue, Apr 2, 2024 at 2:11 PM Zhijie Hou (Fujitsu)
> wrote:
> >
> > CFbot[1] complained about one query result's order in the tap-test, so I am
> > attaching a V7 patch set which fixed this. There are no changes in 0001.
> >
> > [1
On Tue, Apr 2, 2024 at 2:11 PM Zhijie Hou (Fujitsu)
wrote:
>
> CFbot[1] complained about one query result's order in the tap-test, so I am
> attaching a V7 patch set which fixed this. There are no changes in 0001.
>
> [1] https://cirrus-ci.com/task/6375962162495488
Thanks. Here are some comments:
Hi,
On Tue, Apr 02, 2024 at 02:19:30PM +0530, Amit Kapila wrote:
> On Tue, Apr 2, 2024 at 1:54 PM Bertrand Drouvot
> wrote:
> > What about adding a "wait" injection point in LogStandbySnapshot() to
> > prevent
> > checkpointer/bgwriter to log a standby snapshot? Something among those
> > lines:
On Tue, Apr 2, 2024 at 1:54 PM Bertrand Drouvot
wrote:
>
> On Tue, Apr 02, 2024 at 07:20:46AM +, Zhijie Hou (Fujitsu) wrote:
> > I added one test in 040_standby_failover_slots_sync.pl in 0002 patch, which
> > can
> > reproduce the data loss issue consistently on my machine.
>
> Thanks!
>
> >
On Tuesday, April 2, 2024 3:21 PM Zhijie Hou (Fujitsu)
wrote:
> On Tuesday, April 2, 2024 8:35 AM Zhijie Hou (Fujitsu)
> wrote:
> >
> > On Monday, April 1, 2024 7:30 PM Amit Kapila
> > wrote:
> > >
> > > On Mon, Apr 1, 2024 at 6:26 AM Zhijie Hou (Fujitsu)
> > >
> > > wrote:
> > > >
> > > > On
Hi,
On Tue, Apr 02, 2024 at 07:20:46AM +, Zhijie Hou (Fujitsu) wrote:
> I added one test in 040_standby_failover_slots_sync.pl in 0002 patch, which
> can
> reproduce the data loss issue consistently on my machine.
Thanks!
> It may not reproduce
> in some rare cases if concurrent xl_running_
On Tuesday, April 2, 2024 8:35 AM Zhijie Hou (Fujitsu)
wrote:
>
> On Monday, April 1, 2024 7:30 PM Amit Kapila
> wrote:
> >
> > On Mon, Apr 1, 2024 at 6:26 AM Zhijie Hou (Fujitsu)
> >
> > wrote:
> > >
> > > On Friday, March 29, 2024 2:50 PM Amit Kapila
> > >
> > wrote:
> > > >
> > >
> > > >
>
Hi,
On Tue, Apr 02, 2024 at 04:24:49AM +, Zhijie Hou (Fujitsu) wrote:
> On Monday, April 1, 2024 9:28 PM Bertrand Drouvot
> wrote:
> >
> > On Mon, Apr 01, 2024 at 05:04:53PM +0530, Amit Kapila wrote:
> > > On Mon, Apr 1, 2024 at 2:51 PM Bertrand Drouvot
> >
> > >
> > > > 2 ===
> > > >
> >
On Mon, Apr 1, 2024 at 5:05 PM Amit Kapila wrote:
>
> > 2 ===
> >
> > + {
> > + if (SnapBuildSnapshotExists(remote_slot->restart_lsn))
> > + {
> >
> > That could call SnapBuildSnapshotExists() multiple times for the same
> > "restart_lsn" (for example in case of m
On Tuesday, April 2, 2024 8:43 AM Bharath Rupireddy
wrote:
>
> On Mon, Apr 1, 2024 at 11:36 AM Zhijie Hou (Fujitsu)
> wrote:
> >
> > Attach the V4 patch which includes the optimization to skip the
> > decoding if the snapshot at the syncing restart_lsn is already
> > serialized. It can avoid mo
On Mon, Apr 1, 2024 at 6:58 PM Bertrand Drouvot
wrote:
>
> On Mon, Apr 01, 2024 at 05:04:53PM +0530, Amit Kapila wrote:
> > On Mon, Apr 1, 2024 at 2:51 PM Bertrand Drouvot
> > wrote:
> > > Then there is no need to call WaitForStandbyConfirmation() as it could go
> > > until
> > > the RecoveryInP
On Mon, Apr 1, 2024 at 11:36 AM Zhijie Hou (Fujitsu)
wrote:
>
> Attach the V4 patch which includes the optimization to skip the decoding if
> the snapshot at the syncing restart_lsn is already serialized. It can avoid
> most
> of the duplicate decoding in my test, and I am doing some more tests l
On Monday, April 1, 2024 7:30 PM Amit Kapila wrote:
>
> On Mon, Apr 1, 2024 at 6:26 AM Zhijie Hou (Fujitsu)
> wrote:
> >
> > On Friday, March 29, 2024 2:50 PM Amit Kapila
> wrote:
> > >
> >
> > >
> > >
> > > 2.
> > > +extern XLogRecPtr pg_logical_replication_slot_advance(XLogRecPtr
> moveto,
>
Hi,
On Mon, Apr 01, 2024 at 05:04:53PM +0530, Amit Kapila wrote:
> On Mon, Apr 1, 2024 at 2:51 PM Bertrand Drouvot
> wrote:
> > Then there is no need to call WaitForStandbyConfirmation() as it could go
> > until
> > the RecoveryInProgress() in StandbySlotsHaveCaughtup() for nothing (as we
> > a
On Mon, Apr 1, 2024 at 10:40 AM Amit Kapila wrote:
>
> After this step and before the next, did you ensure that the slot sync
> has synced the latest confirmed_flush/restart LSNs? You can query:
> "select slot_name,restart_lsn, confirmed_flush_lsn from
> pg_replication_slots;" to ensure the same o
On Mon, Apr 1, 2024 at 2:51 PM Bertrand Drouvot
wrote:
>
> Hi,
>
> On Mon, Apr 01, 2024 at 06:05:34AM +, Zhijie Hou (Fujitsu) wrote:
> > On Monday, April 1, 2024 8:56 AM Zhijie Hou (Fujitsu)
> > wrote:
> > Attach the V4 patch which includes the optimization to skip the decoding if
> > the sn
On Mon, Apr 1, 2024 at 6:26 AM Zhijie Hou (Fujitsu)
wrote:
>
> On Friday, March 29, 2024 2:50 PM Amit Kapila wrote:
> >
>
> >
> >
> > 2.
> > +extern XLogRecPtr pg_logical_replication_slot_advance(XLogRecPtr moveto,
> > + bool *found_consistent_point);
> > +
> >
> > This API looks a bit awkward
Did performance test on optimization patch
(v2-0001-optimize-the-slot-advancement.patch). Please find the
results:
Setup:
- One primary node with 100 failover-enabled logical slots
- 20 DBs, each having 5 failover-enabled logical replication slots
- One physical standby node with 'sync_replica
Hi,
On Mon, Apr 01, 2024 at 06:05:34AM +, Zhijie Hou (Fujitsu) wrote:
> On Monday, April 1, 2024 8:56 AM Zhijie Hou (Fujitsu)
> wrote:
> Attach the V4 patch which includes the optimization to skip the decoding if
> the snapshot at the syncing restart_lsn is already serialized. It can avoid
On Mon, Apr 1, 2024 at 10:40 AM Amit Kapila wrote:
>
> On Mon, Apr 1, 2024 at 10:01 AM Bharath Rupireddy
> wrote:
> >
> > On Thu, Mar 28, 2024 at 10:08 AM Zhijie Hou (Fujitsu)
> > wrote:
> > >
> > > [2] The steps to reproduce the data miss issue on a primary->standby
> > > setup:
> >
> > I'm tr
On Monday, April 1, 2024 8:56 AM Zhijie Hou (Fujitsu)
wrote:
>
> On Friday, March 29, 2024 2:50 PM Amit Kapila
> wrote:
> >
> > On Fri, Mar 29, 2024 at 6:36 AM Zhijie Hou (Fujitsu)
> >
> > wrote:
> > >
> > >
> > > Attach a new version patch which fixed an un-initialized variable
> > > issue an
On Mon, Apr 1, 2024 at 10:01 AM Bharath Rupireddy
wrote:
>
> On Thu, Mar 28, 2024 at 10:08 AM Zhijie Hou (Fujitsu)
> wrote:
> >
> > [2] The steps to reproduce the data miss issue on a primary->standby setup:
>
> I'm trying to reproduce the problem with [1], but I can see the
> changes after the s
On Thu, Mar 28, 2024 at 10:08 AM Zhijie Hou (Fujitsu)
wrote:
>
> [2] The steps to reproduce the data miss issue on a primary->standby setup:
I'm trying to reproduce the problem with [1], but I can see the
changes after the standby is promoted. Am I missing anything here?
ubuntu:~/postgres/pg17/b
On Friday, March 29, 2024 2:50 PM Amit Kapila wrote:
>
> On Fri, Mar 29, 2024 at 6:36 AM Zhijie Hou (Fujitsu)
> wrote:
> >
> >
> > Attach a new version patch which fixed an un-initialized variable
> > issue and added some comments.
> >
>
> The other approach to fix this issue could be that the
Hi,
On Fri, Mar 29, 2024 at 02:35:22PM +0530, Amit Kapila wrote:
> On Fri, Mar 29, 2024 at 1:08 PM Bertrand Drouvot
> wrote:
> >
> > On Fri, Mar 29, 2024 at 07:23:11AM +, Zhijie Hou (Fujitsu) wrote:
> > > On Friday, March 29, 2024 2:48 PM Bertrand Drouvot
> > > wrote:
> > > >
> > > > Hi,
>
On Fri, Mar 29, 2024 at 1:08 PM Bertrand Drouvot
wrote:
>
> On Fri, Mar 29, 2024 at 07:23:11AM +, Zhijie Hou (Fujitsu) wrote:
> > On Friday, March 29, 2024 2:48 PM Bertrand Drouvot
> > wrote:
> > >
> > > Hi,
> > >
> > > On Fri, Mar 29, 2024 at 01:06:15AM +, Zhijie Hou (Fujitsu) wrote:
>
On Fri, Mar 29, 2024 at 9:34 AM Hayato Kuroda (Fujitsu)
wrote:
>
> Thanks for updating the patch! Here is a comment for it.
>
> ```
> +/*
> + * By advancing the restart_lsn, confirmed_lsn, and xmin using
> + * fast-forward logical decoding, we can verify whether a consisten
Hi,
On Fri, Mar 29, 2024 at 07:23:11AM +, Zhijie Hou (Fujitsu) wrote:
> On Friday, March 29, 2024 2:48 PM Bertrand Drouvot
> wrote:
> >
> > Hi,
> >
> > On Fri, Mar 29, 2024 at 01:06:15AM +, Zhijie Hou (Fujitsu) wrote:
> > > Attach a new version patch which fixed an un-initialized varia
On Friday, March 29, 2024 2:48 PM Bertrand Drouvot
wrote:
>
> Hi,
>
> On Fri, Mar 29, 2024 at 01:06:15AM +, Zhijie Hou (Fujitsu) wrote:
> > Attach a new version patch which fixed an un-initialized variable
> > issue and added some comments. Also, temporarily enable DEBUG2 for the
> > 040 ta
On Fri, Mar 29, 2024 at 6:36 AM Zhijie Hou (Fujitsu)
wrote:
>
>
> Attach a new version patch which fixed an un-initialized variable issue and
> added some comments.
>
The other approach to fix this issue could be that the slotsync worker
get the serialized snapshot using pg_read_binary_file() cor
Hi,
On Fri, Mar 29, 2024 at 01:06:15AM +, Zhijie Hou (Fujitsu) wrote:
> Attach a new version patch which fixed an un-initialized variable issue and
> added some comments. Also, temporarily enable DEBUG2 for the 040 tap-test so
> that
> we can analyze the possible CFbot failures easily.
>
Th
Dear Hou,
Thanks for updating the patch! Here is a comment for it.
```
+/*
+ * By advancing the restart_lsn, confirmed_lsn, and xmin using
+ * fast-forward logical decoding, we can verify whether a consistent
+ * snapshot can be built. This process also involves sa
On Fri, Mar 29, 2024 at 6:36 AM Zhijie Hou (Fujitsu)
wrote:
>
> Attach a new version patch which fixed an un-initialized variable issue and
> added some comments. Also, temporarily enable DEBUG2 for the 040 tap-test so
> that
> we can analyze the possible CFbot failures easily.
As suggested by A
On Thursday, March 28, 2024 10:02 PM Zhijie Hou (Fujitsu)
wrote:
>
> On Thursday, March 28, 2024 7:32 PM Amit Kapila
> wrote:
> >
> > On Thu, Mar 28, 2024 at 10:08 AM Zhijie Hou (Fujitsu)
> >
> > wrote:
> > >
> > > When analyzing one BF error[1], we find an issue of slotsync: Since
> > > we do
On Thursday, March 28, 2024 7:32 PM Amit Kapila wrote:
>
> On Thu, Mar 28, 2024 at 10:08 AM Zhijie Hou (Fujitsu)
> wrote:
> >
> > When analyzing one BF error[1], we find an issue of slotsync: Since we
> > don't perform logical decoding for the synced slots when syncing the
> > lsn/xmin of slot,
Hi,
On Thu, Mar 28, 2024 at 05:05:35PM +0530, Amit Kapila wrote:
> On Thu, Mar 28, 2024 at 3:34 PM Bertrand Drouvot
> wrote:
> >
> > On Thu, Mar 28, 2024 at 04:38:19AM +, Zhijie Hou (Fujitsu) wrote:
> >
> > > To fix this, we could use the fast forward logical decoding to advance
> > > the sy
On Thu, Mar 28, 2024 at 3:34 PM Bertrand Drouvot
wrote:
>
> On Thu, Mar 28, 2024 at 04:38:19AM +, Zhijie Hou (Fujitsu) wrote:
>
> > To fix this, we could use the fast forward logical decoding to advance the
> > synced
> > slot's lsn/xmin when syncing these values instead of directly updating
On Thu, Mar 28, 2024 at 10:08 AM Zhijie Hou (Fujitsu)
wrote:
>
> When analyzing one BF error[1], we find an issue of slotsync: Since we don't
> perform logical decoding for the synced slots when syncing the lsn/xmin of
> slot, no logical snapshots will be serialized to disk. So, when user starts t
Hi,
On Thu, Mar 28, 2024 at 04:38:19AM +, Zhijie Hou (Fujitsu) wrote:
> Hi,
>
> When analyzing one BF error[1], we find an issue of slotsync: Since we don't
> perform logical decoding for the synced slots when syncing the lsn/xmin of
> slot, no logical snapshots will be serialized to disk. So
Hi,
When analyzing one BF error[1], we find an issue of slotsync: Since we don't
perform logical decoding for the synced slots when syncing the lsn/xmin of
slot, no logical snapshots will be serialized to disk. So, when user starts to
use these synced slots after promotion, it needs to re-build th
Hi,
On Thu, Mar 14, 2024 at 02:22:44AM +, Zhijie Hou (Fujitsu) wrote:
> Hi,
>
> Since the standby_slot_names patch has been committed, I am attaching the last
> doc patch for review.
>
Thanks!
1 ===
+ continue subscribing to publications now on the new primary server without
+ any dat
Hi,
Since the standby_slot_names patch has been committed, I am attaching the last
doc patch for review.
Best Regards,
Hou zj
v109-0001-Document-the-steps-to-check-if-the-standby-is-r.patch
Description: v109-0001-Document-the-steps-to-check-if-the-standby-is-r.patch
On Fri, Mar 8, 2024 at 9:56 AM Ajin Cherian wrote:
>
>> Pushed with minor modifications. I'll keep an eye on BF.
>>
>> BTW, one thing that we should try to evaluate a bit more is the
>> traversal of slots in StandbySlotsHaveCaughtup() where we verify if
>> all the slots mentioned in standby_slot_n
On Fri, Mar 8, 2024 at 2:33 PM Amit Kapila wrote:
> On Thu, Mar 7, 2024 at 12:00 PM Zhijie Hou (Fujitsu)
> wrote:
> >
> >
> > Attach the V108 patch set which addressed above and Peter's comments.
> > I also removed the check for "*" in guc check hook.
> >
>
>
> Pushed with minor modifications. I
On Thu, Mar 7, 2024 at 12:00 PM Zhijie Hou (Fujitsu)
wrote:
>
>
> Attach the V108 patch set which addressed above and Peter's comments.
> I also removed the check for "*" in guc check hook.
>
Pushed with minor modifications. I'll keep an eye on BF.
BTW, one thing that we should try to evaluate
On Thursday, March 7, 2024 12:46 PM Amit Kapila wrote:
>
> On Thu, Mar 7, 2024 at 7:35 AM Peter Smith
> wrote:
> >
> > Here are some review comments for v107-0001
> >
> > ==
> > src/backend/replication/slot.c
> >
> > 1.
> > +/*
> > + * Struct for the configuration of standby_slot_names.
> >
On Thursday, March 7, 2024 10:05 AM Peter Smith wrote:
>
> Here are some review comments for v107-0001
Thanks for the comments.
>
> ==
> src/backend/replication/slot.c
>
> 1.
> +/*
> + * Struct for the configuration of standby_slot_names.
> + *
> + * Note: this must be a flat representati
On Thu, Mar 7, 2024 at 8:37 AM shveta malik wrote:
>
I thought about whether we can make standby_slot_names as USERSET
instead of SIGHUP and it doesn't sound like a good idea as that can
lead to inconsistent standby replicas even after configuring the
correct value of standby_slot_names. One can
1 - 100 of 850 matches
Mail list logo