subject:"RE\: Synchronizing slots from primary to standby"

RE: Synchronizing slots from primary to standby

2024-02-06 Thread Zhijie Hou (Fujitsu)

On Tuesday, February 6, 2024 3:39 PM Masahiko Sawada  
wrote:
> 
> On Tue, Feb 6, 2024 at 3:19 PM Amit Kapila  wrote:
> >
> > On Mon, Feb 5, 2024 at 7:56 PM Masahiko Sawada
>  wrote:
> > >
> > > ---
> > > Since Two processes (e.g. the slotsync worker and
> > > pg_sync_replication_slots()) concurrently fetch and update the slot
> > > information, there is a race condition where slot's
> > > confirmed_flush_lsn goes backward.
> > >
> >
> > Right, this is possible, though there shouldn't be a problem because
> > anyway, slotsync is an async process. Till we hold restart_lsn, the
> > required WAL won't be removed. Having said that, I can think of two
> > ways to avoid it: (a) We can have some flag in shared memory using
> > which we can detect whether any other process is doing slot
> > syncronization and then either error out at that time or simply wait
> > or may take nowait kind of parameter from user to decide what to do?
> > If this is feasible, we can simply error out for the first version and
> > extend it later if we see any use cases for the same (b) similar to
> > restart_lsn, if confirmed_flush_lsn is getting moved back, raise an
> > error, this is good for now but in future we may still have another
> > similar issue, so I would prefer (a) among these but I am fine if you
> > prefer (b) or have some other ideas like just note down in comments
> > that this is a harmless case and can happen only very rarely.
> 
> Thank you for sharing the ideas. I would prefer (a). For (b), the same issue 
> still
> happens for other fields.

Attach the V79 patch which includes the following changes. (Note that only
0001 is sent in this version, we will send the later patches after rebasing)

1. Address all the comments from Amit[1], all the comments from Peter[2] and 
some of
   the comments from Sawada-san[3].
2. Using a flag in shared to memory to restrcit concurrent slot sync.
3. Add more tap tests for pg_sync_replication_slots function.

[1] 
https://www.postgresql.org/message-id/CAA4eK1KGHT9S-Bst_G1CUNQvRep%3DipMs5aTBNRQFVi6TogbJ9w%40mail.gmail.com
[2] 
https://www.postgresql.org/message-id/CAHut%2BPtyoRf3adoLoTrbL6momzkhXAFKz656Vv9YRu4cp%3D6Yig%40mail.gmail.com
[3] 
https://www.postgresql.org/message-id/CAD21AoCEkcTaPb%2BGdOhSQE49_mKJG6D64quHcioJGx6RCqMv%2BQ%40mail.gmail.com

Best Regards,
Hou zj


v79-0001-Add-a-slot-synchronization-function.patch
Description: v79-0001-Add-a-slot-synchronization-function.patch

Re: Synchronizing slots from primary to standby

2024-02-06 Thread Amit Kapila

On Tue, Feb 6, 2024 at 12:08 PM Peter Smith  wrote:
>
> Hi, I took another high-level look at all the funtion names of the
> slotsync.c file.
>
>
> Below are some suggestions (some are unchanged); probably there are
> better ideas for names but my point is that the current names could be
> improved:
>
> CURRENT SUGGESTION
...
> check_sync_slot_on_remote check_local_synced_slot_exists_on_remote
>

I think none of this seems to state the purpose of the function. I
suggest changing it to local_sync_slot_required() and returning false
either if the local_slot doesn't exist in remote_slot_list or is
invalidated.

-- 
With Regards,
Amit Kapila.

Re: Synchronizing slots from primary to standby

2024-02-06 Thread Amit Kapila

On Tue, Feb 6, 2024 at 3:57 PM Dilip Kumar  wrote:
>
> On Tue, Feb 6, 2024 at 3:41 PM Amit Kapila  wrote:
> >
> > On Tue, Feb 6, 2024 at 3:23 PM Dilip Kumar  wrote:
> > >
> > > On Tue, Feb 6, 2024 at 1:09 PM Masahiko Sawada  
> > > wrote:
> > > >
> > > > On Tue, Feb 6, 2024 at 3:19 PM Amit Kapila  
> > > > wrote:
> > > > >
> > > > > On Mon, Feb 5, 2024 at 7:56 PM Masahiko Sawada 
> > > > >  wrote:
> > > > > >
> > > > > > ---
> > > > > > Since Two processes (e.g. the slotsync worker and
> > > > > > pg_sync_replication_slots()) concurrently fetch and update the slot
> > > > > > information, there is a race condition where slot's
> > > > > > confirmed_flush_lsn goes backward.
> > > > > >
> > > > >
> > > > > Right, this is possible, though there shouldn't be a problem because
> > > > > anyway, slotsync is an async process. Till we hold restart_lsn, the
> > > > > required WAL won't be removed. Having said that, I can think of two
> > > > > ways to avoid it: (a) We can have some flag in shared memory using
> > > > > which we can detect whether any other process is doing slot
> > > > > syncronization and then either error out at that time or simply wait
> > > > > or may take nowait kind of parameter from user to decide what to do?
> > > > > If this is feasible, we can simply error out for the first version and
> > > > > extend it later if we see any use cases for the same (b) similar to
> > > > > restart_lsn, if confirmed_flush_lsn is getting moved back, raise an
> > > > > error, this is good for now but in future we may still have another
> > > > > similar issue, so I would prefer (a) among these but I am fine if you
> > > > > prefer (b) or have some other ideas like just note down in comments
> > > > > that this is a harmless case and can happen only very rarely.
> > > >
> > > > Thank you for sharing the ideas. I would prefer (a). For (b), the same
> > > > issue still happens for other fields.
> > >
> > > I agree that (a) looks better.  On a separate note, while looking at
> > > this API pg_sync_replication_slots(PG_FUNCTION_ARGS) shouldn't there
> > > be an optional parameter to give one slot or multiple slots or all
> > > slots as default, that will give better control to the user no?
> > >
> >
> > As of now, we want to give functionality similar to slotsync worker
> > with a difference that users can use this new function for planned
> > switchovers. So, syncing all failover slots by default. I think if
> > there is a use case to selectively sync some of the failover slots
> > then we can probably extend this function and slotsync worker as well.
> > Normally, if the primary goes down due to whatever reason users would
> > want to restart the replication for all the defined publications via
> > existing failover slots. Why would anyone want to do it partially?
>
> If we consider the usability of such a function (I mean as it is
> implemented now, without any argument) one use case could be that if
> the slot sync worker is not keeping up or at some point in time the
> user doesn't want to wait for the worker to do this instead user can
> do it by himself.
>

Possibly, but I was imagining that it would be used for planned
switchover cases and also for testing the core sync slot functionality
in our TAP tests.

> So now if we have such a functionality then it would be even better to
> extend it to selectively sync the slot.  For example, if there is some
> issue in syncing all slots, maybe some bug or taking a long time to
> sync because there are a lot of slots but if the user needs to quickly
> failover and he/she is interested in only a couple of slots then such
> a option could be helpful. no?
>

I see your point but not sure how useful it is in the field. I am fine
if others also think such a parameter will be useful and anyway I
think we can even extend it after v1 is done.

-- 
With Regards,
Amit Kapila.

Re: Synchronizing slots from primary to standby

2024-02-06 Thread Amit Kapila

On Tue, Feb 6, 2024 at 3:33 PM Amit Kapila  wrote:
>
> On Tue, Feb 6, 2024 at 1:09 PM Masahiko Sawada  wrote:
> >
> > On Tue, Feb 6, 2024 at 3:19 PM Amit Kapila  wrote:
> > >
> > > I think users can refer to LOGs to see if it has changed since the
> > > first time it was configured. I tried by existing parameter and see
> > > the following in LOG:
> > > LOG:  received SIGHUP, reloading configuration files
> > > 2024-02-06 11:38:59.069 IST [9240] LOG:  parameter "autovacuum" changed 
> > > to "on"
> > >
> > > If the user can't confirm then it is better to follow the steps
> > > mentioned in the patch. Do you want something else to be written in
> > > docs for this? If so, what?
> >
> > IIUC even if a wrong slot name is specified to standby_slot_names or
> > even standby_slot_names is empty, the standby server might not be
> > lagging behind the subscribers depending on the timing. But when
> > checking it the next time, the standby server might lag behind the
> > subscribers. So what I wanted to know is how the user can confirm if a
> > failover-enabled subscription is ensured not to go in front of
> > failover-candidate standbys (i.e., standbys using the slots listed in
> > standby_slot_names).
> >
>
> But isn't the same explained by two steps ((a) Firstly, on the
> subscriber node check the last replayed WAL. (b) Next, on the standby
> server check that the last-received WAL location is ahead of the
> replayed WAL location on the subscriber identified above.) in the
> latest *_0004 patch.
>

Additionally, I would like to add that the users can use the queries
mentioned in the doc after the primary has failed and before promoting
the standby. If she wants to do that when both primary and standby are
available, the value of 'standby_slot_names' on primary should be
referred. Isn't those two sufficient that there won't be false
positives?

-- 
With Regards,
Amit Kapila.

Re: Synchronizing slots from primary to standby

2024-02-06 Thread Amit Kapila

On Tue, Feb 6, 2024 at 9:35 AM Peter Smith  wrote:
>
> ==
> GENERAL
>
> 1.
> Should the "Chapter 30 Logical Replication" at least have another
> section that mentions the feature of slot synchronization so the
> information about it is easier to find? It doesn't need to say much --
> just give a reference to the other sections where it is explained
> already.
>

We can think of something like Failover/Switchover but that we can do
at the end once we get the worker patch and other work not with the
first patch.

>
> 6.
> +  
> +role="func_table_entry">
> +
> + pg_sync_replication_slots
>
> Currently, this is in section "9.27.6 Replication Management
> Functions", but I wondered if it should also have some mention in the
> "9.27.4. Recovery Control Functions" section.
>

I feel this is more suited to "Replication Management Functions"
because the other section talks about functions used during recovery
whereas we won't do anything for slotsync during recovery.

-- 
With Regards,
Amit Kapila.

Re: Synchronizing slots from primary to standby

2024-02-06 Thread Dilip Kumar

On Tue, Feb 6, 2024 at 3:41 PM Amit Kapila  wrote:
>
> On Tue, Feb 6, 2024 at 3:23 PM Dilip Kumar  wrote:
> >
> > On Tue, Feb 6, 2024 at 1:09 PM Masahiko Sawada  
> > wrote:
> > >
> > > On Tue, Feb 6, 2024 at 3:19 PM Amit Kapila  
> > > wrote:
> > > >
> > > > On Mon, Feb 5, 2024 at 7:56 PM Masahiko Sawada  
> > > > wrote:
> > > > >
> > > > > ---
> > > > > Since Two processes (e.g. the slotsync worker and
> > > > > pg_sync_replication_slots()) concurrently fetch and update the slot
> > > > > information, there is a race condition where slot's
> > > > > confirmed_flush_lsn goes backward.
> > > > >
> > > >
> > > > Right, this is possible, though there shouldn't be a problem because
> > > > anyway, slotsync is an async process. Till we hold restart_lsn, the
> > > > required WAL won't be removed. Having said that, I can think of two
> > > > ways to avoid it: (a) We can have some flag in shared memory using
> > > > which we can detect whether any other process is doing slot
> > > > syncronization and then either error out at that time or simply wait
> > > > or may take nowait kind of parameter from user to decide what to do?
> > > > If this is feasible, we can simply error out for the first version and
> > > > extend it later if we see any use cases for the same (b) similar to
> > > > restart_lsn, if confirmed_flush_lsn is getting moved back, raise an
> > > > error, this is good for now but in future we may still have another
> > > > similar issue, so I would prefer (a) among these but I am fine if you
> > > > prefer (b) or have some other ideas like just note down in comments
> > > > that this is a harmless case and can happen only very rarely.
> > >
> > > Thank you for sharing the ideas. I would prefer (a). For (b), the same
> > > issue still happens for other fields.
> >
> > I agree that (a) looks better.  On a separate note, while looking at
> > this API pg_sync_replication_slots(PG_FUNCTION_ARGS) shouldn't there
> > be an optional parameter to give one slot or multiple slots or all
> > slots as default, that will give better control to the user no?
> >
>
> As of now, we want to give functionality similar to slotsync worker
> with a difference that users can use this new function for planned
> switchovers. So, syncing all failover slots by default. I think if
> there is a use case to selectively sync some of the failover slots
> then we can probably extend this function and slotsync worker as well.
> Normally, if the primary goes down due to whatever reason users would
> want to restart the replication for all the defined publications via
> existing failover slots. Why would anyone want to do it partially?

If we consider the usability of such a function (I mean as it is
implemented now, without any argument) one use case could be that if
the slot sync worker is not keeping up or at some point in time the
user doesn't want to wait for the worker to do this instead user can
do it by himself.

So now if we have such a functionality then it would be even better to
extend it to selectively sync the slot.  For example, if there is some
issue in syncing all slots, maybe some bug or taking a long time to
sync because there are a lot of slots but if the user needs to quickly
failover and he/she is interested in only a couple of slots then such
a option could be helpful. no?

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Re: Synchronizing slots from primary to standby

2024-02-06 Thread Amit Kapila

On Tue, Feb 6, 2024 at 3:23 PM Dilip Kumar  wrote:
>
> On Tue, Feb 6, 2024 at 1:09 PM Masahiko Sawada  wrote:
> >
> > On Tue, Feb 6, 2024 at 3:19 PM Amit Kapila  wrote:
> > >
> > > On Mon, Feb 5, 2024 at 7:56 PM Masahiko Sawada  
> > > wrote:
> > > >
> > > > ---
> > > > Since Two processes (e.g. the slotsync worker and
> > > > pg_sync_replication_slots()) concurrently fetch and update the slot
> > > > information, there is a race condition where slot's
> > > > confirmed_flush_lsn goes backward.
> > > >
> > >
> > > Right, this is possible, though there shouldn't be a problem because
> > > anyway, slotsync is an async process. Till we hold restart_lsn, the
> > > required WAL won't be removed. Having said that, I can think of two
> > > ways to avoid it: (a) We can have some flag in shared memory using
> > > which we can detect whether any other process is doing slot
> > > syncronization and then either error out at that time or simply wait
> > > or may take nowait kind of parameter from user to decide what to do?
> > > If this is feasible, we can simply error out for the first version and
> > > extend it later if we see any use cases for the same (b) similar to
> > > restart_lsn, if confirmed_flush_lsn is getting moved back, raise an
> > > error, this is good for now but in future we may still have another
> > > similar issue, so I would prefer (a) among these but I am fine if you
> > > prefer (b) or have some other ideas like just note down in comments
> > > that this is a harmless case and can happen only very rarely.
> >
> > Thank you for sharing the ideas. I would prefer (a). For (b), the same
> > issue still happens for other fields.
>
> I agree that (a) looks better.  On a separate note, while looking at
> this API pg_sync_replication_slots(PG_FUNCTION_ARGS) shouldn't there
> be an optional parameter to give one slot or multiple slots or all
> slots as default, that will give better control to the user no?
>

As of now, we want to give functionality similar to slotsync worker
with a difference that users can use this new function for planned
switchovers. So, syncing all failover slots by default. I think if
there is a use case to selectively sync some of the failover slots
then we can probably extend this function and slotsync worker as well.
Normally, if the primary goes down due to whatever reason users would
want to restart the replication for all the defined publications via
existing failover slots. Why would anyone want to do it partially?

-- 
With Regards,
Amit Kapila.

Re: Synchronizing slots from primary to standby

2024-02-06 Thread Amit Kapila

On Tue, Feb 6, 2024 at 1:09 PM Masahiko Sawada  wrote:
>
> On Tue, Feb 6, 2024 at 3:19 PM Amit Kapila  wrote:
> >
> > I think users can refer to LOGs to see if it has changed since the
> > first time it was configured. I tried by existing parameter and see
> > the following in LOG:
> > LOG:  received SIGHUP, reloading configuration files
> > 2024-02-06 11:38:59.069 IST [9240] LOG:  parameter "autovacuum" changed to 
> > "on"
> >
> > If the user can't confirm then it is better to follow the steps
> > mentioned in the patch. Do you want something else to be written in
> > docs for this? If so, what?
>
> IIUC even if a wrong slot name is specified to standby_slot_names or
> even standby_slot_names is empty, the standby server might not be
> lagging behind the subscribers depending on the timing. But when
> checking it the next time, the standby server might lag behind the
> subscribers. So what I wanted to know is how the user can confirm if a
> failover-enabled subscription is ensured not to go in front of
> failover-candidate standbys (i.e., standbys using the slots listed in
> standby_slot_names).
>

But isn't the same explained by two steps ((a) Firstly, on the
subscriber node check the last replayed WAL. (b) Next, on the standby
server check that the last-received WAL location is ahead of the
replayed WAL location on the subscriber identified above.) in the
latest *_0004 patch.

-- 
With Regards,
Amit Kapila.

Re: Synchronizing slots from primary to standby

2024-02-06 Thread Dilip Kumar

On Tue, Feb 6, 2024 at 1:09 PM Masahiko Sawada  wrote:
>
> On Tue, Feb 6, 2024 at 3:19 PM Amit Kapila  wrote:
> >
> > On Mon, Feb 5, 2024 at 7:56 PM Masahiko Sawada  
> > wrote:
> > >
> > > ---
> > > Since Two processes (e.g. the slotsync worker and
> > > pg_sync_replication_slots()) concurrently fetch and update the slot
> > > information, there is a race condition where slot's
> > > confirmed_flush_lsn goes backward.
> > >
> >
> > Right, this is possible, though there shouldn't be a problem because
> > anyway, slotsync is an async process. Till we hold restart_lsn, the
> > required WAL won't be removed. Having said that, I can think of two
> > ways to avoid it: (a) We can have some flag in shared memory using
> > which we can detect whether any other process is doing slot
> > syncronization and then either error out at that time or simply wait
> > or may take nowait kind of parameter from user to decide what to do?
> > If this is feasible, we can simply error out for the first version and
> > extend it later if we see any use cases for the same (b) similar to
> > restart_lsn, if confirmed_flush_lsn is getting moved back, raise an
> > error, this is good for now but in future we may still have another
> > similar issue, so I would prefer (a) among these but I am fine if you
> > prefer (b) or have some other ideas like just note down in comments
> > that this is a harmless case and can happen only very rarely.
>
> Thank you for sharing the ideas. I would prefer (a). For (b), the same
> issue still happens for other fields.

I agree that (a) looks better.  On a separate note, while looking at
this API pg_sync_replication_slots(PG_FUNCTION_ARGS) shouldn't there
be an optional parameter to give one slot or multiple slots or all
slots as default, that will give better control to the user no?

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Re: Synchronizing slots from primary to standby

2024-02-05 Thread Masahiko Sawada

On Tue, Feb 6, 2024 at 3:19 PM Amit Kapila  wrote:
>
> On Mon, Feb 5, 2024 at 7:56 PM Masahiko Sawada  wrote:
> >
> > ---
> > Since Two processes (e.g. the slotsync worker and
> > pg_sync_replication_slots()) concurrently fetch and update the slot
> > information, there is a race condition where slot's
> > confirmed_flush_lsn goes backward.
> >
>
> Right, this is possible, though there shouldn't be a problem because
> anyway, slotsync is an async process. Till we hold restart_lsn, the
> required WAL won't be removed. Having said that, I can think of two
> ways to avoid it: (a) We can have some flag in shared memory using
> which we can detect whether any other process is doing slot
> syncronization and then either error out at that time or simply wait
> or may take nowait kind of parameter from user to decide what to do?
> If this is feasible, we can simply error out for the first version and
> extend it later if we see any use cases for the same (b) similar to
> restart_lsn, if confirmed_flush_lsn is getting moved back, raise an
> error, this is good for now but in future we may still have another
> similar issue, so I would prefer (a) among these but I am fine if you
> prefer (b) or have some other ideas like just note down in comments
> that this is a harmless case and can happen only very rarely.

Thank you for sharing the ideas. I would prefer (a). For (b), the same
issue still happens for other fields.

>
> >
> > ---
> > + It is recommended that subscriptions are first disabled before 
> > promoting
> > f+ the standby and are enabled back after altering the connection 
> > string.
> >
> > I think it's better to describe the reason why it's recommended to
> > disable subscriptions before the standby promotion.
> >
>
> Agreed. The reason I see for this is that if we don't disable the
> subscription before promotion and changing the connection string there
> is a chance that the old primary comes back and the subscriber can
> have some additional data, though the chances of same are less.
>
> > ---
> > +/* Slot sync worker objects */
> > +extern PGDLLIMPORT char *PrimaryConnInfo;
> > +extern PGDLLIMPORT char *PrimarySlotName;
> >
> > These two variables are declared also in xlogrecovery.h. Is it
> > intentional? If so, I think it's better to write comments.
> >
> > ---
> > Global functions and variables used by the slotsync worker are
> > declared in logicalworker.h and worker_internal.h. But is it really
> > okay to make a dependency between the slotsync worker and logical
> > replication workers? IIUC the slotsync worker is conceptually a
> > separate feature from the logical replication. I think the slotsync
> > worker can have its own header file.
> >
>
> +1.
>
> >
> > ---
> > + Confirm that the standby server is not lagging behind the subscribers.
> > + This step can be skipped if
> > +  > linkend="guc-standby-slot-names">standby_slot_names
> > + has been correctly configured.
> >
> > How can the user confirm if standby_slot_names is correctly configured?
> >
>
> I think users can refer to LOGs to see if it has changed since the
> first time it was configured. I tried by existing parameter and see
> the following in LOG:
> LOG:  received SIGHUP, reloading configuration files
> 2024-02-06 11:38:59.069 IST [9240] LOG:  parameter "autovacuum" changed to 
> "on"
>
> If the user can't confirm then it is better to follow the steps
> mentioned in the patch. Do you want something else to be written in
> docs for this? If so, what?

IIUC even if a wrong slot name is specified to standby_slot_names or
even standby_slot_names is empty, the standby server might not be
lagging behind the subscribers depending on the timing. But when
checking it the next time, the standby server might lag behind the
subscribers. So what I wanted to know is how the user can confirm if a
failover-enabled subscription is ensured not to go in front of
failover-candidate standbys (i.e., standbys using the slots listed in
standby_slot_names).

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Re: Synchronizing slots from primary to standby

2024-02-05 Thread Bertrand Drouvot

Hi,

On Tue, Feb 06, 2024 at 03:19:11AM +, Zhijie Hou (Fujitsu) wrote:
> On Friday, February 2, 2024 2:03 PM Bertrand Drouvot 
>  wrote:
> > 
> > Hi,
> > 
> > On Thu, Feb 01, 2024 at 05:29:15PM +0530, shveta malik wrote:
> > > Attached v75 patch-set. Changes are:
> > >
> > > 1) Re-arranged the patches:
> > > 1.1) 'libpqrc' related changes (from v74-001 and v74-004) are
> > > separated out in v75-001 as those are independent changes.
> > > 1.2) 'Add logical slot sync capability', 'Slot sync worker as special
> > > process' and 'App-name changes' are now merged to single patch which
> > > makes v75-002.
> > > 1.3) 'Wait for physical Standby confirmation' and 'Failover Validation
> > > Document' patches are maintained as is (v75-003 and v75-004 now).
> > 
> > Thanks!
> > 
> > I only looked at the commit message for v75-0002 and see that it has changed
> > since the comment done in [1], but it still does not look correct to me.
> > 
> > "
> > If a logical slot on the primary is valid but is invalidated on the 
> > standby, then
> > that slot is dropped and recreated on the standby in next sync-cycle 
> > provided
> > the slot still exists on the primary server. It is okay to recreate such 
> > slots as long
> > as these are not consumable on the standby (which is the case currently). 
> > This
> > situation may occur due to the following reasons:
> > - The max_slot_wal_keep_size on the standby is insufficient to retain WAL
> >   records from the restart_lsn of the slot.
> > - primary_slot_name is temporarily reset to null and the physical slot is
> >   removed.
> > - The primary changes wal_level to a level lower than logical.
> > "
> > 
> > If a logical decoding slot "still exists on the primary server" then the 
> > primary
> > can not change the wal_level to lower than logical, one would get something
> > like:
> > 
> > "FATAL:  logical replication slot "logical_slot" exists, but wal_level < 
> > logical"
> > 
> > and then slots won't get invalidated on the standby. I've the feeling that 
> > the
> > wal_level conflict part may need to be explained separately? (I think it's 
> > not
> > possible that they end up being re-created on the standby for this conflict,
> > they will be simply removed as it would mean the counterpart one on the
> > primary does not exist anymore).
> 
> This is possible in some extreme cases, because the slot is synced
> asynchronously.
> 
> For example: If on the primary the wal_level is changed to 'replica'

It means that all the logical slots have been dropped on the primary (if not,
it's not possible to change it to a level < logical).

> and then
> changed back to 'logical', the standby would receive two XLOG_PARAMETER_CHANGE
> wals. And before the standby replay these wals, user can create a failover 
> slot

And now it is re-created.

So the slot has been dropped and recreated on the primary, to it's kind of 
expected
it is also dropped and re-created on the standby (should it be invalidated or 
not).

> Although I think it doesn't seem a real world case, so I am not sure is it 
> worth
> separate explanation.

Yeah, I don't think your example is worth a separate explanation also because
it's expected to see the slot being dropped / re-created anyway (see above).

That said, I still think the commit message needs some re-wording, what about?

=
If a logical slot on the primary is valid but is invalidated on the standby,
then that slot is dropped and can be recreated on the standby in next
pg_sync_replication_slots() call provided the slot still exists on the primary
server. It is okay to recreate such slots as long as these are not consumable
on the standby (which is the case currently). This situation may occur due to
the following reasons:

- The max_slot_wal_keep_size on the standby is insufficient to retain WAL
  records from the restart_lsn of the slot.
- primary_slot_name is temporarily reset to null and the physical slot is
  removed.

Changing the primary wal_level to a level lower than logical is only possible
if the logical slots are removed on the primary, so it's expected to see
the slots being removed on the standby too (and re-created if they are
re-created on the primary).
=

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Synchronizing slots from primary to standby

2024-02-05 Thread Peter Smith

Hi, I took another high-level look at all the funtion names of the
slotsync.c file.

==
src/backend/replication/logical/slotsync.c

+static bool
+local_slot_update(RemoteSlot * remote_slot, Oid remote_dbid)

+static List *
+get_local_synced_slots(void)

+static bool
+check_sync_slot_on_remote(ReplicationSlot *local_slot, List *remote_slots,

+static void
+drop_obsolete_slots(List *remote_slot_list)

+static void
+reserve_wal_for_slot(XLogRecPtr restart_lsn)

+static bool
+update_and_persist_slot(RemoteSlot * remote_slot, Oid remote_dbid)

+static bool
+synchronize_one_slot(RemoteSlot * remote_slot, Oid remote_dbid)
+get_slot_invalidation_cause(char *conflict_reason)

+static bool
+synchronize_slots(WalReceiverConn *wrconn)

+static void
+validate_primary_slot(WalReceiverConn *wrconn, int slot_invalid_elevel)

+static bool
+validate_slotsync_params(int elevel)

+bool
+IsSyncingReplicationSlots(void)

+Datum
+pg_sync_replication_slots(PG_FUNCTION_ARGS)

~~~

There seems some muddling of names here:
- "local" versus ? and "remote" versus "primary"; or sometimes the
function does not give an indication.
- "sync_slot" versus "synced_slot" versus nothing
- "check" versus "validate"
- etc.

Below are some suggestions (some are unchanged); probably there are
better ideas for names but my point is that the current names could be
improved:

CURRENT SUGGESTION
get_local_synced_slots get_local_synced_slots
check_sync_slot_on_remote check_local_synced_slot_exists_on_remote
drop_obsolete_slots  drop_local_synced_slots
reserve_wal_for_slot reserve_wal_for_local_slot
local_slot_update  update_local_synced_slot
update_and_persist_slot   update_and_persist_local_synced_slot
get_slot_invalidation_cause  get_slot_conflict_reason
synchronize_slots  synchronize_remote_slots_to_local
synchronize_one_slotsynchronize_remote_slot_to_local
validate_primary_slotcheck_remote_synced_slot_exists
validate_slotsync_params check_local_config
IsSyncingReplicationSlots IsSyncingReplicationSlots
pg_sync_replication_slots pg_sync_replication_slots

==
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Synchronizing slots from primary to standby

2024-02-05 Thread Amit Kapila

On Mon, Feb 5, 2024 at 7:56 PM Masahiko Sawada  wrote:
>
> ---
> Since Two processes (e.g. the slotsync worker and
> pg_sync_replication_slots()) concurrently fetch and update the slot
> information, there is a race condition where slot's
> confirmed_flush_lsn goes backward.
>

Right, this is possible, though there shouldn't be a problem because
anyway, slotsync is an async process. Till we hold restart_lsn, the
required WAL won't be removed. Having said that, I can think of two
ways to avoid it: (a) We can have some flag in shared memory using
which we can detect whether any other process is doing slot
syncronization and then either error out at that time or simply wait
or may take nowait kind of parameter from user to decide what to do?
If this is feasible, we can simply error out for the first version and
extend it later if we see any use cases for the same (b) similar to
restart_lsn, if confirmed_flush_lsn is getting moved back, raise an
error, this is good for now but in future we may still have another
similar issue, so I would prefer (a) among these but I am fine if you
prefer (b) or have some other ideas like just note down in comments
that this is a harmless case and can happen only very rarely.

>
> ---
> + It is recommended that subscriptions are first disabled before promoting
> f+ the standby and are enabled back after altering the connection string.
>
> I think it's better to describe the reason why it's recommended to
> disable subscriptions before the standby promotion.
>

Agreed. The reason I see for this is that if we don't disable the
subscription before promotion and changing the connection string there
is a chance that the old primary comes back and the subscriber can
have some additional data, though the chances of same are less.

> ---
> +/* Slot sync worker objects */
> +extern PGDLLIMPORT char *PrimaryConnInfo;
> +extern PGDLLIMPORT char *PrimarySlotName;
>
> These two variables are declared also in xlogrecovery.h. Is it
> intentional? If so, I think it's better to write comments.
>
> ---
> Global functions and variables used by the slotsync worker are
> declared in logicalworker.h and worker_internal.h. But is it really
> okay to make a dependency between the slotsync worker and logical
> replication workers? IIUC the slotsync worker is conceptually a
> separate feature from the logical replication. I think the slotsync
> worker can have its own header file.
>

+1.

>
> ---
> + Confirm that the standby server is not lagging behind the subscribers.
> + This step can be skipped if
> +  linkend="guc-standby-slot-names">standby_slot_names
> + has been correctly configured.
>
> How can the user confirm if standby_slot_names is correctly configured?
>

I think users can refer to LOGs to see if it has changed since the
first time it was configured. I tried by existing parameter and see
the following in LOG:
LOG:  received SIGHUP, reloading configuration files
2024-02-06 11:38:59.069 IST [9240] LOG:  parameter "autovacuum" changed to "on"

If the user can't confirm then it is better to follow the steps
mentioned in the patch. Do you want something else to be written in
docs for this? If so, what?

-- 
With Regards,
Amit Kapila.

Re: Synchronizing slots from primary to standby

2024-02-05 Thread Peter Smith

Hi,

Previously ([1] #19 and #22) I had suggested that some conflict_reason
code could be split off and pushed first as a prerequisite/
preparatory/ independent patch.

At the time, my suggestion was premature because there was still a lot
of code under development.

But now the affected code is in the first patch 1, and there are
already other precedents of slot-sync preparatory patches getting
pushed.

So I thought to resurrect this splitting suggestion again, as perhaps
is the right time to do it.

Details are below:

==

IMO the new #defines for the conflict_reason stuff plus where they get
used can be pushed as an independent patch.

Specifically, this stuff:

~~~

>From src/include/replication/slot.h:

+/*
+ * The possible values for 'conflict_reason' returned in
+ * pg_get_replication_slots.
+ */
+#define SLOT_INVAL_WAL_REMOVED_TEXT "wal_removed"
+#define SLOT_INVAL_HORIZON_TEXT "rows_removed"
+#define SLOT_INVAL_WAL_LEVEL_TEXT   "wal_level_insufficient"
+

~~~

>From src/backend/replication/logical/slotsync.c:

Also, IMO this function should live in slot.c; Although slotsync.c
might be the only caller, this is not really a slot-sync specific
function.

+/*
+ * Maps the pg_replication_slots.conflict_reason text value to
+ * ReplicationSlotInvalidationCause enum value
+ */
+static ReplicationSlotInvalidationCause
+get_slot_invalidation_cause(char *conflict_reason)
+{
+ Assert(conflict_reason);
+
+ if (strcmp(conflict_reason, SLOT_INVAL_WAL_REMOVED_TEXT) == 0)
+ return RS_INVAL_WAL_REMOVED;
+ else if (strcmp(conflict_reason, SLOT_INVAL_HORIZON_TEXT) == 0)
+ return RS_INVAL_HORIZON;
+ else if (strcmp(conflict_reason, SLOT_INVAL_WAL_LEVEL_TEXT) == 0)
+ return RS_INVAL_WAL_LEVEL;
+ else
+ Assert(0);
+
+ /* Keep compiler quiet */
+ return RS_INVAL_NONE;
+}

~~~

>From src/backend/replication/slotfuncs.c:


  case RS_INVAL_WAL_REMOVED:
- values[i++] = CStringGetTextDatum("wal_removed");
+ values[i++] = CStringGetTextDatum(SLOT_INVAL_WAL_REMOVED_TEXT);
  break;

  case RS_INVAL_HORIZON:
- values[i++] = CStringGetTextDatum("rows_removed");
+ values[i++] = CStringGetTextDatum(SLOT_INVAL_HORIZON_TEXT);
  break;

  case RS_INVAL_WAL_LEVEL:
- values[i++] = CStringGetTextDatum("wal_level_insufficient");
+ values[i++] = CStringGetTextDatum(SLOT_INVAL_WAL_LEVEL_TEXT);
  break;

~~~

Thoughts?

==
[1] 
https://www.postgresql.org/message-id/CAHut%2BPtJAAPghc4GPt0k%3DjeMz1qu4H7mnaDifOHsVsMqi-qOLA%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Synchronizing slots from primary to standby

2024-02-05 Thread Peter Smith

Here are some review comments for v78-0001

==
GENERAL

1.
Should the "Chapter 30 Logical Replication" at least have another
section that mentions the feature of slot synchronization so the
information about it is easier to find? It doesn't need to say much --
just give a reference to the other sections where it is explained
already.

==
Commit Message

2.
A new 'synced' flag is introduced for replication slots, indicating whether the
slot has been synchronized from the primary server. On a standby, synced slots
cannot be dropped or consumed, and any attempt to perform logical decoding on
them will result in an error.

~

It doesn't say *where* is this new 'synced' flag.

~~~

3.
The logical replication slots on the primary can be synchronized to the hot
standby by enabling the failover option during slot creation and calling
pg_sync_replication_slots() function on the standby. For the synchronization to
work, it is mandatory to have a physical replication slot between the primary
and the standby, hot_standby_feedback must be enabled on the standby and a
valid dbname must be specified in primary_conninfo.

~

3a.
"by enabling the failover option during slot creation" -- Should you
elaborate more about that part by mentioning the failover parameter of
the create slot API, or the "failover" option of the CREATE
SUBSCRIPTION?

~

3b.
I find it easy to read if the GUC parameters are quoted, but YMMV.

/hot_standby_feedback/'hot_standby_feedback'/

/primary_conninfo/'primary_conninfo'/

~~~

4.
If a logical slot on the primary is valid but is invalidated on the standby,
then that slot is dropped and can be recreated on the standby in next
pg_sync_replication_slots() call provided the slot still exists on the primary
server. It is okay to recreate such slots as long as these are not consumable
on the standby (which is the case currently). This situation may occur due to
the following reasons:
- The max_slot_wal_keep_size on the standby is insufficient to retain WAL
  records from the restart_lsn of the slot.
- primary_slot_name is temporarily reset to null and the physical slot is
  removed.
- The primary changes wal_level to a level lower than logical.

~

4a.
/and can be recreated/but will be recreated/

~

4b.
(As before, I would quote the GUCs for easier readability)

/max_slot_wal_keep_size/'max_slot_wal_keep_size'/

/primary_slot_name/'primary_slot_name'/

/'wal_level'/wal_level/

==
doc/src/sgml/config.sgml

5.
+ 
+  To synchronize replication slots (see
+  ),
+  it is also necessary to specify a valid dbname
+  in the primary_conninfo string. This will only be
+  used for slot synchronization. It is ignored for streaming.
  

Somehow, I thought the below wording is slightly better (and it also
matches the linked section title). YMMV.

/To synchronize replication slots/For replication slot synchronization/

==
src/sgml/func.sgml

6.
+  
+   
+
+ pg_sync_replication_slots

Currently, this is in section "9.27.6 Replication Management
Functions", but I wondered if it should also have some mention in the
"9.27.4. Recovery Control Functions" section.

==
doc/src/sgml/logicaldecoding.sgml

7.
+Replication Slot Synchronization
+
+ A logical replication slot on the primary can be synchronized to the hot
+ standby by enabling the failover option during slot
+ creation and calling pg_sync_replication_slots
+ on the standby. For the synchronization
+ to work, it is mandatory to have a physical replication slot between the
+ primary and the standby, and
+ hot_standby_feedback
+ must be enabled on the standby. It is also necessary to specify a valid
+ dbname in the
+ primary_conninfo.
+

7a.
Should you elaborate more about the "enabling the failover option
during slot creation" part by mentioning the failover parameter of the
create slot API, or the "failover" option of the CREATE SUBSCRIPTION?

~

7b.
I think it will be better to include a link to the
pg_sync_replication_slots function.

~~~

8.
+
+ To resume logical replication after failover from the synced logical
+ slots, the subscription's 'conninfo' must be altered to point to the
+ new primary server. This is done using
+ ALTER
SUBSCRIPTION ... CONNECTION.
+ It is recommended that subscriptions are first disabled before promoting
+ the standby and are enabled back after altering the connection string.
+

/and are enabled back/and are re-enabled/

==
src/backend/replication/logical/slotsync.c

9.
+ * This file contains the code for slot synchronization on a physical standby
+ * to fetch logical failover slots information from the primary server, create
+ * the slots on the standby and synchronize them periodically.

IIUC there is no "periodically" logic in this patch 0001 anymore
because that is now in a later patch, so this part of the comment
maybe needs adjustment.

~~~

RE: Synchronizing slots from primary to standby

2024-02-05 Thread Zhijie Hou (Fujitsu)

On Friday, February 2, 2024 2:03 PM Bertrand Drouvot 
 wrote:
> 
> Hi,
> 
> On Thu, Feb 01, 2024 at 05:29:15PM +0530, shveta malik wrote:
> > Attached v75 patch-set. Changes are:
> >
> > 1) Re-arranged the patches:
> > 1.1) 'libpqrc' related changes (from v74-001 and v74-004) are
> > separated out in v75-001 as those are independent changes.
> > 1.2) 'Add logical slot sync capability', 'Slot sync worker as special
> > process' and 'App-name changes' are now merged to single patch which
> > makes v75-002.
> > 1.3) 'Wait for physical Standby confirmation' and 'Failover Validation
> > Document' patches are maintained as is (v75-003 and v75-004 now).
> 
> Thanks!
> 
> I only looked at the commit message for v75-0002 and see that it has changed
> since the comment done in [1], but it still does not look correct to me.
> 
> "
> If a logical slot on the primary is valid but is invalidated on the standby, 
> then
> that slot is dropped and recreated on the standby in next sync-cycle provided
> the slot still exists on the primary server. It is okay to recreate such 
> slots as long
> as these are not consumable on the standby (which is the case currently). This
> situation may occur due to the following reasons:
> - The max_slot_wal_keep_size on the standby is insufficient to retain WAL
>   records from the restart_lsn of the slot.
> - primary_slot_name is temporarily reset to null and the physical slot is
>   removed.
> - The primary changes wal_level to a level lower than logical.
> "
> 
> If a logical decoding slot "still exists on the primary server" then the 
> primary
> can not change the wal_level to lower than logical, one would get something
> like:
> 
> "FATAL:  logical replication slot "logical_slot" exists, but wal_level < 
> logical"
> 
> and then slots won't get invalidated on the standby. I've the feeling that the
> wal_level conflict part may need to be explained separately? (I think it's not
> possible that they end up being re-created on the standby for this conflict,
> they will be simply removed as it would mean the counterpart one on the
> primary does not exist anymore).

This is possible in some extreme cases, because the slot is synced
asynchronously.

For example: If on the primary the wal_level is changed to 'replica' and then
changed back to 'logical', the standby would receive two XLOG_PARAMETER_CHANGE
wals. And before the standby replay these wals, user can create a failover slot
on the primary because the wal_level is logical, and if the slotsync worker has
synced the slots before startup process replay the XLOG_PARAMETER_CHANGE, then
when replaying the XLOG_PARAMETER_CHANGE, the just synced slot will be
invalidated.

Although I think it doesn't seem a real world case, so I am not sure is it worth
separate explanation.

Best Regards,
Hou zj

Re: Synchronizing slots from primary to standby

2024-02-05 Thread Masahiko Sawada

On Mon, Feb 5, 2024 at 8:26 PM shveta malik  wrote:
>
> On Mon, Feb 5, 2024 at 10:57 AM Ajin Cherian  wrote:
> >
> > Just noticed that doc/src/sgml/config.sgml still refers to enable_synclot 
> > instead of sync_replication_slots:
> >
> > The standbys corresponding to the physical replication slots in
> > standby_slot_names must configure
> > enable_syncslot = true so they can receive
> >  failover logical slots changes from the primary.
>
> Thanks Ajin for pointing this out. Here are v78 patches, corrected there.
>
> Other changes are:
>
> 1)  Rebased the patches as the v77-001 is now pushed.
> 2)  Enabled executing pg_sync_replication_slots() on cascading-standby.
> 3)  Rearranged the code around parameter validity checks. Changed
> function names and changed the way how dbname is extracted as
> suggested by Amit offlist.
> 4)  Rearranged the code around check_primary_info(). Removed output args.
> 5)  Few other trivial changes.
>

Thank you for updating the patch! Here are some comments:

---
Since Two processes (e.g. the slotsync worker and
pg_sync_replication_slots()) concurrently fetch and update the slot
information, there is a race condition where slot's
confirmed_flush_lsn goes backward. . We have the following check but
it doesn't prevent the slot's confirmed_flush_lsn from moving backward
if the restart_lsn does't change:

/*
 * Sanity check: As long as the invalidations are handled
 * appropriately as above, this should never happen.
 */
if (remote_slot->restart_lsn < slot->data.restart_lsn)
elog(ERROR,
 "cannot synchronize local slot \"%s\" LSN(%X/%X)"
 " to remote slot's LSN(%X/%X) as synchronization"
 " would move it backwards", remote_slot->name,
 LSN_FORMAT_ARGS(slot->data.restart_lsn),
 LSN_FORMAT_ARGS(remote_slot->restart_lsn));

---
+ It is recommended that subscriptions are first disabled before promoting
f+ the standby and are enabled back after altering the connection string.

I think it's better to describe the reason why it's recommended to
disable subscriptions before the standby promotion.

---
+/* Slot sync worker objects */
+extern PGDLLIMPORT char *PrimaryConnInfo;
+extern PGDLLIMPORT char *PrimarySlotName;

These two variables are declared also in xlogrecovery.h. Is it
intentional? If so, I think it's better to write comments.

---
Global functions and variables used by the slotsync worker are
declared in logicalworker.h and worker_internal.h. But is it really
okay to make a dependency between the slotsync worker and logical
replication workers? IIUC the slotsync worker is conceptually a
separate feature from the logical replication. I think the slotsync
worker can have its own header file.

---
+   SELECT r.srsubid AS subid, CONCAT('pg_' || srsubid ||
'_sync_' || srrelid || '_' || ctl.system_identifier) AS slotname

and

+   SELECT (CASE WHEN r.srsubstate = 'f' THEN
pg_replication_origin_progress(CONCAT('pg_' || r.srsubid || '_' ||
r.srrelid), false)

If we use CONCAT function, we can replace '||' with ','.

---
+ Confirm that the standby server is not lagging behind the subscribers.
+ This step can be skipped if
+ standby_slot_names
+ has been correctly configured.

How can the user confirm if standby_slot_names is correctly configured?

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Re: Synchronizing slots from primary to standby

2024-02-05 Thread shveta malik

On Mon, Feb 5, 2024 at 4:36 PM Amit Kapila  wrote:
>
> I have pushed the first patch. Next, a few comments on 0002 are as follows:

Thanks for the feedback Amit. Some of these are addressed in v78. Rest
will be addressed in the next version.

> 1.
> +static bool
> +validate_parameters_and_get_dbname(char **dbname, int elevel)
>
> For 0002, we don't need dbname as out parameter. Also, we can rename
> the function to validate_slotsync_params() or something like that.
> Also, for 0003, we don't need to get the dbname from
> wait_for_valid_params_and_get_dbname(), instead there could be a
> common function that can be invoked from validate_slotsync_params()
> and caller of wait function that caches the value of dbname.
>
> The other parameter elevel is also not required for 0002.
>
> 2.
> + /*
> + * Make sure that concerned WAL is received and flushed before syncing
> + * slot to target lsn received from the primary server.
> + */
> + latestFlushPtr = GetStandbyFlushRecPtr(NULL);
> + if (remote_slot->confirmed_lsn > latestFlushPtr)
> + {
> + /*
> + * Can get here only if GUC 'standby_slot_names' on the primary server
> + * was not configured correctly.
> + */
> + ereport(LOG,
> + errmsg("skipping slot synchronization as the received slot sync"
> +" LSN %X/%X for slot \"%s\" is ahead of the standby position %X/%X",
> +LSN_FORMAT_ARGS(remote_slot->confirmed_lsn),
> +remote_slot->name,
> +LSN_FORMAT_ARGS(latestFlushPtr)));
> +
> + return false;
>
> In the case of a function invocation, this should be an ERROR. We can
> move the comment related to 'standby_slot_names' to a later patch
> where that GUC is introduced. See, if there are other LOGs in the
> patch that needs to be converted to ERROR.
>
> 3. The function pg_sync_replication_slots() should be in file
> slotfuncs.c and common functionality between this function and
> slotsync worker can be exposed via a function in slotsync.c.
>
> 4.
> /*
> + * Using the specified primary server connection, check whether we are
> + * cascading standby and validates primary_slot_name for
> + * non-cascading-standbys.
> + */
> + check_primary_info(wrconn, _cascading_standby,
> +_slot_invalid, ERROR);
> +
> + if (am_cascading_standby)
> + ereport(ERROR,
> + errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
> + errmsg("cannot synchronize replication slots to a cascading standby"));
>
> primary_slot_invalid is not used in this patch. I think we can allow
> the function can be executed on cascading_standby as well because this
> will be used for the planned switchover.
>
> 5. I don't see any problem with allowing concurrent processes trying
> to sync the same slot at the same time as each process will acquire
> the slot and only one process can acquire the slot at a time, the
> other will get an ERROR.
>
> --
> With Regards,
> Amit Kapila.

Re: Synchronizing slots from primary to standby

2024-02-05 Thread Amit Kapila

On Mon, Feb 5, 2024 at 7:59 AM Zhijie Hou (Fujitsu)
 wrote:
>

I have pushed the first patch. Next, a few comments on 0002 are as follows:
1.
+static bool
+validate_parameters_and_get_dbname(char **dbname, int elevel)

For 0002, we don't need dbname as out parameter. Also, we can rename
the function to validate_slotsync_params() or something like that.
Also, for 0003, we don't need to get the dbname from
wait_for_valid_params_and_get_dbname(), instead there could be a
common function that can be invoked from validate_slotsync_params()
and caller of wait function that caches the value of dbname.

The other parameter elevel is also not required for 0002.

2.
+ /*
+ * Make sure that concerned WAL is received and flushed before syncing
+ * slot to target lsn received from the primary server.
+ */
+ latestFlushPtr = GetStandbyFlushRecPtr(NULL);
+ if (remote_slot->confirmed_lsn > latestFlushPtr)
+ {
+ /*
+ * Can get here only if GUC 'standby_slot_names' on the primary server
+ * was not configured correctly.
+ */
+ ereport(LOG,
+ errmsg("skipping slot synchronization as the received slot sync"
+" LSN %X/%X for slot \"%s\" is ahead of the standby position %X/%X",
+LSN_FORMAT_ARGS(remote_slot->confirmed_lsn),
+remote_slot->name,
+LSN_FORMAT_ARGS(latestFlushPtr)));
+
+ return false;

In the case of a function invocation, this should be an ERROR. We can
move the comment related to 'standby_slot_names' to a later patch
where that GUC is introduced. See, if there are other LOGs in the
patch that needs to be converted to ERROR.

3. The function pg_sync_replication_slots() should be in file
slotfuncs.c and common functionality between this function and
slotsync worker can be exposed via a function in slotsync.c.

4.
/*
+ * Using the specified primary server connection, check whether we are
+ * cascading standby and validates primary_slot_name for
+ * non-cascading-standbys.
+ */
+ check_primary_info(wrconn, _cascading_standby,
+_slot_invalid, ERROR);
+
+ if (am_cascading_standby)
+ ereport(ERROR,
+ errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("cannot synchronize replication slots to a cascading standby"));

primary_slot_invalid is not used in this patch. I think we can allow
the function can be executed on cascading_standby as well because this
will be used for the planned switchover.

5. I don't see any problem with allowing concurrent processes trying
to sync the same slot at the same time as each process will acquire
the slot and only one process can acquire the slot at a time, the
other will get an ERROR.

-- 
With Regards,
Amit Kapila.

Re: Synchronizing slots from primary to standby

2024-02-04 Thread Ajin Cherian

On Mon, Feb 5, 2024 at 1:29 PM Zhijie Hou (Fujitsu) 
wrote:

> On Monday, February 5, 2024 10:17 AM Zhijie Hou (Fujitsu) <
> houzj.f...@fujitsu.com> wrote:
>
> There was one miss in the doc that cause CFbot failure,
> attach the correct version V77_2 here. There are no code changes compared
> to V77 version.
>
> Best Regards,
> Hou zj
>

Just noticed that doc/src/sgml/config.sgml still refers to enable_synclot
instead of sync_replication_slots:

The standbys corresponding to the physical replication slots in
standby_slot_names must configure
enable_syncslot = true so they can receive
 failover logical slots changes from the primary.

regards,
Ajin Cherian
Fujitsu Australia

RE: Synchronizing slots from primary to standby

2024-02-04 Thread Zhijie Hou (Fujitsu)

On Thursday, February 1, 2024 12:20 PM Amit Kapila  
wrote:
> 
> On Thu, Feb 1, 2024 at 8:15 AM Euler Taveira  wrote:
> >
> > On Mon, Jan 29, 2024, at 10:17 AM, Zhijie Hou (Fujitsu) wrote:
> >
> > Attach the V72-0001 which addressed above comments, other patches will
> be
> > rebased and posted after pushing first patch. Thanks Shveta for helping
> address
> > the comments.
> >
> >
> > While working on another patch I noticed a new NOTICE message:
> >
> > NOTICE:  changed the failover state of replication slot "foo" on publisher 
> > to
> false
> >
> > I wasn't paying much attention to this thread then I start reading the 2
> > patches that was recently committed. The message above surprises me
> because
> > pg_createsubscriber starts to emit this message. The reason is that it 
> > doesn't
> > create the replication slot during the CREATE SUBSCRIPTION. Instead, it
> creates
> > the replication slot with failover = false and no such option is informed
> > during CREATE SUBSCRIPTION which means it uses the default value (failover
> =
> > false). I expect that I don't see any message because it is *not* changing 
> > the
> > behavior. I was wrong. It doesn't check the failover state on publisher, it
> > just executes walrcv_alter_slot() and emits a message.
> >
> > IMO if we are changing an outstanding property on node A from node B,
> node B
> > already knows (or might know) about that behavior change (because it is
> sending
> > the command), however, node A doesn't (unless log_replication_commands
> = on --
> > it is not the default).
> >
> > Do we really need this message as NOTICE?
> >
> 
> The reason for adding this NOTICE was to keep it similar to other
> Notice messages in these commands like create/drop slot. However, here
> the difference is we may not have altered the slot as the property is
> already the same as we want to set on the publisher. So, I am not sure
> whether we should follow the existing behavior or just get rid of it.
> And then do we remove similar NOTICE in AlterSubscription() as well?
> Normally, I think NOTICE intends to let users know if we did anything
> with slots while executing subscription commands. Does anyone else
> have an opinion on this point?
> 
> A related point, I think we can avoid setting the 'failover' property
> in ReplicationSlotAlter() if it is not changed, the advantage is we
> will avoid saving slots. OTOH, this won't be a frequent operation so
> we can leave it as it is as well.

Here is a patch to remove the NOTICE and improve the ReplicationSlotAlter.
The patch also includes few cleanups based on Peter's feedback.

Best Regards,
Hou zj


v2-0001-clean-up-for-776621a5.patch
Description: v2-0001-clean-up-for-776621a5.patch

Re: Synchronizing slots from primary to standby

2024-02-04 Thread Peter Smith

On Fri, Feb 2, 2024 at 11:18 PM shveta malik  wrote:
>
> On Fri, Feb 2, 2024 at 12:25 PM Peter Smith  wrote:
> >
> > Here are some review comments for v750002.
>
> Thanks for the feedback Peter. Addressed all in v76 except one.
>
> > (this is a WIP but this is what I found so far...)
>
> > I wonder if it is better to log all the problems in one go instead of
> > making users stumble onto them one at a time after fixing one and then
> > hitting the next problem. e.g. just set some variable "all_ok =
> > false;" each time instead of all the "return false;"
> >
> > Then at the end of the function just "return all_ok;"
>
> If we do this way, then we need to find a way to combine the msgs as
> well, otherwise the same msg will be repeated multiple times. For the
> concerned functionality (which needs one time config effort by user),
> I feel the existing way looks okay. We may consider optimizing it if
> we get more comments here.
>

I don't think combining messages is necessary;   I considered these
all as different (not the same msg repeated multiple times) since they
all have different errhints.

I felt a user would only know to make a configuration correction when
they are informed something is wrong, so my review point was we could
tell them all the wrong things up-front so then those can all be fixed
with a "one time config effort by user".

Otherwise, if multiple settings (e.g. from the list below) have wrong
values, I imagined the user will fix the first reported one, then the
next bad config will be reported, then the user will fix that one,
then the next bad config will be reported, then the user will fix that
one, and so on. It just seemed potentially/unnecessarilly painful.

- errhint("\"%s\" must be defined.", "primary_slot_name"));
- errhint("\"%s\" must be enabled.", "hot_standby_feedback"));
- errhint("\"wal_level\" must be >= logical."));
- errhint("\"%s\" must be defined.", "primary_conninfo"));
- errhint("'dbname' must be specified in \"%s\".", "primary_conninfo"));

~

Anyway, I just wanted to explain my review comment some more because
maybe my reason wasn't clear the first time. Whatever your decision
is, it is fine by me.

==
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Synchronizing slots from primary to standby

2024-02-02 Thread shveta malik

On Fri, Feb 2, 2024 at 12:25 PM Peter Smith  wrote:
>
> Here are some review comments for v750002.

Thanks for the feedback Peter. Addressed all in v76 except one.

> (this is a WIP but this is what I found so far...)

> I wonder if it is better to log all the problems in one go instead of
> making users stumble onto them one at a time after fixing one and then
> hitting the next problem. e.g. just set some variable "all_ok =
> false;" each time instead of all the "return false;"
>
> Then at the end of the function just "return all_ok;"

If we do this way, then we need to find a way to combine the msgs as
well, otherwise the same msg will be repeated multiple times. For the
concerned functionality (which needs one time config effort by user),
I feel the existing way looks okay. We may consider optimizing it if
we get more comments here.

thanks
Shveta

Re: Synchronizing slots from primary to standby

2024-02-02 Thread Bertrand Drouvot

Hi,

On Fri, Feb 02, 2024 at 12:25:30PM +0530, Amit Kapila wrote:
> On Thu, Feb 1, 2024 at 5:29 PM shveta malik  wrote:
> >
> > On Thu, Feb 1, 2024 at 2:35 PM Amit Kapila  wrote:
> > >
> > > Agreed, and I am fine with merging 0001, 0002, and 0004 as suggested
> > > by you though I have a few minor comments on 0002 and 0004. I was
> > > thinking about what will be a logical way to split the slot sync
> > > worker patch (combined result of 0001, 0002, and 0004), and one idea
> > > occurred to me is that we can have the first patch as
> > > synchronize_solts() API and the functionality required to implement
> > > that API then the second patch would be a slot sync worker which uses
> > > that API to synchronize slots and does all the required validations.
> > > Any thoughts?
> >
> > If we shift 'synchronize_slots()' to the first patch but there is no
> > caller of it, we may have a compiler warning for the same. The only
> > way it can be done is if we temporarily add SQL function on standby
> > which uses 'synchronize_slots()'. This SQL function can then be
> > removed in later patches where we actually have a caller for
> > 'synchronize_slots'.
> >
> 
> Can such a SQL function say pg_synchronize_slots() which can sync all
> slots that have a failover flag set be useful in general apart from
> just writing tests for this new API? I am thinking maybe users want
> more control over when to sync the slots and write their bgworker or
> simply do it just before shutdown once (sort of planned switchover) or
> at some other pre-defined times.

Big +1 for having this kind of function in user's hands (as the standby's slots
may be lagging behind during a switchover for example). As far the name, I think
it would make sense to add "replication" or "repl" something like
pg_sync_replication_slots()? (that would be aligned with
 pg_create_logical_replication_slot() and friends).

> BTW, we also have
> pg_log_standby_snapshot() which otherwise would be done periodically
> by background processes.
> 
> >
> > 1) Re-arranged the patches:
> > 1.1) 'libpqrc' related changes (from v74-001 and v74-004) are
> > separated out in v75-001 as those are independent changes.
> 
> Bertrand, Sawada-San, and others, do you see a problem with such a
> split? Can we go ahead with v75_0001 separately after fixing the open
> comments?

I think that makes sense, specially if we're also creating a user callable
function to sync the slot(s) at wish.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Synchronizing slots from primary to standby

2024-02-02 Thread Amit Kapila

On Fri, Feb 2, 2024 at 10:53 AM Bertrand Drouvot
 wrote:
>
> On Thu, Feb 01, 2024 at 04:12:43PM +0530, Amit Kapila wrote:
> > On Thu, Jan 25, 2024 at 11:26 AM Bertrand Drouvot
> >
> > I again thought on this point and feel that even if we start to sync
> > say physical slots their purpose would also be to allow
> > failover/switchover, otherwise, there is no use of syncing the slots.
>
> Yeah, I think this is a good point.
>
> > So, by that theory, we can just go for naming it as
> > sync_failover_slots or simply sync_slots with values 'off' and 'on'.
> > Now, if these are used for switchover then there is an argument that
> > adding 'failover' in the GUC name could be confusing but I feel
> > 'failover' is used widely enough that it shouldn't be a problem for
> > users to understand, otherwise, we can go with simple name like
> > sync_slots as well.
> >
>
> I agree and "on"/"off" looks enough to me now. As far the GUC name I've the
> feeling that "replication" should be part of it, and think that 
> sync_replication_slots
> is fine. The reason behind is that "sync_slots" could be confusing if in the
> future other kind of "slot" (other than replication ones) are added in the 
> engine.
>

+1 for sync_replication_slots with values as 'on'/'off'.

-- 
With Regards,
Amit Kapila.

Re: Synchronizing slots from primary to standby

2024-02-01 Thread Amit Kapila

On Thu, Feb 1, 2024 at 5:29 PM shveta malik  wrote:
>
> On Thu, Feb 1, 2024 at 2:35 PM Amit Kapila  wrote:
> >
> > Agreed, and I am fine with merging 0001, 0002, and 0004 as suggested
> > by you though I have a few minor comments on 0002 and 0004. I was
> > thinking about what will be a logical way to split the slot sync
> > worker patch (combined result of 0001, 0002, and 0004), and one idea
> > occurred to me is that we can have the first patch as
> > synchronize_solts() API and the functionality required to implement
> > that API then the second patch would be a slot sync worker which uses
> > that API to synchronize slots and does all the required validations.
> > Any thoughts?
>
> If we shift 'synchronize_slots()' to the first patch but there is no
> caller of it, we may have a compiler warning for the same. The only
> way it can be done is if we temporarily add SQL function on standby
> which uses 'synchronize_slots()'. This SQL function can then be
> removed in later patches where we actually have a caller for
> 'synchronize_slots'.
>

Can such a SQL function say pg_synchronize_slots() which can sync all
slots that have a failover flag set be useful in general apart from
just writing tests for this new API? I am thinking maybe users want
more control over when to sync the slots and write their bgworker or
simply do it just before shutdown once (sort of planned switchover) or
at some other pre-defined times. BTW, we also have
pg_log_standby_snapshot() which otherwise would be done periodically
by background processes.

>
> 1) Re-arranged the patches:
> 1.1) 'libpqrc' related changes (from v74-001 and v74-004) are
> separated out in v75-001 as those are independent changes.

Bertrand, Sawada-San, and others, do you see a problem with such a
split? Can we go ahead with v75_0001 separately after fixing the open
comments?

-- 
With Regards,
Amit Kapila.

Re: Synchronizing slots from primary to standby

2024-02-01 Thread Peter Smith

Here are some review comments for v750002.

(this is a WIP but this is what I found so far...)

==
doc/src/sgml/protocol.sgml

1.
> > 2. ALTER_REPLICATION_SLOT slot_name ( option [, ...] ) #
> >
> >   
> > -  If true, the slot is enabled to be synced to the standbys.
> > +  If true, the slot is enabled to be synced to the standbys
> > +  so that logical replication can be resumed after failover.
> >   
> >
> > This also should have the sentence "The default is false.", e.g. the
> > same as the same option in CREATE_REPLICATION_SLOT says.
>
> I have not added this. I feel the default value related details should
> be present in the 'CREATE' part, it is not meaningful for the "ALTER"
> part. ALTER does not have any defaults, it just modifies the options
> given by the user.

You are correct. My mistake.

==
src/backend/postmaster/bgworker.c

2.
 #include "replication/logicalworker.h"
+#include "replication/worker_internal.h"
 #include "storage/dsm.h"

Is this change needed when the rest of the code is removed?

==
src/backend/replication/logical/slotsync.c

3. synchronize_one_slot

> > 3.
> > + /*
> > + * Make sure that concerned WAL is received and flushed before syncing
> > + * slot to target lsn received from the primary server.
> > + *
> > + * This check will never pass if on the primary server, user has
> > + * configured standby_slot_names GUC correctly, otherwise this can hit
> > + * frequently.
> > + */
> > + latestFlushPtr = GetStandbyFlushRecPtr(NULL);
> > + if (remote_slot->confirmed_lsn > latestFlushPtr)
> >
> > BEFORE
> > This check will never pass if on the primary server, user has
> > configured standby_slot_names GUC correctly, otherwise this can hit
> > frequently.
> >
> > SUGGESTION (simpler way to say the same thing?)
> > This will always be the case unless the standby_slot_names GUC is not
> > correctly configured on the primary server.
>
> It is not true. It will not hit this condition "always" but has higher
> chances to hit it when standby_slot_names is not configured. I think
> you meant 'unless the standby_slot_names GUC is correctly configured'.
> I feel the current comment gives clear info (less confusing) and thus
> I have not changed it for the time being. I can consider if I get more
> comments there.

Hmm. I meant what I wrote. The "This" of my suggested text refers to
the previous sentence in the comment (not about "hitting" ?? your
condition).

TBH, regardless of the wording you choose, I think it will be much
clearer to move the comment to be inside the if.

SUGGESTION
/*
 * Make sure that concerned WAL is received and flushed before syncing
 * slot to target lsn received from the primary server.
 */
latestFlushPtr = GetStandbyFlushRecPtr(NULL);
if (remote_slot->confirmed_lsn > latestFlushPtr)
{
  /*
   * Can get here only when if GUC 'standby_slot_names' on the primary
   * server was not configured correctly.
   */
...
}

~~~

4.
+static bool
+validate_parameters_and_get_dbname(char **dbname)
+{
+ /*
+ * A physical replication slot(primary_slot_name) is required on the
+ * primary to ensure that the rows needed by the standby are not removed
+ * after restarting, so that the synchronized slot on the standby will not
+ * be invalidated.
+ */
+ if (PrimarySlotName == NULL || *PrimarySlotName == '\0')
+ {
+ ereport(LOG,
+ /* translator: %s is a GUC variable name */
+ errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("bad configuration for slot synchronization"),
+ errhint("\"%s\" must be defined.", "primary_slot_name"));
+ return false;
+ }
+
+ /*
+ * hot_standby_feedback must be enabled to cooperate with the physical
+ * replication slot, which allows informing the primary about the xmin and
+ * catalog_xmin values on the standby.
+ */
+ if (!hot_standby_feedback)
+ {
+ ereport(LOG,
+ /* translator: %s is a GUC variable name */
+ errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("bad configuration for slot synchronization"),
+ errhint("\"%s\" must be enabled.", "hot_standby_feedback"));
+ return false;
+ }
+
+ /*
+ * Logical decoding requires wal_level >= logical and we currently only
+ * synchronize logical slots.
+ */
+ if (wal_level < WAL_LEVEL_LOGICAL)
+ {
+ ereport(LOG,
+ errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("bad configuration for slot synchronization"),
+ errhint("\"wal_level\" must be >= logical."));
+ return false;
+ }
+
+ /*
+ * The primary_conninfo is required to make connection to primary for
+ * getting slots information.
+ */
+ if (PrimaryConnInfo == NULL || *PrimaryConnInfo == '\0')
+ {
+ ereport(LOG,
+ /* translator: %s is a GUC variable name */
+ errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("bad configuration for slot synchronization"),
+ errhint("\"%s\" must be defined.", "primary_conninfo"));
+ return false;
+ }
+
+ /*
+ * The slot sync worker needs a database connection for walrcv_exec to
+ * work.
+ */
+ *dbname = walrcv_get_dbname_from_conninfo(PrimaryConnInfo);
+ if (*dbname ==

Re: Synchronizing slots from primary to standby

2024-02-01 Thread Masahiko Sawada

On Fri, Feb 2, 2024 at 1:58 PM Amit Kapila  wrote:
>
> On Fri, Feb 2, 2024 at 6:46 AM Masahiko Sawada  wrote:
> >
> > On Thu, Feb 1, 2024 at 12:51 PM Amit Kapila  wrote:
> > >
> > >
> > > > BTW I've tested the following switch/fail-back scenario but it seems
> > > > not to work fine. Am I missing something?
> > > >
> > > > Setup:
> > > > node1 is the primary, node2 is the physical standby for node1, and
> > > > node3 is the subscriber connecting to node1.
> > > >
> > > > Steps:
> > > > 1. [node1]: create a table and a publication for the table.
> > > > 2. [node2]: set enable_syncslot = on and start (to receive WALs from 
> > > > node1).
> > > > 3. [node3]: create a subscription with failover = true for the 
> > > > publication.
> > > > 4. [node2]: promote to the new standby.
> > > > 5. [node3]: alter subscription to connect the new primary, node2.
> > > > 6. [node1]: stop, set enable_syncslot = on (and other required
> > > > parameters), then start as a new standby.
> > > >
> > > > Then I got the error "exiting from slot synchronization because same
> > > > name slot "test_sub" already exists on the standby".
> > > >
> > > > The logical replication slot that was created on the old primary
> > > > (node1) has been synchronized to the old standby (node2). Therefore on
> > > > node2, the slot's "synced" field is true. However, once node1 starts
> > > > as the new standby with slot synchronization, the slotsync worker
> > > > cannot synchronize the slot because the slot's "synced" field on the
> > > > primary is false.
> > > >
> > >
> > > Yeah, we avoided doing anything in this case because the user could
> > > have manually created another slot with the same name on standby.
> > > Unlike WAL slots can be modified on standby as we allow decoding on
> > > standby, so we can't allow to overwrite the existing slots. We won't
> > > be able to distinguish whether the existing slot was a slot that the
> > > user wants to sync with primary or a slot created on standby to
> > > perform decoding. I think in this case user first needs to drop the
> > > slot on new standby.
> >
> > Yes, but if we do a switch-back further (i.e. in above case, node1
> > backs to the primary again and node becomes the standby again), the
> > user doesn't need to remove failover slots since they are already
> > marked as "synced".
>
> But, I think in this case node-2's timeline will be ahead of node-1,
> so will we be able to make node-2 follow node-1 again without any
> additional steps? One thing is not clear to me after promotion the
> timeline changes in WAL, so the locations in slots will be as per new
> timelines, after that will it be safe to sync slots from the new
> primary to old-primary?

In order for node-1 to go back to the primary again, it needs to be
promoted. That is, the node-1's timeline increments and node-2 follows
node-1.

>
> In general, I think after failover, we recommend running pg_rewind if
> the old primary has to follow the new primary to account for
> divergence in WAL. So, not sure we can safely start syncing slots in
> old-primary from new-primary, consider that in the new primary, the
> same name slot may have dropped/re-created multiple times.

Right. And I missed the point that all replication slots are removed
after pg_rewind. It would not be a problem in a failover case. But
probably we still need to consider a switchover cas (i.e. switch roles
with clean shutdowns) since it doesn't require to run pg_rewind?

> We can
> probably reset all the fields of the existing slot the first time
> syncing for an existing slot or do something like that but I think it
> would be better to just re-create the slot.
>
> >
>  I wonder if we could do something automatically to
> > reduce the user's operation.
>
> One possibility is that we forcefully drop/re-create the slot or
> directly overwrite the slot contents but that would probably be better
> done via some GUC or slot-level parameter. I feel we should leave this
> for another day, for the first version, we can document that an error
> will occur if the same name slots on standby exist, so users need to
> ensure that there shouldn't be an existing same name slots on standby
> before sync.
>

Hmm, I'm afraid it might not be user-friendly. But probably we can
leave it for now as it's not impossible.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Re: Synchronizing slots from primary to standby

2024-02-01 Thread Amit Kapila

On Fri, Feb 2, 2024 at 9:50 AM Peter Smith  wrote:
>
> Here are some review comments for v750001.
>
> ~~~
>
> 2.
> This patch also implements a new API libpqrcv_get_dbname_from_conninfo()
> to extract database name from the given connection-info
>
> ~
>
> /extract database name/the extract database name/
>

I think it should be "..extract the database name.."

> ==
> .../libpqwalreceiver/libpqwalreceiver.c
>

>
> 4. libpqrcv_connect
>
>  /*
> - * Establish the connection to the primary server for XLOG streaming
> + * Establish the connection to the primary server.
> + *
> + * The connection established could be either a replication one or
> + * a non-replication one based on input argument 'replication'. And further
> + * if it is a replication connection, it could be either logical or physical
> + * based on input argument 'logical'.
>
> That first comment ("could be either a replication one or...") seemed
> a bit meaningless (e.g. it like saying "this boolean argument can be
> true or false") because it doesn't describe what is the meaning of a
> "replication connection" versus what is a "non-replication
> connection".
>

The replication connection is a term already used in the code and
docs. For example, see the error message: "pg_hba.conf rejects
replication connection for host ..". It means that for communication
the connection would use replication protocol instead of the normal
(one used by queries) protocol. The other possibility could be to
individually explain each parameter but I think that is not what we
follow in this or related functions. I feel we can use a simple
comment like: "This API can be used for both replication and regular
connections."

> ~~~
>
> 5.
> /* We can not have logical without replication */
> Assert(replication || !logical);
>
> if (replication)
> {
> keys[++i] = "replication";
> vals[i] = logical ? "database" : "true";
>
> if (!logical)
> {
> /*
> * The database name is ignored by the server in replication mode,
> * but specify "replication" for .pgpass lookup.
> */
> keys[++i] = "dbname";
> vals[i] = "replication";
> }
> }
>
> keys[++i] = "fallback_application_name";
> vals[i] = appname;
> if (logical)
> {
> ...
> }
>
> ~
>
> The Assert already says we cannot be 'logical' if not 'replication',
> therefore IMO it seemed strange that the code was not refactored to
> bring that 2nd "if (logical)" code to within the scope of the "if
> (replication)".
>
> e.g. Can't you do something like this:
>
> Assert(replication || !logical);
>
> if (replication)
> {
>   ...
>   if (logical)
>   {
> ...
>   }
>   else
>   {
> ...
>   }
> }
> keys[++i] = "fallback_application_name";
> vals[i] = appname;
>

+1.

> ~~~
>
> 6. libpqrcv_get_dbname_from_conninfo
>
> + for (PQconninfoOption *opt = opts; opt->keyword != NULL; ++opt)
> + {
> + /*
> + * If multiple dbnames are specified, then the last one will be
> + * returned
> + */
> + if (strcmp(opt->keyword, "dbname") == 0 && opt->val &&
> + opt->val[0] != '\0')
> + dbname = pstrdup(opt->val);
> + }
>
> Should you also pfree the old dbname instead of gathering a bunch of
> strdups if there happened to be multiple dbnames specified ?
>
> SUGGESTION
> if (strcmp(opt->keyword, "dbname") == 0 && opt->val && *opt->val)
> {
>if (dbname)
> pfree(dbname);
>dbname = pstrdup(opt->val);
> }
>

makes sense and shouldn't we need to call PQconninfoFree(opts); at the
end of libpqrcv_get_dbname_from_conninfo() similar to
libpqrcv_check_conninfo()?

-- 
With Regards,
Amit Kapila.

Re: Synchronizing slots from primary to standby

2024-02-01 Thread Bertrand Drouvot

Hi,

On Thu, Feb 01, 2024 at 05:29:15PM +0530, shveta malik wrote:
> Attached v75 patch-set. Changes are:
> 
> 1) Re-arranged the patches:
> 1.1) 'libpqrc' related changes (from v74-001 and v74-004) are
> separated out in v75-001 as those are independent changes.
> 1.2) 'Add logical slot sync capability', 'Slot sync worker as special
> process' and 'App-name changes' are now merged to single patch which
> makes v75-002.
> 1.3) 'Wait for physical Standby confirmation' and 'Failover Validation
> Document' patches are maintained as is (v75-003 and v75-004 now).

Thanks!

I only looked at the commit message for v75-0002 and see that it has changed
since the comment done in [1], but it still does not look correct to me.

"
If a logical slot on the primary is valid but is invalidated on the standby,
then that slot is dropped and recreated on the standby in next sync-cycle
provided the slot still exists on the primary server. It is okay to recreate
such slots as long as these are not consumable on the standby (which is the
case currently). This situation may occur due to the following reasons:
- The max_slot_wal_keep_size on the standby is insufficient to retain WAL
  records from the restart_lsn of the slot.
- primary_slot_name is temporarily reset to null and the physical slot is
  removed.
- The primary changes wal_level to a level lower than logical.
"

If a logical decoding slot "still exists on the primary server" then the primary
can not change the wal_level to lower than logical, one would get something 
like:

"FATAL:  logical replication slot "logical_slot" exists, but wal_level < 
logical"

and then slots won't get invalidated on the standby. I've the feeling that the
wal_level conflict part may need to be explained separately? (I think it's not
possible that they end up being re-created on the standby for this conflict, 
they
will be simply removed as it would mean the counterpart one on the primary 
does not exist anymore).

[1]: 
https://www.postgresql.org/message-id/ZYWdSIeAMQQcLmVT%40ip-10-97-1-34.eu-west-3.compute.internal

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Synchronizing slots from primary to standby

2024-02-01 Thread Bertrand Drouvot

Hi,

On Thu, Feb 01, 2024 at 04:12:43PM +0530, Amit Kapila wrote:
> On Thu, Jan 25, 2024 at 11:26 AM Bertrand Drouvot
>  wrote:
> >
> > On Wed, Jan 24, 2024 at 04:09:15PM +0530, shveta malik wrote:
> > > On Wed, Jan 24, 2024 at 2:38 PM Bertrand Drouvot
> > >  wrote:
> > > >
> > > > I also see Sawada-San's point and I'd vote for 
> > > > "sync_replication_slots". Then for
> > > > the current feature I think "failover" and "on" should be the values to 
> > > > turn the
> > > > feature on (assuming "on" would mean "all kind of supported slots").
> > >
> > > Even if others agree and we change this GUC name to
> > > "sync_replication_slots", I feel we should keep the values as "on" and
> > > "off" currently, where "on" would mean 'sync failover slots' (docs can
> > > state that clearly).
> >
> > I gave more thoughts on it and I think the values should only be "failover" 
> > or
> > "off".
> >
> > The reason is that if we allow "on" and change the "on" behavior in future
> > versions (to support more than failover slots) then that would change the 
> > behavior
> > for the ones that used "on".
> >
> 
> I again thought on this point and feel that even if we start to sync
> say physical slots their purpose would also be to allow
> failover/switchover, otherwise, there is no use of syncing the slots.

Yeah, I think this is a good point.

> So, by that theory, we can just go for naming it as
> sync_failover_slots or simply sync_slots with values 'off' and 'on'.
> Now, if these are used for switchover then there is an argument that
> adding 'failover' in the GUC name could be confusing but I feel
> 'failover' is used widely enough that it shouldn't be a problem for
> users to understand, otherwise, we can go with simple name like
> sync_slots as well.
> 

I agree and "on"/"off" looks enough to me now. As far the GUC name I've the
feeling that "replication" should be part of it, and think that 
sync_replication_slots
is fine. The reason behind is that "sync_slots" could be confusing if in the 
future other kind of "slot" (other than replication ones) are added in the 
engine.

Thoughts?

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Synchronizing slots from primary to standby

2024-02-01 Thread Amit Kapila

On Fri, Feb 2, 2024 at 6:46 AM Masahiko Sawada  wrote:
>
> On Thu, Feb 1, 2024 at 12:51 PM Amit Kapila  wrote:
> >
> >
> > > BTW I've tested the following switch/fail-back scenario but it seems
> > > not to work fine. Am I missing something?
> > >
> > > Setup:
> > > node1 is the primary, node2 is the physical standby for node1, and
> > > node3 is the subscriber connecting to node1.
> > >
> > > Steps:
> > > 1. [node1]: create a table and a publication for the table.
> > > 2. [node2]: set enable_syncslot = on and start (to receive WALs from 
> > > node1).
> > > 3. [node3]: create a subscription with failover = true for the 
> > > publication.
> > > 4. [node2]: promote to the new standby.
> > > 5. [node3]: alter subscription to connect the new primary, node2.
> > > 6. [node1]: stop, set enable_syncslot = on (and other required
> > > parameters), then start as a new standby.
> > >
> > > Then I got the error "exiting from slot synchronization because same
> > > name slot "test_sub" already exists on the standby".
> > >
> > > The logical replication slot that was created on the old primary
> > > (node1) has been synchronized to the old standby (node2). Therefore on
> > > node2, the slot's "synced" field is true. However, once node1 starts
> > > as the new standby with slot synchronization, the slotsync worker
> > > cannot synchronize the slot because the slot's "synced" field on the
> > > primary is false.
> > >
> >
> > Yeah, we avoided doing anything in this case because the user could
> > have manually created another slot with the same name on standby.
> > Unlike WAL slots can be modified on standby as we allow decoding on
> > standby, so we can't allow to overwrite the existing slots. We won't
> > be able to distinguish whether the existing slot was a slot that the
> > user wants to sync with primary or a slot created on standby to
> > perform decoding. I think in this case user first needs to drop the
> > slot on new standby.
>
> Yes, but if we do a switch-back further (i.e. in above case, node1
> backs to the primary again and node becomes the standby again), the
> user doesn't need to remove failover slots since they are already
> marked as "synced".

But, I think in this case node-2's timeline will be ahead of node-1,
so will we be able to make node-2 follow node-1 again without any
additional steps? One thing is not clear to me after promotion the
timeline changes in WAL, so the locations in slots will be as per new
timelines, after that will it be safe to sync slots from the new
primary to old-primary?

In general, I think after failover, we recommend running pg_rewind if
the old primary has to follow the new primary to account for
divergence in WAL. So, not sure we can safely start syncing slots in
old-primary from new-primary, consider that in the new primary, the
same name slot may have dropped/re-created multiple times. We can
probably reset all the fields of the existing slot the first time
syncing for an existing slot or do something like that but I think it
would be better to just re-create the slot.

>
 I wonder if we could do something automatically to
> reduce the user's operation.

One possibility is that we forcefully drop/re-create the slot or
directly overwrite the slot contents but that would probably be better
done via some GUC or slot-level parameter. I feel we should leave this
for another day, for the first version, we can document that an error
will occur if the same name slots on standby exist, so users need to
ensure that there shouldn't be an existing same name slots on standby
before sync.

-- 
With Regards,
Amit Kapila.

Re: Synchronizing slots from primary to standby

2024-02-01 Thread Peter Smith

Here are some review comments for v750001.

==
Commit message

1.
This patch provides support for non-replication connection
in libpqrcv_connect().

~

1a.
/connection/connections/

~

1b.
Maybe there needs to be a few more sentences just to describe what you
mean by "non-replication connection".

~

1c.
IIUC although the 'replication' parameter is added, in this patch
AFAICT every call to the connect function is still passing that
argument as true. If that's correct, probably this patch comment
should emphasise that this patch doesn't change any functionality at
all but is just preparation for later patches which *will* pass false
for the replication arg.

~~~

2.
This patch also implements a new API libpqrcv_get_dbname_from_conninfo()
to extract database name from the given connection-info

~

/extract database name/the extract database name/

==
.../libpqwalreceiver/libpqwalreceiver.c

3.
+ * Apart from walreceiver, the libpq-specific routines here are now being used
+ * by logical replication worker as well.

/worker/workers/

~~~

4. libpqrcv_connect

 /*
- * Establish the connection to the primary server for XLOG streaming
+ * Establish the connection to the primary server.
+ *
+ * The connection established could be either a replication one or
+ * a non-replication one based on input argument 'replication'. And further
+ * if it is a replication connection, it could be either logical or physical
+ * based on input argument 'logical'.

That first comment ("could be either a replication one or...") seemed
a bit meaningless (e.g. it like saying "this boolean argument can be
true or false") because it doesn't describe what is the meaning of a
"replication connection" versus what is a "non-replication
connection".

~~~

5.
/* We can not have logical without replication */
Assert(replication || !logical);

if (replication)
{
keys[++i] = "replication";
vals[i] = logical ? "database" : "true";

if (!logical)
{
/*
* The database name is ignored by the server in replication mode,
* but specify "replication" for .pgpass lookup.
*/
keys[++i] = "dbname";
vals[i] = "replication";
}
}

keys[++i] = "fallback_application_name";
vals[i] = appname;
if (logical)
{
...
}

~

The Assert already says we cannot be 'logical' if not 'replication',
therefore IMO it seemed strange that the code was not refactored to
bring that 2nd "if (logical)" code to within the scope of the "if
(replication)".

e.g. Can't you do something like this:

Assert(replication || !logical);

if (replication)
{
  ...
  if (logical)
  {
...
  }
  else
  {
...
  }
}
keys[++i] = "fallback_application_name";
vals[i] = appname;

~~~

6. libpqrcv_get_dbname_from_conninfo

+ for (PQconninfoOption *opt = opts; opt->keyword != NULL; ++opt)
+ {
+ /*
+ * If multiple dbnames are specified, then the last one will be
+ * returned
+ */
+ if (strcmp(opt->keyword, "dbname") == 0 && opt->val &&
+ opt->val[0] != '\0')
+ dbname = pstrdup(opt->val);
+ }

Should you also pfree the old dbname instead of gathering a bunch of
strdups if there happened to be multiple dbnames specified ?

SUGGESTION
if (strcmp(opt->keyword, "dbname") == 0 && opt->val && *opt->val)
{
   if (dbname)
pfree(dbname);
   dbname = pstrdup(opt->val);
}

==
src/include/replication/walreceiver.h

7.
/*
 * walrcv_connect_fn
 *
 * Establish connection to a cluster.  'logical' is true if the
 * connection is logical, and false if the connection is physical.
 * 'appname' is a name associated to the connection, to use for example
 * with fallback_application_name or application_name.  Returns the
 * details about the connection established, as defined by
 * WalReceiverConn for each WAL receiver module.  On error, NULL is
 * returned with 'err' including the error generated.
 */
typedef WalReceiverConn *(*walrcv_connect_fn) (const char *conninfo,
   bool replication,
   bool logical,
   bool must_use_password,
   const char *appname,
   char **err);

~

The comment is missing any description of the new parameter 'replication'.

~~~

8.
+/*
+ * walrcv_get_dbname_from_conninfo_fn
+ *
+ * Returns the dbid from the primary_conninfo
+ */
+typedef char *(*walrcv_get_dbname_from_conninfo_fn) (const char *conninfo);
+

/dbid/database name/

==
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Synchronizing slots from primary to standby

2024-02-01 Thread Masahiko Sawada

On Thu, Feb 1, 2024 at 12:51 PM Amit Kapila  wrote:
>
> On Wed, Jan 31, 2024 at 9:20 PM Masahiko Sawada  wrote:
> >
> > On Wed, Jan 31, 2024 at 7:42 PM Amit Kapila  wrote:
> > >
> > >
> > > Considering my previous where we don't want to restart for a required
> > > parameter change, isn't it better to avoid repeated restart (say when
> > > the user gave an invalid dbname)? BTW, I think this restart interval
> > > is added based on your previous complaint [1].
> >
> > I think it's useful that the slotsync worker restarts immediately when
> > a required parameter is changed but waits to restart when it exits
> > with an error. IIUC the apply worker does so; if it restarts due to a
> > subscription parameter change, it resets the last-start time so that
> > the launcher will restart it without waiting.
> >
>
> Agreed, this idea sounds good to me.
>
> > >
> > > >
> > > > ---
> > > > When I dropped a database on the primary that has a failover slot, I
> > > > got the following logs on the standby:
> > > >
> > > > 2024-01-31 17:25:21.750 JST [1103933] FATAL:  replication slot "s" is
> > > > active for PID 1103935
> > > > 2024-01-31 17:25:21.750 JST [1103933] CONTEXT:  WAL redo at 0/3020D20
> > > > for Database/DROP: dir 1663/16384
> > > > 2024-01-31 17:25:21.751 JST [1103930] LOG:  startup process (PID
> > > > 1103933) exited with exit code 1
> > > >
> > > > It seems that because the slotsync worker created the slot on the
> > > > standby, the slot's active_pid is still valid.
> > > >
> > >
> > > But we release the slot after sync. And we do take a shared lock on
> > > the database to make the startup process wait for slotsync. There is
> > > one gap which is that we don't reset active_pid for temp slots in
> > > ReplicationSlotRelease(), so for temp slots such an error can occur
> > > but OTOH, we immediately make the slot persistent after sync. As per
> > > my understanding, it is only possible to get this error if the initial
> > > sync doesn't happen and the slot remains temporary. Is that your case?
> > > How did reproduce this?
> >
> > I created a failover slot manually on the primary and dropped the
> > database where the failover slot is created. So this would not happen
> > in normal cases.
> >
>
> Right, it won't happen in normal cases (say for walsender). This can
> happen in some cases even without this patch as noted in comments just
> above active_pid check in ReplicationSlotsDropDBSlots(). Now, we need
> to think whether we should just update the comments above active_pid
> check to explain this case or try to engineer some solution for this
> not-so-common case. I guess if we want a solution we need to stop
> slotsync worker temporarily till the drop database WAL is applied or
> something like that.
>
> > BTW I've tested the following switch/fail-back scenario but it seems
> > not to work fine. Am I missing something?
> >
> > Setup:
> > node1 is the primary, node2 is the physical standby for node1, and
> > node3 is the subscriber connecting to node1.
> >
> > Steps:
> > 1. [node1]: create a table and a publication for the table.
> > 2. [node2]: set enable_syncslot = on and start (to receive WALs from node1).
> > 3. [node3]: create a subscription with failover = true for the publication.
> > 4. [node2]: promote to the new standby.
> > 5. [node3]: alter subscription to connect the new primary, node2.
> > 6. [node1]: stop, set enable_syncslot = on (and other required
> > parameters), then start as a new standby.
> >
> > Then I got the error "exiting from slot synchronization because same
> > name slot "test_sub" already exists on the standby".
> >
> > The logical replication slot that was created on the old primary
> > (node1) has been synchronized to the old standby (node2). Therefore on
> > node2, the slot's "synced" field is true. However, once node1 starts
> > as the new standby with slot synchronization, the slotsync worker
> > cannot synchronize the slot because the slot's "synced" field on the
> > primary is false.
> >
>
> Yeah, we avoided doing anything in this case because the user could
> have manually created another slot with the same name on standby.
> Unlike WAL slots can be modified on standby as we allow decoding on
> standby, so we can't allow to overwrite the existing slots. We won't
> be able to distinguish whether the existing slot was a slot that the
> user wants to sync with primary or a slot created on standby to
> perform decoding. I think in this case user first needs to drop the
> slot on new standby.

Yes, but if we do a switch-back further (i.e. in above case, node1
backs to the primary again and node becomes the standby again), the
user doesn't need to remove failover slots since they are already
marked as "synced". I wonder if we could do something automatically to
reduce the user's operation. Also, If we support slot synchronization
feature also on a cascading standby in the future, this operation will
have to be changed.

Regards,

--
Masahiko Sawada
Amazon Web

Re: Synchronizing slots from primary to standby

2024-02-01 Thread shveta malik

On Wed, Jan 31, 2024 at 2:02 PM Masahiko Sawada  wrote:
>
> ---
> +static char *
> +wait_for_valid_params_and_get_dbname(void)
> +{
> +   char   *dbname;
> +   int rc;
> +
> +   /* Sanity check. */
> +   Assert(enable_syncslot);
> +
> +   for (;;)
> +   {
> +   if (validate_parameters_and_get_dbname())
> +   break;
> +   ereport(LOG, errmsg("skipping slot synchronization"));
> +
> +   ProcessSlotSyncInterrupts(NULL);
>
> When reading this function, I expected that the slotsync worker would
> resume working once the parameters became valid, but it was not
> correct. For example, if I changed hot_standby_feedback from off to
> on, the slotsync worker reads the config file, exits, and then
> restarts. Given that the slotsync worker ends up exiting on parameter
> changes anyway, why do we want to have it wait for parameters to
> become valid? IIUC even if the slotsync worker exits when a parameter
> is not valid, it restarts at some intervals.

Thanks for the feedback Changed this functionality in v75. Now we do
not exit in wait_for_valid_params_and_get_dbname() on GUC change. We
re-validate the new values and if found valid, carry on with
slot-syncing else continue waiting.

> ---
> +bool
> +SlotSyncWorkerCanRestart(void)
> +{
> +#define SLOTSYNC_RESTART_INTERVAL_SEC 10
> +
>
> IIUC depending on how busy the postmaster is and the timing, the user
> could wait for 1 min to re-launch the slotsync worker. But I think the
> user might want to re-launch the slotsync worker more quickly for
> example when the slotsync worker restarts due to parameter changes.
> IIUC SloSyncWorkerCanRestart() doesn't consider the fact that the
> slotsync worker previously exited with 0 or 1.

Modified this in v75. As you suggested in [1], we reset
last_start_time on GUC change before proc_exit, so that the postmaster
restarts worker immediately without waiting.

> ---
> +   /* We are a normal standby */
> +   valid = DatumGetBool(slot_getattr(tupslot, 2, ));
> +   Assert(!isnull);
>
> What do you mean by "normal standby"?
>
> ---
> +   appendStringInfo(,
> +"SELECT pg_is_in_recovery(), count(*) = 1"
> +" FROM pg_replication_slots"
> +" WHERE slot_type='physical' AND slot_name=%s",
> +quote_literal_cstr(PrimarySlotName));
>
> I think we need to make "pg_replication_slots" schema-qualified.

Modified.

> ---
> +   errdetail("The primary server slot \"%s\" specified by"
> + " \"%s\" is not valid.",
> + PrimarySlotName, "primary_slot_name"));
>
> and
>
> +   errmsg("slot sync worker will shutdown because"
> +  " %s is disabled", "enable_syncslot"));
>
> It's better to write it in one line for better greppability.

Modified.

[1]: 
https://www.postgresql.org/message-id/CAD21AoAv6FwZ6UPNTj6%3D7A%2B3O2m4utzfL8ZGS6X1EGexikG66A%40mail.gmail.com

thanks
Shveta

Re: Synchronizing slots from primary to standby

2024-02-01 Thread shveta malik

On Thu, Feb 1, 2024 at 11:21 AM Peter Smith  wrote:
>
> Here are some review comments for v740001.

Thanks Peter for the feedback.

> ==
> src/sgml/logicaldecoding.sgml
>
> 1.
> +   
> +Replication Slot Synchronization
> +
> + A logical replication slot on the primary can be synchronized to the hot
> + standby by enabling the failover option during slot
> + creation and setting
> +  linkend="guc-enable-syncslot">enable_syncslot
> + on the standby. For the synchronization
> + to work, it is mandatory to have a physical replication slot between the
> + primary and the standby, and
> +  linkend="guc-hot-standby-feedback">hot_standby_feedback
> + must be enabled on the standby. It is also necessary to specify a valid
> + dbname in the
> +  linkend="guc-primary-conninfo">primary_conninfo
> + string, which is used for slot synchronization and is ignored
> for streaming.
> +
>
> IMO we don't need to repeat that last part ", which is used for slot
> synchronization and is ignored for streaming." because that is a
> detail about the primary_conninfo GUC, and the same information is
> already described in that GUC section.

Modified in v75.

> ==
>
> 2. ALTER_REPLICATION_SLOT slot_name ( option [, ...] ) #
>
>   
> -  If true, the slot is enabled to be synced to the standbys.
> +  If true, the slot is enabled to be synced to the standbys
> +  so that logical replication can be resumed after failover.
>   
>
> This also should have the sentence "The default is false.", e.g. the
> same as the same option in CREATE_REPLICATION_SLOT says.

I have not added this. I feel the default value related details should
be present in the 'CREATE' part, it is not meaningful for the "ALTER"
part. ALTER does not have any defaults, it just modifies the options
given by the user.

> ==
> synchronize_one_slot
>
> 3.
> + /*
> + * Make sure that concerned WAL is received and flushed before syncing
> + * slot to target lsn received from the primary server.
> + *
> + * This check will never pass if on the primary server, user has
> + * configured standby_slot_names GUC correctly, otherwise this can hit
> + * frequently.
> + */
> + latestFlushPtr = GetStandbyFlushRecPtr(NULL);
> + if (remote_slot->confirmed_lsn > latestFlushPtr)
>
> BEFORE
> This check will never pass if on the primary server, user has
> configured standby_slot_names GUC correctly, otherwise this can hit
> frequently.
>
> SUGGESTION (simpler way to say the same thing?)
> This will always be the case unless the standby_slot_names GUC is not
> correctly configured on the primary server.

It is not true. It will not hit this condition "always" but has higher
chances to hit it when standby_slot_names is not configured. I think
you meant 'unless the standby_slot_names GUC is correctly configured'.
I feel the current comment gives clear info (less confusing) and thus
I have not changed it for the time being. I can consider if I get more
comments there.

> 4.
> + /* User created slot with the same name exists, raise ERROR. */
>
> /User created/User-created/

Modified.

> ~~~
>
> 5. synchronize_slots, and also drop_obsolete_slots
>
> + /*
> + * Use shared lock to prevent a conflict with
> + * ReplicationSlotsDropDBSlots(), trying to drop the same slot while
> + * drop-database operation.
> + */
>
> (same code comment is in a couple of places)
>
> SUGGESTION (while -> during, etc.)
>
> Use a shared lock to prevent conflicting with
> ReplicationSlotsDropDBSlots() trying to drop the same slot during a
> drop-database operation.

Modified.

> ~~~
>
> 6. validate_parameters_and_get_dbname
>
> strcmp() just for the empty string "" might be overkill.
>
> 6a.
> + if (PrimarySlotName == NULL || strcmp(PrimarySlotName, "") == 0)
>
> SUGGESTION
> if (PrimarySlotName == NULL || *PrimarySlotName == '\0')
>
> ~~
>
> 6b.
> + if (PrimaryConnInfo == NULL || strcmp(PrimaryConnInfo, "") == 0)
>
> SUGGESTION
> if (PrimaryConnInfo == NULL || *PrimaryConnInfo == '\0')

Modified.

thanks
Shveta

Re: Synchronizing slots from primary to standby

2024-02-01 Thread Amit Kapila

On Wed, Jan 31, 2024 at 10:40 AM Peter Smith  wrote:
>
> On Wed, Jan 31, 2024 at 2:18 PM Amit Kapila  wrote:
> >
> > >
> > > I think is correct to say all those *other* properties (create_slot,
> > > enabled, copy_data) are forced to false because those otherwise have
> > > default true values.
> > >
> >
> > So, won't when connect=false, the user has to explicitly provide such
> > values (create_slot, enabled, etc.) as false? If so, is using 'force'
> > strictly correct?
>
> Perhaps the original docs text could be worded differently; I think
> the word "force" here just meant setting connection=false
> forces/causes/makes those other options behave "as if" they had been
> set to false without the user explicitly doing anything to them.
>

Okay, I see your point. Let's remove the 'failover' from this part of
the sentence.

-- 
With Regards,
Amit Kapila.

Re: Synchronizing slots from primary to standby

2024-02-01 Thread Amit Kapila

On Thu, Jan 25, 2024 at 11:26 AM Bertrand Drouvot
 wrote:
>
> On Wed, Jan 24, 2024 at 04:09:15PM +0530, shveta malik wrote:
> > On Wed, Jan 24, 2024 at 2:38 PM Bertrand Drouvot
> >  wrote:
> > >
> > > I also see Sawada-San's point and I'd vote for "sync_replication_slots". 
> > > Then for
> > > the current feature I think "failover" and "on" should be the values to 
> > > turn the
> > > feature on (assuming "on" would mean "all kind of supported slots").
> >
> > Even if others agree and we change this GUC name to
> > "sync_replication_slots", I feel we should keep the values as "on" and
> > "off" currently, where "on" would mean 'sync failover slots' (docs can
> > state that clearly).
>
> I gave more thoughts on it and I think the values should only be "failover" or
> "off".
>
> The reason is that if we allow "on" and change the "on" behavior in future
> versions (to support more than failover slots) then that would change the 
> behavior
> for the ones that used "on".
>

I again thought on this point and feel that even if we start to sync
say physical slots their purpose would also be to allow
failover/switchover, otherwise, there is no use of syncing the slots.
So, by that theory, we can just go for naming it as
sync_failover_slots or simply sync_slots with values 'off' and 'on'.
Now, if these are used for switchover then there is an argument that
adding 'failover' in the GUC name could be confusing but I feel
'failover' is used widely enough that it shouldn't be a problem for
users to understand, otherwise, we can go with simple name like
sync_slots as well.

Thoughts?

-- 
With Regards,
Amit Kapila.

Re: Synchronizing slots from primary to standby

2024-02-01 Thread Amit Kapila

On Wed, Jan 31, 2024 at 9:20 PM Masahiko Sawada  wrote:
>
> On Wed, Jan 31, 2024 at 7:42 PM Amit Kapila  wrote:
> >
> > On Wed, Jan 31, 2024 at 2:02 PM Masahiko Sawada  
> > wrote:
> > >
> > > Thank you for updating the patches. As for the slotsync worker patch,
> > > is there any reason why 0001, 0002, and 0004 patches are still
> > > separated?
> > >
> >
> > No specific reason, it could be easier to review those parts.
>
> Okay, I think we can merge 0001 and 0002 at least as we don't need
> bgworker codes.
>

Agreed, and I am fine with merging 0001, 0002, and 0004 as suggested
by you though I have a few minor comments on 0002 and 0004. I was
thinking about what will be a logical way to split the slot sync
worker patch (combined result of 0001, 0002, and 0004), and one idea
occurred to me is that we can have the first patch as
synchronize_solts() API and the functionality required to implement
that API then the second patch would be a slot sync worker which uses
that API to synchronize slots and does all the required validations.
Any thoughts?

Few minor comments on 0002 and 0004

1. The comments above HandleChildCrash() should mention about slot sync worker

2.
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -42,6 +42,7 @@
 #include "replication/slot.h"
 #include "replication/syncrep.h"
 #include "replication/walsender.h"
+#include "replication/logicalworker.h"

...
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -43,6 +43,7 @@
 #include "postmaster/autovacuum.h"
 #include "postmaster/postmaster.h"
 #include "replication/slot.h"
+#include "replication/logicalworker.h"

These new includes don't appear to be in alphabetical order.

3.
+ /* We can not have logical without replication */
+ if (!replication)
+ Assert(!logical);

I think we can cover both these conditions via Assert

-- 
With Regards,
Amit Kapila.

Re: Synchronizing slots from primary to standby

2024-01-31 Thread Peter Smith

Here are some review comments for v740001.

==
src/sgml/logicaldecoding.sgml

1.
+   
+Replication Slot Synchronization
+
+ A logical replication slot on the primary can be synchronized to the hot
+ standby by enabling the failover option during slot
+ creation and setting
+ enable_syncslot
+ on the standby. For the synchronization
+ to work, it is mandatory to have a physical replication slot between the
+ primary and the standby, and
+ hot_standby_feedback
+ must be enabled on the standby. It is also necessary to specify a valid
+ dbname in the
+ primary_conninfo
+ string, which is used for slot synchronization and is ignored
for streaming.
+

IMO we don't need to repeat that last part ", which is used for slot
synchronization and is ignored for streaming." because that is a
detail about the primary_conninfo GUC, and the same information is
already described in that GUC section.

==

2. ALTER_REPLICATION_SLOT slot_name ( option [, ...] ) #

  
-  If true, the slot is enabled to be synced to the standbys.
+  If true, the slot is enabled to be synced to the standbys
+  so that logical replication can be resumed after failover.
  

This also should have the sentence "The default is false.", e.g. the
same as the same option in CREATE_REPLICATION_SLOT says.

==
synchronize_one_slot

3.
+ /*
+ * Make sure that concerned WAL is received and flushed before syncing
+ * slot to target lsn received from the primary server.
+ *
+ * This check will never pass if on the primary server, user has
+ * configured standby_slot_names GUC correctly, otherwise this can hit
+ * frequently.
+ */
+ latestFlushPtr = GetStandbyFlushRecPtr(NULL);
+ if (remote_slot->confirmed_lsn > latestFlushPtr)

BEFORE
This check will never pass if on the primary server, user has
configured standby_slot_names GUC correctly, otherwise this can hit
frequently.

SUGGESTION (simpler way to say the same thing?)
This will always be the case unless the standby_slot_names GUC is not
correctly configured on the primary server.

~~~

4.
+ /* User created slot with the same name exists, raise ERROR. */

/User created/User-created/

~~~

5. synchronize_slots, and also drop_obsolete_slots

+ /*
+ * Use shared lock to prevent a conflict with
+ * ReplicationSlotsDropDBSlots(), trying to drop the same slot while
+ * drop-database operation.
+ */

(same code comment is in a couple of places)

SUGGESTION (while -> during, etc.)

Use a shared lock to prevent conflicting with
ReplicationSlotsDropDBSlots() trying to drop the same slot during a
drop-database operation.

~~~

6. validate_parameters_and_get_dbname

strcmp() just for the empty string "" might be overkill.

6a.
+ if (PrimarySlotName == NULL || strcmp(PrimarySlotName, "") == 0)

SUGGESTION
if (PrimarySlotName == NULL || *PrimarySlotName == '\0')

~~

6b.
+ if (PrimaryConnInfo == NULL || strcmp(PrimaryConnInfo, "") == 0)

SUGGESTION
if (PrimaryConnInfo == NULL || *PrimaryConnInfo == '\0')

==
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Synchronizing slots from primary to standby

2024-01-31 Thread Ajin Cherian

On Tue, Jan 30, 2024 at 11:53 PM shveta malik 
wrote:

> On Tue, Jan 30, 2024 at 4:06 PM shveta malik 
> wrote:
> >
> > PFA v73-0001 which addresses the above comments. Other patches will be
> > rebased and posted after pushing this one.
>
> Since v73-0001 is pushed, PFA  rest of the patches. Changes are:
>
> 1) Rebased the patches.
> 2) Ran pg_indent on all.
> 3) patch001: Updated logicaldecoding.sgml for dbname requirement in
> primary_conninfo for slot-synchronization.
>
> thanks
> Shveta
>

Just to test the behaviour, I modified the code to set failover flag to
default to "true" while creating subscription and ran the regression tests.
I only saw the expected errors.
1. Make check in postgres root folder  - all failures are because of
difference when listing subscription as failover flag is now enabled. The
diff is attached for regress.

2. Make check in src/test/subscription - no failures All tests successful.
Files=34, Tests=457, 81 wallclock secs ( 0.14 usr  0.05 sys +  9.53 cusr
13.00 csys = 22.72 CPU)
Result: PASS

3. Make check in src/test/recovery - 3 failures Test Summary Report
---
t/027_stream_regress.pl (Wstat: 256 Tests: 6 Failed: 1)
  Failed test:  2
  Non-zero exit status: 1
t/035_standby_logical_decoding.pl   (Wstat: 7424 Tests: 8 Failed: 0)
  Non-zero exit status: 29
  Parse errors: No plan found in TAP output t/
050_standby_failover_slots_sync.pl (Wstat: 7424 Tests: 5 Failed: 0)
  Non-zero exit status: 29
  Parse errors: No plan found in TAP output

3a. Analysis of t/027_stream_regress.pl - No, 027 fails with the same issue
as "make check" in postgres root folder (for which I attached the diffs).
027 is about running the standard regression tests with streaming
replication. Since the regression tests fail because listing subscription
now has failover enabled, 027 also fails in the same way with streaming
replication.

3b. Analysis of t/035_standby_logical_decoding.pl - In this test case, they
attempt to create a subscription from the subscriber to the standby
##
# Test that we can subscribe on the standby with the publication # created
on the primary.
##

Now, this fails because creating a subscription on the standby with
failover enabled will result in error:
I see the following error in the log:
2024-01-28 23:51:30.425 EST [23332] tap_sub STATEMENT:
 CREATE_REPLICATION_SLOT "tap_sub" LOGICAL pgoutput (FAILOVER, SNAPSHOT
'nothing')
2024-01-28 23:51:30.425 EST [23332] tap_sub ERROR:  cannot create
replication slot with failover enabled on the standby I discussed this with
Shveta and she agreed that this is the expected behaviour as we don't
support failover to cascading standby yet.

3c. Analysis of t/050_standby_failover_slots_sync.pl - This is a new test
case created for this patch, and it creates a subscription without failover
enabled to make sure that the Subscription with failover disabled does not
depend on sync on standby, but this fails because we have failover enabled
by default.

In summary, I don't think these issues are actual bugs but expected
behaviour change.

regards,
Ajin Cherian
Fujitsu Australia

Re: Synchronizing slots from primary to standby

2024-01-31 Thread Amit Kapila

On Thu, Feb 1, 2024 at 8:15 AM Euler Taveira  wrote:
>
> On Mon, Jan 29, 2024, at 10:17 AM, Zhijie Hou (Fujitsu) wrote:
>
> Attach the V72-0001 which addressed above comments, other patches will be
> rebased and posted after pushing first patch. Thanks Shveta for helping 
> address
> the comments.
>
>
> While working on another patch I noticed a new NOTICE message:
>
> NOTICE:  changed the failover state of replication slot "foo" on publisher to 
> false
>
> I wasn't paying much attention to this thread then I start reading the 2
> patches that was recently committed. The message above surprises me because
> pg_createsubscriber starts to emit this message. The reason is that it doesn't
> create the replication slot during the CREATE SUBSCRIPTION. Instead, it 
> creates
> the replication slot with failover = false and no such option is informed
> during CREATE SUBSCRIPTION which means it uses the default value (failover =
> false). I expect that I don't see any message because it is *not* changing the
> behavior. I was wrong. It doesn't check the failover state on publisher, it
> just executes walrcv_alter_slot() and emits a message.
>
> IMO if we are changing an outstanding property on node A from node B, node B
> already knows (or might know) about that behavior change (because it is 
> sending
> the command), however, node A doesn't (unless log_replication_commands = on --
> it is not the default).
>
> Do we really need this message as NOTICE?
>

The reason for adding this NOTICE was to keep it similar to other
Notice messages in these commands like create/drop slot. However, here
the difference is we may not have altered the slot as the property is
already the same as we want to set on the publisher. So, I am not sure
whether we should follow the existing behavior or just get rid of it.
And then do we remove similar NOTICE in AlterSubscription() as well?
Normally, I think NOTICE intends to let users know if we did anything
with slots while executing subscription commands. Does anyone else
have an opinion on this point?

A related point, I think we can avoid setting the 'failover' property
in ReplicationSlotAlter() if it is not changed, the advantage is we
will avoid saving slots. OTOH, this won't be a frequent operation so
we can leave it as it is as well.

-- 
With Regards,
Amit Kapila.

Re: Synchronizing slots from primary to standby

2024-01-31 Thread Amit Kapila

On Wed, Jan 31, 2024 at 9:20 PM Masahiko Sawada  wrote:
>
> On Wed, Jan 31, 2024 at 7:42 PM Amit Kapila  wrote:
> >
> >
> > Considering my previous where we don't want to restart for a required
> > parameter change, isn't it better to avoid repeated restart (say when
> > the user gave an invalid dbname)? BTW, I think this restart interval
> > is added based on your previous complaint [1].
>
> I think it's useful that the slotsync worker restarts immediately when
> a required parameter is changed but waits to restart when it exits
> with an error. IIUC the apply worker does so; if it restarts due to a
> subscription parameter change, it resets the last-start time so that
> the launcher will restart it without waiting.
>

Agreed, this idea sounds good to me.

> >
> > >
> > > ---
> > > When I dropped a database on the primary that has a failover slot, I
> > > got the following logs on the standby:
> > >
> > > 2024-01-31 17:25:21.750 JST [1103933] FATAL:  replication slot "s" is
> > > active for PID 1103935
> > > 2024-01-31 17:25:21.750 JST [1103933] CONTEXT:  WAL redo at 0/3020D20
> > > for Database/DROP: dir 1663/16384
> > > 2024-01-31 17:25:21.751 JST [1103930] LOG:  startup process (PID
> > > 1103933) exited with exit code 1
> > >
> > > It seems that because the slotsync worker created the slot on the
> > > standby, the slot's active_pid is still valid.
> > >
> >
> > But we release the slot after sync. And we do take a shared lock on
> > the database to make the startup process wait for slotsync. There is
> > one gap which is that we don't reset active_pid for temp slots in
> > ReplicationSlotRelease(), so for temp slots such an error can occur
> > but OTOH, we immediately make the slot persistent after sync. As per
> > my understanding, it is only possible to get this error if the initial
> > sync doesn't happen and the slot remains temporary. Is that your case?
> > How did reproduce this?
>
> I created a failover slot manually on the primary and dropped the
> database where the failover slot is created. So this would not happen
> in normal cases.
>

Right, it won't happen in normal cases (say for walsender). This can
happen in some cases even without this patch as noted in comments just
above active_pid check in ReplicationSlotsDropDBSlots(). Now, we need
to think whether we should just update the comments above active_pid
check to explain this case or try to engineer some solution for this
not-so-common case. I guess if we want a solution we need to stop
slotsync worker temporarily till the drop database WAL is applied or
something like that.

> BTW I've tested the following switch/fail-back scenario but it seems
> not to work fine. Am I missing something?
>
> Setup:
> node1 is the primary, node2 is the physical standby for node1, and
> node3 is the subscriber connecting to node1.
>
> Steps:
> 1. [node1]: create a table and a publication for the table.
> 2. [node2]: set enable_syncslot = on and start (to receive WALs from node1).
> 3. [node3]: create a subscription with failover = true for the publication.
> 4. [node2]: promote to the new standby.
> 5. [node3]: alter subscription to connect the new primary, node2.
> 6. [node1]: stop, set enable_syncslot = on (and other required
> parameters), then start as a new standby.
>
> Then I got the error "exiting from slot synchronization because same
> name slot "test_sub" already exists on the standby".
>
> The logical replication slot that was created on the old primary
> (node1) has been synchronized to the old standby (node2). Therefore on
> node2, the slot's "synced" field is true. However, once node1 starts
> as the new standby with slot synchronization, the slotsync worker
> cannot synchronize the slot because the slot's "synced" field on the
> primary is false.
>

Yeah, we avoided doing anything in this case because the user could
have manually created another slot with the same name on standby.
Unlike WAL slots can be modified on standby as we allow decoding on
standby, so we can't allow to overwrite the existing slots. We won't
be able to distinguish whether the existing slot was a slot that the
user wants to sync with primary or a slot created on standby to
perform decoding. I think in this case user first needs to drop the
slot on new standby. We probably need to document it as well unless we
decide to do something else. What do you think?

-- 
With Regards,
Amit Kapila.

Re: Synchronizing slots from primary to standby

2024-01-31 Thread Euler Taveira

On Mon, Jan 29, 2024, at 10:17 AM, Zhijie Hou (Fujitsu) wrote:
> Attach the V72-0001 which addressed above comments, other patches will be
> rebased and posted after pushing first patch. Thanks Shveta for helping 
> address
> the comments.

While working on another patch I noticed a new NOTICE message:

NOTICE:  changed the failover state of replication slot "foo" on publisher to 
false

I wasn't paying much attention to this thread then I start reading the 2
patches that was recently committed. The message above surprises me because
pg_createsubscriber starts to emit this message. The reason is that it doesn't
create the replication slot during the CREATE SUBSCRIPTION. Instead, it creates
the replication slot with failover = false and no such option is informed
during CREATE SUBSCRIPTION which means it uses the default value (failover =
false). I expect that I don't see any message because it is *not* changing the
behavior. I was wrong. It doesn't check the failover state on publisher, it
just executes walrcv_alter_slot() and emits a message.

IMO if we are changing an outstanding property on node A from node B, node B
already knows (or might know) about that behavior change (because it is sending
the command), however, node A doesn't (unless log_replication_commands = on --
it is not the default).

Do we really need this message as NOTICE? I would set it to DEBUG1 if it is
worth or even remove it (if we consider there are other ways to obtain the same
information).

--
Euler Taveira
EDB   https://www.enterprisedb.com/

Re: Synchronizing slots from primary to standby

2024-01-31 Thread Masahiko Sawada

On Wed, Jan 31, 2024 at 7:42 PM Amit Kapila  wrote:
>
> On Wed, Jan 31, 2024 at 2:02 PM Masahiko Sawada  wrote:
> >
> > Thank you for updating the patches. As for the slotsync worker patch,
> > is there any reason why 0001, 0002, and 0004 patches are still
> > separated?
> >
>
> No specific reason, it could be easier to review those parts.

Okay, I think we can merge 0001 and 0002 at least as we don't need
bgworker codes.

>
> >
> > Beside, here are some comments on v74 0001, 0002, and 0004 patches:
> >
> > ---
> > +static char *
> > +wait_for_valid_params_and_get_dbname(void)
> > +{
> > +   char   *dbname;
> > +   int rc;
> > +
> > +   /* Sanity check. */
> > +   Assert(enable_syncslot);
> > +
> > +   for (;;)
> > +   {
> > +   if (validate_parameters_and_get_dbname())
> > +   break;
> > +   ereport(LOG, errmsg("skipping slot synchronization"));
> > +
> > +   ProcessSlotSyncInterrupts(NULL);
> >
> > When reading this function, I expected that the slotsync worker would
> > resume working once the parameters became valid, but it was not
> > correct. For example, if I changed hot_standby_feedback from off to
> > on, the slotsync worker reads the config file, exits, and then
> > restarts. Given that the slotsync worker ends up exiting on parameter
> > changes anyway, why do we want to have it wait for parameters to
> > become valid?
> >
>
> Right, the reason for waiting is to avoid repeated re-start of
> slotsync worker if the required parameter is not changed. To follow
> that, I think we should simply continue when the required parameter is
> changed and is valid. But, I think during actual slotsync, if
> connection_info is changed then there is no option but to restart.

Agreed.

> >
> > ---
> > +bool
> > +SlotSyncWorkerCanRestart(void)
> > +{
> > +#define SLOTSYNC_RESTART_INTERVAL_SEC 10
> > +
> >
> > IIUC depending on how busy the postmaster is and the timing, the user
> > could wait for 1 min to re-launch the slotsync worker. But I think the
> > user might want to re-launch the slotsync worker more quickly for
> > example when the slotsync worker restarts due to parameter changes.
> > IIUC SloSyncWorkerCanRestart() doesn't consider the fact that the
> > slotsync worker previously exited with 0 or 1.
> >
>
> Considering my previous where we don't want to restart for a required
> parameter change, isn't it better to avoid repeated restart (say when
> the user gave an invalid dbname)? BTW, I think this restart interval
> is added based on your previous complaint [1].

I think it's useful that the slotsync worker restarts immediately when
a required parameter is changed but waits to restart when it exits
with an error. IIUC the apply worker does so; if it restarts due to a
subscription parameter change, it resets the last-start time so that
the launcher will restart it without waiting. But if it exits with an
error, the launcher waits for wal_retrieve_retry_interval. I don't
think the slotsync worker must follow this behavior but I feel it's
useful behavior.

>
> >
> > ---
> > When I dropped a database on the primary that has a failover slot, I
> > got the following logs on the standby:
> >
> > 2024-01-31 17:25:21.750 JST [1103933] FATAL:  replication slot "s" is
> > active for PID 1103935
> > 2024-01-31 17:25:21.750 JST [1103933] CONTEXT:  WAL redo at 0/3020D20
> > for Database/DROP: dir 1663/16384
> > 2024-01-31 17:25:21.751 JST [1103930] LOG:  startup process (PID
> > 1103933) exited with exit code 1
> >
> > It seems that because the slotsync worker created the slot on the
> > standby, the slot's active_pid is still valid.
> >
>
> But we release the slot after sync. And we do take a shared lock on
> the database to make the startup process wait for slotsync. There is
> one gap which is that we don't reset active_pid for temp slots in
> ReplicationSlotRelease(), so for temp slots such an error can occur
> but OTOH, we immediately make the slot persistent after sync. As per
> my understanding, it is only possible to get this error if the initial
> sync doesn't happen and the slot remains temporary. Is that your case?
> How did reproduce this?

I created a failover slot manually on the primary and dropped the
database where the failover slot is created. So this would not happen
in normal cases.

BTW I've tested the following switch/fail-back scenario but it seems
not to work fine. Am I missing something?

Setup:
node1 is the primary, node2 is the physical standby for node1, and
node3 is the subscriber connecting to node1.

Steps:
1. [node1]: create a table and a publication for the table.
2. [node2]: set enable_syncslot = on and start (to receive WALs from node1).
3. [node3]: create a subscription with failover = true for the publication.
4. [node2]: promote to the new standby.
5. [node3]: alter subscription to connect the new primary, node2.
6. [node1]: stop, set enable_syncslot = on (and other required
parameters), then start as a new standby.

Then I got the error

Re: Synchronizing slots from primary to standby

2024-01-31 Thread Amit Kapila

On Wed, Jan 31, 2024 at 2:02 PM Masahiko Sawada  wrote:
>
> Thank you for updating the patches. As for the slotsync worker patch,
> is there any reason why 0001, 0002, and 0004 patches are still
> separated?
>

No specific reason, it could be easier to review those parts.

>
> Beside, here are some comments on v74 0001, 0002, and 0004 patches:
>
> ---
> +static char *
> +wait_for_valid_params_and_get_dbname(void)
> +{
> +   char   *dbname;
> +   int rc;
> +
> +   /* Sanity check. */
> +   Assert(enable_syncslot);
> +
> +   for (;;)
> +   {
> +   if (validate_parameters_and_get_dbname())
> +   break;
> +   ereport(LOG, errmsg("skipping slot synchronization"));
> +
> +   ProcessSlotSyncInterrupts(NULL);
>
> When reading this function, I expected that the slotsync worker would
> resume working once the parameters became valid, but it was not
> correct. For example, if I changed hot_standby_feedback from off to
> on, the slotsync worker reads the config file, exits, and then
> restarts. Given that the slotsync worker ends up exiting on parameter
> changes anyway, why do we want to have it wait for parameters to
> become valid?
>

Right, the reason for waiting is to avoid repeated re-start of
slotsync worker if the required parameter is not changed. To follow
that, I think we should simply continue when the required parameter is
changed and is valid. But, I think during actual slotsync, if
connection_info is changed then there is no option but to restart.

>
> ---
> +bool
> +SlotSyncWorkerCanRestart(void)
> +{
> +#define SLOTSYNC_RESTART_INTERVAL_SEC 10
> +
>
> IIUC depending on how busy the postmaster is and the timing, the user
> could wait for 1 min to re-launch the slotsync worker. But I think the
> user might want to re-launch the slotsync worker more quickly for
> example when the slotsync worker restarts due to parameter changes.
> IIUC SloSyncWorkerCanRestart() doesn't consider the fact that the
> slotsync worker previously exited with 0 or 1.
>

Considering my previous where we don't want to restart for a required
parameter change, isn't it better to avoid repeated restart (say when
the user gave an invalid dbname)? BTW, I think this restart interval
is added based on your previous complaint [1].

>
> ---
> When I dropped a database on the primary that has a failover slot, I
> got the following logs on the standby:
>
> 2024-01-31 17:25:21.750 JST [1103933] FATAL:  replication slot "s" is
> active for PID 1103935
> 2024-01-31 17:25:21.750 JST [1103933] CONTEXT:  WAL redo at 0/3020D20
> for Database/DROP: dir 1663/16384
> 2024-01-31 17:25:21.751 JST [1103930] LOG:  startup process (PID
> 1103933) exited with exit code 1
>
> It seems that because the slotsync worker created the slot on the
> standby, the slot's active_pid is still valid.
>

But we release the slot after sync. And we do take a shared lock on
the database to make the startup process wait for slotsync. There is
one gap which is that we don't reset active_pid for temp slots in
ReplicationSlotRelease(), so for temp slots such an error can occur
but OTOH, we immediately make the slot persistent after sync. As per
my understanding, it is only possible to get this error if the initial
sync doesn't happen and the slot remains temporary. Is that your case?
How did reproduce this?

 That is why the startup
> process could not drop the slot.
>

[1] - 
https://www.postgresql.org/message-id/CAD21AoApGoTZu7D_7%3DbVYQqKnj%2BPZ2Rz%2Bnc8Ky1HPQMS_XL6%2BA%40mail.gmail.com

-- 
With Regards,
Amit Kapila.

Re: Synchronizing slots from primary to standby

2024-01-31 Thread Masahiko Sawada

On Tue, Jan 30, 2024 at 9:53 PM shveta malik  wrote:
>
> On Tue, Jan 30, 2024 at 4:06 PM shveta malik  wrote:
> >
> > PFA v73-0001 which addresses the above comments. Other patches will be
> > rebased and posted after pushing this one.
>
> Since v73-0001 is pushed, PFA  rest of the patches. Changes are:
>
> 1) Rebased the patches.
> 2) Ran pg_indent on all.
> 3) patch001: Updated logicaldecoding.sgml for dbname requirement in
> primary_conninfo for slot-synchronization.
>

Thank you for updating the patches. As for the slotsync worker patch,
is there any reason why 0001, 0002, and 0004 patches are still
separated?

Beside, here are some comments on v74 0001, 0002, and 0004 patches:

---
+static char *
+wait_for_valid_params_and_get_dbname(void)
+{
+   char   *dbname;
+   int rc;
+
+   /* Sanity check. */
+   Assert(enable_syncslot);
+
+   for (;;)
+   {
+   if (validate_parameters_and_get_dbname())
+   break;
+   ereport(LOG, errmsg("skipping slot synchronization"));
+
+   ProcessSlotSyncInterrupts(NULL);

When reading this function, I expected that the slotsync worker would
resume working once the parameters became valid, but it was not
correct. For example, if I changed hot_standby_feedback from off to
on, the slotsync worker reads the config file, exits, and then
restarts. Given that the slotsync worker ends up exiting on parameter
changes anyway, why do we want to have it wait for parameters to
become valid? IIUC even if the slotsync worker exits when a parameter
is not valid, it restarts at some intervals.

---
+bool
+SlotSyncWorkerCanRestart(void)
+{
+#define SLOTSYNC_RESTART_INTERVAL_SEC 10
+

IIUC depending on how busy the postmaster is and the timing, the user
could wait for 1 min to re-launch the slotsync worker. But I think the
user might want to re-launch the slotsync worker more quickly for
example when the slotsync worker restarts due to parameter changes.
IIUC SloSyncWorkerCanRestart() doesn't consider the fact that the
slotsync worker previously exited with 0 or 1.

---
+   /* We are a normal standby */
+   valid = DatumGetBool(slot_getattr(tupslot, 2, ));
+   Assert(!isnull);

What do you mean by "normal standby"?

---
+   appendStringInfo(,
+"SELECT pg_is_in_recovery(), count(*) = 1"
+" FROM pg_replication_slots"
+" WHERE slot_type='physical' AND slot_name=%s",
+quote_literal_cstr(PrimarySlotName));

I think we need to make "pg_replication_slots" schema-qualified.

---
+   errdetail("The primary server slot \"%s\" specified by"
+ " \"%s\" is not valid.",
+ PrimarySlotName, "primary_slot_name"));

and

+   errmsg("slot sync worker will shutdown because"
+  " %s is disabled", "enable_syncslot"));

It's better to write it in one line for better greppability.

---
When I dropped a database on the primary that has a failover slot, I
got the following logs on the standby:

2024-01-31 17:25:21.750 JST [1103933] FATAL:  replication slot "s" is
active for PID 1103935
2024-01-31 17:25:21.750 JST [1103933] CONTEXT:  WAL redo at 0/3020D20
for Database/DROP: dir 1663/16384
2024-01-31 17:25:21.751 JST [1103930] LOG:  startup process (PID
1103933) exited with exit code 1

It seems that because the slotsync worker created the slot on the
standby, the slot's active_pid is still valid. That is why the startup
process could not drop the slot.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

RE: Synchronizing slots from primary to standby

2024-01-30 Thread Zhijie Hou (Fujitsu)

On Wednesday, January 31, 2024 9:57 AM Peter Smith  
wrote:
> 
> Hi,
> 
> I saw that v73-0001 was pushed, but it included some last-minute
> changes that I did not get a chance to check yesterday.
> 
> Here are some review comments for the new parts of that patch.
> 
> ==
> doc/src/sgml/ref/create_subscription.sgml
> 
> 1.
> connect (boolean)
> 
> Specifies whether the CREATE SUBSCRIPTION command should connect
> to the publisher at all. The default is true. Setting this to false
> will force the values of create_slot, enabled, copy_data, and failover
> to false. (You cannot combine setting connect to false with setting
> create_slot, enabled, copy_data, or failover to true.)
> 
> ~
> 
> I don't think the first part "Setting this to false will force the
> values ... failover to false." is strictly correct.
> 
> I think is correct to say all those *other* properties (create_slot,
> enabled, copy_data) are forced to false because those otherwise have
> default true values. But the 'failover' has default false, so it
> cannot get force-changed at all because you can't set connect to false
> when failover is true as the second part ("You cannot combine...")
> explains.
> 
> IMO remove 'failover' from that first sentence.
> 
> 
> 3.
> dump can be restored without requiring network access to the remote
> servers.  It is then up to the user to reactivate the subscriptions in a
> suitable way.  If the involved hosts have changed, the connection
> -   information might have to be changed.  It might also be appropriate to
> +   information might have to be changed.  If the subscription needs to
> +   be enabled for
> +linkend="sql-createsubscription-params-with-failover">failover eral>,
> +   then same needs to be done by executing
> +   
> +   ALTER SUBSCRIPTION ... SET(failover = true)
> +   after the slot has been created.  It might also be appropriate to
> 
> "then same needs to be done" (English?)
> 
> BEFORE
> If the subscription needs to be enabled for failover, then same needs
> to be done by executing ALTER SUBSCRIPTION ... SET(failover = true)
> after the slot has been created.
> 
> SUGGESTION
> If the subscription needs to be enabled for failover, execute ALTER
> SUBSCRIPTION ... SET(failover = true) after the slot has been created.
> 
> ==
> src/backend/commands/subscriptioncmds.c
> 
> 4.
>  #define SUBOPT_RUN_AS_OWNER 0x1000
> -#define SUBOPT_LSN 0x2000
> -#define SUBOPT_ORIGIN 0x4000
> +#define SUBOPT_FAILOVER 0x2000
> +#define SUBOPT_LSN 0x4000
> +#define SUBOPT_ORIGIN 0x8000
> +
> 
> A spurious blank line was added.
> 

Here is a small patch to address the comment 3 and 4.
The discussion for comment 1 is still going on, so we can
update the patch once it's concluded.

Best Regards,
Hou zj



0001-clean-up-for-776621a5.patch
Description: 0001-clean-up-for-776621a5.patch

Re: Synchronizing slots from primary to standby

2024-01-30 Thread Peter Smith

On Wed, Jan 31, 2024 at 2:18 PM Amit Kapila  wrote:
>
> On Wed, Jan 31, 2024 at 7:27 AM Peter Smith  wrote:
> >
> > I saw that v73-0001 was pushed, but it included some last-minute
> > changes that I did not get a chance to check yesterday.
> >
> > Here are some review comments for the new parts of that patch.
> >
> > ==
> > doc/src/sgml/ref/create_subscription.sgml
> >
> > 1.
> > connect (boolean)
> >
> > Specifies whether the CREATE SUBSCRIPTION command should connect
> > to the publisher at all. The default is true. Setting this to false
> > will force the values of create_slot, enabled, copy_data, and failover
> > to false. (You cannot combine setting connect to false with setting
> > create_slot, enabled, copy_data, or failover to true.)
> >
> > ~
> >
> > I don't think the first part "Setting this to false will force the
> > values ... failover to false." is strictly correct.
> >
> > I think is correct to say all those *other* properties (create_slot,
> > enabled, copy_data) are forced to false because those otherwise have
> > default true values.
> >
>
> So, won't when connect=false, the user has to explicitly provide such
> values (create_slot, enabled, etc.) as false? If so, is using 'force'
> strictly correct?

Perhaps the original docs text could be worded differently; I think
the word "force" here just meant setting connection=false
forces/causes/makes those other options behave "as if" they had been
set to false without the user explicitly doing anything to them. The
point is they are made to behave *differently* to their normal
defaults.

So,
connect=false ==> this actually sets enabled=false (you can see this
with \dRs+), which is different to the default setting of 'enabled'
connect=false ==> will not create a slot (because there is no
connection), which is different to the default behaviour for
'create-slot'
connect=false ==> will not copy tables (because there is no
connection), which is different to the default behaviour for
'copy_data;'

OTOH, failover is different
connect=false ==> failover is not possible (because there is no
connection), but the 'failover' default is false anyway, so no change
to the default behaviour for this one.

>
> > ~~~
> >
> > 2.
> >   
> >Since no connection is made when this option is
> >false, no tables are subscribed. To initiate
> >replication, you must manually create the replication slot, 
> > enable
> > -  the subscription, and refresh the subscription. See
> > +  the failover if required, enable the subscription, and refresh 
> > the
> > +  subscription. See
> > > linkend="logical-replication-subscription-examples-deferred-slot"/>
> >for examples.
> >   
> >
> > IMO "see the failover if required" is very vague. See what failover?
> >
>
> AFAICS, the committed docs says: "To initiate replication, you must
> manually create the replication slot, enable the failover if required,
> ...". I am not sure what you are referring to.
>

My mistake. I was misreading the patch code without applying the
patch. Sorry for the noise.

> >
> > ==
> > src/bin/pg_upgrade/t/004_subscription.pl
> >
> > 5.
> > -# The subscription's running status should be preserved. Old subscription
> > -# regress_sub1 should be enabled and old subscription regress_sub2 should 
> > be
> > -# disabled.
> > +# The subscription's running status and failover option should be 
> > preserved.
> > +# Old subscription regress_sub1 should have enabled and failover as true 
> > while
> > +# old subscription regress_sub2 should have enabled and failover as false.
> >  $result =
> >$new_sub->safe_psql('postgres',
> > - "SELECT subname, subenabled FROM pg_subscription ORDER BY subname");
> > -is( $result, qq(regress_sub1|t
> > -regress_sub2|f),
> > + "SELECT subname, subenabled, subfailover FROM pg_subscription ORDER
> > BY subname");
> > +is( $result, qq(regress_sub1|t|t
> > +regress_sub2|f|f),
> >   "check that the subscription's running status are preserved");
> >
> > ~
> >
> > Calling those "old subscriptions" seems misleading. Aren't these the
> > new/upgraded subscriptions being checked here?
> >
>
> Again the quoted wording is not introduced by this patch. But, I see
> your point and it is better if you can start a separate thread for it.
>

OK. I created a separate thread for this [1]

==
[1] 
https://www.postgresql.org/message-id/cahut+pu1uslphrysptacy1k_q-ddsrxnfhmj_2u1nfqbc1y...@mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia.

Re: Synchronizing slots from primary to standby

2024-01-30 Thread Amit Kapila

On Wed, Jan 31, 2024 at 7:27 AM Peter Smith  wrote:
>
> I saw that v73-0001 was pushed, but it included some last-minute
> changes that I did not get a chance to check yesterday.
>
> Here are some review comments for the new parts of that patch.
>
> ==
> doc/src/sgml/ref/create_subscription.sgml
>
> 1.
> connect (boolean)
>
> Specifies whether the CREATE SUBSCRIPTION command should connect
> to the publisher at all. The default is true. Setting this to false
> will force the values of create_slot, enabled, copy_data, and failover
> to false. (You cannot combine setting connect to false with setting
> create_slot, enabled, copy_data, or failover to true.)
>
> ~
>
> I don't think the first part "Setting this to false will force the
> values ... failover to false." is strictly correct.
>
> I think is correct to say all those *other* properties (create_slot,
> enabled, copy_data) are forced to false because those otherwise have
> default true values.
>

So, won't when connect=false, the user has to explicitly provide such
values (create_slot, enabled, etc.) as false? If so, is using 'force'
strictly correct?

> ~~~
>
> 2.
>   
>Since no connection is made when this option is
>false, no tables are subscribed. To initiate
>replication, you must manually create the replication slot, enable
> -  the subscription, and refresh the subscription. See
> +  the failover if required, enable the subscription, and refresh the
> +  subscription. See
> linkend="logical-replication-subscription-examples-deferred-slot"/>
>for examples.
>   
>
> IMO "see the failover if required" is very vague. See what failover?
>

AFAICS, the committed docs says: "To initiate replication, you must
manually create the replication slot, enable the failover if required,
...". I am not sure what you are referring to.

>
> ==
> src/bin/pg_upgrade/t/004_subscription.pl
>
> 5.
> -# The subscription's running status should be preserved. Old subscription
> -# regress_sub1 should be enabled and old subscription regress_sub2 should be
> -# disabled.
> +# The subscription's running status and failover option should be preserved.
> +# Old subscription regress_sub1 should have enabled and failover as true 
> while
> +# old subscription regress_sub2 should have enabled and failover as false.
>  $result =
>$new_sub->safe_psql('postgres',
> - "SELECT subname, subenabled FROM pg_subscription ORDER BY subname");
> -is( $result, qq(regress_sub1|t
> -regress_sub2|f),
> + "SELECT subname, subenabled, subfailover FROM pg_subscription ORDER
> BY subname");
> +is( $result, qq(regress_sub1|t|t
> +regress_sub2|f|f),
>   "check that the subscription's running status are preserved");
>
> ~
>
> Calling those "old subscriptions" seems misleading. Aren't these the
> new/upgraded subscriptions being checked here?
>

Again the quoted wording is not introduced by this patch. But, I see
your point and it is better if you can start a separate thread for it.

-- 
With Regards,
Amit Kapila.

Re: Synchronizing slots from primary to standby

2024-01-30 Thread Peter Smith

Hi,

I saw that v73-0001 was pushed, but it included some last-minute
changes that I did not get a chance to check yesterday.

Here are some review comments for the new parts of that patch.

==
doc/src/sgml/ref/create_subscription.sgml

1.
connect (boolean)

Specifies whether the CREATE SUBSCRIPTION command should connect
to the publisher at all. The default is true. Setting this to false
will force the values of create_slot, enabled, copy_data, and failover
to false. (You cannot combine setting connect to false with setting
create_slot, enabled, copy_data, or failover to true.)

~

I don't think the first part "Setting this to false will force the
values ... failover to false." is strictly correct.

I think is correct to say all those *other* properties (create_slot,
enabled, copy_data) are forced to false because those otherwise have
default true values. But the 'failover' has default false, so it
cannot get force-changed at all because you can't set connect to false
when failover is true as the second part ("You cannot combine...")
explains.

IMO remove 'failover' from that first sentence.

~~~

2.
  
   Since no connection is made when this option is
   false, no tables are subscribed. To initiate
   replication, you must manually create the replication slot, enable
-  the subscription, and refresh the subscription. See
+  the failover if required, enable the subscription, and refresh the
+  subscription. See
   
   for examples.
  

IMO "see the failover if required" is very vague. See what failover?
The slot property failover, or the subscription option failover? And
"see" it for what purpose?

I think the intention was probably to say something like "ensure the
manually created slot has the same matching failover property value as
the subscriber failover option", but that is not clear from the
current text.

==
doc/src/sgml/ref/pg_dump.sgml

3.
dump can be restored without requiring network access to the remote
servers.  It is then up to the user to reactivate the subscriptions in a
suitable way.  If the involved hosts have changed, the connection
-   information might have to be changed.  It might also be appropriate to
+   information might have to be changed.  If the subscription needs to
+   be enabled for
+   failover,
+   then same needs to be done by executing
+   
+   ALTER SUBSCRIPTION ... SET(failover = true)
+   after the slot has been created.  It might also be appropriate to

"then same needs to be done" (English?)

BEFORE
If the subscription needs to be enabled for failover, then same needs
to be done by executing ALTER SUBSCRIPTION ... SET(failover = true)
after the slot has been created.

SUGGESTION
If the subscription needs to be enabled for failover, execute ALTER
SUBSCRIPTION ... SET(failover = true) after the slot has been created.

==
src/backend/commands/subscriptioncmds.c

4.
 #define SUBOPT_RUN_AS_OWNER 0x1000
-#define SUBOPT_LSN 0x2000
-#define SUBOPT_ORIGIN 0x4000
+#define SUBOPT_FAILOVER 0x2000
+#define SUBOPT_LSN 0x4000
+#define SUBOPT_ORIGIN 0x8000
+

A spurious blank line was added.

==
src/bin/pg_upgrade/t/004_subscription.pl

5.
-# The subscription's running status should be preserved. Old subscription
-# regress_sub1 should be enabled and old subscription regress_sub2 should be
-# disabled.
+# The subscription's running status and failover option should be preserved.
+# Old subscription regress_sub1 should have enabled and failover as true while
+# old subscription regress_sub2 should have enabled and failover as false.
 $result =
   $new_sub->safe_psql('postgres',
- "SELECT subname, subenabled FROM pg_subscription ORDER BY subname");
-is( $result, qq(regress_sub1|t
-regress_sub2|f),
+ "SELECT subname, subenabled, subfailover FROM pg_subscription ORDER
BY subname");
+is( $result, qq(regress_sub1|t|t
+regress_sub2|f|f),
  "check that the subscription's running status are preserved");

~

Calling those "old subscriptions" seems misleading. Aren't these the
new/upgraded subscriptions being checked here?

Should the comment be more like:

# The subscription's running status and failover option should be preserved.
# Upgraded regress_sub1 should still have enabled and failover as true.
# Upgraded regress_sub2 should still have enabled and failover as false.

==
Kind Regards,
Peter Smith.
Fujitsu Australia.

Re: Synchronizing slots from primary to standby

2024-01-30 Thread Bertrand Drouvot

Hi,

On Mon, Jan 29, 2024 at 09:15:57AM +, Bertrand Drouvot wrote:
> Hi,
> 
> On Mon, Jan 29, 2024 at 02:35:52PM +0530, Amit Kapila wrote:
> > I think it is better to create a separate patch for two_phase after
> > this patch gets committed.
> 
> Yeah, makes sense, will do, thanks!

It's done in [1].

[1]: 
https://www.postgresql.org/message-id/ZbkYrLPhH%2BRxpZlW%40ip-10-97-1-34.eu-west-3.compute.internal

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Synchronizing slots from primary to standby

2024-01-30 Thread shveta malik

On Tue, Jan 30, 2024 at 11:31 AM Amit Kapila  wrote:
>
> In this regard, I feel we don't need to dump/restore the 'FAILOVER'
> option non-binary upgrade paths similar to the 'ENABLE' option. For
> binary upgrade, if the failover option is enabled, then we can enable
> it using Alter Subscription SET (failover=true). Let's add one test
> corresponding to this behavior in
> postgresql\src\bin\pg_upgrade\t\004_subscription.

Changed pg_dump behaviour as suggested and added additional test.

> Additionally, we need to update the pg_dump docs for the 'failover'
> option. See "When dumping logical replication subscriptions, .." [1].
> I think we also need to update the connect option docs in CREATE
> SUBSCRIPTION [2].

Updated docs.

> [1] - https://www.postgresql.org/docs/devel/app-pgdump.html
> [2] - https://www.postgresql.org/docs/devel/sql-createsubscription.html

PFA v73-0001 which addresses the above comments. Other patches will be
rebased and posted after pushing this one. Thanks Hou-San for adding
pg_upgrade test for failover.

thanks
Shveta


v73-0001-Add-a-failover-option-to-subscriptions.patch
Description: Binary data

Re: Synchronizing slots from primary to standby

2024-01-29 Thread Amit Kapila

On Mon, Jan 29, 2024 at 6:47 PM Zhijie Hou (Fujitsu)
 wrote:
>
> On Monday, January 29, 2024 7:30 PM Amit Kapila  
> wrote:
> >
>
> > ===
> > 1.
> > parse_subscription_options()
> > {
> > ...
> > /*
> > * We've been explicitly asked to not connect, that requires some
> > * additional processing.
> > */
> > if (!opts->connect && IsSet(supported_opts, SUBOPT_CONNECT)) {
> >
> > Here, along with other options, we need an explicit check for failover, so 
> > that if
> > connect=false and failover=true, the statement should give error. I was
> > expecting the below statement to fail but it passed with WARNING.
> > postgres=# create subscription sub2 connection 'dbname=postgres'
> > publication pub2 with(connect=false, failover=true);
> > WARNING:  subscription was created, but is not connected
> > HINT:  To initiate replication, you must manually create the replication 
> > slot,
> > enable the subscription, and refresh the subscription.
> > CREATE SUBSCRIPTION
>
> Added.
>

In this regard, I feel we don't need to dump/restore the 'FAILOVER'
option non-binary upgrade paths similar to the 'ENABLE' option. For
binary upgrade, if the failover option is enabled, then we can enable
it using Alter Subscription SET (failover=true). Let's add one test
corresponding to this behavior in
postgresql\src\bin\pg_upgrade\t\004_subscription.

Additionally, we need to update the pg_dump docs for the 'failover'
option. See "When dumping logical replication subscriptions, .." [1].
I think we also need to update the connect option docs in CREATE
SUBSCRIPTION [2].

[1] - https://www.postgresql.org/docs/devel/app-pgdump.html
[2] - https://www.postgresql.org/docs/devel/sql-createsubscription.html

-- 
With Regards,
Amit Kapila.

Re: Synchronizing slots from primary to standby

2024-01-29 Thread Amit Kapila

On Tue, Jan 30, 2024 at 7:29 AM Peter Smith  wrote:
>
> Here are some review comments for v72-0001
>
> ==
> doc/src/sgml/ref/alter_subscription.sgml
>
> 1.
> +  parameter value of the subscription. Otherwise, the slot on the
> +  publisher may behave differently from what subscription's
> +   linkend="sql-createsubscription-params-with-failover">failover
> +  option says. The slot on the publisher could either be
> +  synced to the standbys even when the subscription's
> +   linkend="sql-createsubscription-params-with-failover">failover
> +  option is disabled or could be disabled for sync
> +  even when the subscription's
> +   linkend="sql-createsubscription-params-with-failover">failover
> +  option is enabled.
> + 
>
> It is a bit wordy to keep saying "disabled/enabled"
>
> BEFORE
> The slot on the publisher could either be synced to the standbys even
> when the subscription's failover option is disabled or could be
> disabled for sync even when the subscription's failover option is
> enabled.
>
> SUGGESTION
> The slot on the publisher could be synced to the standbys even when
> the subscription's failover = false or may not be syncing even when
> the subscription's failover = true.
>

I think it is a matter of personal preference because I find the
existing wording in the patch easier to follow. So, I would like to
retain that as it is.

-- 
With Regards,
Amit Kapila.

Re: Synchronizing slots from primary to standby

2024-01-29 Thread Peter Smith

Here are some review comments for v72-0001

==
doc/src/sgml/ref/alter_subscription.sgml

1.
+  parameter value of the subscription. Otherwise, the slot on the
+  publisher may behave differently from what subscription's
+  failover
+  option says. The slot on the publisher could either be
+  synced to the standbys even when the subscription's
+  failover
+  option is disabled or could be disabled for sync
+  even when the subscription's
+  failover
+  option is enabled.
+ 

It is a bit wordy to keep saying "disabled/enabled"

BEFORE
The slot on the publisher could either be synced to the standbys even
when the subscription's failover option is disabled or could be
disabled for sync even when the subscription's failover option is
enabled.

SUGGESTION
The slot on the publisher could be synced to the standbys even when
the subscription's failover = false or may not be syncing even when
the subscription's failover = true.

==
.../t/040_standby_failover_slots_sync.pl

2.
+# Enable subscription
+$subscriber1->safe_psql('postgres',
+ "ALTER SUBSCRIPTION regress_mysub1 ENABLE");
+
+# Disable failover for enabled subscription
+my ($result, $stdout, $stderr) = $subscriber1->psql('postgres',
+ "ALTER SUBSCRIPTION regress_mysub1 SET (failover = false)");
+ok( $stderr =~ /ERROR:  cannot set failover for enabled subscription/,
+ "altering failover is not allowed for enabled subscription");
+

Currently, those tests are under scope the big comment:

+##
+# Test that changing the failover property of a subscription updates the
+# corresponding failover property of the slot.
+##

But that comment is not quite relevant to these tests. So, add another
one just these:

SUGGESTION:
##
# Test that cannot modify the failover option for enabled subscriptions.
##

==
Kind Regards,
Peter Smith.
Fujitsu Australia

RE: Synchronizing slots from primary to standby

2024-01-29 Thread Zhijie Hou (Fujitsu)

On Monday, January 29, 2024 9:17 PM Zhijie Hou (Fujitsu) 
 wrote:
> 
> On Monday, January 29, 2024 7:30 PM Amit Kapila 
> wrote:
> >
> > On Mon, Jan 29, 2024 at 3:11 PM shveta malik 
> > wrote:
> > >
> > > PFA v71 patch set with above changes.
> > >
> >
> > Few comments on 0001
> 
> Thanks for the comments.
> 
> > ===
> > 1.
> > parse_subscription_options()
> > {
> > ...
> > /*
> > * We've been explicitly asked to not connect, that requires some
> > * additional processing.
> > */
> > if (!opts->connect && IsSet(supported_opts, SUBOPT_CONNECT)) {
> >
> > Here, along with other options, we need an explicit check for
> > failover, so that if connect=false and failover=true, the statement
> > should give error. I was expecting the below statement to fail but it passed
> with WARNING.
> > postgres=# create subscription sub2 connection 'dbname=postgres'
> > publication pub2 with(connect=false, failover=true);
> > WARNING:  subscription was created, but is not connected
> > HINT:  To initiate replication, you must manually create the
> > replication slot, enable the subscription, and refresh the subscription.
> > CREATE SUBSCRIPTION
> 
> Added.
> 
> >
> > 2.
> > @@ -148,6 +153,10 @@ typedef struct Subscription
> >   List*publications; /* List of publication names to subscribe to */
> >   char*origin; /* Only publish data originating from the
> >   * specified origin */
> > + bool failover; /* True if the associated replication slots
> > + * (i.e. the main slot and the table sync
> > + * slots) in the upstream database are enabled
> > + * to be synchronized to the standbys. */
> >  } Subscription;
> >
> > Let's add this new field immediately after "bool runasowner;" as is
> > done for other boolean members. This will help avoid increasing the
> > size of the structure due to alignment when we add any new pointer
> > field in the future. Also, that would be consistent with what we do for 
> > other
> new boolean members.
> 
> Moved this field as suggested.
> 
> Attach the V72-0001 which addressed above comments, other patches will be
> rebased and posted after pushing first patch. Thanks Shveta for helping
> address the comments.

Apart from above comments. The new V72 patch also includes the followings 
changes.

1. Moved the test 'altering failover for enabled sub' to the tap-test where most
of the alter-sub behaviors are tested.

2. Rename the tap-test from 050_standby_failover_slots_sync.pl to
040_standby_failover_slots_sync.pl (the big number 050 was used to avoid
conflict with other newly committed tests). And add the test into meson.build
which was missed.

Best Regards,
Hou zj

RE: Synchronizing slots from primary to standby

2024-01-29 Thread Zhijie Hou (Fujitsu)

On Monday, January 29, 2024 7:30 PM Amit Kapila  wrote:
> 
> On Mon, Jan 29, 2024 at 3:11 PM shveta malik 
> wrote:
> >
> > PFA v71 patch set with above changes.
> >
> 
> Few comments on 0001

Thanks for the comments.

> ===
> 1.
> parse_subscription_options()
> {
> ...
> /*
> * We've been explicitly asked to not connect, that requires some
> * additional processing.
> */
> if (!opts->connect && IsSet(supported_opts, SUBOPT_CONNECT)) {
> 
> Here, along with other options, we need an explicit check for failover, so 
> that if
> connect=false and failover=true, the statement should give error. I was
> expecting the below statement to fail but it passed with WARNING.
> postgres=# create subscription sub2 connection 'dbname=postgres'
> publication pub2 with(connect=false, failover=true);
> WARNING:  subscription was created, but is not connected
> HINT:  To initiate replication, you must manually create the replication slot,
> enable the subscription, and refresh the subscription.
> CREATE SUBSCRIPTION

Added.

> 
> 2.
> @@ -148,6 +153,10 @@ typedef struct Subscription
>   List*publications; /* List of publication names to subscribe to */
>   char*origin; /* Only publish data originating from the
>   * specified origin */
> + bool failover; /* True if the associated replication slots
> + * (i.e. the main slot and the table sync
> + * slots) in the upstream database are enabled
> + * to be synchronized to the standbys. */
>  } Subscription;
> 
> Let's add this new field immediately after "bool runasowner;" as is done for
> other boolean members. This will help avoid increasing the size of the 
> structure
> due to alignment when we add any new pointer field in the future. Also, that
> would be consistent with what we do for other new boolean members.

Moved this field as suggested.

Attach the V72-0001 which addressed above comments, other patches will be
rebased and posted after pushing first patch. Thanks Shveta for helping address
the comments.

Best Regards,
Hou zj



v72-0001-Add-a-failover-option-to-subscriptions.patch
Description: v72-0001-Add-a-failover-option-to-subscriptions.patch

Re: Synchronizing slots from primary to standby

2024-01-29 Thread Amit Kapila

On Mon, Jan 29, 2024 at 3:11 PM shveta malik  wrote:
>
> PFA v71 patch set with above changes.
>

Few comments on 0001
===
1.
parse_subscription_options()
{
...
/*
* We've been explicitly asked to not connect, that requires some
* additional processing.
*/
if (!opts->connect && IsSet(supported_opts, SUBOPT_CONNECT))
{

Here, along with other options, we need an explicit check for
failover, so that if connect=false and failover=true, the statement
should give error. I was expecting the below statement to fail but it
passed with WARNING.
postgres=# create subscription sub2 connection 'dbname=postgres'
publication pub2 with(connect=false, failover=true);
WARNING:  subscription was created, but is not connected
HINT:  To initiate replication, you must manually create the
replication slot, enable the subscription, and refresh the
subscription.
CREATE SUBSCRIPTION

2.
@@ -148,6 +153,10 @@ typedef struct Subscription
  List*publications; /* List of publication names to subscribe to */
  char*origin; /* Only publish data originating from the
  * specified origin */
+ bool failover; /* True if the associated replication slots
+ * (i.e. the main slot and the table sync
+ * slots) in the upstream database are enabled
+ * to be synchronized to the standbys. */
 } Subscription;

Let's add this new field immediately after "bool runasowner;" as is
done for other boolean members. This will help avoid increasing the
size of the structure due to alignment when we add any new pointer
field in the future. Also, that would be consistent with what we do
for other new boolean members.

-- 
With Regards,
Amit Kapila.

Re: Synchronizing slots from primary to standby

2024-01-29 Thread Bertrand Drouvot

Hi,

On Mon, Jan 29, 2024 at 02:35:52PM +0530, Amit Kapila wrote:
> On Mon, Jan 29, 2024 at 2:22 PM Bertrand Drouvot
>  wrote:
> > Looking at 0001:
> >
> > +  When altering the
> > +   > linkend="sql-createsubscription-params-with-slot-name">slot_name,
> > +  the failover property value of the named slot may 
> > differ from the
> > +   > linkend="sql-createsubscription-params-with-failover">failover
> > +  parameter specified in the subscription. When creating the slot,
> > +  ensure the slot failover property matches the
> > +   > linkend="sql-createsubscription-params-with-failover">failover
> > +  parameter value of the subscription. Otherwise, the slot on the 
> > publisher may
> > +  not be enabled to be synced to standbys.
> >
> > Not related to this patch series but while at it shouldn't we also add a few
> > words about two_phase too? (I mean ensure the slot property matchs the
> > subscription one).
> >
> > Or would it be better to create a dedicated patch (outside of this thread) 
> > for
> > the "two_phase" remark? (If so I can take care of it).
> >
> 
> I think it is better to create a separate patch for two_phase after
> this patch gets committed.

Yeah, makes sense, will do, thanks!

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Synchronizing slots from primary to standby

2024-01-29 Thread Amit Kapila

On Mon, Jan 29, 2024 at 2:22 PM Bertrand Drouvot
 wrote:
>
> On Mon, Jan 29, 2024 at 10:24:11AM +0530, shveta malik wrote:
> > On Sat, Jan 27, 2024 at 12:02 PM Zhijie Hou (Fujitsu)
> >  wrote:
> > >
> > > Attach the V70 patch set which addressed above comments and Bertrand's 
> > > comments in [1]
> > >
> >
> > Since v70-0001 is pushed, rebased and attached v70_2 patches.  There
> > are no new changes.
>
> Thanks!
>
> Looking at 0001:
>
> +  When altering the
> +   linkend="sql-createsubscription-params-with-slot-name">slot_name,
> +  the failover property value of the named slot may 
> differ from the
> +   linkend="sql-createsubscription-params-with-failover">failover
> +  parameter specified in the subscription. When creating the slot,
> +  ensure the slot failover property matches the
> +   linkend="sql-createsubscription-params-with-failover">failover
> +  parameter value of the subscription. Otherwise, the slot on the 
> publisher may
> +  not be enabled to be synced to standbys.
>
> Not related to this patch series but while at it shouldn't we also add a few
> words about two_phase too? (I mean ensure the slot property matchs the
> subscription one).
>
> Or would it be better to create a dedicated patch (outside of this thread) for
> the "two_phase" remark? (If so I can take care of it).
>

I think it is better to create a separate patch for two_phase after
this patch gets committed.

-- 
With Regards,
Amit Kapila.

Re: Synchronizing slots from primary to standby

2024-01-29 Thread Bertrand Drouvot

Hi,

On Mon, Jan 29, 2024 at 10:24:11AM +0530, shveta malik wrote:
> On Sat, Jan 27, 2024 at 12:02 PM Zhijie Hou (Fujitsu)
>  wrote:
> >
> > Attach the V70 patch set which addressed above comments and Bertrand's 
> > comments in [1]
> >
> 
> Since v70-0001 is pushed, rebased and attached v70_2 patches.  There
> are no new changes.

Thanks!

Looking at 0001:

+  When altering the
+  slot_name,
+  the failover property value of the named slot may 
differ from the
+  failover
+  parameter specified in the subscription. When creating the slot,
+  ensure the slot failover property matches the
+  failover
+  parameter value of the subscription. Otherwise, the slot on the 
publisher may
+  not be enabled to be synced to standbys.

Not related to this patch series but while at it shouldn't we also add a few
words about two_phase too? (I mean ensure the slot property matchs the
subscription one).

Or would it be better to create a dedicated patch (outside of this thread) for
the "two_phase" remark? (If so I can take care of it).

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Synchronizing slots from primary to standby

2024-01-28 Thread Amit Kapila

On Mon, Jan 29, 2024 at 9:21 AM Peter Smith  wrote:
>
> Here are some review comments for v70-0001.
>
> ==
> doc/src/sgml/protocol.sgml
>
> 1.
> Related to this, please also review my other patch to the same docs
> page protocol.sgml [1].
>

We can check that separately.

> ==
> src/backend/replication/logical/tablesync.c
>
> 2.
>   walrcv_create_slot(LogRepWorkerWalRcvConn,
>  slotname, false /* permanent */ , false /* two_phase */ ,
> +false,
>  CRS_USE_SNAPSHOT, origin_startpos);
>
> I know it was previously mentioned in this thread that inline
> parameter comments are unnecessary, but here they are already in the
> existing code so shouldn't we do the same?
>

I think it is better to remove the even existing ones as those many
times make code difficult to read.

-- 
With Regards,
Amit Kapila.

Re: Synchronizing slots from primary to standby

2024-01-28 Thread Peter Smith

Here are some review comments for v70-0001.

==
doc/src/sgml/protocol.sgml

1.
Related to this, please also review my other patch to the same docs
page protocol.sgml [1].

==
src/backend/replication/logical/tablesync.c

2.
  walrcv_create_slot(LogRepWorkerWalRcvConn,
 slotname, false /* permanent */ , false /* two_phase */ ,
+false,
 CRS_USE_SNAPSHOT, origin_startpos);

I know it was previously mentioned in this thread that inline
parameter comments are unnecessary, but here they are already in the
existing code so shouldn't we do the same?

==
src/backend/replication/repl_gram.y

3.
+/* ALTER_REPLICATION_SLOT slot options */
+alter_replication_slot:
+ K_ALTER_REPLICATION_SLOT IDENT '(' generic_option_list ')'
+ {
+ AlterReplicationSlotCmd *cmd;
+ cmd = makeNode(AlterReplicationSlotCmd);
+ cmd->slotname = $2;
+ cmd->options = $4;
+ $$ = (Node *) cmd;
+ }
+ ;
+

IMO write that comment with parentheses, so it matches the code.

SUGGESTION
ALTER_REPLICATION_SLOT slot ( options )

==
[1] 
https://www.postgresql.org/message-id/CAHut%2BPtDWSmW8uiRJF1LfGQJikmo7V2jdysLuRmtsanNZc7fNw%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Synchronizing slots from primary to standby

2024-01-26 Thread Amit Kapila

On Thu, Jan 25, 2024 at 6:42 PM Zhijie Hou (Fujitsu)
 wrote:
>
> Here is the V69 patch set which includes the following changes.
>
> V69-0001, V69-0002
>

Few minor comments on v69-0001
1. In libpqrcv_create_slot(), I see we are using two types of syntaxes
based on 'use_new_options_syntax' (aka server_version >= 15) whereas
this new 'failover' option doesn't follow that. What is the reason of
the same? I thought it is because older versions anyway won't support
this option. However, I guess we should follow the syntax of the old
server and let it error out. BTW, did you test this patch with old
server versions (say < 15 and >=15) by directly using replication
commands, if so, what is the behavior of same?

2.
  }
-
+ if (failover)
+ appendStringInfoString(, "FAILOVER, ");

Spurious line removal. Also, to follow a coding pattern similar to
nearby code, let's have one empty line after handling of failover.

3.
+/* ALTER_REPLICATION_SLOT slot */
+alter_replication_slot:
+ K_ALTER_REPLICATION_SLOT IDENT '(' generic_option_list ')'

I think it would be better if we follow the create style by specifying
syntax in comments as that can make the code easier to understand
after future extensions to this command if any. See
create_replication_slot:
/* CREATE_REPLICATION_SLOT slot [TEMPORARY] PHYSICAL [options] */
K_CREATE_REPLICATION_SLOT IDENT opt_temporary K_PHYSICAL create_slot_options

-- 
With Regards,
Amit Kapila.

Re: Synchronizing slots from primary to standby

2024-01-26 Thread Bertrand Drouvot

Hi,

On Thu, Jan 25, 2024 at 01:11:50PM +, Zhijie Hou (Fujitsu) wrote:
> Here is the V69 patch set which includes the following changes.
> 
> V69-0001, V69-0002
> 1) Addressed Bertrand's comments[1].

Thanks!

V69-0001 LGTM.

As far V69-0002 I just have one more last remark:

+*/
+   if (sub->enabled)
+   ereport(ERROR,
+   
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+errmsg("cannot 
set %s for enabled subscription",
+   
"failover")));

Worth to add a test for it in 050_standby_failover_slots_sync.pl? (I had a quick
look and it does not seem to be covered).

Remarks,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Synchronizing slots from primary to standby

2024-01-25 Thread Bertrand Drouvot

Hi,

On Thu, Jan 25, 2024 at 03:54:45PM +0530, Amit Kapila wrote:
> On Thu, Jan 25, 2024 at 1:25 PM Bertrand Drouvot
>  wrote:
> >
> > On Thu, Jan 25, 2024 at 02:57:30AM +, Zhijie Hou (Fujitsu) wrote:
> > >
> > > Agreed. I split the original 0001 patch into 3 patches as suggested.
> > > Here is the V68 patch set.
> 
> Thanks, I have pushed 0001.
> 
> >
> > Thanks!
> >
> > Some comments.
> >
> > Looking at 0002:
> >
> > 1 ===
> >
> > +  The following options are supported:
> >
> > What about "The following option is supported"? (as currently only the 
> > "FAILOVER"
> > is)
> >
> > 2 ===
> >
> > What about adding some TAP tests too? (I can see that 
> > ALTER_REPLICATION_SLOT test
> > is added in v68-0004 but I think having some in 0002 would make sense too).
> >
> 
> The subscription tests in v68-0003 will test this functionality. The
> one advantage of adding separate tests for this is that if in the
> future we extend this replication command, it could be convenient to
> test various options. However, the same could be said about existing
> replication commands as well.

I initially did check for "START_REPLICATION" and I saw it's part of 
006_logical_decoding.pl (but did not check all the "REPLICATION" commands).

That said, it's more a Nit and I think it's fine with having the test in 
v68-0004
(as it is currently done) + the ones in v68-0003.

> But is it worth having extra tests which
> will be anyway covered in the next commit in a few days?
> 
> I understand that it is a good idea and makes one comfortable to have
> tests for each separate commit but OTOH, in the longer term it will
> just be adding more test time without achieving much benefit. I think
> we can tell explicitly in the commit message of this patch that the
> subsequent commit will cover the tests for this functionality

Yeah, I think that's enough (at least someone reading the commit message, the 
diff changes and not following this dedicated thread closely would know the lack
of test is not a miss).

> One minor comment on 0002:
> +  so that logical replication can be resumed after failover.
> + 
> 
> Can we move this and similar comments or doc changes to the later 0004
> patch where we are syncing the slots?

Sure.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Synchronizing slots from primary to standby

2024-01-25 Thread shveta malik

On Thu, Jan 25, 2024 at 10:39 AM Peter Smith  wrote:

> 2. synchronize_one_slot
>
> + /*
> + * Sanity check: Make sure that concerned WAL is received and flushed
> + * before syncing slot to target lsn received from the primary server.
> + *
> + * This check should never pass as on the primary server, we have waited
> + * for the standby's confirmation before updating the logical slot.
> + */
> + latestFlushPtr = GetStandbyFlushRecPtr(NULL);
> + if (remote_slot->confirmed_lsn > latestFlushPtr)
> + {
> + ereport(LOG,
> + errmsg("skipping slot synchronization as the received slot sync"
> +" LSN %X/%X for slot \"%s\" is ahead of the standby position %X/%X",
> +LSN_FORMAT_ARGS(remote_slot->confirmed_lsn),
> +remote_slot->name,
> +LSN_FORMAT_ARGS(latestFlushPtr)));
> +
> + return false;
> + }
>
> Previously in v65 this was an elog, but now it is an ereport. But
> since this is a sanity check condition that "should never pass" wasn't
> the elog the more appropriate choice?

We realized that this scenario can be frequently hit when the user has
not set standby_slot_names on primary. And thus ereport makes more
sense. But I agree that this comment is misleading. We will adjust the
comment in the next version.

thanks
Shveta

Re: Synchronizing slots from primary to standby

2024-01-25 Thread Amit Kapila

On Thu, Jan 25, 2024 at 1:25 PM Bertrand Drouvot
 wrote:
>
> On Thu, Jan 25, 2024 at 02:57:30AM +, Zhijie Hou (Fujitsu) wrote:
> >
> > Agreed. I split the original 0001 patch into 3 patches as suggested.
> > Here is the V68 patch set.

Thanks, I have pushed 0001.

>
> Thanks!
>
> Some comments.
>
> Looking at 0002:
>
> 1 ===
>
> +  The following options are supported:
>
> What about "The following option is supported"? (as currently only the 
> "FAILOVER"
> is)
>
> 2 ===
>
> What about adding some TAP tests too? (I can see that ALTER_REPLICATION_SLOT 
> test
> is added in v68-0004 but I think having some in 0002 would make sense too).
>

The subscription tests in v68-0003 will test this functionality. The
one advantage of adding separate tests for this is that if in the
future we extend this replication command, it could be convenient to
test various options. However, the same could be said about existing
replication commands as well. But is it worth having extra tests which
will be anyway covered in the next commit in a few days?

I understand that it is a good idea and makes one comfortable to have
tests for each separate commit but OTOH, in the longer term it will
just be adding more test time without achieving much benefit. I think
we can tell explicitly in the commit message of this patch that the
subsequent commit will cover the tests for this functionality

One minor comment on 0002:
+  so that logical replication can be resumed after failover.
+ 

Can we move this and similar comments or doc changes to the later 0004
patch where we are syncing the slots?

-- 
With Regards,
Amit Kapila.

Re: Synchronizing slots from primary to standby

2024-01-24 Thread Bertrand Drouvot

Hi,

On Thu, Jan 25, 2024 at 02:57:30AM +, Zhijie Hou (Fujitsu) wrote:
> On Wednesday, January 24, 2024 6:31 PM Amit Kapila  
> wrote:
> > 
> > On Tue, Jan 23, 2024 at 5:13 PM shveta malik  wrote:
> > >
> > > Thanks Ajin for testing the patch. PFA v66 which fixes this issue.
> > >
> > 
> > I think we should try to commit the patch as all of the design concerns are
> > resolved now. To achieve that, can we split the failover setting patch into 
> > the
> > following: (a) setting failover property via SQL commands and display it in
> > pg_replication_slots (b) replication protocol command (c) failover property 
> > via
> > subscription commands?
> > 
> > It will make each patch smaller and it would be easier to detect any 
> > problem in
> > the same after commit.
> 
> Agreed. I split the original 0001 patch into 3 patches as suggested.
> Here is the V68 patch set.

Thanks!

Some comments.

Looking at 0002:

1 ===

+  The following options are supported:

What about "The following option is supported"? (as currently only the 
"FAILOVER"
is)

2 ===

What about adding some TAP tests too? (I can see that ALTER_REPLICATION_SLOT 
test
is added in v68-0004 but I think having some in 0002 would make sense too).

Looking at 0003:

1 ===

+  parameter specified in the subscription. When creating the slot,
+  ensure the slot failover property matches the
+  failover
+  parameter value of the subscription.

What about explaining what would be the consequence of not doing so?

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Synchronizing slots from primary to standby

2024-01-24 Thread shveta malik

On Thu, Jan 25, 2024 at 9:13 AM Amit Kapila  wrote:
>
> > 3) Removed syncing 'failover' on standby from remote_slot. The
> > 'failover' field will be false for synced slots. Since we do not
> > support sync to cascading standbys yet, thus failover=true was
> > misleading and unused there.
> >
>
> But what will happen after the standby is promoted? After promotion,
> ideally, it should have failover enabled, so that the slots can be
> synced. Also, note that corresponding subscriptions still have the
> failover flag enabled. I think we should copy the 'failover' option
> for the synced slots.

Yes, right, missed this point earlier. I will make the change in the
next version.

thanks
Shveta

Re: Synchronizing slots from primary to standby

2024-01-24 Thread Bertrand Drouvot

Hi,

On Wed, Jan 24, 2024 at 04:09:15PM +0530, shveta malik wrote:
> On Wed, Jan 24, 2024 at 2:38 PM Bertrand Drouvot
>  wrote:
> >
> > I also see Sawada-San's point and I'd vote for "sync_replication_slots". 
> > Then for
> > the current feature I think "failover" and "on" should be the values to 
> > turn the
> > feature on (assuming "on" would mean "all kind of supported slots").
> 
> Even if others agree and we change this GUC name to
> "sync_replication_slots", I feel we should keep the values as "on" and
> "off" currently, where "on" would mean 'sync failover slots' (docs can
> state that clearly).

I gave more thoughts on it and I think the values should only be "failover" or
"off".

The reason is that if we allow "on" and change the "on" behavior in future
versions (to support more than failover slots) then that would change the 
behavior 
for the ones that used "on".

That's right that we can mention it in the docs, but there is still the risk of
users not reading the doc (that's why I think that it would be good if we can 
put 
this extra "safety" in the code too).

>  I do not think we should support sync of "all
> kinds of supported slots" in the first version. Maybe we can think
> about it for future versions.

Yeah I think the same (I was mentioning the future "on" behavior up-thread).

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Synchronizing slots from primary to standby

2024-01-24 Thread Peter Smith

Here are some review comments for v67-0002.

==
src/backend/replication/logical/slotsync.c

1.
+/* The sleep time (ms) between slot-sync cycles varies dynamically
+ * (within a MIN/MAX range) according to slot activity. See
+ * wait_for_slot_activity() for details.
+ */
+#define MIN_WORKER_NAPTIME_MS  200
+#define MAX_WORKER_NAPTIME_MS  3 /* 30s */
+
+static long sleep_ms = MIN_WORKER_NAPTIME_MS;

In my previous review for this, I meant for there to be no whitespace
between the #defines and the static long sleep_ms so the prior comment
then clearly belongs to all 3 lines

~~~

2. synchronize_one_slot

+ /*
+ * Sanity check: Make sure that concerned WAL is received and flushed
+ * before syncing slot to target lsn received from the primary server.
+ *
+ * This check should never pass as on the primary server, we have waited
+ * for the standby's confirmation before updating the logical slot.
+ */
+ latestFlushPtr = GetStandbyFlushRecPtr(NULL);
+ if (remote_slot->confirmed_lsn > latestFlushPtr)
+ {
+ ereport(LOG,
+ errmsg("skipping slot synchronization as the received slot sync"
+" LSN %X/%X for slot \"%s\" is ahead of the standby position %X/%X",
+LSN_FORMAT_ARGS(remote_slot->confirmed_lsn),
+remote_slot->name,
+LSN_FORMAT_ARGS(latestFlushPtr)));
+
+ return false;
+ }

Previously in v65 this was an elog, but now it is an ereport. But
since this is a sanity check condition that "should never pass" wasn't
the elog the more appropriate choice?

~~~

3. synchronize_one_slot

+ /*
+ * We don't want any complicated code while holding a spinlock, so do
+ * namestrcpy() and get_database_oid() outside.
+ */
+ namestrcpy(_name, remote_slot->plugin);
+ dbid = get_database_oid(remote_slot->database, false);

IMO just simplify the whole comment, here and for the other similar
comment in local_slot_update().

SUGGESTION
/* Avoid expensive operations while holding a spinlock. */

~~~

4. synchronize_slots

+ /* Construct the remote_slot tuple and synchronize each slot locally */
+ tupslot = MakeSingleTupleTableSlot(res->tupledesc, );
+ while (tuplestore_gettupleslot(res->tuplestore, true, false, tupslot))
+ {
+ bool isnull;
+ RemoteSlot *remote_slot = palloc0(sizeof(RemoteSlot));
+ Datum d;
+
+ remote_slot->name = TextDatumGetCString(slot_getattr(tupslot, 1, ));
+ Assert(!isnull);
+
+ remote_slot->plugin = TextDatumGetCString(slot_getattr(tupslot, 2, ));
+ Assert(!isnull);
+
+ /*
+ * It is possible to get null values for LSN and Xmin if slot is
+ * invalidated on the primary server, so handle accordingly.
+ */
+ d = slot_getattr(tupslot, 3, );
+ remote_slot->confirmed_lsn = isnull ? InvalidXLogRecPtr :
+ DatumGetLSN(d);
+
+ d = slot_getattr(tupslot, 4, );
+ remote_slot->restart_lsn = isnull ? InvalidXLogRecPtr : DatumGetLSN(d);
+
+ d = slot_getattr(tupslot, 5, );
+ remote_slot->catalog_xmin = isnull ? InvalidTransactionId :
+ DatumGetTransactionId(d);
+
+ remote_slot->two_phase = DatumGetBool(slot_getattr(tupslot, 6, ));
+ Assert(!isnull);
+
+ remote_slot->database = TextDatumGetCString(slot_getattr(tupslot,
+ 7, ));
+ Assert(!isnull);
+
+ d = slot_getattr(tupslot, 8, );
+ remote_slot->invalidated = isnull ? RS_INVAL_NONE :
+ get_slot_invalidation_cause(TextDatumGetCString(d));

Would it be better to get rid of the hardwired column numbers and then
be able to use the SLOTSYNC_COLUMN_COUNT already defined as a sanity
check?

SUGGESTION
int col = 0;
...
remote_slot->name = TextDatumGetCString(slot_getattr(tupslot, ++col, ));
...
remote_slot->plugin = TextDatumGetCString(slot_getattr(tupslot, ++col,
));
...
d = slot_getattr(tupslot, ++col, );
remote_slot->confirmed_lsn = isnull ? InvalidXLogRecPtr : DatumGetLSN(d);
...
d = slot_getattr(tupslot, ++col, );
remote_slot->restart_lsn = isnull ? InvalidXLogRecPtr : DatumGetLSN(d);
...
d = slot_getattr(tupslot, ++col, );
remote_slot->catalog_xmin = isnull ? InvalidTransactionId :
DatumGetTransactionId(d);
...
remote_slot->two_phase = DatumGetBool(slot_getattr(tupslot, ++col, ));
...
remote_slot->database = TextDatumGetCString(slot_getattr(tupslot,
++col, ));
...
d = slot_getattr(tupslot, ++col, );
remote_slot->invalidated = isnull ? RS_INVAL_NONE :
get_slot_invalidation_cause(TextDatumGetCString(d));

/* Sanity check */
Asert(col == SLOTSYNC_COLUMN_COUNT);

~~~

5.
+static char *
+validate_parameters_and_get_dbname(void)
+{
+ char*dbname;

These are configuration issues, so probably all these ereports could
also set errcode(ERRCODE_INVALID_PARAMETER_VALUE).

==
Kind Regards,
Peter Smith.
Fujitsu Australia

RE: Synchronizing slots from primary to standby

2024-01-24 Thread Zhijie Hou (Fujitsu)

On Wednesday, January 24, 2024 1:11 PM Masahiko Sawada  
wrote:
> Here are random comments on slotsyncworker.c (v66):

Thanks for the comments:

> 
> ---
> +   elog(ERROR,
> +"cannot synchronize local slot \"%s\" LSN(%X/%X)"
> +" to remote slot's LSN(%X/%X) as synchronization"
> +" would move it backwards", remote_slot->name,
> 
> Many error messages in slotsync.c are splitted into several lines, but I 
> think it
> would reduce the greppability when the user looks for the error message in the
> source code.

Thanks for the suggestion! we combined most of the messages in the new version
patch. Although some messages including the above one were kept splitted,
because It's too long(> 120 col including the indent) to fit into the screen,
so I feel it's better to keep these messages splitted.

Best Regards,
Hou zj

Re: Synchronizing slots from primary to standby

2024-01-24 Thread Amit Kapila

On Wed, Jan 24, 2024 at 5:17 PM shveta malik  wrote:
>
> PFA v67. Note that the GUC (enable_syncslot) name is unchanged. Once
> we have final agreement on the name, we can make the change in the
> next version.
>
> Changes in v67 are:
>
> 1) Addressed comments by Peter given in [1].
> 2) Addressed comments by Swada-San given in [2].
> 3) Removed syncing 'failover' on standby from remote_slot. The
> 'failover' field will be false for synced slots. Since we do not
> support sync to cascading standbys yet, thus failover=true was
> misleading and unused there.
>

But what will happen after the standby is promoted? After promotion,
ideally, it should have failover enabled, so that the slots can be
synced. Also, note that corresponding subscriptions still have the
failover flag enabled. I think we should copy the 'failover' option
for the synced slots.

-- 
With Regards,
Amit Kapila.

Re: Synchronizing slots from primary to standby

2024-01-24 Thread Peter Smith

Here are some review comments for the patch v67-0001.

==
1.
There are a couple of places checking for failover usage on a standby.

+ if (RecoveryInProgress() && failover)
+ ereport(ERROR,
+ errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("cannot enable failover for a replication slot"
+" created on the standby"));

and

+ if (RecoveryInProgress() && failover)
+ ereport(ERROR,
+ errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("cannot enable failover for a replication slot"
+" on the standby"));

IMO the conditions should be written the other way around (failover &&
RecoveryInProgress()) to avoid the unnecessary function calls when
'failover' flag is probably mostly default false anyway.

==
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Synchronizing slots from primary to standby

2024-01-24 Thread shveta malik

On Wed, Jan 24, 2024 at 2:38 PM Bertrand Drouvot
 wrote:
>
> Hi,
>
> On Wed, Jan 24, 2024 at 01:51:54PM +0530, Amit Kapila wrote:
> > On Wed, Jan 24, 2024 at 11:24 AM Masahiko Sawada  
> > wrote:
> > >
> > > On Wed, Jan 24, 2024 at 2:43 PM Amit Kapila  
> > > wrote:
> > > >
> > > > >
> > > > > +/* GUC variable */
> > > > > +bool   enable_syncslot = false;
> > > > >
> > > > > Is enable_syncslot a really good name? We use "enable" prefix only for
> > > > > planner parameters such as enable_seqscan, and it seems to me that
> > > > > "slot" is not specific. Other candidates are:
> > > > >
> > > > > * synchronize_replication_slots = on|off
> > > > > * synchronize_failover_slots = on|off
> > > > >
> > > >
> > > > I would prefer the second one. Would it be better to just say
> > > > sync_failover_slots?
> > >
> > > Works for me. But if we want to extend this option for non-failover
> > > slots as well in the future, synchronize_replication_slots (or
> > > sync_replication_slots) seems better. We can extend it by having an
> > > enum later. For example, the values can be on, off, or failover etc.
> > >
> >
> > I see your point. Let us see if others have any suggestions on this.
>
> I also see Sawada-San's point and I'd vote for "sync_replication_slots". Then 
> for
> the current feature I think "failover" and "on" should be the values to turn 
> the
> feature on (assuming "on" would mean "all kind of supported slots").

Even if others agree and we change this GUC name to
"sync_replication_slots", I feel we should keep the values as "on" and
"off" currently, where "on" would mean 'sync failover slots' (docs can
state that clearly).  I do not think we should support sync of "all
kinds of supported slots" in the first version. Maybe we can think
about it for future versions.

thanks
Shveta

Re: Synchronizing slots from primary to standby

2024-01-24 Thread Amit Kapila

On Tue, Jan 23, 2024 at 5:13 PM shveta malik  wrote:
>
> Thanks Ajin for testing the patch. PFA v66 which fixes this issue.
>

I think we should try to commit the patch as all of the design
concerns are resolved now. To achieve that, can we split the failover
setting patch into the following: (a) setting failover property via
SQL commands and display it in pg_replication_slots (b) replication
protocol command (c) failover property via subscription commands?

It will make each patch smaller and it would be easier to detect any
problem in the same after commit.

-- 
With Regards,
Amit Kapila.

Re: Synchronizing slots from primary to standby

2024-01-24 Thread Bertrand Drouvot

Hi,

On Wed, Jan 24, 2024 at 01:51:54PM +0530, Amit Kapila wrote:
> On Wed, Jan 24, 2024 at 11:24 AM Masahiko Sawada  
> wrote:
> >
> > On Wed, Jan 24, 2024 at 2:43 PM Amit Kapila  wrote:
> > >
> > > >
> > > > +/* GUC variable */
> > > > +bool   enable_syncslot = false;
> > > >
> > > > Is enable_syncslot a really good name? We use "enable" prefix only for
> > > > planner parameters such as enable_seqscan, and it seems to me that
> > > > "slot" is not specific. Other candidates are:
> > > >
> > > > * synchronize_replication_slots = on|off
> > > > * synchronize_failover_slots = on|off
> > > >
> > >
> > > I would prefer the second one. Would it be better to just say
> > > sync_failover_slots?
> >
> > Works for me. But if we want to extend this option for non-failover
> > slots as well in the future, synchronize_replication_slots (or
> > sync_replication_slots) seems better. We can extend it by having an
> > enum later. For example, the values can be on, off, or failover etc.
> >
> 
> I see your point. Let us see if others have any suggestions on this.

I also see Sawada-San's point and I'd vote for "sync_replication_slots". Then 
for
the current feature I think "failover" and "on" should be the values to turn the
feature on (assuming "on" would mean "all kind of supported slots").

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Synchronizing slots from primary to standby

2024-01-24 Thread Amit Kapila

On Wed, Jan 24, 2024 at 11:24 AM Masahiko Sawada  wrote:
>
> On Wed, Jan 24, 2024 at 2:43 PM Amit Kapila  wrote:
> >
> > >
> > > +/* GUC variable */
> > > +bool   enable_syncslot = false;
> > >
> > > Is enable_syncslot a really good name? We use "enable" prefix only for
> > > planner parameters such as enable_seqscan, and it seems to me that
> > > "slot" is not specific. Other candidates are:
> > >
> > > * synchronize_replication_slots = on|off
> > > * synchronize_failover_slots = on|off
> > >
> >
> > I would prefer the second one. Would it be better to just say
> > sync_failover_slots?
>
> Works for me. But if we want to extend this option for non-failover
> slots as well in the future, synchronize_replication_slots (or
> sync_replication_slots) seems better. We can extend it by having an
> enum later. For example, the values can be on, off, or failover etc.
>

I see your point. Let us see if others have any suggestions on this.

-- 
With Regards,
Amit Kapila.

Re: Synchronizing slots from primary to standby

2024-01-23 Thread Masahiko Sawada

On Wed, Jan 24, 2024 at 2:43 PM Amit Kapila  wrote:
>
> On Wed, Jan 24, 2024 at 10:41 AM Masahiko Sawada  
> wrote:
> >
> > On Mon, Jan 22, 2024 at 3:58 PM Amit Kapila  wrote:
> > >
> > > Can we think of using GetStandbyFlushRecPtr()? We probably need to
> > > expose this function, if this works for the required purpose.
> >
> > GetStandbyFlushRecPtr() seems good. But do we really want to raise an
> > ERROR in this case? IIUC this case could happen often when the slot
> > used by the standby is not listed in standby_slot_names.
> >
>
> or it can be due to some bug in the code as well.
>
> > I think we
> > can just skip such a slot to synchronize and check it the next time.
> >
>
> How about logging the message and then skipping the sync step? This
> will at least make users aware that they could be missing to set
> standby_slot_names.

+1

>
> > Here are random comments on slotsyncworker.c (v66):
> >
> > +/* GUC variable */
> > +bool   enable_syncslot = false;
> >
> > Is enable_syncslot a really good name? We use "enable" prefix only for
> > planner parameters such as enable_seqscan, and it seems to me that
> > "slot" is not specific. Other candidates are:
> >
> > * synchronize_replication_slots = on|off
> > * synchronize_failover_slots = on|off
> >
>
> I would prefer the second one. Would it be better to just say
> sync_failover_slots?

Works for me. But if we want to extend this option for non-failover
slots as well in the future, synchronize_replication_slots (or
sync_replication_slots) seems better. We can extend it by having an
enum later. For example, the values can be on, off, or failover etc.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Re: Synchronizing slots from primary to standby

2024-01-23 Thread Amit Kapila

On Wed, Jan 24, 2024 at 10:41 AM Masahiko Sawada  wrote:
>
> On Mon, Jan 22, 2024 at 3:58 PM Amit Kapila  wrote:
> >
> > Can we think of using GetStandbyFlushRecPtr()? We probably need to
> > expose this function, if this works for the required purpose.
>
> GetStandbyFlushRecPtr() seems good. But do we really want to raise an
> ERROR in this case? IIUC this case could happen often when the slot
> used by the standby is not listed in standby_slot_names.
>

or it can be due to some bug in the code as well.

> I think we
> can just skip such a slot to synchronize and check it the next time.
>

How about logging the message and then skipping the sync step? This
will at least make users aware that they could be missing to set
standby_slot_names.

> Here are random comments on slotsyncworker.c (v66):
>
> +/* GUC variable */
> +bool   enable_syncslot = false;
>
> Is enable_syncslot a really good name? We use "enable" prefix only for
> planner parameters such as enable_seqscan, and it seems to me that
> "slot" is not specific. Other candidates are:
>
> * synchronize_replication_slots = on|off
> * synchronize_failover_slots = on|off
>

I would prefer the second one. Would it be better to just say
sync_failover_slots?

-- 
With Regards,
Amit Kapila.

Re: Synchronizing slots from primary to standby

2024-01-23 Thread Masahiko Sawada

On Mon, Jan 22, 2024 at 3:58 PM Amit Kapila  wrote:
>
> On Fri, Jan 19, 2024 at 3:55 PM shveta malik  wrote:
> >
> > On Fri, Jan 19, 2024 at 10:35 AM Masahiko Sawada  
> > wrote:
> > >
> > >
> > > Thank you for updating the patch. I have some comments:
> > >
> > > ---
> > > +latestWalEnd = GetWalRcvLatestWalEnd();
> > > +if (remote_slot->confirmed_lsn > latestWalEnd)
> > > +{
> > > +elog(ERROR, "exiting from slot synchronization as the
> > > received slot sync"
> > > + " LSN %X/%X for slot \"%s\" is ahead of the
> > > standby position %X/%X",
> > > + LSN_FORMAT_ARGS(remote_slot->confirmed_lsn),
> > > + remote_slot->name,
> > > + LSN_FORMAT_ARGS(latestWalEnd));
> > > +}
> > >
> > > IIUC GetWalRcvLatestWalEnd () returns walrcv->latestWalEnd, which is
> > > typically the primary server's flush position and doesn't mean the LSN
> > > where the walreceiver received/flushed up to.
> >
> > yes. I think it makes more sense to use something which actually tells
> > flushed-position. I gave it a try by replacing GetWalRcvLatestWalEnd()
> > with GetWalRcvFlushRecPtr() but I see a problem here. Lets say I have
> > enabled the slot-sync feature in a running standby, in that case we
> > are all good (flushedUpto is the same as actual flush-position
> > indicated by LogstreamResult.Flush). But if I restart standby, then I
> > observed that the startup process sets flushedUpto to some value 'x'
> > (see [1]) while when the wal-receiver starts, it sets
> > 'LogstreamResult.Flush' to another value (see [2]) which is always
> > greater than 'x'. And we do not update flushedUpto with the
> > 'LogstreamResult.Flush' value in walreceiver until we actually do an
> > operation on primary. Performing a data change on primary sends WALs
> > to standby which then hits XLogWalRcvFlush() and updates flushedUpto
> > same as LogstreamResult.Flush. Until then we have a situation where
> > slots received on standby are ahead of flushedUpto and thus slotsync
> > worker keeps one erroring out. I am yet to find out why flushedUpto is
> > set to a lower value than 'LogstreamResult.Flush' at the start of
> > standby.  Or maybe am I using the wrong function
> > GetWalRcvFlushRecPtr() and should be using something else instead?
> >
>
> Can we think of using GetStandbyFlushRecPtr()? We probably need to
> expose this function, if this works for the required purpose.

GetStandbyFlushRecPtr() seems good. But do we really want to raise an
ERROR in this case? IIUC this case could happen often when the slot
used by the standby is not listed in standby_slot_names. I think we
can just skip such a slot to synchronize and check it the next time.

Here are random comments on slotsyncworker.c (v66):

---
The postmaster relaunches the slotsync worker without intervals. So if
a connection string in primary_conninfo is not correct, many errors
are emitted.

---
+/* GUC variable */
+bool   enable_syncslot = false;

Is enable_syncslot a really good name? We use "enable" prefix only for
planner parameters such as enable_seqscan, and it seems to me that
"slot" is not specific. Other candidates are:

* synchronize_replication_slots = on|off
* synchronize_failover_slots = on|off

---
+   elog(ERROR,
+"cannot synchronize local slot \"%s\" LSN(%X/%X)"
+" to remote slot's LSN(%X/%X) as synchronization"
+" would move it backwards", remote_slot->name,

Many error messages in slotsync.c are splitted into several lines, but
I think it would reduce the greppability when the user looks for the
error message in the source code.

---
+   SpinLockAcquire(>mutex);
+   slot->data.database = get_database_oid(remote_slot->database, false);
+   namestrcpy(>data.plugin, remote_slot->plugin);

We should not access syscaches while holding a spinlock.

---
+   SpinLockAcquire(>mutex);
+   slot->data.database = get_database_oid(remote_slot->database, false);
+   namestrcpy(>data.plugin, remote_slot->plugin);
+   SpinLockRelease(>mutex);

Similarly, it's better to avoid calling namestrcpy() while holding a
spinlock, as we do in CreateInitDecodingContext().

---
+   SpinLockAcquire(>mutex);
+
+   SlotSyncWorker->stopSignaled = true;
+
+   if (SlotSyncWorker->pid == InvalidPid)
+   {
+   SpinLockRelease(>mutex);
+   return;
+   }
+
+   kill(SlotSyncWorker->pid, SIGINT);
+
+   SpinLockRelease(>mutex);

It's better to avoid calling a system call while holding a spin lock.

---
+   BackgroundWorkerUnblockSignals();

I think it's no longer necessary.

---
+   ereport(LOG,
+   /* translator: %s is a GUC variable name */
+   errmsg("bad configuration for slot synchronization"),
+   errhint("\"wal_level\" must be >= logical."));

There is no '%s' in errmsg string.

---
+/*
+ * Cleanup function for logical

Re: Synchronizing slots from primary to standby

2024-01-23 Thread Amit Kapila

On Wed, Jan 24, 2024 at 8:52 AM Peter Smith  wrote:
>
> Here are some comments for patch v66-0001.
>
> ==
> doc/src/sgml/catalogs.sgml
>
> 1.
> +  
> +   If true, the associated replication slots (i.e. the main slot and the
> +   table sync slots) in the upstream database are enabled to be
> +   synchronized to the physical standbys
> +  
>
> /physical standbys/physical standby/
>
> I wondered if it is better just to say singular "standby" instead of
> "standbys" in places like this; e.g. plural might imply cascading for
> some readers.
>

I don't think it is confusing as we used in a similar way in docs. We
can probably avoid using physical in places similar to above as that
is implied

>
>
> ==
> src/backend/replication/slotfuncs.c
>
> 11. create_physical_replication_slot
>
>   /* acquire replication slot, this will check for conflicting names */
>   ReplicationSlotCreate(name, false,
> -   temporary ? RS_TEMPORARY : RS_PERSISTENT, false);
> +   temporary ? RS_TEMPORARY : RS_PERSISTENT, false,
> +   false);
>
> Having an inline comment might be helpful here instead of passing 
> "false,false"
>
> SUGGESTION
> ReplicationSlotCreate(name, false,
>  temporary ? RS_TEMPORARY : RS_PERSISTENT, false,
>  false /* failover */);
>

I don't think we follow to use of inline comments. I feel that
sometimes makes code difficult to read considering when we have
multiple such parameters.

-- 
With Regards,
Amit Kapila.

Re: Synchronizing slots from primary to standby

2024-01-23 Thread Peter Smith

Here are some comments for patch v66-0001.

==
doc/src/sgml/catalogs.sgml

1.
+  
+   If true, the associated replication slots (i.e. the main slot and the
+   table sync slots) in the upstream database are enabled to be
+   synchronized to the physical standbys
+  

/physical standbys/physical standby/

I wondered if it is better just to say singular "standby" instead of
"standbys" in places like this; e.g. plural might imply cascading for
some readers.

There are a number of examples like this, so I've repeated the same
comment multiple times below. If you disagree, please just ignore all
of them.

==
doc/src/sgml/func.sgml

2.
 that the decoding of prepared transactions is enabled for this
-slot. A call to this function has the same effect as the replication
-protocol command CREATE_REPLICATION_SLOT ...
LOGICAL.
+slot. The optional fifth parameter,
+failover, when set to true,
+specifies that this slot is enabled to be synced to the
+physical standbys so that logical replication can be resumed
+after failover. A call to this function has the same effect as
+the replication protocol command
+CREATE_REPLICATION_SLOT ... LOGICAL.


(same as above)

/physical standbys/physical standby/

Also, I don't see anything else on this page using plural "standbys".

==
doc/src/sgml/protocol.sgml

3. CREATE_REPLICATION_SLOT

+   
+FAILOVER [ boolean ]
+
+ 
+  If true, the slot is enabled to be synced to the physical
+  standbys so that logical replication can be resumed after failover.
+  The default is false.
+ 
+
+   

(same as above)

/physical standbys/physical standby/

~~~

4. ALTER_REPLICATION_SLOT

+  
+   
+FAILOVER [ boolean ]
+
+ 
+  If true, the slot is enabled to be synced to the physical
+  standbys so that logical replication can be resumed after failover.
+ 
+
+   
+  

(same as above)

/physical standbys/physical standby/

==
doc/src/sgml/ref/create_subscription.sgml

5.
+   
+failover (boolean)
+
+ 
+  Specifies whether the replication slots associated with the
subscription
+  are enabled to be synced to the physical standbys so that logical
+  replication can be resumed from the new primary after failover.
+  The default is false.
+ 
+
+   

(same as above)

/physical standbys/physical standby/

==
doc/src/sgml/system-views.sgml

6.
+
+ 
+  
+   failover bool
+  
+  
+   True if this is a logical slot enabled to be synced to the physical
+   standbys so that logical replication can be resumed from the new primary
+   after failover. Always false for physical slots.
+  
+ 

(same as above)

/physical standbys/physical standby/

==
src/backend/commands/subscriptioncmds.c

7.
+ if (IsSet(opts.specified_opts, SUBOPT_FAILOVER))
+ {
+ if (!sub->slotname)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot set failover for a subscription that does not have a
slot name")));
+
+ /*
+ * Do not allow changing the failover state if the
+ * subscription is enabled. This is because the failover
+ * state of the slot on the publisher cannot be modified if
+ * the slot is currently acquired by the apply worker.
+ */
+ if (sub->enabled)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot set %s for enabled subscription",
+ "failover")));
+
+ values[Anum_pg_subscription_subfailover - 1] =
+ BoolGetDatum(opts.failover);
+ replaces[Anum_pg_subscription_subfailover - 1] = true;
+ }

The first message is not consistent with the second. The "failover"
option maybe should be extracted so it won't be translated.

SUGGESTION
errmsg("cannot set %s for a subscription that does not have a slot
name", "failover")

~~~

8. AlterSubscription

+ if (!wrconn)
+ ereport(ERROR,
+ (errcode(ERRCODE_CONNECTION_FAILURE),
+ errmsg("could not connect to the publisher: %s", err)));
+

Need to keep an eye on the patch proposed by Nisha [1] for messages
similar to this one, so in case that gets pushed this code should be
changed appropriately.

==
src/backend/replication/slot.c

9.
  * during getting changes, if the two_phase option is enabled it can skip
  * prepare because by that time start decoding point has been moved. So the
  * user will only get commit prepared.
+ * failover: If enabled, allows the slot to be synced to physical standbys so
+ * that logical replication can be resumed after failover.
  */

(same as earlier)

/physical standbys/physical standby/

~~~

10.
+ /*
+ * Do not allow users to alter slots to enable failover on the standby
+ * as we do not support sync to the cascading standby.
+ */
+ if (RecoveryInProgress() && failover)
+

Re: Synchronizing slots from primary to standby

2024-01-23 Thread shveta malik

On Tue, Jan 23, 2024 at 9:45 AM Peter Smith  wrote:
>
> Here are some review comments for v65-0002

Thanks Peter for the feedback. I have addressed these in v66.

>
> 4. GetStandbyFlushRecPtr
>
>  /*
> - * Returns the latest point in WAL that has been safely flushed to disk, and
> - * can be sent to the standby. This should only be called when in recovery,
> - * ie. we're streaming to a cascaded standby.
> + * Returns the latest point in WAL that has been safely flushed to disk.
> + * This should only be called when in recovery.
> + *
>
> Since it says "This should only be called when in recovery", should
> there also be a check for that (e.g. RecoveryInProgress) in the added
> Assert?

Since 'am_cascading_walsender' and  'IsLogicalSlotSyncWorker' makes
sense 'in-recovery' only, I think explicit check for
'RecoveryInProgress' is not needed here. But I can add if others also
think it is needed.

thanks
Shveta

Re: Synchronizing slots from primary to standby

2024-01-23 Thread Ajin Cherian

On Mon, Jan 22, 2024 at 10:30 PM shveta malik 
wrote:
>
> On Mon, Jan 22, 2024 at 3:10 PM Amit Kapila 
wrote:
> >
> > minor comments on the patch:
> > ===
>
> PFA v65 addressing the comments.
>
> Addressed comments by Peter in [1], comments by Hou-San in [2],
> comments by Amit in [3] and [4]
>
> TODO:
> Analyze the issue reported by Swada-san in [5] (pt 2)
> Disallow subscription creation on standby with failover=true (as we do
> not support sync on cascading standbys)
>
> [1]:
https://www.postgresql.org/message-id/CAHut%2BPt5Pk_xJkb54oahR%2Bf9oawgfnmbpewvkZPgnRhoJ3gkYg%40mail.gmail.com
> [2]:
https://www.postgresql.org/message-id/OS0PR01MB57160C7184E17C6765AAE38294752%40OS0PR01MB5716.jpnprd01.prod.outlook.com
> [3]:
https://www.postgresql.org/message-id/CAA4eK1JPB-zpGYTbVOP5Qp26tNQPMjDuYzNZ%2Ba9RFiN5nE1tEA%40mail.gmail.com
> [4]:
https://www.postgresql.org/message-id/CAA4eK1Jhy1-bsu6vc0%3DNja7aw5-EK_%3D101pnnuM3ATqTA8%2B%3DSg%40mail.gmail.com
> [5]:
https://www.postgresql.org/message-id/CAD21AoBgzONdt3o5mzbQ4MtqAE%3DWseiXUOq0LMqne-nWGjZBsA%40mail.gmail.com
>
>
I was doing some testing on this. What I noticed is that creating
subscriptions with failover enabled is taking a lot longer compared with a
subscription with failover disabled. The setup has primary configured with
standby_slot_names and that standby is enabled with enable_synclot turned
on.

Publisher has one publication, no tables.
subscriber:
postgres=# \timing
Timing is on.
postgres=# CREATE SUBSCRIPTION sub CONNECTION 'dbname=postgres
host=localhost port=6972' PUBLICATION pub with (failover = true);
NOTICE:  created replication slot "sub" on publisher
CREATE SUBSCRIPTION
Time: 10011.829 ms (00:10.012)

== drop the sub

postgres=# CREATE SUBSCRIPTION sub CONNECTION 'dbname=postgres
host=localhost port=6972' PUBLICATION pub with (failover = false);
NOTICE:  created replication slot "sub" on publisher
CREATE SUBSCRIPTION
Time: 46.317 ms

With failover=true, it takes 10011 ms while failover=false takes 46 ms.

I don't see a similar delay when creating slot on the primary with
pg_create_logical_replication_slot() with failover flag enabled.

Then on primary:
postgres=# SELECT 'init' FROM
pg_create_logical_replication_slot('lsub2_slot', 'pgoutput', false, false,
true);
?column?
--
init
(1 row)

Time: 36.125 ms
postgres=# SELECT 'init' FROM
pg_create_logical_replication_slot('lsub1_slot', 'pgoutput', false, false,
false);
?column?
--
init
(1 row)

Time: 53.981 ms

regards,
Ajin Cherian
Fujitsu Australia

Re: Synchronizing slots from primary to standby

2024-01-22 Thread Amit Kapila

On Mon, Jan 22, 2024 at 8:42 PM Masahiko Sawada  wrote:
>
> On Mon, Jan 22, 2024 at 9:26 PM Amit Kapila  wrote:
> >
> > >
> > > Yes. IIUC the slotsync worker uses libpqwalreceiver to establish a
> > > non-replication connection and to execute SQL query. But neither of
> > > them are relevant with replication.
> > >
> >
> > But we are already using libpqwalreceiver to execute SQL queries via
> > tablesync worker.
>
> IIUC tablesync workers do both SQL queries and replication commands. I
> think the slotsync worker is the first background process who does
> only SQL queries in a non-replication command ( using
> libpqwalreceiver).
>

Yes, I agree but till now we didn't saw any problem with the same.

-- 
With Regards,
Amit Kapila.

Re: Synchronizing slots from primary to standby

2024-01-22 Thread Peter Smith

Here are some review comments for v65-0002

==
0. General - GUCs in messages

I think it would be better for the GUC names to all be quoted. It's
not a rule (yet), but OTOH it seems to be the consensus most people
want. See [1].

This might impact the following messages:

0.1
+ ereport(ERROR,
+ errmsg("could not fetch primary_slot_name \"%s\" info from the"
+" primary server: %s", PrimarySlotName, res->err));

SUGGESTION
errmsg("could not fetch primary server slot \"%s\" info from the
primary server: %s", ...)
errhint("Check if \"primary_slot_name\" is configured correctly.");

~~~

0.2
+ if (!tuplestore_gettupleslot(res->tuplestore, true, false, tupslot))
+ elog(ERROR, "failed to fetch primary_slot_name tuple");

SUGGESTION
elog(ERROR, "failed to fetch tuple for the primary server slot
specified by \"primary_slot_name\"");

~~~

0.3
+ ereport(ERROR,
+ errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("exiting from slot synchronization due to bad configuration"),
+ /* translator: second %s is a GUC variable name */
+ errdetail("The primary server slot \"%s\" specified by %s is not valid.",
+   PrimarySlotName, "primary_slot_name"));

/specified by %s/specified by \"%s\"/

~~~

0.4
+ ereport(ERROR,
+ /* translator: %s is a GUC variable name */
+ errmsg("exiting from slot synchronization due to bad configuration"),
+ errhint("%s must be defined.", "primary_slot_name"));

/%s must be defined./\"%s\" must be defined./

~~~

0.5
+ ereport(ERROR,
+ /* translator: %s is a GUC variable name */
+ errmsg("exiting from slot synchronization due to bad configuration"),
+ errhint("%s must be enabled.", "hot_standby_feedback"));

/%s must be enabled./\"%s\" must be enabled./

~~~

0.6
+ ereport(ERROR,
+ /* translator: %s is a GUC variable name */
+ errmsg("exiting from slot synchronization due to bad configuration"),
+ errhint("wal_level must be >= logical."));

errhint("\"wal_level\" must be >= logical."))

~~~

0.7
+ if (PrimaryConnInfo == NULL || strcmp(PrimaryConnInfo, "") == 0)
+ ereport(ERROR,
+ /* translator: %s is a GUC variable name */
+ errmsg("exiting from slot synchronization due to bad configuration"),
+ errhint("%s must be defined.", "primary_conninfo"));

/%s must be defined./\"%s\" must be defined./

~~~

0.8
+ ereport(ERROR,
+
+ /*
+ * translator: 'dbname' is a specific option; %s is a GUC variable
+ * name
+ */
+ errmsg("exiting from slot synchronization due to bad configuration"),
+ errhint("'dbname' must be specified in %s.", "primary_conninfo"));

/must be specified in %s./must be specified in \"%s\"./

~~~

0.9
+ ereport(LOG,
+ errmsg("skipping slot synchronization"),
+ errdetail("enable_syncslot is disabled."));

errdetail("\"enable_syncslot\" is disabled."));

==
src/backend/replication/logical/slotsync.c

1.
+/* Min and Max sleep time for slot sync worker */
+#define MIN_WORKER_NAPTIME_MS  200
+#define MAX_WORKER_NAPTIME_MS  3 /* 30s */
+
+/*
+ * Sleep time in ms between slot-sync cycles.
+ * See wait_for_slot_activity() for how we adjust this
+ */
+static long sleep_ms = MIN_WORKER_NAPTIME_MS;

These all belong together, so I think they share a combined comment like:

SUGGESTION
The sleep time (ms) between slot-sync cycles varies dynamically
(within a MIN/MAX range) according to slot activity. See
wait_for_slot_activity() for details.

~~~

2. update_and_persist_slot

+ /* First time slot update, the function must return true */
+ if(!local_slot_update(remote_slot))
+ elog(ERROR, "failed to update slot");

Missing whitespace after 'if'

~~~

3. synchronize_one_slot

+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("exiting from slot synchronization because same"
+" name slot \"%s\" already exists on standby",
+remote_slot->name),
+ errdetail("A user-created slot with the same name as"
+   " failover slot already exists on the standby."));


3a.
/on standby/on the standby/

~

3b.
Now the errmsg is changed, the errdetail doesn't seem so useful. Isn't
it repeating pretty much the same information as in the errmsg?

==
src/backend/replication/walsender.c

4. GetStandbyFlushRecPtr

 /*
- * Returns the latest point in WAL that has been safely flushed to disk, and
- * can be sent to the standby. This should only be called when in recovery,
- * ie. we're streaming to a cascaded standby.
+ * Returns the latest point in WAL that has been safely flushed to disk.
+ * This should only be called when in recovery.
+ *

Since it says "This should only be called when in recovery", should
there also be a check for that (e.g. RecoveryInProgress) in the added
Assert?

==
src/include/replication/walreceiver.h

5.
 typedef char *(*walrcv_identify_system_fn) (WalReceiverConn *conn,
  TimeLineID *primary_tli);
+/*
+ * walrcv_get_dbname_from_conninfo_fn
+ *
+ * Returns the dbid from the primary_conninfo
+ */
+typedef char *(*walrcv_get_dbname_from_conninfo_fn) (const char *conninfo);

It looks like a blank line that previously existed has been lost.

Re: Synchronizing slots from primary to standby

2024-01-22 Thread Masahiko Sawada

On Mon, Jan 22, 2024 at 9:26 PM Amit Kapila  wrote:
>
> On Mon, Jan 22, 2024 at 5:28 PM Masahiko Sawada  wrote:
> >
> > On Sat, Jan 20, 2024 at 7:44 PM Amit Kapila  wrote:
> > >
> > > On Sat, Jan 20, 2024 at 10:52 AM Dilip Kumar  
> > > wrote:
> > > >
> > > > On Fri, Jan 19, 2024 at 5:24 PM Amit Kapila  
> > > > wrote:
> > > > >
> > > > > On Wed, Jan 17, 2024 at 4:00 PM shveta malik  
> > > > > wrote:
> > > > > >
> > > > >
> > > > > I had some off-list discussions with Sawada-San, Hou-San, and Shveta
> > > > > on the topic of extending replication commands instead of using the
> > > > > current model where we fetch the required slot information via SQL
> > > > > using a database connection. I would like to summarize the discussion
> > > > > and would like to know the thoughts of others on this topic.
> > > > >
> > > > > In the current patch, we launch the slotsync worker on physical
> > > > > standby which connects to the specified database (currently we let
> > > > > users specify the required dbname in primary_conninfo) on the primary.
> > > > > It then fetches the required information for failover marked slots
> > > > > from the primary and also does some primitive checks on the upstream
> > > > > node via SQL (the additional checks are like whether the upstream node
> > > > > has a specified physical slot or whether the upstream node is a
> > > > > primary node or a standby node). To fetch the required information it
> > > > > uses a libpqwalreciever API which is mostly apt for this purpose as it
> > > > > supports SQL execution but for this patch, we don't need a replication
> > > > > connection, so we extend the libpqwalreciever connect API.
> > > >
> > > > What sort of extension we have done to 'libpqwalreciever'? Is it
> > > > something like by default this supports replication connections so we
> > > > have done an extension to the API so that we can provide an option
> > > > whether to create a replication connection or a normal connection?
> > > >
> > >
> > > Yeah and in the future there could be more as well. The other function
> > > added walrcv_get_dbname_from_conninfo doesn't appear to be a problem
> > > either for now.
> > >
> > > > > Now, the concerns related to this could be that users would probably
> > > > > need to change existing mechanisms/tools to update priamry_conninfo
> >
> > I'm concerned about this. In fact, a primary_conninfo value generated
> > by pg_basebackup does not work with enable_syncslot.
> >
>
> Right, but if we want can't we extend pg_basebackup to do that? It is
> just that I am not sure that it is a good idea to extend pg_basebackup
> in the first version.

Okay.

>
> > > > > and one of the alternatives proposed is to have an additional GUC like
> > > > > slot_sync_dbname. Users won't be able to drop the database this worker
> > > > > is connected to aka whatever is specified in slot_sync_dbname but as
> > > > > the user herself sets up the configuration it shouldn't be a big deal.
> > > >
> > > > Yeah for this purpose users may use template1 or so which they
> > > > generally don't plan to drop.
> > > >
> > >
> > > Using template1 has other problems like users won't be able to create
> > > a new database. See [2] (point number 2.2)
> > >
> > > >
> > > >  So in case the user wants to drop that
> > > > database user needs to turn off the slot syncing option and then it
> > > > can be done?
> > > >
> > >
> > > Right.
> >
> > If the user wants to continue using slot syncing, they need to switch
> > the database to connect. Which requires modifying primary_conninfo and
> > reloading the configuration file. Which further leads to restarting
> > the physical replication. If they use synchronous replication, it
> > means the application temporarily stops during that.
> >
>
> Yes, that would be an inconvenience but the point is we don't expect
> this to change often.
>
> > >
> > > > > Then we also discussed whether extending libpqwalreceiver's connect
> > > > > API is a good idea and whether we need to further extend it in the
> > > > > future. As far as I can see, slotsync worker's primary requirement is
> > > > > to execute SQL queries which the current API is sufficient, and don't
> > > > > see something that needs any drastic change in this API. Note that
> > > > > tablesync worker that executes SQL also uses these APIs, so we may
> > > > > need something in the future for either of those. Then finally we need
> > > > > a slotsync worker to also connect to a database to use SQL and fetch
> > > > > results.
> > > >
> > > > While looking into the patch v64-0002 I could not exactly point out
> > > > what sort of extensions are there in libpqwalreceiver.c, I just saw
> > > > one extra API for fetching the dbname from connection info?
> > > >
> > >
> > > Right, the worry was that we may need it in the future.
> >
> > Yes. IIUC the slotsync worker uses libpqwalreceiver to establish a
> > non-replication connection and to execute SQL query. But neither of
> > them are

Re: Synchronizing slots from primary to standby

2024-01-22 Thread Amit Kapila

On Mon, Jan 22, 2024 at 5:28 PM Masahiko Sawada  wrote:
>
> On Sat, Jan 20, 2024 at 7:44 PM Amit Kapila  wrote:
> >
> > On Sat, Jan 20, 2024 at 10:52 AM Dilip Kumar  wrote:
> > >
> > > On Fri, Jan 19, 2024 at 5:24 PM Amit Kapila  
> > > wrote:
> > > >
> > > > On Wed, Jan 17, 2024 at 4:00 PM shveta malik  
> > > > wrote:
> > > > >
> > > >
> > > > I had some off-list discussions with Sawada-San, Hou-San, and Shveta
> > > > on the topic of extending replication commands instead of using the
> > > > current model where we fetch the required slot information via SQL
> > > > using a database connection. I would like to summarize the discussion
> > > > and would like to know the thoughts of others on this topic.
> > > >
> > > > In the current patch, we launch the slotsync worker on physical
> > > > standby which connects to the specified database (currently we let
> > > > users specify the required dbname in primary_conninfo) on the primary.
> > > > It then fetches the required information for failover marked slots
> > > > from the primary and also does some primitive checks on the upstream
> > > > node via SQL (the additional checks are like whether the upstream node
> > > > has a specified physical slot or whether the upstream node is a
> > > > primary node or a standby node). To fetch the required information it
> > > > uses a libpqwalreciever API which is mostly apt for this purpose as it
> > > > supports SQL execution but for this patch, we don't need a replication
> > > > connection, so we extend the libpqwalreciever connect API.
> > >
> > > What sort of extension we have done to 'libpqwalreciever'? Is it
> > > something like by default this supports replication connections so we
> > > have done an extension to the API so that we can provide an option
> > > whether to create a replication connection or a normal connection?
> > >
> >
> > Yeah and in the future there could be more as well. The other function
> > added walrcv_get_dbname_from_conninfo doesn't appear to be a problem
> > either for now.
> >
> > > > Now, the concerns related to this could be that users would probably
> > > > need to change existing mechanisms/tools to update priamry_conninfo
>
> I'm concerned about this. In fact, a primary_conninfo value generated
> by pg_basebackup does not work with enable_syncslot.
>

Right, but if we want can't we extend pg_basebackup to do that? It is
just that I am not sure that it is a good idea to extend pg_basebackup
in the first version.

> > > > and one of the alternatives proposed is to have an additional GUC like
> > > > slot_sync_dbname. Users won't be able to drop the database this worker
> > > > is connected to aka whatever is specified in slot_sync_dbname but as
> > > > the user herself sets up the configuration it shouldn't be a big deal.
> > >
> > > Yeah for this purpose users may use template1 or so which they
> > > generally don't plan to drop.
> > >
> >
> > Using template1 has other problems like users won't be able to create
> > a new database. See [2] (point number 2.2)
> >
> > >
> > >  So in case the user wants to drop that
> > > database user needs to turn off the slot syncing option and then it
> > > can be done?
> > >
> >
> > Right.
>
> If the user wants to continue using slot syncing, they need to switch
> the database to connect. Which requires modifying primary_conninfo and
> reloading the configuration file. Which further leads to restarting
> the physical replication. If they use synchronous replication, it
> means the application temporarily stops during that.
>

Yes, that would be an inconvenience but the point is we don't expect
this to change often.

> >
> > > > Then we also discussed whether extending libpqwalreceiver's connect
> > > > API is a good idea and whether we need to further extend it in the
> > > > future. As far as I can see, slotsync worker's primary requirement is
> > > > to execute SQL queries which the current API is sufficient, and don't
> > > > see something that needs any drastic change in this API. Note that
> > > > tablesync worker that executes SQL also uses these APIs, so we may
> > > > need something in the future for either of those. Then finally we need
> > > > a slotsync worker to also connect to a database to use SQL and fetch
> > > > results.
> > >
> > > While looking into the patch v64-0002 I could not exactly point out
> > > what sort of extensions are there in libpqwalreceiver.c, I just saw
> > > one extra API for fetching the dbname from connection info?
> > >
> >
> > Right, the worry was that we may need it in the future.
>
> Yes. IIUC the slotsync worker uses libpqwalreceiver to establish a
> non-replication connection and to execute SQL query. But neither of
> them are relevant with replication.
>

But we are already using libpqwalreceiver to execute SQL queries via
tablesync worker.

 I'm a bit concerned that when we
> need to extend the slotsync feature in the future we will end up
> extending libpqwalreceiver, even if the new feature is

Re: Synchronizing slots from primary to standby

2024-01-22 Thread Masahiko Sawada

On Sat, Jan 20, 2024 at 7:44 PM Amit Kapila  wrote:
>
> On Sat, Jan 20, 2024 at 10:52 AM Dilip Kumar  wrote:
> >
> > On Fri, Jan 19, 2024 at 5:24 PM Amit Kapila  wrote:
> > >
> > > On Wed, Jan 17, 2024 at 4:00 PM shveta malik  
> > > wrote:
> > > >
> > >
> > > I had some off-list discussions with Sawada-San, Hou-San, and Shveta
> > > on the topic of extending replication commands instead of using the
> > > current model where we fetch the required slot information via SQL
> > > using a database connection. I would like to summarize the discussion
> > > and would like to know the thoughts of others on this topic.
> > >
> > > In the current patch, we launch the slotsync worker on physical
> > > standby which connects to the specified database (currently we let
> > > users specify the required dbname in primary_conninfo) on the primary.
> > > It then fetches the required information for failover marked slots
> > > from the primary and also does some primitive checks on the upstream
> > > node via SQL (the additional checks are like whether the upstream node
> > > has a specified physical slot or whether the upstream node is a
> > > primary node or a standby node). To fetch the required information it
> > > uses a libpqwalreciever API which is mostly apt for this purpose as it
> > > supports SQL execution but for this patch, we don't need a replication
> > > connection, so we extend the libpqwalreciever connect API.
> >
> > What sort of extension we have done to 'libpqwalreciever'? Is it
> > something like by default this supports replication connections so we
> > have done an extension to the API so that we can provide an option
> > whether to create a replication connection or a normal connection?
> >
>
> Yeah and in the future there could be more as well. The other function
> added walrcv_get_dbname_from_conninfo doesn't appear to be a problem
> either for now.
>
> > > Now, the concerns related to this could be that users would probably
> > > need to change existing mechanisms/tools to update priamry_conninfo

I'm concerned about this. In fact, a primary_conninfo value generated
by pg_basebackup does not work with enable_syncslot.

> > > and one of the alternatives proposed is to have an additional GUC like
> > > slot_sync_dbname. Users won't be able to drop the database this worker
> > > is connected to aka whatever is specified in slot_sync_dbname but as
> > > the user herself sets up the configuration it shouldn't be a big deal.
> >
> > Yeah for this purpose users may use template1 or so which they
> > generally don't plan to drop.
> >
>
> Using template1 has other problems like users won't be able to create
> a new database. See [2] (point number 2.2)
>
> >
> >  So in case the user wants to drop that
> > database user needs to turn off the slot syncing option and then it
> > can be done?
> >
>
> Right.

If the user wants to continue using slot syncing, they need to switch
the database to connect. Which requires modifying primary_conninfo and
reloading the configuration file. Which further leads to restarting
the physical replication. If they use synchronous replication, it
means the application temporarily stops during that.

>
> > > Then we also discussed whether extending libpqwalreceiver's connect
> > > API is a good idea and whether we need to further extend it in the
> > > future. As far as I can see, slotsync worker's primary requirement is
> > > to execute SQL queries which the current API is sufficient, and don't
> > > see something that needs any drastic change in this API. Note that
> > > tablesync worker that executes SQL also uses these APIs, so we may
> > > need something in the future for either of those. Then finally we need
> > > a slotsync worker to also connect to a database to use SQL and fetch
> > > results.
> >
> > While looking into the patch v64-0002 I could not exactly point out
> > what sort of extensions are there in libpqwalreceiver.c, I just saw
> > one extra API for fetching the dbname from connection info?
> >
>
> Right, the worry was that we may need it in the future.

Yes. IIUC the slotsync worker uses libpqwalreceiver to establish a
non-replication connection and to execute SQL query. But neither of
them are relevant with replication. I'm a bit concerned that when we
need to extend the slotsync feature in the future we will end up
extending libpqwalreceiver, even if the new feature is not also
relevant with replication.

>
> > > Now, let us consider if we extend the replication commands like
> > > READ_REPLICATION_SLOT and or introduce a new set of replication
> > > commands to fetch the required information then we don't need a DB
> > > connection with primary or a connection in slotsync worker. As per my
> > > current understanding, it is quite doable but I think we will slowly
> > > go in the direction of making replication commands something like SQL
> > > because today we need to extend it to fetch all slots info that have
> > > failover marked as true, the

Re: Synchronizing slots from primary to standby

2024-01-22 Thread shveta malik

On Fri, Jan 19, 2024 at 11:48 AM Peter Smith  wrote:
>
> Here are some review comments for patch v63-0003.

Thanks Peter. I have addressed all in v65.

>
> 4b.
> It was a bit different when there were ERRORs but now they are LOGs
> somehow it seems wrong for this function to say what the *caller* will
> do. Maybe you can rewrite all the errmsg so the don't say "skipping"
> but they just say "bad configuration for slot synchronization"
>
> If valid is false then you can LOG "skipping" at the caller...

I have made this change but now in the log file we see 3 logs like
below, does it seem apt? Was the earlier one better where we get the
info in 2 lines?

[34416] LOG:  bad configuration for slot synchronization
[34416] HINT:  hot_standby_feedback must be enabled.
[34416] LOG:  skipping slot synchronization

thanks
Shveta

Re: Synchronizing slots from primary to standby

2024-01-22 Thread shveta malik

On Mon, Jan 22, 2024 at 12:28 PM Amit Kapila  wrote:
>
> On Fri, Jan 19, 2024 at 3:55 PM shveta malik  wrote:
> >
> > On Fri, Jan 19, 2024 at 10:35 AM Masahiko Sawada  
> > wrote:
> > >
> > >
> > > Thank you for updating the patch. I have some comments:
> > >
> > > ---
> > > +latestWalEnd = GetWalRcvLatestWalEnd();
> > > +if (remote_slot->confirmed_lsn > latestWalEnd)
> > > +{
> > > +elog(ERROR, "exiting from slot synchronization as the
> > > received slot sync"
> > > + " LSN %X/%X for slot \"%s\" is ahead of the
> > > standby position %X/%X",
> > > + LSN_FORMAT_ARGS(remote_slot->confirmed_lsn),
> > > + remote_slot->name,
> > > + LSN_FORMAT_ARGS(latestWalEnd));
> > > +}
> > >
> > > IIUC GetWalRcvLatestWalEnd () returns walrcv->latestWalEnd, which is
> > > typically the primary server's flush position and doesn't mean the LSN
> > > where the walreceiver received/flushed up to.
> >
> > yes. I think it makes more sense to use something which actually tells
> > flushed-position. I gave it a try by replacing GetWalRcvLatestWalEnd()
> > with GetWalRcvFlushRecPtr() but I see a problem here. Lets say I have
> > enabled the slot-sync feature in a running standby, in that case we
> > are all good (flushedUpto is the same as actual flush-position
> > indicated by LogstreamResult.Flush). But if I restart standby, then I
> > observed that the startup process sets flushedUpto to some value 'x'
> > (see [1]) while when the wal-receiver starts, it sets
> > 'LogstreamResult.Flush' to another value (see [2]) which is always
> > greater than 'x'. And we do not update flushedUpto with the
> > 'LogstreamResult.Flush' value in walreceiver until we actually do an
> > operation on primary. Performing a data change on primary sends WALs
> > to standby which then hits XLogWalRcvFlush() and updates flushedUpto
> > same as LogstreamResult.Flush. Until then we have a situation where
> > slots received on standby are ahead of flushedUpto and thus slotsync
> > worker keeps one erroring out. I am yet to find out why flushedUpto is
> > set to a lower value than 'LogstreamResult.Flush' at the start of
> > standby.  Or maybe am I using the wrong function
> > GetWalRcvFlushRecPtr() and should be using something else instead?
> >
>
> Can we think of using GetStandbyFlushRecPtr()? We probably need to
> expose this function, if this works for the required purpose.

I think we can. For the records, the problem while using flushedUpto
(or GetWalRcvFlushRecPtr()) directly is that it is not set to the
latest flushed position immediately after startup.  It points to some
prior location (perhaps segment or page start) after startup until
some data  is flushed next which then updates it to the latest flushed
position, thus we can not use it directly.  GetStandbyFlushRecPtr()
OTOH takes care of it i.e. it returns correct flushed-location at any
point of time. I have changed v65 to use this one.

thanks
Shveta

Re: Synchronizing slots from primary to standby

2024-01-22 Thread Amit Kapila

On Mon, Jan 22, 2024 at 1:11 PM Bertrand Drouvot
 wrote:
>
> Hi,
>

Thanks for sharing the feedback.

>
> > Then we also discussed whether extending libpqwalreceiver's connect
> > API is a good idea and whether we need to further extend it in the
> > future. As far as I can see, slotsync worker's primary requirement is
> > to execute SQL queries which the current API is sufficient, and don't
> > see something that needs any drastic change in this API. Note that
> > tablesync worker that executes SQL also uses these APIs, so we may
> > need something in the future for either of those. Then finally we need
> > a slotsync worker to also connect to a database to use SQL and fetch
> > results.
> >
>
> On my side the nits concerns about using the libpqrcv_connect / 
> walrcv_connect are:
>
> - cosmetic: the "rcv" do not really align with the sync slot worker
>

But note that the same API is even used for apply worker as well. One
can think that this is a connection used to receive WAL or slot_info.

minor comments on the patch:
===
1.
+ /* First time slot update, the function must return true */
+ Assert(local_slot_update(remote_slot));

Isn't moving this code to Assert in update_and_persist_slot() wrong?
It will make this function call no-op in non-assert builds?

2.
+ ereport(LOG,
+ errmsg("newly locally created slot \"%s\" is sync-ready now",

I think even without 'locally' in the above LOG message, it is clear.

3.
+/*
+ * walrcv_get_dbinfo_for_failover_slots_fn
+ *
+ * Run LIST_DBID_FOR_FAILOVER_SLOTS on primary server to get the
+ * list of unique DBIDs for failover logical slots
+ */
+typedef List *(*walrcv_get_dbinfo_for_failover_slots_fn)
(WalReceiverConn *conn);

This looks like a leftover from the previous version of the patch.

-- 
With Regards,
Amit Kapila.

RE: Synchronizing slots from primary to standby

2024-01-22 Thread Zhijie Hou (Fujitsu)

On Monday, January 22, 2024 11:36 AM shveta malik  
wrote:

Hi,

> On Fri, Jan 19, 2024 at 4:18 PM shveta malik  wrote:
> >
> > PFA v64.
> 
> V64 fails to apply to HEAD due to a recent commit. Rebased it. PFA v64_2. It 
> has
> no new changes.

I noticed few things while analyzing the patch.

1.
sleep_ms = Min(sleep_ms * 2, MAX_WORKER_NAPTIME_MS);

The initial value for sleep_ms is 0(default value for static variable) which
will not be advanced in this expression. We should initialize sleep_ms to a 
positive
number.

2.
/ Wait a bit, we don't expect to have to wait long /
rc = WaitLatch(MyLatch,
   WL_LATCH_SET | WL_TIMEOUT | 
WL_EXIT_ON_PM_DEATH,
   10L, WAIT_EVENT_BGWORKER_SHUTDOWN);

The slotsync worker is not a bgworker anymore after 0003 patch, so a new event
is needed I think.

3.
slot->effective_catalog_xmin = xmin_horizon;

The assignment is also needed in local_slot_update() to make 
ReplicationSlotsComputeRequiredXmin work.

Best Regards,
Hou zj

Re: Synchronizing slots from primary to standby

2024-01-21 Thread Bertrand Drouvot

Hi,

On Fri, Jan 19, 2024 at 05:23:53PM +0530, Amit Kapila wrote:
> On Wed, Jan 17, 2024 at 4:00 PM shveta malik  wrote:
> >
> 
> Now, the concerns related to this could be that users would probably
> need to change existing mechanisms/tools to update priamry_conninfo

Yeah, for the ones that want the sync slot feature.

> and one of the alternatives proposed is to have an additional GUC like
> slot_sync_dbname. Users won't be able to drop the database this worker
> is connected to aka whatever is specified in slot_sync_dbname but as
> the user herself sets up the configuration it shouldn't be a big deal.

Same point of view here.

> Then we also discussed whether extending libpqwalreceiver's connect
> API is a good idea and whether we need to further extend it in the
> future. As far as I can see, slotsync worker's primary requirement is
> to execute SQL queries which the current API is sufficient, and don't
> see something that needs any drastic change in this API. Note that
> tablesync worker that executes SQL also uses these APIs, so we may
> need something in the future for either of those. Then finally we need
> a slotsync worker to also connect to a database to use SQL and fetch
> results.
>

On my side the nits concerns about using the libpqrcv_connect / walrcv_connect 
are:

- cosmetic: the "rcv" do not really align with the sync slot worker
- we're using a WalReceiverConn, while a PGconn should suffice. From what I can
see the "overhead" is (1 byte + 7 bytes hole + 8 bytes). I don't think that's a
big deal even if we switch to a multi sync slot worker design later on.

Those have already been discussed in [1] and I'm fine with them.

> Now, let us consider if we extend the replication commands like
> READ_REPLICATION_SLOT and or introduce a new set of replication
> commands to fetch the required information then we don't need a DB
> connection with primary or a connection in slotsync worker. As per my
> current understanding, it is quite doable but I think we will slowly
> go in the direction of making replication commands something like SQL
> because today we need to extend it to fetch all slots info that have
> failover marked as true, the existence of a particular replication,
> etc. Then tomorrow, if we want to extend this work to have multiple
> slotsync workers say workers perdb then we have to extend the
> replication command to fetch per-database failover marked slots. To
> me, it sounds more like we are slowly adding SQL-like features to
> replication commands.

Agree. Also it seems to me that extending the replication commands is more like
a one-way door change.

> Apart from this when we are reading per-db replication slots without
> connecting to a database, we probably need some additional protection
> mechanism so that the database won't get dropped.
> 
> Considering all this it seems that for now probably extending
> replication commands can simplify a few things like mentioned above
> but using SQL's with db-connection is more extendable.

I'd vote for using a SQL db-connection (like we are doing currently).
It seems more extendable and more a two-way door (as compared to extending the
replication commands): I think it still gives us the flexibility to switch to
extending the replication commands if we want to in the future.

[1]: 
https://www.postgresql.org/message-id/ZZe6sok7IWmhKReU%40ip-10-97-1-34.eu-west-3.compute.internal

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Synchronizing slots from primary to standby

2024-01-21 Thread Amit Kapila

On Fri, Jan 19, 2024 at 3:55 PM shveta malik  wrote:
>
> On Fri, Jan 19, 2024 at 10:35 AM Masahiko Sawada  
> wrote:
> >
> >
> > Thank you for updating the patch. I have some comments:
> >
> > ---
> > +latestWalEnd = GetWalRcvLatestWalEnd();
> > +if (remote_slot->confirmed_lsn > latestWalEnd)
> > +{
> > +elog(ERROR, "exiting from slot synchronization as the
> > received slot sync"
> > + " LSN %X/%X for slot \"%s\" is ahead of the
> > standby position %X/%X",
> > + LSN_FORMAT_ARGS(remote_slot->confirmed_lsn),
> > + remote_slot->name,
> > + LSN_FORMAT_ARGS(latestWalEnd));
> > +}
> >
> > IIUC GetWalRcvLatestWalEnd () returns walrcv->latestWalEnd, which is
> > typically the primary server's flush position and doesn't mean the LSN
> > where the walreceiver received/flushed up to.
>
> yes. I think it makes more sense to use something which actually tells
> flushed-position. I gave it a try by replacing GetWalRcvLatestWalEnd()
> with GetWalRcvFlushRecPtr() but I see a problem here. Lets say I have
> enabled the slot-sync feature in a running standby, in that case we
> are all good (flushedUpto is the same as actual flush-position
> indicated by LogstreamResult.Flush). But if I restart standby, then I
> observed that the startup process sets flushedUpto to some value 'x'
> (see [1]) while when the wal-receiver starts, it sets
> 'LogstreamResult.Flush' to another value (see [2]) which is always
> greater than 'x'. And we do not update flushedUpto with the
> 'LogstreamResult.Flush' value in walreceiver until we actually do an
> operation on primary. Performing a data change on primary sends WALs
> to standby which then hits XLogWalRcvFlush() and updates flushedUpto
> same as LogstreamResult.Flush. Until then we have a situation where
> slots received on standby are ahead of flushedUpto and thus slotsync
> worker keeps one erroring out. I am yet to find out why flushedUpto is
> set to a lower value than 'LogstreamResult.Flush' at the start of
> standby.  Or maybe am I using the wrong function
> GetWalRcvFlushRecPtr() and should be using something else instead?
>

Can we think of using GetStandbyFlushRecPtr()? We probably need to
expose this function, if this works for the required purpose.

-- 
With Regards,
Amit Kapila.

Re: Synchronizing slots from primary to standby

2024-01-21 Thread Amit Kapila

On Fri, Jan 19, 2024 at 5:23 PM Amit Kapila  wrote:
>
> On Wed, Jan 17, 2024 at 4:00 PM shveta malik  wrote:
> >
>
> I had some off-list discussions with Sawada-San, Hou-San, and Shveta
> on the topic of extending replication commands instead of using the
> current model where we fetch the required slot information via SQL
> using a database connection. I would like to summarize the discussion
> and would like to know the thoughts of others on this topic.
>
> In the current patch, we launch the slotsync worker on physical
> standby which connects to the specified database (currently we let
> users specify the required dbname in primary_conninfo) on the primary.
> It then fetches the required information for failover marked slots
> from the primary and also does some primitive checks on the upstream
> node via SQL (the additional checks are like whether the upstream node
> has a specified physical slot or whether the upstream node is a
> primary node or a standby node). To fetch the required information it
> uses a libpqwalreciever API which is mostly apt for this purpose as it
> supports SQL execution but for this patch, we don't need a replication
> connection, so we extend the libpqwalreciever connect API.
>
> Now, the concerns related to this could be that users would probably
> need to change existing mechanisms/tools to update priamry_conninfo
> and one of the alternatives proposed is to have an additional GUC like
> slot_sync_dbname. Users won't be able to drop the database this worker
> is connected to aka whatever is specified in slot_sync_dbname but as
> the user herself sets up the configuration it shouldn't be a big deal.
> Then we also discussed whether extending libpqwalreceiver's connect
> API is a good idea and whether we need to further extend it in the
> future. As far as I can see, slotsync worker's primary requirement is
> to execute SQL queries which the current API is sufficient, and don't
> see something that needs any drastic change in this API. Note that
> tablesync worker that executes SQL also uses these APIs, so we may
> need something in the future for either of those. Then finally we need
> a slotsync worker to also connect to a database to use SQL and fetch
> results.
>
> Now, let us consider if we extend the replication commands like
> READ_REPLICATION_SLOT and or introduce a new set of replication
> commands to fetch the required information then we don't need a DB
> connection with primary or a connection in slotsync worker. As per my
> current understanding, it is quite doable but I think we will slowly
> go in the direction of making replication commands something like SQL
> because today we need to extend it to fetch all slots info that have
> failover marked as true, the existence of a particular replication,
> etc. Then tomorrow, if we want to extend this work to have multiple
> slotsync workers say workers perdb then we have to extend the
> replication command to fetch per-database failover marked slots. To
> me, it sounds more like we are slowly adding SQL-like features to
> replication commands.
>
> Apart from this when we are reading per-db replication slots without
> connecting to a database, we probably need some additional protection
> mechanism so that the database won't get dropped.
>
> Considering all this it seems that for now probably extending
> replication commands can simplify a few things like mentioned above
> but using SQL's with db-connection is more extendable.
>
> Thoughts?
>

Bertrand, and others, do you have an opinion on this matter?

-- 
With Regards,
Amit Kapila.

Re: Synchronizing slots from primary to standby

2024-01-20 Thread Amit Kapila

On Sat, Jan 20, 2024 at 10:52 AM Dilip Kumar  wrote:
>
> On Fri, Jan 19, 2024 at 5:24 PM Amit Kapila  wrote:
> >
> > On Wed, Jan 17, 2024 at 4:00 PM shveta malik  wrote:
> > >
> >
> > I had some off-list discussions with Sawada-San, Hou-San, and Shveta
> > on the topic of extending replication commands instead of using the
> > current model where we fetch the required slot information via SQL
> > using a database connection. I would like to summarize the discussion
> > and would like to know the thoughts of others on this topic.
> >
> > In the current patch, we launch the slotsync worker on physical
> > standby which connects to the specified database (currently we let
> > users specify the required dbname in primary_conninfo) on the primary.
> > It then fetches the required information for failover marked slots
> > from the primary and also does some primitive checks on the upstream
> > node via SQL (the additional checks are like whether the upstream node
> > has a specified physical slot or whether the upstream node is a
> > primary node or a standby node). To fetch the required information it
> > uses a libpqwalreciever API which is mostly apt for this purpose as it
> > supports SQL execution but for this patch, we don't need a replication
> > connection, so we extend the libpqwalreciever connect API.
>
> What sort of extension we have done to 'libpqwalreciever'? Is it
> something like by default this supports replication connections so we
> have done an extension to the API so that we can provide an option
> whether to create a replication connection or a normal connection?
>

Yeah and in the future there could be more as well. The other function
added walrcv_get_dbname_from_conninfo doesn't appear to be a problem
either for now.

> > Now, the concerns related to this could be that users would probably
> > need to change existing mechanisms/tools to update priamry_conninfo
> > and one of the alternatives proposed is to have an additional GUC like
> > slot_sync_dbname. Users won't be able to drop the database this worker
> > is connected to aka whatever is specified in slot_sync_dbname but as
> > the user herself sets up the configuration it shouldn't be a big deal.
>
> Yeah for this purpose users may use template1 or so which they
> generally don't plan to drop.
>

Using template1 has other problems like users won't be able to create
a new database. See [2] (point number 2.2)

>
>  So in case the user wants to drop that
> database user needs to turn off the slot syncing option and then it
> can be done?
>

Right.

> > Then we also discussed whether extending libpqwalreceiver's connect
> > API is a good idea and whether we need to further extend it in the
> > future. As far as I can see, slotsync worker's primary requirement is
> > to execute SQL queries which the current API is sufficient, and don't
> > see something that needs any drastic change in this API. Note that
> > tablesync worker that executes SQL also uses these APIs, so we may
> > need something in the future for either of those. Then finally we need
> > a slotsync worker to also connect to a database to use SQL and fetch
> > results.
>
> While looking into the patch v64-0002 I could not exactly point out
> what sort of extensions are there in libpqwalreceiver.c, I just saw
> one extra API for fetching the dbname from connection info?
>

Right, the worry was that we may need it in the future.

> > Now, let us consider if we extend the replication commands like
> > READ_REPLICATION_SLOT and or introduce a new set of replication
> > commands to fetch the required information then we don't need a DB
> > connection with primary or a connection in slotsync worker. As per my
> > current understanding, it is quite doable but I think we will slowly
> > go in the direction of making replication commands something like SQL
> > because today we need to extend it to fetch all slots info that have
> > failover marked as true, the existence of a particular replication,
> > etc. Then tomorrow, if we want to extend this work to have multiple
> > slotsync workers say workers perdb then we have to extend the
> > replication command to fetch per-database failover marked slots. To
> > me, it sounds more like we are slowly adding SQL-like features to
> > replication commands.
> >
> > Apart from this when we are reading per-db replication slots without
> > connecting to a database, we probably need some additional protection
> > mechanism so that the database won't get dropped.
>
> Something like locking the database only while fetching the slots?
>

Possible, but can we lock the database from an auxiliary process?

> > Considering all this it seems that for now probably extending
> > replication commands can simplify a few things like mentioned above
> > but using SQL's with db-connection is more extendable.
>
> Even I have similar thoughts.
>

Thanks.

[1] -

< 1 2 3 4 5 6 7 8 9 >

301 - 400 of 847 matches

Mail list logo