On Mon, Nov 6, 2023 at 9:09 AM Amit Kapila <[email protected]> wrote: > > On Sun, Nov 5, 2023 at 4:01 AM Bharath Rupireddy > <[email protected]> wrote: > > > > On Thu, Nov 2, 2023 at 7:19 AM Peter Smith <[email protected]> wrote: > > > > > > But that's not quite compatible with what Alvaro [2] had written long > > > back ("... the only acquisitions that would log messages are those in > > > StartReplication and StartLogicalReplication."). > > > > > > In other words, ReplicationSlotAcquire/ReplicationSlotRelease is > > > called by more places than you care to log from. > > > > I refreshed my thoughts for this patch and I think it's enough if > > walsenders alone log messages when slots become active and inactive. > > How about something like the attached v11 patch? > > > > + * This function is currently used only in walsender. > + */ > +void > +ReplicationSlotAcquireAndLog(const char *name, bool nowait) > > BTW, is the reason for using it only in walsender is that it is a > background process and it is not very apparent whether the slot is > created, and for foreground processes, it is a bit clear when the > command is executed. > > Can you please tell me the use case of this additional message?
Replication slot acquisitions and releases by backends say when
running pg_replication_slot_advance or pg_logical_slot_get_changes or
pg_drop_replication_slot or pg_create_{physical,
logical}_replication_slot are transient unlike walsenders which
comparatively hold slots for longer durations. Therefore, I've added
them only for walsenders. These messages help to know the lifetime of
a replication slot - one can know how long a streaming standby or
logical subscriber is down, IOW, how long a replication slot is
inactive in production. For instance, the time between released and
acquired slots in the below messages is the inactive replication slot
duration.
2023-11-13 11:06:34.338 UTC [470262] LOG: acquired physical
replication slot "sb_repl_slot"
2023-11-13 11:06:34.338 UTC [470262] STATEMENT: START_REPLICATION
SLOT "sb_repl_slot" 0/3000000 TIMELINE 1
2023-11-13 11:09:24.918 UTC [470262] LOG: released physical
replication slot "sb_repl_slot"
2023-11-13 12:01:40.530 UTC [470967] LOG: acquired physical
replication slot "sb_repl_slot"
2023-11-13 12:01:40.530 UTC [470967] STATEMENT: START_REPLICATION
SLOT "sb_repl_slot" 0/3000000 TIMELINE 1
> If so, the other alternative is to either use a
> parameter to the existing function or directly use am_walsender flag
> to distinguish when to print the message in acquire/release calls.
Done that way. PSA v12.
> A few other minor comments:
> 1.
> + Causes each replication command and related activity to be logged in
> + the server log.
>
> Can we be bit more specific by changing to something like: "Causes
> each replication command and slot acquisition/release to be logged in
> the server log."
Done.
> 2.
> + ereport(log_replication_commands ? LOG : DEBUG1,
> + (errmsg("walsender process with PID %d acquired %s replication slot \"%s\"",
>
> It seems PID and process name is quite unlike what we print in other
> similar messages. For example, see below messages when we start
> replication via pub/sub :
>
> We can get the PID from the log line as for other logs and I don't see
> the process name printed anywhere else.
There was a comment upthread to have PID printed, but I agree to be
consistent and changed the messages to be: acquired physical/logical
replication slot "foo" and released physical/logical replication slot
"foo".
PSA v12 patch.
--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
v12-0001-Emit-messages-when-replication-slots-become-acti.patch
Description: Binary data
