Re: add log messages when replication slots become active and inactive (was Re: Is it worth adding ReplicationSlot active_pid to ReplicationSlotPersistentData?)

2023-11-21 Thread Amit Kapila
On Sat, Nov 18, 2023 at 4:54 PM Amit Kapila  wrote:
>
> On Thu, Nov 16, 2023 at 6:09 PM Bharath Rupireddy
>  wrote:
> > PSA v18 patch.
> >
>
> LGTM. I'll push this early next week unless there are further
> suggestions or comments.
>

Pushed.

-- 
With Regards,
Amit Kapila.




Re: add log messages when replication slots become active and inactive (was Re: Is it worth adding ReplicationSlot active_pid to ReplicationSlotPersistentData?)

2023-11-18 Thread Amit Kapila
On Thu, Nov 16, 2023 at 6:09 PM Bharath Rupireddy
 wrote:
>
> On Thu, Nov 16, 2023 at 4:01 PM Alvaro Herrera  
> wrote:
> >
> > On 2023-Nov-16, Peter Smith wrote:
> >
> > > I searched HEAD code and did not find any "translator:" comments for
> > > just ordinary slot name substitutions like these; AFAICT that comment
> > > is not necessary anymore.
> >
> > True.  Lose that.
>
> Removed.
>
> > The rationale I have is to consider whether a translator looking at the
> > original message message in isolation is going to understand what the %s
> > means.  If it's possible to tell what it is without having to go read
> > the source code that leads to the message, then you don't need a
> > "translator:" comment.  Otherwise you do.
>
> Agree. I think it's easy for one to guess the slot name follows "
> replication slot %s", so I removed the translator message.
>
> > > SUGGESTION (#1a and #1b)
> > >
> > > ereport(log_replication_commands ? LOG : DEBUG1,
> > > errmsg(SlotIsLogical(s)
> > >? "acquired logical replication slot \"%s\""
> > >: "acquired physical replication slot \"%s\"",
> > >NameStr(s->data.name)));
> >
> > The bad thing about this is that gettext() is not going to pick up these
> > strings into the translation catalog.  You could fix that by adding
> > gettext_noop() calls around them:
> >
> >  ereport(log_replication_commands ? LOG : DEBUG1,
> >  errmsg(SlotIsLogical(s)
> > ? gettext_noop("acquired logical replication slot \"%s\"")
> > : gettext_noop("acquired physical replication slot \"%s\""),
> > NameStr(s->data.name)));
> >
> > but at that point it's not clear that it's really better than putting
> > the ternary in the outer scope.
>
> I retained wrapping messages in errmsg("...").
>
> > You can verify this by doing "make update-po" and then searching for the
> > messages in postgres.pot.
>
> Translation gives me [1] with v18 patch
>
> PSA v18 patch.
>

LGTM. I'll push this early next week unless there are further
suggestions or comments.

-- 
With Regards,
Amit Kapila.




Re: add log messages when replication slots become active and inactive (was Re: Is it worth adding ReplicationSlot active_pid to ReplicationSlotPersistentData?)

2023-11-16 Thread Bharath Rupireddy
On Thu, Nov 16, 2023 at 4:01 PM Alvaro Herrera  wrote:
>
> On 2023-Nov-16, Peter Smith wrote:
>
> > I searched HEAD code and did not find any "translator:" comments for
> > just ordinary slot name substitutions like these; AFAICT that comment
> > is not necessary anymore.
>
> True.  Lose that.

Removed.

> The rationale I have is to consider whether a translator looking at the
> original message message in isolation is going to understand what the %s
> means.  If it's possible to tell what it is without having to go read
> the source code that leads to the message, then you don't need a
> "translator:" comment.  Otherwise you do.

Agree. I think it's easy for one to guess the slot name follows "
replication slot %s", so I removed the translator message.

> > SUGGESTION (#1a and #1b)
> >
> > ereport(log_replication_commands ? LOG : DEBUG1,
> > errmsg(SlotIsLogical(s)
> >? "acquired logical replication slot \"%s\""
> >: "acquired physical replication slot \"%s\"",
> >NameStr(s->data.name)));
>
> The bad thing about this is that gettext() is not going to pick up these
> strings into the translation catalog.  You could fix that by adding
> gettext_noop() calls around them:
>
>  ereport(log_replication_commands ? LOG : DEBUG1,
>  errmsg(SlotIsLogical(s)
> ? gettext_noop("acquired logical replication slot \"%s\"")
> : gettext_noop("acquired physical replication slot \"%s\""),
> NameStr(s->data.name)));
>
> but at that point it's not clear that it's really better than putting
> the ternary in the outer scope.

I retained wrapping messages in errmsg("...").

> You can verify this by doing "make update-po" and then searching for the
> messages in postgres.pot.

Translation gives me [1] with v18 patch

PSA v18 patch.

[1]
#: replication/slot.c:545
#, c-format
msgid "acquired logical replication slot \"%s\""
msgstr ""

#: replication/slot.c:547
#, c-format
msgid "acquired physical replication slot \"%s\""
msgstr ""

#: replication/slot.c:622
#, c-format
msgid "released logical replication slot \"%s\""
msgstr ""

#: replication/slot.c:624
#, c-format
msgid "released physical replication slot \"%s\""
msgstr ""

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com


v18-0001-Log-messages-for-replication-slot-acquisition-an.patch
Description: Binary data


Re: add log messages when replication slots become active and inactive (was Re: Is it worth adding ReplicationSlot active_pid to ReplicationSlotPersistentData?)

2023-11-16 Thread Alvaro Herrera
On 2023-Nov-16, Peter Smith wrote:

> I searched HEAD code and did not find any "translator:" comments for
> just ordinary slot name substitutions like these; AFAICT that comment
> is not necessary anymore.

True.  Lose that.

The rationale I have is to consider whether a translator looking at the
original message message in isolation is going to understand what the %s
means.  If it's possible to tell what it is without having to go read
the source code that leads to the message, then you don't need a
"translator:" comment.  Otherwise you do.

You also need to assume the translator is not stupid, but that seems an
OK assumption.

> SUGGESTION (#1a and #1b)
> 
> ereport(log_replication_commands ? LOG : DEBUG1,
> errmsg(SlotIsLogical(s)
>? "acquired logical replication slot \"%s\""
>: "acquired physical replication slot \"%s\"",
>NameStr(s->data.name)));

The bad thing about this is that gettext() is not going to pick up these
strings into the translation catalog.  You could fix that by adding
gettext_noop() calls around them:

 ereport(log_replication_commands ? LOG : DEBUG1,
 errmsg(SlotIsLogical(s)
? gettext_noop("acquired logical replication slot \"%s\"")
: gettext_noop("acquired physical replication slot \"%s\""),
NameStr(s->data.name)));

but at that point it's not clear that it's really better than putting
the ternary in the outer scope.

You can verify this by doing "make update-po" and then searching for the
messages in postgres.pot.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"Cada quien es cada cual y baja las escaleras como quiere" (JMSerrat)




Re: add log messages when replication slots become active and inactive (was Re: Is it worth adding ReplicationSlot active_pid to ReplicationSlotPersistentData?)

2023-11-15 Thread Peter Smith
On Thu, Nov 16, 2023 at 12:18 PM Amit Kapila  wrote:
>
> On Thu, Nov 16, 2023 at 3:48 AM Peter Smith  wrote:
> >
> > ~
> >
> > SUGGESTION (#1a and #1b)
> >
> > ereport(log_replication_commands ? LOG : DEBUG1,
> > errmsg(SlotIsLogical(s)
> >? "acquired logical replication slot \"%s\""
> >: "acquired physical replication slot \"%s\"",
> >NameStr(s->data.name)));
> >
> > ~~~
> >
>
> Personally, I prefer the way Bharath had in his patch. Did you see any
> preferred way in the existing code?

Not really. I think the errmsg combined with ternary is not so common.
I couldn't find many examples, so I wouldn't try to claim anything is
a "preferred" way

There are some existing examples, like Bharath had:

ereport(NOTICE,
(errcode(ERRCODE_DUPLICATE_OBJECT),
 collencoding == -1
 ? errmsg("collation \"%s\" already exists, skipping",
  collname)
 : errmsg("collation \"%s\" for encoding \"%s\" already
exists, skipping",
  collname, pg_encoding_to_char(collencoding;

OTOH, when there are different numbers of substitution parameters in
each of the errmsg like that, you don't have much choice but to do it
that way.

I am fine with whatever is chosen -- I was only offering an alternative.

==
Kind Regards,
Peter Smith.
Fujitsu Australia




Re: add log messages when replication slots become active and inactive (was Re: Is it worth adding ReplicationSlot active_pid to ReplicationSlotPersistentData?)

2023-11-15 Thread Peter Smith
On Thu, Nov 16, 2023 at 12:36 PM Amit Kapila  wrote:
>
> On Wed, Nov 15, 2023 at 3:58 PM Alvaro Herrera  
> wrote:
> >
> > Translation-wise, this doesn't work, because you're building a string.
> > There's no reason to think that the words "logical" and "physical"
> > should stay untranslated; the message would make no sense, or at least
> > would be very ugly.
> >
> > You should do something like
> >
> > if (am_walsender)
> > {
> > ereport(log_replication_commands ? LOG : DEBUG1,
> > SlotIsLogical(s) ? errmsg("acquired logical replication 
> > slot \"%s\"", NameStr(s->data.name)) :
> > errmsg("acquired physical replication slot \"%s\"", 
> > NameStr(s->data.name)));
> > }
> >
> > (Obviously, lose the "translator:" comments since they are unnecessary)
> >
> >
> > If you really want to keep the "logical"/"physical" word untranslated,
> > you need to split it out of the sentence somehow.  But it would be
> > really horrible IMO.  Like
> >
> > errmsg("acquired replication slot \"%s\" of type \"%s\"",
> >NameStr(s->data.name), SlotIsLogical(s) ? "logical" : "physical")
> >
>
> Thanks for the suggestion. I would like to clarify on this a bit. What
> do exactly mean by splitting out of the sentence? For example, in one
> of the existing messages:
>
> ereport(LOG,
> /* translator: %s is SIGKILL or SIGABRT */
> (errmsg("issuing %s to recalcitrant children",
> send_abort_for_kill ? "SIGABRT" : "SIGKILL")));
>
> Do here words SIGABRT/SIGKILL remain untranslated due to the
> translator's comment? I thought this was similar to the message being
> proposed but seems like this message construction follows translation
> rules better.
>

IIUC, that example is different because "SIGABRT" / "SIGKILL" are not
real words, so you don't want the translator to attempt to translate
them.You want them to appear in the message as-is.

OTOH in this patch "logical" and "physical" are just normal English
words that should be translated as part of the original message.
e.g. like in these similar messages:
- msgid "database \"%s\" is used by an active logical replication slot"
- msgstr "la base de données « %s » est utilisée par un slot de
réplication logique actif"

==
Kind Regards,
Peter Smith.
Fujitsu Australia




Re: add log messages when replication slots become active and inactive (was Re: Is it worth adding ReplicationSlot active_pid to ReplicationSlotPersistentData?)

2023-11-15 Thread Amit Kapila
On Wed, Nov 15, 2023 at 3:58 PM Alvaro Herrera  wrote:
>
> Translation-wise, this doesn't work, because you're building a string.
> There's no reason to think that the words "logical" and "physical"
> should stay untranslated; the message would make no sense, or at least
> would be very ugly.
>
> You should do something like
>
> if (am_walsender)
> {
> ereport(log_replication_commands ? LOG : DEBUG1,
> SlotIsLogical(s) ? errmsg("acquired logical replication slot 
> \"%s\"", NameStr(s->data.name)) :
> errmsg("acquired physical replication slot \"%s\"", 
> NameStr(s->data.name)));
> }
>
> (Obviously, lose the "translator:" comments since they are unnecessary)
>
>
> If you really want to keep the "logical"/"physical" word untranslated,
> you need to split it out of the sentence somehow.  But it would be
> really horrible IMO.  Like
>
> errmsg("acquired replication slot \"%s\" of type \"%s\"",
>NameStr(s->data.name), SlotIsLogical(s) ? "logical" : "physical")
>

Thanks for the suggestion. I would like to clarify on this a bit. What
do exactly mean by splitting out of the sentence? For example, in one
of the existing messages:

ereport(LOG,
/* translator: %s is SIGKILL or SIGABRT */
(errmsg("issuing %s to recalcitrant children",
send_abort_for_kill ? "SIGABRT" : "SIGKILL")));

Do here words SIGABRT/SIGKILL remain untranslated due to the
translator's comment? I thought this was similar to the message being
proposed but seems like this message construction follows translation
rules better.

-- 
With Regards,
Amit Kapila.




Re: add log messages when replication slots become active and inactive (was Re: Is it worth adding ReplicationSlot active_pid to ReplicationSlotPersistentData?)

2023-11-15 Thread Amit Kapila
On Thu, Nov 16, 2023 at 3:48 AM Peter Smith  wrote:
>
> ~
>
> SUGGESTION (#1a and #1b)
>
> ereport(log_replication_commands ? LOG : DEBUG1,
> errmsg(SlotIsLogical(s)
>? "acquired logical replication slot \"%s\""
>: "acquired physical replication slot \"%s\"",
>NameStr(s->data.name)));
>
> ~~~
>

Personally, I prefer the way Bharath had in his patch. Did you see any
preferred way in the existing code?

-- 
With Regards,
Amit Kapila.




Re: add log messages when replication slots become active and inactive (was Re: Is it worth adding ReplicationSlot active_pid to ReplicationSlotPersistentData?)

2023-11-15 Thread Peter Smith
Some minor comments for v17-0001.

==

1.
+   ereport(log_replication_commands ? LOG : DEBUG1,
+   SlotIsLogical(s)
+   /* translator: %s is name of the replication slot */
+   ? errmsg("acquired logical replication slot \"%s\"",
+NameStr(s->data.name))
+   : errmsg("acquired physical replication slot \"%s\"",
+NameStr(s->data.name)));

1a.
FWIW, if the ternary was inside the errmsg, there would be less code
duplication.

~

1b.
I searched HEAD code and did not find any "translator:" comments for
just ordinary slot name substitutions like these; AFAICT that comment
is not necessary anymore.

~

SUGGESTION (#1a and #1b)

ereport(log_replication_commands ? LOG : DEBUG1,
errmsg(SlotIsLogical(s)
   ? "acquired logical replication slot \"%s\""
   : "acquired physical replication slot \"%s\"",
   NameStr(s->data.name)));

~~~

2.
Ditto for the other ereport.

==
Kind Regards,
Peter Smith.
Fujitsu Australia




Re: add log messages when replication slots become active and inactive (was Re: Is it worth adding ReplicationSlot active_pid to ReplicationSlotPersistentData?)

2023-11-15 Thread Bharath Rupireddy
On Wed, Nov 15, 2023 at 3:58 PM Alvaro Herrera  wrote:
>
> Translation-wise, this doesn't work, because you're building a string.
> There's no reason to think that the words "logical" and "physical"
> should stay untranslated; the message would make no sense, or at least
> would be very ugly.
>
> You should do something like
>
> if (am_walsender)
> {
> ereport(log_replication_commands ? LOG : DEBUG1,
> SlotIsLogical(s) ? errmsg("acquired logical replication slot 
> \"%s\"", NameStr(s->data.name)) :
> errmsg("acquired physical replication slot \"%s\"", 
> NameStr(s->data.name)));
> }

This seems better, so done that way.

> (Obviously, lose the "translator:" comments since they are unnecessary)

The translator message now indicates that the remaining %s denotes the
replication slot name.

PSA v17 patch.

-- 
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com


v17-0001-Log-messages-for-replication-slot-acquisition-an.patch
Description: Binary data


Re: add log messages when replication slots become active and inactive (was Re: Is it worth adding ReplicationSlot active_pid to ReplicationSlotPersistentData?)

2023-11-15 Thread Alvaro Herrera
Translation-wise, this doesn't work, because you're building a string.
There's no reason to think that the words "logical" and "physical"
should stay untranslated; the message would make no sense, or at least
would be very ugly.

You should do something like

if (am_walsender)
{
ereport(log_replication_commands ? LOG : DEBUG1,
SlotIsLogical(s) ? errmsg("acquired logical replication slot 
\"%s\"", NameStr(s->data.name)) :
errmsg("acquired physical replication slot \"%s\"", 
NameStr(s->data.name)));
}

(Obviously, lose the "translator:" comments since they are unnecessary)


If you really want to keep the "logical"/"physical" word untranslated,
you need to split it out of the sentence somehow.  But it would be
really horrible IMO.  Like

errmsg("acquired replication slot \"%s\" of type \"%s\"",
   NameStr(s->data.name), SlotIsLogical(s) ? "logical" : "physical")

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"Use it up, wear it out, make it do, or do without"




Re: add log messages when replication slots become active and inactive (was Re: Is it worth adding ReplicationSlot active_pid to ReplicationSlotPersistentData?)

2023-11-15 Thread Amit Kapila
On Wed, Nov 15, 2023 at 11:00 AM Bharath Rupireddy
 wrote:
>
> PSA v15 patch.
>

The patch looks good to me. I have slightly modified the translator
message and commit message in the attached. I'll push this tomorrow
unless there are any comments.

-- 
With Regards,
Amit Kapila.


v16-0001-Log-messages-for-replication-slot-acquisition-an.patch
Description: Binary data


Re: add log messages when replication slots become active and inactive (was Re: Is it worth adding ReplicationSlot active_pid to ReplicationSlotPersistentData?)

2023-11-14 Thread Peter Smith
Patch v15-0001 LGTM.

==
Kind Regards,
Peter Smith.
Fujitsu Australia




Re: add log messages when replication slots become active and inactive (was Re: Is it worth adding ReplicationSlot active_pid to ReplicationSlotPersistentData?)

2023-11-14 Thread Bharath Rupireddy
On Wed, Nov 15, 2023 at 9:49 AM Amit Kapila  wrote:
>
> On Wed, Nov 15, 2023 at 4:40 AM Peter Smith  wrote:
> >
> > Thanks for addressing my previous comments. Patch v14-0001 looks good
> > to me, except I have one question:
> >
> > The patch uses errmsg_internal() for the logging, but I noticed the
> > only other code using GUC 'log_replication_commands' has errmsg()
> > instead of errmsg_internal(). Isn't it better to be consistent with
> > the existing code?
> >
>
> I agree that we should errmsg here. If we read the description of
> errmsg_internal() [1], it is recommended to be used for "cannot
> happen" cases where we don't want to spend translation effort which is
> not the case here.

I chose not to translate the newly added messages as they are only
written to server logs not sent to the client. However, I've changed
to errmsg, after looking at the errmsg_internal docs.

> Also, similar to the below message, we should add a
> comment for a translator.
>
> ereport(LOG,
> /* translator: %s is SIGKILL or SIGABRT */
> (errmsg("issuing %s to recalcitrant children",
> send_abort_for_kill ? "SIGABRT" : "SIGKILL")));

Added.

> Another minor comment:
> +Causes each replication command and walsender
> +process replication slot acquisition/release to be logged in the 
> server
> +log.
>
> Isn't it better to use process's instead of process in the above sentence?

Changed.

PSA v15 patch.

-- 
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com


v15-0001-Log-messages-for-replication-slot-acquisition-an.patch
Description: Binary data


Re: add log messages when replication slots become active and inactive (was Re: Is it worth adding ReplicationSlot active_pid to ReplicationSlotPersistentData?)

2023-11-14 Thread Amit Kapila
On Wed, Nov 15, 2023 at 4:40 AM Peter Smith  wrote:
>
> Thanks for addressing my previous comments. Patch v14-0001 looks good
> to me, except I have one question:
>
> The patch uses errmsg_internal() for the logging, but I noticed the
> only other code using GUC 'log_replication_commands' has errmsg()
> instead of errmsg_internal(). Isn't it better to be consistent with
> the existing code?
>

I agree that we should errmsg here. If we read the description of
errmsg_internal() [1], it is recommended to be used for "cannot
happen" cases where we don't want to spend translation effort which is
not the case here. Also, similar to the below message, we should add a
comment for a translator.

ereport(LOG,
/* translator: %s is SIGKILL or SIGABRT */
(errmsg("issuing %s to recalcitrant children",
send_abort_for_kill ? "SIGABRT" : "SIGKILL")));

Another minor comment:
+Causes each replication command and walsender
+process replication slot acquisition/release to be logged in the server
+log.

Isn't it better to use process's instead of process in the above sentence?

[1] -https://www.postgresql.org/docs/devel/error-message-reporting.html

-- 
With Regards,
Amit Kapila.




Re: add log messages when replication slots become active and inactive (was Re: Is it worth adding ReplicationSlot active_pid to ReplicationSlotPersistentData?)

2023-11-14 Thread Peter Smith
Hi,

Thanks for addressing my previous comments. Patch v14-0001 looks good
to me, except I have one question:

The patch uses errmsg_internal() for the logging, but I noticed the
only other code using GUC 'log_replication_commands' has errmsg()
instead of errmsg_internal(). Isn't it better to be consistent with
the existing code?

==
Kind Regards,
Peter Smith.
Fujitsu Australia




Re: add log messages when replication slots become active and inactive (was Re: Is it worth adding ReplicationSlot active_pid to ReplicationSlotPersistentData?)

2023-11-13 Thread Bharath Rupireddy
On Tue, Nov 14, 2023 at 4:50 AM Peter Smith  wrote:
>
> Here are some review comments for patch v13-0001.

Thanks.

> ==
> doc/src/sgml/config.sgml
>
> 1.
>
> Should that also mention about walsender?
>
> e.g.
> "and slot acquisition/release" ==> "and walsender
> slot acquisition/release"

Changed.

> 2a.
> Instead of calling SlotIsLogical() and then again calling
> SlotIsPhysical(), it might be better to assign this one time to a
> local variable.
>
> 2b.
> IMO it is better to continue using variable 's' here instead of
> 'MyReplicationSlot'. Code is not only shorter but is also consistent
> with the rest of the function which never uses MyReplicationSlot, even
> in the places where it could have.
>
> SUGGESTION (for #2a and #2b)
> is_logical = SlotIsLogical(s);
> if (is_logical)
>   pgstat_acquire_replslot(s);
>
> if (am_walsender)
>   ereport(log_replication_commands ? LOG : DEBUG1,
> errmsg_internal("acquired %s replication slot \"%s\"",
>   is_logical ? "logical" : "physical", NameStr(s->data.name)));

Use of a separate variable isn't good IMO, I used SlotIsLogical(s); directly.

> 3a.
> Notice 'MyReplicationSlot' is already assigned to the local 'slot'
> variable, so IMO it is better if this new code also uses that 'slot'
> variable for consistency with the rest of the function.
>
> 3b.
> Consider flipping the flag to be 'is_logical' instead of
> 'is_physical', so the ereport substitution will match the other
> ReplicationSlotAcquirecode suggested above (#2a).
>
> SUGGESTION (For #3a and #3b)
> if (am_walsender)
> {
>   slotname = pstrdup(NameStr(slot->data.name));
>   is_logical = SlotIsLogical(slot);
> }

Done.

PSA v14 patch.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
From 87d04d040fc048af2065f34c6c5887c84d4e15da Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy 
Date: Tue, 14 Nov 2023 06:53:02 +
Subject: [PATCH v14] Log messages for replication slot acquisition and release

This commit adds log messages (at LOG level when
log_replication_commands is set, otherwise at DEBUG1 level) when
walsenders acquire and release replication slots. These messages
help to know the lifetime of a replication slot - one can know how
long a streaming standby or logical subscriber or replication slot
consumer is down. In other words, one can know how long a
replication slot is inactive - the time between released and
acquired slot messages is the inactive replication slot duration.
These messages will be extremely useful on production servers to
debug and analyze inactive replication slot issues.

Note that these messages are emitted only for walsenders when
replication slot is acquired and released, but not for backends.
This is because walsenders are the ones that typically hold
replication slots for longer durations unlike backends which hold
replication slots for executing some replication related
functions.
---
 doc/src/sgml/config.sgml   |  7 ---
 src/backend/replication/slot.c | 24 
 2 files changed, 28 insertions(+), 3 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index fc35a46e5e..d87320141a 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -7510,9 +7510,10 @@ log_line_prefix = '%m [%p] %q%u@%d/%a '
   
   

-Causes each replication command to be logged in the server log.
-See  for more information about
-replication command. The default value is off.
+Causes each replication command and walsender
+process replication slot acquisition/release to be logged in the server
+log. See  for more information
+about replication command. The default value is off.
 Only superusers and users with the appropriate SET
 privilege can change this setting.

diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 781aa43cc4..5d6ba0aef1 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -537,6 +537,12 @@ retry:
 	 */
 	if (SlotIsLogical(s))
 		pgstat_acquire_replslot(s);
+
+	if (am_walsender)
+		ereport(log_replication_commands ? LOG : DEBUG1,
+errmsg_internal("acquired %s replication slot \"%s\"",
+SlotIsLogical(s) ? "logical" : "physical",
+NameStr(s->data.name)));
 }
 
 /*
@@ -549,9 +555,17 @@ void
 ReplicationSlotRelease(void)
 {
 	ReplicationSlot *slot = MyReplicationSlot;
+	char	   *slotname = NULL;	/* keep compiler quiet */
+	bool		is_logical = false; /* keep compiler quiet */
 
 	Assert(slot != NULL && slot->active_pid != 0);
 
+	if (am_walsender)
+	{
+		slotname = pstrdup(NameStr(slot->data.name));
+		is_logical = SlotIsLogical(slot);
+	}
+
 	if (slot->data.persistency == RS_EPHEMERAL)
 	{
 		/*
@@ -596,6 +610,16 @@ ReplicationSlotRelease(void)
 	MyProc->statusFlags &= ~PROC_IN_LOGICAL_DECODING;
 	ProcGlobal->statusFlags[MyProc->pgxactoff] = MyProc->statusFlags;
 	

Re: add log messages when replication slots become active and inactive (was Re: Is it worth adding ReplicationSlot active_pid to ReplicationSlotPersistentData?)

2023-11-13 Thread Peter Smith
Here are some review comments for patch v13-0001.

==
doc/src/sgml/config.sgml

1.

-Causes each replication command to be logged in the server log.
-See  for more information about
-replication command. The default value is off.
-Only superusers and users with the appropriate SET
-privilege can change this setting.
+Causes each replication command and slot acquisition/release to be
+logged in the server log. See 
+for more information about replication command. The default value is
+off. Only superusers and users with the appropriate
+SET privilege can change this setting.


Should that also mention about walsender?

e.g.
"and slot acquisition/release" ==> "and walsender
slot acquisition/release"

==
src/backend/replication/slot.c

2. ReplicationSlotAcquire

  if (SlotIsLogical(s))
  pgstat_acquire_replslot(s);
+
+ if (am_walsender)
+ ereport(log_replication_commands ? LOG : DEBUG1,
+ errmsg_internal("acquired %s replication slot \"%s\"",
+ SlotIsPhysical(MyReplicationSlot) ? "physical" : "logical",
+ NameStr(MyReplicationSlot->data.name)));

2a.
Instead of calling SlotIsLogical() and then again calling
SlotIsPhysical(), it might be better to assign this one time to a
local variable.

~

2b.
IMO it is better to continue using variable 's' here instead of
'MyReplicationSlot'. Code is not only shorter but is also consistent
with the rest of the function which never uses MyReplicationSlot, even
in the places where it could have.

~

SUGGESTION (for #2a and #2b)
is_logical = SlotIsLogical(s);
if (is_logical)
  pgstat_acquire_replslot(s);

if (am_walsender)
  ereport(log_replication_commands ? LOG : DEBUG1,
errmsg_internal("acquired %s replication slot \"%s\"",
  is_logical ? "logical" : "physical", NameStr(s->data.name)));

~~~

3. ReplicationSlotRelease

 ReplicationSlotRelease(void)
 {
  ReplicationSlot *slot = MyReplicationSlot;
+ char*slotname = NULL; /* keep compiler quiet */
+ bool is_physical = false; /* keep compiler quiet */

  Assert(slot != NULL && slot->active_pid != 0);

+ if (am_walsender)
+ {
+ slotname = pstrdup(NameStr(MyReplicationSlot->data.name));
+ is_physical = SlotIsPhysical(MyReplicationSlot);
+ }
+

3a.
Notice 'MyReplicationSlot' is already assigned to the local 'slot'
variable, so IMO it is better if this new code also uses that 'slot'
variable for consistency with the rest of the function.

~

3b.
Consider flipping the flag to be 'is_logical' instead of
'is_physical', so the ereport substitution will match the other
ReplicationSlotAcquirecode suggested above (#2a).

~

SUGGESTION (For #3a and #3b)
if (am_walsender)
{
  slotname = pstrdup(NameStr(slot->data.name));
  is_logical = SlotIsLogical(slot);
}

==
Kind Regards,
Peter Smith.
Fujitsu Australia




Re: add log messages when replication slots become active and inactive (was Re: Is it worth adding ReplicationSlot active_pid to ReplicationSlotPersistentData?)

2023-11-13 Thread Bharath Rupireddy
On Mon, Nov 13, 2023 at 5:43 PM Bharath Rupireddy
 wrote:
>
> On Mon, Nov 6, 2023 at 9:09 AM Amit Kapila  wrote:
> >
> > On Sun, Nov 5, 2023 at 4:01 AM Bharath Rupireddy
> >  wrote:
> > >
> > > On Thu, Nov 2, 2023 at 7:19 AM Peter Smith  wrote:
> > > >
> > > > But that's not quite compatible with what Alvaro [2] had written long
> > > > back ("... the only acquisitions that would log messages are those in
> > > > StartReplication and StartLogicalReplication.").
> > > >
> > > > In other words, ReplicationSlotAcquire/ReplicationSlotRelease is
> > > > called by more places than you care to log from.
> > >
> > > I refreshed my thoughts for this patch and I think it's enough if
> > > walsenders alone log messages when slots become active and inactive.
> > > How about something like the attached v11 patch?
> > >
> >
> > + * This function is currently used only in walsender.
> > + */
> > +void
> > +ReplicationSlotAcquireAndLog(const char *name, bool nowait)
> >
> > BTW, is the reason for using it only in walsender is that it is a
> > background process and it is not very apparent whether the slot is
> > created, and for foreground processes, it is a bit clear when the
> > command is executed.
> >
> > Can you please tell me the use case of this additional message?
>
> Replication slot acquisitions and releases by backends say when
> running pg_replication_slot_advance or pg_logical_slot_get_changes or
> pg_drop_replication_slot or pg_create_{physical,
> logical}_replication_slot are transient unlike walsenders which
> comparatively hold slots for longer durations. Therefore, I've added
> them only for walsenders. These messages help to know the lifetime of
> a replication slot - one can know how long a streaming standby or
> logical subscriber is down, IOW, how long a replication slot is
> inactive in production. For instance, the time between released and
> acquired slots in the below messages is the inactive replication slot
> duration.
>
> 2023-11-13 11:06:34.338 UTC [470262] LOG:  acquired physical
> replication slot "sb_repl_slot"
> 2023-11-13 11:06:34.338 UTC [470262] STATEMENT:  START_REPLICATION
> SLOT "sb_repl_slot" 0/300 TIMELINE 1
> 2023-11-13 11:09:24.918 UTC [470262] LOG:  released physical
> replication slot "sb_repl_slot"
> 2023-11-13 12:01:40.530 UTC [470967] LOG:  acquired physical
> replication slot "sb_repl_slot"
> 2023-11-13 12:01:40.530 UTC [470967] STATEMENT:  START_REPLICATION
> SLOT "sb_repl_slot" 0/300 TIMELINE 1
>
> > If so, the other alternative is to either use a
> > parameter to the existing function or directly use am_walsender flag
> > to distinguish when to print the message in acquire/release calls.
>
> Done that way. PSA v12.
>
> > A few other minor comments:
> > 1.
> > +Causes each replication command and related activity to be logged 
> > in
> > +the server log.
> >
> > Can we be bit more specific by changing to something like: "Causes
> > each replication command and slot acquisition/release to be logged in
> > the server log."
>
> Done.
>
> > 2.
> > + ereport(log_replication_commands ? LOG : DEBUG1,
> > + (errmsg("walsender process with PID %d acquired %s replication slot 
> > \"%s\"",
> >
> > It seems PID and process name is quite unlike what we print in other
> > similar messages. For example, see below messages when we start
> > replication via pub/sub :
> >
> > We can get the PID from the log line as for other logs and I don't see
> > the process name printed anywhere else.
>
> There was a comment upthread to have PID printed, but I agree to be
> consistent and changed the messages to be: acquired physical/logical
> replication slot "foo" and released physical/logical replication slot
> "foo".
>
> PSA v12 patch.

Compiler isn't happy with v12
https://cirrus-ci.com/task/5543061376204800?logs=gcc_warning#L405. PSA
v13 patch.

-- 
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com


v13-0001-Log-messages-for-replication-slot-acquisition-an.patch
Description: Binary data


Re: add log messages when replication slots become active and inactive (was Re: Is it worth adding ReplicationSlot active_pid to ReplicationSlotPersistentData?)

2023-11-13 Thread Bharath Rupireddy
On Mon, Nov 6, 2023 at 9:09 AM Amit Kapila  wrote:
>
> On Sun, Nov 5, 2023 at 4:01 AM Bharath Rupireddy
>  wrote:
> >
> > On Thu, Nov 2, 2023 at 7:19 AM Peter Smith  wrote:
> > >
> > > But that's not quite compatible with what Alvaro [2] had written long
> > > back ("... the only acquisitions that would log messages are those in
> > > StartReplication and StartLogicalReplication.").
> > >
> > > In other words, ReplicationSlotAcquire/ReplicationSlotRelease is
> > > called by more places than you care to log from.
> >
> > I refreshed my thoughts for this patch and I think it's enough if
> > walsenders alone log messages when slots become active and inactive.
> > How about something like the attached v11 patch?
> >
>
> + * This function is currently used only in walsender.
> + */
> +void
> +ReplicationSlotAcquireAndLog(const char *name, bool nowait)
>
> BTW, is the reason for using it only in walsender is that it is a
> background process and it is not very apparent whether the slot is
> created, and for foreground processes, it is a bit clear when the
> command is executed.
>
> Can you please tell me the use case of this additional message?

Replication slot acquisitions and releases by backends say when
running pg_replication_slot_advance or pg_logical_slot_get_changes or
pg_drop_replication_slot or pg_create_{physical,
logical}_replication_slot are transient unlike walsenders which
comparatively hold slots for longer durations. Therefore, I've added
them only for walsenders. These messages help to know the lifetime of
a replication slot - one can know how long a streaming standby or
logical subscriber is down, IOW, how long a replication slot is
inactive in production. For instance, the time between released and
acquired slots in the below messages is the inactive replication slot
duration.

2023-11-13 11:06:34.338 UTC [470262] LOG:  acquired physical
replication slot "sb_repl_slot"
2023-11-13 11:06:34.338 UTC [470262] STATEMENT:  START_REPLICATION
SLOT "sb_repl_slot" 0/300 TIMELINE 1
2023-11-13 11:09:24.918 UTC [470262] LOG:  released physical
replication slot "sb_repl_slot"
2023-11-13 12:01:40.530 UTC [470967] LOG:  acquired physical
replication slot "sb_repl_slot"
2023-11-13 12:01:40.530 UTC [470967] STATEMENT:  START_REPLICATION
SLOT "sb_repl_slot" 0/300 TIMELINE 1

> If so, the other alternative is to either use a
> parameter to the existing function or directly use am_walsender flag
> to distinguish when to print the message in acquire/release calls.

Done that way. PSA v12.

> A few other minor comments:
> 1.
> +Causes each replication command and related activity to be logged in
> +the server log.
>
> Can we be bit more specific by changing to something like: "Causes
> each replication command and slot acquisition/release to be logged in
> the server log."

Done.

> 2.
> + ereport(log_replication_commands ? LOG : DEBUG1,
> + (errmsg("walsender process with PID %d acquired %s replication slot \"%s\"",
>
> It seems PID and process name is quite unlike what we print in other
> similar messages. For example, see below messages when we start
> replication via pub/sub :
>
> We can get the PID from the log line as for other logs and I don't see
> the process name printed anywhere else.

There was a comment upthread to have PID printed, but I agree to be
consistent and changed the messages to be: acquired physical/logical
replication slot "foo" and released physical/logical replication slot
"foo".

PSA v12 patch.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com


v12-0001-Emit-messages-when-replication-slots-become-acti.patch
Description: Binary data


Re: add log messages when replication slots become active and inactive (was Re: Is it worth adding ReplicationSlot active_pid to ReplicationSlotPersistentData?)

2023-11-05 Thread Amit Kapila
On Sun, Nov 5, 2023 at 4:01 AM Bharath Rupireddy
 wrote:
>
> On Thu, Nov 2, 2023 at 7:19 AM Peter Smith  wrote:
> >
> > But that's not quite compatible with what Alvaro [2] had written long
> > back ("... the only acquisitions that would log messages are those in
> > StartReplication and StartLogicalReplication.").
> >
> > In other words, ReplicationSlotAcquire/ReplicationSlotRelease is
> > called by more places than you care to log from.
>
> I refreshed my thoughts for this patch and I think it's enough if
> walsenders alone log messages when slots become active and inactive.
> How about something like the attached v11 patch?
>

+ * This function is currently used only in walsender.
+ */
+void
+ReplicationSlotAcquireAndLog(const char *name, bool nowait)

BTW, is the reason for using it only in walsender is that it is a
background process and it is not very apparent whether the slot is
created, and for foreground processes, it is a bit clear when the
command is executed. If so, the other alternative is to either use a
parameter to the existing function or directly use am_walsender flag
to distinguish when to print the message in acquire/release calls.

Can you please tell me the use case of this additional message?

A few other minor comments:
1.
+Causes each replication command and related activity to be logged in
+the server log.

Can we be bit more specific by changing to something like: "Causes
each replication command and slot acquisition/release to be logged in
the server log."

2.
+ ereport(log_replication_commands ? LOG : DEBUG1,
+ (errmsg("walsender process with PID %d acquired %s replication slot \"%s\"",

It seems PID and process name is quite unlike what we print in other
similar messages. For example, see below messages when we start
replication via pub/sub :

2023-11-06 08:41:57.867 IST [24400] LOG:  received replication
command: CREATE_REPLICATION_SLOT "sub1" LOGICAL pgoutput (SNAPSHOT
'nothing')
2023-11-06 08:41:57.867 IST [24400] STATEMENT:
CREATE_REPLICATION_SLOT "sub1" LOGICAL pgoutput (SNAPSHOT 'nothing')
...
...
2023-11-06 08:41:57.993 IST [22332] LOG:  walsender process with PID
22332 acquired logical replication slot "sub1"
2023-11-06 08:41:57.993 IST [22332] STATEMENT:  START_REPLICATION SLOT
"sub1" LOGICAL 0/0 (proto_version '4', origin 'any', publication_names
'"pub1"')
...
...
2023-11-06 08:41:58.015 IST [22332] LOG:  starting logical decoding
for slot "sub1"
2023-11-06 08:41:58.015 IST [22332] DETAIL:  Streaming transactions
committing after 0/1522730, reading WAL from 0/15226F8.
2023-11-06 08:41:58.015 IST [22332] STATEMENT:  START_REPLICATION SLOT
"sub1" LOGICAL 0/0 (proto_version '4', origin 'any', publication_names
'"pub1"')

We can get the PID from the log line as for other logs and I don't see
the process name printed anywhere else.

-- 
With Regards,
Amit Kapila.




Re: add log messages when replication slots become active and inactive (was Re: Is it worth adding ReplicationSlot active_pid to ReplicationSlotPersistentData?)

2023-11-04 Thread Bharath Rupireddy
On Thu, Nov 2, 2023 at 7:19 AM Peter Smith  wrote:
>
> But that's not quite compatible with what Alvaro [2] had written long
> back ("... the only acquisitions that would log messages are those in
> StartReplication and StartLogicalReplication.").
>
> In other words, ReplicationSlotAcquire/ReplicationSlotRelease is
> called by more places than you care to log from.

I refreshed my thoughts for this patch and I think it's enough if
walsenders alone log messages when slots become active and inactive.
How about something like the attached v11 patch?

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
From eed7c96d9cab98a576728792c2cb111ae61f930f Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy 
Date: Sat, 4 Nov 2023 21:59:27 +
Subject: [PATCH v11] Emit messages when replication slots become active and
 inactive

These logs will be extremely useful on production servers to debug
and analyze inactive replication slot issues.
---
 doc/src/sgml/config.sgml| 10 +++---
 src/backend/replication/slot.c  | 54 -
 src/backend/replication/walsender.c | 10 +++---
 src/include/replication/slot.h  |  2 ++
 4 files changed, 65 insertions(+), 11 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index bd70ff2e4b..6d14a94cf4 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -7510,11 +7510,11 @@ log_line_prefix = '%m [%p] %q%u@%d/%a '
   
   

-Causes each replication command to be logged in the server log.
-See  for more information about
-replication command. The default value is off.
-Only superusers and users with the appropriate SET
-privilege can change this setting.
+Causes each replication command and related activity to be logged in
+the server log. See  for more
+information about replication command. The default value is
+off. Only superusers and users with the appropriate
+SET privilege can change this setting.

   
  
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 99823df3c7..dfb564b04f 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -181,7 +181,12 @@ ReplicationSlotShmemExit(int code, Datum arg)
 {
 	/* Make sure active replication slots are released */
 	if (MyReplicationSlot != NULL)
-		ReplicationSlotRelease();
+	{
+		if (am_walsender)
+			ReplicationSlotReleaseAndLog();
+		else
+			ReplicationSlotRelease();
+	}
 
 	/* Also cleanup all the temporary slots. */
 	ReplicationSlotCleanup();
@@ -539,6 +544,26 @@ retry:
 		pgstat_acquire_replslot(s);
 }
 
+/*
+ * A wrapper around ReplicationSlotAcquire() but logs a message when
+ * appropriate.
+ *
+ * This function is currently used only in walsender.
+ */
+void
+ReplicationSlotAcquireAndLog(const char *name, bool nowait)
+{
+	Assert(am_walsender);
+
+	ReplicationSlotAcquire(name, nowait);
+
+	ereport(log_replication_commands ? LOG : DEBUG1,
+			(errmsg("walsender process with PID %d acquired %s replication slot \"%s\"",
+	MyProcPid,
+	SlotIsPhysical(MyReplicationSlot) ? "physical" : "logical",
+	NameStr(MyReplicationSlot->data.name;
+}
+
 /*
  * Release the replication slot that this backend considers to own.
  *
@@ -598,6 +623,33 @@ ReplicationSlotRelease(void)
 	LWLockRelease(ProcArrayLock);
 }
 
+/*
+ * A wrapper around ReplicationSlotAcquire() but logs a message when
+ * appropriate.
+ *
+ * This function is currently used only in walsender.
+ */
+void
+ReplicationSlotReleaseAndLog(void)
+{
+	char	   *slotname;
+	bool		is_physical;
+
+	Assert(am_walsender);
+
+	slotname = pstrdup(NameStr(MyReplicationSlot->data.name));
+	is_physical = SlotIsPhysical(MyReplicationSlot);
+
+	ReplicationSlotRelease();
+
+	ereport(log_replication_commands ? LOG : DEBUG1,
+			(errmsg("walsender process with PID %d released %s replication slot \"%s\"",
+	MyProcPid,
+	is_physical ? "physical" : "logical", slotname)));
+
+	pfree(slotname);
+}
+
 /*
  * Cleanup all temporary slots created in current session.
  */
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index e250b0567e..2a1c5056c0 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -322,7 +322,7 @@ WalSndErrorCleanup(void)
 		wal_segment_close(xlogreader);
 
 	if (MyReplicationSlot != NULL)
-		ReplicationSlotRelease();
+		ReplicationSlotReleaseAndLog();
 
 	ReplicationSlotCleanup();
 
@@ -697,7 +697,7 @@ StartReplication(StartReplicationCmd *cmd)
 
 	if (cmd->slotname)
 	{
-		ReplicationSlotAcquire(cmd->slotname, true);
+		ReplicationSlotAcquireAndLog(cmd->slotname, true);
 		if (SlotIsLogical(MyReplicationSlot))
 			ereport(ERROR,
 	(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
@@ -843,7 +843,7 @@ StartReplication(StartReplicationCmd *cmd)
 	}
 
 	if (cmd->slotname)

Re: add log messages when replication slots become active and inactive (was Re: Is it worth adding ReplicationSlot active_pid to ReplicationSlotPersistentData?)

2023-11-01 Thread Peter Smith
On Wed, Nov 1, 2023 at 2:03 PM Amit Kapila  wrote:
>
> On Mon, Mar 27, 2023 at 11:08 AM Bharath Rupireddy
>  wrote:
> >
> > The v9 patch was failing because I was using MyReplicationSlot after
> > it got reset by slot release, attached v10 patch fixes it.
> >
>
> + *
> + * Note: use LogReplicationSlotAcquire() if needed, to log a message after
> + * acquiring the replication slot.
>   */
>  void
>  ReplicationSlotAcquire(const char *name, bool nowait)
> @@ -542,6 +554,9 @@ retry:
>
> When does it need to be logged? For example, recently, we added one
> more slot acquisition/release call in
> binary_upgrade_logical_slot_has_caught_up(); it is not clear from the
> comments whether we need to LOG it or not. I guess at some place like
> atop LogReplicationSlotAcquire() we should document in a bit more
> specific way as to when is this expected to be called.
>

I agree. Just saying "if needed" in those function comments doesn't
help with knowing how to judge when logging is needed or not.

~

Looking back at the thread history it seems the function comment was
added after Euler [1] suggested ("... you should add a comment at the
top of ReplicationSlotAcquire() and ReplicationSlotRelease() functions
saying that LogReplicationSlotAquired() and
LogReplicationSlotReleased() functions should be called respectively
after it.")

But that's not quite compatible with what Alvaro [2] had written long
back ("... the only acquisitions that would log messages are those in
StartReplication and StartLogicalReplication.").

In other words, ReplicationSlotAcquire/ReplicationSlotRelease is
called by more places than you care to log from.

~

Adding a better explanatory comment than "if needed" will be good, and
maybe that is all that is necessary. I'm not sure.

OTOH, if you have to explain that logging is only wanted for a couple
of scenarios, then it raises some doubts about the usefulness of
having a common function in the first place. I had the same doubts
back in March [3] ("I am not sure for the *current* code if the
encapsulation is worth the trouble or not.").

==
[1] Euler - 
https://www.postgresql.org/message-id/c42d5634-ca9b-49a7-85cd-9eff9feb33b4%40app.fastmail.com
[2] Alvaro - 
https://www.postgresql.org/message-id/202204291032.qfvyuqxkjnjw%40alvherre.pgsql
[3] Peter - 
https://www.postgresql.org/message-id/CAHut%2BPu6Knwooc_NckMxszGrAJnytgpMadtoJ-OA-SFWT%2BGFYw%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Austalia




Re: add log messages when replication slots become active and inactive (was Re: Is it worth adding ReplicationSlot active_pid to ReplicationSlotPersistentData?)

2023-10-31 Thread Amit Kapila
On Mon, Mar 27, 2023 at 11:08 AM Bharath Rupireddy
 wrote:
>
> The v9 patch was failing because I was using MyReplicationSlot after
> it got reset by slot release, attached v10 patch fixes it.
>

+ *
+ * Note: use LogReplicationSlotAcquire() if needed, to log a message after
+ * acquiring the replication slot.
  */
 void
 ReplicationSlotAcquire(const char *name, bool nowait)
@@ -542,6 +554,9 @@ retry:

When does it need to be logged? For example, recently, we added one
more slot acquisition/release call in
binary_upgrade_logical_slot_has_caught_up(); it is not clear from the
comments whether we need to LOG it or not. I guess at some place like
atop LogReplicationSlotAcquire() we should document in a bit more
specific way as to when is this expected to be called.

-- 
With Regards,
Amit Kapila.




Re: add log messages when replication slots become active and inactive (was Re: Is it worth adding ReplicationSlot active_pid to ReplicationSlotPersistentData?)

2023-03-26 Thread Bharath Rupireddy
On Fri, Mar 24, 2023 at 3:05 PM Alvaro Herrera  wrote:
>
> On 2023-Mar-23, Greg Stark wrote:
>
> > On Thu, 23 Mar 2023 at 23:30, Bharath Rupireddy
> >  wrote:
> > >
> > > > + ereport(log_replication_commands ? LOG : DEBUG3,
> > > > + (errmsg("acquired physical replication slot \"%s\"",
> > > > + slotname)));
> >
> > So this is just a bit of bike-shedding but I don't feel like these log
> > messages really meet the standard we set for our logging. Like what
> > did the acquiring? What does "acquired" actually mean for a
> > replication slot? Is there not any meta information about the
> > acquisition that can give more context to the reader to make this
> > message more meaningful?
> >
> > I would expect a log message like this to say, I dunno, something like
> > "physical replication slot \"%s\" acquired by streaming TCP connection
> > to 192.168.0.1:999 at LSN ... with xxxMB of logs to read"
>
> Hmm, I don't disagree with your argument in principle, but I think this
> proposal is going too far.  I think stating the PID is more than
> sufficient.

Do you mean to have something like "physical/logical replication slot
\"%s\" is released/acquired by PID %d", MyProcPid? If yes, the
log_line_prefix already contains PID right? Or do we want to cover the
cases when someone changes log_line_prefix to not contain PID?

> And I don't think we need this patch to go great lengths to
> explain what acquisition is, either; I mean, maybe that's a good thing
> to have, but then that's a different patch.
>
> > I even would be wondering if the other end shouldn't also be logging a
> > corresponding log and we shouldn't be going out of our way to ensure
> > there's enough information to match them up and presenting them in a
> > way that makes that easy.
>
> Hmm, you should be able to match things using the connection
> information.  I don't think the slot acquisition operation in itself is
> that important.

Yeah, the intention of the patch is to track the patterns of slot
acquisitions and releases to aid analysis. Of course, this information
alone may not help but when matched with others in the logs, it will.

The v9 patch was failing because I was using MyReplicationSlot after
it got reset by slot release, attached v10 patch fixes it.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
From 626dc27e81142cddee656c0140f4570e952e728a Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy 
Date: Fri, 24 Mar 2023 06:11:52 +
Subject: [PATCH v10] Add LOG messages when replication slots become active and
 inactive

These logs will be extremely useful on production servers to debug
and analyze inactive replication slot issues.
---
 doc/src/sgml/config.sgml|  7 ++--
 src/backend/replication/slot.c  | 51 +
 src/backend/replication/walsender.c | 17 ++
 src/backend/utils/misc/guc_tables.c |  2 +-
 src/include/replication/slot.h  |  2 ++
 5 files changed, 75 insertions(+), 4 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 481f93cea1..f9b0d3c9a9 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -7593,9 +7593,10 @@ log_line_prefix = '%m [%p] %q%u@%d/%a '
   
   

-Causes each replication command to be logged in the server log.
-See  for more information about
-replication command. The default value is off.
+Causes each replication command and related activity to be logged in
+the server log. See  for more
+information about replication command. The default value is
+off.
 Only superusers and users with the appropriate SET
 privilege can change this setting.

diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 2293c0c6fc..2caa8a3532 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -180,7 +180,16 @@ ReplicationSlotShmemExit(int code, Datum arg)
 {
 	/* Make sure active replication slots are released */
 	if (MyReplicationSlot != NULL)
+	{
+		bool	is_physical;
+		char	*slotname;
+
+		is_physical = SlotIsPhysical(MyReplicationSlot);
+		slotname = pstrdup(NameStr(MyReplicationSlot->data.name));
 		ReplicationSlotRelease();
+		LogReplicationSlotRelease(is_physical, slotname);
+		pfree(slotname);
+	}
 
 	/* Also cleanup all the temporary slots. */
 	ReplicationSlotCleanup();
@@ -445,6 +454,9 @@ ReplicationSlotName(int index, Name name)
  *
  * An error is raised if nowait is true and the slot is currently in use. If
  * nowait is false, we sleep until the slot is released by the owning process.
+ *
+ * Note: use LogReplicationSlotAcquire() if needed, to log a message after
+ * acquiring the replication slot.
  */
 void
 ReplicationSlotAcquire(const char *name, bool nowait)
@@ -542,6 +554,9 @@ retry:
  *
  * This or another backend can re-acquire the slot later.
  * Resources this slot requires will be 

Re: add log messages when replication slots become active and inactive (was Re: Is it worth adding ReplicationSlot active_pid to ReplicationSlotPersistentData?)

2023-03-24 Thread Alvaro Herrera
On 2023-Mar-23, Greg Stark wrote:

> On Thu, 23 Mar 2023 at 23:30, Bharath Rupireddy
>  wrote:
> >
> > > + ereport(log_replication_commands ? LOG : DEBUG3,
> > > + (errmsg("acquired physical replication slot \"%s\"",
> > > + slotname)));
> 
> So this is just a bit of bike-shedding but I don't feel like these log
> messages really meet the standard we set for our logging. Like what
> did the acquiring? What does "acquired" actually mean for a
> replication slot? Is there not any meta information about the
> acquisition that can give more context to the reader to make this
> message more meaningful?
> 
> I would expect a log message like this to say, I dunno, something like
> "physical replication slot \"%s\" acquired by streaming TCP connection
> to 192.168.0.1:999 at LSN ... with xxxMB of logs to read"

Hmm, I don't disagree with your argument in principle, but I think this
proposal is going too far.  I think stating the PID is more than
sufficient.  And I don't think we need this patch to go great lengths to
explain what acquisition is, either; I mean, maybe that's a good thing
to have, but then that's a different patch.

> I even would be wondering if the other end shouldn't also be logging a
> corresponding log and we shouldn't be going out of our way to ensure
> there's enough information to match them up and presenting them in a
> way that makes that easy.

Hmm, you should be able to match things using the connection
information.  I don't think the slot acquisition operation in itself is
that important.

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/
"Use it up, wear it out, make it do, or do without"




Re: add log messages when replication slots become active and inactive (was Re: Is it worth adding ReplicationSlot active_pid to ReplicationSlotPersistentData?)

2023-03-23 Thread Greg Stark
On Thu, 23 Mar 2023 at 23:30, Bharath Rupireddy
 wrote:
>
> > + ereport(log_replication_commands ? LOG : DEBUG3,
> > + (errmsg("acquired physical replication slot \"%s\"",
> > + slotname)));

So this is just a bit of bike-shedding but I don't feel like these log
messages really meet the standard we set for our logging. Like what
did the acquiring? What does "acquired" actually mean for a
replication slot? Is there not any meta information about the
acquisition that can give more context to the reader to make this
message more meaningful?

I would expect a log message like this to say, I dunno, something like
"physical replication slot \"%s\" acquired by streaming TCP connection
to 192.168.0.1:999 at LSN ... with xxxMB of logs to read"

I even would be wondering if the other end shouldn't also be logging a
corresponding log and we shouldn't be going out of our way to ensure
there's enough information to match them up and presenting them in a
way that makes that easy.

-- 
greg




Re: add log messages when replication slots become active and inactive (was Re: Is it worth adding ReplicationSlot active_pid to ReplicationSlotPersistentData?)

2023-03-23 Thread Bharath Rupireddy
On Fri, Mar 24, 2023 at 3:11 AM Euler Taveira  wrote:
>
> If you are adding separate functions as suggested, you should add a comment at
> the top of ReplicationSlotAcquire() and ReplicationSlotRelease() functions
> saying that LogReplicationSlotAquired() and LogReplicationSlotReleased()
> functions should be called  respectively after it.

Done.

> My suggestion is that the functions should have the same name with a "Log"
> prefix. On of them has a typo "Aquired" in its name. Hence,
> LogReplicationSlotAcquire() and LogReplicationSlotRelease() as names. It is
> easier to find if someone is grepping by the origin function.

Done.

> I prefer a sentence that includes a verb.
>
>   physical replication slot \"%s\" is acquired
>   logical replication slot \"%s\" is released

Hm, changed for now. But I'll leave it to the committer's discretion.

> Isn't the PID important for this use case? If so, of course, you can rely on
> log_line_prefix (%p) but if the PID is crucial for an investigation then it
> should also be included in the message.

On Fri, Mar 24, 2023 at 3:10 AM Peter Smith  wrote:
>
> Patch v8 applied OK, and builds/renders the HTML docs OK, and passes
> the regression and subscription TAP tests OK.
>
> Here are some minor comments:
>
> 1.
> + ereport(log_replication_commands ? LOG : DEBUG3,
> + (errmsg("acquired physical replication slot \"%s\"",
> + slotname)));
>
> AFAIK those extra parentheses wrapping the "errmsg" part are not necessary.

Done

> 2.
> extern void LogReplicationSlotAquired(bool is_physical, char *slotname);
> extern void LogReplicationSlotReleased(bool is_physical, char *slotname);
>
> The "char *slotname" params of those helper functions should probably
> be declared and defined as "const char *slotname".

Done.

> Otherwise, from a code review perspective the patch v8 LGTM.

Thanks. Please have a look at the v9 patch.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com


v9-0001-Add-LOG-messages-when-replication-slots-become-ac.patch
Description: Binary data


Re: add log messages when replication slots become active and inactive (was Re: Is it worth adding ReplicationSlot active_pid to ReplicationSlotPersistentData?)

2023-03-23 Thread Euler Taveira
On Thu, Mar 23, 2023, at 2:33 PM, Bharath Rupireddy wrote:
> On Thu, Mar 23, 2023 at 3:37 PM Alvaro Herrera  
> wrote:
> >
> > On 2023-Mar-22, Amit Kapila wrote:
> >
> > > I see that you have modified the patch to address the comments from
> > > Alvaro. Personally, I feel it would be better to add such a message at
> > > a centralized location instead of spreading these in different callers
> > > of slot acquire/release functionality to avoid getting these missed in
> > > the new callers in the future. However, if Alvaro and others think
> > > that the current style is better then we should go ahead and do it
> > > that way. I hope that we should be able to decide on this and get it
> > > into PG16. Anyone else would like to weigh in here?
> >
> > I like Peter Smith's suggestion downthread.
> 
> +1. Please review the attached v8 patch further.
If you are adding separate functions as suggested, you should add a comment at
the top of ReplicationSlotAcquire() and ReplicationSlotRelease() functions
saying that LogReplicationSlotAquired() and LogReplicationSlotReleased()
functions should be called  respectively after it.

My suggestion is that the functions should have the same name with a "Log"
prefix. On of them has a typo "Aquired" in its name. Hence,
LogReplicationSlotAcquire() and LogReplicationSlotRelease() as names. It is
easier to find if someone is grepping by the origin function.

I prefer a sentence that includes a verb.

  physical replication slot \"%s\" is acquired
  logical replication slot \"%s\" is released

Isn't the PID important for this use case? If so, of course, you can rely on
log_line_prefix (%p) but if the PID is crucial for an investigation then it
should also be included in the message.


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: add log messages when replication slots become active and inactive (was Re: Is it worth adding ReplicationSlot active_pid to ReplicationSlotPersistentData?)

2023-03-23 Thread Peter Smith
On Fri, Mar 24, 2023 at 4:33 AM Bharath Rupireddy
 wrote:
>
> On Thu, Mar 23, 2023 at 3:37 PM Alvaro Herrera  
> wrote:
> >
> > On 2023-Mar-22, Amit Kapila wrote:
> >
> > > I see that you have modified the patch to address the comments from
> > > Alvaro. Personally, I feel it would be better to add such a message at
> > > a centralized location instead of spreading these in different callers
> > > of slot acquire/release functionality to avoid getting these missed in
> > > the new callers in the future. However, if Alvaro and others think
> > > that the current style is better then we should go ahead and do it
> > > that way. I hope that we should be able to decide on this and get it
> > > into PG16. Anyone else would like to weigh in here?
> >
> > I like Peter Smith's suggestion downthread.
>
> +1. Please review the attached v8 patch further.
>

Patch v8 applied OK, and builds/renders the HTML docs OK, and passes
the regression and subscription TAP tests OK.

(Note - I didn't do any additional manual testing, and I've assumed it
to be covering all the necessary acquire/related logging that you
wanted).

~~

Here are some minor comments:

1.
+ ereport(log_replication_commands ? LOG : DEBUG3,
+ (errmsg("acquired physical replication slot \"%s\"",
+ slotname)));

AFAIK those extra parentheses wrapping the "errmsg" part are not necessary.

~~

2.
extern void LogReplicationSlotAquired(bool is_physical, char *slotname);
extern void LogReplicationSlotReleased(bool is_physical, char *slotname);

The "char *slotname" params of those helper functions should probably
be declared and defined as "const char *slotname".

~~

Otherwise, from a code review perspective the patch v8 LGTM.

--
Kind Regards,
Peter Smith.
Fujitsu Australia




Re: add log messages when replication slots become active and inactive (was Re: Is it worth adding ReplicationSlot active_pid to ReplicationSlotPersistentData?)

2023-03-23 Thread Bharath Rupireddy
On Thu, Mar 23, 2023 at 3:37 PM Alvaro Herrera  wrote:
>
> On 2023-Mar-22, Amit Kapila wrote:
>
> > I see that you have modified the patch to address the comments from
> > Alvaro. Personally, I feel it would be better to add such a message at
> > a centralized location instead of spreading these in different callers
> > of slot acquire/release functionality to avoid getting these missed in
> > the new callers in the future. However, if Alvaro and others think
> > that the current style is better then we should go ahead and do it
> > that way. I hope that we should be able to decide on this and get it
> > into PG16. Anyone else would like to weigh in here?
>
> I like Peter Smith's suggestion downthread.

+1. Please review the attached v8 patch further.

-- 
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com


v8-0001-Add-LOG-messages-when-replication-slots-become-ac.patch
Description: Binary data


Re: add log messages when replication slots become active and inactive (was Re: Is it worth adding ReplicationSlot active_pid to ReplicationSlotPersistentData?)

2023-03-23 Thread Alvaro Herrera
On 2023-Mar-22, Amit Kapila wrote:

> I see that you have modified the patch to address the comments from
> Alvaro. Personally, I feel it would be better to add such a message at
> a centralized location instead of spreading these in different callers
> of slot acquire/release functionality to avoid getting these missed in
> the new callers in the future. However, if Alvaro and others think
> that the current style is better then we should go ahead and do it
> that way. I hope that we should be able to decide on this and get it
> into PG16. Anyone else would like to weigh in here?

I like Peter Smith's suggestion downthread.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/




Re: add log messages when replication slots become active and inactive (was Re: Is it worth adding ReplicationSlot active_pid to ReplicationSlotPersistentData?)

2023-03-23 Thread Peter Smith
Hi, I had a quick look at the v7 patch.

You might consider to encapsulate some of this logic in new functions like:

void
LogReplicationSlotAquired(bool is_physical, char *slotname)
{
loglevel = log_replication_commands ? LOG : DEBUG3;

if (is_physical)
ereport(loglevel,
(errmsg("acquired physical replication slot \"%s\"", slotname)));
else
ereport(loglevel,
(errmsg("acquired logical replication slot \"%s\"", slotname)));
}

void
LogReplicationSlotReleased(bool is_physical, char *slotname)
{
loglevel = log_replication_commands ? LOG : DEBUG3;

if (is_physical)
ereport(loglevel,
(errmsg("released physical replication slot \"%s\"", slotname)));
else
ereport(loglevel,
(errmsg("released logical replication slot \"%s\"", slotname)));
}

~~

THEN

ReplicationSlotShmemExit and WalSndErrorCleanup can call it like:
if (MyReplicationSlot != NULL)
{
bool is_physical = SlotIsPhysical(MyReplicationSlot);
char *slotname = pstrdup(NameStr(MyReplicationSlot->data.name));

ReplicationSlotRelease();
LogReplicationSlotReleased(is_physical, slotname);
}

~

StartlReplication can call like:
LogReplicationSlotAquired(true, cmd->slotname);
...
LogReplicationSlotReleased(true, cmd->slotname);

~

StartLogicalReplication can call like:
LogReplicationSlotAquired(false, cmd->slotname);
...
LogReplicationSlotReleased(false, cmd->slotname);


~~~

TBH, I am not sure for the *current* code if the encapsulation is
worth the trouble or not. But maybe at least it helps message
consistency and will make it easier if future callers are needed. I
guess those functions could also give you some central point to
comment the intent of this logging? Feel free to take or leave this
code suggestion.

--
Kind Regards,
Peter Smith.
Fujitsu Australia




Re: add log messages when replication slots become active and inactive (was Re: Is it worth adding ReplicationSlot active_pid to ReplicationSlotPersistentData?)

2023-03-22 Thread Amit Kapila
On Thu, Sep 15, 2022 at 10:39 AM Bharath Rupireddy
 wrote:
>
> On Mon, Jul 25, 2022 at 6:31 PM Bharath Rupireddy
>  wrote:
> >
> > Here's the v6 patch, a much simpler one - no changes to any of the
> > existing function APIs. Please see the sample logs at [1]. There's a
> > bit of duplicate code in the v6 patch, if the overall approach looks
> > okay, I can remove that too in the next version of the patch.
>
> I modified the log_replication_commands description in guc_tables.c.
> Please review the v7 patch further.
>

I see that you have modified the patch to address the comments from
Alvaro. Personally, I feel it would be better to add such a message at
a centralized location instead of spreading these in different callers
of slot acquire/release functionality to avoid getting these missed in
the new callers in the future. However, if Alvaro and others think
that the current style is better then we should go ahead and do it
that way. I hope that we should be able to decide on this and get it
into PG16. Anyone else would like to weigh in here?

-- 
With Regards,
Amit Kapila.




Re: add log messages when replication slots become active and inactive (was Re: Is it worth adding ReplicationSlot active_pid to ReplicationSlotPersistentData?)

2022-09-14 Thread Bharath Rupireddy
On Mon, Jul 25, 2022 at 6:31 PM Bharath Rupireddy
 wrote:
>
> Here's the v6 patch, a much simpler one - no changes to any of the
> existing function APIs. Please see the sample logs at [1]. There's a
> bit of duplicate code in the v6 patch, if the overall approach looks
> okay, I can remove that too in the next version of the patch.

I modified the log_replication_commands description in guc_tables.c.
Please review the v7 patch further.

-- 
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com


v7-0001-Add-LOG-messages-when-replication-slots-become-ac.patch
Description: Binary data


Re: add log messages when replication slots become active and inactive (was Re: Is it worth adding ReplicationSlot active_pid to ReplicationSlotPersistentData?)

2022-07-25 Thread Bharath Rupireddy
On Thu, May 5, 2022 at 2:57 PM Bharath Rupireddy
 wrote:
>
> On Fri, Apr 29, 2022 at 4:02 PM Alvaro Herrera  
> wrote:
> >
> > If I read the code right, this patch emits logs when
> > pg_logical_slot_get_changes and pg_replication_slot_advance SQL
> > functions are called.  Is this desirable/useful, for the use case you
> > stated at start of thread?  I think it is most likely pointless.  If you
> > get rid of those, then the only acquisitions that would log messages are
> > those in StartReplication and StartLogicalReplication.  So I wonder if
> > it would be better to leave the API of ReplicationSlotAcquire() alone,
> > and instead make StartReplication and StartLogicalReplication
> > responsible for those messages.
> >
> > I didn't look at the release-side messages you're adding, but I suppose
> > it should be symmetrical.

Here's the v6 patch, a much simpler one - no changes to any of the
existing function APIs. Please see the sample logs at [1]. There's a
bit of duplicate code in the v6 patch, if the overall approach looks
okay, I can remove that too in the next version of the patch.

Thoughts?

[1]
2022-07-25 12:30:14.847 UTC [152873] LOG:  acquired physical
replication slot "foo"
2022-07-25 12:30:20.878 UTC [152873] LOG:  released physical
replication slot "foo"

2022-07-25 12:49:18.023 UTC [168738] LOG:  acquired logical
replication slot "bar"
2022-07-25 12:49:28.105 UTC [168738] LOG:  released logical
replication slot "bar"

Regards,
Bharath Rupireddy.
From c57cfd8dd27eaf9e0c17c47769cb24b8ab961116 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy 
Date: Mon, 25 Jul 2022 12:51:26 +
Subject: [PATCH v6] Add LOG messages when replication slots become active and
 inactive

These logs will be extremely useful on production servers to debug
and analyze inactive replication slot issues.
---
 doc/src/sgml/config.sgml|  7 +++---
 src/backend/replication/slot.c  | 21 
 src/backend/replication/walsender.c | 39 +
 3 files changed, 64 insertions(+), 3 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index e2d728e0c4..8bf3c2ab53 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -7461,9 +7461,10 @@ log_line_prefix = '%m [%p] %q%u@%d/%a '
   
   

-Causes each replication command to be logged in the server log.
-See  for more information about
-replication command. The default value is off.
+Causes each replication command and related activity to be logged in
+the server log. See  for more
+information about replication command. The default value is
+off.
 Only superusers and users with the appropriate SET
 privilege can change this setting.

diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 850b74936f..43563bec7e 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -179,8 +179,29 @@ ReplicationSlotShmemExit(int code, Datum arg)
 {
 	/* Make sure active replication slots are released */
 	if (MyReplicationSlot != NULL)
+	{
+		bool is_physical;
+		char slotname[NAMEDATALEN] = {0};
+
+		is_physical = SlotIsPhysical(MyReplicationSlot);
+		strcpy(slotname, NameStr(MyReplicationSlot->data.name));
+
 		ReplicationSlotRelease();
 
+		if (is_physical)
+		{
+			ereport(log_replication_commands ? LOG : DEBUG3,
+	(errmsg("released physical replication slot \"%s\"",
+			slotname)));
+		}
+		else
+		{
+			ereport(log_replication_commands ? LOG : DEBUG3,
+	(errmsg("released logical replication slot \"%s\"",
+			slotname)));
+		}
+	}
+
 	/* Also cleanup all the temporary slots. */
 	ReplicationSlotCleanup();
 }
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 3a86786cc3..ebd34316e2 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -322,8 +322,29 @@ WalSndErrorCleanup(void)
 		wal_segment_close(xlogreader);
 
 	if (MyReplicationSlot != NULL)
+	{
+		bool is_physical;
+		char slotname[NAMEDATALEN] = {0};
+
+		is_physical = SlotIsPhysical(MyReplicationSlot);
+		strcpy(slotname, NameStr(MyReplicationSlot->data.name));
+
 		ReplicationSlotRelease();
 
+		if (is_physical)
+		{
+			ereport(log_replication_commands ? LOG : DEBUG3,
+	(errmsg("released physical replication slot \"%s\"",
+			slotname)));
+		}
+		else
+		{
+			ereport(log_replication_commands ? LOG : DEBUG3,
+	(errmsg("released logical replication slot \"%s\"",
+			slotname)));
+		}
+	}
+
 	ReplicationSlotCleanup();
 
 	replication_active = false;
@@ -704,6 +725,10 @@ StartReplication(StartReplicationCmd *cmd)
 	(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
 	 errmsg("cannot use a logical replication slot for physical replication")));
 
+		ereport(log_replication_commands ? LOG : DEBUG3,
+(errmsg("acquired physical replication slot \"%s\"",
+		cmd->slotname)));
+
 	

Re: add log messages when replication slots become active and inactive (was Re: Is it worth adding ReplicationSlot active_pid to ReplicationSlotPersistentData?)

2022-05-05 Thread Bharath Rupireddy
On Fri, Apr 29, 2022 at 4:02 PM Alvaro Herrera  wrote:
>
> If I read the code right, this patch emits logs when
> pg_logical_slot_get_changes and pg_replication_slot_advance SQL
> functions are called.  Is this desirable/useful, for the use case you
> stated at start of thread?  I think it is most likely pointless.  If you
> get rid of those, then the only acquisitions that would log messages are
> those in StartReplication and StartLogicalReplication.  So I wonder if
> it would be better to leave the API of ReplicationSlotAcquire() alone,
> and instead make StartReplication and StartLogicalReplication
> responsible for those messages.
>
> I didn't look at the release-side messages you're adding, but I suppose
> it should be symmetrical.

Adding the messages right after ReplicationSlotAcquire in the
StartReplication and StartLogicalReplication looks okay, but, if we
just add "released replication slot \"%s\" for logical/physical
replication". in StartReplication and StartLogicalReplication right
after ReplicationSlotRelease, how about ReplicationSlotRelease in
WalSndErrorCleanup and ReplicationSlotShmemExit?

The whole idea is to get to know when and how much time a slot was
inactive/unused when log_replication_commands is set to true.

We can think of adding a timestamp column to on-disk replication slot
data but that will be too much.

Regards,
Bharath Rupireddy.




Re: add log messages when replication slots become active and inactive (was Re: Is it worth adding ReplicationSlot active_pid to ReplicationSlotPersistentData?)

2022-04-29 Thread Alvaro Herrera
If I read the code right, this patch emits logs when
pg_logical_slot_get_changes and pg_replication_slot_advance SQL
functions are called.  Is this desirable/useful, for the use case you
stated at start of thread?  I think it is most likely pointless.  If you
get rid of those, then the only acquisitions that would log messages are
those in StartReplication and StartLogicalReplication.  So I wonder if
it would be better to leave the API of ReplicationSlotAcquire() alone,
and instead make StartReplication and StartLogicalReplication
responsible for those messages.

I didn't look at the release-side messages you're adding, but I suppose
it should be symmetrical.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"El hombre nunca sabe de lo que es capaz hasta que lo intenta" (C. Dickens)