Re: Consistent coding for the naming of LR workers

2023-07-13 Thread Peter Eisentraut

On 13.07.23 11:29, Alvaro Herrera wrote:

On 2023-Jul-13, Peter Eisentraut wrote:


I suppose we could just say "logical replication worker" in all cases. That
should be enough precision for the purpose of these messages.


Agreed.  IMO the user doesn't care about specifics.


Ok, committed.





Re: Consistent coding for the naming of LR workers

2023-07-13 Thread Alvaro Herrera
On 2023-Jul-13, Peter Eisentraut wrote:

> I suppose we could just say "logical replication worker" in all cases. That
> should be enough precision for the purpose of these messages.

Agreed.  IMO the user doesn't care about specifics.

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/




Re: Consistent coding for the naming of LR workers

2023-07-13 Thread Masahiko Sawada
On Thu, Jul 13, 2023 at 4:07 PM Peter Eisentraut  wrote:
>
> On 13.07.23 06:59, Peter Smith wrote:
> > On Wed, Jul 12, 2023 at 9:35 PM Peter Eisentraut  
> > wrote:
> >>
> >> On 21.06.23 09:18, Alvaro Herrera wrote:
> >>> That is a terrible pattern in relatively new code.  Let's get rid of it
> >>> entirely rather than continue to propagate it.
> >>>
>  So, I don't think it is fair to say that these format strings are OK
>  for the existing HEAD code, but not OK for the patch code, when they
>  are both the same.
> >>>
> >>> Agreed.  Let's remove them all.
> >>
> >> This is an open issue for PG16 translation.  I propose the attached
> >> patch to fix this.  Mostly, this just reverts to the previous wordings.
> >> (I don't think for these messages the difference between "apply worker"
> >> and "parallel apply worker" is all that interesting to explode the
> >> number of messages.  AFAICT, the table sync worker case wasn't even
> >> used, since callers always handled it separately.)
> >
> > I thought the get_worker_name function was reachable by tablesync workers 
> > also.
> >
> > Since ApplyWorkerMain is a common entry point for both leader apply
> > workers and tablesync workers, any logs in that code path could
> > potentially be from either kind of worker. At least, when the function
> > was first introduced (around patch v43-0001? [1]) it was also
> > replacing some tablesync logs.
>
> I suppose we could just say "logical replication worker" in all cases.
> That should be enough precision for the purpose of these messages.

+1

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com




Re: Consistent coding for the naming of LR workers

2023-07-13 Thread Peter Eisentraut

On 13.07.23 06:59, Peter Smith wrote:

On Wed, Jul 12, 2023 at 9:35 PM Peter Eisentraut  wrote:


On 21.06.23 09:18, Alvaro Herrera wrote:

That is a terrible pattern in relatively new code.  Let's get rid of it
entirely rather than continue to propagate it.


So, I don't think it is fair to say that these format strings are OK
for the existing HEAD code, but not OK for the patch code, when they
are both the same.


Agreed.  Let's remove them all.


This is an open issue for PG16 translation.  I propose the attached
patch to fix this.  Mostly, this just reverts to the previous wordings.
(I don't think for these messages the difference between "apply worker"
and "parallel apply worker" is all that interesting to explode the
number of messages.  AFAICT, the table sync worker case wasn't even
used, since callers always handled it separately.)


I thought the get_worker_name function was reachable by tablesync workers also.

Since ApplyWorkerMain is a common entry point for both leader apply
workers and tablesync workers, any logs in that code path could
potentially be from either kind of worker. At least, when the function
was first introduced (around patch v43-0001? [1]) it was also
replacing some tablesync logs.


I suppose we could just say "logical replication worker" in all cases. 
That should be enough precision for the purpose of these messages.






Re: Consistent coding for the naming of LR workers

2023-07-12 Thread Peter Smith
On Wed, Jul 12, 2023 at 9:35 PM Peter Eisentraut  wrote:
>
> On 21.06.23 09:18, Alvaro Herrera wrote:
> > That is a terrible pattern in relatively new code.  Let's get rid of it
> > entirely rather than continue to propagate it.
> >
> >> So, I don't think it is fair to say that these format strings are OK
> >> for the existing HEAD code, but not OK for the patch code, when they
> >> are both the same.
> >
> > Agreed.  Let's remove them all.
>
> This is an open issue for PG16 translation.  I propose the attached
> patch to fix this.  Mostly, this just reverts to the previous wordings.
> (I don't think for these messages the difference between "apply worker"
> and "parallel apply worker" is all that interesting to explode the
> number of messages.  AFAICT, the table sync worker case wasn't even
> used, since callers always handled it separately.)

I thought the get_worker_name function was reachable by tablesync workers also.

Since ApplyWorkerMain is a common entry point for both leader apply
workers and tablesync workers, any logs in that code path could
potentially be from either kind of worker. At least, when the function
was first introduced (around patch v43-0001? [1]) it was also
replacing some tablesync logs.

--
[1] v43-0001 - 
https://www.postgresql.org/message-id/OS0PR01MB5716E366874B253B58FC0A23943C9%40OS0PR01MB5716.jpnprd01.prod.outlook.com

Kind Regards,
Peter Smith.
Fujitsu Australia




Re: Consistent coding for the naming of LR workers

2023-07-12 Thread Peter Eisentraut

On 21.06.23 09:18, Alvaro Herrera wrote:

That is a terrible pattern in relatively new code.  Let's get rid of it
entirely rather than continue to propagate it.


So, I don't think it is fair to say that these format strings are OK
for the existing HEAD code, but not OK for the patch code, when they
are both the same.


Agreed.  Let's remove them all.


This is an open issue for PG16 translation.  I propose the attached 
patch to fix this.  Mostly, this just reverts to the previous wordings. 
(I don't think for these messages the difference between "apply worker" 
and "parallel apply worker" is all that interesting to explode the 
number of messages.  AFAICT, the table sync worker case wasn't even 
used, since callers always handled it separately.)
From 6a1558629f97d83c8b11f204b39aceffc94dee8f Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Wed, 12 Jul 2023 13:31:08 +0200
Subject: [PATCH] Fix untranslatable error message assembly

Discussion: 
https://www.postgresql.org/message-id/flat/CAHut%2BPt1xwATviPGjjtJy5L631SGf3qjV9XUCmxLu16cHamfgg%40mail.gmail.com
---
 src/backend/replication/logical/worker.c | 44 +++-
 1 file changed, 12 insertions(+), 32 deletions(-)

diff --git a/src/backend/replication/logical/worker.c 
b/src/backend/replication/logical/worker.c
index 0ee764d68f..95b36ced13 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -435,20 +435,6 @@ static inline void reset_apply_error_context_info(void);
 static TransApplyAction get_transaction_apply_action(TransactionId xid,

 ParallelApplyWorkerInfo **winfo);
 
-/*
- * Return the name of the logical replication worker.
- */
-static const char *
-get_worker_name(void)
-{
-   if (am_tablesync_worker())
-   return _("logical replication table synchronization worker");
-   else if (am_parallel_apply_worker())
-   return _("logical replication parallel apply worker");
-   else
-   return _("logical replication apply worker");
-}
-
 /*
  * Form the origin name for the subscription.
  *
@@ -3904,9 +3890,8 @@ maybe_reread_subscription(void)
if (!newsub)
{
ereport(LOG,
-   /* translator: first %s is the name of logical replication 
worker */
-   (errmsg("%s for subscription \"%s\" will stop 
because the subscription was removed",
-   get_worker_name(), 
MySubscription->name)));
+   (errmsg("logical replication apply worker for 
subscription \"%s\" will stop because the subscription was removed",
+   MySubscription->name)));
 
/* Ensure we remove no-longer-useful entry for worker's start 
time */
if (!am_tablesync_worker() && !am_parallel_apply_worker())
@@ -3918,9 +3903,8 @@ maybe_reread_subscription(void)
if (!newsub->enabled)
{
ereport(LOG,
-   /* translator: first %s is the name of logical replication 
worker */
-   (errmsg("%s for subscription \"%s\" will stop 
because the subscription was disabled",
-   get_worker_name(), 
MySubscription->name)));
+   (errmsg("logical replication apply worker for 
subscription \"%s\" will stop because the subscription was disabled",
+   MySubscription->name)));
 
apply_worker_exit();
}
@@ -3954,9 +3938,8 @@ maybe_reread_subscription(void)
MySubscription->name)));
else
ereport(LOG,
-   /* translator: first %s is the name of logical 
replication worker */
-   (errmsg("%s for subscription \"%s\" 
will restart because of a parameter change",
-   get_worker_name(), 
MySubscription->name)));
+   (errmsg("logical replication apply 
worker for subscription \"%s\" will restart because of a parameter change",
+   MySubscription->name)));
 
apply_worker_exit();
}
@@ -4478,9 +4461,8 @@ InitializeApplyWorker(void)
if (!MySubscription)
{
ereport(LOG,
-   /* translator: %s is the name of logical replication worker */
-   (errmsg("%s for subscription %u will not start 
because the subscription was removed during startup",
-   get_worker_name(), 
MyLogicalRepWorker->subid)));
+   (errmsg("logical replication apply worker for 
subscri

Re: Consistent coding for the naming of LR workers

2023-06-21 Thread Alvaro Herrera
On 2023-Jun-21, Peter Smith wrote:

> Except, please note that there are already multiple message format
> strings in the HEAD code that look like "%s for subscription ...",
> that are using the get_worker_name() function for the name
> substitution.
> 
> e.g.
> - "%s for subscription \"%s\" will stop because the subscription was removed"
> - "%s for subscription \"%s\" will stop because the subscription was disabled"
> - "%s for subscription \"%s\" will restart because of a parameter change"
> - "%s for subscription %u will not start because the subscription was
> removed during startup"
> - "%s for subscription \"%s\" will not start because the subscription
> was disabled during startup"
> - "%s for subscription \"%s\" has started"

That is a terrible pattern in relatively new code.  Let's get rid of it
entirely rather than continue to propagate it.

> So, I don't think it is fair to say that these format strings are OK
> for the existing HEAD code, but not OK for the patch code, when they
> are both the same.

Agreed.  Let's remove them all.

BTW this is documented:
https://www.postgresql.org/docs/15/nls-programmer.html#NLS-GUIDELINES

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"I suspect most samba developers are already technically insane...
Of course, since many of them are Australians, you can't tell." (L. Torvalds)




Re: Consistent coding for the naming of LR workers

2023-06-20 Thread Peter Smith
Re:  Alvaro's comment [1] "From a translation standpoint, this doesn't
seem good".

Except, please note that there are already multiple message format
strings in the HEAD code that look like "%s for subscription ...",
that are using the get_worker_name() function for the name
substitution.

e.g.
- "%s for subscription \"%s\" will stop because the subscription was removed"
- "%s for subscription \"%s\" will stop because the subscription was disabled"
- "%s for subscription \"%s\" will restart because of a parameter change"
- "%s for subscription %u will not start because the subscription was
removed during startup"
- "%s for subscription \"%s\" will not start because the subscription
was disabled during startup"
- "%s for subscription \"%s\" has started"

And many of my patch changes will result in a format string which has
exactly that same pattern:

e.g.
- "%s for subscription \"%s\" has finished"
- "%s for subscription \"%s\", table \"%s\" has finished"
- "%s for subscription \"%s\" will restart so that two_phase can be
enabledworker"
- "%s for subscription \"%s\" will stop"
- "%s for subscription \"%s\" will stop because of a parameter change"
- "%s for subscription \"%s\", table \"%s\" has started"

So, I don't think it is fair to say that these format strings are OK
for the existing HEAD code, but not OK for the patch code, when they
are both the same.

~~

OTOH,  you are correct there are some more problematic messages (see
below - one of these you cited) that are not using the same pattern:

e.g.
- "lost connection to the %s"
- "%s exited due to error"
- "unrecognized message type received %s: %c (message length %d bytes)"
- "lost connection to the %s"
- "%s will serialize the remaining changes of remote transaction %u to a file"
- "lost connection to the %s"
- "defining savepoint %s in %s"
- "rolling back to savepoint %s in %s"

IMO it will be an improvement for all-round consistency if those also
get changed to use the similar pattern:

e.g. "lost connection to the %s" --> "%s for subscription \"%s",
cannot be contacted"
e.g. "defining savepoint %s in %s" --> "%s for subscription \"%s",
defining savepoint %s"
e.g. "rolling back to savepoint %s in %s" --> "%s for subscription
\"%s", rolling back to savepoint %s"
etc.

Thoughts?

--

Re: Horiguchi-san's comment [2] "... instead, it makes grepping difficult."

Sorry, I didn't really understand how this patch makes grepping more
difficult. e.g. If you are grepping for any message about "table
synchronization worker" then you are currently hoping and relying that
there there are no differences in the wording of all the existing
messages. If something instead says "tablesync worker" you will
accidentally overlook it.

OTOH, this patch eliminates the guesswork and luck. In the example,
grepping for LR_WORKER_NAME_TABLESYNC will give you all the messages
you were looking for.

--
[1] Alvaro review comments -
https://www.postgresql.org/message-id/20230615103759.bkkv226czkcnuork%40alvherre.pgsql
[2] Horiguchi-san review comments -
https://www.postgresql.org/message-id/20230616.104327.1878440413098623268.horikyota.ntt%40gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia




Re: Consistent coding for the naming of LR workers

2023-06-15 Thread Kyotaro Horiguchi
At Thu, 15 Jun 2023 12:42:33 +1000, Peter Smith  wrote 
in 
> It is better to have a *single* point where these worker names are
> defined, so then all output uses identical LR worker nomenclature.
> 
> PSA a small patch to modify the code accordingly. This is not intended
> to be a functional change - just a code cleanup.
> 
> Thoughts?

I generally like this direction when it actually decreases the number
of translatable messages without making grepping on the tree
excessively difficult. However, in this case, the patch doesn't seems
to reduce the translatable messages; instead, it makes grepping
difficult.

Addition to that, I'm inclined to concur with Alvaro regarding the
gramattical aspect.

Consequently, I'd prefer to leave these alone.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: Consistent coding for the naming of LR workers

2023-06-15 Thread Alvaro Herrera
On 2023-Jun-15, Peter Smith wrote:

> PSA a small patch to modify the code accordingly. This is not intended
> to be a functional change - just a code cleanup.

>From a translation standpoint, this doesn't seem good.  Consider this
proposed message:
  "lost connection to the %s"

It's not possible to translate "to the" correctly to Spanish because it
depends on the grammatical gender of the %s.  At present this is not an
actual problem because all bgworker types have the same gender, but it
goes counter translability good practices.  Also, other languages may
have different considerations.

You could use a generic term and then add a colon-separated or a quoted
indicator for its type:
 lost connection to logical replication worker of type "parallel apply"
 lost connection to logical replication worker: "parallel apply worker"
 lost connection to logical replication worker: type "parallel apply worker"

and then make the type string (and nothing else in that message) be a
%s.  But I don't think this looks very good.

I'd leave this alone, except if there are any actual inconsistencies in
which case they should be fixed specifically.

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/
Thou shalt check the array bounds of all strings (indeed, all arrays), for
surely where thou typest "foo" someone someday shall type
"supercalifragilisticexpialidocious" (5th Commandment for C programmers)




Re: Consistent coding for the naming of LR workers

2023-06-15 Thread Amit Kapila
On Thu, Jun 15, 2023 at 8:13 AM Peter Smith  wrote:
>
> There are different types of Logical Replication workers -- e.g.
> tablesync workers, apply workers, and parallel apply workers.
>
> The logging and errors often name these worker types, but during a
> recent code review, I noticed some inconsistency in the way this is
> done:
> a) there is a common function get_worker_name() to return the name for
> the worker type,  -- OR --
> b) the worker name is just hardcoded in the message/error
>
> I think it is not ideal to cut/paste the same hardwired strings over
> and over. IMO it just introduces an unnecessary risk of subtle naming
> differences creeping in.
>
> ~~
>
> It is better to have a *single* point where these worker names are
> defined, so then all output uses identical LR worker nomenclature.
>

+1. I think makes error strings in the code look a bit shorter.  I
think it is better to park the patch for the next CF to avoid
forgetting about it.


-- 
With Regards,
Amit Kapila.