Re: Autogenerate some wait events code and documentation

2023-07-09 Thread Drouvot, Bertrand




On 7/10/23 7:20 AM, Michael Paquier wrote:

On Mon, Jul 10, 2023 at 07:05:30AM +0200, Drouvot, Bertrand wrote:

Yeah there is one in generate-wait_event_types.pl. I was wondering
to add one in wait_event_names.txt too (as this is the place where
no wait events would be added if any).


Hmm.  Something like that could be done, for instance:

  #   src/backend/utils/activity/wait_event_types.h
-#  typedef enum definitions for wait events.
+#  typedef enum definitions for wait events, generated from the first
+#  field.


Yeah, it looks a good place for it.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Autogenerate some wait events code and documentation

2023-07-09 Thread Michael Paquier
On Mon, Jul 10, 2023 at 07:05:30AM +0200, Drouvot, Bertrand wrote:
> Yeah there is one in generate-wait_event_types.pl. I was wondering
> to add one in wait_event_names.txt too (as this is the place where
> no wait events would be added if any).

Hmm.  Something like that could be done, for instance:

 #   src/backend/utils/activity/wait_event_types.h
-#  typedef enum definitions for wait events.
+#  typedef enum definitions for wait events, generated from the first
+#  field.
--
Michael


signature.asc
Description: PGP signature


Re: Add hint message for check_log_destination()

2023-07-09 Thread Michael Paquier
On Mon, Jul 10, 2023 at 02:07:09PM +0900, Kyotaro Horiguchi wrote:
> At Mon, 10 Jul 2023 09:04:42 +0800, Japin Li  wrote in 
>> Sorry for the late reply!  I'm not sure.  How can I know whether it is 
>> translatable?

Per the documentation:
https://www.postgresql.org/docs/devel/nls-programmer.html#NLS-GUIDELINES

Now, if you want to look at the shape of the messages, you could also
run something like a `make init-po` and look at the messages generated
in a .pot file.

> Honestly, I'm not sold on the idea that we need to exhaust ourselves
> providing an exhaustive list of usable keywords for users here.  I
> believe that it is unlikely that these keywords will be used in
> different combinations each time without looking at the
> documentation. On top of that, consider "csvlog" as an example, -- it
> doesn't work as expected if logging_collector is off. Although this is
> documented, we don't give any warnings at startup. This seems like a
> bigger issue than the unusable keywords. (I don't mean to suggest to
> fix this, as usual.)
> 
> In short, I think a simple message like '"xxx" cannot be used in this
> build' should suffice for keywords defined but unusable, and we should
> stick with "unknown" for the undefined ones.

Which is roughly what the existing GUC_check_errdetail() does as well,
but you indeed lose a bit of context because the option wanted is not
built.  I am not convinced that there is something to change here.
--
Michael


signature.asc
Description: PGP signature


Re: Add hint message for check_log_destination()

2023-07-09 Thread Kyotaro Horiguchi
At Mon, 10 Jul 2023 09:04:42 +0800, Japin Li  wrote in 
> 
> On Sat, 08 Jul 2023 at 12:48, Michael Paquier  wrote:
> > On Fri, Jul 07, 2023 at 07:23:47PM +0800, Japin Li wrote:
> >> +  appendStringInfoString(, "\"stderr\"");
> >> +#ifdef HAVE_SYSLOG
> >> +  appendStringInfoString(, ", \"syslog\"");
> >> +#endif
> >> +#ifdef WIN32
> >> +  appendStringInfoString(, ", \"eventlog\"");
> >> +#endif
> >> +  appendStringInfoString(, ", \"csvlog\", and 
> >> \"jsonlog\"");
> >
> > Hmm.  Is that OK as a translatable string?
> 
> 
> Sorry for the late reply!  I'm not sure.  How can I know whether it is 
> translatable?

At the very least, we can't generate comma-separated lists
programatically because punctuation marks vary across languages.

One potential approach could involve defining the message for every
potential combination, in full length.

Honestly, I'm not sold on the idea that we need to exhaust ourselves
providing an exhaustive list of usable keywords for users here.  I
believe that it is unlikely that these keywords will be used in
different combinations each time without looking at the
documentation. On top of that, consider "csvlog" as an example, -- it
doesn't work as expected if logging_collector is off. Although this is
documented, we don't give any warnings at startup. This seems like a
bigger issue than the unusable keywords. (I don't mean to suggest to
fix this, as usual.)

In short, I think a simple message like '"xxx" cannot be used in this
build' should suffice for keywords defined but unusable, and we should
stick with "unknown" for the undefined ones.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: Autogenerate some wait events code and documentation

2023-07-09 Thread Drouvot, Bertrand

Hi,

On 7/9/23 9:36 AM, Michael Paquier wrote:

On Sun, Jul 09, 2023 at 09:15:34AM +0200, Drouvot, Bertrand wrote:

I also noticed that you now provide the culprit line in case of parsing
failure (thanks for that).


Yes, that's mentioned in the commit message I quickly wrote in 0002.


  #
-#   "C symbol in enums" "format in the system views" "description in the docs"
+#   "format in the system views" "description in the docs"

Should we add a note here about the impact of the "format in the system views" 
on
the auto generated enum? (aka how it is generated based on its format)?


There is one, 


Yeah there is one in generate-wait_event_types.pl. I was wondering
to add one in wait_event_names.txt too (as this is the place where
no wait events would be added if any).


but now that I look at it WAIT_EVENT repeated twice does
not look great, so this could use "FooBarName" or equivalent:


+1 for "FooBarName"

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: pg_recvlogical prints bogus error when interrupted

2023-07-09 Thread Michael Paquier
On Thu, Jul 06, 2023 at 10:29:10AM -0500, Tristan Partin wrote:
> On Thu Apr 27, 2023 at 12:54 AM CDT, Bharath Rupireddy wrote:
>> Why do we need both time_to_abort and ready_to_exit?
> 
> I am trying to understand why we need both as well. Maybe I am missing
> something important :).

As StreamLogicalLog() states once it leaves its main loop because
time_to_abort has been switched to true, we want a clean exit.  I
think that this patch is just a more complicated way to avoid doing
twice the operations done by prepareToTerminate().  So how about
moving the prepareToTerminate() call outside the main streaming loop
and call it when time_to_abort is true?  Then, I would suggest to
change the keepalive argument of prepareToTerminate() to an enum able
to handle three values to log the reason why the tool is stopping: the
end of WAL, an interruption or a keepalive when logging.  There are
two of them now, but we want a third mode for the signals.

> > /* It is not unexepected termination error when Ctrl-C'ed. */
> 
> My only other comment is that it would be nice to have the word "an"
> before unexpected.

s/unexepected/unexpected/.  Still, it seems to me that we don't need
this comment.
--
Michael


signature.asc
Description: PGP signature


Re: doc: improve the restriction description of using indexes on REPLICA IDENTITY FULL table.

2023-07-09 Thread Amit Kapila
On Mon, Jul 10, 2023 at 7:55 AM Peter Smith  wrote:
>
> On Sat, Jul 8, 2023 at 1:49 PM Amit Kapila  wrote:
> >
> > On Fri, Jul 7, 2023 at 1:36 PM Masahiko Sawada  
> > wrote:
> > >
> > > I prefer the first suggestion. I've attached the updated patch.
> > >
> >
> > This looks mostly good to me but I think it would be better if we can
> > also add the information that the leftmost index column must be a
> > non-expression. So, how about: "Candidate indexes must be btree,
> > non-partial, and the leftmost index column must be a non-expression
> > and reference to a published table column (i.e. cannot consist of only
> > expressions)."?
>
> That part in parentheses ought to say "the index ..." because it is
> referring to the full INDEX, not to the leftmost column. I think this
> was missed when Sawada-san took my previous suggestion [1].
>
> IMO it doesn't sound right to say the "index column must be a
> non-expression". It is already a non-expression because it is a
> column. So I think it would be better to refer to this as an INDEX
> "field" instead of an INDEX column. Note that "field" is the same
> terminology used in the docs for CREATE INDEX [2].
>

I thought it would be better to be explicit for this case but I am
fine if Sawada-San and you prefer some other way to document it.

-- 
With Regards,
Amit Kapila.




Re: Add more sanity checks around callers of changeDependencyFor()

2023-07-09 Thread Michael Paquier
On Fri, Jul 07, 2023 at 06:12:48PM +, Akshat Jaimini wrote:
> I believe it is ready to be committed!

Okay, thanks.  Please note that I have backpatched the bug and added
the checks for the callers of changeDependencyFor() on HEAD on top of
the bugfix.
--
Michael


signature.asc
Description: PGP signature


Re: Synchronizing slots from primary to standby

2023-07-09 Thread Amit Kapila
On Sun, Jul 9, 2023 at 1:01 PM Bharath Rupireddy
 wrote:
>
> On Fri, Jun 16, 2023 at 3:26 PM Amit Kapila  wrote:
> >
> >
> > 1. Can you please try to explain the functionality of the overall
> > patch somewhere in the form of comments and or commit message?
>
> IIUC, there are 2 core ideas of the feature:
>
> 1) It will never let the logical replication subscribers go ahead of
> physical replication standbys specified in standby_slot_names. It
> implements this by delaying decoding of commit records on the
> walsenders corresponding to logical replication subscribers on the
> primary until all the specified standbys confirm receiving the commit
> LSN.
>
> 2) The physical replication standbys will synchronize data of the
> specified logical replication slots (in synchronize_slot_names) from
> the primary, creating the logical replication slots if necessary.
> Since the logical replication subscribers will never go out of
> physical replication standbys, the standbys can safely synchronize the
> slots and keep the data necessary for subscribers to connect to it and
> work seamlessly even after a failover.
>
> If my understanding is right,
>

This matches my understanding as well.

> I have few thoughts here:
>
> 1. All the logical walsenders are delayed on the primary - per
> wait_for_standby_confirmation() despite the user being interested in
> only a few of them via synchronize_slot_names. Shouldn't the delay be
> for just the slots specified in synchronize_slot_names?
> 2. I think we can split the patch like this - 0001 can be the logical
> walsenders delaying decoding on the primary unless standbys confirm,
> 0002 standby synchronizing the logical slots.
>

Agreed with the above two points.

> 3. I think we need to change the GUC standby_slot_names to better
> reflect what it is used for - wait_for_replication_slot_names or
> wait_for_
>

I feel at this stage we can focus on getting the design and
implementation correct. We can improve GUC names later once we are
confident that the functionality is correct.

> 4. It allows specifying logical slots in standby_slot_names, meaning,
> it can disallow logical slots getting ahead of other logical slots
> specified in standby_slot_names. Should we allow this case with the
> thinking that if there's anyone using logical replication for failover
> (well, will anybody do that in production?).
>

I think on the contrary we should prohibit this case. We can always
extend this functionality later.

> 5. Similar to above, it allows specifying physical slots in
> synchronize_slot_names. Should we disallow?
>

We should prohibit that as well.

-- 
With Regards,
Amit Kapila.




RE: [Patch] Use *other* indexes on the subscriber when REPLICA IDENTITY is FULL

2023-07-09 Thread Hayato Kuroda (Fujitsu)
Dear Amit,

> > Yes, I agree, it is (and was before my patch as well) un-documented
> limitation of REPLICA IDENTITY FULL.
> > And, as far as I can see, my patch actually didn't have any impact on the
> limitation. The unsupported
> > cases are still unsupported, but now the same error is thrown in a slightly
> different place.
> > I think that is a minor limitation, but maybe should be listed [1]?
> > >
> 
> +1.
> 
> >
> > Yes, your modification did not touch the restriction. It has existed before 
> > the
> > commit. I (or my colleague) will post the patch to add the description, 
> > maybe
> > after [1] is committed.
> >
> 
> I don't think there is any dependency between the two. So, feel free
> to post the patch separately.

Okay, thank you for your confirmation. I have started the fork thread [1].

[1]: 
https://www.postgresql.org/message-id/TYAPR01MB58662174ED62648E0D611194F530A%40TYAPR01MB5866.jpnprd01.prod.outlook.com

Best Regards,
Hayato Kuroda
FUJITSU LIMITED



doc: clarify the limitation for logical replication when REPILICA IDENTITY is FULL

2023-07-09 Thread Hayato Kuroda (Fujitsu)
Dear hackers,

This is a fork thread from [1]. While analyzing codes I noticed that UPDATE and
DELETE cannot be replicated when REPLICA IDENTITY is FULL and the table has 
datatype
which does not have the operator class of Btree. I thnk this restriction is not
documented but should be. PSA the patch to add that. Thought?

[1]: 
https://www.postgresql.org/message-id/TYAPR01MB586687A51AB511E5A7F7D3E6F526A%40TYAPR01MB5866.jpnprd01.prod.outlook.com

Best Regards,
Hayato Kuroda
FUJITSU LIMITED



add_description.patch
Description: add_description.patch


Re: POC, WIP: OR-clause support for indexes

2023-07-09 Thread Andrey Lepikhov

On 7/7/2023 15:20, Alena Rybakina wrote:


because we will provide similar manipulation in this:

foreach(l, gentry->consts)
{
       Node       *rexpr = (Node *) lfirst(l);

       rexpr = coerce_to_common_type(pstate, rexpr,
                                             scalar_type,
                                             "IN");
      aexprs = lappend(aexprs, rexpr);
}

I'm not sure that it should be replaced.
In attachment - a bit more corrections to the patch.
The most important change - or_list contains already transformed 
expression subtree. So, I think we don't need to free it at all.


--
regards,
Andrey Lepikhov
Postgres Professional
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 961ca3e482..f0fd63f05c 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -112,9 +112,6 @@ transformBoolExprOr(ParseState *pstate, BoolExpr *expr_orig)
List   *or_list = NIL;
List   *groups_list = NIL;
ListCell   *lc;
-   Node   *result;
-   BoolExpr   *expr;
-   boolchange_apply = false;
 
/* If this is not an 'OR' expression, skip the transformation */
if (expr_orig->boolop != OR_EXPR ||
@@ -131,16 +128,7 @@ transformBoolExprOr(ParseState *pstate, BoolExpr 
*expr_orig)
OrClauseGroupEntry *gentry;
boolconst_is_left = true;
 
-   /*
-* The first step: checking that the expression consists only 
of equality.
-* We can only do this here, while arg is still row data type, 
namely A_Expr.
-* After applying transformExprRecurce, we already know that it 
will be OpExr type,
-* but checking the expression for equality is already becoming 
impossible for us.
-* Sometimes we have the chance to devide expression into the 
groups on
-* equality and inequality. This is possible if all list items 
are not at the
-* same level of a single BoolExpr expression, otherwise all of 
them cannot be converted.
-*/
-
+   /* At first, transform the arg and evaluate constant 
expressions. */
orqual = transformExprRecurse(pstate, (Node *) arg);
orqual = coerce_to_boolean(pstate, orqual, "OR");
orqual = eval_const_expressions(NULL, orqual);
@@ -151,7 +139,13 @@ transformBoolExprOr(ParseState *pstate, BoolExpr 
*expr_orig)
continue;
}
 
-   /* Detect constant side of the clause */
+   /*
+* Detect the constant side of the clause. Recall non-constant
+* expression can be made not only with Vars, but also with 
Params,
+* which is not bonded with any relation. Thus, we detect the 
const
+* side - if another side is constant too, the orqual couldn't 
be
+* an OpExpr.
+*/
if (IsA(get_leftop(orqual), Const))
const_is_left = true;
else if (IsA(get_rightop(orqual), Const))
@@ -175,26 +169,14 @@ transformBoolExprOr(ParseState *pstate, BoolExpr 
*expr_orig)
}
 
/*
-* The next step: transform all the inputs, and detect whether
-* any contain Vars.
-*/
-   if (!contain_vars_of_level(orqual, 0))
-   {
-   /* Again, it's not the expr we can transform */
-   or_list = lappend(or_list, orqual);
-   continue;
-   }
-
-   ;
-
-   /*
-   * At this point we definitely have a transformable 
clause.
-   * Classify it and add into specific group of clauses, 
or create new
-   * group.
-   * TODO: to manage complexity in the case of many 
different clauses
-   * (X1=C1) OR (X2=C2 OR) ... (XN = CN) we could invent 
something
-   * like a hash table (htab key ???).
-   */
+   * At this point we definitely have a transformable clause.
+   * Classify it and add into specific group of clauses, or create 
new
+   * group.
+   * TODO: to manage complexity in the case of many different 
clauses
+   * (X1=C1) OR (X2=C2 OR) ... (XN = CN) we could invent something
+   * like a hash table. But also we believe, that the case of many
+   * different variable sides is very rare.
+   */
foreach(lc_groups, groups_list)
{
OrClauseGroupEntry *v = (OrClauseGroupEntry *) 
lfirst(lc_groups);
@@ -220,20 +202,18 @@ 

Re: check_strxfrm_bug()

2023-07-09 Thread Thomas Munro
On Sun, Jul 9, 2023 at 6:35 PM Thomas Munro  wrote:
> On Sun, Jul 9, 2023 at 6:20 PM Peter Eisentraut  wrote:
> > So I don't think this code is correct.  AFAICT, there is nothing right
> > now that can possibly define HAVE_MBSTOWCS_L on Windows/MSVC.  Was that
> > the intention?
>
> Yes, that was my intention.  Windows actually doesn't have them.

Thinking about that some more...  Its _XXX implementations don't deal
with UTF-8 the way Unix-based developers would expect, and are
therefore just portability hazards, aren't they?  What would we gain
by restoring the advertisement that they are available?  Perhaps we
should go the other way completely and remove the relevant #defines
from win32_port.h, and fully confine knowledge of them to pg_locale.c?
 It knows how to deal with that.  Here is a patch trying this idea
out, with as slightly longer explanation.
From 1c04d781195266b6bc88d8a5a584a64aac07f613 Mon Sep 17 00:00:00 2001
From: Thomas Munro 
Date: Mon, 10 Jul 2023 13:41:27 +1200
Subject: [PATCH] Don't expose Windows' mbstowcs_l() and wcstombs_l().

Windows has similar functions with leading underscores, and we provided
the rename via a macro in win32_port.h.  In fact its functions are not
always good replacements for the Unix functions, since they can't deal
with UTF-8.  They are only currently used by pg_locale.c, which is
careful to redirect to other Windows routines for UTF-8.  Given that
portability hazard, it seem unlikely to be a good idea to encourage any
other code to think of these functions as being available outside
pg_locale.c.  Any code that thinks it wants these functions probably
wants our wchar2char() or char2wchar() routines instead, or it won't
actually work on Windows in UTF-8 databases.

Furthermore, some major libc implementations including glibc don't have
them (they only have the standard variants without _l), so external code
is very unlikely to require them to exist.

diff --git a/src/backend/utils/adt/pg_locale.c b/src/backend/utils/adt/pg_locale.c
index 0eb764e897..61daa8190b 100644
--- a/src/backend/utils/adt/pg_locale.c
+++ b/src/backend/utils/adt/pg_locale.c
@@ -154,36 +154,41 @@ static void icu_set_collation_attributes(UCollator *collator, const char *loc,
 		 UErrorCode *status);
 #endif
 
-#ifndef WIN32
 /*
  * POSIX doesn't define _l-variants of these functions, but several systems
- * have them.  We provide our own replacements here.  For Windows, we have
- * macros in win32_port.h.
+ * have them.  We provide our own replacements here.
  */
 #ifndef HAVE_MBSTOWCS_L
 static size_t
 mbstowcs_l(wchar_t *dest, const char *src, size_t n, locale_t loc)
 {
+#ifdef WIN32
+	return _mbstowcs_l(dest, src, n, loc);
+#else
 	size_t		result;
 	locale_t	save_locale = uselocale(loc);
 
 	result = mbstowcs(dest, src, n);
 	uselocale(save_locale);
 	return result;
+#endif
 }
 #endif
 #ifndef HAVE_WCSTOMBS_L
 static size_t
 wcstombs_l(char *dest, const wchar_t *src, size_t n, locale_t loc)
 {
+#ifdef WIN32
+	return _wcstombs_l(dest, src, n, loc);
+#else
 	size_t		result;
 	locale_t	save_locale = uselocale(loc);
 
 	result = wcstombs(dest, src, n);
 	uselocale(save_locale);
 	return result;
-}
 #endif
+}
 #endif
 
 /*
diff --git a/src/include/port/win32_port.h b/src/include/port/win32_port.h
index b957d5c598..0e6d608330 100644
--- a/src/include/port/win32_port.h
+++ b/src/include/port/win32_port.h
@@ -455,8 +455,6 @@ extern int	_pglstat64(const char *name, struct stat *buf);
 #define strcoll_l _strcoll_l
 #define strxfrm_l _strxfrm_l
 #define wcscoll_l _wcscoll_l
-#define wcstombs_l _wcstombs_l
-#define mbstowcs_l _mbstowcs_l
 
 /*
  * Versions of libintl >= 0.18? try to replace setlocale() with a macro
-- 
2.41.0



RE: [PATCH] Reuse Workers and Replication Slots during Logical Replication

2023-07-09 Thread Hayato Kuroda (Fujitsu)
Dear hackers,

Hi, I did a performance testing for v16 patch set.
Results show that patches significantly improves the performance in most cases.

# Method

Following tests were done 10 times per condition, and compared by median.
do_one_test.sh was used for the testing.

1.  Create tables on publisher
2.  Insert initial data on publisher
3.  Create tables on subscriber
4.  Create a replication slot (mysub_slot) on publisher
5.  Create a publication on publisher
6.  Create tables on subscriber
--- timer on ---
7.  Create subscription with pre-existing replication slot (mysub_slot)
8.  Wait until all srsubstate in pg_subscription_rel becomes 'r'
--- timer off ---

# Tested sources

I used three types of sources

* HEAD (f863d82)
* HEAD + 0001 + 0002
* HEAD + 0001 + 0002 + 0003

# Tested conditions

Following parameters were changed during the measurement.

### table size

* empty
* around 10kB

### number of tables

* 10
* 100
* 1000
* 2000

### max_sync_workers_per_subscription

* 2
* 4
* 8
* 16

## Results

Please see the attached image file. Each cell shows the improvement percentage 
of
measurement comapred with HEAD, HEAD + 0001 + 0002, and HEAD + 0001 + 0002 + 
0003.

According to the measurement, we can say following things:

* In any cases the performance was improved from the HEAD.
* The improvement became more significantly if number of synced tables were 
increased.
* 0003 basically improved performance from first two patches
* Increasing workers could sometimes lead to lesser performance due to 
contention.
  This was occurred when the number of tables were small. Moreover, this was 
not only happen by patchset - it happened even if we used HEAD.
  Detailed analysis will be done later.

Mored deital, please see the excel file. It contains all the results of 
measurement.

## Detailed configuration

* Powerful machine was used:
 - Number of CPU: 120
 - Memory: 755 GB

* Both publisher and subscriber were on the same machine.
* Following GUC settings were used for both pub/sub:

```
wal_level = logical
shared_buffers = 40GB
max_worker_processes = 32
max_parallel_maintenance_workers = 24
max_parallel_workers = 32
synchronous_commit = off
checkpoint_timeout = 1d
max_wal_size = 24GB
min_wal_size = 15GB
autovacuum = off
max_wal_senders = 200
max_replication_slots = 200
```

Best Regards,
Hayato Kuroda
FUJITSU LIMITED



perftest_result.xlsx
Description: perftest_result.xlsx


do_one_test.sh
Description: do_one_test.sh


Re: doc: improve the restriction description of using indexes on REPLICA IDENTITY FULL table.

2023-07-09 Thread Peter Smith
On Sat, Jul 8, 2023 at 1:49 PM Amit Kapila  wrote:
>
> On Fri, Jul 7, 2023 at 1:36 PM Masahiko Sawada  wrote:
> >
> > I prefer the first suggestion. I've attached the updated patch.
> >
>
> This looks mostly good to me but I think it would be better if we can
> also add the information that the leftmost index column must be a
> non-expression. So, how about: "Candidate indexes must be btree,
> non-partial, and the leftmost index column must be a non-expression
> and reference to a published table column (i.e. cannot consist of only
> expressions)."?

That part in parentheses ought to say "the index ..." because it is
referring to the full INDEX, not to the leftmost column. I think this
was missed when Sawada-san took my previous suggestion [1].

IMO it doesn't sound right to say the "index column must be a
non-expression". It is already a non-expression because it is a
column. So I think it would be better to refer to this as an INDEX
"field" instead of an INDEX column. Note that "field" is the same
terminology used in the docs for CREATE INDEX [2].

SUGGESTION
Candidate indexes must be btree, non-partial, and the leftmost index
field must be a column that references a published table column (i.e.
the index cannot consist of only expressions).



What happened to the earlier idea of removing/merging the redundant
(?) function IsIndexOnlyOnExpression()?
- Something wrong with that?
- Chose not to do it?
- Will do it raised in another thread?

--
[1] my review v2 -
https://www.postgresql.org/message-id/CAHut%2BPsFdTZJ7DG1jyu7BpA_1d4hwEd-Q%2BmQAPWcj1ZLD_X5Dw%40mail.gmail.com
[2] create index - https://www.postgresql.org/docs/current/sql-createindex.html

Kind Regards,
Peter Smith.
Fujitsu Australia




Re: Add hint message for check_log_destination()

2023-07-09 Thread Japin Li


On Sat, 08 Jul 2023 at 12:48, Michael Paquier  wrote:
> On Fri, Jul 07, 2023 at 07:23:47PM +0800, Japin Li wrote:
>> +appendStringInfoString(, "\"stderr\"");
>> +#ifdef HAVE_SYSLOG
>> +appendStringInfoString(, ", \"syslog\"");
>> +#endif
>> +#ifdef WIN32
>> +appendStringInfoString(, ", \"eventlog\"");
>> +#endif
>> +appendStringInfoString(, ", \"csvlog\", and 
>> \"jsonlog\"");
>
> Hmm.  Is that OK as a translatable string?


Sorry for the late reply!  I'm not sure.  How can I know whether it is 
translatable?

-- 
Regrads,
Japin Li.




Re: Preventing non-superusers from altering session authorization

2023-07-09 Thread Joseph Koshakow
On Sun, Jul 9, 2023 at 1:03 PM Joseph Koshakow  wrote:

>> * Only a superuser may set auth ID to something other than himself

> Is "auth ID" the right term here? Maybe something like "Only a
> superuser may set their session authorization/ID to something other
> than their authenticated ID."

>>   But we set the GUC variable
>> * is_superuser to indicate whether the *current* session userid is a
>> * superuser.

> Just a small correction here, I believe the is_superuser GUC is meant
> to indicate whether the current user id is a superuser, not the current
> session user id. We only update is_superuser in SetSessionAuthorization
> because we are also updating the current user id in SetSessionUserId.

I just realized that you moved this comment from
SetSessionAuthorization. I think we should leave the part about setting
the GUC variable is_superuser on top of SetSessionAuthorization since
that's where we actually set the GUC.

Thanks,
Joe Koshakow


Re: Cleaning up threading code

2023-07-09 Thread Thomas Munro
Thanks all for the reviews.  I pushed the first two small patches.
Here is a new version of the main --disable-thread-safety removal
part, with these changes:

* fixed LDAP_LIBS_FE snafu reported by Peter E
* defined ENABLE_THREAD_SAFETY 1 in c.h, for the benefit of extensions
* defined ENABLE_THREAD_SAFETY 1 ecpg_config.h, for the benefit of ECPG clients
* added Heikki's doc change as a separate patch (with minor xml snafu fixed)

I'll follow up with a new version of the later pg_threads.h proposal
and responses to feedback on that after getting this basic clean-up
work in.  Any other thoughts on this part?
From 18975f437e6b8e7c36da99238a90e5efa68434b9 Mon Sep 17 00:00:00 2001
From: Thomas Munro 
Date: Fri, 9 Jun 2023 14:07:26 +1200
Subject: [PATCH v2 1/2] Remove --disable-thread-safety and related code.

All supported computers have either POSIX or Windows threads, and we no
longer have any automated testing of --disable-thread-safety.  We define
a vestigial ENABLE_THREAD_SAFETY macro to 1 in c.h and ecpg_config.h for
the benefit of external code that might be testing for that, but we no
longer test it anywhere in PostgreSQL code, and associated dead code
paths are removed.

The Meson and perl-based Windows build scripts never had an equivalent
build option.

Reviewed-by: Andres Freund 
Reviewed-by: Peter Eisentraut 
Reviewed-by: Heikki Linnakangas 
Discussion: https://postgr.es/m/CA%2BhUKGLtmexrpMtxBRLCVePqV_dtWG-ZsEbyPrYc%2BNBB2TkNsw%40mail.gmail.com
---
 configure | 54 ++
 configure.ac  | 26 ++-
 doc/src/sgml/installation.sgml| 13 
 meson.build   | 11 ---
 src/Makefile.global.in|  1 -
 src/bin/pgbench/pgbench.c | 22 +-
 src/include/c.h   |  7 ++
 src/include/pg_config.h.in|  4 --
 src/interfaces/ecpg/ecpglib/connect.c | 40 ---
 src/interfaces/ecpg/ecpglib/descriptor.c  | 10 ---
 src/interfaces/ecpg/ecpglib/ecpglib_extern.h  |  2 -
 src/interfaces/ecpg/ecpglib/execute.c |  2 -
 src/interfaces/ecpg/ecpglib/memory.c  |  7 --
 src/interfaces/ecpg/ecpglib/misc.c| 47 
 .../ecpg/include/ecpg-pthread-win32.h |  3 -
 src/interfaces/ecpg/include/ecpg_config.h.in  |  5 +-
 src/interfaces/ecpg/include/ecpglib.h |  2 -
 src/interfaces/ecpg/include/meson.build   |  3 +-
 .../ecpg/test/expected/thread-alloc.c | 43 +--
 .../ecpg/test/expected/thread-descriptor.c| 22 +++---
 .../ecpg/test/expected/thread-prep.c  | 71 ---
 .../ecpg/test/expected/thread-thread.c| 63 +++-
 .../test/expected/thread-thread_implicit.c| 63 +++-
 src/interfaces/ecpg/test/thread/alloc.pgc |  9 ---
 .../ecpg/test/thread/descriptor.pgc   |  8 +--
 src/interfaces/ecpg/test/thread/prep.pgc  |  9 ---
 src/interfaces/ecpg/test/thread/thread.pgc|  9 ---
 .../ecpg/test/thread/thread_implicit.pgc  |  9 ---
 src/interfaces/libpq/Makefile |  5 +-
 src/interfaces/libpq/fe-connect.c |  4 --
 src/interfaces/libpq/fe-exec.c|  4 --
 src/interfaces/libpq/fe-print.c   | 13 +---
 src/interfaces/libpq/fe-secure-openssl.c  | 17 +
 src/interfaces/libpq/fe-secure.c  | 26 +--
 src/interfaces/libpq/legacy-pqsignal.c|  4 +-
 src/interfaces/libpq/libpq-int.h  |  9 +--
 src/makefiles/meson.build |  1 -
 src/tools/msvc/Solution.pm|  3 +-
 src/tools/msvc/ecpg_regression.proj   |  2 +-
 39 files changed, 144 insertions(+), 509 deletions(-)

diff --git a/configure b/configure
index 4a8f652129..2e518c8007 100755
--- a/configure
+++ b/configure
@@ -722,7 +722,6 @@ with_tcl
 ICU_LIBS
 ICU_CFLAGS
 with_icu
-enable_thread_safety
 INCLUDES
 autodepend
 PKG_CONFIG_LIBDIR
@@ -848,7 +847,6 @@ with_CC
 with_llvm
 enable_depend
 enable_cassert
-enable_thread_safety
 with_icu
 with_tcl
 with_tclconfig
@@ -1536,7 +1534,6 @@ Optional Features:
   --enable-tap-tests  enable TAP tests (requires Perl and IPC::Run)
   --enable-depend turn on automatic dependency tracking
   --enable-cassertenable assertion checks (for debugging)
-  --disable-thread-safety disable thread-safety in client libraries
   --disable-largefile omit support for large files
 
 Optional Packages:
@@ -8338,43 +8335,6 @@ $as_echo "$as_me: WARNING: *** Library directory $dir does not exist." >&2;}
 done
 IFS=$ac_save_IFS
 
-#
-# Enable thread-safe client libraries
-#
-{ $as_echo "$as_me:${as_lineno-$LINENO}: checking allow thread-safe client libraries" >&5
-$as_echo_n "checking allow thread-safe client libraries... " >&6; }
-
-
-# Check whether --enable-thread-safety was given.
-if test "${enable_thread_safety+set}" = set; then :
-  

Re: BRIN indexes vs. SK_SEARCHARRAY (and preprocessing scan keys)

2023-07-09 Thread Tomas Vondra
On 7/9/23 20:05, Heikki Linnakangas wrote:
> On 09/07/2023 19:16, Tomas Vondra wrote:
>> On 7/8/23 23:57, Heikki Linnakangas wrote:
>>> The new preprocess support function feels a bit too inflexible to me.
>>> True, you can store whatever you want in the ScanKey that it returns,
>>> but since that's the case, why not just make it void * ? It seems that
>>> the constraint here was that you need to pass a ScanKey to the
>>> consistent function, because the consistent function's signature is what
>>> it is. But we can change the signature, if we introduce a new support
>>> amproc number for it.
>>
>> Now sure I follow - what should be made (void *)? Oh, you mean not
>> passing the preprocessed array as a scan key at all, and instead passing
>> it as a new (void*) parameter to the (new) consistent function?
> 
> Right.
> 
>>> It would be unpleasant to force all BRIN opclasses to immediately
>>> implement the searcharray-logic. If we don't want to do that, we need to
>>> implement generic SK_SEARCHARRAY handling in BRIN AM itself. That would
>>> be pretty easy, right? Just call the regular consistent function for
>>> every element in the array.
>>
>> True, although the question is how many out-of-core opclasses are there.
>> My impression is the number is pretty close to 0, in which case we're
>> making ourselves to jump through all kinds of hoops, making the code
>> more complex, with almost no benefit in the end.
> 
> Perhaps. How many of the opclasses can do something smart with
> SEARCHARRAY? If the answer is "all" or "almost all", then it seems
> reasonable to just require them all to handle it. If the answer is
> "some", then it would still be nice to provide a naive default
> implementation in the AM itself. Otherwise all the opclasses need to
> include a bunch of boilerplate to support SEARCHARRAY.
> 

For the built-in, I think all can do something smart.

For external, hard to say - my guess is they could do something
interesting too.


>>> If an opclass wants to provide a faster/better implementation, it can
>>> provide a new consistent support procedure that supports that. Let's
>>> assign a new amproc number for new-style consistent function, which must
>>> support SK_SEARCHARRAY, and pass it some scratch space where it can
>>> cache whatever per-scankey data. Because it gets a new amproc number, we
>>> can define the arguments as we wish. We can pass a pointer to the
>>> per-scankey scratch space as a new argument, for example.
>>>
>>> We did this backwards-compatibility dance with the 3/4-argument variants
>>> of the current consistent functions. But I think assigning a whole new
>>> procedure number is better than looking at the number of arguments.
>>
>> I actually somewhat hate the 3/4-argument dance, and I'm opposed to
>> doing that sort of thing again. First, I'm not quite convinced it's
>> worth the effort to jump through this hoop (I recall this being one of
>> the headaches when fixing the BRIN NULL handling). Second, it can only
>> be done once - imagine we now need to add a new optional parameter.
>> Presumably, we'd need to keep the existing 3/4 variants, and add new 4/5
>> variants. At which point 4 is ambiguous.
> 
> My point is that we should assign a new amproc number to distinguish the
> new variant, instead of looking at the number of arguments. That way
> it's not ambiguous, and you can define whatever arguments you want for
> the new variant.
> 

Right, I agree.

> Yet another idea is to introduce a new amproc for a consistent function
> that *only* handles the SEARCHARRAY case, and keep the old consistent
> function as it is for the scalars. So every opclass would need to
> implement the current consistent function, just like today. But if an
> opclass wants to support SEARCHARRAY, it could optionally also provide
> an "consistent_array" function.
> 

Hmm, we probably need to do something like this anyway. Because even
with the scratch space in BrinDesc, we still need to track whether the
opclass can process arrays or not. And if it can't we need to emulate it
in brin.c.


>> Yes, my previous message was mostly about backwards compatibility, and
>> this may seem a bit like an argument against it. But that message was
>> more a question "If we do this, is it actually backwards compatible the
>> way we want/need?")
>>
>> Anyway, I think the BrinDesc scratch space is a neat idea, I'll try
>> doing it that way and report back in a couple days.
> 
> Cool. In 0005-Support-SK_SEARCHARRAY-in-BRIN-bloom-20230702.patch, you
> used the preprocess function to pre-calculate the scankey's hash, even
> for scalars. You could use the scratch space in BrinDesc for that,
> before doing anything with SEARCHARRAYs.
> 

Yeah, that's a good idea.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: Support logical replication of DDLs

2023-07-09 Thread Zheng Li
On Tue, Jun 27, 2023 at 6:16 AM vignesh C  wrote:

> While development, below are some of the challenges we faced:

> 1. Almost all the members of the AlterTableType enum will have to be 
> annotated.
> 2. Complex functionalities which require access to catalog tables
> cannot be auto generated, custom functions should be written in this
> case.
> 3. Some commands might have completely custom code(no auto generation)
> and in the alter/drop table case we will have hybrid implementation
> both auto generated and custom implementation.

Thanks for providing the PoC for auto generation of the deparser code!

I think this is the main difference between the deparser code and outfuncs.c.
There is no need for catalog access in outfuncs.c, which makes code generation
simpler for outfuncs.c and harder for the deparser. The hybrid
implementation of the deparser doesn't seem
to make it more maintainable, it's probably more confusing. Is it possible to
automate the code with catalog access? There may be common patterns in it also.

Regards,
Zane




Re: DecodeInterval fixes

2023-07-09 Thread reid . thompson
On Sat, 2023-07-08 at 13:18 -0400, Joseph Koshakow wrote:
> Jacob Champion  writes:
> > Hi Joe, here's a partial review:
> 
> Thanks so much for the review!
> 
> > I'm new to this code, but I agree that the use of `type` and the
> > lookahead are not particularly obvious/intuitive. At the very
> > least,
> > they'd need some more explanation in the code. Your boolean flag
> > idea
> > sounds reasonable, though.
> 
> I've updated the patch with the boolean flag idea. I think it's a
> bit cleaner and more readable.
> 
> > > There is one more problem I noticed, but didn't fix. We allow
> > > multiple
> > > "@" to be sprinkled anywhere in the input, even though the docs
> > > [0]
> > > only allow it to appear at the beginning of the input.
> > 
> > (No particular opinion on this.)
> 
> I looked into this a bit. The reason this works is because the date
> time lexer filters out all punctuation. That's what allows us to
> parse
> things like `SELECT date 'January 8, 1999';`. It's probably not worth
> trying to be smarter about what punctuation we allow where, at least
> for now. Maybe in the future we can exclude "@" from the punctuation
> that get's filtered out.
> 
> > It looks like this patch needs a rebase for the CI, too, but there
> > are
> > no conflicts.
> 
> The attached patch is rebased against master.
> 
> Thanks,
> Joe Koshakow

Apologies, I'm posting a little behind the curve here. My initial
thoughts on the original patch mirrored Jacob's re 1 and 2 - that it looked
good, did we need to consider the modified ecpg copy (which has been
answered by Tom). I didn't have have any strong thought re 3) or the '@'. 

The updated patch looks good to me. Seems a little clearer/cleaner.

Thanks,
Reid








Re: RFC: pg_stat_logmsg

2023-07-09 Thread Joe Conway

On 7/7/23 01:38, Gurjeet Singh wrote:

On Thu, Jul 6, 2023 at 12:37 AM Masahiko Sawada  wrote:

On Sat, Jul 1, 2023 at 8:57 AM Joe Conway  wrote:
>
> The basic idea is to mirror how pg_stat_statements works, except the
> logged messages keyed by filename, lineno, and elevel are saved with a
> aggregate count. The format string is displayed (similar to a query
> jumble) for context, along with function name and sqlerrcode.
>
>
> Part of the thinking is that people with fleets of postgres instances
> can use this to scan for various errors that they care about.
> Additionally it would be useful to look for "should not happen" errors.



I'm concerned that displaying the format string could not be helpful
in some cases. For example, when raising an ERROR while reading WAL
records, we typically write the error message stored in
record->errormsg_buf:

in ReadRecord():
if (errormsg)
ereport(emode_for_corrupt_record(emode, xlogreader->EndRecPtr),
(errmsg_internal("%s", errormsg) /* already
translated */ ));

In this case, the error message stored in pg_stat_logmsg is just '%s'.
The filename and line number columns might help identify the actual
error but it requires users to read the source code and cannot know
the actual error message.


I believe that the number of such error messages, the ones with very
little descriptive content, is very low in Postgres code. These kinds
of messages are not prevalent, and must be used when code hits obscure
conditions, like seen in your example above. These are the kinds of
errors that Joe is referring to as "should not happen". In these
cases, even if the error message was descriptive, the user will very
likely have to dive deep into code to find out the real cause.


Agreed. As an example, after running `make installcheck`

8<-
select sum(count) from pg_stat_logmsg;
 sum
--
 6005
(1 row)

select message,sum(count)
from pg_stat_logmsg
where length(message) < 30
  and elevel in ('ERROR','FATAL','PANIC')
  and message like '%\%s%' escape '\'
group by message
order by length(message);
message| sum
---+-
 %s| 107
 "%s" is a view|   9
 "%s" is a table   |   4
 %s is a procedure |   1
 invalid size: "%s"|  13
 %s at or near "%s"| 131
 %s at end of input|   3
...
8<-

I only see three message formats there that are generic enough to be of 
concern (the first and last two shown -- FWIW I did not see any more of 
them as the fmt string gets longer)


So out of 6005 log events, 241 fit this concern.

Perhaps given the small number of message format strings affected, it 
would make sense to special case those, but I am not sure it is worth 
the effort, at least for version 1.



I feel that the unique combination of file name and line number is
useful information, even in cases where the format string not very
descriptive. So I believe the extension's behaviour in this regard is
acceptable.

In cases where the extension's output is not descriptive enough, the
user can use the filename:lineno pair to look for fully formed error
messages in the actual log files; they may have to make appropriate
changes to log_* parameters, though.


Right


If we still feel strongly about getting the actual message for these
cases, perhaps we can develop a heuristic to identify such messages,
and use either full or a prefix of the 'message' field, instead of
'message_id' field. The heuristic could be: strlen(message_id) is too
short, or that message_id is all/overwhelmingly format specifiers
(e.g. '%s: %d').


Based on the above analysis (though granted, not all inclusive), it 
seems like just special casing the specific message format strings of 
interest would work.



The core benefit of this extension is to make it easy for the user to
discover error messages affecting their workload. The user may be
interested in knowing the most common log messages in their server
logs, so that they can work on reducing those errors or warnings. Or
they may be interested in knowing when their workload hits
unexpected/serious error messages, even if they're infrequent. The
data exposed by pg_stat_logmsg view would serve as a starting point,
but I'm guessing it would not be sufficient for their investigation.


Yes, exactly.

--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com





Re: [PATCH] Remove unnecessary unbind in LDAP search+bind mode

2023-07-09 Thread Anatoly Zaretsky
On Sun, Jul 9, 2023 at 9:57 AM Peter Eisentraut 
wrote:

> committed
>
Thanks!

-- 
Best regards,
Anatoly Zaretsky


Re: BRIN indexes vs. SK_SEARCHARRAY (and preprocessing scan keys)

2023-07-09 Thread Heikki Linnakangas

On 09/07/2023 19:16, Tomas Vondra wrote:

On 7/8/23 23:57, Heikki Linnakangas wrote:

The new preprocess support function feels a bit too inflexible to me.
True, you can store whatever you want in the ScanKey that it returns,
but since that's the case, why not just make it void * ? It seems that
the constraint here was that you need to pass a ScanKey to the
consistent function, because the consistent function's signature is what
it is. But we can change the signature, if we introduce a new support
amproc number for it.


Now sure I follow - what should be made (void *)? Oh, you mean not
passing the preprocessed array as a scan key at all, and instead passing
it as a new (void*) parameter to the (new) consistent function?


Right.


It would be unpleasant to force all BRIN opclasses to immediately
implement the searcharray-logic. If we don't want to do that, we need to
implement generic SK_SEARCHARRAY handling in BRIN AM itself. That would
be pretty easy, right? Just call the regular consistent function for
every element in the array.


True, although the question is how many out-of-core opclasses are there.
My impression is the number is pretty close to 0, in which case we're
making ourselves to jump through all kinds of hoops, making the code
more complex, with almost no benefit in the end.


Perhaps. How many of the opclasses can do something smart with 
SEARCHARRAY? If the answer is "all" or "almost all", then it seems 
reasonable to just require them all to handle it. If the answer is 
"some", then it would still be nice to provide a naive default 
implementation in the AM itself. Otherwise all the opclasses need to 
include a bunch of boilerplate to support SEARCHARRAY.



If an opclass wants to provide a faster/better implementation, it can
provide a new consistent support procedure that supports that. Let's
assign a new amproc number for new-style consistent function, which must
support SK_SEARCHARRAY, and pass it some scratch space where it can
cache whatever per-scankey data. Because it gets a new amproc number, we
can define the arguments as we wish. We can pass a pointer to the
per-scankey scratch space as a new argument, for example.

We did this backwards-compatibility dance with the 3/4-argument variants
of the current consistent functions. But I think assigning a whole new
procedure number is better than looking at the number of arguments.


I actually somewhat hate the 3/4-argument dance, and I'm opposed to
doing that sort of thing again. First, I'm not quite convinced it's
worth the effort to jump through this hoop (I recall this being one of
the headaches when fixing the BRIN NULL handling). Second, it can only
be done once - imagine we now need to add a new optional parameter.
Presumably, we'd need to keep the existing 3/4 variants, and add new 4/5
variants. At which point 4 is ambiguous.


My point is that we should assign a new amproc number to distinguish the 
new variant, instead of looking at the number of arguments. That way 
it's not ambiguous, and you can define whatever arguments you want for 
the new variant.


Yet another idea is to introduce a new amproc for a consistent function 
that *only* handles the SEARCHARRAY case, and keep the old consistent 
function as it is for the scalars. So every opclass would need to 
implement the current consistent function, just like today. But if an 
opclass wants to support SEARCHARRAY, it could optionally also provide 
an "consistent_array" function.



Yes, my previous message was mostly about backwards compatibility, and
this may seem a bit like an argument against it. But that message was
more a question "If we do this, is it actually backwards compatible the
way we want/need?")

Anyway, I think the BrinDesc scratch space is a neat idea, I'll try
doing it that way and report back in a couple days.


Cool. In 0005-Support-SK_SEARCHARRAY-in-BRIN-bloom-20230702.patch, you 
used the preprocess function to pre-calculate the scankey's hash, even 
for scalars. You could use the scratch space in BrinDesc for that, 
before doing anything with SEARCHARRAYs.


--
Heikki Linnakangas
Neon (https://neon.tech)





Re: DecodeInterval fixes

2023-07-09 Thread Joseph Koshakow
On Sat, Jul 8, 2023 at 5:06 PM Gurjeet Singh  wrote:

> I feel the staleness/deficiencies you mention above are not
> captured in the TODO wiki page. It'd be nice if these were documented,
> so that newcomers to the community can pick up work that they feel is
> an easy lift for them.

I think that's a good idea. I've definitely been confused by this in
previous patches I've submitted.


I've broken up this patch into three logical commits and attached them.
None of the actual code has changed.

Thanks,
Joe Koshakow
From b3fe06934b40489d1b4b157677f1292bc698c7da Mon Sep 17 00:00:00 2001
From: Joseph Koshakow 
Date: Sun, 9 Jul 2023 13:12:16 -0400
Subject: [PATCH 1/3] Remove dead code in DecodeInterval

This commit removes dead code for handling unit type RESERVE. There
used to be a unit called "invalid" that was of type RESERVE. At some
point that unit was removed and there were no more units of type
RESERVE. Therefore, the code for RESERVE unit handling is unreachable.
---
 src/backend/utils/adt/datetime.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/src/backend/utils/adt/datetime.c b/src/backend/utils/adt/datetime.c
index 5d8d583ddc..2a5dddc43f 100644
--- a/src/backend/utils/adt/datetime.c
+++ b/src/backend/utils/adt/datetime.c
@@ -3582,11 +3582,6 @@ DecodeInterval(char **field, int *ftype, int nf, int range,
 		type = uval;
 		break;
 
-	case RESERV:
-		tmask = (DTK_DATE_M | DTK_TIME_M);
-		*dtype = uval;
-		break;
-
 	default:
 		return DTERR_BAD_FORMAT;
 }
-- 
2.34.1

From 6271c5fcca30de0982b4b6073b49c1cea6c7391b Mon Sep 17 00:00:00 2001
From: Joseph Koshakow 
Date: Sun, 9 Jul 2023 13:17:08 -0400
Subject: [PATCH 2/3] Fix Interval 'ago' parsing

This commit Restrict the unit "ago" to only appear at the end of the
interval. According to the docs [0], this is the only valid place to
put it, but we allowed it multiple times at any point in the input.

[0] https://www.postgresql.org/docs/15/datatype-datetime.html#DATATYPE-INTERVAL-INPUT
---
 src/backend/utils/adt/datetime.c   | 6 ++
 src/test/regress/expected/interval.out | 9 +
 src/test/regress/sql/interval.sql  | 4 
 3 files changed, 19 insertions(+)

diff --git a/src/backend/utils/adt/datetime.c b/src/backend/utils/adt/datetime.c
index 2a5dddc43f..9d09381328 100644
--- a/src/backend/utils/adt/datetime.c
+++ b/src/backend/utils/adt/datetime.c
@@ -3578,6 +3578,12 @@ DecodeInterval(char **field, int *ftype, int nf, int range,
 		break;
 
 	case AGO:
+		/*
+		 * 'ago' is only allowed to appear at the end of the
+		 * interval.
+		 */
+		if (i != nf - 1)
+			return DTERR_BAD_FORMAT;
 		is_before = true;
 		type = uval;
 		break;
diff --git a/src/test/regress/expected/interval.out b/src/test/regress/expected/interval.out
index 28b71d9681..42062f947f 100644
--- a/src/test/regress/expected/interval.out
+++ b/src/test/regress/expected/interval.out
@@ -1787,3 +1787,12 @@ SELECT extract(epoch from interval '10 days');
  864000.00
 (1 row)
 
+-- test that ago can only appear once at the end of the interval.
+SELECT INTERVAL '42 days 2 seconds ago ago';
+ERROR:  invalid input syntax for type interval: "42 days 2 seconds ago ago"
+LINE 1: SELECT INTERVAL '42 days 2 seconds ago ago';
+^
+SELECT INTERVAL '2 minutes ago 5 days';
+ERROR:  invalid input syntax for type interval: "2 minutes ago 5 days"
+LINE 1: SELECT INTERVAL '2 minutes ago 5 days';
+^
diff --git a/src/test/regress/sql/interval.sql b/src/test/regress/sql/interval.sql
index 56feda1a3d..8fd2e7f41e 100644
--- a/src/test/regress/sql/interval.sql
+++ b/src/test/regress/sql/interval.sql
@@ -582,3 +582,7 @@ SELECT f1,
 
 -- internal overflow test case
 SELECT extract(epoch from interval '10 days');
+
+-- test that ago can only appear once at the end of the interval.
+SELECT INTERVAL '42 days 2 seconds ago ago';
+SELECT INTERVAL '2 minutes ago 5 days';
-- 
2.34.1

From 2ffb81e95031b43955fdba784356fc54659775e2 Mon Sep 17 00:00:00 2001
From: Joseph Koshakow 
Date: Sun, 9 Jul 2023 13:21:23 -0400
Subject: [PATCH 3/3] Fix Interval unit parsing

This commit will error when the user has multiple consecutive units or
a unit without an accompanying value.
---
 src/backend/utils/adt/datetime.c   | 12 
 src/test/regress/expected/interval.out |  9 +
 src/test/regress/sql/interval.sql  |  4 
 3 files changed, 25 insertions(+)

diff --git a/src/backend/utils/adt/datetime.c b/src/backend/utils/adt/datetime.c
index 9d09381328..edf22f458e 100644
--- a/src/backend/utils/adt/datetime.c
+++ b/src/backend/utils/adt/datetime.c
@@ -3278,6 +3278,7 @@ DecodeInterval(char **field, int *ftype, int nf, int range,
 {
 	bool		force_negative = false;
 	bool		is_before = false;
+	bool		parsing_unit_val = false;
 	char	   *cp;
 	int			fmask = 0,
 tmask,
@@ -3336,6 +3337,7 @@ DecodeInterval(char **field, int *ftype, int nf, 

Re: Preventing non-superusers from altering session authorization

2023-07-09 Thread Joseph Koshakow
On Sun, Jul 9, 2023 at 12:47 AM Nathan Bossart 
wrote:

> I think we should split this into two patches: one to move the permission
> check to check_session_authorization() and another for the behavior
change.
> I've attached an attempt at the first one (that borrows heavily from your
> latest patch).  AFAICT the only reason that the permission check lives in
> SetSessionAuthorization() is because AuthenticatedUserIsSuperuser is
static
> to miscinit.c and doesn't have an accessor function.  I added one, but it
> would probably just be removed by the following patch.  WDYT?

I think that's a good idea. We could even keep around the accessor
function as a good place to bundle the calls to
Assert(OidIsValid(AuthenticatedUserId))
and
superuser_arg(AuthenticatedUserId)

> * Only a superuser may set auth ID to something other than himself

Is "auth ID" the right term here? Maybe something like "Only a
superuser may set their session authorization/ID to something other
than their authenticated ID."

>   But we set the GUC variable
> * is_superuser to indicate whether the *current* session userid is a
> * superuser.

Just a small correction here, I believe the is_superuser GUC is meant
to indicate whether the current user id is a superuser, not the current
session user id. We only update is_superuser in SetSessionAuthorization
because we are also updating the current user id in SetSessionUserId.
For example,

test=# CREATE ROLE r1 SUPERUSER;
CREATE ROLE
test=# CREATE ROLE r2;
CREATE ROLE
test=# SET SESSION AUTHORIZATION r1;
SET
test=# SET ROLE r2;
SET
test=> SELECT session_user, current_user;
 session_user | current_user
--+--
 r1   | r2
(1 row)

test=> SHOW is_superuser;
 is_superuser
--
 off
(1 row)

Which has also made me realize that the comment on is_superuser in
guc_tables.c is incorrect:

> /* Not for general use --- used by SET SESSION AUTHORIZATION */

Additionally the C variable name for is_superuser is fairly misleading:

> session_auth_is_superuser

The documentation for this GUC in show.sgml is correct:

> True if the current role has superuser privileges.

As an aside, I'm starting to think we should consider removing this
GUC. It sometimes reports an incorrect value [0], and potentially is
not used internally for anything.

I've rebased my changes over your patch and attached them both.

[0]
https://www.postgresql.org/message-id/CAAvxfHcxH-hLndty6CRThGXL1hLsgCn%2BE3QuG_4Qi7GxrHmgKg%40mail.gmail.com
From 2e1689b5fe384d675043beb9df8eff49a0ff436e Mon Sep 17 00:00:00 2001
From: Joseph Koshakow 
Date: Sun, 9 Jul 2023 12:58:41 -0400
Subject: [PATCH 2/2] Prevent non-superusers from altering session auth

Previously, if a user connected with as a role that had the superuser
attribute, then they could always execute a SET SESSION AUTHORIZATION
statement for the duration of their session. Even if the role was
altered to set superuser to false, the user was still allowed to
execute SET SESSION AUTHORIZATION. This allowed them to set their
session role to some other superuser and effectively regain the
superuser privileges. They could even reset their own superuser
attribute to true.

This commit alters the privilege checks for SET SESSION AUTHORIZATION
such that a user can only execute it if the role they connected with is
currently a superuser. This prevents users from regaining superuser
privileges after it has been revoked.
---
 doc/src/sgml/ref/set_session_auth.sgml |  2 +-
 src/backend/utils/init/miscinit.c  | 19 +--
 2 files changed, 10 insertions(+), 11 deletions(-)

diff --git a/doc/src/sgml/ref/set_session_auth.sgml b/doc/src/sgml/ref/set_session_auth.sgml
index f8fcafc194..94adab2468 100644
--- a/doc/src/sgml/ref/set_session_auth.sgml
+++ b/doc/src/sgml/ref/set_session_auth.sgml
@@ -51,7 +51,7 @@ RESET SESSION AUTHORIZATION
 
   
The session user identifier can be changed only if the initial session
-   user (the authenticated user) had the
+   user (the authenticated user) has the
superuser privilege.  Otherwise, the command is accepted only if it
specifies the authenticated user name.
   
diff --git a/src/backend/utils/init/miscinit.c b/src/backend/utils/init/miscinit.c
index f5548a0f47..1aa393f9fd 100644
--- a/src/backend/utils/init/miscinit.c
+++ b/src/backend/utils/init/miscinit.c
@@ -467,7 +467,7 @@ ChangeToDataDir(void)
  * AuthenticatedUserId is determined at connection start and never changes.
  *
  * SessionUserId is initially the same as AuthenticatedUserId, but can be
- * changed by SET SESSION AUTHORIZATION (if AuthenticatedUserIsSuperuser).
+ * changed by SET SESSION AUTHORIZATION (if AuthenticatedUserId is a superuser).
  * This is the ID reported by the SESSION_USER SQL function.
  *
  * OuterUserId is the current user ID in effect at the "outer level" (outside
@@ -492,8 +492,7 @@ static Oid	OuterUserId = InvalidOid;
 static Oid	

Re: BRIN indexes vs. SK_SEARCHARRAY (and preprocessing scan keys)

2023-07-09 Thread Tomas Vondra
On 7/8/23 23:57, Heikki Linnakangas wrote:
> On 02/07/2023 19:09, Tomas Vondra wrote:
>> Here's an updated version of the patch series.
>>
>> I've polished and pushed the first three patches with cleanup, tests to
>> improve test coverage and so on. I chose not to backpatch those - I
>> planned to do that to make future backpatches simpler, but the changes
>> ended up less disruptive than expected.
>>
>> The remaining patches are just about adding SK_SEARCHARRAY to BRIN.
>>
>> 0001 - adds the optional preprocess procedure, calls it from brinrescan
>>
>> 0002 to 0005 - adds the support to the existing BRIN opclasses
> 
> Could you implement this completely in the consistent-function, by
> caching the sorted array in fn_extra, without adding the new preprocess
> procedure? On first call, when fn_extra == NULL, sort the array and
> stash it in fn_extra.
> 
> I don't think that works, because fn_extra isn't reset when the scan
> keys change on rescan. We could reset it, and document that you can use
> fn_extra for per-scankey caching. There's some precedence for not
> resetting it though, see commit d22a09dc70f. But we could provide an
> opaque per-scankey scratch space like that somewhere else. In BrinDesc,
> perhaps.
> 

Hmm, yeah. BrinDesc seems like a good place for such scratch space ...

And it's seem to alleviate most of the compatibility issues, because
it'd make the preprocessing a responsibility of the consistent function,
instead of doing it in a separate optional procedure (and having to deal
with cases when it does not exist). If would not even need a separate
procnum.

> The new preprocess support function feels a bit too inflexible to me.
> True, you can store whatever you want in the ScanKey that it returns,
> but since that's the case, why not just make it void * ?It seems that
> the constraint here was that you need to pass a ScanKey to the
> consistent function, because the consistent function's signature is what
> it is. But we can change the signature, if we introduce a new support
> amproc number for it.
> 

Now sure I follow - what should be made (void *)? Oh, you mean not
passing the preprocessed array as a scan key at all, and instead passing
it as a new (void*) parameter to the (new) consistent function?

Yeah, I was trying to stick to the existing signature of the consistent
function, to minimize the necessary changes.

>> The main open question I have is what exactly does it mean that the
>> procedure is optional. In particular, should it be supported to have a
>> BRIN opclass without the "preprocess" procedure but using the other
>> built-in support procedures?
>>
>> For example, imagine you have a custom BRIN opclass in an extension (for
>> a custom data type or something). This does not need to implement any
>> procedures, it can just call the existing built-in ones. Of course, this
>> won't get the "preprocess" procedure automatically.
>>
>> Should we support such opclasses or should we force the extension to be
>> updated by adding a preprocess procedure? I'd say "optional" means we
>> should support (otherwise it'd not really optional).
>>
>> The reason why this matters is that "amsearcharray" is AM-level flag,
>> but the support procedure is defined by the opclass. So the consistent
>> function needs to handle SK_SEARCHARRAY keys both with and without
>> preprocessing.
>>
>> That's mostly what I did for all existing BRIN opclasses (it's a bit
>> confusing that opclass may refer to both the "generic" minmax or the
>> opclass defined for a particular data type). All the opclasses now
>> handle three cases:
>>
>> 1) scalar keys (just like before, with amsearcharray=fase)
>>
>> 2) array keys with preprocessing (sorted array, array of hashes, ...)
>>
>> 3) array keys without preprocessing (for compatibility with old
>>     opclasses missing the optional preprocess procedure)
>>
>> The current code is a bit ugly, because it duplicates a bunch of code,
>> because the option (3) mostly does (1) in a loop. I'm confident this can
>> be reduced by refactoring and reusing some of the "shared" code.
>>
>> The question is if my interpretation of what "optional" procedure means
>> is reasonable. Thoughts?
>>
>> The other thing is how to test this "compatibility" code. I assume we
>> want to have the procedure for all built-in opclasses, so that won't
>> exercise it. I did test it by temporarily removing the procedure from a
>> couple pg_amproc.dat entries. I guess creating a custom opclass in the
>> regression test is the right solution.
> 
> It would be unpleasant to force all BRIN opclasses to immediately
> implement the searcharray-logic. If we don't want to do that, we need to
> implement generic SK_SEARCHARRAY handling in BRIN AM itself. That would
> be pretty easy, right? Just call the regular consistent function for
> every element in the array.
> 

True, although the question is how many out-of-core opclasses are there.
My impression is the number is pretty close to 0, in which 

Re: psql: Add role's membership options to the \du+ command

2023-07-09 Thread Pavel Luzanov

On 08.07.2023 20:07, Tom Lane wrote:

1. I was thinking in terms of dropping the "Member of" column entirely
in \du and \dg.  It doesn't tell you enough, and the output of those
commands is often too wide already.


I understood it that way that the dropping "Member of" column will be 
done as part of another work for v17. [1]

But I'm ready to do it now.


2. You have describeRoleGrants() set up to localize "ADMIN", "INHERIT",
and "SET".  Since those are SQL keywords, our usual practice is to not
localize them; this'd simplify the code.


The reason is that \du has translatable all attributes of the role, 
including NOINHERIT.

I decided to make a new command the same way.
But I'm ready to leave them untranslatable, if no objections.


3. Not sure about use of LEFT JOIN in the query.  That will mean we
get a row out even for roles that have no grants, which seems like
clutter.  The LEFT JOINs to r and g are fine, but I suggest changing
the first join to a plain join.


It was David's suggestion:

On 24.06.2023 18:57, David G. Johnston wrote:
On Sat, Jun 24, 2023 at 8:11 AM Pavel Luzanov 
 wrote:


* The new meta-command will not show all roles. It will only show the
roles included in other roles.
To show all roles you need to add an outer join between pg_roles and
pg_auth_members.
But all columns except "role" will be left blank. Is it worth
doing this?


I'm inclined to want this.  I would be good when specifying a role to 
filter upon that all rows that do exist matching that filter end up in 
the output regardless if they are standalone or not.


Personally, I tend to think that left join is not needed. No memberships 
- nothing shown.


So, I accepted all three suggestions. I will wait for other opinions and
plan to implement discussed changes within a few days.

1. https://www.postgresql.org/message-id/4133242.1687481416%40sss.pgh.pa.us

--
Pavel Luzanov
Postgres Professional:https://postgrespro.com


Re: Autogenerate some wait events code and documentation

2023-07-09 Thread Michael Paquier
On Sun, Jul 09, 2023 at 09:15:34AM +0200, Drouvot, Bertrand wrote:
> I also noticed that you now provide the culprit line in case of parsing
> failure (thanks for that).

Yes, that's mentioned in the commit message I quickly wrote in 0002.

>  #
> -#   "C symbol in enums" "format in the system views" "description in the 
> docs"
> +#   "format in the system views" "description in the docs"
> 
> Should we add a note here about the impact of the "format in the system 
> views" on
> the auto generated enum? (aka how it is generated based on its format)?

There is one, but now that I look at it WAIT_EVENT repeated twice does
not look great, so this could use "FooBarName" or equivalent:
+   # Generate the element name for the enums based on the
+   # description.  Camelcase strings like "WaitEventName"
+   # are converted to WAIT_EVENT_WAIT_EVENT_NAME.
--
Michael


signature.asc
Description: PGP signature


Re: Synchronizing slots from primary to standby

2023-07-09 Thread Bharath Rupireddy
On Fri, Jun 16, 2023 at 3:26 PM Amit Kapila  wrote:
>
> On Mon, Apr 17, 2023 at 7:37 PM Drouvot, Bertrand
>  wrote:
> >
> > Please find attached V5 (a rebase of V4 posted up-thread).
> >
> > In addition to the "rebasing" work, the TAP test adds a test about conflict 
> > handling (logical slot invalidation)
> > relying on the work done in the "Minimal logical decoding on standby" patch 
> > series.
> >
> > I did not look more at the patch (than what's was needed for the rebase) 
> > but plan to do so.
> >
>
> Are you still planning to continue working on this? Some miscellaneous
> comments while going through this patch are as follows?
>
> 1. Can you please try to explain the functionality of the overall
> patch somewhere in the form of comments and or commit message?

IIUC, there are 2 core ideas of the feature:

1) It will never let the logical replication subscribers go ahead of
physical replication standbys specified in standby_slot_names. It
implements this by delaying decoding of commit records on the
walsenders corresponding to logical replication subscribers on the
primary until all the specified standbys confirm receiving the commit
LSN.

2) The physical replication standbys will synchronize data of the
specified logical replication slots (in synchronize_slot_names) from
the primary, creating the logical replication slots if necessary.
Since the logical replication subscribers will never go out of
physical replication standbys, the standbys can safely synchronize the
slots and keep the data necessary for subscribers to connect to it and
work seamlessly even after a failover.

If my understanding is right, I have few thoughts here:

1. All the logical walsenders are delayed on the primary - per
wait_for_standby_confirmation() despite the user being interested in
only a few of them via synchronize_slot_names. Shouldn't the delay be
for just the slots specified in synchronize_slot_names?
2. I think we can split the patch like this - 0001 can be the logical
walsenders delaying decoding on the primary unless standbys confirm,
0002 standby synchronizing the logical slots.
3. I think we need to change the GUC standby_slot_names to better
reflect what it is used for - wait_for_replication_slot_names or
wait_for_
4. It allows specifying logical slots in standby_slot_names, meaning,
it can disallow logical slots getting ahead of other logical slots
specified in standby_slot_names. Should we allow this case with the
thinking that if there's anyone using logical replication for failover
(well, will anybody do that in production?).
5. Similar to above, it allows specifying physical slots in
synchronize_slot_names. Should we disallow?

I'm attaching the v6 patch, a rebased version of v5.

-- 
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com


v6-0001-Synchronize-logical-replication-slots-from-primar.patch
Description: Binary data


Re: Fix last unitialized memory warning

2023-07-09 Thread Peter Eisentraut

On 06.07.23 15:41, Tristan Partin wrote:

On Thu Jul 6, 2023 at 3:21 AM CDT, Peter Eisentraut wrote:

On 05.07.23 23:06, Tristan Partin wrote:

Thanks for following up. My system is Fedora 38. I can confirm this is
still happening on master.

$ gcc --version
gcc (GCC) 13.1.1 20230614 (Red Hat 13.1.1-4)
Copyright (C) 2023 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
$ meson setup build --buildtype=release


This buildtype turns on -O3 warnings.  We have usually opted against
chasing warnings in -O3 level because there are often some
false-positive uninitialized variable warnings with every new compiler.

Note that we have set the default build type to debugoptimized, for that
reason.


Good to know, thanks.

Regarding the original patch, do you think it is good to be applied?


That patch looks reasonable.  But I can't actually reproduce the 
warning, even with gcc-13.  I do get the warning from plpgsql.  Can you 
show the warning you are seeing?






Re: Autogenerate some wait events code and documentation

2023-07-09 Thread Drouvot, Bertrand

Hi,

On 7/9/23 6:32 AM, Michael Paquier wrote:

On Fri, Jul 07, 2023 at 01:49:24PM +0900, Michael Paquier wrote:

Hmm.  If we go down this road I would make the choice of simplicity
and remove entirely a column, then, generating the snakecase from the
camelcase or vice-versa (say like a $string =~ s/([a-z]+)/$1_/g;),
even if it means having slightly incompatible strings showing to the
users. And I'd rather minimize the number of exceptions we need to
handle in this automation (aka no exception rules for some keywords
like "SSL" or "WAL", etc.).


After pondering more about that, the attached patch set does exactly
that.


Thanks!


 Patch 0001 includes an update of the wait event names so as
these are more consistent with the enum elements generated.  With this
change, users can apply lower() or upper() across monitoring queries
and still get the same results as before.  An exception was the
message queue events, which the enums used "MQ" but the event names
used "MessageQueue", but this concerns only four lines of code in the
backend.  The newly-generated enum elements match with the existing
ones, except for MQ.




Patch 0002 introduces a set of simplifications for the format of
wait_event_names.txt:
- Removal of the first column for the enums.
- Removal of the quotes for the event name.  We have a single keyword
for these, so that's kind of annoying to cope with that for new
entries.
- Build of the enum elements using the event names, by applying a
rebuild as simple as that:
+   $waiteventenumname =~ s/([a-z])([A-Z])/$1_$2/g;
+   $waiteventenumname = uc($waiteventenumname);

Thoughts?


That's great and it does simplify the wait_event_names.txt format (and the
impact on "MQ" does not seem like a big deal).

I also noticed that you now provide the culprit line in case of parsing
failure (thanks for that).

 #
-#   "C symbol in enums" "format in the system views" "description in the docs"
+#   "format in the system views" "description in the docs"

Should we add a note here about the impact of the "format in the system views" 
on
the auto generated enum? (aka how it is generated based on its format)?

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: [PATCH] Remove unnecessary unbind in LDAP search+bind mode

2023-07-09 Thread Peter Eisentraut

On 03.07.23 11:53, Peter Eisentraut wrote:

On 23.03.23 02:45, Anatoly Zaretsky wrote:

Comments in src/backend/libpq/auth.c [1] say:
(after successfully finding the final DN to check the user-supplied 
password against)

/* Unbind and disconnect from the LDAP server */
and later
/*
* Need to re-initialize the LDAP connection, so that we can bind to
* it with a different username.
*/

But the protocol actually permits multiple subsequent authentications 
("binds" in LDAP parlance) over a single connection [2].
Moreover, inspection of the code revision history of mod_authnz_ldap, 
pam_ldap, Bugzilla, and MediaWiki LDAP authentication plugin, shows 
that they've been doing this bind-after-search over the same LDAP 
connection for ~20 years without any evidence of interoperability 
troubles.


So, it seems like the whole connection re-initialization thing was 
just a confusion caused by this very unfortunate "historical" 
naming, and can be safely removed, thus saving quite a few 
network round-trips, especially for the case of ldaps/starttls.


Your reasoning and your patch look correct to me.


committed





Re: check_strxfrm_bug()

2023-07-09 Thread Thomas Munro
On Sun, Jul 9, 2023 at 6:20 PM Peter Eisentraut  wrote:
> So I don't think this code is correct.  AFAICT, there is nothing right
> now that can possibly define HAVE_MBSTOWCS_L on Windows/MSVC.  Was that
> the intention?

Yes, that was my intention.  Windows actually doesn't have them.  The
autoconf/MinGW test result was telling the truth.  Somehow I had to
make the three build systems agree on this.  Either by strong-arming
all three of them to emit a hard-coded claim that it does, or by
removing the test that produces a different answer in different build
systems.  I will happily do it the other way if you insist, which
would involve restoring the meson.build and Solultion.pm kludges you
quoted, but I'd also have to add a compatible kludge to configure.ac.
It doesn't seem like an improvement to me but I don't feel strongly
about it.  In the end, Solution.pm and configure.ac will be vaporised
by lasers, so we'll be left with 0 or 1 special cases.  I don't care
much, but I like 0, it's nice and round.




Re: check_strxfrm_bug()

2023-07-09 Thread Peter Eisentraut

On 07.07.23 22:30, Thomas Munro wrote:

Thinking about how to bring this all into "normal" form -- where
HAVE_XXX means "system defines XXX", not "system defines XXX || we
define a replacement"


HAVE_XXX means "code can use XXX", doesn't matter how it got there (it 
could also be a libpgport replacement).


So I don't think this code is correct.  AFAICT, there is nothing right 
now that can possibly define HAVE_MBSTOWCS_L on Windows/MSVC.  Was that 
the intention?


I think these changes would need to be reverted to make this correct:

-# MSVC has replacements defined in src/include/port/win32_port.h.
-if cc.get_id() == 'msvc'
-  cdata.set('HAVE_WCSTOMBS_L', 1)
-  cdata.set('HAVE_MBSTOWCS_L', 1)
-endif

-   HAVE_MBSTOWCS_L => 1,
+   HAVE_MBSTOWCS_L => undef,

-   HAVE_WCSTOMBS_L => 1,
+   HAVE_WCSTOMBS_L => undef,