date:20240227

Re: "type with xxxx does not exist" when doing ExecMemoize()

2024-02-27 Thread Andrei Lepikhov


On 28/2/2024 13:53, Tender Wang wrote:
The attached patch is a new version based on v3(not including Andrei's 
the test case). There is no need to call datumCopy when

isnull is true.

I have not added a new runtime memoryContext so far. Continue to use 
mstate->tableContext, I'm not sure the memory used of probeslot will 
affect mstate->mem_limit.
Maybe adding a new memoryContext is better. I think I should spend a 
little time to learn nodeMemoize.c more deeply.
I am curious about your reasons to stay with tableContext. In terms of 
memory allocation, Richard's approach looks better.
Also, You don't need to initialize tts_values[i] at all if tts_isnull[i] 
set to true.


--
regards,
Andrei Lepikhov
Postgres Professional

Re: Synchronizing slots from primary to standby

2024-02-27 Thread Bertrand Drouvot

Hi,

On Wed, Feb 28, 2024 at 06:48:37AM +, Zhijie Hou (Fujitsu) wrote:
> On Wednesday, February 28, 2024 2:38 PM Bertrand Drouvot 
>  wrote:
> > On Wed, Feb 28, 2024 at 08:49:19AM +0530, Amit Kapila wrote:
> > > On Mon, Feb 26, 2024 at 9:13 AM shveta malik 
> > wrote:
> > > >
> > > > On Fri, Feb 23, 2024 at 7:41 PM Bertrand Drouvot
> > > >  wrote:
> > > > >
> > > > > Hi,
> > > > > > I think to set secure search path for remote connection, the
> > > > > > standard approach could be to extend the code in
> > > > > > libpqrcv_connect[1], so that we don't need to schema qualify all the
> > operators in the queries.
> > > > > >
> > > > > > And for local connection, I agree it's also needed to add a
> > > > > > SetConfigOption("search_path", "" call in the slotsync worker.
> > > > > >
> > > > > > [1]
> > > > > > libpqrcv_connect
> > > > > > ...
> > > > > >   if (logical)
> > > > > > ...
> > > > > >   res = libpqrcv_PQexec(conn->streamConn,
> > > > > >
> > > > > > ALWAYS_SECURE_SEARCH_PATH_SQL);
> > > > > >
> > > > >
> > > > > Agree, something like in the attached? (it's .txt to not disturb the 
> > > > > CF bot).
> > > >
> > > > Thanks for the patch, changes look good. I have corporated it in the
> > > > patch which addresses the rest of your comments in [1]. I have
> > > > attached the patch as .txt
> > > >
> > >
> > > Few comments:
> > > ===
> > > 1.
> > > - if (logical)
> > > + if (logical || !replication)
> > >   {
> > >
> > > Can we add a comment about connection types that require
> > > ALWAYS_SECURE_SEARCH_PATH_SQL?
> > 
> > Yeah, will do.
> > 
> > >
> > > 2.
> > > Can we add a test case to demonstrate that the '=' operator can be
> > > hijacked to do different things when the slotsync worker didn't use
> > > ALWAYS_SECURE_SEARCH_PATH_SQL?
> > 
> > I don't think that's good to create a test to show how to hijack an operator
> > within a background worker.
> > 
> > I had a quick look and did not find existing tests in this area around
> > ALWAYS_SECURE_SEARCH_PATH_SQL / search_patch and background worker.
> 
> I think a similar commit 11da970 has added a test for the search_path, e.g.

Oh right, thanks for sharing!

But do we think it's worth to show how to hijack an operator within a background
worker "just" to verify that the search_path works as expected?

I don't think it's worth it but will do if others have different opinions.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Synchronizing slots from primary to standby

2024-02-27 Thread shveta malik

On Wed, Feb 28, 2024 at 8:49 AM Amit Kapila  wrote:
>
>
> Few comments:

Thanks for the feedback.

> ===
> 1.
> - if (logical)
> + if (logical || !replication)
>   {
>
> Can we add a comment about connection types that require
> ALWAYS_SECURE_SEARCH_PATH_SQL?
>
> 2.
> Can we add a test case to demonstrate that the '=' operator can be
> hijacked to do different things when the slotsync worker didn't use
> ALWAYS_SECURE_SEARCH_PATH_SQL?
>

Here is the patch with new test added and improved comments.

thanks
Shveta


v2-0001-Fixups-for-commit-93db6cbda0.patch
Description: Binary data

Re: "type with xxxx does not exist" when doing ExecMemoize()

2024-02-27 Thread Tender Wang

The attached patch is a new version based on v3(not including Andrei's the
test case). There is no need to call datumCopy when
isnull is true.

I have not added a new runtime memoryContext so far. Continue to use
mstate->tableContext, I'm not sure the memory used of probeslot will affect
mstate->mem_limit.
Maybe adding a new memoryContext is better. I think I should spend a little
time to learn nodeMemoize.c more deeply.

Andrei Lepikhov  于2024年2月26日周一 20:29写道：

> On 26/2/2024 18:34, Richard Guo wrote:
> >
> > On Mon, Feb 26, 2024 at 3:54 PM Andrei Lepikhov
> > mailto:a.lepik...@postgrespro.ru>> wrote:
> >
> > On 26/2/2024 12:44, Tender Wang wrote:
> >  > Make sense. I found MemoizeState already has a MemoryContext, so
> > I used it.
> >  > I update the patch.
> > This approach is better for me. In the next version of this patch, I
> > included a test case. I am still unsure about the context chosen and
> > the
> > stability of the test case. Richard, you recently fixed some Memoize
> > issues, could you look at this problem and patch?
> >
> >
> > I looked at this issue a bit.  It seems to me what happens is that at
> > first the memory areas referenced by probeslot->tts_values[] are
> > allocated in the per tuple context (see prepare_probe_slot).  And then
> > in MemoizeHash_hash, after we've calculated the hashkey, we will reset
> > the per tuple context.  However, later in MemoizeHash_equal, we still
> > need to reference the values in probeslot->tts_values[], which have been
> > cleared.
> Agree
> >
> > Actually the context would always be reset in MemoizeHash_equal, for
> > both binary and logical mode.  So I kind of wonder if it's necessary to
> > reset the context in MemoizeHash_hash.
> I can only provide one thought against this solution: what if we have a
> lot of unique hash values, maybe all of them? In that case, we still
> have a kind of 'leak' David fixed by the commit 0b053e78b5.
> Also, I have a segfault report of one client. As I see, it was caused by
> too long text column in the table slot. As I see, key value, stored in
> the Memoize hash table, was corrupted, and the most plain reason is this
> bug. Should we add a test on this bug, and what do you think about the
> one proposed in v3?
>
> --
> regards,
> Andrei Lepikhov
> Postgres Professional
>
>

-- 
Tender Wang
OpenPie:  https://en.openpie.com/


v4-0001-Fix-RangeType-oid-not-found-when-doing-Memoize.patch
Description: Binary data

RE: Synchronizing slots from primary to standby

2024-02-27 Thread Zhijie Hou (Fujitsu)

On Wednesday, February 28, 2024 2:38 PM Bertrand Drouvot 
 wrote:
> On Wed, Feb 28, 2024 at 08:49:19AM +0530, Amit Kapila wrote:
> > On Mon, Feb 26, 2024 at 9:13 AM shveta malik 
> wrote:
> > >
> > > On Fri, Feb 23, 2024 at 7:41 PM Bertrand Drouvot
> > >  wrote:
> > > >
> > > > Hi,
> > > > > I think to set secure search path for remote connection, the
> > > > > standard approach could be to extend the code in
> > > > > libpqrcv_connect[1], so that we don't need to schema qualify all the
> operators in the queries.
> > > > >
> > > > > And for local connection, I agree it's also needed to add a
> > > > > SetConfigOption("search_path", "" call in the slotsync worker.
> > > > >
> > > > > [1]
> > > > > libpqrcv_connect
> > > > > ...
> > > > >   if (logical)
> > > > > ...
> > > > >   res = libpqrcv_PQexec(conn->streamConn,
> > > > >
> > > > > ALWAYS_SECURE_SEARCH_PATH_SQL);
> > > > >
> > > >
> > > > Agree, something like in the attached? (it's .txt to not disturb the CF 
> > > > bot).
> > >
> > > Thanks for the patch, changes look good. I have corporated it in the
> > > patch which addresses the rest of your comments in [1]. I have
> > > attached the patch as .txt
> > >
> >
> > Few comments:
> > ===
> > 1.
> > - if (logical)
> > + if (logical || !replication)
> >   {
> >
> > Can we add a comment about connection types that require
> > ALWAYS_SECURE_SEARCH_PATH_SQL?
> 
> Yeah, will do.
> 
> >
> > 2.
> > Can we add a test case to demonstrate that the '=' operator can be
> > hijacked to do different things when the slotsync worker didn't use
> > ALWAYS_SECURE_SEARCH_PATH_SQL?
> 
> I don't think that's good to create a test to show how to hijack an operator
> within a background worker.
> 
> I had a quick look and did not find existing tests in this area around
> ALWAYS_SECURE_SEARCH_PATH_SQL / search_patch and background worker.

I think a similar commit 11da970 has added a test for the search_path, e.g.

# Create some preexisting content on publisher
$node_publisher->safe_psql(
'postgres',
"CREATE FUNCTION public.pg_get_replica_identity_index(int)
 RETURNS regclass LANGUAGE sql AS 'SELECT 1/0'");# shall not call

Best Regards,
Hou zj

Re: Add new error_action COPY ON_ERROR "log"

2024-02-27 Thread Bharath Rupireddy

On Mon, Feb 26, 2024 at 5:47 PM torikoshia  wrote:
>
> >
> > It looks good to me.
>
> Here are comments on the v2 patch.

Thanks for looking at it.

> +   if (cstate->opts.on_error != COPY_ON_ERROR_STOP)
> +   {
> +   ereport(NOTICE,
>
> I think cstate->opts.on_error is not COPY_ON_ERROR_STOP here, since if
> it is COPY_ON_ERROR_STOP, InputFunctionCallSafe() should already have
> errored out.
>
> Should it be something like "Assert(cstate->opts.on_error !=
> COPY_ON_ERROR_STOP)"?

Nice catch. When COPY_ON_ERROR_STOP is specified, we use ereport's
soft error mechanism. An assertion seems a good choice to validate the
state is what we expect. Done that way.

> Should below manual also be updated?
>
> > A NOTICE message containing the ignored row count is emitted at the end
> > of the COPY FROM if at least one row was discarded.

Changed.

PSA v3 patch with the above review comments addressed.

-- 
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com


v3-0001-Add-detailed-info-when-COPY-skips-soft-errors.patch
Description: Binary data

Re: Synchronizing slots from primary to standby

2024-02-27 Thread Bertrand Drouvot

Hi,

On Wed, Feb 28, 2024 at 08:49:19AM +0530, Amit Kapila wrote:
> On Mon, Feb 26, 2024 at 9:13 AM shveta malik  wrote:
> >
> > On Fri, Feb 23, 2024 at 7:41 PM Bertrand Drouvot
> >  wrote:
> > >
> > > Hi,
> > > > I think to set secure search path for remote connection, the standard 
> > > > approach
> > > > could be to extend the code in libpqrcv_connect[1], so that we don't 
> > > > need to schema
> > > > qualify all the operators in the queries.
> > > >
> > > > And for local connection, I agree it's also needed to add a
> > > > SetConfigOption("search_path", "" call in the slotsync worker.
> > > >
> > > > [1]
> > > > libpqrcv_connect
> > > > ...
> > > >   if (logical)
> > > > ...
> > > >   res = libpqrcv_PQexec(conn->streamConn,
> > > > 
> > > > ALWAYS_SECURE_SEARCH_PATH_SQL);
> > > >
> > >
> > > Agree, something like in the attached? (it's .txt to not disturb the CF 
> > > bot).
> >
> > Thanks for the patch, changes look good. I have corporated it in the
> > patch which addresses the rest of your comments in [1]. I have
> > attached the patch as .txt
> >
> 
> Few comments:
> ===
> 1.
> - if (logical)
> + if (logical || !replication)
>   {
> 
> Can we add a comment about connection types that require
> ALWAYS_SECURE_SEARCH_PATH_SQL?

Yeah, will do.

> 
> 2.
> Can we add a test case to demonstrate that the '=' operator can be
> hijacked to do different things when the slotsync worker didn't use
> ALWAYS_SECURE_SEARCH_PATH_SQL?

I don't think that's good to create a test to show how to hijack an operator
within a background worker.

I had a quick look and did not find existing tests in this area around
ALWAYS_SECURE_SEARCH_PATH_SQL / search_patch and background worker.

Such a test would:

- "just" ensure that search_path works as expected
- show how to hijack an operator within a background worker

Based on the above I don't think that such a test is worth it.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Injection points: some tools to wait and wake

2024-02-27 Thread Bertrand Drouvot

Hi,

On Wed, Feb 28, 2024 at 01:26:46PM +0900, Michael Paquier wrote:
> On Tue, Feb 27, 2024 at 01:39:59PM +, Bertrand Drouvot wrote:
> > So, I'm ok with the new helper too.
> 
> If both of you feel strongly about that, I'm OK with introducing
> something like that.

Thanks!

>  Now, a routine should be only about waiting on
> pg_stat_activity to report something, as for the logs we already have
> log_contains().

Agree.

> And a test may want one check, but unlikely both.
> Even if both are wanted, it's just a matter of using log_contains()
> and the new routine that does pg_stat_activity lookups.

Yeah, that's also my point of view.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: RFC: Logging plan of the running query

2024-02-27 Thread Robert Haas

On Mon, Feb 26, 2024 at 5:31 PM torikoshia  wrote:
> It would be nice if there was a place accessed once every few seconds or
> so..

I think this comment earlier from Andres deserves close attention:

# If we went with something like tht approach, I think we'd have to do something
# like redirecting node->ExecProcNode to a wrapper, presumably from within a
# CFI. That wrapper could then implement the explain support, without slowing
# down the normal execution path.

If this is correctly implemented, the overhead in the case where the
feature isn't used should be essentially zero, I believe.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Improve eviction algorithm in ReorderBuffer

2024-02-27 Thread Amit Kapila

On Mon, Feb 26, 2024 at 7:54 PM Masahiko Sawada  wrote:
>

A few comments on 0003:
===
1.
+/*
+ * Threshold of the total number of top-level and sub transactions
that controls
+ * whether we switch the memory track state. While the MAINTAIN_HEAP state is
+ * effective when there are many transactions being decoded, in many systems
+ * there is generally no need to use it as long as all transactions
being decoded
+ * are top-level transactions. Therefore, we use MaxConnections as
the threshold
+ * so we can prevent switch to the state unless we use subtransactions.
+ */
+#define REORDER_BUFFER_MEM_TRACK_THRESHOLD MaxConnections

The comment seems to imply that MAINTAIN_HEAP is useful for large
number of transactions but ReorderBufferLargestTXN() switches to this
state even when there is one transaction. So, basically we use the
binary_heap technique to get the largest even when we have one
transaction but we don't maintain that heap unless we have
REORDER_BUFFER_MEM_TRACK_THRESHOLD number of transactions are
in-progress. This means there is some additional work when (build and
reset heap each time when we pick largest xact) we have fewer
transactions in the system but that may not be impacting us because of
other costs involved like serializing all the changes. I think once we
can try to stress test this by setting
debug_logical_replication_streaming to 'immediate' to see if the new
mechanism has any overhead.

2. Can we somehow measure the additional memory that will be consumed
by each backend/walsender to maintain transactions? Because I think
this also won't be accounted for logical_decoding_work_mem, so if this
is large, there could be a chance of more complaints on us for not
honoring logical_decoding_work_mem.

3.
@@ -3707,11 +3817,14 @@ ReorderBufferSerializeTXN(ReorderBuffer *rb,
ReorderBufferTXN *txn)

  ReorderBufferSerializeChange(rb, txn, fd, change);
  dlist_delete(>node);
- ReorderBufferReturnChange(rb, change, true);
+ ReorderBufferReturnChange(rb, change, false);

  spilled++;
  }

+ /* Update the memory counter */
+ ReorderBufferChangeMemoryUpdate(rb, NULL, txn, false, txn->size);

In ReorderBufferSerializeTXN(), we already use a size variable for
txn->size, we can probably use that for the sake of consistency.

-- 
With Regards,
Amit Kapila.

Re: Injection points: some tools to wait and wake

2024-02-27 Thread Andrey M. Borodin

> On 28 Feb 2024, at 09:26, Michael Paquier  wrote:
> 
> Now, a routine should be only about waiting on
> pg_stat_activity to report something

BTW, if we had an SQL function such as injection_point_await(name), we could 
use it in our isolation tests as well as our TAP tests :)
However, this might well be a future improvement, if we had a generic “await" 
Perl function - we wouldn’t need to re-use new SQL code in so many places.

Thanks!

Best regards, Andrey Borodin.

Re: Improve readability by using designated initializers when possible

2024-02-27 Thread Jelte Fennema-Nio

On Wed, 28 Feb 2024 at 04:59, Michael Paquier  wrote:
> Cool.  I have applied 0004 and most of 0002.  Attached is what
> remains, where I am wondering if it would be cleaner to do these bits
> together (did not look at the whole, yet).

Feel free to squash them if you prefer that. I think now that patch
0002 only includes encoding changes, I feel 50-50 about grouping them
together. I mainly kept them separate, because 0002 were simple search
+ replaces and 0003 required code changes. That's still the case, but
now I can also see the argument for grouping them together since that
would clean up all the encoding arrays in one single commit (without a
ton of other arrays also being modified).

Re: Injection points: some tools to wait and wake

2024-02-27 Thread Michael Paquier

On Tue, Feb 27, 2024 at 01:39:59PM +, Bertrand Drouvot wrote:
> So, I'm ok with the new helper too.

If both of you feel strongly about that, I'm OK with introducing
something like that.  Now, a routine should be only about waiting on
pg_stat_activity to report something, as for the logs we already have
log_contains().  And a test may want one check, but unlikely both.
Even if both are wanted, it's just a matter of using log_contains()
and the new routine that does pg_stat_activity lookups.
--
Michael


signature.asc
Description: PGP signature

Re: POC, WIP: OR-clause support for indexes

2024-02-27 Thread Andrei Lepikhov


On 26/2/2024 11:10, Alena Rybakina wrote:

On 24.02.2024 14:28, jian he wrote:

Hi.
I wrote the first draft patch of the documentation.
it's under the section: Planner Method Configuration 
(runtime-config-query.html)

but this feature's main meat is in src/backend/parser/parse_expr.c
so it may be slightly inconsistent, as mentioned by others.

You can further furnish it.


Thank you for your work. I found a few spelling mistakes - I fixed this 
and added some information about generating a partial index plan. I 
attached it.

Thanks Jian and Alena,
It is a good start for the documentation. But I think the runtime-config 
section needs only a condensed description of a feature underlying the 
GUC. The explanations in this section look a bit awkward.
Having looked through the documentation for a better place for a 
detailed explanation, I found array.sgml as a candidate. Also, we have 
the parser's short overview section. I'm unsure about the best place but 
it is better when the server config section.

What's more, there are some weak points in the documentation:
1. We choose constant and variable parts of an expression and don't 
require the constant to be on the right side.
2. We should describe the second part of the feature, where the 
optimiser can split an array to fit the optimal BitmapOr scan path.


--
regards,
Andrei Lepikhov
Postgres Professional

Re: Bytea PL/Perl transform

2024-02-27 Thread Pavel Stehule

út 27. 2. 2024 v 21:03 odesílatel Alexander Korotkov 
napsal:

> Hi!
>
> On Tue, Jan 30, 2024 at 8:46 PM Pavel Stehule 
> wrote:
> > I marked this patch as ready for committer.
>
> The last version of the patch still provides transform for builtin
> type in a separate extension.  As discussed upthread such transforms
> don't need separate extensions, and could be provided as part of
> upgrades of existing extensions.  There is probably no consensus yet
> on what to do with existing extensions like jsonb_plperl and
> jsonb_plpython, but we clearly shouldn't spread such cases.
>

+1

Pavel


> --
> Regards,
> Alexander Korotkov
>

Re: Improve readability by using designated initializers when possible

2024-02-27 Thread Michael Paquier

On Wed, Feb 28, 2024 at 09:41:42AM +0800, Japin Li wrote:
> On Wed, 28 Feb 2024 at 00:06, Jelte Fennema-Nio  wrote:
>> Attached is v5 of the patchset that also includes this change (with
>> you listed as author).
> 
> Thanks for updating the patch!

Cool.  I have applied 0004 and most of 0002.  Attached is what
remains, where I am wondering if it would be cleaner to do these bits
together (did not look at the whole, yet).

> It looks good to me except there is an outdated comment.

Yep, I've updated that in the attached for now.
--
Michael
From e42d47cf5854932d0bfcf713ec47a9d5ca060c4f Mon Sep 17 00:00:00 2001
From: Jelte Fennema-Nio 
Date: Wed, 21 Feb 2024 15:34:38 +0100
Subject: [PATCH v6 2/3] Use designated initializer syntax to improve
 readability

Use C99 designated initializer syntax for array elements of various
arrays. This is less verbose, more readable and less prone to typos or
ordering mistakes.
Akin to 74a730631065 and cc150596341e.
---
 src/include/mb/pg_wchar.h |   6 +-
 src/common/encnames.c | 155 +++---
 src/common/wchar.c|  85 +++--
 3 files changed, 120 insertions(+), 126 deletions(-)

diff --git a/src/include/mb/pg_wchar.h b/src/include/mb/pg_wchar.h
index 1d521bea24..9248daab3a 100644
--- a/src/include/mb/pg_wchar.h
+++ b/src/include/mb/pg_wchar.h
@@ -224,11 +224,7 @@ typedef unsigned int pg_wchar;
 /*
  * PostgreSQL encoding identifiers
  *
- * WARNING: the order of this enum must be same as order of entries
- *			in the pg_enc2name_tbl[] array (in src/common/encnames.c), and
- *			in the pg_wchar_table[] array (in src/common/wchar.c)!
- *
- *			If you add some encoding don't forget to check
+ * WARNING: If you add some encoding don't forget to check
  *			PG_ENCODING_BE_LAST macro.
  *
  * PG_SQL_ASCII is default encoding and must be = 0.
diff --git a/src/common/encnames.c b/src/common/encnames.c
index 710b747f6b..dba6bd2c9e 100644
--- a/src/common/encnames.c
+++ b/src/common/encnames.c
@@ -297,7 +297,6 @@ static const pg_encname pg_encname_tbl[] =
 
 /* --
  * These are "official" encoding names.
- * XXX must be sorted by the same order as enum pg_enc (in mb/pg_wchar.h)
  * --
  */
 #ifndef WIN32
@@ -308,48 +307,48 @@ static const pg_encname pg_encname_tbl[] =
 
 const pg_enc2name pg_enc2name_tbl[] =
 {
-	DEF_ENC2NAME(SQL_ASCII, 0),
-	DEF_ENC2NAME(EUC_JP, 20932),
-	DEF_ENC2NAME(EUC_CN, 20936),
-	DEF_ENC2NAME(EUC_KR, 51949),
-	DEF_ENC2NAME(EUC_TW, 0),
-	DEF_ENC2NAME(EUC_JIS_2004, 20932),
-	DEF_ENC2NAME(UTF8, 65001),
-	DEF_ENC2NAME(MULE_INTERNAL, 0),
-	DEF_ENC2NAME(LATIN1, 28591),
-	DEF_ENC2NAME(LATIN2, 28592),
-	DEF_ENC2NAME(LATIN3, 28593),
-	DEF_ENC2NAME(LATIN4, 28594),
-	DEF_ENC2NAME(LATIN5, 28599),
-	DEF_ENC2NAME(LATIN6, 0),
-	DEF_ENC2NAME(LATIN7, 0),
-	DEF_ENC2NAME(LATIN8, 0),
-	DEF_ENC2NAME(LATIN9, 28605),
-	DEF_ENC2NAME(LATIN10, 0),
-	DEF_ENC2NAME(WIN1256, 1256),
-	DEF_ENC2NAME(WIN1258, 1258),
-	DEF_ENC2NAME(WIN866, 866),
-	DEF_ENC2NAME(WIN874, 874),
-	DEF_ENC2NAME(KOI8R, 20866),
-	DEF_ENC2NAME(WIN1251, 1251),
-	DEF_ENC2NAME(WIN1252, 1252),
-	DEF_ENC2NAME(ISO_8859_5, 28595),
-	DEF_ENC2NAME(ISO_8859_6, 28596),
-	DEF_ENC2NAME(ISO_8859_7, 28597),
-	DEF_ENC2NAME(ISO_8859_8, 28598),
-	DEF_ENC2NAME(WIN1250, 1250),
-	DEF_ENC2NAME(WIN1253, 1253),
-	DEF_ENC2NAME(WIN1254, 1254),
-	DEF_ENC2NAME(WIN1255, 1255),
-	DEF_ENC2NAME(WIN1257, 1257),
-	DEF_ENC2NAME(KOI8U, 21866),
-	DEF_ENC2NAME(SJIS, 932),
-	DEF_ENC2NAME(BIG5, 950),
-	DEF_ENC2NAME(GBK, 936),
-	DEF_ENC2NAME(UHC, 949),
-	DEF_ENC2NAME(GB18030, 54936),
-	DEF_ENC2NAME(JOHAB, 0),
-	DEF_ENC2NAME(SHIFT_JIS_2004, 932)
+	[PG_SQL_ASCII] = DEF_ENC2NAME(SQL_ASCII, 0),
+	[PG_EUC_JP] = DEF_ENC2NAME(EUC_JP, 20932),
+	[PG_EUC_CN] = DEF_ENC2NAME(EUC_CN, 20936),
+	[PG_EUC_KR] = DEF_ENC2NAME(EUC_KR, 51949),
+	[PG_EUC_TW] = DEF_ENC2NAME(EUC_TW, 0),
+	[PG_EUC_JIS_2004] = DEF_ENC2NAME(EUC_JIS_2004, 20932),
+	[PG_UTF8] = DEF_ENC2NAME(UTF8, 65001),
+	[PG_MULE_INTERNAL] = DEF_ENC2NAME(MULE_INTERNAL, 0),
+	[PG_LATIN1] = DEF_ENC2NAME(LATIN1, 28591),
+	[PG_LATIN2] = DEF_ENC2NAME(LATIN2, 28592),
+	[PG_LATIN3] = DEF_ENC2NAME(LATIN3, 28593),
+	[PG_LATIN4] = DEF_ENC2NAME(LATIN4, 28594),
+	[PG_LATIN5] = DEF_ENC2NAME(LATIN5, 28599),
+	[PG_LATIN6] = DEF_ENC2NAME(LATIN6, 0),
+	[PG_LATIN7] = DEF_ENC2NAME(LATIN7, 0),
+	[PG_LATIN8] = DEF_ENC2NAME(LATIN8, 0),
+	[PG_LATIN9] = DEF_ENC2NAME(LATIN9, 28605),
+	[PG_LATIN10] = DEF_ENC2NAME(LATIN10, 0),
+	[PG_WIN1256] = DEF_ENC2NAME(WIN1256, 1256),
+	[PG_WIN1258] = DEF_ENC2NAME(WIN1258, 1258),
+	[PG_WIN866] = DEF_ENC2NAME(WIN866, 866),
+	[PG_WIN874] = DEF_ENC2NAME(WIN874, 874),
+	[PG_KOI8R] = DEF_ENC2NAME(KOI8R, 20866),
+	[PG_WIN1251] = DEF_ENC2NAME(WIN1251, 1251),
+	[PG_WIN1252] = DEF_ENC2NAME(WIN1252, 1252),
+	[PG_ISO_8859_5] = DEF_ENC2NAME(ISO_8859_5, 28595),
+	[PG_ISO_8859_6] = DEF_ENC2NAME(ISO_8859_6, 28596),
+	[PG_ISO_8859_7] = DEF_ENC2NAME(ISO_8859_7, 28597),
+	[PG_ISO_8859_8] = DEF_ENC2NAME(ISO_8859_8, 28598),
+	[PG_WIN1250] =

Re: Relation bulk write facility

2024-02-27 Thread Noah Misch

On Wed, Feb 28, 2024 at 12:24:01AM +0400, Heikki Linnakangas wrote:
> Here's a patch to fully remove AIX support.

> Subject: [PATCH 1/1] Remove AIX support
> 
> There isn't a lot of user demand for AIX support, no one has stepped
> up to the plate to properly maintain it, so it's best to remove it

Regardless of how someone were to step up to maintain it, we'd be telling them
such contributions have negative value and must stop.  We're expelling AIX due
to low demand, compiler bugs, its ABI, and its shlib symbol export needs.

> altogether. AIX is still supported for stable versions.
> 
> The acute issue that triggered this decision was that after commit
> 8af2565248, the AIX buildfarm members have been hitting this
> assertion:
> 
> TRAP: failed Assert("(uintptr_t) buffer == TYPEALIGN(PG_IO_ALIGN_SIZE, 
> buffer)"), File: "md.c", Line: 472, PID: 2949728
> 
> Apperently the "pg_attribute_aligned(a)" attribute doesn't work on AIX
> (linker?) for values larger than PG_IO_ALIGN_SIZE.

No; see https://postgr.es/m/20240225194322.a5%40rfd.leadboat.com

Re: SLRU optimization - configurable buffer pool and partitioning the SLRU lock

2024-02-27 Thread Dilip Kumar

On Tue, Feb 27, 2024 at 11:41 PM Alvaro Herrera 
wrote:

> On 2024-Feb-27, Alvaro Herrera wrote:
>
> > Here's the complete set, with these two names using the singular.
>
> BTW one thing I had not noticed is that before this patch we have
> minimum shmem size that's lower than the lowest you can go with the new
> code.
>
> This means Postgres may no longer start when extremely tight memory
> restrictions (and of course use more memory even when idle or with small
> databases).  I wonder to what extent should we make an effort to relax
> that.  For small, largely inactive servers, this is just memory we use
> for no good reason.  However, anything we do here will impact
> performance on the high end, because as Andrey says this will add
> calculations and jumps where there are none today.
>
>
I was just comparing the minimum memory required for SLRU when the system
is minimally configured, correct me if I am wrong.

SLRUunpatched
patched
commit_timestamp_buffers  4   16
subtransaction_buffers 32 16
transaction_buffers   4   16
multixact_offset_buffers8   16
multixact_member_buffers   16  16
notify_buffers 8
 16
serializable_buffers   16  16
-
total buffers 88
112

so that is < 200kB of extra memory on a minimally configured system, IMHO
this should not matter.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Re: The const expression evaluation routine should always return a copy

2024-02-27 Thread Andrei Lepikhov


On 28/2/2024 04:19, Tom Lane wrote:

Andrei Lepikhov  writes:

IMO, the routine eval_const_expressions_mutator contains some stale code:


I'd like to push back against the idea that eval_const_expressions
is expected to return a freshly-copied tree.  Its API specification
contains no such claim.  It's true that it appears to do that
everywhere but here, but I think that's an implementation detail
that callers had better not depend on.  It's not hard to imagine
someone trying to improve its performance by avoiding unnecessary
copying.
Thanks for the explanation. I was just such a developer of extensions 
who had looked into the internals of the eval_const_expressions, found a 
call for a '_mutator' function, and made such a mistake :).
I agree that freeing the return node value doesn't provide a sensible 
benefit because the underlying bushy tree was copied during mutation. 
What's more, it makes even less sense with the selectivity context 
coming shortly (I hope).


--
regards,
Andrei Lepikhov
Postgres Professional

Re: Synchronizing slots from primary to standby

2024-02-27 Thread Amit Kapila

On Mon, Feb 26, 2024 at 9:13 AM shveta malik  wrote:
>
> On Fri, Feb 23, 2024 at 7:41 PM Bertrand Drouvot
>  wrote:
> >
> > Hi,
> > > I think to set secure search path for remote connection, the standard 
> > > approach
> > > could be to extend the code in libpqrcv_connect[1], so that we don't need 
> > > to schema
> > > qualify all the operators in the queries.
> > >
> > > And for local connection, I agree it's also needed to add a
> > > SetConfigOption("search_path", "" call in the slotsync worker.
> > >
> > > [1]
> > > libpqrcv_connect
> > > ...
> > >   if (logical)
> > > ...
> > >   res = libpqrcv_PQexec(conn->streamConn,
> > > 
> > > ALWAYS_SECURE_SEARCH_PATH_SQL);
> > >
> >
> > Agree, something like in the attached? (it's .txt to not disturb the CF 
> > bot).
>
> Thanks for the patch, changes look good. I have corporated it in the
> patch which addresses the rest of your comments in [1]. I have
> attached the patch as .txt
>

Few comments:
===
1.
- if (logical)
+ if (logical || !replication)
  {

Can we add a comment about connection types that require
ALWAYS_SECURE_SEARCH_PATH_SQL?

2.
Can we add a test case to demonstrate that the '=' operator can be
hijacked to do different things when the slotsync worker didn't use
ALWAYS_SECURE_SEARCH_PATH_SQL?

-- 
With Regards,
Amit Kapila.

RE: Synchronizing slots from primary to standby

2024-02-27 Thread Zhijie Hou (Fujitsu)

On Tuesday, February 27, 2024 3:18 PM Peter Smith  wrote:

> 
> Here are some review comments for v99-0001

Thanks for the comments!

> Commit Message
> 
> 1.
> A new parameter named standby_slot_names is introduced.
> 
> Maybe quote the GUC names here to make it more readable.

Added.

> 
> ~~
> 
> 2.
> Additionally, The SQL functions pg_logical_slot_get_changes and
> pg_replication_slot_advance are modified to wait for the replication slots
> mentioned in standby_slot_names to catch up before returning the changes to
> the user.
> 
> ~
> 
> 2a.
> "pg_replication_slot_advance" is a typo? Did you mean
> pg_logical_replication_slot_advance?

pg_logical_replication_slot_advance is not a user visible function. So the
pg_replication_slot_advance is correct.

> 
> ~
> 
> 2b.
> The "before returning the changes to the user" seems like it is referring 
> only to
> the first function.
> 
> Maybe needs slight rewording like:
> /before returning the changes to the user./ before returning./

Changed.

> 
> ==
> doc/src/sgml/config.sgml
> 
> 3. standby_slot_names
> 
> +   
> +List of physical slots guarantees that logical replication slots with
> +failover enabled do not consume changes until those changes
> are received
> +and flushed to corresponding physical standbys. If a logical
> replication
> +connection is meant to switch to a physical standby after the
> standby is
> +promoted, the physical replication slot for the standby
> should be listed
> +here. Note that logical replication will not proceed if the slots
> +specified in the standby_slot_names do not exist or are invalidated.
> +   
> 
> The wording doesn't seem right. IMO this should be worded much like how this
> GUC is described in guc_tables.c
> 
> e.g something a bit like:
> 
> Lists the streaming replication standby server slot names that logical WAL
> sender processes will wait for. Logical WAL sender processes will send
> decoded changes to plugins only after the specified replication slots confirm
> receiving WAL. This guarantees that logical replication slots with failover
> enabled do not consume changes until those changes are received and flushed
> to corresponding physical standbys...

Changed.

> 
> ==
> doc/src/sgml/logicaldecoding.sgml
> 
> 4. Section 48.2.3 Replication Slot Synchronization
> 
> + It's also highly recommended that the said physical replication slot
> + is named in
> +  linkend="guc-standby-slot-names">standby_slot_names me>
> + list on the primary, to prevent the subscriber from consuming changes
> + faster than the hot standby. But once we configure it, then
> certain latency
> + is expected in sending changes to logical subscribers due to wait on
> + physical replication slots in
> +  +
> linkend="guc-standby-slot-names">standby_slot_names me>
> + 
> 
> 4a.
> /It's also highly/It is highly/
> 
> ~
> 
> 4b.
> 
> BEFORE
> But once we configure it, then certain latency is expected in sending changes
> to logical subscribers due to wait on physical replication slots in  linkend="guc-standby-slot-names">standby_slot_names me>
> 
> SUGGESTION
> Even when correctly configured, some latency is expected when sending
> changes to logical subscribers due to the waiting on slots named in
> standby_slot_names.

Changed.

> 
> ==
> .../replication/logical/logicalfuncs.c
> 
> 5. pg_logical_slot_get_changes_guts
> 
> + if (XLogRecPtrIsInvalid(upto_lsn))
> + wait_for_wal_lsn = end_of_wal;
> + else
> + wait_for_wal_lsn = Min(upto_lsn, end_of_wal);
> +
> + /*
> + * Wait for specified streaming replication standby servers (if any)
> + * to confirm receipt of WAL up to wait_for_wal_lsn.
> + */
> + WaitForStandbyConfirmation(wait_for_wal_lsn);
> 
> Perhaps those statements all belong together with the comment up-front. e.g.
> 
> + /*
> + * Wait for specified streaming replication standby servers (if any)
> + * to confirm receipt of WAL up to wait_for_wal_lsn.
> + */
> + if (XLogRecPtrIsInvalid(upto_lsn))
> + wait_for_wal_lsn = end_of_wal;
> + else
> + wait_for_wal_lsn = Min(upto_lsn, end_of_wal);
> + WaitForStandbyConfirmation(wait_for_wal_lsn);

Changed.

> 
> ==
> src/backend/replication/logical/slotsync.c
> 
> ==
> src/backend/replication/slot.c
> 
> 6.
> +static bool
> +validate_standby_slots(char **newval)
> +{
> + char*rawname;
> + List*elemlist;
> + ListCell   *lc;
> + bool ok;
> +
> + /* Need a modifiable copy of string */ rawname = pstrdup(*newval);
> +
> + /* Verify syntax and parse string into a list of identifiers */ ok =
> + SplitIdentifierString(rawname, ',', );
> +
> + if (!ok)
> + {
> + GUC_check_errdetail("List syntax is invalid."); }
> +
> + /*
> + * If the replication slots' data have been initialized, verify if the
> + * specified slots exist and are logical slots.
> + */
> + else if (ReplicationSlotCtl)
> + {
> + foreach(lc, elemlist)
> 
> 6a.
> So, if the ReplicationSlotCtl is

Re: Logging parallel worker draught

2024-02-27 Thread Andrew Dunstan




On 2024-02-27 Tu 05:03, Benoit Lobréau wrote:



On 2/25/24 23:32, Peter Smith wrote:

Also, I don't understand how the word "draught" (aka "draft") makes
sense here -- I assume the intended word was "drought" (???).


yes, that was the intent, sorry about that. English is not my native 
langage and I was convinced the spelling was correct.



Both are English words spelled correctly, but with very different 
meanings. (Drought is definitely the one you want here.) This reminds me 
of the Errata section of Sellars and Yeatman's classic "history" work 
"1066 And All That":


"For 'pheasant' read 'peasant' throughout."


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: Improve readability by using designated initializers when possible

2024-02-27 Thread Japin Li



On Wed, 28 Feb 2024 at 00:06, Jelte Fennema-Nio  wrote:
> On Tue, 27 Feb 2024 at 16:04, Japin Li  wrote:
>> I see the config_group_names[] needs null-terminated because of help_config,
>> however, I didn't find the reference in help_config.c.  Is this comment
>> outdated?
>
> Yeah, you're correct. That comment has been outdated for more than 20
> years. The commit that made it unnecessary to null-terminate the array
> was 9d77708d83ee.
>
> Attached is v5 of the patchset that also includes this change (with
> you listed as author).

Thanks for updating the patch!

It looks good to me except there is an outdated comment.

diff --git a/src/common/encnames.c b/src/common/encnames.c
index bd012fe3a0..dba6bd2c9e 100644
--- a/src/common/encnames.c
+++ b/src/common/encnames.c
@@ -297,7 +297,6 @@ static const pg_encname pg_encname_tbl[] =

 /* --
  * These are "official" encoding names.
- * XXX must be sorted by the same order as enum pg_enc (in mb/pg_wchar.h)
  * --
  */
 #ifndef WIN32

Re: MAINTAIN privilege -- what do we need to un-revert it?

2024-02-27 Thread Jeff Davis

New version attached.

Do we need a documentation update here? If so, where would be a good
place?

On Fri, 2024-02-23 at 15:30 -0600, Nathan Bossart wrote:
> Why is this change needed?  Is the idea to make amcheck follow the
> same
> rules as maintenance commands to encourage folks to set up index
> functions
> correctly?

amcheck is calling functions it defined, so in order to find those
functions it needs to set the right search path.

> 
> What is the purpose of this [bootstrap-related] change?

DefineIndex() is called during bootstrap, and it's also a maintenance
command. I tried to handle the bootstrapping case, but I think it's
best to just guard it with a conditional. Done.

I also added Assert(!IsBootstrapProcessingMode()) in
assign_search_path().

> > +   SetConfigOption("search_path", GUC_SAFE_SEARCH_PATH,
> > PGC_USERSET,
> > +   PGC_S_SESSION);
> 
> I wonder if it's worth using PGC_S_INTERACTIVE or introducing a new
> value
> for these.
> > 

Did you have a particular concern about PGC_S_SESSION?

If it's less than PGC_S_SESSION, it won't work, because the caller's
SET command will override it, and the same manipulation is possible.

And I don't think we want it higher than PGC_S_SESSION, otherwise the
function can't set its own search_path, if needed.

> > +#define GUC_SAFE_SEARCH_PATH "pg_catalog, pg_temp"
> 
> Is including pg_temp actually safe?  I worry that a user could use
> their
> temporary schema to inject objects that would take the place of
> non-schema-qualified stuff in functions.

pg_temp cannot (currently) be excluded. If it is omitted from the
string, it will be placed *first* in the search_path, which is more
dangerous.

pg_temp does not take part in function or operator resolution, which
makes it safer than it first appears. There are potentially some risks
around tables, but it's not typical to access a table in a function
called as part of an index expression.

If we determine that pg_temp is actually unsafe to include, we need to
do something like what I proposed here:

https://www.postgresql.org/message-id/a6865db287596c9c6ea12bdd9de87216cb5e7902.ca...@j-davis.com

before this change.

Regards,
Jeff Davis

From d628e1b4e2632a7f853e2c1f016758569eaee14e Mon Sep 17 00:00:00 2001
From: Jeff Davis 
Date: Fri, 16 Feb 2024 14:04:23 -0800
Subject: [PATCH v5] Fix search_path to a safe value during maintenance
 operations.

While executing maintenance operations (ANALYZE, CLUSTER, REFRESH
MATERIALIZED VIEW, REINDEX, or VACUUM), set search_path to
'pg_catalog, pg_temp' to prevent inconsistent behavior.

Functions that are used for functional indexes, in index expressions,
or in materialized views and depend on a different search path must be
declared with CREATE FUNCTION ... SET search_path='...'.

This change was previously committed as 05e1737351, then reverted in
commit 2fcc7ee7af because it was too late in the cycle.

Preparation for the MAINTAIN privilege, which was previously reverted
due to search_path manipulation hazards.

Discussion: https://postgr.es/m/d4ccaf3658cb3c281ec88c851a09733cd9482f22.ca...@j-davis.com
Discussion: https://postgr.es/m/E1q7j7Y-000z1H-Hr%40gemulon.postgresql.org
Discussion: https://postgr.es/m/e44327179e5c9015c8dda67351c04da552066017.camel%40j-davis.com
Reviewed-by: Greg Stark
Reviewed-by: Nathan Bossart
---
 contrib/amcheck/t/004_verify_nbtree_unique.pl | 33 +++
 contrib/amcheck/verify_nbtree.c   |  2 ++
 src/backend/access/brin/brin.c|  2 ++
 src/backend/catalog/index.c   |  9 +
 src/backend/catalog/namespace.c   |  3 ++
 src/backend/commands/analyze.c|  2 ++
 src/backend/commands/cluster.c|  2 ++
 src/backend/commands/indexcmds.c  |  8 +
 src/backend/commands/matview.c|  2 ++
 src/backend/commands/vacuum.c |  2 ++
 src/bin/scripts/t/100_vacuumdb.pl |  4 ---
 src/include/utils/guc.h   |  6 
 .../test_oat_hooks/expected/alter_table.out   |  2 ++
 .../expected/test_oat_hooks.out   |  4 +++
 src/test/regress/expected/matview.out |  4 ++-
 src/test/regress/expected/privileges.out  | 12 +++
 src/test/regress/expected/vacuum.out  |  2 +-
 src/test/regress/sql/matview.sql  |  4 ++-
 src/test/regress/sql/privileges.sql   |  8 ++---
 src/test/regress/sql/vacuum.sql   |  2 +-
 20 files changed, 82 insertions(+), 31 deletions(-)

diff --git a/contrib/amcheck/t/004_verify_nbtree_unique.pl b/contrib/amcheck/t/004_verify_nbtree_unique.pl
index 3f474a158a..4b704e6815 100644
--- a/contrib/amcheck/t/004_verify_nbtree_unique.pl
+++ b/contrib/amcheck/t/004_verify_nbtree_unique.pl
@@ -20,8 +20,11 @@ $node->safe_psql(
 	'postgres', q(
 	CREATE EXTENSION amcheck;

+	CREATE SCHEMA test_amcheck;
+	SET search_path = test_amcheck;
+
 	CREATE FUNCTION ok_cmp (int4,

Re: session username in default psql prompt?

2024-02-27 Thread Kori Lane

Here’s a patch for this.- Kori

session-user-prompt.patch
Description: Binary data
On May 27, 2023, at 8:52 AM, Andrew Dunstan  wrote:
  


  
  I don't recall if this has come up before.
  I'm sometimes mildly annoyed when I get on
a new system and find the username missing in my psql prompt. Or
if a customer shows me a screen and I have to ask "which user is
this". If we're dealing with several roles it can get confusing.
My usual .psqlrc has   \set PROMPT1 '%n@%~%R%x%# 'So my suggestion is that we prepend '%n@'
to the default psql PROMPT1 (and maybe PROMPT2).I realize it's not exactly
earth-shattering, but I think it's a bit more user-friendly.(Would be a good beginner project, the
code change would be tiny but there are lots of docs / examples
that we might want to change if we did it.)
  
  cheers
  andrew
  
--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: Potential issue in ecpg-informix decimal converting functions

2024-02-27 Thread Michael Paquier

On Tue, Feb 27, 2024 at 09:24:25AM +0100, Daniel Gustafsson wrote:
> I have it on my TODO for the upcoming CF.

Okay, thanks.
--
Michael


signature.asc
Description: PGP signature

Re: Relation bulk write facility

2024-02-27 Thread Thomas Munro

On Wed, Feb 28, 2024 at 9:24 AM Heikki Linnakangas  wrote:
> Here's a patch to fully remove AIX support.

--- a/doc/src/sgml/installation.sgml
+++ b/doc/src/sgml/installation.sgml
@@ -3401,7 +3401,7 @@ export MANPATH
   
PostgreSQL can be expected to work on current
versions of these operating systems: Linux, Windows,
-   FreeBSD, OpenBSD, NetBSD, DragonFlyBSD, macOS, AIX, Solaris, and illumos.
+   FreeBSD, OpenBSD, NetBSD, DragonFlyBSD, macOS, Solaris, and illumos.

There is also a little roll-of-honour of operating systems we used to
support, just a couple of paragraphs down, where AIX should appear.

Re: The const expression evaluation routine should always return a copy

2024-02-27 Thread Tom Lane

Andrei Lepikhov  writes:
> IMO, the routine eval_const_expressions_mutator contains some stale code:

I'd like to push back against the idea that eval_const_expressions
is expected to return a freshly-copied tree.  Its API specification
contains no such claim.  It's true that it appears to do that
everywhere but here, but I think that's an implementation detail
that callers had better not depend on.  It's not hard to imagine
someone trying to improve its performance by avoiding unnecessary
copying.

Also, your proposed patch introduces a great deal of schizophrenia,
because SubPlan has substructure.  What's the point of making a copy
of the SubPlan node itself, if the testexpr and args aren't copied?
But we shouldn't modify those, because as the comment states, it's
a bit late to be doing so.

I agree that the comment claiming we can't get here is outdated,
but I'm unexcited about changing the code's behavior.

regards, tom lane

Re: Remove --with-CC autoconf option

2024-02-27 Thread Daniel Gustafsson

> On 27 Feb 2024, at 22:03, Heikki Linnakangas  wrote:

> Any objections to removing the ./configure --with-CC option? It's been 
> deprecated since commit cb292206c5 from July 2000:

None, and removing it will chip away further at getting autoconf and meson
fully in sync so +1.

--
Daniel Gustafsson

Remove --with-CC autoconf option

2024-02-27 Thread Heikki Linnakangas

Any objections to removing the ./configure --with-CC option? It's been 
deprecated since commit cb292206c5 from July 2000:



# For historical reasons you can also use --with-CC to specify the C compiler
# to use, although the standard way to do this is to set the CC environment
# variable.
PGAC_ARG_REQ(with, CC, [CMD], [set compiler (deprecated)], [CC=$with_CC])


--
Heikki Linnakangas
Neon (https://neon.tech)

Re: Relation bulk write facility

2024-02-27 Thread Andres Freund

Hi,

On 2024-02-27 15:45:45 -0500, Tom Lane wrote:
> Heikki Linnakangas  writes:
> With AIX out of the picture, lapwing will be the only remaining
> animal testing MAXALIGN less than 8.  That seems like a single
> point of failure ... should we spin up another couple 32-bit
> animals?  I had supposed that my faithful old PPC animal mamba
> was helping to check this, but I see that under NetBSD it's
> joined the ALIGNOF_DOUBLE==8 crowd.

I can set up a i386 animal, albeit on an amd64 kernel. But I don't think the
latter matters.

Greetings,

Andres Freund

RE: Popcount optimization using AVX512

2024-02-27 Thread Amonson, Paul D

Andres,

After consulting some Intel internal experts on MSVC the linking issue as it 
stood was not resolved. Instead, I created a MSVC ONLY work-around. This adds 
one extra functional call on the Windows builds (The linker resolves a real 
function just fine but not a function pointer of the same name). This extra 
latency does not exist on any of the other platforms. I also believe I 
addressed all issues raised in the previous reviews. The new 
pg_popcnt_x86_64_accel.c file is now the ONLY file compiled with the AVX512 
compiler flags. I added support for the MSVC compiler flag as well. Both meson 
and autoconf are updated with the new refactor.

I am attaching the new patch.

Paul

-Original Message-
From: Amonson, Paul D  
Sent: Monday, February 26, 2024 9:57 AM
To: Amonson, Paul D ; Andres Freund 

Cc: Alvaro Herrera ; Shankaran, Akash 
; Nathan Bossart ; Noah 
Misch ; Tom Lane ; Matthias van de Meent 
; pgsql-hackers@lists.postgresql.org
Subject: RE: Popcount optimization using AVX512

Hello again,

This is now a blocking issue. I can find no reason for the failing behavior of 
the MSVC build. All other languages build fine in CI including the Mac. Since 
the master branch builds, I assume I changed something critical to linking, but 
I can't figure out what that would be. Can someone with Windows/MSVC experience 
help me?

* Code: https://github.com/paul-amonson/postgresql/tree/popcnt_patch
* CI build: https://cirrus-ci.com/task/4927666021728256

Thanks,
Paul

-Original Message-
From: Amonson, Paul D 
Sent: Wednesday, February 21, 2024 9:36 AM
To: Andres Freund 
Cc: Alvaro Herrera ; Shankaran, Akash 
; Nathan Bossart ; Noah 
Misch ; Tom Lane ; Matthias van de Meent 
; pgsql-hackers@lists.postgresql.org
Subject: RE: Popcount optimization using AVX512

Hi,

I am encountering a problem that I don't think I understand. I cannot get the 
MSVC build to link in CI. I added 2 files to the build, but the linker is 
complaining about the original pg_bitutils.c file is missing (specifically 
symbol 'pg_popcount'). To my knowledge my changes did not change linking for 
the offending file and I see the compiles for pg_bitutils.c in all 3 libs in 
the build. All other builds are compiling.

Any help on this issue would be greatly appreciated.

My fork is at https://github.com/paul-amonson/postgresql/tree/popcnt_patch and 
the CI build is at https://cirrus-ci.com/task/4927666021728256.

Thanks,
Paul

-Original Message-
From: Andres Freund 
Sent: Monday, February 12, 2024 12:37 PM
To: Amonson, Paul D 
Cc: Alvaro Herrera ; Shankaran, Akash 
; Nathan Bossart ; Noah 
Misch ; Tom Lane ; Matthias van de Meent 
; pgsql-hackers@lists.postgresql.org
Subject: Re: Popcount optimization using AVX512

Hi,

On 2024-02-12 20:14:06 +, Amonson, Paul D wrote:
> > > +# Check for header immintrin.h
> > ...
> > Do these all actually have to link?  Invoking the linker is slow.
> > I think you might be able to just use cc.has_header_symbol().
>
> I took this to mean the last of the 3 new blocks.

Yep.

> > Does this work with msvc?
>
> I think it will work but I have no way to validate it. I propose we remove 
> the AVX-512 popcount feature from MSVC builds. Sound ok?

CI [1], whould be able to test at least building. Including via cfbot, 
automatically run for each commitfest entry - you can see prior runs at [2].  
They run on Zen 3 epyc instances, so unfortunately runtime won't be tested.  If 
you look at [3], you can see that currently it doesn't seem to be considered 
supported at configure time:

...
[00:23:48.480] Checking if "__get_cpuid" : links: NO [00:23:48.480] Checking if 
"__cpuid" : links: YES ...
[00:23:48.492] Checking if "x86_64: popcntq instruction" compiles: NO ...

Unfortunately CI currently is configured to not upload the build logs if the 
build succeeds, so we don't have enough details to see why.

> > This will build all of pgport with the avx flags, which wouldn't be 
> > correct, I think? The compiler might inject automatic uses of avx512 in 
> > places, which would cause problems, no?
>
> This will take me some time to learn how to do this in meson. Any 
> pointers here would be helpful.

Should be fairly simple, add it to the replace_funcs_pos and add the relevant 
cflags to pgport_cflags, similar to how it's done for crc.

> > While you don't do the same for make, isn't even just using the avx512 for 
> > all of pg_bitutils.c broken for exactly that reson? That's why the existing 
> > code builds the files for various crc variants as their own file.
>
> I don't think its broken, nothing else in pg_bitutils.c will make use 
> of
> AVX-512

You can't really guarantee that compiler auto-vectorization won't decide to do 
so, no? I wouldn't call it likely, but it's also hard to be sure it won't 
happen at some point.

> If splitting still makes sense, I propose splitting into 3 files:  
> pg_bitutils.c (entry point +sw popcnt implementation), pg_popcnt_choose.c 
> (CPUID and

Re: Relation bulk write facility

2024-02-27 Thread Tom Lane

Heikki Linnakangas  writes:
> What do y'all think of adding a check for 
> ALIGNOF_DOUBLE==MAXIMUM_ALIGNOF to configure.ac and meson.build? It's 
> not a requirement today, but I believe AIX was the only platform where 
> that was not true. With AIX gone, that combination won't be tested, and 
> we will probably break it sooner or later.

+1, and then probably revert the whole test addition of 79b716cfb7a.

I did a quick scrape of the buildfarm, and identified these as the
only animals reporting ALIGNOF_DOUBLE less than 8:

$ grep 'alignment of double' alignments  | grep -v ' 8$'
 hornet| 2024-02-22 16:26:16 | checking alignment of double... 4
 lapwing   | 2024-02-27 12:40:15 | checking alignment of double... (cached) 
4
 mandrill  | 2024-02-19 01:03:47 | checking alignment of double... 4
 sungazer  | 2024-02-21 00:22:48 | checking alignment of double... 4
 tern  | 2024-02-22 13:25:12 | checking alignment of double... 4

With AIX out of the picture, lapwing will be the only remaining
animal testing MAXALIGN less than 8.  That seems like a single
point of failure ... should we spin up another couple 32-bit
animals?  I had supposed that my faithful old PPC animal mamba
was helping to check this, but I see that under NetBSD it's
joined the ALIGNOF_DOUBLE==8 crowd.

regards, tom lane

Re: Relation bulk write facility

2024-02-27 Thread Heikki Linnakangas


On 26/02/2024 06:18, Michael Paquier wrote:

On Mon, Feb 26, 2024 at 09:42:03AM +0530, Robert Haas wrote:

On Mon, Feb 26, 2024 at 1:21 AM Tom Lane  wrote:

So, we now need to strip the remnants of AIX support from the code and
docs?  I don't see that much of it, but it's misleading to leave it
there.

(BTW, I still want to nuke the remaining snippets of HPPA support.
I don't think it does anybody any good to make it look like that's
still expected to work.)


+1 for removing things that don't work (or that we think probably don't work).


Seeing this stuff eat developer time because of the debugging of weird
issues while having a very limited impact for end-users is sad, so +1
for a cleanup of any remnants if this disappears.


Here's a patch to fully remove AIX support.

One small issue that warrants some discussion (in sanity_check.sql):


--- When ALIGNOF_DOUBLE==4 (e.g. AIX), the C ABI may impose 8-byte alignment on
+-- When MAXIMUM_ALIGNOF==8 but ALIGNOF_DOUBLE==4, the C ABI may impose 8-byte 
alignment
 -- some of the C types that correspond to TYPALIGN_DOUBLE SQL types.  To ensure
 -- catalog C struct layout matches catalog tuple layout, arrange for the tuple
 -- offset of each fixed-width, attalign='d' catalog column to be divisible by 8
 -- unconditionally.  Keep such columns before the first NameData column of the
 -- catalog, since packagers can override NAMEDATALEN to an odd number.
+-- (XXX: I'm not sure if any of the supported platforms have 
MAXIMUM_ALIGNOF==8 and
+-- ALIGNOF_DOUBLE==4.  Perhaps we should just require that
+-- ALIGNOF_DOUBLE==MAXIMUM_ALIGNOF)


What do y'all think of adding a check for 
ALIGNOF_DOUBLE==MAXIMUM_ALIGNOF to configure.ac and meson.build? It's 
not a requirement today, but I believe AIX was the only platform where 
that was not true. With AIX gone, that combination won't be tested, and 
we will probably break it sooner or later.


--
Heikki Linnakangas
Neon (https://neon.tech)
From 59a66f507365ff9cb8c462d8206be285e3e2632d Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas 
Date: Wed, 28 Feb 2024 00:22:23 +0400
Subject: [PATCH 1/1] Remove AIX support

There isn't a lot of user demand for AIX support, no one has stepped
up to the plate to properly maintain it, so it's best to remove it
altogether. AIX is still supported for stable versions.

The acute issue that triggered this decision was that after commit
8af2565248, the AIX buildfarm members have been hitting this
assertion:

TRAP: failed Assert("(uintptr_t) buffer == TYPEALIGN(PG_IO_ALIGN_SIZE, buffer)"), File: "md.c", Line: 472, PID: 2949728

Apperently the "pg_attribute_aligned(a)" attribute doesn't work on AIX
(linker?) for values larger than PG_IO_ALIGN_SIZE. That could be
worked around, but we decided to just drop the AIX support instead.

Discussion: https://www.postgresql.org/message-id/20240224172345.32%40rfd.leadboat.com
---
 Makefile  |   2 +
 config/c-compiler.m4  |   2 +-
 configure | 301 +-
 configure.ac  |  34 +-
 doc/src/sgml/dfunc.sgml   |  19 --
 doc/src/sgml/installation.sgml| 119 +--
 doc/src/sgml/runtime.sgml |  23 --
 meson.build   |  27 +-
 src/Makefile.shlib|  30 --
 src/backend/Makefile  |  20 --
 src/backend/meson.build   |  15 -
 src/backend/port/aix/mkldexport.sh|  61 
 src/backend/utils/error/elog.c|   2 -
 src/backend/utils/misc/ps_status.c|   4 +-
 src/bin/pg_basebackup/t/010_pg_basebackup.pl  |   3 -
 src/bin/pg_verifybackup/t/008_untar.pl|   3 -
 src/bin/pg_verifybackup/t/010_client_untar.pl |   3 -
 src/include/c.h   |  18 +-
 src/include/port/aix.h|  14 -
 src/include/port/atomics.h|   6 +-
 src/include/storage/s_lock.h  |  31 +-
 src/interfaces/libpq/Makefile |   2 +-
 src/interfaces/libpq/meson.build  |   5 +-
 src/makefiles/Makefile.aix|  39 ---
 src/port/README   |   2 +-
 src/port/strerror.c   |   2 -
 src/template/aix  |  25 --
 src/test/regress/Makefile |   5 -
 src/test/regress/expected/sanity_check.out|   5 +-
 src/test/regress/sql/sanity_check.sql |   5 +-
 src/tools/gen_export.pl   |  11 +-
 src/tools/pginclude/cpluspluscheck|   1 -
 src/tools/pginclude/headerscheck  |   1 -
 33 files changed, 52 insertions(+), 788 deletions(-)
 delete mode 100755 src/backend/port/aix/mkldexport.sh
 delete mode 100644 src/include/port/aix.h
 delete mode 100644 src/makefiles/Makefile.aix
 delete mode 100644 src/template/aix

diff --git

Re: Bytea PL/Perl transform

2024-02-27 Thread Alexander Korotkov

Hi!

On Tue, Jan 30, 2024 at 8:46 PM Pavel Stehule  wrote:
> I marked this patch as ready for committer.

The last version of the patch still provides transform for builtin
type in a separate extension.  As discussed upthread such transforms
don't need separate extensions, and could be provided as part of
upgrades of existing extensions.  There is probably no consensus yet
on what to do with existing extensions like jsonb_plperl and
jsonb_plpython, but we clearly shouldn't spread such cases.

--
Regards,
Alexander Korotkov

Wrong description in server_ca.config and client_ca.config

2024-02-27 Thread David Zhang


Hi Hackers,

The current descriptions for server_ca.config and client_ca.config are 
not so accurate. For example, one of the descriptions in 
server_ca.config states, "This certificate is used to sign server 
certificates. It is self-signed." However, the server_ca.crt and 
client_ca.crt are actually signed by the root_ca.crt, which is the only 
self-signed certificate. Therefore, it would be more accurate to change 
it to "This certificate is used to sign server certificates. It is an 
Intermediate CA."


Attached is a patch attempting to fix the description issue.

Best regards,

David
From ddc07447152331c09daecf0202178cfe77a817a9 Mon Sep 17 00:00:00 2001
From: David Zhang 
Date: Tue, 27 Feb 2024 10:06:18 -0800
Subject: [PATCH] correct description for server_ca and client_ca

---
 src/test/ssl/conf/client_ca.config | 8 +---
 src/test/ssl/conf/server_ca.config | 8 +---
 2 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/src/test/ssl/conf/client_ca.config 
b/src/test/ssl/conf/client_ca.config
index 5990f06000..08365aac95 100644
--- a/src/test/ssl/conf/client_ca.config
+++ b/src/test/ssl/conf/client_ca.config
@@ -1,7 +1,9 @@
-# An OpenSSL format CSR config file for creating the client root certificate.
-# This configuration file is also used when operating the CA.
+# An OpenSSL format CSR config file for creating the client Intermediate
+# Certificate Authority. This configuration file is also used when operating
+# the CA.
 #
-# This certificate is used to sign client certificates. It is self-signed.
+# This certificate is used to sign client certificates. It is an Intermediate
+# CA.
 
 [ req ]
 distinguished_name = req_distinguished_name
diff --git a/src/test/ssl/conf/server_ca.config 
b/src/test/ssl/conf/server_ca.config
index 496aaba29f..15f8d1590f 100644
--- a/src/test/ssl/conf/server_ca.config
+++ b/src/test/ssl/conf/server_ca.config
@@ -1,7 +1,9 @@
-# An OpenSSL format CSR config file for creating the server root certificate.
-# This configuration file is also used when operating the CA.
+# An OpenSSL format CSR config file for creating the server Intermediate
+# Certificate Authority. This configuration file is also used when operating
+# the CA.
 #
-# This certificate is used to sign server certificates. It is self-signed.
+# This certificate is used to sign server certificates. It is an Intermediate
+# CA.
 
 [ req ]
 distinguished_name = req_distinguished_name
-- 
2.34.1

Re: [PoC] Federated Authn/z with OAUTHBEARER

2024-02-27 Thread Jacob Champion

On Tue, Feb 27, 2024 at 11:20 AM Jacob Champion
 wrote:
> This is done in v17, which is also now based on the two patches pulled
> out by Daniel in [1].

It looks like my patchset has been eaten by a malware scanner:

550 Message content failed content scanning
(Sanesecurity.Foxhole.Mail_gz.UNOFFICIAL)

Was there a recent change to the lists? Is anyone able to see what the
actual error was so I don't do it again?

Thanks,
--Jacob

Re: Allow non-superuser to cancel superuser tasks.

2024-02-27 Thread Nathan Bossart

On Tue, Feb 27, 2024 at 11:59:00PM +0500, Kirill Reshke wrote:
> Do we need to test the pg_cancel_backend vs autovacuum case at all?
> I think we do. Would it be better to split work into 2 patches: first one
> with tests against current logic, and second
> one with some changes/enhancements which allows to cancel running autovac
> to non-superuser (via `pg_signal_autovacuum` role or some other mechanism)?

If we need to add tests for pg_signal_backend, I think it's reasonable to
keep those in a separate patch from pg_signal_autovacuum.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Re: Fix for edge case in date_bin() function

2024-02-27 Thread Tom Lane

I wrote:
> Hmm, yeah.  The "stride_usecs > 1" test seems like it's a partial
> attempt to account for this that is probably redundant given the
> additional condition.  Also, can we avoid computing tm_diff %
> stride_usecs twice?  Maybe the compiler is smart enough to remove the
> common subexpression, but it's a mighty expensive computation if not.

I think we could do it like this:

tm_diff = timestamp - origin;
tm_modulo = tm_diff % stride_usecs;
tm_delta = tm_diff - tm_modulo;
/* We want to round towards -infinity when tm_diff is negative */
if (tm_modulo < 0)
tm_delta -= stride_usecs;

Excluding tm_modulo == 0 from the correction is what fixes the
problem.

> I'm also itching a bit over whether there are integer-overflow
> hazards here.  Maybe the range of timestamp is constrained enough
> that there aren't, but I didn't look hard.

Hmm, it's not.  For instance this triggers the overflow check in
timestamp_mi:

regression=# select '294000-01-01'::timestamp - '4714-11-24 00:00:00 BC';
ERROR:  interval out of range
regression=# \errverbose 
ERROR:  22008: interval out of range
LOCATION:  timestamp_mi, timestamp.c:2832

So we ought to guard the subtraction that produces tm_diff similarly.
I suspect it's also possible to overflow int64 while computing
stride_usecs.

regards, tom lane

Re: Allow non-superuser to cancel superuser tasks.

2024-02-27 Thread Nathan Bossart

On Tue, Feb 27, 2024 at 01:22:31AM +0500, Kirill Reshke wrote:
> Also, tap tests for functionality added. I'm not sure where to place them,
> so I placed them in a separate directory in `src/test/`
> Seems that regression tests for this feature are not possible, am i right?

It might be difficult to create reliable tests for pg_signal_autovacuum.
If we can, it would probably be easiest to do with a TAP test.

> Also, I was thinking of pg_signal_autovacuum vs pg_signal_backend.
> Should pg_signal_autovacuum have power of pg_signal_backend (implicity)? Or
> should this role have such little scope...

-1.  I don't see why giving a role privileges of pg_signal_autovacuum
should also give them the ability to signal all other non-superuser
backends.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Re: Allow non-superuser to cancel superuser tasks.

2024-02-27 Thread Kirill Reshke

On Tue, 27 Feb 2024 at 01:22, Kirill Reshke  wrote:

>
>
> On Mon, 26 Feb 2024 at 20:10, Nathan Bossart 
> wrote:
>
>> On Mon, Feb 26, 2024 at 12:38:40PM +0500, Kirill Reshke wrote:
>> > I see 2 possible ways to implement this. The first one is to have hool
>> in
>> > pg_signal_backend, and define a hook in extension which can do the
>> thing.
>> > The second one is to have a predefined role. Something like a
>> > `pg_signal_autovacuum` role which can signal running autovac to cancel.
>> But
>> > I don't see how we can handle specific `application_name` with this
>> > solution.
>>
>> pg_signal_autovacuum seems useful given commit 3a9b18b.
>>
>> --
>> Nathan Bossart
>> Amazon Web Services: https://aws.amazon.com
>
>
> Thank you for your response.
> Please find a patch attached.
>
> In patch, pg_signal_autovacuum role with oid 6312 added. I grabbed oid
> from unused_oids script output.
> Also, tap tests for functionality added. I'm not sure where to place them,
> so I placed them in a separate directory in `src/test/`
> Seems that regression tests for this feature are not possible, am i right?
> Also, I was thinking of pg_signal_autovacuum vs pg_signal_backend.
> Should pg_signal_autovacuum have power of pg_signal_backend (implicity)?
> Or should this role have such little scope...
>
> Have a little thought on this, will share.
Do we need to test the pg_cancel_backend vs autovacuum case at all?
I think we do. Would it be better to split work into 2 patches: first one
with tests against current logic, and second
one with some changes/enhancements which allows to cancel running autovac
to non-superuser (via `pg_signal_autovacuum` role or some other mechanism)?

Re: libpq: PQfnumber overload for not null-terminated strings

2024-02-27 Thread Jelte Fennema-Nio

On Tue, 27 Feb 2024 at 15:49, Ivan Trofimov  wrote:
> Caching the result of PQfnumber could be done, but would result in
> somewhat of a mess on a call-site.

To be clear I meant your wrapper around libpq could internally cache
this, then the call sites of users of your wrapper would not need to
be changed. i.e. your Result could contain a cache of
columnname->columnumber mapping that you know because of previous
calls to PQfnumber on the same Result.

> I like your idea of 'PQfnumberRaw': initially i was only concerned about
> a null-terminated string requirement affecting my interfaces (because
> users complained about that to me,
> https://github.com/userver-framework/userver/issues/494), but I think
> PQfnumberRaw could solve both problems (PQfnumber being a bottleneck
> when called repeatedly and a null-terminated string requirement)
> simultaneously.

Feel free to send a patch for this.

Re: SLRU optimization - configurable buffer pool and partitioning the SLRU lock

2024-02-27 Thread Alvaro Herrera

On 2024-Feb-27, Alvaro Herrera wrote:

> Here's the complete set, with these two names using the singular.

BTW one thing I had not noticed is that before this patch we have
minimum shmem size that's lower than the lowest you can go with the new
code.

This means Postgres may no longer start when extremely tight memory
restrictions (and of course use more memory even when idle or with small
databases).  I wonder to what extent should we make an effort to relax
that.  For small, largely inactive servers, this is just memory we use
for no good reason.  However, anything we do here will impact
performance on the high end, because as Andrey says this will add
calculations and jumps where there are none today.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"We’ve narrowed the problem down to the customer’s pants being in a situation
 of vigorous combustion" (Robert Haas, Postgres expert extraordinaire)

Re: SLRU optimization - configurable buffer pool and partitioning the SLRU lock

2024-02-27 Thread Andrey M. Borodin




> On 27 Feb 2024, at 22:33, Alvaro Herrera  wrote:
> 
> 

These patches look amazing!


Best regards, Andrey Borodin.

Re: SLRU optimization - configurable buffer pool and partitioning the SLRU lock

2024-02-27 Thread Alvaro Herrera

Here's the complete set, with these two names using the singular.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"Uno puede defenderse de los ataques; contra los elogios se esta indefenso"
>From 225b2403f7bb9990656d18c8339c452fcd6822c5 Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Tue, 27 Feb 2024 16:56:00 +0100
Subject: [PATCH v21 1/2] Rename SLRU elements in pg_stat_slru

The new names are intended to match an upcoming patch that adds a few
GUCs to configure the SLRU buffer sizes.

Discussion: https://postgr.es/m/202402261616.dlriae7b6emv@alvherre.pgsql
---
 doc/src/sgml/monitoring.sgml| 14 
 src/backend/access/transam/clog.c   |  2 +-
 src/backend/access/transam/commit_ts.c  |  2 +-
 src/backend/access/transam/multixact.c  |  4 +--
 src/backend/access/transam/subtrans.c   |  2 +-
 src/backend/commands/async.c|  2 +-
 src/backend/storage/lmgr/predicate.c|  2 +-
 src/include/utils/pgstat_internal.h | 14 
 src/test/isolation/expected/stats.out   | 44 -
 src/test/isolation/expected/stats_1.out | 44 -
 src/test/isolation/specs/stats.spec |  4 +--
 src/test/regress/expected/stats.out | 14 
 src/test/regress/sql/stats.sql  | 14 
 13 files changed, 81 insertions(+), 81 deletions(-)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 5cf9363ac8..9d73d8c1bb 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -4853,13 +4853,13 @@ description | Waiting for a newly initialized WAL file to reach durable storage
 NULL or is not specified, all the counters shown in
 the pg_stat_slru view for all SLRU caches are
 reset. The argument can be one of
-CommitTs,
-MultiXactMember,
-MultiXactOffset,
-Notify,
-Serial,
-Subtrans, or
-Xact
+commit_timestamp,
+multixact_member,
+multixact_offset,
+notify,
+serializable,
+subtransaction, or
+transaction
 to reset the counters for only that entry.
 If the argument is other (or indeed, any
 unrecognized name), then the counters for all other SLRU caches, such
diff --git a/src/backend/access/transam/clog.c b/src/backend/access/transam/clog.c
index 97f7434da3..34f079cbb1 100644
--- a/src/backend/access/transam/clog.c
+++ b/src/backend/access/transam/clog.c
@@ -706,7 +706,7 @@ void
 CLOGShmemInit(void)
 {
 	XactCtl->PagePrecedes = CLOGPagePrecedes;
-	SimpleLruInit(XactCtl, "Xact", CLOGShmemBuffers(), CLOG_LSNS_PER_PAGE,
+	SimpleLruInit(XactCtl, "transaction", CLOGShmemBuffers(), CLOG_LSNS_PER_PAGE,
   XactSLRULock, "pg_xact", LWTRANCHE_XACT_BUFFER,
   SYNC_HANDLER_CLOG, false);
 	SlruPagePrecedesUnitTests(XactCtl, CLOG_XACTS_PER_PAGE);
diff --git a/src/backend/access/transam/commit_ts.c b/src/backend/access/transam/commit_ts.c
index 6bfe60343e..d965db89c7 100644
--- a/src/backend/access/transam/commit_ts.c
+++ b/src/backend/access/transam/commit_ts.c
@@ -529,7 +529,7 @@ CommitTsShmemInit(void)
 	bool		found;
 
 	CommitTsCtl->PagePrecedes = CommitTsPagePrecedes;
-	SimpleLruInit(CommitTsCtl, "CommitTs", CommitTsShmemBuffers(), 0,
+	SimpleLruInit(CommitTsCtl, "commit_timestamp", CommitTsShmemBuffers(), 0,
   CommitTsSLRULock, "pg_commit_ts",
   LWTRANCHE_COMMITTS_BUFFER,
   SYNC_HANDLER_COMMIT_TS,
diff --git a/src/backend/access/transam/multixact.c b/src/backend/access/transam/multixact.c
index febc429f72..64040d330e 100644
--- a/src/backend/access/transam/multixact.c
+++ b/src/backend/access/transam/multixact.c
@@ -1851,14 +1851,14 @@ MultiXactShmemInit(void)
 	MultiXactMemberCtl->PagePrecedes = MultiXactMemberPagePrecedes;
 
 	SimpleLruInit(MultiXactOffsetCtl,
-  "MultiXactOffset", NUM_MULTIXACTOFFSET_BUFFERS, 0,
+  "multixact_offset", NUM_MULTIXACTOFFSET_BUFFERS, 0,
   MultiXactOffsetSLRULock, "pg_multixact/offsets",
   LWTRANCHE_MULTIXACTOFFSET_BUFFER,
   SYNC_HANDLER_MULTIXACT_OFFSET,
   false);
 	SlruPagePrecedesUnitTests(MultiXactOffsetCtl, MULTIXACT_OFFSETS_PER_PAGE);
 	SimpleLruInit(MultiXactMemberCtl,
-  "MultiXactMember", NUM_MULTIXACTMEMBER_BUFFERS, 0,
+  "multixact_member", NUM_MULTIXACTMEMBER_BUFFERS, 0,
   MultiXactMemberSLRULock, "pg_multixact/members",
   LWTRANCHE_MULTIXACTMEMBER_BUFFER,
   SYNC_HANDLER_MULTIXACT_MEMBER,
diff --git a/src/backend/access/transam/subtrans.c b/src/backend/access/transam/subtrans.c
index b2ed82ac56..6aa47af43e 100644
--- a/src/backend/access/transam/subtrans.c
+++ b/src/backend/access/transam/subtrans.c
@@ -200,7 +200,7 @@ void
 SUBTRANSShmemInit(void)
 {
 	SubTransCtl->PagePrecedes = SubTransPagePrecedes;
-	SimpleLruInit(SubTransCtl, "Subtrans", NUM_SUBTRANS_BUFFERS, 0,
+	SimpleLruInit(SubTransCtl, "subtransaction", NUM_SUBTRANS_BUFFERS, 0,
   SubtransSLRULock, "pg_subtrans",

Re: Functions to return random numbers in a given range

2024-02-27 Thread Dean Rasheed

On Sat, 24 Feb 2024 at 17:10, Tomas Vondra
 wrote:
>
> Hi Dean,
>
> I did a quick review and a little bit of testing on the patch today. I
> think it's a good/useful idea, and I think the code is ready to go (the
> code is certainly much cleaner than anything I'd written ...).
>

Thanks for reviewing!

> I do have one minor comments regarding the docs - it refers to "random
> functions" in a couple places, which sounds to me as if it was talking
> about some functions arbitrarily taken from some list, although it
> clearly means "functions generating random numbers". (I realize this
> might be just due to me not being native speaker.)
>

Yes, I think you're right, that wording was a bit clumsy. Attached is
an update that's hopefully a bit better.

> Did you think about adding more functions generating either other types
> of data distributions (now we have uniform and normal), or random data
> for other data types (I often need random strings, for example)?
>
> Of course, I'm not saying this patch needs to do that. But perhaps it
> might affect how we name stuff to make it "extensible".
>

I don't have any plans to add more random functions, but I did think
about it from that perspective. Currently we have "random" and
"random_normal", so the natural extension would be
"random_${distribution}" for other data distributions, with "uniform"
as the default distribution, if omitted.

For different result datatypes, it ought to be mostly possible to
determine the result type from the arguments. There might be some
exceptions, like maybe "random_bytes(length)" to generate a byte
array, but I think that would be OK.

Regards,
Dean
From b1a63ecce667377435dc16fc262509bff2355b29 Mon Sep 17 00:00:00 2001
From: Dean Rasheed 
Date: Fri, 25 Aug 2023 10:42:38 +0100
Subject: [PATCH v3] Add random-number-in-range functions.

This adds 3 functions:

random(min int, max int) returns int
random(min bigint, max bigint) returns bigint
random(min numeric, max numeric) returns numeric

Each returns a random number in the range [min, max].

In the numeric case, the result scale is Max(scale(min), scale(max)).
---
 doc/src/sgml/func.sgml|  43 ++-
 src/backend/utils/adt/Makefile|   1 +
 src/backend/utils/adt/float.c |  95 --
 src/backend/utils/adt/meson.build |   1 +
 src/backend/utils/adt/numeric.c   | 219 +
 src/backend/utils/adt/pseudorandomfuncs.c | 185 +++
 src/common/pg_prng.c  |  36 +++
 src/include/catalog/pg_proc.dat   |  12 +
 src/include/common/pg_prng.h  |   1 +
 src/include/utils/numeric.h   |   4 +
 src/test/regress/expected/random.out  | 360 ++
 src/test/regress/sql/random.sql   | 164 ++
 12 files changed, 1021 insertions(+), 100 deletions(-)
 create mode 100644 src/backend/utils/adt/pseudorandomfuncs.c

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index e5fa82c161..e39e569fb6 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -1862,6 +1862,39 @@ SELECT NOT(ROW(table.*) IS NOT NULL) FROM TABLE; -- detect at least one null in

   
 
+  
+   
+
+ random
+
+random ( min integer, max integer )
+integer
+   
+   
+random ( min bigint, max bigint )
+bigint
+   
+   
+random ( min numeric, max numeric )
+numeric
+   
+   
+Returns a random value in the range
+min = x = max.
+For type numeric, the result will have the same number of
+fractional decimal digits as min or
+max, whichever has more.
+   
+   
+random(1, 10)
+7
+   
+   
+random(-0.499, 0.499)
+0.347
+   
+  
+
   

 
@@ -1906,19 +1939,19 @@ SELECT NOT(ROW(table.*) IS NOT NULL) FROM TABLE; -- detect at least one null in

 
   
-   The random() function uses a deterministic
-   pseudo-random number generator.
+   The random() and random_normal()
+   functions listed in  use a
+   deterministic pseudo-random number generator.
It is fast but not suitable for cryptographic
applications; see the  module for a more
secure alternative.
If setseed() is called, the series of results of
-   subsequent random() calls in the current session
+   subsequent calls to these functions in the current session
can be repeated by re-issuing setseed() with the same
argument.
Without any prior setseed() call in the same
-   session, the first random() call obtains a seed
+   session, the first call to any of these functions obtains a seed
from a platform-dependent source of random bits.
-   These remarks hold equally for random_normal().
   
 
   
diff --git a/src/backend/utils/adt/Makefile b/src/backend/utils/adt/Makefile
index 199eae525d..610ccf2f79 100644
--- a/src/backend/utils/adt/Makefile
+++

Re: Fix for edge case in date_bin() function

2024-02-27 Thread Tom Lane

Moaaz Assali  writes:
> The date_bin() function has a bug where it returns an incorrect binned date
> when both of the following are true:
> 1) the origin timestamp is before the source timestamp
> 2) the origin timestamp is exactly equivalent to some valid binned date in
> the set of binned dates that date_bin() can return given a specific stride
> and source timestamp.

Hmm, yeah.  The "stride_usecs > 1" test seems like it's a partial
attempt to account for this that is probably redundant given the
additional condition.  Also, can we avoid computing tm_diff %
stride_usecs twice?  Maybe the compiler is smart enough to remove the
common subexpression, but it's a mighty expensive computation if not.

I'm also itching a bit over whether there are integer-overflow
hazards here.  Maybe the range of timestamp is constrained enough
that there aren't, but I didn't look hard.

Also, whatever we do here, surely timestamptz_bin() has the
same problem(s).

regards, tom lane

Re: Support a wildcard in backtrace_functions

2024-02-27 Thread Jelte Fennema-Nio

On Mon, 12 Feb 2024 at 14:27, Jelte Fennema-Nio  wrote:
> Would a backtrace_functions_level GUC that would default to ERROR be
> acceptable in this case?

I implemented it this way in the attached patchset.
From 71e2c1f1fa903ecfce4a79ff5981d0d754d134a2 Mon Sep 17 00:00:00 2001
From: Jelte Fennema-Nio 
Date: Wed, 20 Dec 2023 11:41:18 +0100
Subject: [PATCH v3 2/2] Add wildcard support to backtrace_functions GUC

With this change setting backtrace_functions to '*' will start logging
backtraces for all errors (or more precisely all logs).
---
 doc/src/sgml/config.sgml   | 5 +
 src/backend/utils/error/elog.c | 8 +---
 2 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index ba5dbf9f096..a59d8e1b263 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -11136,6 +11136,11 @@ dynamic_library_path = 'C:\tools\postgresql;H:\my_project\lib;$libdir'
 code.

 
+   
+A single * character is interpreted as a wildcard and
+will cause all errors in the log to contain backtraces.
+   
+

 Backtrace support is not available on all platforms, and the quality
 of the backtraces depends on compilation options.
diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c
index 8364125f44a..923e00e766e 100644
--- a/src/backend/utils/error/elog.c
+++ b/src/backend/utils/error/elog.c
@@ -843,6 +843,8 @@ matches_backtrace_functions(const char *funcname)
 		if (*p == '\0')			/* end of backtrace_function_list */
 			break;
 
+		if (strcmp("*", p) == 0)
+			return true;
 		if (strcmp(funcname, p) == 0)
 			return true;
 		p += strlen(p) + 1;
@@ -2135,14 +2137,14 @@ check_backtrace_functions(char **newval, void **extra, GucSource source)
 	int			j;
 
 	/*
-	 * Allow characters that can be C identifiers and commas as separators, as
-	 * well as some whitespace for readability.
+	 * Allow characters that can be C identifiers, commas as separators, the
+	 * wildcard symbol, as well as some whitespace for readability.
 	 */
 	validlen = strspn(*newval,
 	  "0123456789_"
 	  "abcdefghijklmnopqrstuvwxyz"
 	  "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
-	  ", \n\t");
+	  ",* \n\t");
 	if (validlen != newvallen)
 	{
 		GUC_check_errdetail("Invalid character");
-- 
2.34.1

From 4ffbc6b51a711bd266f15c02611d894b080a3e11 Mon Sep 17 00:00:00 2001
From: Jelte Fennema-Nio 
Date: Tue, 27 Feb 2024 17:31:24 +0100
Subject: [PATCH v3 1/2] Add backtrace_functions_min_level

Before this change a stacktrace would be attached to all logs messages
in a function that matched backtrace_functions. But in most cases people
only care about the stacktraces for messages with an ERROR level. This
changes that default to only log stacktraces for ERROR messages but
keeps the option open of logging stacktraces for different log levels
too by having users configure the new backtrace_functions_min_level GUC.
---
 doc/src/sgml/config.sgml| 37 +
 src/backend/utils/error/elog.c  |  1 +
 src/backend/utils/misc/guc_tables.c | 12 ++
 src/include/utils/guc.h |  1 +
 4 files changed, 47 insertions(+), 4 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 36a2a5ce431..ba5dbf9f096 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -11128,10 +11128,12 @@ dynamic_library_path = 'C:\tools\postgresql;H:\my_project\lib;$libdir'
   

 This parameter contains a comma-separated list of C function names.
-If an error is raised and the name of the internal C function where
-the error happens matches a value in the list, then a backtrace is
-written to the server log together with the error message.  This can
-be used to debug specific areas of the source code.
+If a log entry is raised with a level higher than
+ and the name of the
+internal C function where the error happens matches a value in the
+list, then a backtrace is written to the server log together with the
+error message. This can be used to debug specific areas of the source
+code.

 

@@ -11146,6 +11148,33 @@ dynamic_library_path = 'C:\tools\postgresql;H:\my_project\lib;$libdir'
   
  
 
+ 
+  backtrace_functions_min_level (string)
+  
+backtrace_functions_min_level configuration parameter
+  
+  
+  
+   
+Controls which message
+levels cause stack traces to be written to the log, for log
+entries in C functions that match the configured
+.
+   
+
+   
+Backtrace support is not available on all platforms, and the quality
+of the backtraces depends on compilation options.
+   
+
+   
+Only superusers and users with the appropriate SET
+privilege can change this setting.
+   
+  
+ 
+
+

Re: SLRU optimization - configurable buffer pool and partitioning the SLRU lock

2024-02-27 Thread Alvaro Herrera

On 2024-Feb-27, Andrey M. Borodin wrote:

> Sorry for the late reply, I have one nit. Are you sure that
> multixact_members and multixact_offsets are plural, while transaction
> and commit_timestamp are singular?
> Maybe multixact_members and multixact_offset? Because there are many
> members and one offset for a givent multixact? Users certainly do not
> care, thought...

I made myself the same question actually, and thought about putting them
both in the singular.  I only backed off because I noticed that the
directories themselves are in plural (an old mistake of mine, evidently). 
Maybe we should follow that instinct and use the singular for these.

If we do that, we can rename the directories to also appear in singular
when/if the patch to add standard page headers to the SLRUs lands --
which is going to need code to rewrite the files during pg_upgrade
anyway, so the rename is not going to be a big deal.

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/
"Crear es tan difícil como ser libre" (Elsa Triolet)

Re: SLRU optimization - configurable buffer pool and partitioning the SLRU lock

2024-02-27 Thread Andrey M. Borodin

> On 27 Feb 2024, at 21:03, Alvaro Herrera  wrote:
> 
> On 2024-Feb-27, Dilip Kumar wrote:
> 
>>> static const char *const slru_names[] = {
>>>"commit_timestamp",
>>>"multixact_members",
>>>"multixact_offsets",
>>>"notify",
>>>"serializable",
>>>"subtransaction",
>>>"transaction",
>>>"other" /* has to be last
>>> */
>>> };
> 
> Here's a patch for the renaming part.

Sorry for the late reply, I have one nit. Are you sure that multixact_members 
and multixact_offsets are plural, while transaction and commit_timestamp are 
singular?
Maybe multixact_members and multixact_offset? Because there are many members 
and one offset for a givent multixact? Users certainly do not care, thought...

Best regards, Andrey Borodin.

Re: Improve readability by using designated initializers when possible

2024-02-27 Thread Jelte Fennema-Nio

On Tue, 27 Feb 2024 at 16:04, Japin Li  wrote:
> I see the config_group_names[] needs null-terminated because of help_config,
> however, I didn't find the reference in help_config.c.  Is this comment
> outdated?

Yeah, you're correct. That comment has been outdated for more than 20
years. The commit that made it unnecessary to null-terminate the array
was 9d77708d83ee.

Attached is v5 of the patchset that also includes this change (with
you listed as author).
From 945fae648d0a396e9f99413c8e31542ced9b17ba Mon Sep 17 00:00:00 2001
From: Jelte Fennema-Nio 
Date: Wed, 21 Feb 2024 15:34:38 +0100
Subject: [PATCH v5 2/4] Use designated initializer syntax to improve
 readability

Use C99 designated initializer syntax for array elements of various
arrays. This is less verbose, more readable and less prone to typos or
ordering mistakes.
Akin to 74a730631065 and cc150596341e.
---
 src/backend/parser/parser.c |  12 +-
 src/backend/utils/misc/guc_tables.c | 187 +++-
 src/bin/pg_dump/pg_dump_sort.c  |  94 +++---
 src/common/encnames.c   | 154 +++
 src/common/relpath.c|   8 +-
 src/common/wchar.c  |  85 +++--
 src/include/mb/pg_wchar.h   |   6 +-
 7 files changed, 248 insertions(+), 298 deletions(-)

diff --git a/src/backend/parser/parser.c b/src/backend/parser/parser.c
index 9ec628ecbdf..3a1fa91c1b6 100644
--- a/src/backend/parser/parser.c
+++ b/src/backend/parser/parser.c
@@ -56,12 +56,12 @@ raw_parser(const char *str, RawParseMode mode)
 	{
 		/* this array is indexed by RawParseMode enum */
 		static const int mode_token[] = {
-			0,	/* RAW_PARSE_DEFAULT */
-			MODE_TYPE_NAME,		/* RAW_PARSE_TYPE_NAME */
-			MODE_PLPGSQL_EXPR,	/* RAW_PARSE_PLPGSQL_EXPR */
-			MODE_PLPGSQL_ASSIGN1,	/* RAW_PARSE_PLPGSQL_ASSIGN1 */
-			MODE_PLPGSQL_ASSIGN2,	/* RAW_PARSE_PLPGSQL_ASSIGN2 */
-			MODE_PLPGSQL_ASSIGN3	/* RAW_PARSE_PLPGSQL_ASSIGN3 */
+			[RAW_PARSE_DEFAULT] = 0,
+			[RAW_PARSE_TYPE_NAME] = MODE_TYPE_NAME,
+			[RAW_PARSE_PLPGSQL_EXPR] = MODE_PLPGSQL_EXPR,
+			[RAW_PARSE_PLPGSQL_ASSIGN1] = MODE_PLPGSQL_ASSIGN1,
+			[RAW_PARSE_PLPGSQL_ASSIGN2] = MODE_PLPGSQL_ASSIGN2,
+			[RAW_PARSE_PLPGSQL_ASSIGN3] = MODE_PLPGSQL_ASSIGN3,
 		};
 
 		yyextra.have_lookahead = true;
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 527a2b27340..a63ea042edf 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -627,13 +627,13 @@ bool		in_hot_standby_guc;
  */
 const char *const GucContext_Names[] =
 {
-	 /* PGC_INTERNAL */ "internal",
-	 /* PGC_POSTMASTER */ "postmaster",
-	 /* PGC_SIGHUP */ "sighup",
-	 /* PGC_SU_BACKEND */ "superuser-backend",
-	 /* PGC_BACKEND */ "backend",
-	 /* PGC_SUSET */ "superuser",
-	 /* PGC_USERSET */ "user"
+	[PGC_INTERNAL] = "internal",
+	[PGC_POSTMASTER] = "postmaster",
+	[PGC_SIGHUP] = "sighup",
+	[PGC_SU_BACKEND] = "superuser-backend",
+	[PGC_BACKEND] = "backend",
+	[PGC_SUSET] = "superuser",
+	[PGC_USERSET] = "user",
 };
 
 StaticAssertDecl(lengthof(GucContext_Names) == (PGC_USERSET + 1),
@@ -646,20 +646,20 @@ StaticAssertDecl(lengthof(GucContext_Names) == (PGC_USERSET + 1),
  */
 const char *const GucSource_Names[] =
 {
-	 /* PGC_S_DEFAULT */ "default",
-	 /* PGC_S_DYNAMIC_DEFAULT */ "default",
-	 /* PGC_S_ENV_VAR */ "environment variable",
-	 /* PGC_S_FILE */ "configuration file",
-	 /* PGC_S_ARGV */ "command line",
-	 /* PGC_S_GLOBAL */ "global",
-	 /* PGC_S_DATABASE */ "database",
-	 /* PGC_S_USER */ "user",
-	 /* PGC_S_DATABASE_USER */ "database user",
-	 /* PGC_S_CLIENT */ "client",
-	 /* PGC_S_OVERRIDE */ "override",
-	 /* PGC_S_INTERACTIVE */ "interactive",
-	 /* PGC_S_TEST */ "test",
-	 /* PGC_S_SESSION */ "session"
+	[PGC_S_DEFAULT] = "default",
+	[PGC_S_DYNAMIC_DEFAULT] = "default",
+	[PGC_S_ENV_VAR] = "environment variable",
+	[PGC_S_FILE] = "configuration file",
+	[PGC_S_ARGV] = "command line",
+	[PGC_S_GLOBAL] = "global",
+	[PGC_S_DATABASE] = "database",
+	[PGC_S_USER] = "user",
+	[PGC_S_DATABASE_USER] = "database user",
+	[PGC_S_CLIENT] = "client",
+	[PGC_S_OVERRIDE] = "override",
+	[PGC_S_INTERACTIVE] = "interactive",
+	[PGC_S_TEST] = "test",
+	[PGC_S_SESSION] = "session",
 };
 
 StaticAssertDecl(lengthof(GucSource_Names) == (PGC_S_SESSION + 1),
@@ -670,96 +670,51 @@ StaticAssertDecl(lengthof(GucSource_Names) == (PGC_S_SESSION + 1),
  */
 const char *const config_group_names[] =
 {
-	/* UNGROUPED */
-	gettext_noop("Ungrouped"),
-	/* FILE_LOCATIONS */
-	gettext_noop("File Locations"),
-	/* CONN_AUTH_SETTINGS */
-	gettext_noop("Connections and Authentication / Connection Settings"),
-	/* CONN_AUTH_TCP */
-	gettext_noop("Connections and Authentication / TCP Settings"),
-	/* CONN_AUTH_AUTH */
-	gettext_noop("Connections and Authentication / Authentication"),
-	/* CONN_AUTH_SSL */
-	gettext_noop("Connections and Authentication / SSL"),
-	/* RESOURCES_MEM */
-	gettext_noop("Resource Usage /

Re: SLRU optimization - configurable buffer pool and partitioning the SLRU lock

2024-02-27 Thread Alvaro Herrera

On 2024-Feb-27, Dilip Kumar wrote:

> > static const char *const slru_names[] = {
> > "commit_timestamp",
> > "multixact_members",
> > "multixact_offsets",
> > "notify",
> > "serializable",
> > "subtransaction",
> > "transaction",
> > "other" /* has to be last
> > */
> > };

Here's a patch for the renaming part.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"No nos atrevemos a muchas cosas porque son difíciles,
pero son difíciles porque no nos atrevemos a hacerlas" (Séneca)
>From 91741984cbd77f88e39b6fac8e8c7dc622d2ccf4 Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Tue, 27 Feb 2024 16:56:00 +0100
Subject: [PATCH] Rename SLRU elements in pg_stat_slru

The new names are intended to match an upcoming patch that adds a few
GUCs to configure the SLRU buffer sizes.

Discussion: https://postgr.es/m/202402261616.dlriae7b6emv@alvherre.pgsql
---
 doc/src/sgml/monitoring.sgml| 14 
 src/backend/access/transam/clog.c   |  2 +-
 src/backend/access/transam/commit_ts.c  |  2 +-
 src/backend/access/transam/multixact.c  |  4 +--
 src/backend/access/transam/subtrans.c   |  2 +-
 src/backend/commands/async.c|  2 +-
 src/backend/storage/lmgr/predicate.c|  2 +-
 src/include/utils/pgstat_internal.h | 14 
 src/test/isolation/expected/stats.out   | 44 -
 src/test/isolation/expected/stats_1.out | 44 -
 src/test/isolation/specs/stats.spec |  4 +--
 src/test/regress/expected/stats.out | 14 
 src/test/regress/sql/stats.sql  | 14 
 13 files changed, 81 insertions(+), 81 deletions(-)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 5cf9363ac8..7d92e68572 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -4853,13 +4853,13 @@ description | Waiting for a newly initialized WAL file to reach durable storage
 NULL or is not specified, all the counters shown in
 the pg_stat_slru view for all SLRU caches are
 reset. The argument can be one of
-CommitTs,
-MultiXactMember,
-MultiXactOffset,
-Notify,
-Serial,
-Subtrans, or
-Xact
+commit_timestamp,
+multixact_members,
+multixact_offsets,
+notify,
+serializable,
+subtransaction, or
+transaction
 to reset the counters for only that entry.
 If the argument is other (or indeed, any
 unrecognized name), then the counters for all other SLRU caches, such
diff --git a/src/backend/access/transam/clog.c b/src/backend/access/transam/clog.c
index 97f7434da3..34f079cbb1 100644
--- a/src/backend/access/transam/clog.c
+++ b/src/backend/access/transam/clog.c
@@ -706,7 +706,7 @@ void
 CLOGShmemInit(void)
 {
 	XactCtl->PagePrecedes = CLOGPagePrecedes;
-	SimpleLruInit(XactCtl, "Xact", CLOGShmemBuffers(), CLOG_LSNS_PER_PAGE,
+	SimpleLruInit(XactCtl, "transaction", CLOGShmemBuffers(), CLOG_LSNS_PER_PAGE,
   XactSLRULock, "pg_xact", LWTRANCHE_XACT_BUFFER,
   SYNC_HANDLER_CLOG, false);
 	SlruPagePrecedesUnitTests(XactCtl, CLOG_XACTS_PER_PAGE);
diff --git a/src/backend/access/transam/commit_ts.c b/src/backend/access/transam/commit_ts.c
index 6bfe60343e..d965db89c7 100644
--- a/src/backend/access/transam/commit_ts.c
+++ b/src/backend/access/transam/commit_ts.c
@@ -529,7 +529,7 @@ CommitTsShmemInit(void)
 	bool		found;
 
 	CommitTsCtl->PagePrecedes = CommitTsPagePrecedes;
-	SimpleLruInit(CommitTsCtl, "CommitTs", CommitTsShmemBuffers(), 0,
+	SimpleLruInit(CommitTsCtl, "commit_timestamp", CommitTsShmemBuffers(), 0,
   CommitTsSLRULock, "pg_commit_ts",
   LWTRANCHE_COMMITTS_BUFFER,
   SYNC_HANDLER_COMMIT_TS,
diff --git a/src/backend/access/transam/multixact.c b/src/backend/access/transam/multixact.c
index febc429f72..f8bb83927c 100644
--- a/src/backend/access/transam/multixact.c
+++ b/src/backend/access/transam/multixact.c
@@ -1851,14 +1851,14 @@ MultiXactShmemInit(void)
 	MultiXactMemberCtl->PagePrecedes = MultiXactMemberPagePrecedes;
 
 	SimpleLruInit(MultiXactOffsetCtl,
-  "MultiXactOffset", NUM_MULTIXACTOFFSET_BUFFERS, 0,
+  "multixact_offsets", NUM_MULTIXACTOFFSET_BUFFERS, 0,
   MultiXactOffsetSLRULock, "pg_multixact/offsets",
   LWTRANCHE_MULTIXACTOFFSET_BUFFER,
   SYNC_HANDLER_MULTIXACT_OFFSET,
   false);
 	SlruPagePrecedesUnitTests(MultiXactOffsetCtl, MULTIXACT_OFFSETS_PER_PAGE);
 	SimpleLruInit(MultiXactMemberCtl,
-  "MultiXactMember", NUM_MULTIXACTMEMBER_BUFFERS, 0,
+  "multixact_members", NUM_MULTIXACTMEMBER_BUFFERS, 0,
   MultiXactMemberSLRULock, "pg_multixact/members",
   LWTRANCHE_MULTIXACTMEMBER_BUFFER,
   SYNC_HANDLER_MULTIXACT_MEMBER,
diff --git a/src/backend/access/transam/subtrans.c b/src/backend/access/transam/subtrans.c
index b2ed82ac56..6aa47af43e

Re: WIP Incremental JSON Parser

2024-02-27 Thread Jacob Champion

On Mon, Feb 26, 2024 at 9:20 PM Andrew Dunstan  wrote:
> The good news is that the parser is doing fine - this issue was due to a
> thinko on my part in the test program that got triggered by the input
> file size being an exact multiple of the chunk size. I'll have a new
> patch set later this week.

Ah, feof()! :D Good to know it's not the partial token logic.

--Jacob

Re: Improve readability by using designated initializers when possible

2024-02-27 Thread Japin Li



On Tue, 27 Feb 2024 at 19:55, Jelte Fennema-Nio  wrote:
> On Tue, 27 Feb 2024 at 12:52, Jelte Fennema-Nio  wrote:
>> Attached is an updated patchset to also convert pg_enc2icu_tbl and
>> pg_enc2gettext_tbl. I converted pg_enc2gettext_tbl in a separate
>> commit, because it actually requires some codechanges too.
>
> Another small update to also make all arrays changed by this patch
> have a trailing comma (to avoid future diff noise).

I see the config_group_names[] needs null-terminated because of help_config,
however, I didn't find the reference in help_config.c.  Is this comment
outdated?  Here is a patch to remove the null-terminated.

diff --git a/src/backend/utils/misc/guc_tables.c 
b/src/backend/utils/misc/guc_tables.c
index 59904fd007..df849f73fc 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -715,11 +715,9 @@ const char *const config_group_names[] =
[PRESET_OPTIONS] = gettext_noop("Preset Options"),
[CUSTOM_OPTIONS] = gettext_noop("Customized Options"),
[DEVELOPER_OPTIONS] = gettext_noop("Developer Options"),
-   /* help_config wants this array to be null-terminated */
-   NULL
 };

-StaticAssertDecl(lengthof(config_group_names) == (DEVELOPER_OPTIONS + 2),
+StaticAssertDecl(lengthof(config_group_names) == (DEVELOPER_OPTIONS + 1),
 "array length mismatch");

 /*

Re: Improve eviction algorithm in ReorderBuffer

2024-02-27 Thread vignesh C

On Mon, 26 Feb 2024 at 12:33, Masahiko Sawada  wrote:
>
> On Fri, Feb 23, 2024 at 6:24 PM vignesh C  wrote:
> >
> > On Fri, 9 Feb 2024 at 20:51, Masahiko Sawada  wrote:
> > >
> > >
> > > I think this performance regression is not acceptable. In this
> > > workload, one transaction has 10k subtransactions and the logical
> > > decoding becomes quite slow if logical_decoding_work_mem is not big
> > > enough. Therefore, it's a legitimate and common approach to increase
> > > logical_decoding_work_mem to speedup the decoding. However, with thie
> > > patch, the decoding becomes slower than today. It's a bad idea in
> > > general to optimize an extreme case while sacrificing the normal (or
> > > more common) cases.
> > >
> >
> > Since this same function is used by pg_dump sorting TopoSort functions
> > also, we can just verify once if there is no performance impact with
> > large number of objects during dump sorting:
>
> Okay. I've run the pg_dump regression tests with --timer flag (note
> that pg_dump doesn't use indexed binary heap):
>
> master:
> [16:00:25] t/001_basic.pl  ok  151 ms ( 0.00 usr
> 0.00 sys +  0.09 cusr  0.06 csys =  0.15 CPU)
> [16:00:25] t/002_pg_dump.pl .. ok10157 ms ( 0.23 usr
> 0.01 sys +  1.48 cusr  0.37 csys =  2.09 CPU)
> [16:00:36] t/003_pg_dump_with_server.pl .. ok  504 ms ( 0.00 usr
> 0.01 sys +  0.10 cusr  0.07 csys =  0.18 CPU)
> [16:00:36] t/004_pg_dump_parallel.pl . ok 1044 ms ( 0.00 usr
> 0.00 sys +  0.12 cusr  0.08 csys =  0.20 CPU)
> [16:00:37] t/005_pg_dump_filterfile.pl ... ok 2390 ms ( 0.00 usr
> 0.00 sys +  0.34 cusr  0.19 csys =  0.53 CPU)
> [16:00:40] t/010_dump_connstr.pl . ok 4813 ms ( 0.01 usr
> 0.00 sys +  2.13 cusr  0.45 csys =  2.59 CPU)
>
> patched:
> [15:59:47] t/001_basic.pl  ok  150 ms ( 0.00 usr
> 0.00 sys +  0.08 cusr  0.07 csys =  0.15 CPU)
> [15:59:47] t/002_pg_dump.pl .. ok10057 ms ( 0.23 usr
> 0.02 sys +  1.49 cusr  0.36 csys =  2.10 CPU)
> [15:59:57] t/003_pg_dump_with_server.pl .. ok  509 ms ( 0.00 usr
> 0.00 sys +  0.09 cusr  0.08 csys =  0.17 CPU)
> [15:59:58] t/004_pg_dump_parallel.pl . ok 1048 ms ( 0.01 usr
> 0.00 sys +  0.11 cusr  0.11 csys =  0.23 CPU)
> [15:59:59] t/005_pg_dump_filterfile.pl ... ok 2398 ms ( 0.00 usr
> 0.00 sys +  0.34 cusr  0.20 csys =  0.54 CPU)
> [16:00:01] t/010_dump_connstr.pl . ok 4762 ms ( 0.01 usr
> 0.00 sys +  2.15 cusr  0.42 csys =  2.58 CPU)
>
> There is no noticeable difference between the two results.

Thanks for verifying it, I have also run in my environment and found
no noticeable difference between them:
Head:
[07:29:41] t/001_basic.pl  ok  332 ms
[07:29:41] t/002_pg_dump.pl .. ok11029 ms
[07:29:52] t/003_pg_dump_with_server.pl .. ok  705 ms
[07:29:53] t/004_pg_dump_parallel.pl . ok 1198 ms
[07:29:54] t/005_pg_dump_filterfile.pl ... ok 2822 ms
[07:29:57] t/010_dump_connstr.pl . ok 5582 ms

With Patch:
[07:42:16] t/001_basic.pl  ok  328 ms
[07:42:17] t/002_pg_dump.pl .. ok11044 ms
[07:42:28] t/003_pg_dump_with_server.pl .. ok  719 ms
[07:42:29] t/004_pg_dump_parallel.pl . ok 1188 ms
[07:42:30] t/005_pg_dump_filterfile.pl ... ok 2816 ms
[07:42:33] t/010_dump_connstr.pl . ok 5609 ms

Regards,
Vignesh

Re: libpq: PQfnumber overload for not null-terminated strings

2024-02-27 Thread Ivan Trofimov


However, I do think you could convert this per-row overhead in your
case to per-query overhead by caching the result of PQfnumber for each
unique C++ string_view. Afaict that should solve your performance
problem.


Absolutely, you're right.

The problem here is not that it's impossible to write it in a performant 
way, but rather that it's impossible to do so in a performant _and_ 
clean way given the convenient abstractions wrapper-libraries provide: 
here's a `Result`, which consists of `Row`s, which in turn consist of 
`Field`s.
The most natural and straightforward way to iterate over a `Result` 
would be in the lines of that loop, and people do write code like that 
because it's what they expect to just work given the abstractions (and 
it does, it's just slow).
Caching the result of PQfnumber could be done, but would result in 
somewhat of a mess on a call-site.



I like your idea of 'PQfnumberRaw': initially i was only concerned about 
a null-terminated string requirement affecting my interfaces (because 
users complained about that to me, 
https://github.com/userver-framework/userver/issues/494), but I think 
PQfnumberRaw could solve both problems (PQfnumber being a bottleneck 
when called repeatedly and a null-terminated string requirement) 
simultaneously.

Re: [PATCH] updates to docs about HOT updates for BRIN

2024-02-27 Thread Stephen Frost

Greetings,

* Alvaro Herrera (alvhe...@alvh.no-ip.org) wrote:
> On 2024-Feb-26, Stephen Frost wrote:
> > Here's an updated patch which tries to improve on the wording a bit by
> > having it be a bit more consistent.  Would certainly welcome feedback on
> > it though, of course.  This is a tricky bit of language to write and
> > a complex optimization to explain.
> 
> Should we try to explain what is a "summarizing" index is?  Right now
> the only way to know is to look at the source code or attach a debugger
> and see if IndexAmRoutine->amsummarizing is true.  Maybe we can just say
> "currently the only in-core summarizing index is BRIN" somewhere in the
> page.  (The patch's proposal to say "... such as BRIN" strikes me as too
> vague.)

Not sure about explaining what one is, but I'd be fine adding that
language.  I was disappointed to see that there's no way to figure out
the value of amsummarizing for an access method other than looking at
the code.  Not as part of this specific patch, but I'd generally support
having a way to that information at the SQL level (or perhaps everything
from IndexAmRoutine?).

Attached is an updated patch which drops the 'such as' and adds a
sentence mentioning that BRIN is the only in-core summarizing index.

Thanks!

Stephen
From 131f83868b947b379e57ea3dad0acf5a4f95bca8 Mon Sep 17 00:00:00 2001
From: Stephen Frost 
Date: Mon, 26 Feb 2024 17:17:54 -0500
Subject: [PATCH] docs: Update HOT update docs for 19d8e2308b

Commit 19d8e2308b changed when the HOT update optimization is possible
but neglected to update the Heap-Only Tuples (HOT) documentation.  This
patch updates that documentation accordingly.

Author: Elizabeth Christensen
Reviewed-By: Stephen Frost, Alvaro Herrera
Backpatch-through: 16
Discussion: https://postgr.es/m/CABoUFXRjisr58Ct_3VsFEdQx+fJeQTWTdJnM7XAp=8mubto...@mail.gmail.com
---
 doc/src/sgml/storage.sgml | 21 +
 1 file changed, 13 insertions(+), 8 deletions(-)

diff --git a/doc/src/sgml/storage.sgml b/doc/src/sgml/storage.sgml
index 3ea4e5526d..f2bb0283ae 100644
--- a/doc/src/sgml/storage.sgml
+++ b/doc/src/sgml/storage.sgml
@@ -1097,8 +1097,13 @@ data. Empty in ordinary tables.
   

 
- The update does not modify any columns referenced by the table's
- indexes, including expression and partial indexes.
+ The update only modifies columns which are not referenced by the
+ table's indexes or are only referenced by summarizing indexes and
+ does not update any columns referenced by the table's
+ non-summarizing indexes, including expression and partial
+ non-summarizing indexes.  The only summarizing index in the core
+ PostgreSQL distribution is
+ BRIN.
  


@@ -1114,7 +1119,8 @@ data. Empty in ordinary tables.
   

 
- New index entries are not needed to represent updated rows.
+ New index entries are not needed to represent updated rows, however,
+ summary indexes may still need to be updated.
 


@@ -1130,11 +1136,10 @@ data. Empty in ordinary tables.
  
 
  
-  In summary, heap-only tuple updates can only be created
-  if columns used by indexes are not updated.  You can
-  increase the likelihood of sufficient page space for
-  HOT updates by decreasing a table's fillfactor.
+  In summary, heap-only tuple updates can only be created if columns used by
+  non-summarizing indexes are not updated. You can increase the likelihood of
+  sufficient page space for HOT updates by decreasing
+  a table's fillfactor.
   If you don't, HOT updates will still happen because
   new rows will naturally migrate to new pages and existing pages with
   sufficient free space for new row versions.  The system view 

signature.asc
Description: PGP signature

Re: abi-compliance-checker

2024-02-27 Thread Alvaro Herrera

On 2024-Feb-27, Peter Geoghegan wrote:

> I have a feeling that there are going to be real problems with
> alerting, at least if it's introduced right away. I'd feel much better
> about it if there was an existing body of suppressions, that more or
> less worked as a reference of agreed upon best practices. Can we do
> that part first, rather than starting out with a blanket assumption
> that everything that happened before now must have been perfect?

Well, I was describing a possible plan, not saying that we have to
assume we've been perfect all along.  I think the first step should be
to add the tooling now (Meson rules as in Peter E's 0001 patch
upthread, or something equivalent), then figure out what suppressions we
need in the supported back branches.  This would let us build the corpus
of best practices you want, I think.

Once we have clean runs with those, we can add BF animals or whatever.
The alerts don't have to be the first step.  In fact, we can wait even
longer for the alerts.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/

Re: Can we include capturing logs of pgdata/pg_upgrade_output.d/*/log in buildfarm

2024-02-27 Thread vignesh C

On Mon, 26 Feb 2024 at 10:57, Andrew Dunstan  wrote:
>
>
> On 2024-02-25 Su 11:18, vignesh C wrote:
> > On Thu, 15 Feb 2024 at 08:36, vignesh C  wrote:
> >> On Thu, 15 Feb 2024 at 07:24, Michael Paquier  wrote:
> >>> On Wed, Feb 14, 2024 at 03:51:08PM +0530, vignesh C wrote:
>  First regex is the testname_clusterinstance_data, second regex is the
>  timestamp used for pg_upgrade, third regex is for the text files
>  generated by pg_upgrade and fourth regex is for the log files
>  generated by pg_upgrade.
> 
>  Can we include these log files also in the buildfarm?
> 
>  [1] - 
>  https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=drongo=2024-02-10%2007%3A03%3A10
>  [2] - 
>  https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=drongo=2023-12-07%2003%3A56%3A20
> >>> Indeed, these lack some patterns.  Why not sending a pull request
> >>> around [1] to get more patterns covered?
> >> I have added a few more patterns to include the pg_upgrade generated
> >> files. The attached patch has the changes for the same.
> >> Adding Andrew also to get his thoughts on this.
> > I have added the following commitfest entry for this:
> > https://commitfest.postgresql.org/47/4850/
> >
>
> Buildfarm code patches do not belong in the Commitfest, I have marked
> the item as rejected. You can send me patches directly or add a PR to
> the buildfarm's github repo.

Ok, I will send over the patch directly for the required things.

>
> In this case the issue on drongo was a typo, the fix for which I had
> forgotten to propagate back in December. Note that the buildfarm's
> TestUpgrade.pm module is only used for branches < 15. For branches >= 15
> we run the standard TAP test and this module does nothing.
>
> More generally, the collection of logs etc. for pg_upgrade will improve
> with the next release, which will be soon after I return from a vacation
> in about 2 weeks - experience shows that making releases just before a
> vacation is not a good idea :-)

Thanks, that will be helpful.

Regards,
Vignesh

Re: abi-compliance-checker

2024-02-27 Thread Peter Geoghegan

On Tue, Feb 27, 2024 at 9:03 AM Alvaro Herrera  wrote:
> Now, maybe a buildfarm animal is not the right tool, and instead we need
> a separate system that tests for it and emails pg-hackers when it breaks
> or whatever. That's fine with me, but it seems a pretty minor
> implementation detail.

Anything that alerts on breakage is pretty much equivalent to having a
buildfarm animal.

I have a feeling that there are going to be real problems with
alerting, at least if it's introduced right away. I'd feel much better
about it if there was an existing body of suppressions, that more or
less worked as a reference of agreed upon best practices. Can we do
that part first, rather than starting out with a blanket assumption
that everything that happened before now must have been perfect?

-- 
Peter Geoghegan

Re: Logging parallel worker draught

2024-02-27 Thread Tomas Vondra

On 2/27/24 10:55, Benoit Lobréau wrote:
> On 2/25/24 20:13, Tomas Vondra wrote:
>> 1) name of the GUC
> ...
>> 2) logging just the failures provides an incomplete view
>> log_parallel_workers = {none | failures | all}>
>> where "failures" only logs when at least one worker fails to start, and
>> "all" logs everything.
>>
>> AFAIK Sami made the same observation/argument in his last message [1].
> 
> I like the name and different levels you propose. I was initially
> thinking that the overall picture would be better summarized (an easier
> to implement) with pg_stat_statements. But with the granularity you
> propose, the choice is in the hands of the DBA, which is great and
> provides more options when we don't have full control of the installation.
> 

Good that you like the idea with multiple levels.

I agree pg_stat_statements may be an easier way to get a summary, but it
has limitations too (e.g. no built-in ability to show how the metrics
evolves over time, which is easier to restore from logs). So I think
those approaches are complementary.

>> 3) node-level or query-level?
> ...
>> There's no value in having information about every individual temp file,
>> because we don't even know which node produced it. It'd be perfectly
>> sufficient to log some summary statistics (number of files, total space,
>> ...), either for the query as a whole, or perhaps per node (because
>> that's what matters for work_mem). So I don't think we should mimic this
>> just because log_temp_files does this.
> 
> I must admit that, given my (poor) technical level, I went for the
> easiest way.
> 

That's understandable, I'd do the same thing.

> If we go for the three levels you proposed, "node-level" makes even less
> sense (and has great "potential" for spam).
> 

Perhaps.

> I can see one downside to the "query-level" approach: it might be more
> difficult to to give information in cases where the query doesn't end
> normally. It's sometimes useful to have hints about what was going wrong
> before a query was cancelled or crashed, which log_temp_files kinda does.
> 

That is certainly true, but it's not a new thing, I believe. IIRC we may
not report statistics until the end of the transaction, so no progress
updates, I'm not sure what happens if the doesn't end correctly (e.g.
backend dies, ...). Similarly for the temporary files, we don't report
those until the temporary file gets closed, so I'm not sure you'd get a
lot of info about the progress either.

I admit I haven't tried and maybe I don't remember the details wrong.
Might be useful to experiment with this first a little bit ...

>> I don't know how difficult would it be to track/collect this information
>> for the whole query.
> 
> I am a worried this will be a little too much for me, given the time and
> the knowledge gap I have (both in C and PostgreSQL internals). I'll try
> to look anyway.
> 

I certainly understand that, and I realize it may feel daunting and even
discouraging. What I can promise is that I'm willing to help, either by
suggesting ways to approach the problems, doing reviews, and so on.
Would that be sufficient for you to continue working on this patch?

Some random thoughts/ideas about how to approach this:

- For parallel workers I think this might be as simple as adding some
counters into QueryDesc, and update those during query exec (instead of
just logging stuff directly). I'm not sure if it should be added to
"Instrumentation" or separately.

- I was thinking maybe we could just walk the execution plan and collect
the counts that way. But I'm not sure that'd work if the Gather node
happens to be executed repeatedly (in a loop). Also, not sure we'd want
to walk all plans.

- While log_temp_files is clearly out of scope for this patch, it might
be useful to think about it and how it should behave. We've already used
log_temp_files to illustrate some issues with logging the info right
away, so maybe there's something to learn here ...

- For temporary files I think it'd be more complicated, because we can
create temporary files from many different places, not just in executor,
so we can't simply update QueryDesc etc. Also, the places that log info
about temporary files (using ReportTemporaryFileUsage) only really know
about individual temporary files, so if a Sort or HashJoin creates a
1000 files, we'll get one LOG for each of those temp files. But they're
really a single "combined" file. So maybe we should introduce some sort
of "context" to group those files, and only accumulate/log the size for
the group as a whole? Maybe it'd even allow printing some info about
what the temporary file is for (e.g. tuplestore / tuplesort / ...).




regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: abi-compliance-checker

2024-02-27 Thread Alvaro Herrera

On 2024-Feb-27, Peter Geoghegan wrote:

> On Tue, Feb 27, 2024 at 8:25 AM Alvaro Herrera  
> wrote:
> > The way I see this working, is that we set up a buildfarm animal [per
> > architecture] that runs the new rules produced by the abidw option and
> > stores the result locally, so that for stable branches it can turn red
> > when an ABI-breaking change with the previous minor release of the same
> > branch is introduced.  There's no point on it ever turning red in the
> > master branch, since we're obviously not concerned with ABI changes there.
> 
> ABI stability doesn't seem like something that you can alert on.

Eh, I disagree.  Since you can add suppression rules to the tree, I'd
say it definitely is.

If you commit something and it breaks ABI, we want to know as soon as
possible -- for example suppose the ABI break occurs during a security
patch at release time; if we get an alert about it immediately, we still
have time to fix it before the mess is released.

Now, if you have an intentional ABI break, then you can let the testing
system know about it so that it won't complain.  We could for example
have src/abi/suppressions/REL_11_8.ini and
src/abi/suppressions/REL_12_3.ini files (in the respective branches)
with the _bt_pagedel() change.  You can add this file together with the
commit that introduces the change, if you know about it ahead of time,
or as a separate commit after the buildfarm animal turns red.  Or you
can fix your ABI break, if -- as is most likely -- it was unintentional.

Again -- this only matters for stable branches.  We don't need to do
anything for the master branch, as it would be far too noisy if we did
that.

Now, maybe a buildfarm animal is not the right tool, and instead we need
a separate system that tests for it and emails pg-hackers when it breaks
or whatever.  That's fine with me, but it seems a pretty minor
implementation detail.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/

Re: abi-compliance-checker

2024-02-27 Thread Peter Geoghegan

On Tue, Feb 27, 2024 at 8:25 AM Alvaro Herrera  wrote:
> The way I see this working, is that we set up a buildfarm animal [per
> architecture] that runs the new rules produced by the abidw option and
> stores the result locally, so that for stable branches it can turn red
> when an ABI-breaking change with the previous minor release of the same
> branch is introduced.  There's no point on it ever turning red in the
> master branch, since we're obviously not concerned with ABI changes there.

ABI stability doesn't seem like something that you can alert on. There
are quite a few individual cases where the ABI was technically broken,
in a way that these tools will complain about. And yet it was
generally understood that these changes did not really break ABI
stability, for various high-context reasons that no tool can possibly
be expected to understand. This will at least be true under our
existing practices, or anything like them.

For example, if you look at the "Problems with Symbols, High Severity"
from the report I posted comparing REL_11_0 to REL_11_20, you'll see
that I removed _bt_pagedel() when backpatching a fix. That was
justified by the fact that any extension that was calling that
function was already hopelessly broken (I pointed this out at the
time).

Having some tooling in this area would be extremely useful. The
absolute number of false positives seems quite manageable, but the
fact is that most individual complaints that the tooling makes are
false positives. At least in some deeper sense.

-- 
Peter Geoghegan

Re: Injection points: some tools to wait and wake

2024-02-27 Thread Bertrand Drouvot

Hi,

On Tue, Feb 27, 2024 at 04:49:03PM +0500, Andrey M. Borodin wrote:
> Hi,
> 
> > On 27 Feb 2024, at 16:08, Bertrand Drouvot  
> > wrote:
> > 
> > On Tue, Feb 27, 2024 at 11:00:10AM +0500, Andrey M. Borodin wrote:
> >> 
> >> But, AFAICS, the purpose is the same: wait until event happened.
> > 
> > I think it's easier to understand the tests (I mean what the purpose of the
> > injection points are) if we don't use an helper function. While the helper 
> > function would make the test easier to read / cleaner, I think it may make 
> > them
> > more difficult to understand as 'await_injection_point' would probably be 
> > too
> > generic.
> 
> For the record: I’m fine if there is no such function.
> There will be at least one implementation of this function in every single 
> test with waiting injection points.
> That’s the case where we might want something generic.
> What the specific there might be? The test can check that 
>  - conditions are logged
>  - injection point reached in specific backend (e.g. checkpointer)
> etc
> 
> I doubt that this specific checks worth cleanness of the test. And 
> sacrificing test readability in favour of teaching reader what injection 
> points are sounds strange.

I'm giving a second thought and it does not have to be exclusive, for example in
this specific test we could:

1) check that the injection point is reached with the helper (querying 
pg_stat_activity
looks good to me) 
And
2) check in the log 

So even if two checks might sound "too much" they both make sense to give 1) 
better
readability and 2) better understanding of the test.

So, I'm ok with the new helper too.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Slow catchup of 2PC (twophase) transactions on replica in LR

2024-02-27 Thread Давыдов Виталий

Hi Amit,

On Tuesday, February 27, 2024 16:00 MSK, Amit Kapila  
wrote:
As we do XLogFlush() at the time of prepare then why it is not available? OR 
are you talking about this state after your idea/patch where you are trying to 
make both Prepare and Commit_prepared records async?Right, I'm talking about my 
patch where async commit is implemented. There is no such problem with reading 
2PC from not flushed WAL in the vanilla because XLogFlush is called 
unconditionally, as you've described. But an attempt to add some async stuff 
leads to the problem of reading not flushed WAL. It is why I store 2pc state in 
the local memory in my patch.
It would be good if you could link those threads.Sure, I will find and add some 
links to the discussions from past.

Thank you!

With best regards,
Vitaly
 On Tue, Feb 27, 2024 at 4:49 PM Давыдов Виталий
 wrote:
>
> Thank you for your interest in the discussion!
>
> On Monday, February 26, 2024 16:24 MSK, Amit Kapila  
> wrote:
>
>
> I think the reason is probably that when the WAL record for prepared is 
> already flushed then what will be the idea of async commit here?
>
> I think, the idea of async commit should be applied for both transactions: 
> PREPARE and COMMIT PREPARED, which are actually two separate local 
> transactions. For both these transactions we may call XLogSetAsyncXactLSN on 
> commit instead of XLogFlush when async commit is enabled. When I use async 
> commit, I mean to apply async commit to local transactions, not to a twophase 
> (prepared) transaction itself.
>
>
> At commit prepared, it seems we read prepare's WAL record, right? If so, it 
> is not clear to me do you see a problem with a flush of commit_prepared or 
> reading WAL for prepared or both of these.
>
> The problem with reading WAL is due to async commit of PREPARE TRANSACTION 
> which saves 2PC in the WAL. At the moment of COMMIT PREPARED the WAL with 
> PREPARE TRANSACTION 2PC state may not be XLogFlush-ed yet.
>

As we do XLogFlush() at the time of prepare then why it is not
available? OR are you talking about this state after your idea/patch
where you are trying to make both Prepare and Commit_prepared records
async?

So, PREPARE TRANSACTION should wait until its 2PC state is flushed.
>
> I did some experiments with saving 2PC state in the local memory of logical 
> replication worker and, I think, it worked and demonstrated much better 
> performance. Logical replication worker utilized up to 100% CPU. I'm just 
> concerned about possible problems with async commit for twophase transactions.
>
> To be more specific, I've attached a patch to support async commit for 
> twophase. It is not the final patch but it is presented only for discussion 
> purposes. There were some attempts to save 2PC in memory in past but it was 
> rejected.
>

It would be good if you could link those threads.

--
With Regards,
Amit Kapila.

Re: abi-compliance-checker

2024-02-27 Thread Alvaro Herrera

On 2023-Nov-01, Peter Eisentraut wrote:

> v3-0001-abidw-option.patch
> 
> This adds the abidw meson option, which produces the xml files with the ABI
> description.  With that, you can then implement a variety of workflows, such
> as the abidiff proposed in the later patches, or something rigged up via CI,
> or you can just build various versions locally and compare them.  With this
> patch, you get the files to compare built automatically and don't have to
> remember to cover all the libraries or which options to use.
> 
> I think this patch is mostly pretty straightforward and agreeable, subject
> to technical review in detail.

I like this idea and I think we should integrate it with the objective
of it becoming the workhorse of ABI-stability testing.  However, I do
not think that the subsequent patches should be part of the tree at all;
certainly not the produced .xml files in your 0004, as that would be far
too unstable and would cause a lot of pointless churn.

> TODO: documentation

Yes, please.

> TODO: Do we want a configure/make variant of this?

Not needed IMO.

The way I see this working, is that we set up a buildfarm animal [per
architecture] that runs the new rules produced by the abidw option and
stores the result locally, so that for stable branches it can turn red
when an ABI-breaking change with the previous minor release of the same
branch is introduced.  There's no point on it ever turning red in the
master branch, since we're obviously not concerned with ABI changes there.

(Perhaps we do need 0003 as an easy mechanism to run the comparison, but
I'm not sure to what extent that would be actually helpful.)

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/

Re: Have pg_basebackup write "dbname" in "primary_conninfo"?

2024-02-27 Thread Amit Kapila

On Tue, Feb 27, 2024 at 2:00 PM Hayato Kuroda (Fujitsu)
 wrote:
>
> > We do append dbname=replication even in libpqrcv_connect for .pgpass
> > lookup as mentioned in comments. See below:
> > libpqrcv_connect()
> > {
> > 
> > else
> > {
> > /*
> > * The database name is ignored by the server in replication mode,
> > * but specify "replication" for .pgpass lookup.
> > */
> > keys[++i] = "dbname";
> > vals[i] = "replication";
> > }
> > ...
> > }
>
> OK. So we must add the value for the authorization, right?
> I think it should be described even in GetConnection().
>
> > I think as part of this effort we should just add dbname to
> > primary_conninfo written in postgresql.auto.conf file. As above, the
> > question is valid whether we should do it just for 17 or for prior
> > versions. Let's discuss more on that. I am not sure of the use case
> > for versions before 17 but commit cca97ce6a665 mentioned that some
> > middleware or proxies might however need to know the dbname to make
> > the correct routing decision for the connection. Does that apply here
> > as well? If so, we can do it and update the docs, otherwise, let's do
> > it just for 17.
>
> Hmm, I might lose your requirements again. So, we must satisfy all of below
> ones, right?
> 1) add {"dbname", "replication"} key-value pair to look up .pgpass file 
> correctly.
> 2) If the -R is given, output dbname=xxx value to be used by slotsync worker.
> 3) If the dbname is not given in the connection string, the same string as
>username must be used to keep the libpq connection rule.
> 4) No need to add dbname=replication pair
>

Point 1) and 4) seems contradictory or maybe I am missing something.

-- 
With Regards,
Amit Kapila.

Re: More new SQL/JSON item methods

2024-02-27 Thread Jeevan Chalke

On Tue, Feb 27, 2024 at 12:40 PM Andrew Dunstan  wrote:

>
> On 2024-02-02 Fr 00:31, Jeevan Chalke wrote:
>
>
>
> On Thu, Feb 1, 2024 at 11:25 AM Kyotaro Horiguchi 
> wrote:
>
>> At Thu, 1 Feb 2024 09:22:22 +0530, Jeevan Chalke <
>> jeevan.cha...@enterprisedb.com> wrote in
>> > On Thu, Feb 1, 2024 at 7:24 AM Kyotaro Horiguchi <
>> horikyota@gmail.com>
>> > wrote:
>> >
>> > > At Thu, 01 Feb 2024 10:49:57 +0900 (JST), Kyotaro Horiguchi <
>> > > horikyota@gmail.com> wrote in
>> > > > By the way, while playing with this feature, I noticed the following
>> > > > error message:
>> > > >
>> > > > > select jsonb_path_query('1.1' , '$.boolean()');
>> > > > > ERROR:  numeric argument of jsonpath item method .boolean() is
>> out of
>> > > range for type boolean
>> > > >
>> > > > The error message seems a bit off to me. For example, "argument
>> '1.1'
>> > > > is invalid for type [bB]oolean" seems more appropriate for this
>> > > > specific issue. (I'm not ceratin about our policy on the spelling of
>> > > > Boolean..)
>> > >
>> > > Or, following our general convention, it would be spelled as:
>> > >
>> > > 'invalid argument for type Boolean: "1.1"'
>> > >
>> >
>> > jsonpath way:
>>
>> Hmm. I see.
>>
>> > ERROR:  argument of jsonpath item method .boolean() is invalid for type
>> > boolean
>> >
>> > or, if we add input value, then
>> >
>> > ERROR:  argument "1.1" of jsonpath item method .boolean() is invalid for
>> > type boolean
>> >
>> > And this should work for all the error types, like out of range, not
>> valid,
>> > invalid input, etc, etc. Also, we don't need separate error messages for
>> > string input as well, which currently has the following form:
>> >
>> > "string argument of jsonpath item method .%s() is not a valid
>> > representation.."
>>
>> Agreed.
>>
>
> Attached are patches based on the discussion.
>
>
>
>
> Thanks, I combined these and pushed the result.
>

Thank you, Andrew.


>
> cheers
>
>
> andrew
>
>
> --
> Andrew Dunstan
> EDB: https://www.enterprisedb.com
>
>

-- 
Jeevan Chalke

*Principal, ManagerProduct Development*



edbpostgres.com

Re: Slow catchup of 2PC (twophase) transactions on replica in LR

2024-02-27 Thread Amit Kapila

On Tue, Feb 27, 2024 at 4:49 PM Давыдов Виталий
 wrote:
>
> Thank you for your interest in the discussion!
>
> On Monday, February 26, 2024 16:24 MSK, Amit Kapila  
> wrote:
>
>
> I think the reason is probably that when the WAL record for prepared is 
> already flushed then what will be the idea of async commit here?
>
> I think, the idea of async commit should be applied for both transactions: 
> PREPARE and COMMIT PREPARED, which are actually two separate local 
> transactions. For both these transactions we may call XLogSetAsyncXactLSN on 
> commit instead of XLogFlush when async commit is enabled. When I use async 
> commit, I mean to apply async commit to local transactions, not to a twophase 
> (prepared) transaction itself.
>
>
> At commit prepared, it seems we read prepare's WAL record, right? If so, it 
> is not clear to me do you see a problem with a flush of commit_prepared or 
> reading WAL for prepared or both of these.
>
> The problem with reading WAL is due to async commit of PREPARE TRANSACTION 
> which saves 2PC in the WAL. At the moment of COMMIT PREPARED the WAL with 
> PREPARE TRANSACTION 2PC state may not be XLogFlush-ed yet.
>

As we do XLogFlush() at the time of prepare then why it is not
available? OR are you talking about this state after your idea/patch
where you are trying to make both Prepare and Commit_prepared records
async?

 So, PREPARE TRANSACTION should wait until its 2PC state is flushed.
>
> I did some experiments with saving 2PC state in the local memory of logical 
> replication worker and, I think, it worked and demonstrated much better 
> performance. Logical replication worker utilized up to 100% CPU. I'm just 
> concerned about possible problems with async commit for twophase transactions.
>
> To be more specific, I've attached a patch to support async commit for 
> twophase. It is not the final patch but it is presented only for discussion 
> purposes. There were some attempts to save 2PC in memory in past but it was 
> rejected.
>

It would be good if you could link those threads.

-- 
With Regards,
Amit Kapila.

Re: bug report: some issues about pg_15_stable(8fa4a1ac61189efffb8b851ee77e1bc87360c445)

2024-02-27 Thread Dean Rasheed

On Fri, 23 Feb 2024 at 00:12, Tom Lane  wrote:
>
> So after studying this for awhile, I see that the planner is emitting
> a PlanRowMark that presumes that the UNION ALL subquery will be
> scanned as though it's a base relation; but since we've converted it
> to an appendrel, the executor just ignores that rowmark, and the wrong
> things happen.  I think the really right fix would be to teach the
> executor to honor such PlanRowMarks, by getting nodeAppend.c and
> nodeMergeAppend.c to perform EPQ row substitution.

Yes, I agree that's a much better solution, if it can be made to work,
though I have been really struggling to see how.


> the planner produces targetlists like
>
> Output: src_1.val, src_1.id, ROW(src_1.id, src_1.val)
>
> and as you can see the order of the columns doesn't match.
> I can see three ways we might attack that:
>
> 1. Persuade the planner to build output tlists that always match
> the row identity Var.
>
> 2. Change generation of the ROW() expression so that it lists only
> the values we're going to output, in the order we're going to
> output them.
>
> 3. Fix the executor to remap what it gets out of the ROW() into the
> order of the subquery tlists.  This is probably do-able but I'm
> not certain; it may be that the executor hasn't enough info.
> We might need to teach the planner to produce a mapping projection
> and attach it to the Append node, which carries some ABI risk (but
> in the past we've gotten away with adding new fields to the ends
> of plan nodes in the back branches).  Another objection is that
> adding cycles to execution rather than planning might be a poor
> tradeoff --- although if we only do the work when EPQ is invoked,
> maybe it'd be the best way.
>

Of those, option 3 feels like the best one, though I'm really not
sure. I played around with it and convinced myself that the executor
doesn't have the information it needs to make it work, but I think all
it needs is the Append node's original targetlist, as it is just
before it's rewritten by set_dummy_tlist_references(), which rewrites
the attribute numbers sequentially. In the original targetlist, all
the Vars have the right attribute numbers, so it can be used to build
the required projection (I think).

Attached is a very rough patch. It seemed better to build the
projection in the executor rather than the planner, since then the
extra work can be avoided, if EPQ is not invoked.

It seems to work (it passes the isolation tests, and I couldn't break
it in ad hoc testing), but it definitely needs tidying up, and it's
hard to be sure that it's not overlooking something.

Regards,
Dean
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
new file mode 100644
index c7059e7..ccd994c
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -275,6 +275,8 @@ ExecInitAppend(Append *node, EState *est
 	/* For parallel query, this will be overridden later. */
 	appendstate->choose_next_subplan = choose_next_subplan_locally;
 
+	appendstate->as_epq_tupdesc = NULL;
+
 	return appendstate;
 }
 
@@ -288,8 +290,107 @@ static TupleTableSlot *
 ExecAppend(PlanState *pstate)
 {
 	AppendState *node = castNode(AppendState, pstate);
+	EState	   *estate = node->ps.state;
 	TupleTableSlot *result;
 
+	if (estate->es_epq_active != NULL)
+	{
+		/*
+		 * We are inside an EvalPlanQual recheck.  If there is a relevant
+		 * rowmark for the append relation, return the test tuple if one is
+		 * available.
+		 */
+		EPQState   *epqstate = estate->es_epq_active;
+		int			scanrelid;
+
+		if (bms_get_singleton_member(castNode(Append, node->ps.plan)->apprelids,
+	 ))
+		{
+			if (epqstate->relsubs_done[scanrelid - 1])
+			{
+/*
+ * Return empty slot, as either there is no EPQ tuple for this
+ * rel or we already returned it.
+ */
+TupleTableSlot *slot = node->ps.ps_ResultTupleSlot;
+
+return ExecClearTuple(slot);
+			}
+			else if (epqstate->relsubs_slot[scanrelid - 1] != NULL)
+			{
+/*
+ * Return replacement tuple provided by the EPQ caller.
+ */
+TupleTableSlot *slot = epqstate->relsubs_slot[scanrelid - 1];
+
+Assert(epqstate->relsubs_rowmark[scanrelid - 1] == NULL);
+
+/* Mark to remember that we shouldn't return it again */
+epqstate->relsubs_done[scanrelid - 1] = true;
+
+return slot;
+			}
+			else if (epqstate->relsubs_rowmark[scanrelid - 1] != NULL)
+			{
+/*
+ * Fetch and return replacement tuple using a non-locking
+ * rowmark.
+ */
+ExecAuxRowMark *earm = epqstate->relsubs_rowmark[scanrelid - 1];
+ExecRowMark *erm = earm->rowmark;
+Datum		datum;
+bool		isNull;
+TupleTableSlot *slot;
+
+Assert(erm->markType == ROW_MARK_COPY);
+
+datum = ExecGetJunkAttribute(epqstate->origslot,
+			 earm->wholeAttNo,
+			 );
+if (isNull)
+	return NULL;
+
+if (node->as_epq_tupdesc == NULL)
+{
+	HeapTupleHeader tuple;
+	Oid			tupType;

Re: [PATCH] updates to docs about HOT updates for BRIN

2024-02-27 Thread Alvaro Herrera

Hello,

On 2024-Feb-26, Stephen Frost wrote:

> Here's an updated patch which tries to improve on the wording a bit by
> having it be a bit more consistent.  Would certainly welcome feedback on
> it though, of course.  This is a tricky bit of language to write and
> a complex optimization to explain.

Should we try to explain what is a "summarizing" index is?  Right now
the only way to know is to look at the source code or attach a debugger
and see if IndexAmRoutine->amsummarizing is true.  Maybe we can just say
"currently the only in-core summarizing index is BRIN" somewhere in the
page.  (The patch's proposal to say "... such as BRIN" strikes me as too
vague.)

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/
"I am amazed at [the pgsql-sql] mailing list for the wonderful support, and
lack of hesitasion in answering a lost soul's question, I just wished the rest
of the mailing list could be like this."   (Fotis)
  https://postgr.es/m/200606261359.k5qdxe2p004...@auth-smtp.hol.gr

Re: libpq: PQfnumber overload for not null-terminated strings

2024-02-27 Thread Jelte Fennema-Nio

On Tue, 27 Feb 2024 at 00:31, Ivan Trofimov  wrote:
> I see now that I failed to express myself clearly: it's not a per-query
> overhead, but rather a per-result-field one.

I'm fairly sympathetic to decreasing the overhead of any per-row
operation. And looking at the code, it doesn't surprise me that
PQfnumber shows up so big in your profile. I think it would probably
make sense to introduce a PQfnumber variant that does not do the
downcasing/quote handling (called e.g. PQfnumberRaw).

However, I do think you could convert this per-row overhead in your
case to per-query overhead by caching the result of PQfnumber for each
unique C++ string_view. Afaict that should solve your performance
problem.

Re: Synchronizing slots from primary to standby

2024-02-27 Thread Amit Kapila

On Tue, Feb 27, 2024 at 12:48 PM Peter Smith  wrote:
>
> Here are some review comments for v99-0001
>
> ==
> 0. GENERAL.
>
> +#standby_slot_names = '' # streaming replication standby server slot names 
> that
> + # logical walsender processes will wait for
>
> IMO the GUC name is too generic. There is nothing in this name to
> suggest it has anything to do with logical slot synchronization; that
> meaning is only found in the accompanying comment -- it would be
> better if the GUC name itself were more self-explanatory.
>
> e.g. Maybe like 'wal_sender_sync_standby_slot_names' or
> 'wal_sender_standby_slot_names', 'wal_sender_wait_for_standby_slots',
> or ...
>

It would be wrong and or misleading to append wal_sender to this GUC
name as this is used during SQL APIs as well. Also, adding wait sounds
more like a boolean. So, I don't see the proposed names any better
than the current one.

> ~~~
>
> 9.
> + /* Now verify if the specified slots really exist and have correct type */
> + if (!validate_standby_slots(newval))
> + return false;
>
> As in a prior comment, if ReplicationSlotCtl is NULL then it is not
> always going to do exactly what that comment says it is doing...
>

It will do what the comment says when invoked as part of the SIGHUP
signal. I think the current comment is okay.

-- 
With Regards,
Amit Kapila.

Re: Improve readability by using designated initializers when possible

2024-02-27 Thread Jelte Fennema-Nio

On Tue, 27 Feb 2024 at 08:57, Alvaro Herrera  wrote:
> What this says to me is that ObjectClass is/was a somewhat useful
> abstraction layer on top of catalog definitions.  I'm now not 100% that
> poking this hole in the abstraction (by expanding use of catalog OIDs at
> the expense of ObjectClass) was such a great idea.  Maybe we want to
> make ObjectClass *more* useful/encompassing rather than the opposite.

I agree that ObjectClass has some benefits over using the table OIDs,
but both the benefits you mention don't apply to add_object_address.
So I don't think using ObjectClass for was worth the extra effort to
maintain the
object_classes array, just for add_object_address.

One improvement that I think could be worth considering is to link
ObjectClass and the table OIDs more explicitly, by actually making
their values the same:
enum ObjectClass {
OCLASS_PGCLASS = RelationRelationId,
OCLASS_PGPROC = ProcedureRelationId,
...
}

But that would effectively mean that anyone including dependency.h
would also be including all catalog headers. I'm not sure if that's
considered problematic or not. If that is problematic then it would
also be possible to reverse the relationship and have each catalog
header include dependency.h (or some other header that we move
ObjectClass to), and go about it in the following way:

/* dependency.h */
enum ObjectClass {
OCLASS_PGCLASS = 1259,
OCLASS_PGPROC = 1255,
...
}

/* pg_class.h */
CATALOG(pg_class,OCLASS_PGCLASS,RelationRelationId) BKI_BOOTSTRAP
BKI_ROWTYPE_OID(83,RelationRelation_Rowtype_Id) BKI_SCHEMA_MACRO

Seeking Clarification on Logical Replication Start LSN

2024-02-27 Thread Pradeep Kumar

Dear Postgres Community,

I hope this email finds you well. I am reaching out to seek clarification
on an issue I am encountering with logical replication in PostgreSQL.

My specific question pertains to determining the appropriate LSN (Log
Sequence Number) from which to start logical replication. Allow me to
provide detailed context for better understanding:

During the process of performing a parallel pg_basebackup, I concurrently
execute DML queries. As part of the pg_basebackup command, I utilize the
option create-slot to create a replication slot. Subsequently, upon
completion of the base backup, I initiate logical replication using the
restart_lsn obtained during the execution of the pg_basebackup command. My
intention is to ensure consistency between two database clusters.

However, I am encountering errors during this process. Specifically, I
receive the following error message on the source side:

"""
2024-02-27 16:20:09.271 IST [2838457] ERROR: duplicate key value violates
unique constraint "table_15_36_pkey"
2024-02-27 16:20:09.271 IST [2838457] DETAIL: Key (col_1, col_2)=(23,
2024-02-27 15:14:24.332557) already exists.
2024-02-27 16:20:09.272 IST [2834967] LOG: background worker "logical
replication worker" (PID 2838457) exited with exit code 1
Upon analysis, it appears that the errors stem from starting the logical
replication with an incorrect LSN, one that has already been applied to the
target side, leading to duplicate key conflicts.
"""

In light of this issue, I seek guidance on determining the appropriate LSN
from which to commence logical replication.

To further clarify my problem:

1)I have a source machine and a target machine.
2) I perform a backup from the source to the target using pg_basebackup.
3) Prior to initiating the base backup, I create logical replication slots
on the source machine.
4) During the execution of pg_basebackup, DML queries are executed, and I
aim to replicate this data on the target machine.
5) My dilemma lies in determining the correct LSN to begin the logical
replication process.
Your insights and guidance on this matter would be immensely appreciated.
Thank you for your time and assistance.

Warm regards,
Pradeep

Re: Synchronizing slots from primary to standby

2024-02-27 Thread Amit Kapila

On Tue, Feb 27, 2024 at 4:07 PM Bertrand Drouvot
 wrote:
>
> On Tue, Feb 27, 2024 at 06:17:44PM +1100, Peter Smith wrote:
> > +static bool
> > +validate_standby_slots(char **newval)
> > +{
> > + char*rawname;
> > + List*elemlist;
> > + ListCell   *lc;
> > + bool ok;
> > +
> > + /* Need a modifiable copy of string */
> > + rawname = pstrdup(*newval);
> > +
> > + /* Verify syntax and parse string into a list of identifiers */
> > + ok = SplitIdentifierString(rawname, ',', );
> > +
> > + if (!ok)
> > + {
> > + GUC_check_errdetail("List syntax is invalid.");
> > + }
> > +
> > + /*
> > + * If the replication slots' data have been initialized, verify if the
> > + * specified slots exist and are logical slots.
> > + */
> > + else if (ReplicationSlotCtl)
> > + {
> > + foreach(lc, elemlist)
> >
> > 6a.
> > So, if the ReplicationSlotCtl is NULL, it is possible to return
> > ok=true without ever checking if the slots exist or are of the correct
> > kind. I am wondering what are the ramifications of that. -- e.g.
> > assuming names are OK when maybe they aren't OK at all. AFAICT this
> > works because it relies on getting subsequent WARNINGS when calling
> > FilterStandbySlots(). If that is correct then maybe the comment here
> > can be enhanced to say so.
> >
> > Indeed, if it works like that, now I am wondering do we need this for
> > loop validation at all. e.g. it seems just a matter of timing whether
> > we get ERRORs validating the GUC here, or WARNINGS later in the
> > FilterStandbySlots. Maybe we don't need the double-checking and it is
> > enough to check in FilterStandbySlots?
>
> Good point, I have the feeling that it is enough to check in 
> FilterStandbySlots().
>

I think it is better if we get earlier in a case where the parameter
is changed and performed SIGHUP instead of waiting till we get to
logical decoding. So, there is merit in keeping these checks during
initial validation.

-- 
With Regards,
Amit Kapila.

Re: Improve readability by using designated initializers when possible

2024-02-27 Thread Jelte Fennema-Nio

On Tue, 27 Feb 2024 at 12:52, Jelte Fennema-Nio  wrote:
> Attached is an updated patchset to also convert pg_enc2icu_tbl and
> pg_enc2gettext_tbl. I converted pg_enc2gettext_tbl in a separate
> commit, because it actually requires some codechanges too.

Another small update to also make all arrays changed by this patch
have a trailing comma (to avoid future diff noise).
From 945fae648d0a396e9f99413c8e31542ced9b17ba Mon Sep 17 00:00:00 2001
From: Jelte Fennema-Nio 
Date: Wed, 21 Feb 2024 15:34:38 +0100
Subject: [PATCH v4 2/3] Use designated initializer syntax to improve
 readability

Use C99 designated initializer syntax for array elements of various
arrays. This is less verbose, more readable and less prone to typos or
ordering mistakes.
Akin to 74a730631065 and cc150596341e.
---
 src/backend/parser/parser.c |  12 +-
 src/backend/utils/misc/guc_tables.c | 187 +++-
 src/bin/pg_dump/pg_dump_sort.c  |  94 +++---
 src/common/encnames.c   | 154 +++
 src/common/relpath.c|   8 +-
 src/common/wchar.c  |  85 +++--
 src/include/mb/pg_wchar.h   |   6 +-
 7 files changed, 248 insertions(+), 298 deletions(-)

diff --git a/src/backend/parser/parser.c b/src/backend/parser/parser.c
index 9ec628ecbdf..3a1fa91c1b6 100644
--- a/src/backend/parser/parser.c
+++ b/src/backend/parser/parser.c
@@ -56,12 +56,12 @@ raw_parser(const char *str, RawParseMode mode)
 	{
 		/* this array is indexed by RawParseMode enum */
 		static const int mode_token[] = {
-			0,	/* RAW_PARSE_DEFAULT */
-			MODE_TYPE_NAME,		/* RAW_PARSE_TYPE_NAME */
-			MODE_PLPGSQL_EXPR,	/* RAW_PARSE_PLPGSQL_EXPR */
-			MODE_PLPGSQL_ASSIGN1,	/* RAW_PARSE_PLPGSQL_ASSIGN1 */
-			MODE_PLPGSQL_ASSIGN2,	/* RAW_PARSE_PLPGSQL_ASSIGN2 */
-			MODE_PLPGSQL_ASSIGN3	/* RAW_PARSE_PLPGSQL_ASSIGN3 */
+			[RAW_PARSE_DEFAULT] = 0,
+			[RAW_PARSE_TYPE_NAME] = MODE_TYPE_NAME,
+			[RAW_PARSE_PLPGSQL_EXPR] = MODE_PLPGSQL_EXPR,
+			[RAW_PARSE_PLPGSQL_ASSIGN1] = MODE_PLPGSQL_ASSIGN1,
+			[RAW_PARSE_PLPGSQL_ASSIGN2] = MODE_PLPGSQL_ASSIGN2,
+			[RAW_PARSE_PLPGSQL_ASSIGN3] = MODE_PLPGSQL_ASSIGN3,
 		};
 
 		yyextra.have_lookahead = true;
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 527a2b27340..a63ea042edf 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -627,13 +627,13 @@ bool		in_hot_standby_guc;
  */
 const char *const GucContext_Names[] =
 {
-	 /* PGC_INTERNAL */ "internal",
-	 /* PGC_POSTMASTER */ "postmaster",
-	 /* PGC_SIGHUP */ "sighup",
-	 /* PGC_SU_BACKEND */ "superuser-backend",
-	 /* PGC_BACKEND */ "backend",
-	 /* PGC_SUSET */ "superuser",
-	 /* PGC_USERSET */ "user"
+	[PGC_INTERNAL] = "internal",
+	[PGC_POSTMASTER] = "postmaster",
+	[PGC_SIGHUP] = "sighup",
+	[PGC_SU_BACKEND] = "superuser-backend",
+	[PGC_BACKEND] = "backend",
+	[PGC_SUSET] = "superuser",
+	[PGC_USERSET] = "user",
 };
 
 StaticAssertDecl(lengthof(GucContext_Names) == (PGC_USERSET + 1),
@@ -646,20 +646,20 @@ StaticAssertDecl(lengthof(GucContext_Names) == (PGC_USERSET + 1),
  */
 const char *const GucSource_Names[] =
 {
-	 /* PGC_S_DEFAULT */ "default",
-	 /* PGC_S_DYNAMIC_DEFAULT */ "default",
-	 /* PGC_S_ENV_VAR */ "environment variable",
-	 /* PGC_S_FILE */ "configuration file",
-	 /* PGC_S_ARGV */ "command line",
-	 /* PGC_S_GLOBAL */ "global",
-	 /* PGC_S_DATABASE */ "database",
-	 /* PGC_S_USER */ "user",
-	 /* PGC_S_DATABASE_USER */ "database user",
-	 /* PGC_S_CLIENT */ "client",
-	 /* PGC_S_OVERRIDE */ "override",
-	 /* PGC_S_INTERACTIVE */ "interactive",
-	 /* PGC_S_TEST */ "test",
-	 /* PGC_S_SESSION */ "session"
+	[PGC_S_DEFAULT] = "default",
+	[PGC_S_DYNAMIC_DEFAULT] = "default",
+	[PGC_S_ENV_VAR] = "environment variable",
+	[PGC_S_FILE] = "configuration file",
+	[PGC_S_ARGV] = "command line",
+	[PGC_S_GLOBAL] = "global",
+	[PGC_S_DATABASE] = "database",
+	[PGC_S_USER] = "user",
+	[PGC_S_DATABASE_USER] = "database user",
+	[PGC_S_CLIENT] = "client",
+	[PGC_S_OVERRIDE] = "override",
+	[PGC_S_INTERACTIVE] = "interactive",
+	[PGC_S_TEST] = "test",
+	[PGC_S_SESSION] = "session",
 };
 
 StaticAssertDecl(lengthof(GucSource_Names) == (PGC_S_SESSION + 1),
@@ -670,96 +670,51 @@ StaticAssertDecl(lengthof(GucSource_Names) == (PGC_S_SESSION + 1),
  */
 const char *const config_group_names[] =
 {
-	/* UNGROUPED */
-	gettext_noop("Ungrouped"),
-	/* FILE_LOCATIONS */
-	gettext_noop("File Locations"),
-	/* CONN_AUTH_SETTINGS */
-	gettext_noop("Connections and Authentication / Connection Settings"),
-	/* CONN_AUTH_TCP */
-	gettext_noop("Connections and Authentication / TCP Settings"),
-	/* CONN_AUTH_AUTH */
-	gettext_noop("Connections and Authentication / Authentication"),
-	/* CONN_AUTH_SSL */
-	gettext_noop("Connections and Authentication / SSL"),
-	/* RESOURCES_MEM */
-	gettext_noop("Resource Usage / Memory"),
-	/* RESOURCES_DISK */
-	gettext_noop("Resource Usage / Disk"),
-	/* RESOURCES_KERNEL */

Re: Improve readability by using designated initializers when possible

2024-02-27 Thread Jelte Fennema-Nio

On Tue, 27 Feb 2024 at 07:25, Michael Paquier  wrote:
> About 0002, I can't help but notice pg_enc2icu_tbl and
> pg_enc2gettext_tb.  There are exceptions like MULE_INTERNAL, but is it
> possible to do better?

Attached is an updated patchset to also convert pg_enc2icu_tbl and
pg_enc2gettext_tbl. I converted pg_enc2gettext_tbl in a separate
commit, because it actually requires some codechanges too.
From 4bdf8991d948a251cdade89dbacf121c64f420c7 Mon Sep 17 00:00:00 2001
From: Jelte Fennema-Nio 
Date: Wed, 21 Feb 2024 15:34:38 +0100
Subject: [PATCH v3 2/3] Use designated initializer syntax to improve
 readability

Use C99 designated initializer syntax for array elements of various
arrays. This is less verbose, more readable and less prone to typos or
ordering mistakes.
Akin to 74a730631065 and cc150596341e.
---
 src/backend/parser/parser.c |  12 +-
 src/backend/utils/misc/guc_tables.c | 187 +++-
 src/bin/pg_dump/pg_dump_sort.c  |  94 +++---
 src/common/encnames.c   | 154 +++
 src/common/relpath.c|   8 +-
 src/common/wchar.c  |  85 +++--
 src/include/mb/pg_wchar.h   |   6 +-
 7 files changed, 248 insertions(+), 298 deletions(-)

diff --git a/src/backend/parser/parser.c b/src/backend/parser/parser.c
index 9ec628ecbdf..3a1fa91c1b6 100644
--- a/src/backend/parser/parser.c
+++ b/src/backend/parser/parser.c
@@ -56,12 +56,12 @@ raw_parser(const char *str, RawParseMode mode)
 	{
 		/* this array is indexed by RawParseMode enum */
 		static const int mode_token[] = {
-			0,	/* RAW_PARSE_DEFAULT */
-			MODE_TYPE_NAME,		/* RAW_PARSE_TYPE_NAME */
-			MODE_PLPGSQL_EXPR,	/* RAW_PARSE_PLPGSQL_EXPR */
-			MODE_PLPGSQL_ASSIGN1,	/* RAW_PARSE_PLPGSQL_ASSIGN1 */
-			MODE_PLPGSQL_ASSIGN2,	/* RAW_PARSE_PLPGSQL_ASSIGN2 */
-			MODE_PLPGSQL_ASSIGN3	/* RAW_PARSE_PLPGSQL_ASSIGN3 */
+			[RAW_PARSE_DEFAULT] = 0,
+			[RAW_PARSE_TYPE_NAME] = MODE_TYPE_NAME,
+			[RAW_PARSE_PLPGSQL_EXPR] = MODE_PLPGSQL_EXPR,
+			[RAW_PARSE_PLPGSQL_ASSIGN1] = MODE_PLPGSQL_ASSIGN1,
+			[RAW_PARSE_PLPGSQL_ASSIGN2] = MODE_PLPGSQL_ASSIGN2,
+			[RAW_PARSE_PLPGSQL_ASSIGN3] = MODE_PLPGSQL_ASSIGN3,
 		};
 
 		yyextra.have_lookahead = true;
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 527a2b27340..59904fd007f 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -627,13 +627,13 @@ bool		in_hot_standby_guc;
  */
 const char *const GucContext_Names[] =
 {
-	 /* PGC_INTERNAL */ "internal",
-	 /* PGC_POSTMASTER */ "postmaster",
-	 /* PGC_SIGHUP */ "sighup",
-	 /* PGC_SU_BACKEND */ "superuser-backend",
-	 /* PGC_BACKEND */ "backend",
-	 /* PGC_SUSET */ "superuser",
-	 /* PGC_USERSET */ "user"
+	[PGC_INTERNAL] = "internal",
+	[PGC_POSTMASTER] = "postmaster",
+	[PGC_SIGHUP] = "sighup",
+	[PGC_SU_BACKEND] = "superuser-backend",
+	[PGC_BACKEND] = "backend",
+	[PGC_SUSET] = "superuser",
+	[PGC_USERSET] = "user"
 };
 
 StaticAssertDecl(lengthof(GucContext_Names) == (PGC_USERSET + 1),
@@ -646,20 +646,20 @@ StaticAssertDecl(lengthof(GucContext_Names) == (PGC_USERSET + 1),
  */
 const char *const GucSource_Names[] =
 {
-	 /* PGC_S_DEFAULT */ "default",
-	 /* PGC_S_DYNAMIC_DEFAULT */ "default",
-	 /* PGC_S_ENV_VAR */ "environment variable",
-	 /* PGC_S_FILE */ "configuration file",
-	 /* PGC_S_ARGV */ "command line",
-	 /* PGC_S_GLOBAL */ "global",
-	 /* PGC_S_DATABASE */ "database",
-	 /* PGC_S_USER */ "user",
-	 /* PGC_S_DATABASE_USER */ "database user",
-	 /* PGC_S_CLIENT */ "client",
-	 /* PGC_S_OVERRIDE */ "override",
-	 /* PGC_S_INTERACTIVE */ "interactive",
-	 /* PGC_S_TEST */ "test",
-	 /* PGC_S_SESSION */ "session"
+	[PGC_S_DEFAULT] = "default",
+	[PGC_S_DYNAMIC_DEFAULT] = "default",
+	[PGC_S_ENV_VAR] = "environment variable",
+	[PGC_S_FILE] = "configuration file",
+	[PGC_S_ARGV] = "command line",
+	[PGC_S_GLOBAL] = "global",
+	[PGC_S_DATABASE] = "database",
+	[PGC_S_USER] = "user",
+	[PGC_S_DATABASE_USER] = "database user",
+	[PGC_S_CLIENT] = "client",
+	[PGC_S_OVERRIDE] = "override",
+	[PGC_S_INTERACTIVE] = "interactive",
+	[PGC_S_TEST] = "test",
+	[PGC_S_SESSION] = "session"
 };
 
 StaticAssertDecl(lengthof(GucSource_Names) == (PGC_S_SESSION + 1),
@@ -670,96 +670,51 @@ StaticAssertDecl(lengthof(GucSource_Names) == (PGC_S_SESSION + 1),
  */
 const char *const config_group_names[] =
 {
-	/* UNGROUPED */
-	gettext_noop("Ungrouped"),
-	/* FILE_LOCATIONS */
-	gettext_noop("File Locations"),
-	/* CONN_AUTH_SETTINGS */
-	gettext_noop("Connections and Authentication / Connection Settings"),
-	/* CONN_AUTH_TCP */
-	gettext_noop("Connections and Authentication / TCP Settings"),
-	/* CONN_AUTH_AUTH */
-	gettext_noop("Connections and Authentication / Authentication"),
-	/* CONN_AUTH_SSL */
-	gettext_noop("Connections and Authentication / SSL"),
-	/* RESOURCES_MEM */
-	gettext_noop("Resource Usage / Memory"),
-	/* RESOURCES_DISK */
-	gettext_noop("Resource Usage / Disk"),

Re: Injection points: some tools to wait and wake

2024-02-27 Thread Andrey M. Borodin

Hi,

> On 27 Feb 2024, at 16:08, Bertrand Drouvot  
> wrote:
> 
> On Tue, Feb 27, 2024 at 11:00:10AM +0500, Andrey M. Borodin wrote:
>> 
>> But, AFAICS, the purpose is the same: wait until event happened.
> 
> I think it's easier to understand the tests (I mean what the purpose of the
> injection points are) if we don't use an helper function. While the helper 
> function would make the test easier to read / cleaner, I think it may make 
> them
> more difficult to understand as 'await_injection_point' would probably be too
> generic.

For the record: I’m fine if there is no such function.
There will be at least one implementation of this function in every single test 
with waiting injection points.
That’s the case where we might want something generic.
What the specific there might be? The test can check that 
 - conditions are logged
 - injection point reached in specific backend (e.g. checkpointer)
etc

I doubt that this specific checks worth cleanness of the test. And sacrificing 
test readability in favour of teaching reader what injection points are sounds 
strange.


Best regards, Andrey Borodin.

Re: pread, pwrite, etc return ssize_t not int

2024-02-27 Thread Thomas Munro

Patches attached.

PS Correction to my earlier statement about POSIX: the traditional K
interfaces were indeed in the original POSIX.1 1988 but it was the
1990 edition (approximately coinciding with standard C) that adopted
void, size_t, const and invented ssize_t.


0001-Return-ssize_t-in-fd.c-I-O-functions.patch
Description: Binary data


0002-Fix-theoretical-overflow-in-Windows-pg_pread-pg_pwri.patch
Description: Binary data

Re: Slow catchup of 2PC (twophase) transactions on replica in LR

2024-02-27 Thread Давыдов Виталий


Hi Amit,

Thank you for your interest in the discussion!

On Monday, February 26, 2024 16:24 MSK, Amit Kapila  
wrote:
 
I think the reason is probably that when the WAL record for prepared is already 
flushed then what will be the idea of async commit here?I think, the idea of 
async commit should be applied for both transactions: PREPARE and COMMIT 
PREPARED, which are actually two separate local transactions. For both these 
transactions we may call XLogSetAsyncXactLSN on commit instead of XLogFlush 
when async commit is enabled. When I use async commit, I mean to apply async 
commit to local transactions, not to a twophase (prepared) transaction itself.
 
At commit prepared, it seems we read prepare's WAL record, right? If so, it is 
not clear to me do you see a problem with a flush of commit_prepared or reading 
WAL for prepared or both of these.The problem with reading WAL is due to async 
commit of PREPARE TRANSACTION which saves 2PC in the WAL. At the moment of 
COMMIT PREPARED the WAL with PREPARE TRANSACTION 2PC state may not be 
XLogFlush-ed yet. So, PREPARE TRANSACTION should wait until its 2PC state is 
flushed.

I did some experiments with saving 2PC state in the local memory of logical 
replication worker and, I think, it worked and demonstrated much better 
performance. Logical replication worker utilized up to 100% CPU. I'm just 
concerned about possible problems with async commit for twophase transactions.

To be more specific, I've attached a patch to support async commit for 
twophase. It is not the final patch but it is presented only for discussion 
purposes. There were some attempts to save 2PC in memory in past but it was 
rejected. Now, there might be the second round to discuss it.

With best regards,
Vitaly

 
From 549f809fa122ca0842ec4bfc775afd08feee0d80 Mon Sep 17 00:00:00 2001
From: Vitaly Davydov 
Date: Tue, 27 Feb 2024 14:02:23 +0300
Subject: [PATCH] Add asynchronous commit support for 2PC

---
 src/backend/access/transam/twophase.c | 111 +-
 1 file changed, 108 insertions(+), 3 deletions(-)

diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c
index c6af8cfd7e..52f0853db8 100644
--- a/src/backend/access/transam/twophase.c
+++ b/src/backend/access/transam/twophase.c
@@ -109,6 +109,8 @@
 #include "utils/memutils.h"
 #include "utils/timestamp.h"
 
+#define POSTGRESQL_TWOPHASE_SUPPORT_ASYNC_COMMIT
+
 /*
  * Directory where Two-phase commit files reside within PGDATA
  */
@@ -163,6 +165,9 @@ typedef struct GlobalTransactionData
 	 */
 	XLogRecPtr	prepare_start_lsn;	/* XLOG offset of prepare record start */
 	XLogRecPtr	prepare_end_lsn;	/* XLOG offset of prepare record end */
+	void*   prepare_2pc_mem_data;
+	size_t  prepare_2pc_mem_len;
+	pid_t   prepare_2pc_proc;
 	TransactionId xid;			/* The GXACT id */
 
 	Oid			owner;			/* ID of user that executed the xact */
@@ -427,6 +432,9 @@ MarkAsPreparing(TransactionId xid, const char *gid,
 
 	MarkAsPreparingGuts(gxact, xid, gid, prepared_at, owner, databaseid);
 
+	Assert(gxact->prepare_2pc_mem_data == NULL);
+	Assert(gxact->prepare_2pc_proc == 0);
+
 	gxact->ondisk = false;
 
 	/* And insert it into the active array */
@@ -1129,6 +1137,8 @@ StartPrepare(GlobalTransaction gxact)
 	}
 }
 
+extern bool IsLogicalWorker(void);
+
 /*
  * Finish preparing state data and writing it to WAL.
  */
@@ -1167,6 +1177,37 @@ EndPrepare(GlobalTransaction gxact)
 (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
  errmsg("two-phase state file maximum length exceeded")));
 
+	Assert(gxact->prepare_2pc_mem_data == NULL);
+	Assert(gxact->prepare_2pc_proc == 0);
+
+	if (IsLogicalWorker())
+	{
+		size_t			len = 0;
+		size_t			offset = 0;
+
+		for (record = records.head; record != NULL; record = record->next)
+			len += record->len;
+
+		if (len > 0)
+		{
+			MemoryContext	oldmemctx;
+
+			oldmemctx = MemoryContextSwitchTo(TopMemoryContext);
+
+			gxact->prepare_2pc_mem_data = palloc(len);
+			gxact->prepare_2pc_mem_len = len;
+			gxact->prepare_2pc_proc = getpid();
+
+			for (record = records.head; record != NULL; record = record->next)
+			{
+memcpy((char *)gxact->prepare_2pc_mem_data + offset, record->data, record->len);
+offset += record->len;
+			}
+
+			MemoryContextSwitchTo(oldmemctx);
+		}
+	}
+
 	/*
 	 * Now writing 2PC state data to WAL. We let the WAL's CRC protection
 	 * cover us, so no need to calculate a separate CRC.
@@ -1202,8 +1243,24 @@ EndPrepare(GlobalTransaction gxact)
    gxact->prepare_end_lsn);
 	}
 
+#if !defined(POSTGRESQL_TWOPHASE_SUPPORT_ASYNC_COMMIT)
+
 	XLogFlush(gxact->prepare_end_lsn);
 
+#else
+
+	if (synchronous_commit > SYNCHRONOUS_COMMIT_OFF)
+	{
+		/* Flush XLOG to disk */
+		XLogFlush(gxact->prepare_end_lsn);
+	}
+	else
+	{
+		XLogSetAsyncXactLSN(gxact->prepare_end_lsn);
+	}
+
+#endif
+
 	/* If we crash now, we have prepared: WAL replay will fix things */
 
 	/* Store record's start location to read that later on Commit */
@@

Re: clarify equalTupleDescs()

2024-02-27 Thread jian he

On Mon, Feb 12, 2024 at 7:47 PM Peter Eisentraut  wrote:
>
>
> In principle, hashRowType() could process all the fields that
> equalRowTypes() does.  But since it's only a hash function, it doesn't
> have to be perfect.  (This is also the case for the current
> hashTupleDesc().)  I'm not sure where the best tradeoff is.

That's where my confusion comes from.
hashRowType is used in record_type_typmod_hash.
record_type_typmod_hash is within assign_record_type_typmod.

in assign_record_type_typmod:

if (RecordCacheHash == NULL)
{
/* First time through: initialize the hash table */
HASHCTL ctl;
ctl.keysize = sizeof(TupleDesc); /* just the pointer */
ctl.entrysize = sizeof(RecordCacheEntry);
ctl.hash = record_type_typmod_hash;
ctl.match = record_type_typmod_compare;
RecordCacheHash = hash_create("Record information cache", 64,
  ,
  HASH_ELEM | HASH_FUNCTION | HASH_COMPARE);
/* Also make sure CacheMemoryContext exists */
if (!CacheMemoryContext)
CreateCacheMemoryContext();
}
/*
* Find a hashtable entry for this tuple descriptor. We don't use
* HASH_ENTER yet, because if it's missing, we need to make sure that all
* the allocations succeed before we create the new entry.
*/
recentry = (RecordCacheEntry *) hash_search(RecordCacheHash,
,
HASH_FIND, );

based on the comments in hash_create. The above hash_search function
would first use
record_type_typmod_hash to find out candidate entries in a hash table
then use record_type_typmod_compare to compare the given tupDesc with
candidate entries.

Is this how the hash_search in assign_record_type_typmod works?

equalRowTypes processed more fields than hashRowType,
hashRowType comments mentioned equalRowTypes,
maybe we should have some comments in hashRowType explaining why only
hashing natts, tdtypeid, atttypid will be fine.

Re: Injection points: some tools to wait and wake

2024-02-27 Thread Bertrand Drouvot

Hi,

On Tue, Feb 27, 2024 at 11:00:10AM +0500, Andrey M. Borodin wrote:
> 
> 
> > On 27 Feb 2024, at 04:29, Michael Paquier  wrote:
> > 
> > For
> > example, the test just posted here does not rely on that:
> > https://www.postgresql.org/message-id/zdyzya4yrnapw...@ip-10-97-1-34.eu-west-3.compute.internal
> 
> Instead, that test is scanning logs
> 
> + # Note: $node_primary->wait_for_replay_catchup($node_standby) would be
> + # hanging here due to the injection point, so check the log instead.+
> + my $terminated = 0;
> + for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++)
> + {
> + if ($node_standby->log_contains(
> + 'terminating process .* to release replication slot 
> \"injection_activeslot\"', $logstart))
> + {
> + $terminated = 1;
> + last;
> + }
> + usleep(100_000);
> + }
> 
> But, AFAICS, the purpose is the same: wait until event happened.

I think it's easier to understand the tests (I mean what the purpose of the
injection points are) if we don't use an helper function. While the helper 
function would make the test easier to read / cleaner, I think it may make them
more difficult to understand as 'await_injection_point' would probably be too
generic.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Improve readability by using designated initializers when possible

2024-02-27 Thread Jelte Fennema-Nio

On Tue, 27 Feb 2024 at 07:25, Michael Paquier  wrote:
>
> On Mon, Feb 26, 2024 at 05:00:13PM +0800, Japin Li wrote:
> > On Mon, 26 Feb 2024 at 16:41, jian he  wrote:
> >> obvious typo errors.
>
> These would cause compilation failures.  Saying that, this is a very
> nice cleanup, so I've fixed these and applied the patch after checking
> that the one-one replacements were correct.

Sorry about those search/replace mistakes. Not sure how that happened.
Thanks for committing :)

Re: Synchronizing slots from primary to standby

2024-02-27 Thread Bertrand Drouvot

Hi,

On Tue, Feb 27, 2024 at 06:17:44PM +1100, Peter Smith wrote:
> +static bool
> +validate_standby_slots(char **newval)
> +{
> + char*rawname;
> + List*elemlist;
> + ListCell   *lc;
> + bool ok;
> +
> + /* Need a modifiable copy of string */
> + rawname = pstrdup(*newval);
> +
> + /* Verify syntax and parse string into a list of identifiers */
> + ok = SplitIdentifierString(rawname, ',', );
> +
> + if (!ok)
> + {
> + GUC_check_errdetail("List syntax is invalid.");
> + }
> +
> + /*
> + * If the replication slots' data have been initialized, verify if the
> + * specified slots exist and are logical slots.
> + */
> + else if (ReplicationSlotCtl)
> + {
> + foreach(lc, elemlist)
> 
> 6a.
> So, if the ReplicationSlotCtl is NULL, it is possible to return
> ok=true without ever checking if the slots exist or are of the correct
> kind. I am wondering what are the ramifications of that. -- e.g.
> assuming names are OK when maybe they aren't OK at all. AFAICT this
> works because it relies on getting subsequent WARNINGS when calling
> FilterStandbySlots(). If that is correct then maybe the comment here
> can be enhanced to say so.
> 
> Indeed, if it works like that, now I am wondering do we need this for
> loop validation at all. e.g. it seems just a matter of timing whether
> we get ERRORs validating the GUC here, or WARNINGS later in the
> FilterStandbySlots. Maybe we don't need the double-checking and it is
> enough to check in FilterStandbySlots?

Good point, I have the feeling that it is enough to check in 
FilterStandbySlots().

Indeed, if the value is syntactically correct, then I think that its actual 
value
"really" matters when the logical decoding is starting/running, does it provide
additional benefits "before" that?

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Fix for edge case in date_bin() function

2024-02-27 Thread Moaaz Assali

Hello Daniel,

I have added a test case for this in timestamp.sql and timestamp.out, and
tests pass when using the bug fix patch in the first email.
I have attached a new patch in this email below with the new tests only
(doesn't include the bug fix).

P.S. I forgot to CC the mailing list in my previous reply. This is just a
copy of it.

Best regards,
Moaaz Assali

On Tue, Feb 27, 2024 at 12:48 PM Daniel Gustafsson  wrote:

> > On 27 Feb 2024, at 09:42, Moaaz Assali  wrote:
>
> > To account for this edge, we simply add another condition in the if
> statement to not perform the subtraction by one stride interval if the time
> difference is divisible by the stride.
>
> I only skimmed the patch, but I recommend adding a testcase to it to keep
> the
> regression from reappearing.  src/test/regress/sql/timestamp.sql might be a
> good candidate testsuite.
>
> --
> Daniel Gustafsson
>
>

v1-0002-Regression-test-in-timestamp.sql-for-date_bin.patch
Description: Binary data

Re: table inheritance versus column compression and storage settings

2024-02-27 Thread Ashutosh Bapat

On Fri, Feb 23, 2024 at 6:05 PM Ashutosh Bapat
 wrote:
>
> On Tue, Feb 20, 2024 at 3:51 PM Peter Eisentraut  wrote:
> >
> > I have reverted the patch for now (and re-opened the commitfest entry).
> > We should continue to work on this and see if we can at least try to get
> > the pg_dump test coverage suitable.
> >
>
> I have started a separate thread for dump/restore test. [1].
>
> Using that test, I found an existing bug:
> Consider
> CREATE TABLE cminh6 (f1 TEXT);
> ALTER TABLE cminh6 INHERIT cmparent1;
> f1 remains without compression even after inherit per the current code.
> But pg_dump dumps it out as
> CREATE TABLE cminh6 (f1 TEXT) INHERIT(cmparent1)
> Because of this after restoring cminh6::f1 inherits compression of
> cmparent1. So before dump cminh6::f1 has no compression and after
> restore it has compression.
>
> I am not sure how to fix this. We want inheritance children to have
> their on compression. So ALTER TABLE ... INHERIT ... no passing a
> compression onto child is fine. CREATE TABLE  INHERIT ... passing
> compression onto the child being created also looks fine since that's
> what we do with other attributes. Only solution I see is to introduce
> "none" as a special compression method to indicate "no compression"
> and store that instead of NULL in attcompression column. That looks
> ugly.

Specifying DEFAULT as COMPRESSION method instead of inventing "none"
works. We should do it only for INHERITed tables.

>
> Similar is the case with storage.
>

Similar to compression, for inherited tables we have to output STORAGE
clause even if it's default.

-- 
Best Wishes,
Ashutosh Bapat

Re: Logging parallel worker draught

2024-02-27 Thread Benoit Lobréau





On 2/25/24 23:32, Peter Smith wrote:

Also, I don't understand how the word "draught" (aka "draft") makes
sense here -- I assume the intended word was "drought" (???).


yes, that was the intent, sorry about that. English is not my native 
langage and I was convinced the spelling was correct.


--
Benoit Lobréau
Consultant
http://dalibo.com

RE: speed up a logical replica setup

2024-02-27 Thread Hayato Kuroda (Fujitsu)

Dear Euler,

> Sorry, which pg_rewind option did you mention? I cannot find.
> IIUC, -l is an only option which can accept the path, but it is not related 
> with us.

Sorry, I found [1]. I was confused with pg_resetwal. But my opinion was not so 
changed.
The reason why --config-file exists is that pg_rewind requires that target must 
be stopped,
which is different from the current pg_createsubscriber. So I still do not like 
to add options.

[1]: 
https://www.postgresql.org/docs/current/app-pgrewind.html#:~:text=%2D%2Dconfig%2Dfile%3Dfilename
[2]:
```
The target server must be shut down cleanly before running pg_rewind
```

Best Regards,
Hayato Kuroda
FUJITSU LIMITED
https://www.fujitsu.com/

Re: SLRU optimization - configurable buffer pool and partitioning the SLRU lock

2024-02-27 Thread Dilip Kumar

On Mon, Feb 26, 2024 at 9:46 PM Alvaro Herrera 
wrote:

> On 2024-Feb-23, Dilip Kumar wrote:
>
> +  
> +   For each SLRU area that's part of the core server,
> +   there is a configuration parameter that controls its size, with the
> suffix
> +   _buffers appended.  For historical
> +   reasons, the names are not exact matches, but Xact
> +   corresponds to transaction_buffers and the rest
> should
> +   be obvious.
> +   
> +  
>
> I think I would like to suggest renaming the GUCs to have the _slru_ bit
> in the middle:
>
> +# - SLRU Buffers (change requires restart) -
> +
> +#commit_timestamp_slru_buffers = 0  # memory for pg_commit_ts (0
> = auto)
> +#multixact_offsets_slru_buffers = 16# memory for
> pg_multixact/offsets
> +#multixact_members_slru_buffers = 32# memory for
> pg_multixact/members
> +#notify_slru_buffers = 16   # memory for pg_notify
> +#serializable_slru_buffers = 32 # memory for pg_serial
> +#subtransaction_slru_buffers = 0# memory for pg_subtrans (0 =
> auto)
> +#transaction_slru_buffers = 0   # memory for pg_xact (0 =
> auto)
>
> and the pgstat_internal.h table:
>
> static const char *const slru_names[] = {
> "commit_timestamp",
> "multixact_members",
> "multixact_offsets",
> "notify",
> "serializable",
> "subtransaction",
> "transaction",
> "other" /* has to be last
> */
> };
>
> This way they match perfectly.
>

Yeah, I think this looks fine to me.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Re: Logging parallel worker draught

2024-02-27 Thread Benoit Lobréau

On 2/25/24 20:13, Tomas Vondra wrote:
> 1) name of the GUC
...
> 2) logging just the failures provides an incomplete view
> log_parallel_workers = {none | failures | all}>
> where "failures" only logs when at least one worker fails to start, and
> "all" logs everything.
>
> AFAIK Sami made the same observation/argument in his last message [1].

I like the name and different levels you propose. I was initially 
thinking that the overall picture would be better summarized (an easier 
to implement) with pg_stat_statements. But with the granularity you 
propose, the choice is in the hands of the DBA, which is great and 
provides more options when we don't have full control of the installation.

> 3) node-level or query-level?
...
> There's no value in having information about every individual temp file,
> because we don't even know which node produced it. It'd be perfectly
> sufficient to log some summary statistics (number of files, total space,
> ...), either for the query as a whole, or perhaps per node (because
> that's what matters for work_mem). So I don't think we should mimic this
> just because log_temp_files does this.

I must admit that, given my (poor) technical level, I went for the 
easiest way.

If we go for the three levels you proposed, "node-level" makes even less 
sense (and has great "potential" for spam).

I can see one downside to the "query-level" approach: it might be more 
difficult to to give information in cases where the query doesn't end 
normally. It's sometimes useful to have hints about what was going wrong 
before a query was cancelled or crashed, which log_temp_files kinda does.

> I don't know how difficult would it be to track/collect this information
> for the whole query.

I am a worried this will be a little too much for me, given the time and 
the knowledge gap I have (both in C and PostgreSQL internals). I'll try 
to look anyway.

--
Benoit Lobréau
Consultant
http://dalibo.com

Re: Fix for edge case in date_bin() function

2024-02-27 Thread Daniel Gustafsson

> On 27 Feb 2024, at 09:42, Moaaz Assali  wrote:

> To account for this edge, we simply add another condition in the if statement 
> to not perform the subtraction by one stride interval if the time difference 
> is divisible by the stride.

I only skimmed the patch, but I recommend adding a testcase to it to keep the
regression from reappearing.  src/test/regress/sql/timestamp.sql might be a
good candidate testsuite.

--
Daniel Gustafsson

Fix for edge case in date_bin() function

2024-02-27 Thread Moaaz Assali

Hello,

The date_bin() function has a bug where it returns an incorrect binned date
when both of the following are true:
1) the origin timestamp is before the source timestamp
2) the origin timestamp is exactly equivalent to some valid binned date in
the set of binned dates that date_bin() can return given a specific stride
and source timestamp.

For example, consider the following function call:
date_bin('30 minutes'::interval, '2024-01-01 15:00:00'::timestamp,
'2024-01-01 17:00:00'::timestamp);

This function call will return '2024-01-01 14:30:00' instead of '2024-01-01
15:00:00' despite '2024-01-01 15:00:00' being the valid binned date for the
timestamp '2024-01-01 15:00:00'. This commit fixes that by editing the
timestamp_bin() function in timestamp.c file.

The reason this happens is that the code in timestamp_bin() that allows for
correct date binning when source timestamp < origin timestamp subtracts one
stride in all cases.
However, that is not valid for this case when the source timestamp is
exactly equivalent to a valid binned date as in the example mentioned above.

To account for this edge, we simply add another condition in the if
statement to not perform the subtraction by one stride interval if the time
difference is divisible by the stride.

Best regards,
Moaaz Assali


v1-0001-Fix-for-edge-case-in-date_bin-function.patch
Description: Binary data

RE: Have pg_basebackup write "dbname" in "primary_conninfo"?

2024-02-27 Thread Hayato Kuroda (Fujitsu)

> PSA the patch for implementing it. It is basically same as Ian's one.
> However, this patch still cannot satisfy the condition 3).
> 
> `pg_basebackup -D data_N2 -d "user=postgres" -R`
> -> dbname would not be appeared in primary_conninfo.
> 
> This is because `if (connection_string)` case in GetConnection() explicy 
> override
> a dbname to "replication". I've tried to add a dummy entry {"dbname", NULL} 
> pair
> before the overriding, but it is no-op. Because The replacement of the dbname 
> in
> pqConnectOptions2() would be done only for the valid (=lastly specified)
> connection options.

Oh, this patch missed the straightforward case:

pg_basebackup -D data_N2 -d "user=postgres dbname=replication" -R
-> dbname would not be appeared in primary_conninfo.

So I think it cannot be applied as-is. Sorry for sharing the bad item.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED
https://www.fujitsu.com/

RE: Have pg_basebackup write "dbname" in "primary_conninfo"?

2024-02-27 Thread Hayato Kuroda (Fujitsu)

Dear Amit,

> We do append dbname=replication even in libpqrcv_connect for .pgpass
> lookup as mentioned in comments. See below:
> libpqrcv_connect()
> {
> 
> else
> {
> /*
> * The database name is ignored by the server in replication mode,
> * but specify "replication" for .pgpass lookup.
> */
> keys[++i] = "dbname";
> vals[i] = "replication";
> }
> ...
> }

OK. So we must add the value for the authorization, right?
I think it should be described even in GetConnection().

> I think as part of this effort we should just add dbname to
> primary_conninfo written in postgresql.auto.conf file. As above, the
> question is valid whether we should do it just for 17 or for prior
> versions. Let's discuss more on that. I am not sure of the use case
> for versions before 17 but commit cca97ce6a665 mentioned that some
> middleware or proxies might however need to know the dbname to make
> the correct routing decision for the connection. Does that apply here
> as well? If so, we can do it and update the docs, otherwise, let's do
> it just for 17.

Hmm, I might lose your requirements again. So, we must satisfy all of below
ones, right?
1) add {"dbname", "replication"} key-value pair to look up .pgpass file 
correctly.
2) If the -R is given, output dbname=xxx value to be used by slotsync worker.
3) If the dbname is not given in the connection string, the same string as
   username must be used to keep the libpq connection rule.
4) No need to add dbname=replication pair 

PSA the patch for implementing it. It is basically same as Ian's one.
However, this patch still cannot satisfy the condition 3).

`pg_basebackup -D data_N2 -d "user=postgres" -R`
-> dbname would not be appeared in primary_conninfo.

This is because `if (connection_string)` case in GetConnection() explicy 
override
a dbname to "replication". I've tried to add a dummy entry {"dbname", NULL} pair
before the overriding, but it is no-op. Because The replacement of the dbname in
pqConnectOptions2() would be done only for the valid (=lastly specified)
connection options.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED
https://www.fujitsu.com/ 


pg_basebackup-write-dbname.v0002.patch
Description: pg_basebackup-write-dbname.v0002.patch

RE: Have pg_basebackup write "dbname" in "primary_conninfo"?

2024-02-27 Thread Hayato Kuroda (Fujitsu)

Dear Amit,

> When dbname is NULL or not given, it defaults to username. This
> follows the specs of the connection string. See (dbname #
> The database name. Defaults to be the same as the user name...) [1].
> Your patch breaks that specs, so I don't think it is correct.

I have proposed the point because I thought pg_basebackup basically wanted to do
a physical replication. But if the general libpq rule is stronger than it, we
should not apply my add_NULL_check.txt. Let's forget it.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED
https://www.fujitsu.com/

Re: Potential issue in ecpg-informix decimal converting functions

2024-02-27 Thread Daniel Gustafsson

> On 27 Feb 2024, at 06:08, Michael Paquier  wrote:
> 
> On Mon, Feb 26, 2024 at 12:28:51AM +0100, Daniel Gustafsson wrote:
>> Yeah, I think this is for HEAD only, especially given the lack of complaints
>> against backbranches.
> 
> Daniel, are you planning to look at that?  I haven't done any detailed
> lookup, but would be happy to do so it that helps.

I have it on my TODO for the upcoming CF.

--
Daniel Gustafsson

97 matches

Mail list logo