Re: Disable LLVM bitcode generation with pgxs.mk framework.

2024-03-12 Thread Peter Eisentraut

On 12.03.24 14:38, Xing Guo wrote:

When the PostgreSQL server is configured with --with-llvm, the pgxs.mk
framework will generate LLVM bitcode for extensions automatically.
Sometimes, I don't want to generate bitcode for some extensions. I can
turn off this feature by specifying with_llvm=0 in the make command.

```
make with_llvm=0
```

Would it be possible to add a new switch in the pgxs.mk framework to
allow users to disable this feature? E.g., the Makefile looks like:

```
WITH_LLVM=no
PG_CONFIG = pg_config
PGXS := $(shell $(PG_CONFIG) --pgxs)
```


Can't you just put the very same with_llvm=0 into the makefile?





Re: MERGE ... RETURNING

2024-03-12 Thread jian he
Hi, some minor issues:


[ WITH with_query [, ...] ]
MERGE INTO [ ONLY ] target_table_name [ * ] [ [ AS ]
target_alias ]
USING data_source ON
join_condition
when_clause [...]
[ RETURNING * | output_expression [ [ AS ]
output_name ] [, ...] ]

here the "WITH" part should have "[ RECURSIVE ]" like:
[ WITH [ RECURSIVE ] with_query [, ...] ]

+  An expression to be computed and returned by the MERGE
+  command after each row is merged.  The expression can use any columns of
+  the source or target tables, or the 
+  function to return additional information about the action executed.
+ 
should be:
+  An expression to be computed and returned by the MERGE
+  command after each row is changed.


one minor issue:
add
`
table sq_target;
table sq_source;
`
before `-- RETURNING` in src/test/regress/sql/merge.sql, so we can
easily understand the tests.




Re: meson vs tarballs

2024-03-12 Thread Andrew Dunstan



On 2024-03-13 We 02:31, Andrew Dunstan wrote:


On 2024-03-13 We 02:22, Peter Eisentraut wrote:

On 13.03.24 07:11, Andrew Dunstan wrote:
I and several colleagues have just been trying to build from a 
tarball with meson.


That seems pretty awful and unfriendly and I didn't see anything 
about it in the docs.


At https://www.postgresql.org/docs/16/install-requirements.html is says:

"""
Alternatively, PostgreSQL can be built using Meson. This is currently 
experimental and only works when building from a Git checkout (not 
from a distribution tarball).

"""



Ah!. Darn, I missed that. Thanks.





Of course, when release 17 comes out this had better not be the case, 
since we have removed the custom Windows build system.



cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: Support json_errdetail in FRONTEND builds

2024-03-12 Thread Michael Paquier
On Tue, Mar 12, 2024 at 08:38:43PM -0400, Andrew Dunstan wrote:
> On 2024-03-12 Tu 14:43, Jacob Champion wrote:
>> Two notes that I wanted to point out explicitly:
>> - On the other thread, Daniel contributed a destroyStringInfo()
>> counterpart for makeStringInfo(), which is optional but seemed useful
>> to include here.
> 
> yeah, although maybe worth a different patch.

-   {
-   pfree(lex->strval->data);
-   pfree(lex->strval);
-   }
+   destroyStringInfo(lex->strval);

I've wanted that a few times, FWIW.  I would do a split, mainly for
clarity.

>> - After this patch, if a frontend client calls json_errdetail()
>> without calling freeJsonLexContext(), it will leak memory.
>
> Not too concerned about this:
> 
> 1. we tend to be a bit more relaxed about that in frontend programs,
> especially those not expected to run for long times and especially on error
> paths, where in many cases the program will just exit anyway.
> 
> 2. the fix is simple where it's needed.

This does not stress me much, either.  I can see that OAuth introduces
at least two calls of json_errdetail() in libpq, and that would matter
there.

 case JSON_EXPECTED_STRING:
-return psprintf(_("Expected string, but found \"%s\"."),
-extract_token(lex));
+appendStringInfo(lex->errormsg,
+ _("Expected string, but found \"%.*s\"."),
+ toklen, lex->token_start);

Maybe this could be wrapped in a macro to avoid copying around
token_start and toklen for all the error cases.
--
Michael


signature.asc
Description: PGP signature


Re: meson vs tarballs

2024-03-12 Thread Andrew Dunstan



On 2024-03-13 We 02:22, Peter Eisentraut wrote:

On 13.03.24 07:11, Andrew Dunstan wrote:
I and several colleagues have just been trying to build from a 
tarball with meson.


That seems pretty awful and unfriendly and I didn't see anything 
about it in the docs.


At https://www.postgresql.org/docs/16/install-requirements.html is says:

"""
Alternatively, PostgreSQL can be built using Meson. This is currently 
experimental and only works when building from a Git checkout (not 
from a distribution tarball).

"""



Ah!. Darn, I missed that. Thanks.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: Race condition in FetchTableStates() breaks synchronization of subscription tables

2024-03-12 Thread vignesh C
On Wed, 13 Mar 2024 at 10:12, Zhijie Hou (Fujitsu)
 wrote:
>
> On Wednesday, March 13, 2024 11:49 AMvignesh C  wrote:
> > On Tue, 12 Mar 2024 at 09:34, Ajin Cherian  wrote:
> > >
> > >
> > >
> > > On Tue, Mar 12, 2024 at 2:59 PM vignesh C  wrote:
> > >>
> > >>
> > >> Thanks, I have created the following Commitfest entry for this:
> > >> https://commitfest.postgresql.org/47/4816/
> > >>
> > >> Regards,
> > >> Vignesh
> > >
> > >
> > > Thanks for the patch, I have verified that the fix works well by 
> > > following the
> > steps mentioned to reproduce the problem.
> > > Reviewing the patch, it seems good and is well documented. Just one minor
> > comment I had was probably to change the name of the variable
> > table_states_valid to table_states_validity. The current name made sense 
> > when
> > it was a bool, but now that it is a tri-state enum, it doesn't fit well.
> >
> > Thanks for reviewing the patch, the attached v6 patch has the changes for 
> > the
> > same.
>
> Thanks for the patches.
>
> I saw a recent similar BF error[1] which seems related to the issue that 0001
> patch is trying to solve. i.e. The table sync worker is somehow not started
> after refreshing the publication on the subscriber. I didn't see other 
> related ERRORs in
> the log, so I think the reason is the same as the one being discussed in this
> thread, which is the table state invalidation got lost. And the 0001 patch
> looks good to me.
>
> For 0002, instead of avoid resetting the latch, is it possible to let the
> logical rep worker wake up the launcher once after attaching ?

Waking up of the launch process uses the same latch that is used for
subscription creation/modification and apply worker process exit. As
the handling of this latch for subscription creation/modification and
worker process exit can be done only by ApplyLauncherMain, we will not
be able to reset the latch in WaitForReplicationWorkerAttach. I feel
waking up the launcher process might not help in this case as
currently we will not be able to differentiate between worker
attached, subscription creation/modification and apply worker process
exit.

Regards,
Vignesh




Re: meson vs tarballs

2024-03-12 Thread Peter Eisentraut

On 13.03.24 07:11, Andrew Dunstan wrote:
I and several colleagues have just been trying to build from a tarball 
with meson.


That seems pretty awful and unfriendly and I didn't see anything about 
it in the docs.


At https://www.postgresql.org/docs/16/install-requirements.html is says:

"""
Alternatively, PostgreSQL can be built using Meson. This is currently 
experimental and only works when building from a Git checkout (not from 
a distribution tarball).

"""





Re: Add basic tests for the low-level backup method.

2024-03-12 Thread Michael Paquier
On Wed, Mar 13, 2024 at 01:12:28PM +1300, David Steele wrote:
> On 2/29/24 16:55, Michael Paquier wrote:
>> +unlink("$backup_dir/postmaster.pid")
>> +or BAIL_OUT("unable to unlink $backup_dir/postmaster.pid");
>> +unlink("$backup_dir/postmaster.opts")
>> +or BAIL_OUT("unable to unlink $backup_dir/postmaster.opts");
>> +unlink("$backup_dir/global/pg_control")
>> +or BAIL_OUT("unable to unlink $backup_dir/global/pg_control");
>> 
>> RELCACHE_INIT_FILENAME as well?
> 
> I'm not trying to implement the full exclusion list here, just enough to get
> the test working. Since exclusions are optional according to the docs I
> don't think we need them for a valid tests.

Okay.  Fine by me at the end.

>> +# Rather than writing out backup_label, try to recover the backup without
>> +# backup_label to demonstrate that recovery will not work correctly without 
>> it,
>> +# i.e. the canary table will be missing and the cluster will be corrupt. 
>> Provide
>> +# only the WAL segment that recovery will think it needs.
>> 
>> Okay, why not.  No objections to this addition.  I am a bit surprised
>> that this is not scanning some of the logs produced by the startup
>> process for particular patterns.
> 
> Not sure what to look for here. There are no distinct messages for crash
> recovery. Perhaps there should be?

The closest thing I can think of here would be "database system was
not properly shut down; automatic recovery in progress" as we don't
have InArchiveRecovery, after checking that the canary is missing.  If
you don't like this suggestion, feel free to say so, of course :)

>> +# Save backup_label into the backup directory and recover using the 
>> primary's
>> +# archive. This time recovery will succeed and the canary table will be
>> +# present.
>> 
>> Here are well, I think that we should add some log_contains() with
>> pre-defined patterns to show that recovery has completed the way we
>> want it with a backup_label up to the end-of-backup record.
> 
> Sure, I added a check for the new log message when recovering with a
> backup_label.

+ok($node_replica->log_contains('completed backup recovery with redo LSN'),
+   'verify backup recovery performed with backup_label');

Okay for this choice.  I was thinking first about "starting backup
recovery with redo LSN", closer to the area where the backup_label is
read.
--
Michael


signature.asc
Description: PGP signature


Re: speed up a logical replica setup

2024-03-12 Thread Euler Taveira
On Fri, Mar 8, 2024, at 6:44 AM, Hayato Kuroda (Fujitsu) wrote:
> Hmm, right. I considered below improvements. Tomas and Euler, how do you 
> think?

I'm posting a new patchset v28.

I changed the way that the check function works. From the usability
perspective, it is better to test all conditions and reports all errors (if
any) at once. It avoids multiple executions in dry run mode just to figure out
all of the issues in the initial phase. I also included tests for it using
Shlok's idea [1] although I didn't use v27-0004.

Shlok [1] reported that it was failing on Windows since the socket-directory
option was added. I added a fix for it.

Tomas pointed out the documentation [2] does not provide a good high level
explanation about pg_createsubscriber. I expanded the Description section and
moved the prerequisites to Nodes section. The prerequisites were grouped into
target and source conditions on their own paragraph instead of using a list. It
seems more in line with the style of some applications.

As I said in a previous email [3], I removed the retain option.


[1] 
https://www.postgresql.org/message-id/canhcyeu4q3dwh9ax9bpojcm4ebbhyfenegoaz8xfgyjmcpz...@mail.gmail.com
[2] 
https://www.postgresql.org/message-id/6423dfeb-a729-45d3-b71e-7bf1b3adb0c9%40enterprisedb.com
[3] 
https://www.postgresql.org/message-id/e390e35e-508e-4eb8-92e4-e6b066407a41%40app.fastmail.com


--
Euler Taveira
EDB   https://www.enterprisedb.com/


v28-0001-pg_createsubscriber-creates-a-new-logical-replic.patch.gz
Description: application/gzip


v28-0002-Use-last-replication-slot-position-as-replicatio.patch.gz
Description: application/gzip


v28-0003-port-replace-int-with-string.patch.gz
Description: application/gzip


meson vs tarballs

2024-03-12 Thread Andrew Dunstan



I and several colleagues have just been trying to build from a tarball 
with meson.



`meson setup build .` results in this:


[...]

Message: checking for file conflicts between source and build directory

meson.build:2963:2: ERROR: Problem encountered:

Non-clean source code directory detected.

To build with meson the source tree may not have an in-place, ./configure
style, build configured. You can have both meson and ./configure style 
builds

for the same source tree by building out-of-source / VPATH with
configure. Alternatively use a separate check out for meson based builds.


Conflicting files in source directory:

[huge list of files]

The conflicting files need to be removed, either by removing the files 
listed

above, or by running configure and then make maintainer-clean.




That seems pretty awful and unfriendly and I didn't see anything about 
it in the docs.



cheers


andrew




--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: pg_column_toast_chunk_id: a function to get a chunk ID of a TOASTed value

2024-03-12 Thread Yugo NAGATA
On Tue, 12 Mar 2024 22:07:17 -0500
Nathan Bossart  wrote:

> I did some light editing to prepare this for commit.

Thank you. I confirmed the test you improved and I am fine with that.

Regards,
Yugo Nagata

> 
> -- 
> Nathan Bossart
> Amazon Web Services: https://aws.amazon.com


-- 
Yugo NAGATA 




Re: Sequence Access Methods, round two

2024-03-12 Thread Peter Eisentraut

On 12.03.24 00:44, Michael Paquier wrote:

Anyway, there is one piece of this patch set that I think has a lot of
value outside of the discussion with access methods, which is to
redesign pg_sequence_last_value so as it returns a (last_value,
is_called) tuple rather than a (last_value).  This has the benefit of
switching pg_dump to use this function rather than relying on a scan
of the heap table used by a sequence to retrieve the state of a
sequence dumped.  This is the main diff:
-appendPQExpBuffer(query,
-  "SELECT last_value, is_called FROM %s",
-  fmtQualifiedDumpable(tbinfo));
+/*
+ * In versions 17 and up, pg_sequence_last_value() has been switched to
+ * return a tuple with last_value and is_called.
+ */
+if (fout->remoteVersion >= 17)
+appendPQExpBuffer(query,
+  "SELECT last_value, is_called "
+  "FROM pg_sequence_last_value('%s')",
+  fmtQualifiedDumpable(tbinfo));
+else
+appendPQExpBuffer(query,
+  "SELECT last_value, is_called FROM %s",
+  fmtQualifiedDumpable(tbinfo));

Are there any objections to that?  pg_sequence_last_value() is
something that we've only been relying on internally for the catalog
pg_sequences.


I don't understand what the overall benefit of this change is supposed 
to be.


If this route were to be pursued, it should be a different function 
name.  We shouldn't change the signature of an existing function.






Re: Transaction timeout

2024-03-12 Thread Andrey M. Borodin


> On 13 Mar 2024, at 05:23, Alexander Korotkov  wrote:
> 
> On Tue, Mar 12, 2024 at 10:28 AM Andrey M. Borodin  
> wrote:
>>> On 11 Mar 2024, at 16:18, Alexander Korotkov  wrote:
>>> 
>>> I think if checking psql stderr is problematic, checking just logs is
>>> fine.  Could we wait for the relevant log messages one by one with
>>> $node->wait_for_log() just like 040_standby_failover_slots_sync.pl do?
>> 
>> PFA version with $node->wait_for_log()
> 
> I've slightly revised the patch.  I'm going to push it if no objections.

One small note: log_offset was updated, but was not used.

Thanks!


Best regards, Andrey Borodin.


v6-0001-Add-TAP-tests-for-timeouts.patch
Description: Binary data


Re: POC, WIP: OR-clause support for indexes

2024-03-12 Thread Andrei Lepikhov

On 12/3/2024 22:20, Alexander Korotkov wrote:

On Mon, Mar 11, 2024 at 2:43 PM Andrei Lepikhov

I think you are right. It is probably a better place than any other to
remove duplicates in an array. I just think we should sort and remove
duplicates from entry->consts in one pass. Thus, this optimisation
should be applied to sortable constants.


Ok.
New version of the patch set implemented all we have agreed on for now. 
We can return MAX_SAOP_ARRAY_SIZE constraint and Alena's approach to 
duplicates deletion for non-sortable cases at the end.



Hmm, we already tried to do it at that point. I vaguely recall some
issues caused by this approach. Anyway, it should be done as quickly as
possible to increase the effect of the optimization.


I think there were provided quite strong reasons why this shouldn't be
implemented at the parse analysis stage [1], [2], [3].  The
canonicalize_qual() looks quite appropriate place for that since it
does similar transformations.
Ok. Let's discuss these reasons. In Robert's opinion [1,3], we should do 
the transformation based on the cost model. But in the canonicalize_qual 
routine, we still make the transformation blindly. Moreover, the second 
patch reduces the weight of this reason, doesn't it? Maybe we shouldn't 
think about that as about optimisation but some 'general form of 
expression'?
Peter [2] worries about the possible transformation outcomes at this 
stage. But remember, we already transform clauses like ROW() IN (...) to 
a series of ORs here, so it is not an issue. Am I wrong?
Why did we discard the attempt with canonicalize_qual on the previous 
iteration? - The stage of parsing is much more native for building SAOP 
quals. We can reuse make_scalar_array_op and other stuff, for example.
During the optimisation stage, the only list partitioning machinery 
creates SAOP based on a list of constants. So, in theory, it is possible 
to implement. But do we really need to make the code more complex?


Links.
1. 
https://www.postgresql.org/message-id/CA%2BTgmoZCgP6FrBQEusn4yaWm02XU8OPeoEMk91q7PRBgwaAkFw%40mail.gmail.com
2. 
https://www.postgresql.org/message-id/CAH2-Wzm2%3Dnf_JhiM3A2yetxRs8Nd2NuN3JqH%3Dfm_YWYd1oYoPg%40mail.gmail.com
3. 
https://www.postgresql.org/message-id/CA%2BTgmoaOiwMXBBTYknczepoZzKTp-Zgk5ss1%2BCuVQE-eFTqBmA%40mail.gmail.com


--
regards,
Andrei Lepikhov
Postgres Professional
From 86d969f2598a03b1eba84c0c064707de34ee60ac Mon Sep 17 00:00:00 2001
From: Alena Rybakina 
Date: Fri, 2 Feb 2024 22:01:09 +0300
Subject: [PATCH 1/2] Transform OR clauses to ANY expression.

Replace (expr op C1) OR (expr op C2) ... with expr op ANY(ARRAY[C1, C2, ...]) 
on the
preliminary stage of optimization when we are still working with the
expression tree.
Here C is a constant expression, 'expr' is non-constant expression, 'op' is
an operator which returns boolean result and has a commuter (for the case of
reverse order of constant and non-constant parts of the expression,
like 'CX op expr').
Sometimes it can lead to not optimal plan. But we think it is better to have
array of elements instead of a lot of OR clauses. Here is a room for further
optimizations on decomposing that array into more optimal parts.
Authors: Alena Rybakina , Andrey Lepikhov 

Reviewed-by: Peter Geoghegan , Ranier Vilela 
Reviewed-by: Alexander Korotkov , Robert Haas 

Reviewed-by: jian he 
---
 .../postgres_fdw/expected/postgres_fdw.out|   8 +-
 doc/src/sgml/config.sgml  |  17 +
 src/backend/nodes/queryjumblefuncs.c  |  27 ++
 src/backend/parser/parse_expr.c   | 416 ++
 src/backend/utils/misc/guc_tables.c   |  11 +
 src/backend/utils/misc/postgresql.conf.sample |   1 +
 src/bin/pg_dump/t/002_pg_dump.pl  |   6 +-
 src/include/nodes/queryjumble.h   |   1 +
 src/include/optimizer/optimizer.h |   1 +
 src/test/regress/expected/create_index.out| 156 ++-
 src/test/regress/expected/join.out|  62 ++-
 src/test/regress/expected/partition_prune.out | 215 -
 src/test/regress/expected/stats_ext.out   |  12 +-
 src/test/regress/expected/sysviews.out|   3 +-
 src/test/regress/expected/tidscan.out |  23 +-
 src/test/regress/sql/create_index.sql |  35 ++
 src/test/regress/sql/join.sql |  10 +
 src/test/regress/sql/partition_prune.sql  |  22 +
 src/test/regress/sql/tidscan.sql  |   6 +
 src/tools/pgindent/typedefs.list  |   2 +
 20 files changed, 974 insertions(+), 60 deletions(-)

diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out 
b/contrib/postgres_fdw/expected/postgres_fdw.out
index 58a603ac56..a965b43cc6 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -8838,18 +8838,18 @@ insert into utrtest values (2, 'qux');
 -- Check case where the foreign partition is a subplan target rel
 explain (verbose, costs off)
 update utrtest 

Re: type cache cleanup improvements

2024-03-12 Thread Michael Paquier
On Tue, Mar 12, 2024 at 06:55:41PM +0300, Teodor Sigaev wrote:
> Playing around I found one more place which could easily modified with
> hash_seq_init_with_hash_value() call.

I think that this patch should be split for clarity, as there are a
few things that are independently useful.  I guess something like
that:
- Introduction of hash_initial_lookup(), that simplifies 3 places of
dynahash.c where the same code is used.  The routine should be
inlined.
- The split in hash_seq_search to force a different type of search is
weird, complicating the dynahash interface by hiding what seems like a
search mode.  Rather than hasHashvalue that's hidden in the middle of
HASH_SEQ_STATUS, could it be better to have an entirely different API
for the search?  That should be a patch on its own, as well.
- The typcache changes.
--
Michael


signature.asc
Description: PGP signature


Re: Improve the connection failure error messages

2024-03-12 Thread Peter Smith
FYI -- some more code has been pushed since this patch was last
updated. AFAICT perhaps you'll want to update this patch again for the
following new connection messages on HEAD:

- slotfuncs.c [1]
- slotsync.c [2]

--
[1] 
https://github.com/postgres/postgres/blob/0b84f5c419a300dc1b1a70cf63b9907208e52643/src/backend/replication/slotfuncs.c#L989
[2] 
https://github.com/postgres/postgres/blob/0b84f5c419a300dc1b1a70cf63b9907208e52643/src/backend/replication/logical/slotsync.c#L1258

Kind Regards,
Peter Smith.
Fujitsu Australia




Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-12 Thread shveta malik
On Wed, Mar 6, 2024 at 2:47 PM Bharath Rupireddy
 wrote:
>
> On Wed, Mar 6, 2024 at 2:42 PM Bertrand Drouvot
>  wrote:
> >
> > Hi,
> >
> > On Tue, Mar 05, 2024 at 01:44:43PM -0600, Nathan Bossart wrote:
> > > On Wed, Mar 06, 2024 at 12:50:38AM +0530, Bharath Rupireddy wrote:
> > > > On Mon, Mar 4, 2024 at 2:11 PM Bertrand Drouvot
> > > >  wrote:
> > > >> On Sun, Mar 03, 2024 at 03:44:34PM -0600, Nathan Bossart wrote:
> > > >> > Unless I am misinterpreting some details, ISTM we could rename this 
> > > >> > column
> > > >> > to invalidation_reason and use it for both logical and physical 
> > > >> > slots.  I'm
> > > >> > not seeing a strong need for another column.
> > > >>
> > > >> Yeah having two columns was more for convenience purpose. Without the 
> > > >> "conflict"
> > > >> one, a slot conflicting with recovery would be "a logical slot having 
> > > >> a non NULL
> > > >> invalidation_reason".
> > > >>
> > > >> I'm also fine with one column if most of you prefer that way.
> > > >
> > > > While we debate on the above, please find the attached v7 patch set
> > > > after rebasing.
> > >
> > > It looks like Bertrand is okay with reusing the same column for both
> > > logical and physical slots
> >
> > Yeah, I'm okay with one column.
>
> Thanks. v8-0001 is how it looks. Please see the v8 patch set with this change.

JFYI, the patch does not apply to the head. There is a conflict in
multiple files.

thanks
Shveta




Re: meson: Specify -Wformat as a common warning flag for extensions

2024-03-12 Thread Tristan Partin

On Tue Mar 12, 2024 at 6:56 PM CDT, Sutou Kouhei wrote:

Hi,

In 
  "Re: meson: Specify -Wformat as a common warning flag for extensions" on Fri, 
08 Mar 2024 10:05:27 -0600,
  "Tristan Partin"  wrote:

> Ok, I figured this out. -Wall implies -Wformat=1. We set warning_level
> to 1 in the Meson project() call, which implies -Wall, and set -Wall
> in CFLAGS for autoconf. That's the reason we don't get issues building
> Postgres. A user making use of the pg_config --cflags option, as Sutou
> is, *will* run into the aforementioned issues, since we don't
> propogate -Wall into pg_config.
> 
> 	$ gcc $(pg_config --cflags) -E - < /dev/null > /dev/null

>cc1: warning: ‘-Wformat-security’ ignored without ‘-Wformat’
>[-Wformat-security]
>$ gcc -Wall $(pg_config --cflags) -E - < /dev/null > /dev/null
>(nothing printed)

Thanks for explaining this. You're right. This is the reason
why we don't need this for PostgreSQL itself but we need
this for PostgreSQL extensions. Sorry. I should have
explained this in the first e-mail...


What should we do to proceed this patch?


Perhaps adding some more clarification in the comments that I wrote.

-  # -Wformat-security requires -Wformat, so check for it
+  # -Wformat-secuirty requires -Wformat. We compile with -Wall in 
+  # Postgres, which includes -Wformat=1. -Wformat is shorthand for 
+  # -Wformat=1. The set of flags which includes -Wformat-security is 
+  # persisted into pg_config --cflags, which is commonly used by 
+  # PGXS-based extensions. The lack of -Wformat in the persisted flags
+  # will produce a warning on many GCC versions, so even though adding 
+  # -Wformat here is a no-op for Postgres, it silences other use cases.


That might be too long-winded though :).

--
Tristan Partin
Neon (https://neon.tech)




Re: Add new error_action COPY ON_ERROR "log"

2024-03-12 Thread Michael Paquier
On Wed, Mar 13, 2024 at 08:08:42AM +0530, Bharath Rupireddy wrote:
> On Wed, Mar 13, 2024 at 5:16 AM Michael Paquier  wrote:
>> The attached 0003 is what I had in mind:
>> - Simplification of the LOG generated with quotes applied around the
>> column name, don't see much need to add the relation name, either, for
>> consistency and because the knowledge is known in the query.
>> - A few more tests.
>> - Some doc changes.
> 
> LGMT. So, I've merged those changes into 0001 and 0002.

I've applied the extra tests for now, as this was really confusing.

Hmm.  This NOTICE is really bugging me.  It is true that the clients
would get more information, but the information is duplicated on the
server side because the error context provides the same information as
the NOTICE itself:
NOTICE:  data type incompatibility at line 1 for column "a"
CONTEXT:  COPY aa, line 1, column a: "a"
STATEMENT:  copy aa from stdin with (on_error ignore, log_verbosity verbose); 
--
Michael


signature.asc
Description: PGP signature


Re: RFC: Logging plan of the running query

2024-03-12 Thread torikoshia
On Fri, Feb 16, 2024 at 11:42 PM torikoshia  
wrote:
I'm not so sure about the implementation now, i.e. finding the next 
node

to be executed from the planstate tree, but I'm going to try this
approach.


Attached a patch which takes this approach.

- I saw no way to find the next node to be executed from the planstate 
tree, so the patch wraps all the ExecProcNode of the planstate tree at 
CHECK_FOR_INTERRUPTS().
- To prevent overhead of this wrapped function call, unwrap it at the 
end of EXPLAIN code execution.
- I first tried to use ExecSetExecProcNode() for wrapping, but it 
'changes' ExecProcNodeMtd of nodes, not 'adds' some process to 
ExecProcNodeMtd. I'm not sure this is the right approach, but attached 
patch adds new member ExecProcNodeOriginal to PlanState to preserve 
original ExecProcNodeMtd.


Any comments are welcomed.

--
Regards,

--
Atsushi Torikoshi
NTT DATA Group CorporationFrom a374bf485e6e7237314179313ac7cb61a0ad784b Mon Sep 17 00:00:00 2001
From: Atsushi Torikoshi 
Date: Wed, 13 Mar 2024 13:47:18 +0900
Subject: [PATCH v38] Add function to log the plan of the query

Currently, we have to wait for the query execution to finish
to check its plan. This is not so convenient when
investigating long-running queries on production environments
where we cannot use debuggers.
To improve this situation, this patch adds
pg_log_query_plan() function that requests to log the
plan of the specified backend process.

By default, only superusers are allowed to request to log the
plans because allowing any users to issue this request at an
unbounded rate would cause lots of log messages and which can
lead to denial of service.

On receipt of the request, EXPLAIN codes for logging is wrapped
to every ExecProcNode and when the executor runs one of
ExecProcNode, the plan is logged. These EXPLAIN codes are
unwrapped when EXPLAIN codes actually run once.
In this way, we can avoid adding the overhead which we'll face
when adding CHECK_FOR_INTERRUPTS() like mechanisms in somewhere
in executor codes.
---
 contrib/auto_explain/Makefile |   2 +
 contrib/auto_explain/auto_explain.c   |  23 +-
 contrib/auto_explain/t/001_auto_explain.pl|  35 +++
 doc/src/sgml/func.sgml|  50 
 src/backend/access/transam/xact.c |  17 ++
 src/backend/catalog/system_functions.sql  |   2 +
 src/backend/commands/explain.c| 228 +-
 src/backend/executor/execMain.c   |  19 ++
 src/backend/storage/ipc/procsignal.c  |   4 +
 src/backend/tcop/postgres.c   |   4 +
 src/backend/utils/init/globals.c  |   2 +
 src/include/catalog/pg_proc.dat   |   6 +
 src/include/commands/explain.h|   9 +
 src/include/miscadmin.h   |   1 +
 src/include/nodes/execnodes.h |   3 +
 src/include/storage/procsignal.h  |   1 +
 src/include/tcop/pquery.h |   2 +-
 src/include/utils/elog.h  |   1 +
 .../injection_points/injection_points.c   |  11 +
 src/test/regress/expected/misc_functions.out  |  54 -
 src/test/regress/sql/misc_functions.sql   |  41 +++-
 21 files changed, 473 insertions(+), 42 deletions(-)

diff --git a/contrib/auto_explain/Makefile b/contrib/auto_explain/Makefile
index efd127d3ca..64fe0e0573 100644
--- a/contrib/auto_explain/Makefile
+++ b/contrib/auto_explain/Makefile
@@ -8,6 +8,8 @@ PGFILEDESC = "auto_explain - logging facility for execution plans"
 
 TAP_TESTS = 1
 
+EXTRA_INSTALL = src/test/modules/injection_points
+
 ifdef USE_PGXS
 PG_CONFIG = pg_config
 PGXS := $(shell $(PG_CONFIG) --pgxs)
diff --git a/contrib/auto_explain/auto_explain.c b/contrib/auto_explain/auto_explain.c
index 677c135f59..e041b10b0e 100644
--- a/contrib/auto_explain/auto_explain.c
+++ b/contrib/auto_explain/auto_explain.c
@@ -401,26 +401,9 @@ explain_ExecutorEnd(QueryDesc *queryDesc)
 			es->format = auto_explain_log_format;
 			es->settings = auto_explain_log_settings;
 
-			ExplainBeginOutput(es);
-			ExplainQueryText(es, queryDesc);
-			ExplainQueryParameters(es, queryDesc->params, auto_explain_log_parameter_max_length);
-			ExplainPrintPlan(es, queryDesc);
-			if (es->analyze && auto_explain_log_triggers)
-ExplainPrintTriggers(es, queryDesc);
-			if (es->costs)
-ExplainPrintJITSummary(es, queryDesc);
-			ExplainEndOutput(es);
-
-			/* Remove last line break */
-			if (es->str->len > 0 && es->str->data[es->str->len - 1] == '\n')
-es->str->data[--es->str->len] = '\0';
-
-			/* Fix JSON to output an object */
-			if (auto_explain_log_format == EXPLAIN_FORMAT_JSON)
-			{
-es->str->data[0] = '{';
-es->str->data[es->str->len - 1] = '}';
-			}
+			ExplainAssembleLogOutput(es, queryDesc, auto_explain_log_format,
+	 auto_explain_log_triggers,
+	 auto_explain_log_parameter_max_length);
 
 			/*
 			 * Note: we rely on the existing logging of context or
diff --git a/con

Re: A failure in t/038_save_logical_slots_shutdown.pl

2024-03-12 Thread Amit Kapila
On Tue, Mar 12, 2024 at 8:46 PM Bharath Rupireddy
 wrote:
>
> On Tue, Mar 12, 2024 at 6:05 PM Amit Kapila  wrote:
> >
> > The patch looks mostly good to me. I have changed the comments and
> > commit message in the attached. I am planning to push this tomorrow
> > unless you or others have any comments on it.
>
> LGTM.
>

Pushed.

-- 
With Regards,
Amit Kapila.




RE: Race condition in FetchTableStates() breaks synchronization of subscription tables

2024-03-12 Thread Zhijie Hou (Fujitsu)
On Wednesday, March 13, 2024 11:49 AMvignesh C  wrote:
> On Tue, 12 Mar 2024 at 09:34, Ajin Cherian  wrote:
> >
> >
> >
> > On Tue, Mar 12, 2024 at 2:59 PM vignesh C  wrote:
> >>
> >>
> >> Thanks, I have created the following Commitfest entry for this:
> >> https://commitfest.postgresql.org/47/4816/
> >>
> >> Regards,
> >> Vignesh
> >
> >
> > Thanks for the patch, I have verified that the fix works well by following 
> > the
> steps mentioned to reproduce the problem.
> > Reviewing the patch, it seems good and is well documented. Just one minor
> comment I had was probably to change the name of the variable
> table_states_valid to table_states_validity. The current name made sense when
> it was a bool, but now that it is a tri-state enum, it doesn't fit well.
> 
> Thanks for reviewing the patch, the attached v6 patch has the changes for the
> same.

Thanks for the patches.

I saw a recent similar BF error[1] which seems related to the issue that 0001
patch is trying to solve. i.e. The table sync worker is somehow not started
after refreshing the publication on the subscriber. I didn't see other related 
ERRORs in
the log, so I think the reason is the same as the one being discussed in this
thread, which is the table state invalidation got lost. And the 0001 patch
looks good to me.

For 0002, instead of avoid resetting the latch, is it possible to let the
logical rep worker wake up the launcher once after attaching ?

[1] 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=francolin&dt=2024-03-11%2000%3A52%3A42

Best Regards,
Hou zj


Re: POC: Extension for adding distributed tracing - pg_tracing

2024-03-12 Thread Craig Ringer
On Tue, 12 Mar 2024, 23:45 Anthonin Bonnefoy, <
anthonin.bonne...@datadoghq.com> wrote:

>
> > - I don't think it's a good idea to do memory allocations in the middle
> of a
> > PG_CATCH. If the error was due to out-of-memory, you'll throw another
> error.
> Good point. I was wondering what were the risks of generating spans
> for errors. I will try to find a better way to handle this.
>

The usual approach is to have pre-allocated memory. This must actually be
written (zeroed usually) or it might be lazily allocated only on page
fault. And it can't be copy on write memory allocated once in postmaster
startup then inherited by fork.

That imposes an overhead for every single postgres backend. So maybe
there's a better solution.


Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-12 Thread Amit Kapila
On Tue, Mar 12, 2024 at 10:10 PM Bharath Rupireddy
 wrote:
>
> On Mon, Mar 11, 2024 at 3:44 PM Amit Kapila  wrote:
> >
> > Yes, your understanding is correct. I wanted us to consider having new
> > parameters like 'inactive_replication_slot_timeout' to be at
> > slot-level instead of GUC. I think this new parameter doesn't seem to
> > be the similar as 'max_slot_wal_keep_size' which leads to truncation
> > of WAL at global and then invalidates the appropriate slots. OTOH, the
> > 'inactive_replication_slot_timeout' doesn't appear to have a similar
> > global effect.
>
> last_inactive_at is tracked for each slot using which slots get
> invalidated based on inactive_replication_slot_timeout. It's like
> max_slot_wal_keep_size invalidating slots based on restart_lsn. In a
> way, both are similar, right?
>

There is some similarity but 'max_slot_wal_keep_size' leads to
truncation of WAL which in turn leads to invalidation of slots. Here,
I am also trying to be cautious in adding a GUC unless it is required
or having a slot-level parameter doesn't serve the need. Having said
that, I see that there is an argument that we should follow the path
of 'max_slot_wal_keep_size' GUC and there is some value to it but
still I think avoiding a new GUC for inactivity in the slot would
outweigh.

-- 
With Regards,
Amit Kapila.




Re: remaining sql/json patches

2024-03-12 Thread Himanshu Upadhyaya
On Tue, Mar 12, 2024 at 5:37 PM Amit Langote 
wrote:

>
>
> SELECT JSON_EXISTS(jsonb '{"customer_name": "test", "salary":1000,
> "department_id":1}', '$ ? (@.department_id == $dept_id && @.salary ==
> $sal)' PASSING 1000 AS sal, 1 as dept_id);
>  json_exists
> -
>  t
> (1 row)
>
> Does that make sense?
>
> Yes, got it. Thanks for the clarification.

-- 
Regards,
Himanshu Upadhyaya
EnterpriseDB: http://www.enterprisedb.com


Re: perl: unsafe empty pattern behavior

2024-03-12 Thread Tom Lane
Andrew Dunstan  writes:
> On 2024-03-12 Tu 18:59, Tom Lane wrote:
>> I wonder whether perlcritic has sufficiently deep understanding of
>> Perl code that it could find these hazards.

> Yeah, that was my thought too. I'd start with ProhibitComplexRegexes.pm 
> as a template.

Oooh.  Taking a quick look at the source code:

https://metacpan.org/dist/Perl-Critic/source/lib/Perl/Critic/Policy/RegularExpressions/ProhibitComplexRegexes.pm

it seems like it'd be pretty trivial to convert that from "complain if
regex contains more than N characters" to "complain if regex contains
zero characters".

regards, tom lane




Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-12 Thread Amit Kapila
On Tue, Mar 12, 2024 at 9:11 PM Bertrand Drouvot
 wrote:
>
> On Tue, Mar 12, 2024 at 05:51:43PM +0530, Amit Kapila wrote:
> > On Tue, Mar 12, 2024 at 1:24 PM Bertrand Drouvot
> >  wrote:
>
> > so why to prevent it for
> > these new parameters? This will unnecessarily create inconsistency in
> > the invalidation behavior.
>
> Yeah, but I think wal removal has a direct impact on the slot usuability which
> is probably not the case with the new XID and Timeout ones.
>

BTW, is XID the based parameter 'max_slot_xid_age' not have similarity
with 'max_slot_wal_keep_size'? I think it will impact the rows we
removed based on xid horizons. Don't we need to consider it while
vacuum computing the xid horizons in ComputeXidHorizons() similar to
what we do for WAL w.r.t 'max_slot_wal_keep_size'?

-- 
With Regards,
Amit Kapila.




Re: perl: unsafe empty pattern behavior

2024-03-12 Thread Andrew Dunstan



On 2024-03-12 Tu 18:59, Tom Lane wrote:

Jeff Davis  writes:

On Tue, 2024-03-12 at 18:53 +0100, Alvaro Herrera wrote:

I also tried grepping (for things
like qr{}, qr[], qr||, qr!!) and didn't find anything beyond what you
have ... but I only looked for the "qr" literal, not other ways to
get regexes.

I think that's fine. qr// seems the most dangerous, because it seems to
behave differently in different versions of perl.

I wonder whether perlcritic has sufficiently deep understanding of
Perl code that it could find these hazards.  I already checked,
and found that there's no built-in filter for this (at least not
in the perlcritic version I have), but maybe we could write one?
The rules seem to be plug-in modules, so you can make your own
in principle.




Yeah, that was my thought too. I'd start with ProhibitComplexRegexes.pm 
as a template.


If nobody else does it I'll have a go, but it might take a while.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-03-12 Thread Tom Lane
Jelte Fennema-Nio  writes:
> On Tue, 12 Mar 2024 at 19:28, Alvaro Herrera  wrote:
>>> Hmm, buildfarm member kestrel (which uses
>>> -fsanitize=undefined,alignment) failed:

>> Hm, I tried using the same compile flags, couldn't reproduce.

> Okay, it passed now it seems so I guess this test is flaky somehow.

Two more intermittent failures:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=bushmaster&dt=2024-03-13%2003%3A15%3A09

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=taipan&dt=2024-03-13%2003%3A15%3A31

These animals all belong to Andres' flotilla, but other than that
I'm not seeing a connection.  I suspect it's basically just a
timing dependency.  Have you thought about the fact that a cancel
request is a no-op if it arrives after the query's done?

regards, tom lane




Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-12 Thread Amit Kapila
On Tue, Mar 12, 2024 at 8:55 PM Bharath Rupireddy
 wrote:
>
> On Mon, Mar 11, 2024 at 11:26 AM Amit Kapila  wrote:
> >
> > > Hm. I get the concern. Are you okay with having inavlidation_reason
> > > separately for both logical and physical slots? In such a case,
> > > logical slots that got invalidated on the standby will have duplicate
> > > info in conflict_reason and invalidation_reason, is this fine?
> > >
> >
> > If we have duplicate information in two columns that could be
> > confusing for users. BTW, isn't the recovery conflict occur only
> > because of rows_removed and wal_level_insufficient reasons? The
> > wal_removed or the new reasons you are proposing can't happen because
> > of recovery conflict. Am, I missing something here?
>
> My understanding aligns with yours that the rows_removed and
> wal_level_insufficient invalidations can occur only upon recovery
> conflict.
>
> FWIW, a test named 'synchronized slot has been invalidated' in
> 040_standby_failover_slots_sync.pl inappropriately uses
> conflict_reason = 'wal_removed' logical slot on standby. As per the
> above understanding, it's inappropriate to use conflict_reason here
> because wal_removed invalidation doesn't conflict with recovery.
>
> > > Another idea is to make 'conflict_reason text' as a 'conflicting
> > > boolean' again (revert 007693f2a3), and have 'invalidation_reason
> > > text' for both logical and physical slots. So, whenever 'conflicting'
> > > is true, one can look at invalidation_reason for the reason for
> > > conflict. How does this sound?
> > >
> >
> > So, does this mean that conflicting will only be true for some of the
> > reasons (say wal_level_insufficient, rows_removed, wal_removed) and
> > logical slots but not for others? I think that will also not eliminate
> > the duplicate information as user could have deduced that from single
> > column.
>
> So, how about we turn conflict_reason to only report the reasons that
> actually cause conflict with recovery for logical slots, something
> like below, and then have invalidation_cause as a generic column for
> all sorts of invalidation reasons for both logical and physical slots?
>

If our above understanding is correct then coflict_reason will be a
subset of invalidation_reason. If so, whatever way we arrange this
information, there will be some sort of duplicity unless we just have
one column 'invalidation_reason' and update the docs to interpret it
correctly for conflicts.

-- 
With Regards,
Amit Kapila.




Re: Race condition in FetchTableStates() breaks synchronization of subscription tables

2024-03-12 Thread vignesh C
On Tue, 12 Mar 2024 at 09:34, Ajin Cherian  wrote:
>
>
>
> On Tue, Mar 12, 2024 at 2:59 PM vignesh C  wrote:
>>
>>
>> Thanks, I have created the following Commitfest entry for this:
>> https://commitfest.postgresql.org/47/4816/
>>
>> Regards,
>> Vignesh
>
>
> Thanks for the patch, I have verified that the fix works well by following 
> the steps mentioned to reproduce the problem.
> Reviewing the patch, it seems good and is well documented. Just one minor 
> comment I had was probably to change the name of the variable 
> table_states_valid to table_states_validity. The current name made sense when 
> it was a bool, but now that it is a tri-state enum, it doesn't fit well.

Thanks for reviewing the patch, the attached v6 patch has the changes
for the same.

Regards,
Vignesh
From ab68e343b4b3f7ec09f758089a9f90a16f21de42 Mon Sep 17 00:00:00 2001
From: Vignesh C 
Date: Mon, 5 Feb 2024 14:55:48 +0530
Subject: [PATCH v6 2/2] Apply worker will get started after 180 seconds by the
 launcher in case the apply worker exits immediately after startup.

Apply worker was getting started after 180 seconds tiemout of the
launcher in case the apply worker exits immediately after startup. This
was happening because the launcher process's latch was getting reset
after the apply worker was started, which resulted in the launcher to
wait for the next 180 seconds timeout before starting the apply worker.
Fixed this issue by not resetting the latch, as this latch will be set for
subscription modifications and apply worker exit. We should check if the
new worker needs to be started or not and reset the latch in ApplyLauncherMain.
---
 src/backend/replication/logical/launcher.c | 17 +++--
 1 file changed, 7 insertions(+), 10 deletions(-)

diff --git a/src/backend/replication/logical/launcher.c b/src/backend/replication/logical/launcher.c
index 66070e9131..f2545482c8 100644
--- a/src/backend/replication/logical/launcher.c
+++ b/src/backend/replication/logical/launcher.c
@@ -185,7 +185,6 @@ WaitForReplicationWorkerAttach(LogicalRepWorker *worker,
 			   BackgroundWorkerHandle *handle)
 {
 	BgwHandleStatus status;
-	int			rc;
 
 	for (;;)
 	{
@@ -220,16 +219,14 @@ WaitForReplicationWorkerAttach(LogicalRepWorker *worker,
 		/*
 		 * We need timeout because we generally don't get notified via latch
 		 * about the worker attach.  But we don't expect to have to wait long.
+		 * Since this latch is also used for subscription creation/modification
+		 * and apply worker process exit signal handling, the latch must not be
+		 * reset here. We should check if the new worker needs to be started or
+		 * not and reset the latch in ApplyLauncherMain.
 		 */
-		rc = WaitLatch(MyLatch,
-	   WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
-	   10L, WAIT_EVENT_BGWORKER_STARTUP);
-
-		if (rc & WL_LATCH_SET)
-		{
-			ResetLatch(MyLatch);
-			CHECK_FOR_INTERRUPTS();
-		}
+		(void) WaitLatch(MyLatch,
+		 WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+		 10L, WAIT_EVENT_BGWORKER_STARTUP);
 	}
 }
 
-- 
2.34.1

From 47c6cd886902693074ece4b2a1188dce6f066bf8 Mon Sep 17 00:00:00 2001
From: Vignesh C 
Date: Thu, 1 Feb 2024 09:46:40 +0530
Subject: [PATCH v6 1/2] Table sync missed for newly added tables because
 subscription relation cache invalidation was not handled in certain
 concurrent scenarios.

Table sync was missed if the invalidation of table sync occurs while
the non ready tables were getting prepared. This occurrs because
the table state was being set to valid at the end of non ready table
list preparation irrespective of any inavlidations occurred while
preparing the list. Fixed it by changing the boolean variable to an
tri-state enum and by setting table state to valid only if no
invalidations have been occurred while the list is being prepared.
---
 src/backend/replication/logical/tablesync.c | 25 +
 src/tools/pgindent/typedefs.list|  1 +
 2 files changed, 22 insertions(+), 4 deletions(-)

diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index ee06629088..eefe620e5e 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -123,7 +123,14 @@
 #include "utils/syscache.h"
 #include "utils/usercontext.h"
 
-static bool table_states_valid = false;
+typedef enum
+{
+	SYNC_TABLE_STATE_NEEDS_REBUILD,
+	SYNC_TABLE_STATE_REBUILD_STARTED,
+	SYNC_TABLE_STATE_VALID,
+} SyncingTablesState;
+
+static SyncingTablesState table_states_validity = SYNC_TABLE_STATE_NEEDS_REBUILD;
 static List *table_states_not_ready = NIL;
 static bool FetchTableStates(bool *started_tx);
 
@@ -273,7 +280,7 @@ wait_for_worker_state_change(char expected_state)
 void
 invalidate_syncing_table_states(Datum arg, int cacheid, uint32 hashvalue)
 {
-	table_states_valid = false;
+	table_states_validity = SYNC_TABLE_STATE_NEEDS_REBUILD;
 }
 
 /*
@@ -1568,13 +1575,15 @@ FetchTableStates(bool *started_tx)
 
 	*started_tx = false;
 
-	if 

Re: pg_column_toast_chunk_id: a function to get a chunk ID of a TOASTed value

2024-03-12 Thread Nathan Bossart
I did some light editing to prepare this for commit.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
>From 38c78bd92a7b4d4600e6f0dbe58283c21ea87d50 Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Tue, 12 Mar 2024 22:04:25 -0500
Subject: [PATCH v10 1/1] Add pg_column_toast_chunk_id().

This function returns the chunk_id of an on-disk TOASTed value, or
NULL if the value is un-TOASTed or not on-disk.  This is useful for
identifying which values are actually TOASTed and for investigating
"unexpected chunk number" errors.

XXX: REQUIRES CATVERSION BUMP

Author: Yugo Nagata
Reviewed-by: Jian He
Discussion: https://postgr.es/m/20230329105507.d764497456eeac1ca491b5bd%40sraoss.co.jp
---
 doc/src/sgml/func.sgml   | 17 
 src/backend/utils/adt/varlena.c  | 41 
 src/include/catalog/pg_proc.dat  |  3 ++
 src/test/regress/expected/misc_functions.out | 16 
 src/test/regress/sql/misc_functions.sql  | 12 ++
 5 files changed, 89 insertions(+)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 0bb7aeb40e..fe9c1a38b9 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -28552,6 +28552,23 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset

   
 
+  
+   
+
+ pg_column_toast_chunk_id
+
+pg_column_toast_chunk_id ( "any" )
+oid
+   
+   
+Shows the chunk_id of an on-disk
+TOASTed value.  Returns NULL
+if the value is un-TOASTed or not on-disk.  See
+ for more information about
+TOAST.
+   
+  
+
   

 
diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c
index 543afb66e5..8d28dd42ce 100644
--- a/src/backend/utils/adt/varlena.c
+++ b/src/backend/utils/adt/varlena.c
@@ -5105,6 +5105,47 @@ pg_column_compression(PG_FUNCTION_ARGS)
 	PG_RETURN_TEXT_P(cstring_to_text(result));
 }
 
+/*
+ * Return the chunk_id of the on-disk TOASTed value.  Return NULL if the value
+ * is un-TOASTed or not on-disk.
+ */
+Datum
+pg_column_toast_chunk_id(PG_FUNCTION_ARGS)
+{
+	int			typlen;
+	struct varlena *attr;
+	struct varatt_external toast_pointer;
+
+	/* On first call, get the input type's typlen, and save at *fn_extra */
+	if (fcinfo->flinfo->fn_extra == NULL)
+	{
+		/* Lookup the datatype of the supplied argument */
+		Oid			argtypeid = get_fn_expr_argtype(fcinfo->flinfo, 0);
+
+		typlen = get_typlen(argtypeid);
+		if (typlen == 0)		/* should not happen */
+			elog(ERROR, "cache lookup failed for type %u", argtypeid);
+
+		fcinfo->flinfo->fn_extra = MemoryContextAlloc(fcinfo->flinfo->fn_mcxt,
+	  sizeof(int));
+		*((int *) fcinfo->flinfo->fn_extra) = typlen;
+	}
+	else
+		typlen = *((int *) fcinfo->flinfo->fn_extra);
+
+	if (typlen != -1)
+		PG_RETURN_NULL();
+
+	attr = (struct varlena *) DatumGetPointer(PG_GETARG_DATUM(0));
+
+	if (!VARATT_IS_EXTERNAL_ONDISK(attr))
+		PG_RETURN_NULL();
+
+	VARATT_EXTERNAL_GET_POINTER(toast_pointer, attr);
+
+	PG_RETURN_OID(toast_pointer.va_valueid);
+}
+
 /*
  * string_agg - Concatenates values and returns string.
  *
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 291ed876fc..443ca854a6 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -7447,6 +7447,9 @@
 { oid => '2121', descr => 'compression method for the compressed datum',
   proname => 'pg_column_compression', provolatile => 's', prorettype => 'text',
   proargtypes => 'any', prosrc => 'pg_column_compression' },
+{ oid => '8393', descr => 'chunk ID of on-disk TOASTed value',
+  proname => 'pg_column_toast_chunk_id', provolatile => 's', prorettype => 'oid',
+  proargtypes => 'any', prosrc => 'pg_column_toast_chunk_id' },
 { oid => '2322',
   descr => 'total disk space usage for the specified tablespace',
   proname => 'pg_tablespace_size', provolatile => 'v', prorettype => 'int8',
diff --git a/src/test/regress/expected/misc_functions.out b/src/test/regress/expected/misc_functions.out
index d5f61dfad9..e0ba9fdafa 100644
--- a/src/test/regress/expected/misc_functions.out
+++ b/src/test/regress/expected/misc_functions.out
@@ -703,3 +703,19 @@ SELECT has_function_privilege('regress_current_logfile',
 (1 row)
 
 DROP ROLE regress_current_logfile;
+-- pg_column_toast_chunk_id
+CREATE TABLE test_chunk_id (a TEXT, b TEXT STORAGE EXTERNAL);
+INSERT INTO test_chunk_id VALUES ('x', repeat('x', 8192));
+SELECT t.relname AS toastrel FROM pg_class c
+  LEFT JOIN pg_class t ON c.reltoastrelid = t.oid
+  WHERE c.relname = 'test_chunk_id'
+\gset
+SELECT pg_column_toast_chunk_id(a) IS NULL,
+  pg_column_toast_chunk_id(b) IN (SELECT chunk_id FROM pg_toast.:toastrel)
+  FROM test_chunk_id;
+ ?column? | ?column? 
+--+--
+ t| t
+(1 row)
+
+DROP TABLE test_chunk_id;
diff --git a/src/test/regress/sql/misc_functions.sql b/src/test/regress/sql/misc_f

Re: Infinite loop in XLogPageRead() on standby

2024-03-12 Thread Kyotaro Horiguchi
At Mon, 11 Mar 2024 16:43:32 +0900 (JST), Kyotaro Horiguchi 
 wrote in 
> Oh, I once saw the fix work, but seems not to be working after some
> point. The new issue was a corruption of received WAL records on the
> first standby, and it may be related to the setting.

I identified the cause of the second issue. When I tried to replay the
issue, the second standby accidentally received the old timeline's
last page-spanning record till the end while the first standby was
promoting (but it had not been read by recovery). In addition to that,
on the second standby, there's a time window where the timeline
increased but the first segment of the new timeline is not available
yet. In this case, the second standby successfully reads the
page-spanning record in the old timeline even after the second standby
noticed that the timeline ID has been increased, thanks to the
robustness of XLogFileReadAnyTLI().

I think the primary change to XLogPageRead that I suggested is correct
(assuming the use of wal_segment_size instead of the
constant). However, still XLogFileReadAnyTLI() has a chance to read
the segment from the old timeline after the second standby notices a
timeline switch, leading to the second issue. The second issue was
fixed by preventing XLogFileReadAnyTLI from reading segments from
older timelines than those suggested by the latest timeline
history. (In other words, disabling the "AnyTLI" part).

I recall that there was a discussion for commit 4bd0ad9e44, about the
objective of allowing reading segments from older timelines than the
timeline history suggests. In my faint memory, we concluded to
postpone making the decision to remove the feature due to uncertainity
about the objective. If there's no clear reason to continue using
XLogFileReadAnyTLI(), I suggest we stop its use and instead adopt
XLogFileReadOnTLHistory(), which reads segments that align precisely
with the timeline history.

Of course, regardless of the changes above, if recovery on the second
standby had reached the end of the page-spanning record before
redirection to the first standby, it would need pg_rewind to connect
to the first standby.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: Add new error_action COPY ON_ERROR "log"

2024-03-12 Thread Bharath Rupireddy
On Wed, Mar 13, 2024 at 5:16 AM Michael Paquier  wrote:
>
> On Tue, Mar 12, 2024 at 12:54:29PM +0530, Bharath Rupireddy wrote:
> > +1. But, do you want to add them now or later as a separate
> > patch/discussion altogether?
>
> The attached 0003 is what I had in mind:
> - Simplification of the LOG generated with quotes applied around the
> column name, don't see much need to add the relation name, either, for
> consistency and because the knowledge is known in the query.
> - A few more tests.
> - Some doc changes.

LGMT. So, I've merged those changes into 0001 and 0002.

> >> Wouldn't it be better to squash the patches together, by the way?
> >
> > I guess not. They are two different things IMV.
>
> Well, 0001 is sitting doing nothing because the COPY code does not
> make use of it internally.

Yes. That's why I left a note in the commit message that a future
commit will use it.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
From 77b9f28121d6531f40d96f7c00ecdb860550b67f Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy 
Date: Wed, 13 Mar 2024 02:20:42 +
Subject: [PATCH v7 1/2] Add LOG_VERBOSITY option to COPY command

This commit adds a new option LOG_VERBOSITY to set the verbosity of
logged messages by COPY command. A value of 'verbose' can be used
to emit more informative messages by the command, while the value
of 'default (which is the default) can be used to not log any
additional messages. More values such as 'terse', 'row_details'
etc. can be added based on the need  to the LOG_VERBOSITY option.

An upcoming commit for emitting more info on soft errors by
COPY FROM command with ON_ERROR 'ignore' uses this.

Author: Bharath Rupireddy
Reviewed-by: Michael Paquier, Masahiko Sawada
Reviewed-by: Atsushi Torikoshi
Discussion: https://www.postgresql.org/message-id/CALj2ACXNA0focNeriYRvQQaCGc4CsTuOnFbzF9LqTKNWxuJdhA%40mail.gmail.com
---
 doc/src/sgml/ref/copy.sgml  | 14 +++
 src/backend/commands/copy.c | 38 +
 src/bin/psql/tab-complete.c |  6 -
 src/include/commands/copy.h | 10 
 src/test/regress/expected/copy2.out |  8 ++
 src/test/regress/sql/copy2.sql  |  2 ++
 src/tools/pgindent/typedefs.list|  1 +
 7 files changed, 78 insertions(+), 1 deletion(-)

diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
index 55764fc1f2..eba9b8f64e 100644
--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -45,6 +45,7 @@ COPY { table_name [ ( column_name [, ...] ) | * }
 ON_ERROR 'error_action'
 ENCODING 'encoding_name'
+LOG_VERBOSITY [ mode ]
 
  
 
@@ -415,6 +416,19 @@ COPY { table_name [ ( 

 
+   
+LOG_VERBOSITY
+
+ 
+  Sets the verbosity of some of the messages logged by a
+  COPY command.
+  A mode value of
+  verbose can be used to emit more informative messages.
+  default will not log any additional messages.
+ 
+
+   
+

 WHERE
 
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index 056b6733c8..23eb8c9c79 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -428,6 +428,36 @@ defGetCopyOnErrorChoice(DefElem *def, ParseState *pstate, bool is_from)
 	return COPY_ON_ERROR_STOP;	/* keep compiler quiet */
 }
 
+/*
+ * Extract a CopyLogVerbosityChoice value from a DefElem.
+ */
+static CopyLogVerbosityChoice
+defGetCopyLogVerbosityChoice(DefElem *def, ParseState *pstate)
+{
+	char	   *sval;
+
+	/*
+	 * If no parameter value given, assume the default value.
+	 */
+	if (def->arg == NULL)
+		return COPY_LOG_VERBOSITY_DEFAULT;
+
+	/*
+	 * Allow "default", or "verbose" values.
+	 */
+	sval = defGetString(def);
+	if (pg_strcasecmp(sval, "default") == 0)
+		return COPY_LOG_VERBOSITY_DEFAULT;
+	if (pg_strcasecmp(sval, "verbose") == 0)
+		return COPY_LOG_VERBOSITY_VERBOSE;
+
+	ereport(ERROR,
+			(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+			 errmsg("COPY LOG_VERBOSITY \"%s\" not recognized", sval),
+			 parser_errposition(pstate, def->location)));
+	return COPY_LOG_VERBOSITY_DEFAULT;	/* keep compiler quiet */
+}
+
 /*
  * Process the statement option list for COPY.
  *
@@ -454,6 +484,7 @@ ProcessCopyOptions(ParseState *pstate,
 	bool		freeze_specified = false;
 	bool		header_specified = false;
 	bool		on_error_specified = false;
+	bool		log_verbosity_specified = false;
 	ListCell   *option;
 
 	/* Support external use for option sanity checking */
@@ -613,6 +644,13 @@ ProcessCopyOptions(ParseState *pstate,
 			on_error_specified = true;
 			opts_out->on_error = defGetCopyOnErrorChoice(defel, pstate, is_from);
 		}
+		else if (strcmp(defel->defname, "log_verbosity") == 0)
+		{
+			if (log_verbosity_specified)
+errorConflictingDefElem(defel, pstate);
+			log_verbosity_specified = true;
+			opts_out->log_verbosity = defGetCopyLogVerbosityChoice(defel, pstate);
+		}
 		else
 			ereport(ERROR,
 	(errcode(ERRCODE_

Re: Improve eviction algorithm in ReorderBuffer

2024-03-12 Thread Peter Smith
On Wed, Mar 13, 2024 at 12:48 PM Masahiko Sawada  wrote:
>
> On Wed, Mar 13, 2024 at 10:15 AM Peter Smith  wrote:
> >
> > On Tue, Mar 12, 2024 at 4:23 PM Masahiko Sawada  
> > wrote:
> > >
> > > On Fri, Mar 8, 2024 at 12:58 PM Peter Smith  wrote:
> > > >
> > ...
> > > > > > 5.
> > > > > > + *
> > > > > > + * If 'indexed' is true, we create a hash table to track of each 
> > > > > > node's
> > > > > > + * index in the heap, enabling to perform some operations such as 
> > > > > > removing
> > > > > > + * the node from the heap.
> > > > > >   */
> > > > > >  binaryheap *
> > > > > > -binaryheap_allocate(int capacity, binaryheap_comparator compare, 
> > > > > > void *arg)
> > > > > > +binaryheap_allocate(int capacity, binaryheap_comparator compare,
> > > > > > + bool indexed, void *arg)
> > > > > >
> > > > > > BEFORE
> > > > > > ... enabling to perform some operations such as removing the node 
> > > > > > from the heap.
> > > > > >
> > > > > > SUGGESTION
> > > > > > ... to help make operations such as removing nodes more efficient.
> > > > > >
> > > > >
> > > > > But these operations literally require the indexed binary heap as we
> > > > > have an assertion:
> > > > >
> > > > > void
> > > > > binaryheap_remove_node_ptr(binaryheap *heap, bh_node_type d)
> > > > > {
> > > > > bh_nodeidx_entry *ent;
> > > > >
> > > > > Assert(!binaryheap_empty(heap) && heap->bh_has_heap_property);
> > > > > Assert(heap->bh_indexed);
> > > > >
> > > >
> > > > I didn’t quite understand -- the operations mentioned are "operations
> > > > such as removing the node", but binaryheap_remove_node() also removes
> > > > a node from the heap. So I still felt the comment wording of the patch
> > > > is not quite correct.
> > >
> > > Now I understand your point. That's a valid point.
> > >
> > > >
> > > > Now, if the removal of a node from an indexed heap can *only* be done
> > > > using binaryheap_remove_node_ptr() then:
> > > > - the other removal functions (binaryheap_remove_*) probably need some
> > > > comments to make sure nobody is tempted to call them directly for an
> > > > indexed heap.
> > > > - maybe some refactoring and assertions are needed to ensure those
> > > > *cannot* be called directly for an indexed heap.
> > > >
> > >
> > > If the 'index' is true, the caller can not only use the existing
> > > functions but also newly added functions such as
> > > binaryheap_remove_node_ptr() and binaryheap_update_up() etc. How about
> > > something like below?
> > >
> >
> > You said: "can not only use the existing functions but also..."
> >
> > Hmm. Is that right? IIUC those existing "remove" functions should NOT
> > be called directly if the heap was "indexed" because they'll delete
> > the node from the heap OK, but any corresponding index for that
> > deleted node will be left lying around -- i.e. everything gets out of
> > sync. This was the reason for my original concern.
> >
>
> All existing binaryheap functions should be available even if the
> binaryheap is 'indexed'. For instance, with the patch,
> binaryheap_remote_node() is:
>
> void
> binaryheap_remove_node(binaryheap *heap, int n)
> {
> int cmp;
>
> Assert(!binaryheap_empty(heap) && heap->bh_has_heap_property);
> Assert(n >= 0 && n < heap->bh_size);
>
> /* compare last node to the one that is being removed */
> cmp = heap->bh_compare(heap->bh_nodes[--heap->bh_size],
>heap->bh_nodes[n],
>heap->bh_arg);
>
> /* remove the last node, placing it in the vacated entry */
> replace_node(heap, n, heap->bh_nodes[heap->bh_size]);
>
> /* sift as needed to preserve the heap property */
> if (cmp > 0)
> sift_up(heap, n);
> else if (cmp < 0)
> sift_down(heap, n);
> }
>
> The replace_node(), sift_up() and sift_down() update node's index as
> well if the binaryheap is indexed. When deleting the node from the
> binaryheap, it will also delete its index from the hash table.
>

I see now. Thanks for the information.

~~~

Some more review comments for v8-0002

==

1.
+/*
+ * Remove the node's index from the hash table if the heap is indexed.
+ */
+static bool
+delete_nodeidx(binaryheap *heap, bh_node_type node)
+{
+ if (!binaryheap_indexed(heap))
+ return false;
+
+ return bh_nodeidx_delete(heap->bh_nodeidx, node);
+}

I wasn't sure if having this function was a good idea. Yes, it makes
code more readable, but I felt the heap code ought to be as efficient
as possible so maybe it is better for the index check to be done at
the caller, instead of incurring any overhead of function calls that
might do nothing.

SUGGESTION
if (binaryheap_indexed(heap))
  found = bh_nodeidx_delete(heap->bh_nodeidx, node);

~~~

2.
+/*
+ * binaryheap_update_up
+ *
+ * Sift the given node up after the node's key is updated. The caller must
+ * ensure that the given node is in the heap. O(log n) worst case.
+ *
+ * This function can be used only if the heap is indexed.
+ */
+void

Re: CI speed improvements for FreeBSD

2024-03-12 Thread Thomas Munro
On Wed, Mar 13, 2024 at 4:50 AM Maxim Orlov  wrote:
> I looked at the changes and I liked them.  Here are my thoughts:

Thanks for looking!  Pushed.




Re: Improve eviction algorithm in ReorderBuffer

2024-03-12 Thread Masahiko Sawada
On Wed, Mar 13, 2024 at 10:15 AM Peter Smith  wrote:
>
> On Tue, Mar 12, 2024 at 4:23 PM Masahiko Sawada  wrote:
> >
> > On Fri, Mar 8, 2024 at 12:58 PM Peter Smith  wrote:
> > >
> ...
> > > > > 5.
> > > > > + *
> > > > > + * If 'indexed' is true, we create a hash table to track of each 
> > > > > node's
> > > > > + * index in the heap, enabling to perform some operations such as 
> > > > > removing
> > > > > + * the node from the heap.
> > > > >   */
> > > > >  binaryheap *
> > > > > -binaryheap_allocate(int capacity, binaryheap_comparator compare, 
> > > > > void *arg)
> > > > > +binaryheap_allocate(int capacity, binaryheap_comparator compare,
> > > > > + bool indexed, void *arg)
> > > > >
> > > > > BEFORE
> > > > > ... enabling to perform some operations such as removing the node 
> > > > > from the heap.
> > > > >
> > > > > SUGGESTION
> > > > > ... to help make operations such as removing nodes more efficient.
> > > > >
> > > >
> > > > But these operations literally require the indexed binary heap as we
> > > > have an assertion:
> > > >
> > > > void
> > > > binaryheap_remove_node_ptr(binaryheap *heap, bh_node_type d)
> > > > {
> > > > bh_nodeidx_entry *ent;
> > > >
> > > > Assert(!binaryheap_empty(heap) && heap->bh_has_heap_property);
> > > > Assert(heap->bh_indexed);
> > > >
> > >
> > > I didn’t quite understand -- the operations mentioned are "operations
> > > such as removing the node", but binaryheap_remove_node() also removes
> > > a node from the heap. So I still felt the comment wording of the patch
> > > is not quite correct.
> >
> > Now I understand your point. That's a valid point.
> >
> > >
> > > Now, if the removal of a node from an indexed heap can *only* be done
> > > using binaryheap_remove_node_ptr() then:
> > > - the other removal functions (binaryheap_remove_*) probably need some
> > > comments to make sure nobody is tempted to call them directly for an
> > > indexed heap.
> > > - maybe some refactoring and assertions are needed to ensure those
> > > *cannot* be called directly for an indexed heap.
> > >
> >
> > If the 'index' is true, the caller can not only use the existing
> > functions but also newly added functions such as
> > binaryheap_remove_node_ptr() and binaryheap_update_up() etc. How about
> > something like below?
> >
>
> You said: "can not only use the existing functions but also..."
>
> Hmm. Is that right? IIUC those existing "remove" functions should NOT
> be called directly if the heap was "indexed" because they'll delete
> the node from the heap OK, but any corresponding index for that
> deleted node will be left lying around -- i.e. everything gets out of
> sync. This was the reason for my original concern.
>

All existing binaryheap functions should be available even if the
binaryheap is 'indexed'. For instance, with the patch,
binaryheap_remote_node() is:

void
binaryheap_remove_node(binaryheap *heap, int n)
{
int cmp;

Assert(!binaryheap_empty(heap) && heap->bh_has_heap_property);
Assert(n >= 0 && n < heap->bh_size);

/* compare last node to the one that is being removed */
cmp = heap->bh_compare(heap->bh_nodes[--heap->bh_size],
   heap->bh_nodes[n],
   heap->bh_arg);

/* remove the last node, placing it in the vacated entry */
replace_node(heap, n, heap->bh_nodes[heap->bh_size]);

/* sift as needed to preserve the heap property */
if (cmp > 0)
sift_up(heap, n);
else if (cmp < 0)
sift_down(heap, n);
}

The replace_node(), sift_up() and sift_down() update node's index as
well if the binaryheap is indexed. When deleting the node from the
binaryheap, it will also delete its index from the hash table.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com




Re: [PoC] Improve dead tuple storage for lazy vacuum

2024-03-12 Thread Masahiko Sawada
On Tue, Mar 12, 2024 at 7:34 PM John Naylor  wrote:
>
> On Mon, Mar 11, 2024 at 3:13 PM Masahiko Sawada  wrote:
> >
> > On Mon, Mar 11, 2024 at 12:20 PM John Naylor  
> > wrote:
> > >
> > > On Thu, Mar 7, 2024 at 10:35 PM Masahiko Sawada  
> > > wrote:
>
> > > + ts->context = CurrentMemoryContext;
> > >
> > > As far as I can tell, this member is never accessed again -- am I
> > > missing something?
> >
> > You're right. It was used to re-create the tidstore in the same
> > context again while resetting it, but we no longer support the reset
> > API. Considering it again, would it be better to allocate the iterator
> > struct in the same context as we store the tidstore struct?
>
> That makes sense.
>
> > > + /* DSA for tidstore will be detached at the end of session */
> > >
> > > No other test module pins the mapping, but that doesn't necessarily
> > > mean it's wrong. Is there some advantage over explicitly detaching?
> >
> > One small benefit of not explicitly detaching dsa_area in
> > tidstore_destroy() would be simplicity; IIUC if we want to do that, we
> > need to remember the dsa_area using (for example) a static variable,
> > and free it if it's non-NULL. I've implemented this idea in the
> > attached patch.
>
> Okay, I don't have a strong preference at this point.

I'd keep the update on that.

>
> > > +-- Add tids in random order.
> > >
> > > I don't see any randomization here. I do remember adding row_number to
> > > remove whitespace in the output, but I don't remember a random order.
> > > On that subject, the row_number was an easy trick to avoid extra
> > > whitespace, but maybe we should just teach the setting function to
> > > return blocknumber rather than null?
> >
> > Good idea, fixed.
>
> + test_set_block_offsets
> +
> + 2147483647
> +  0
> + 4294967294
> +  1
> + 4294967295
>
> Hmm, was the earlier comment about randomness referring to this? I'm
> not sure what other regression tests do in these cases, or how
> relibale this is. If this is a problem we could simply insert this
> result into a temp table so it's not output.

I didn't address the comment about randomness.

I think that we will have both random TIDs tests and fixed TIDs tests
in test_tidstore as we discussed, and probably we can do both tests
with similar steps; insert TIDs into both a temp table and tidstore
and check if the tidstore returned the results as expected by
comparing the results to the temp table. Probably we can have a common
pl/pgsql function that checks that and raises a WARNING or an ERROR.
Given that this is very similar to what we did in test_radixtree, why
do we really want to implement it using a pl/pgsql function? When we
discussed it before, I found the current way makes sense. But given
that we're adding more tests and will add more tests in the future,
doing the tests in C will be more maintainable and faster. Also, I
think we can do the debug-build array stuff in the test_tidstore code
instead.

>
> > > +Datum
> > > +tidstore_create(PG_FUNCTION_ARGS)
> > > +{
> > > ...
> > > + tidstore = TidStoreCreate(max_bytes, dsa);
> > >
> > > +Datum
> > > +tidstore_set_block_offsets(PG_FUNCTION_ARGS)
> > > +{
> > > 
> > > + TidStoreSetBlockOffsets(tidstore, blkno, offs, noffs);
> > >
> > > These names are too similar. Maybe the test module should do
> > > s/tidstore_/test_/ or similar.
> >
> > Agreed.
>
> Mostly okay, although a couple look a bit generic now. I'll leave it
> up to you if you want to tweak things.
>
> > > In general, the .sql file is still very hard-coded. Functions are
> > > created that contain a VALUES statement. Maybe it's okay for now, but
> > > wanted to mention it. Ideally, we'd have some randomized tests,
> > > without having to display it. That could be in addition to (not
> > > replacing) the small tests we have that display input. (see below)
> > >
> >
> > Agreed to add randomized tests in addition to the existing tests.
>
> I'll try something tomorrow.
>
> > Sounds a good idea. In fact, if there are some bugs in tidstore, it's
> > likely that even initdb would fail in practice. However, it's a very
> > good idea that we can test the tidstore anyway with such a check
> > without a debug-build array.
> >
> > Or as another idea, I wonder if we could keep the debug-build array in
> > some form. For example, we use the array with the particular build
> > flag and set a BF animal for that. That way, we can test the tidstore
> > in more real cases.
>
> I think the purpose of a debug flag is to help developers catch
> mistakes. I don't think it's quite useful enough for that. For one, it
> has the same 1GB limitation as vacuum's current array. For another,
> it'd be a terrible way to debug moving tidbitmap.c from its hash table
> to use TID store -- AND/OR operations and lossy pages are pretty much
> undoable with a copy of vacuum's array.

Valid points.

As I mentioned above, if we im

Re: Improve eviction algorithm in ReorderBuffer

2024-03-12 Thread Peter Smith
On Tue, Mar 12, 2024 at 4:23 PM Masahiko Sawada  wrote:
>
> On Fri, Mar 8, 2024 at 12:58 PM Peter Smith  wrote:
> >
...
> > > > 5.
> > > > + *
> > > > + * If 'indexed' is true, we create a hash table to track of each node's
> > > > + * index in the heap, enabling to perform some operations such as 
> > > > removing
> > > > + * the node from the heap.
> > > >   */
> > > >  binaryheap *
> > > > -binaryheap_allocate(int capacity, binaryheap_comparator compare, void 
> > > > *arg)
> > > > +binaryheap_allocate(int capacity, binaryheap_comparator compare,
> > > > + bool indexed, void *arg)
> > > >
> > > > BEFORE
> > > > ... enabling to perform some operations such as removing the node from 
> > > > the heap.
> > > >
> > > > SUGGESTION
> > > > ... to help make operations such as removing nodes more efficient.
> > > >
> > >
> > > But these operations literally require the indexed binary heap as we
> > > have an assertion:
> > >
> > > void
> > > binaryheap_remove_node_ptr(binaryheap *heap, bh_node_type d)
> > > {
> > > bh_nodeidx_entry *ent;
> > >
> > > Assert(!binaryheap_empty(heap) && heap->bh_has_heap_property);
> > > Assert(heap->bh_indexed);
> > >
> >
> > I didn’t quite understand -- the operations mentioned are "operations
> > such as removing the node", but binaryheap_remove_node() also removes
> > a node from the heap. So I still felt the comment wording of the patch
> > is not quite correct.
>
> Now I understand your point. That's a valid point.
>
> >
> > Now, if the removal of a node from an indexed heap can *only* be done
> > using binaryheap_remove_node_ptr() then:
> > - the other removal functions (binaryheap_remove_*) probably need some
> > comments to make sure nobody is tempted to call them directly for an
> > indexed heap.
> > - maybe some refactoring and assertions are needed to ensure those
> > *cannot* be called directly for an indexed heap.
> >
>
> If the 'index' is true, the caller can not only use the existing
> functions but also newly added functions such as
> binaryheap_remove_node_ptr() and binaryheap_update_up() etc. How about
> something like below?
>

You said: "can not only use the existing functions but also..."

Hmm. Is that right? IIUC those existing "remove" functions should NOT
be called directly if the heap was "indexed" because they'll delete
the node from the heap OK, but any corresponding index for that
deleted node will be left lying around -- i.e. everything gets out of
sync. This was the reason for my original concern.

>  * If 'indexed' is true, we create a hash table to track each node's
>  * index in the heap, enabling to perform some operations such as
>  * binaryheap_remove_node_ptr() etc.
>

Yeah, something like that... I'll wait for the next patch version
before commenting further.

--
Kind Regards,
Peter Smith.
Fujitsu Australia




Re: Support json_errdetail in FRONTEND builds

2024-03-12 Thread Andrew Dunstan



On 2024-03-12 Tu 14:43, Jacob Champion wrote:

Hello,

Both the incremental JSON [1] and OAuth [2] patchsets would be
improved by json_errdetail(), which was removed from FRONTEND builds
in b44669b2ca:


The routine providing a detailed error message based on the error code
is made backend-only, the existing code being unsafe to use in the
frontend as the error message may finish by being palloc'd or point to a
static string, so there is no way to know if the memory of the message
should be pfree'd or not.

Attached is a patch to undo this, by attaching any necessary
allocations to the JsonLexContext so the caller doesn't have to keep
track of them.

This is based on the first patch of the OAuth patchset, which
additionally needs json_errdetail() to be safe to use from libpq
itself. Alvaro pointed out offlist that we don't have to go that far
to re-enable this function for the utilities, so this patch is a sort
of middle ground between what we have now and what OAuth implements.
(There is some additional minimization that could be done to this
patch, but I'm hoping to keep the code structure consistent between
the two, if the result is acceptable.)




Seems reasonable.



Two notes that I wanted to point out explicitly:
- On the other thread, Daniel contributed a destroyStringInfo()
counterpart for makeStringInfo(), which is optional but seemed useful
to include here.



yeah, although maybe worth a different patch.



- After this patch, if a frontend client calls json_errdetail()
without calling freeJsonLexContext(), it will leak memory.



Not too concerned about this:


1. we tend to be a bit more relaxed about that in frontend programs, 
especially those not expected to run for long times and especially on 
error paths, where in many cases the program will just exit anyway.


2. the fix is simple where it's needed.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: [PROPOSAL] Skip test citext_utf8 on Windows

2024-03-12 Thread Andrew Dunstan



On 2024-03-11 Mo 22:50, Thomas Munro wrote:

On Tue, Mar 12, 2024 at 2:56 PM Andrew Dunstan  wrote:

On 2024-03-11 Mo 04:21, Oleg Tselebrovskiy wrote:

Greetings, everyone!

While running "installchecks" on databases with UTF-8 encoding the test
citext_utf8 fails because of Turkish dotted I like this:

  SELECT 'i'::citext = 'İ'::citext AS t;
   t
  ---
- t
+ f
  (1 row)

I tried to replicate the test's results by hand and with any collation
that I tried (including --locale="Turkish") this test failed

Also an interesing result of my tesing. If you initialize you DB
with -E utf-8 --locale="Turkish" and then run select LOWER('İ');
the output will be this:
  lower
---
  İ
(1 row)

Which I find strange since lower() uses collation that was passed
(default in this case but still)

Wouldn't we be better off finding a Windows fix for this, instead of
sweeping it under the rug?

Given the sorry state of our Windows locale support, I've started
wondering about deleting it and telling users to adopt our nascent
built-in support or ICU[1].

This other thread [2] says the sorting is intransitive so I don't
think it really meets our needs anyway.

[1] 
https://www.postgresql.org/message-id/flat/CA%2BhUKGJhV__g_TJ0jVqPbnTuqT%2B%2BM6KFv2wj%2B9AV-cABNCXN6Q%40mail.gmail.com#bc35c0b88962ff8c24c27aecc1bca72e
[2] 
https://www.postgresql.org/message-id/flat/1407a2c0-062b-4e4c-b728-438fdff5cb07%40manitou-mail.org



Makes more sense than just hacking the tests to avoid running them on 
Windows. (I also didn't much like doing it by parsing the version 
string, although I know there's at least one precedent for doing that.)



cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: Transaction timeout

2024-03-12 Thread Alexander Korotkov
On Tue, Mar 12, 2024 at 10:28 AM Andrey M. Borodin  wrote:
> > On 11 Mar 2024, at 16:18, Alexander Korotkov  wrote:
> >
> > I think if checking psql stderr is problematic, checking just logs is
> > fine.  Could we wait for the relevant log messages one by one with
> > $node->wait_for_log() just like 040_standby_failover_slots_sync.pl do?
>
> PFA version with $node->wait_for_log()

I've slightly revised the patch.  I'm going to push it if no objections.

--
Regards,
Alexander Korotkov


v5-0001-Add-TAP-tests-for-timeouts.patch
Description: Binary data


Re: Add basic tests for the low-level backup method.

2024-03-12 Thread David Steele

On 2/29/24 16:55, Michael Paquier wrote:

On Thu, Feb 29, 2024 at 10:30:52AM +1300, David Steele wrote:

On 11/12/23 08:21, David Steele wrote:

As promised in [1], attached are some basic tests for the low-level
backup method.


Added to the 2024-03 CF.


There is already 040_standby_failover_slots_sync.pl in recovery/ that
uses the number of your test script.  You may want to bump it, that's
a nit.


Renamed to 042_low_level_backup.pl.


+unlink("$backup_dir/postmaster.pid")
+   or BAIL_OUT("unable to unlink $backup_dir/postmaster.pid");
+unlink("$backup_dir/postmaster.opts")
+   or BAIL_OUT("unable to unlink $backup_dir/postmaster.opts");
+unlink("$backup_dir/global/pg_control")
+   or BAIL_OUT("unable to unlink $backup_dir/global/pg_control");

RELCACHE_INIT_FILENAME as well?


I'm not trying to implement the full exclusion list here, just enough to 
get the test working. Since exclusions are optional according to the 
docs I don't think we need them for a valid tests.



+# Rather than writing out backup_label, try to recover the backup without
+# backup_label to demonstrate that recovery will not work correctly without it,
+# i.e. the canary table will be missing and the cluster will be corrupt. 
Provide
+# only the WAL segment that recovery will think it needs.

Okay, why not.  No objections to this addition.  I am a bit surprised
that this is not scanning some of the logs produced by the startup
process for particular patterns.


Not sure what to look for here. There are no distinct messages for crash 
recovery. Perhaps there should be?



+# Save backup_label into the backup directory and recover using the primary's
+# archive. This time recovery will succeed and the canary table will be
+# present.

Here are well, I think that we should add some log_contains() with
pre-defined patterns to show that recovery has completed the way we
want it with a backup_label up to the end-of-backup record.


Sure, I added a check for the new log message when recovering with a 
backup_label.


Regards,
-DavidFrom 076429a578ece3c213525220a4961cb5b56c190a Mon Sep 17 00:00:00 2001
From: David Steele 
Date: Wed, 13 Mar 2024 00:01:55 +
Subject: Add basic tests for the low-level backup method.

There are currently no tests for the low-level backup method. pg_backup_start()
and pg_backup_stop() are called but not exercised in any real fashion.

There is a lot more that can be done, but this at least provides some basic
tests and provides a place for future improvement.
---
 src/test/recovery/t/042_low_level_backup.pl | 108 
 1 file changed, 108 insertions(+)
 create mode 100644 src/test/recovery/t/042_low_level_backup.pl

diff --git a/src/test/recovery/t/042_low_level_backup.pl 
b/src/test/recovery/t/042_low_level_backup.pl
new file mode 100644
index 00..22668bdc0d
--- /dev/null
+++ b/src/test/recovery/t/042_low_level_backup.pl
@@ -0,0 +1,108 @@
+
+# Copyright (c) 2023, PostgreSQL Global Development Group
+
+# Test low-level backup method by using pg_backup_start() and pg_backup_stop()
+# to create backups.
+
+use strict;
+use warnings;
+
+use File::Copy qw(copy);
+use File::Path qw(remove_tree);
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# Start primary node with archiving
+my $node_primary = PostgreSQL::Test::Cluster->new('primary');
+$node_primary->init(has_archiving => 1, allows_streaming => 1);
+$node_primary->start;
+
+# Start backup
+my $backup_name = 'backup1';
+my $psql = $node_primary->background_psql('postgres');
+
+$psql->query_safe("SET client_min_messages TO WARNING");
+$psql->set_query_timer_restart();
+$psql->query_safe("select pg_backup_start('test label')");
+
+# Copy files
+my $backup_dir = $node_primary->backup_dir() . '/' . $backup_name;
+
+PostgreSQL::Test::RecursiveCopy::copypath(
+   $node_primary->data_dir(), $backup_dir);
+
+# Cleanup some files/paths that should not be in the backup. There is no
+# attempt to all the exclusions done by pg_basebackup here, in part because
+# they are not required, but also to keep the test simple.
+#
+# Also remove pg_control because it needs to be copied later and it will be
+# obvious if the copy fails.
+unlink("$backup_dir/postmaster.pid")
+   or BAIL_OUT("unable to unlink $backup_dir/postmaster.pid");
+unlink("$backup_dir/postmaster.opts")
+   or BAIL_OUT("unable to unlink $backup_dir/postmaster.opts");
+unlink("$backup_dir/global/pg_control")
+   or BAIL_OUT("unable to unlink $backup_dir/global/pg_control");
+
+remove_tree("$backup_dir/pg_wal", {keep_root => 1})
+   or BAIL_OUT("unable to unlink contents of $backup_dir/pg_wal");
+
+# Create a table that will be used to verify that recovery started at the
+# correct location, rather than the location recorded in pg_control
+$psql->query_safe("create table canary (id int)");
+
+# Advance the checkpoint location in pg_control past the location where the
+# backup started. Switch the WAL to make it really ob

Re: Disable LLVM bitcode generation with pgxs.mk framework.

2024-03-12 Thread Xing Guo
> On Tue, Mar 12, 2024 at 10:40 PM Daniel Gustafsson  wrote:
>
> > On 12 Mar 2024, at 14:38, Xing Guo  wrote:
>
> > Would it be possible to add a new switch in the pgxs.mk framework to
> > allow users to disable this feature?
>
> Something like that doesn't seem unreasonable I think.

Thanks.

I added a new option NO_LLVM_BITCODE to pgxs. I'm not sure if the name
is appropriate.

> --
> Daniel Gustafsson
>
From e19a724fad4949bef9bc4d0f8e58719607d979be Mon Sep 17 00:00:00 2001
From: Xing Guo 
Date: Wed, 13 Mar 2024 07:56:46 +0800
Subject: [PATCH v1] Add NO_LLVM_BITCODE option to pgxs.

This patch adds a new option NO_LLVM_BITCODE to pgxs to allow user to
disable LLVM bitcode generation completely even if the PostgreSQL
installation is configured with --with-llvm.
---
 doc/src/sgml/extend.sgml | 9 +
 src/makefiles/pgxs.mk| 6 ++
 2 files changed, 15 insertions(+)

diff --git a/doc/src/sgml/extend.sgml b/doc/src/sgml/extend.sgml
index 218940ee5c..6fe69746c2 100644
--- a/doc/src/sgml/extend.sgml
+++ b/doc/src/sgml/extend.sgml
@@ -1719,6 +1719,15 @@ include $(PGXS)
   
  
 
+ 
+  NO_LLVM_BITCODE
+  
+   
+don't generate LLVM bitcode even if the current PostgreSQL installation is configured with --with-llvm
+   
+  
+ 
+
  
   EXTRA_CLEAN
   
diff --git a/src/makefiles/pgxs.mk b/src/makefiles/pgxs.mk
index 0de3737e78..ec6a3c1f09 100644
--- a/src/makefiles/pgxs.mk
+++ b/src/makefiles/pgxs.mk
@@ -53,6 +53,8 @@
 # that don't need their build products to be installed
 #   NO_INSTALLCHECK -- don't define an installcheck target, useful e.g. if
 # tests require special configuration, or don't use pg_regress
+#   NO_LLVM_BITCODE -- don't generate LLVM bitcode even if the current
+# PostgreSQL installation is configured with --with-llvm
 #   EXTRA_CLEAN -- extra files to remove in 'make clean'
 #   PG_CPPFLAGS -- will be prepended to CPPFLAGS
 #   PG_CFLAGS -- will be appended to CFLAGS
@@ -218,6 +220,10 @@ endef
 
 all: $(PROGRAM) $(DATA_built) $(HEADER_allbuilt) $(SCRIPTS_built) $(addsuffix $(DLSUFFIX), $(MODULES)) $(addsuffix .control, $(EXTENSION))
 
+ifdef NO_LLVM_BITCODE
+with_llvm := no
+endif # NO_LLVM_BITCODE
+
 ifeq ($(with_llvm), yes)
 all: $(addsuffix .bc, $(MODULES)) $(patsubst %.o,%.bc, $(OBJS))
 endif
-- 
2.44.0



Re: meson: Specify -Wformat as a common warning flag for extensions

2024-03-12 Thread Sutou Kouhei
Hi,

In 
  "Re: meson: Specify -Wformat as a common warning flag for extensions" on Fri, 
08 Mar 2024 10:05:27 -0600,
  "Tristan Partin"  wrote:

> Ok, I figured this out. -Wall implies -Wformat=1. We set warning_level
> to 1 in the Meson project() call, which implies -Wall, and set -Wall
> in CFLAGS for autoconf. That's the reason we don't get issues building
> Postgres. A user making use of the pg_config --cflags option, as Sutou
> is, *will* run into the aforementioned issues, since we don't
> propogate -Wall into pg_config.
> 
>   $ gcc $(pg_config --cflags) -E - < /dev/null > /dev/null
>   cc1: warning: ‘-Wformat-security’ ignored without ‘-Wformat’
>   [-Wformat-security]
>   $ gcc -Wall $(pg_config --cflags) -E - < /dev/null > /dev/null
>   (nothing printed)

Thanks for explaining this. You're right. This is the reason
why we don't need this for PostgreSQL itself but we need
this for PostgreSQL extensions. Sorry. I should have
explained this in the first e-mail...


What should we do to proceed this patch?


Thanks,
-- 
kou


Re: Add new error_action COPY ON_ERROR "log"

2024-03-12 Thread Michael Paquier
On Tue, Mar 12, 2024 at 12:54:29PM +0530, Bharath Rupireddy wrote:
> +1. But, do you want to add them now or later as a separate
> patch/discussion altogether?

The attached 0003 is what I had in mind:
- Simplification of the LOG generated with quotes applied around the
column name, don't see much need to add the relation name, either, for
consistency and because the knowledge is known in the query.
- A few more tests.
- Some doc changes.

>> Wouldn't it be better to squash the patches together, by the way?
> 
> I guess not. They are two different things IMV.

Well, 0001 is sitting doing nothing because the COPY code does not
make use of it internally.
--
Michael
From c45474726e084faf876a319485995ce84eef8293 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy 
Date: Mon, 11 Mar 2024 11:37:49 +
Subject: [PATCH v6 1/3] Add LOG_VERBOSITY option to COPY command

This commit adds a new option LOG_VERBOSITY to set the verbosity of
logged messages by COPY command. A value of 'verbose' can be used
to emit more informative messages by the command, while the value
of 'default (which is the default) can be used to not log any
additional messages. More values such as 'terse', 'row_details'
etc. can be added based on the need  to the LOG_VERBOSITY option.

An upcoming commit for emitting more info on soft errors by
COPY FROM command with ON_ERROR 'ignore' uses this.

Author: Bharath Rupireddy
Reviewed-by: Michael Paquier, Masahiko Sawada
Reviewed-by: Atsushi Torikoshi
Discussion: https://www.postgresql.org/message-id/CALj2ACXNA0focNeriYRvQQaCGc4CsTuOnFbzF9LqTKNWxuJdhA%40mail.gmail.com
---
 src/include/commands/copy.h | 10 
 src/backend/commands/copy.c | 38 +
 src/bin/psql/tab-complete.c |  6 -
 src/test/regress/expected/copy2.out |  8 ++
 src/test/regress/sql/copy2.sql  |  2 ++
 doc/src/sgml/ref/copy.sgml  | 14 +++
 src/tools/pgindent/typedefs.list|  1 +
 7 files changed, 78 insertions(+), 1 deletion(-)

diff --git a/src/include/commands/copy.h b/src/include/commands/copy.h
index b3da3cb0be..99d183fa4d 100644
--- a/src/include/commands/copy.h
+++ b/src/include/commands/copy.h
@@ -40,6 +40,15 @@ typedef enum CopyOnErrorChoice
 	COPY_ON_ERROR_IGNORE,		/* ignore errors */
 } CopyOnErrorChoice;
 
+/*
+ * Represents verbosity of logged messages by COPY command.
+ */
+typedef enum CopyLogVerbosityChoice
+{
+	COPY_LOG_VERBOSITY_DEFAULT = 0, /* logs no additional messages, default */
+	COPY_LOG_VERBOSITY_VERBOSE, /* logs additional messages */
+} CopyLogVerbosityChoice;
+
 /*
  * A struct to hold COPY options, in a parsed form. All of these are related
  * to formatting, except for 'freeze', which doesn't really belong here, but
@@ -73,6 +82,7 @@ typedef struct CopyFormatOptions
 	bool	   *force_null_flags;	/* per-column CSV FN flags */
 	bool		convert_selectively;	/* do selective binary conversion? */
 	CopyOnErrorChoice on_error; /* what to do when error happened */
+	CopyLogVerbosityChoice log_verbosity;	/* verbosity of logged messages */
 	List	   *convert_select; /* list of column names (can be NIL) */
 } CopyFormatOptions;
 
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index 056b6733c8..23eb8c9c79 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -428,6 +428,36 @@ defGetCopyOnErrorChoice(DefElem *def, ParseState *pstate, bool is_from)
 	return COPY_ON_ERROR_STOP;	/* keep compiler quiet */
 }
 
+/*
+ * Extract a CopyLogVerbosityChoice value from a DefElem.
+ */
+static CopyLogVerbosityChoice
+defGetCopyLogVerbosityChoice(DefElem *def, ParseState *pstate)
+{
+	char	   *sval;
+
+	/*
+	 * If no parameter value given, assume the default value.
+	 */
+	if (def->arg == NULL)
+		return COPY_LOG_VERBOSITY_DEFAULT;
+
+	/*
+	 * Allow "default", or "verbose" values.
+	 */
+	sval = defGetString(def);
+	if (pg_strcasecmp(sval, "default") == 0)
+		return COPY_LOG_VERBOSITY_DEFAULT;
+	if (pg_strcasecmp(sval, "verbose") == 0)
+		return COPY_LOG_VERBOSITY_VERBOSE;
+
+	ereport(ERROR,
+			(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+			 errmsg("COPY LOG_VERBOSITY \"%s\" not recognized", sval),
+			 parser_errposition(pstate, def->location)));
+	return COPY_LOG_VERBOSITY_DEFAULT;	/* keep compiler quiet */
+}
+
 /*
  * Process the statement option list for COPY.
  *
@@ -454,6 +484,7 @@ ProcessCopyOptions(ParseState *pstate,
 	bool		freeze_specified = false;
 	bool		header_specified = false;
 	bool		on_error_specified = false;
+	bool		log_verbosity_specified = false;
 	ListCell   *option;
 
 	/* Support external use for option sanity checking */
@@ -613,6 +644,13 @@ ProcessCopyOptions(ParseState *pstate,
 			on_error_specified = true;
 			opts_out->on_error = defGetCopyOnErrorChoice(defel, pstate, is_from);
 		}
+		else if (strcmp(defel->defname, "log_verbosity") == 0)
+		{
+			if (log_verbosity_specified)
+errorConflictingDefElem(defel, pstate);
+			log_verbosity_specified = true;
+			opts_out->

Re: perl: unsafe empty pattern behavior

2024-03-12 Thread Tom Lane
Jeff Davis  writes:
> On Tue, 2024-03-12 at 18:53 +0100, Alvaro Herrera wrote:
>> I also tried grepping (for things
>> like qr{}, qr[], qr||, qr!!) and didn't find anything beyond what you
>> have ... but I only looked for the "qr" literal, not other ways to
>> get regexes.

> I think that's fine. qr// seems the most dangerous, because it seems to
> behave differently in different versions of perl.

I wonder whether perlcritic has sufficiently deep understanding of
Perl code that it could find these hazards.  I already checked,
and found that there's no built-in filter for this (at least not
in the perlcritic version I have), but maybe we could write one?
The rules seem to be plug-in modules, so you can make your own
in principle.

regards, tom lane




Re: perl: unsafe empty pattern behavior

2024-03-12 Thread Jeff Davis
On Tue, 2024-03-12 at 18:53 +0100, Alvaro Herrera wrote:
> I suggest that pg_dump/t/002_pg_dump.pl could use a verification that
> the ->{regexp} thing is not empty.

I'm not sure how exactly to test for an empty pattern. The problem is,
we don't really want to test for an empty pattern, because /(?^:)/ is
fine. The problem is //, which gets turned into an actual pattern
(perhaps empty or perhaps not), and by the time it's in the %tests
hash, I think it's too late to distinguish.

Again, I'm not a perl expert, so suggestions welcome.

>   I also tried grepping (for things
> like qr{}, qr[], qr||, qr!!) and didn't find anything beyond what you
> have ... but I only looked for the "qr" literal, not other ways to
> get
> regexes.

I think that's fine. qr// seems the most dangerous, because it seems to
behave differently in different versions of perl.

Grepping for regexes in perl code is an "interesting" exercise.

Regards,
Jeff Davis





Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-03-12 Thread Jelte Fennema-Nio
On Tue, 12 Mar 2024 at 19:28, Alvaro Herrera  wrote:
>
> On 2024-Mar-12, Alvaro Herrera wrote:
>
> > Hmm, buildfarm member kestrel (which uses
> > -fsanitize=undefined,alignment) failed:
> >
> > # Running: libpq_pipeline -r 700 cancel port=49975 host=/tmp/dFh46H7YGc
> > dbname='postgres'
> > test cancellations...
> > libpq_pipeline:260: query did not fail when it was expected
>
> Hm, I tried using the same compile flags, couldn't reproduce.

Okay, it passed now it seems so I guess this test is flaky somehow.
The error message and the timing difference between failed and
succeeded buildfarm run clearly indicates that the pg_sleep ran its
180 seconds to completion (so cancel was never processed for some
reason).

**failed case**
282/285 postgresql:libpq_pipeline / libpq_pipeline/001_libpq_pipeline
   ERROR   191.56s   exit status 1

**succeeded case**

252/285 postgresql:libpq_pipeline / libpq_pipeline/001_libpq_pipeline
   OK   10.01s   21 subtests passed

I don't see any obvious reason for how this test can be flaky, but
I'll think a bit more about it tomorrow.




Re: Spurious pgstat_drop_replslot() call

2024-03-12 Thread Michael Paquier
On Tue, Mar 12, 2024 at 12:00:00PM +0300, Alexander Lakhin wrote:
> Sorry for a bit off-topic, but I've remembered an anomaly with a similar
> assert:
> https://www.postgresql.org/message-id/17947-b9554521ad963c9c%40postgresql.org

Thanks for the reminder.  The invalidation path with the stats drop is
only in 16~.

> Maybe you would find it worth considering while working in this area...
> (I've just run that reproducer on b36fbd9f8 and confirmed that the
> assertion failure is still here.)

Indeed, something needs to happen.  I am not surprised that it still
reproduces; nothing has changed with the locking of the stats entries.
:/
--
Michael


signature.asc
Description: PGP signature


Recent 027_streaming_regress.pl hangs

2024-03-12 Thread Thomas Munro
Hi,

Several animals are timing out while waiting for catchup,
sporadically.  I don't know why.  The oldest example I have found so
far by clicking around is:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink&dt=2024-02-23%2015%3A44%3A35

So perhaps something was committed ~3 weeks ago triggered this.

There are many examples since, showing as recoveryCheck failures.
Apparently they are all on animals wrangled by Andres.  Hmm.  I think
some/all share a physical host, they seem to have quite high run time
variance, and they're using meson.




Re: Vectored I/O in bulk_write.c

2024-03-12 Thread Thomas Munro
One more observation while I'm thinking about bulk_write.c... hmm, it
writes the data out and asks the checkpointer to fsync it, but doesn't
call smgrwriteback().  I assume that means that on Linux the physical
writeback sometimes won't happen until the checkpointer eventually
calls fsync() sequentially, one segment file at a time.  I see that
it's difficult to decide how to do that though; unlike checkpoints,
which have rate control/spreading, bulk writes could more easily flood
the I/O subsystem in a burst.  I expect most non-Linux systems'
write-behind heuristics to fire up for bulk sequential writes, but on
Linux where most users live, there is no write-behind heuristic AFAIK
(on the most common file systems anyway), so you have to crank that
handle if you want it to wake up and smell the coffee before it hits
internal limits, but then you have to decide how fast to crank it.

This problem will come into closer focus when we start talking about
streaming writes.  For the current non-streaming bulk_write.c coding,
I don't have any particular idea of what to do about that, so I'm just
noting the observation here.

Sorry for the sudden wall of text/monologue; this is all a sort of
reaction to bulk_write.c that I should perhaps have sent to the
bulk_write.c thread, triggered by a couple of debugging sessions.




Re: On disable_cost

2024-03-12 Thread Tom Lane
David Rowley  writes:
> So maybe the fix could be to set disable_cost to something like
> 1.0e110 and adjust compare_path_costs_fuzzily to not apply the
> fuzz_factor for paths >= disable_cost.   However, I wonder if that
> risks the costs going infinite after a couple of cartesian joins.

Perhaps.  It still does nothing for Robert's point that once we're
forced into using a "disabled" plan type, it'd be better if the
disabled-ness didn't skew subsequent planning choices.

On the whole I agree that getting rid of disable_cost entirely
would be the way to go, if we can replace that with a separate
boolean without driving up the cost of add_path too much.

regards, tom lane




Re: un-revert the MAINTAIN privilege and the pg_maintain predefined role

2024-03-12 Thread Nathan Bossart
On Thu, Mar 07, 2024 at 10:50:00AM -0600, Nathan Bossart wrote:
> Given all of this code was previously reviewed and committed, I am planning
> to forge ahead and commit this early next week, provided no objections or
> additional feedback materialize.

Jeff Davis and I spent some additional time looking at this patch.  There
are existing inconsistencies among the privilege checks for the various
maintenance commands, and the MAINTAIN privilege just builds on the status
quo, with one exception.  In the v1 patch, I proposed skipping privilege
checks when VACUUM recurses to TOAST tables, which means that a user may be
able to process a TOAST table for which they've concurrent lost privileges
on the main relation (since each table is vacuumed in a separate
transaction).  It's easy enough to resolve this inconsistency by sending
down the parent OID when recursing to a TOAST table and using that for the
privilege checks.  AFAICT this avoids any kind of cache lookup hazards
because we hold a session lock on the main relation in this case.  I've
done this in the attached v2.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
>From 1d58f64b1dcded53bd95c926c5476e129352c753 Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Tue, 12 Mar 2024 10:55:33 -0500
Subject: [PATCH v2 1/1] Reintroduce MAINTAIN privilege and pg_maintain
 predefined role.

Roles with MAINTAIN on a relation may run VACUUM, ANALYZE, REINDEX,
REFRESH MATERIALIZE VIEW, CLUSTER, and LOCK TABLE on the relation.
Roles with privileges of pg_maintain may run those same commands on
all relations.

This was previously committed for v16, but it was reverted in
commit 151c22deee due to concerns about search_path tricks that
could be used to escalate privileges to the table owner.  Commits
2af07e2f74, 59825d1639, and c7ea3f4229 resolved these concerns by
restricting search_path when running maintenance commands.

XXX: NEEDS CATVERSION BUMP

Reviewed-by: Jeff Davis
Discussion: https://postgr.es/m/20240305161235.GA3478007%40nathanxps13
---
 doc/src/sgml/ddl.sgml |  35 --
 doc/src/sgml/func.sgml|   2 +-
 .../sgml/ref/alter_default_privileges.sgml|   4 +-
 doc/src/sgml/ref/analyze.sgml |   6 +-
 doc/src/sgml/ref/cluster.sgml |  10 +-
 doc/src/sgml/ref/grant.sgml   |   3 +-
 doc/src/sgml/ref/lock.sgml|   4 +-
 .../sgml/ref/refresh_materialized_view.sgml   |   5 +-
 doc/src/sgml/ref/reindex.sgml |  23 ++--
 doc/src/sgml/ref/revoke.sgml  |   2 +-
 doc/src/sgml/ref/vacuum.sgml  |   6 +-
 doc/src/sgml/user-manag.sgml  |  12 ++
 src/backend/catalog/aclchk.c  |  15 +++
 src/backend/commands/analyze.c|  13 +-
 src/backend/commands/cluster.c|  43 +--
 src/backend/commands/indexcmds.c  |  34 ++---
 src/backend/commands/lockcmds.c   |   2 +-
 src/backend/commands/matview.c|   3 +-
 src/backend/commands/tablecmds.c  |  18 +--
 src/backend/commands/vacuum.c |  76 +++-
 src/backend/postmaster/autovacuum.c   |   1 +
 src/backend/utils/adt/acl.c   |   8 ++
 src/bin/pg_dump/dumputils.c   |   1 +
 src/bin/pg_dump/t/002_pg_dump.pl  |   2 +-
 src/bin/psql/tab-complete.c   |   6 +-
 src/include/catalog/pg_authid.dat |   5 +
 src/include/commands/tablecmds.h  |   5 +-
 src/include/commands/vacuum.h |   5 +-
 src/include/nodes/parsenodes.h|   3 +-
 src/include/utils/acl.h   |   5 +-
 .../expected/cluster-conflict-partition.out   |   8 +-
 .../specs/cluster-conflict-partition.spec |   2 +-
 .../perl/PostgreSQL/Test/AdjustUpgrade.pm |  11 ++
 src/test/regress/expected/cluster.out |   7 ++
 src/test/regress/expected/create_index.out|   4 +-
 src/test/regress/expected/dependency.out  |  22 ++--
 src/test/regress/expected/privileges.out  | 116 ++
 src/test/regress/expected/rowsecurity.out |  34 ++---
 src/test/regress/sql/cluster.sql  |   5 +
 src/test/regress/sql/dependency.sql   |   2 +-
 src/test/regress/sql/privileges.sql   |  67 ++
 41 files changed, 456 insertions(+), 179 deletions(-)

diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml
index 9d7e2c756b..8616a8e9cc 100644
--- a/doc/src/sgml/ddl.sgml
+++ b/doc/src/sgml/ddl.sgml
@@ -1868,8 +1868,8 @@ ALTER TABLE products RENAME TO items;
INSERT, UPDATE, DELETE,
TRUNCATE, REFERENCES, TRIGGER,
CREATE, CONNECT, TEMPORARY,
-   EXECUTE, USAGE, SET
-   and ALTER SYSTEM.
+   EXECUTE, USAGE, SET,
+   ALTER SYSTEM, and MAINTAIN.
The privileges applicable to a particular
object vary depending on the object's type (table, function, etc.).
More detail about the meanings

Re: On disable_cost

2024-03-12 Thread David Rowley
On Wed, 13 Mar 2024 at 08:55, Robert Haas  wrote:
> But in the absence of that, we need some way to privilege the
> non-disabled paths over the disabled ones -- and I'd prefer to have
> something more principled than disable_cost, if we can work out the
> details.

The primary place I see issues with disabled_cost is caused by
STD_FUZZ_FACTOR.  When you add 1.0e10 to a couple of modestly costly
paths, it makes them appear fuzzily the same in cases where one could
be significantly cheaper than the other. If we were to bump up the
disable_cost it would make this problem worse.

I think we do still need some way to pick the cheapest disabled path
when there are no other options.

One way would be to set fuzz_factor to 1.0 when either of the paths
costs in compare_path_costs_fuzzily() is >= disable_cost.
clamp_row_est() does cap row estimates at MAXIMUM_ROWCOUNT (1e100), so
I think there is some value of disable_cost that could almost
certainly ensure we don't reach it because the path is truly expensive
rather than disabled.

So maybe the fix could be to set disable_cost to something like
1.0e110 and adjust compare_path_costs_fuzzily to not apply the
fuzz_factor for paths >= disable_cost.   However, I wonder if that
risks the costs going infinite after a couple of cartesian joins.

David




Re: remaining sql/json patches

2024-03-12 Thread Alvaro Herrera
About 0002:

I think we should just drop it.  Look at the changes it produces in the
plans for aliases XMLTABLE:

> @@ -1556,7 +1556,7 @@ SELECT f.* FROM xmldata, LATERAL 
> xmltable('/ROWS/ROW[COUNTRY_NAME="Japan" or COU
> Output: f."COUNTRY_NAME", f."REGION_ID"
> ->  Seq Scan on public.xmldata
>   Output: xmldata.data
> -   ->  Table Function Scan on "xmltable" f
> +   ->  Table Function Scan on "XMLTABLE" f
>   Output: f."COUNTRY_NAME", f."REGION_ID"
>   Table Function Call: XMLTABLE(('/ROWS/ROW[COUNTRY_NAME="Japan" or 
> COUNTRY_NAME="India"]'::text) PASSING (xmldata.data) COLUMNS "COUNTRY_NAME" 
> text, "REGION_ID" integer)
>   Filter: (f."COUNTRY_NAME" = 'Japan'::text)

Here in text-format EXPLAIN, we already have the alias next to the
"xmltable" moniker, when an alias is present.  This matches the
query itself as well as the labels used in the "Output:" display.
If an alias is not present, then this says just 'Table Function Scan on 
"xmltable"'
and the rest of the plans refers to this as "xmltable", so it's also
fine.

> @@ -1591,7 +1591,7 @@ SELECT f.* FROM xmldata, LATERAL 
> xmltable('/ROWS/ROW[COUNTRY_NAME="Japan" or COU
> "Parent Relationship": "Inner",   
>   
>  +
> "Parallel Aware": false,  
>   
>  +
> "Async Capable": false,   
>   
>  +
> -   "Table Function Name": "xmltable",
>   
>  +
> +   "Table Function Name": "XMLTABLE",
>   
>  +
> "Alias": "f", 
>   
>  +
> "Output": ["f.\"COUNTRY_NAME\"", "f.\"REGION_ID\""],  
>   
>  +
> "Table Function Call": 
> "XMLTABLE(('/ROWS/ROW[COUNTRY_NAME=\"Japan\" or 
> COUNTRY_NAME=\"India\"]'::text) PASSING (xmldata.data) COLUMNS 
> \"COUNTRY_NAME\" text, \"REGION_ID\" integer)",+

This is the JSON-format explain.  Notice that the "Alias" member already
shows the alias "f", so the only thing this change is doing is
uppercasing the "xmltable" to "XMLTABLE".  We're not really achieving
anything here.

I think the only salvageable piece from this, **if anything**, is making
the "xmltable" literal string into uppercase.  That might bring a little
clarity to the fact that this is a keyword and not a user-introduced
name.


In your 0003 I think this would only have relevance in this query,

+-- JSON_TABLE() with alias
+EXPLAIN (COSTS OFF, VERBOSE)
+SELECT * FROM
+   JSON_TABLE(
+   jsonb 'null', 'lax $[*]' PASSING 1 + 2 AS a, json '"foo"' AS "b c"
+   COLUMNS (
+   id FOR ORDINALITY,
+   "int" int PATH '$',
+   "text" text PATH '$'
+   )) json_table_func;
+   
QUERY PLAN 
  
+--
--
+ Table Function Scan on "JSON_TABLE" json_table_func
+   Output: id, "int", text
+   Table Function Call: JSON_TABLE('null'::jsonb, '$[*]' AS json_table_path_0 
PASSING 3 AS a, '"foo"'::jsonb AS "b c" COLUMNS (id FOR ORDINALITY, "int" 
integer PATH '$', text text PATH '$') PLAN (json_table_path_0))
+(3 rows)

and I'm curious to see what this would output if this was to be run
without the 0002 patch.  If I understand things correctly, the alias
would be displayed anyway, meaning 0002 doesn't get us anything.

Please do add a test with EXPLAIN (FORMAT JSON) in 0003.

Thanks

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"La vida es para el que se aventura"




Re: typo in paths.h

2024-03-12 Thread David Rowley
On Wed, 13 Mar 2024 at 07:58, Cary Huang  wrote:
> I noticed that the comment for declaring create_tidscan_paths() in 
> src/include/optimizer/paths.h has a typo. The function is implemented in 
> tidpath.c, not tidpath.h as stated, which does not exist.

Thank you.  Pushed.

David




Re: On disable_cost

2024-03-12 Thread Robert Haas
On Tue, Mar 12, 2024 at 3:36 PM Tom Lane  wrote:
> Yeah.  I keep thinking that the right solution is to not generate
> disabled paths in the first place if there are any other ways to
> produce the same relation.  That has obvious order-of-operations
> problems though, and I've not been able to make it work.

I've expressed the same view in the past. It would be nice not to
waste planner effort on paths that we're just going to throw away, but
I'm not entirely sure what you mean by "obvious order-of-operations
problems."

To me, it seems like what we'd need is to be able to restart the whole
planner process if we run out of steam before we get done. For
example, suppose we're planning a 2-way join where index and
index-only scans are disabled, sorts are disabled, and nested loops
and hash joins are disabled. There's no problem generating just the
non-disabled scan types at the baserel level, but when we reach the
join, we're going to find that the only non-disabled join type is a
merge join, and we're also going to find that we have no paths that
provide pre-sorted input, so we need to sort, which we're also not
allowed to do. If we could give up at that point and restart planning,
disabling all of the plan-choice constraints and now creating all
paths for each RelOptInfo, then everything would, I believe, be just
fine. We'd end up needing neither disable_cost nor the mechanism
proposed by this patch.

But in the absence of that, we need some way to privilege the
non-disabled paths over the disabled ones -- and I'd prefer to have
something more principled than disable_cost, if we can work out the
details.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: On disable_cost

2024-03-12 Thread Tom Lane
Robert Haas  writes:
> On Tue, Mar 12, 2024 at 1:32 PM Tom Lane  wrote:
>> BTW, having written that paragraph, I wonder if we couldn't get
>> the same end result with a nearly one-line change that consists of
>> making disable_cost be IEEE infinity.

> I don't think so, because I think that what will happen in that case
> is that we'll pick a completely random plan if we can't pick a plan
> that avoids incurring disable_cost. Every plan that contains one
> disabled node anywhere in the plan tree will look like it has exactly
> the same cost as any other such plan.

Good point.

> IMHO, this is actually one of the problems with disable_cost as it
> works today.

Yeah.  I keep thinking that the right solution is to not generate
disabled paths in the first place if there are any other ways to
produce the same relation.  That has obvious order-of-operations
problems though, and I've not been able to make it work.

regards, tom lane




typo in paths.h

2024-03-12 Thread Cary Huang
Hello



I noticed that the comment for declaring create_tidscan_paths() in 
src/include/optimizer/paths.h has a typo. The function is implemented in 
tidpath.c, not tidpath.h as stated, which does not exist.



Made a small patch to correct it.



Thank you





Cary Huang

-

HighGo Software Inc. (Canada)

mailto:cary.hu...@highgo.ca

http://www.highgo.ca

correct_source_filename.patch
Description: Binary data


Broken EXPLAIN output for SubPlan in MERGE

2024-03-12 Thread Dean Rasheed
While playing around with EXPLAIN and SubPlans, I noticed that there's
a bug in how this is handled for MERGE. For example:

drop table if exists src, tgt, ref;
create table src (a int, b text);
create table tgt (a int, b text);
create table ref (a int);

explain (verbose, costs off)
merge into tgt t
  using (select (select r.a from ref r where r.a = s.a) a, b from src s) s
  on t.a = s.a
  when not matched then insert values (s.a, s.b);

QUERY PLAN
---
 Merge on public.tgt t
   ->  Merge Left Join
 Output: t.ctid, s.a, s.b, s.ctid
 Merge Cond: (((SubPlan 1)) = t.a)
 ->  Sort
   Output: s.a, s.b, s.ctid, ((SubPlan 1))
   Sort Key: ((SubPlan 1))
   ->  Seq Scan on public.src s
 Output: s.a, s.b, s.ctid, (SubPlan 1)
 SubPlan 1
   ->  Seq Scan on public.ref r
 Output: r.a
 Filter: (r.a = s.a)
 ->  Sort
   Output: t.ctid, t.a
   Sort Key: t.a
   ->  Seq Scan on public.tgt t
 Output: t.ctid, t.a
   SubPlan 2
 ->  Seq Scan on public.ref r_1
   Output: r_1.a
   Filter: (r_1.a = t.ctid)

The final filter condition "(r_1.a = t.ctid)" is incorrect, and should
be "(r_1.a = s.a)".

What's happening is that the right hand side of that filter expression
is an input Param node which get_parameter() tries to display by
calling find_param_referent() and then drilling down through the
ancestor node (the ModifyTable node) to try to find the real name of
the variable (s.a).

However, that isn't working properly for MERGE because the inner_plan
and inner_tlist of the corresponding deparse_namespace aren't set
correctly. Actually the inner_tlist is correct, but the inner_plan is
set to the ModifyTable node, whereas it needs to be the outer child
node -- in a MERGE, any references to the source relation will be
INNER_VAR references to the targetlist of the join node immediately
under the ModifyTable node.

So I think we want to do something like the attached.

Regards,
Dean
diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c
new file mode 100644
index 2a1ee69..2231752
--- a/src/backend/utils/adt/ruleutils.c
+++ b/src/backend/utils/adt/ruleutils.c
@@ -4988,8 +4988,11 @@ set_deparse_plan(deparse_namespace *dpns
 	 * For a WorkTableScan, locate the parent RecursiveUnion plan node and use
 	 * that as INNER referent.
 	 *
-	 * For MERGE, make the inner tlist point to the merge source tlist, which
-	 * is same as the targetlist that the ModifyTable's source plan provides.
+	 * For MERGE, pretend the ModifyTable's source plan (its outer plan) is
+	 * INNER referent.  This is the join from the target relation to the data
+	 * source, and all INNER_VAR Vars in other parts of the query refer to its
+	 * targetlist.
+	 *
 	 * For ON CONFLICT .. UPDATE we just need the inner tlist to point to the
 	 * excluded expression's tlist. (Similar to the SubqueryScan we don't want
 	 * to reuse OUTER, it's used for RETURNING in some modify table cases,
@@ -5004,17 +5007,17 @@ set_deparse_plan(deparse_namespace *dpns
 		dpns->inner_plan = find_recursive_union(dpns,
 (WorkTableScan *) plan);
 	else if (IsA(plan, ModifyTable))
-		dpns->inner_plan = plan;
-	else
-		dpns->inner_plan = innerPlan(plan);
-
-	if (IsA(plan, ModifyTable))
 	{
 		if (((ModifyTable *) plan)->operation == CMD_MERGE)
-			dpns->inner_tlist = dpns->outer_tlist;
+			dpns->inner_plan = outerPlan(plan);
 		else
-			dpns->inner_tlist = ((ModifyTable *) plan)->exclRelTlist;
+			dpns->inner_plan = plan;
 	}
+	else
+		dpns->inner_plan = innerPlan(plan);
+
+	if (IsA(plan, ModifyTable) && ((ModifyTable *) plan)->operation == CMD_INSERT)
+		dpns->inner_tlist = ((ModifyTable *) plan)->exclRelTlist;
 	else if (dpns->inner_plan)
 		dpns->inner_tlist = dpns->inner_plan->targetlist;
 	else
diff --git a/src/test/regress/expected/merge.out b/src/test/regress/expected/merge.out
new file mode 100644
index e3ebf46..1a6f6ad
--- a/src/test/regress/expected/merge.out
+++ b/src/test/regress/expected/merge.out
@@ -1473,6 +1473,56 @@ WHEN MATCHED AND t.a < 10 THEN
 
 DROP TABLE ex_msource, ex_mtarget;
 DROP FUNCTION explain_merge(text);
+-- EXPLAIN SubPlans and InitPlans
+CREATE TABLE src (a int, b int, c int, d int);
+CREATE TABLE tgt (a int, b int, c int, d int);
+CREATE TABLE ref (ab int, cd int);
+EXPLAIN (verbose, costs off)
+MERGE INTO tgt t
+USING (SELECT *, (SELECT count(*) FROM ref r
+   WHERE r.ab = s.a + s.b
+ AND r.cd = s.c - s.d) cnt
+ FROM src s) s
+ON t.a = s.a AND t.b < s.cnt
+WHEN MATCHED AND t.c > s.cnt THEN
+  UPDATE SET (b, c) = (SELECT s.b, s.cnt);
+ QUERY PLAN  
+-

Support json_errdetail in FRONTEND builds

2024-03-12 Thread Jacob Champion
Hello,

Both the incremental JSON [1] and OAuth [2] patchsets would be
improved by json_errdetail(), which was removed from FRONTEND builds
in b44669b2ca:

>The routine providing a detailed error message based on the error code
>is made backend-only, the existing code being unsafe to use in the
>frontend as the error message may finish by being palloc'd or point to a
>static string, so there is no way to know if the memory of the message
>should be pfree'd or not.

Attached is a patch to undo this, by attaching any necessary
allocations to the JsonLexContext so the caller doesn't have to keep
track of them.

This is based on the first patch of the OAuth patchset, which
additionally needs json_errdetail() to be safe to use from libpq
itself. Alvaro pointed out offlist that we don't have to go that far
to re-enable this function for the utilities, so this patch is a sort
of middle ground between what we have now and what OAuth implements.
(There is some additional minimization that could be done to this
patch, but I'm hoping to keep the code structure consistent between
the two, if the result is acceptable.)

Two notes that I wanted to point out explicitly:
- On the other thread, Daniel contributed a destroyStringInfo()
counterpart for makeStringInfo(), which is optional but seemed useful
to include here.
- After this patch, if a frontend client calls json_errdetail()
without calling freeJsonLexContext(), it will leak memory.

Thanks,
--Jacob

[1] 
https://www.postgresql.org/message-id/682c8fff-355c-a04f-57ac-81055c4ccda8%40dunslane.net
[2] 
https://www.postgresql.org/message-id/CAOYmi%2BkKNZCL7uz-LHyBggM%2BfEcf4285pFWwm7spkUb8irY7mQ%40mail.gmail.com


0001-common-jsonapi-support-json_errdetail-in-FRONTEND.patch
Description: Binary data


Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-03-12 Thread Alvaro Herrera
On 2024-Mar-12, Alvaro Herrera wrote:

> Hmm, buildfarm member kestrel (which uses
> -fsanitize=undefined,alignment) failed:
> 
> # Running: libpq_pipeline -r 700 cancel port=49975 host=/tmp/dFh46H7YGc
> dbname='postgres'
> test cancellations... 
> libpq_pipeline:260: query did not fail when it was expected

Hm, I tried using the same compile flags, couldn't reproduce.

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"Pido que me den el Nobel por razones humanitarias" (Nicanor Parra)




Re: On disable_cost

2024-03-12 Thread Robert Haas
On Tue, Mar 12, 2024 at 1:32 PM Tom Lane  wrote:
> BTW, having written that paragraph, I wonder if we couldn't get
> the same end result with a nearly one-line change that consists of
> making disable_cost be IEEE infinity.  Years ago we didn't want
> to rely on IEEE float semantics in this area, but nowadays I don't
> see why we shouldn't.

I don't think so, because I think that what will happen in that case
is that we'll pick a completely random plan if we can't pick a plan
that avoids incurring disable_cost. Every plan that contains one
disabled node anywhere in the plan tree will look like it has exactly
the same cost as any other such plan.

IMHO, this is actually one of the problems with disable_cost as it
works today. I think the semantics that we want are: if it's possible
to pick a plan where nothing is disabled, then pick the cheapest such
plan; if not, pick the cheapest plan overall. But treating
disable_cost doesn't really do that. It does the first part -- picking
the cheapest plan where nothing is disabled -- but it doesn't do the
second part, because once you add disable_cost into the cost of some
particular plan node, it screws up the rest of the planning, because
the cost estimates for the disabled nodes have no bearing in reality.
Fast-start plans, for example, will look insanely good compared to
what would be the case in normal planning (and we lean too much toward
fast-start plans even normally).

(I don't think we should care how MANY disabled nodes appear in a
plan, particularly. This is a more arguable point. Is a plan with 1
disabled node and 10% more cost better or worse than a plan with 2
disabled nodes and 10% less cost? I'd argue that counting the number
of disabled nodes isn't particularly meaningful.)

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-03-12 Thread Alvaro Herrera
Hmm, buildfarm member kestrel (which uses
-fsanitize=undefined,alignment) failed:

# Running: libpq_pipeline -r 700 cancel port=49975 host=/tmp/dFh46H7YGc
dbname='postgres'
test cancellations... 
libpq_pipeline:260: query did not fail when it was expected

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=kestrel&dt=2024-03-12%2016%3A41%3A27

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"The saddest aspect of life right now is that science gathers knowledge faster
 than society gathers wisdom."  (Isaac Asimov)




Re: perl: unsafe empty pattern behavior

2024-03-12 Thread Alvaro Herrera
On 2024-Mar-12, Jeff Davis wrote:

> Do not use perl empty patterns like // or qr// or s//.../, the behavior
> is too surprising for perl non-experts.

Yeah, nasty.

> There are a few such uses in
> our tests; patch attached. Unfortunately, there is no obvious way to
> automatically detect them so I am just relying on grep. I'm sure there
> are others here who know more about perl than I do, so
> suggestions/corrections are welcome.

I suggest that pg_dump/t/002_pg_dump.pl could use a verification that
the ->{regexp} thing is not empty.  I also tried grepping (for things
like qr{}, qr[], qr||, qr!!) and didn't find anything beyond what you
have ... but I only looked for the "qr" literal, not other ways to get
regexes.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
Bob [Floyd] used to say that he was planning to get a Ph.D. by the "green
stamp method," namely by saving envelopes addressed to him as 'Dr. Floyd'.
After collecting 500 such letters, he mused, a university somewhere in
Arizona would probably grant him a degree.  (Don Knuth)




Re: On disable_cost

2024-03-12 Thread Tom Lane
Robert Haas  writes:
> On Thu, Aug 3, 2023 at 5:22 AM Jian Guo  wrote:
>> I have write an initial patch to retire the disable_cost GUC, which labeled 
>> a flag on the Path struct instead of adding up a big cost which is hard to 
>> estimate. Though it involved in tons of plan changes in regression tests, I 
>> have tested on some simple test cases such as eagerly generate a two-stage 
>> agg plans and it worked. Could someone help to review?

> I took a look at this patch today. I believe that overall this may
> well be an approach worth pursuing. However, more work is going to be
> needed. Here are some comments:

> 1. You stated that it changes lots of plans in the regression tests,
> but you haven't provided any sort of analysis of why those plans
> changed. I'm kind of surprised that there would be "tons" of plan
> changes. You (or someone) should look into why that's happening.

I've not read the patch, but given this description I would expect
there to be *zero* regression changes --- I don't think we have any
test cases that depend on disable_cost being finite.  If there's more
than zero changes, that very likely indicates a bug in the patch.
Even if there are places where the output legitimately changes, you
need to justify each one and make sure that you didn't invalidate the
intent of that test case.

BTW, having written that paragraph, I wonder if we couldn't get
the same end result with a nearly one-line change that consists of
making disable_cost be IEEE infinity.  Years ago we didn't want
to rely on IEEE float semantics in this area, but nowadays I don't
see why we shouldn't.

regards, tom lane




Re: UUID v7

2024-03-12 Thread Jelte Fennema-Nio
On Tue, 12 Mar 2024 at 18:18, Sergey Prokhorenko
 wrote:
> Andrey and you simply did not read the RFC a little further down in the text:

You're totally right, sorry about that. Maybe it would be good to move
those subsections around a bit in the RFC though, so that anything
related to only one method is included in the section for that method.




perl: unsafe empty pattern behavior

2024-03-12 Thread Jeff Davis
Moved from discussion on -committers:

https://postgr.es/m/0ef325fa06e7a1605c4e119c4ecb637c67e5fb4e.ca...@j-davis.com

Summary:

Do not use perl empty patterns like // or qr// or s//.../, the behavior
is too surprising for perl non-experts. There are a few such uses in
our tests; patch attached. Unfortunately, there is no obvious way to
automatically detect them so I am just relying on grep. I'm sure there
are others here who know more about perl than I do, so
suggestions/corrections are welcome.

Long version:

Some may know this already, but we just discovered the dangers of using
empty patterns in perl:

"If the PATTERN evaluates to the empty string, the last successfully
matched regular expression is used instead... If no match has
previously succeeded, this will (silently) act instead as a genuine
empty pattern (which will always match)."

https://perldoc.perl.org/perlop#The-empty-pattern-//

In other words, if you have code like:

   if ('xyz' =~ //)
   {
   print "'xyz' matches //\n";
   }

The match will succeed and print, because there's no previous pattern,
so // is a "genuine" empty pattern, which is treated like /.*/ (I
think?). Then, if you add some other code before it:

   if ('abc' =~ /abc/)
   {
   print "'abc' matches /abc/\n";
   }

   if ('xyz' =~ //)
   {
   print "'xyz' matches //\n";
   }

The first match will succeed, but the second match will fail, because
// is treated like /abc/.

On reflection, that does seem very perl-like. But it can cause
surprising action-at-a-distance if not used carefully, especially for
those who aren't experts in perl. It's much safer to just not use the
empty pattern.

If you use qr// instead:

https://perldoc.perl.org/perlop#qr/STRING/msixpodualn

like:

   if ('abc' =~ qr/abc/)
   {
   print "'abc' matches /abc/\n";
   }

   if ('xyz' =~ qr//)
   {
   print "'xyz' matches //\n";
   }

Then the second match may succeed or may fail, and it's not clear from
the documentation what precise circumstances matter. It seems to fail
on older versions of perl (like 5.16.3) and succeed on newer versions
(5.38.2). However, it may also depend on when the qr// is [re]compiled,
or regex flags, or locale, or may just be undefined.

Regards,
Jeff Davis


From be5aa677e37180a8c1b0faebcceab5506b1c8130 Mon Sep 17 00:00:00 2001
From: Jeff Davis 
Date: Mon, 11 Mar 2024 16:44:56 -0700
Subject: [PATCH v1] perl: avoid empty regex patterns

Empty patterns have special behavior that uses the last successful
pattern match. This behavior can be surprising, so remove empty
patterns and instead match against exactly what is intended
(e.g. /^$/ or /.*/).

Unfortunately there's not an easy way to check for this in an
automated way, so it's likely that some cases have been missed and
will be missed in the future. This commit just cleans up known
instances.

Discussion: https://postgr.es/m/1548559.1710188...@sss.pgh.pa.us
---
 src/bin/pg_upgrade/t/003_logical_slots.pl   | 4 ++--
 src/bin/pg_upgrade/t/004_subscription.pl| 2 +-
 src/bin/psql/t/001_basic.pl | 8 
 src/bin/psql/t/010_tab_completion.pl| 2 +-
 src/test/recovery/t/037_invalid_database.pl | 2 +-
 5 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/src/bin/pg_upgrade/t/003_logical_slots.pl b/src/bin/pg_upgrade/t/003_logical_slots.pl
index 83d71c3084..256dfd53b1 100644
--- a/src/bin/pg_upgrade/t/003_logical_slots.pl
+++ b/src/bin/pg_upgrade/t/003_logical_slots.pl
@@ -78,7 +78,7 @@ command_checks_all(
 	[
 		qr/max_replication_slots \(1\) must be greater than or equal to the number of logical replication slots \(2\) on the old cluster/
 	],
-	[qr//],
+	[qr/^$/],
 	'run of pg_upgrade where the new cluster has insufficient max_replication_slots'
 );
 ok( -d $newpub->data_dir . "/pg_upgrade_output.d",
@@ -118,7 +118,7 @@ command_checks_all(
 	[
 		qr/Your installation contains logical replication slots that can't be upgraded./
 	],
-	[qr//],
+	[qr/^$/],
 	'run of pg_upgrade of old cluster with slots having unconsumed WAL records'
 );
 
diff --git a/src/bin/pg_upgrade/t/004_subscription.pl b/src/bin/pg_upgrade/t/004_subscription.pl
index df5d6dffbc..c8ee2390d1 100644
--- a/src/bin/pg_upgrade/t/004_subscription.pl
+++ b/src/bin/pg_upgrade/t/004_subscription.pl
@@ -68,7 +68,7 @@ command_checks_all(
 	[
 		qr/max_replication_slots \(0\) must be greater than or equal to the number of subscriptions \(1\) on the old cluster/
 	],
-	[qr//],
+	[qr/^$/],
 	'run of pg_upgrade where the new cluster has insufficient max_replication_slots'
 );
 
diff --git a/src/bin/psql/t/001_basic.pl b/src/bin/psql/t/001_basic.pl
index 9f0b6cf8ca..ce875ce316 100644
--- a/src/bin/psql/t/001_basic.pl
+++ b/src/bin/psql/t/001_basic.pl
@@ -412,23 +412,23 @@ my $perlbin = $^X;
 $perlbin =~ s!\\!/!g if $PostgreSQL::Test::Utils::windows_os;
 my $pipe_cmd = "$perlbin -pe '' >$g_file";
 
-psql_like($node, "SELECT 'one' \\g | $pipe_cmd", qr//, "one command \\g");
+psql_like($node, "SELECT 'one' \\g 

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-12 Thread Bharath Rupireddy
On Mon, Mar 11, 2024 at 4:09 PM Amit Kapila  wrote:
>
> I don't see how it will be easier for the user to choose the default
> value of 'max_slot_xid_age' compared to 'max_slot_wal_keep_size'. But,
> I agree similar to 'max_slot_wal_keep_size', 'max_slot_xid_age' can be
> another parameter to allow vacuum to proceed removing the rows which
> otherwise it wouldn't have been as those would be required by some
> slot. Now, if this understanding is correct, we should probably make
> this invalidation happen by (auto)vacuum after computing the age based
> on this new parameter.

Currently, the patch computes the XID age in the checkpointer using
the next XID (gets from ReadNextFullTransactionId()) and slot's xmin
and catalog_xmin. I think the checkpointer is best because it already
invalidates slots for wal_removed cause, and flushes all replication
slots to disk. Moving this new invalidation functionality into some
other background process such as autovacuum will not only burden that
process' work but also mix up the unique functionality of that
background process.

Having said above, I'm open to ideas from others as I'm not so sure if
there's any issue with checkpointer invalidating the slots for new
reasons.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: UUID v7

2024-03-12 Thread Sergey Prokhorenko
Hi Jelte,
I am one of the contributors to this RFC.

Andrey's patch corresponds exactly to Fixed-Length Dedicated Counter Bits 
(Method 1).

Andrey and you simply did not read the RFC a little further down in the text:
__

The following sub-topics cover topics related solely with creating reliable 
fixed-length dedicated counters:
   
   - Fixed-Length Dedicated Counter Seeding:
  -
Implementations utilizing the fixed-length counter method randomly initialize 
the counter with each new timestamp tick. However, when the timestamp has not 
increased, the counter is instead incremented by the desired increment logic. 
When utilizing a randomly seeded counter alongside Method 1, the random value 
MAY be regenerated with each counter increment without impacting sortability. 
The downside is that Method 1 is prone to overflows if a counter of adequate 
length is not selected or the random data generated leaves little room for the 
required number of increments. Implementations utilizing fixed-length counter 
method MAY also choose to randomly initialize a portion of the counter rather 
than the entire counter. For example, a 24 bit counter could have the 23 bits 
in least-significant, right-most, position randomly initialized. The remaining 
most significant, left-most counter bit is initialized as zero for the sole 
purpose of guarding against counter rollovers.

  - 
   - Fixed-Length Dedicated Counter Length:
  - Select a counter bit-length that can properly handle the level of 
timestamp precision in use. For example, millisecond precision generally 
requires a larger counter than a timestamp with nanosecond precision. General 
guidance is that the counter SHOULD be at least 12 bits but no longer than 42 
bits. Care must be taken to ensure that the counter length selected leaves room 
for sufficient entropy in the random portion of the UUID after the counter. 
This entropy helps improve the unguessability characteristics of UUIDs created 
within the batch.
  - The following sub-topics cover rollover handling with either type of 
counter method:   

  - ...
  - 
   - Counter Rollover Handling:
  -
Counter rollovers MUST be handled by the application to avoid sorting issues. 
The general guidance is that applications that care about absolute monotonicity 
and sortability should freeze the counter and wait for the timestamp to advance 
which ensures monotonicity is not broken. Alternatively, implementations MAY 
increment the timestamp ahead of the actual time and reinitialize the counter.


Sergey prokhorenkosergeyprokhore...@yahoo.com.au 

On Tuesday, 12 March 2024 at 06:36:13 pm GMT+3, Jelte Fennema-Nio 
 wrote:  
 
 On Mon, 11 Mar 2024 at 19:27, Andrey M. Borodin  wrote:
> Sorry for this long and vague explanation, if it still seems too uncertain we 
> can have a chat or something like that. I don't think this number picking 
> stuff deserve to be commented, because it still is quite close to random. RFC 
> gives us too much freedom of choice.

I thought your explanation was quite clear and I agree that this
approach makes the most sense. I sent an email to the RFC authors to
ask for their feedback with you (Andrey) in the CC, because even
though it makes the most sense it does not comply with the either of
method 1 or 2 as described in the RFC.
  

Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-03-12 Thread Alvaro Herrera
On 2024-Mar-12, Jelte Fennema-Nio wrote:

> On Tue, 12 Mar 2024 at 10:19, Alvaro Herrera  wrote:
> > Here's a last one for the cfbot.
> 
> Thanks for committing the first 3 patches btw. Attached a tiny change
> to 0001, which adds "(backing struct for PGcancelConn)" to the comment
> on pg_cancel_conn.

Thanks, I included it.  I hope there were no other changes, because I
didn't verify :-) but if there were, please let me know to incorporate
them.

I made a number of other small changes, mostly to the documentation,
nothing fundamental.  (Someday we should stop using  to
document the libpq functions and use refentry's instead ... it'd be
useful to have manpages for these functions.)

One thing I don't like very much is release_conn_addrinfo(), which is
called conditionally in two places but unconditionally in other places.
Maybe it'd make more sense to put this conditionality inside the
function itself, possibly with a "bool force" flag to suppress that in
the cases where it is not desired.

In pqConnectDBComplete, we cast the PGconn * to PGcancelConn * in order
to call PQcancelPoll, which is a bit abusive, but I don't know how to do
better.  Maybe we just accept this ... but if PQcancelStart is the only
way to have pqConnectDBStart called from a cancel connection, maybe it'd
be saner to duplicate pqConnectDBStart for cancel conns.

Thanks!

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/




Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-12 Thread Bharath Rupireddy
On Mon, Mar 11, 2024 at 3:44 PM Amit Kapila  wrote:
>
> Yes, your understanding is correct. I wanted us to consider having new
> parameters like 'inactive_replication_slot_timeout' to be at
> slot-level instead of GUC. I think this new parameter doesn't seem to
> be the similar as 'max_slot_wal_keep_size' which leads to truncation
> of WAL at global and then invalidates the appropriate slots. OTOH, the
> 'inactive_replication_slot_timeout' doesn't appear to have a similar
> global effect.

last_inactive_at is tracked for each slot using which slots get
invalidated based on inactive_replication_slot_timeout. It's like
max_slot_wal_keep_size invalidating slots based on restart_lsn. In a
way, both are similar, right?

> The other thing we should consider is what if the
> checkpoint happens at a timeout greater than
> 'inactive_replication_slot_timeout'?

In such a case, the slots get invalidated upon the next checkpoint as
the (current_checkpointer_timeout - last_inactive_at) will then be
greater than inactive_replication_slot_timeout.

> Shall, we consider doing it via
> some other background process or do we think checkpointer is the best
> we can have?

The same problem exists if we do it with some other background
process. I think the checkpointer is best because it already
invalidates slots for wal_removed cause, and flushes all replication
slots to disk. Moving this new invalidation functionality into some
other background process such as autovacuum will not only burden that
process' work but also mix up the unique functionality of that
background process.

Having said above, I'm open to ideas from others as I'm not so sure if
there's any issue with checkpointer invalidating the slots for new
reasons.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Statistics Import and Export

2024-03-12 Thread Corey Huinker
>
> No, we should be keeping the lock until the end of the transaction
> (which in this case would be just the one statement, but it would be the
> whole statement and all of the calls in it).  See analyze.c:268 or
> so, where we call relation_close(onerel, NoLock); meaning we're closing
> the relation but we're *not* releasing the lock on it- it'll get
> released at the end of the transaction.
>
>
If that's the case, then changing the two table_close() statements to
NoLock should resolve any remaining concern.


Re: type cache cleanup improvements

2024-03-12 Thread Teodor Sigaev

Got it. Here is an updated patch where I added a corresponding comment.

Thank you!

Playing around I found one more place which could easily modified with 
hash_seq_init_with_hash_value() call.

--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/diff --git a/src/backend/utils/cache/attoptcache.c b/src/backend/utils/cache/attoptcache.c
index af978ccd4b1..28980620662 100644
--- a/src/backend/utils/cache/attoptcache.c
+++ b/src/backend/utils/cache/attoptcache.c
@@ -44,12 +44,10 @@ typedef struct
 
 /*
  * InvalidateAttoptCacheCallback
- *		Flush all cache entries when pg_attribute is updated.
+ *		Flush cache entry (or entries) when pg_attribute is updated.
  *
  * When pg_attribute is updated, we must flush the cache entry at least
- * for that attribute.  Currently, we just flush them all.  Since attribute
- * options are not currently used in performance-critical paths (such as
- * query execution), this seems OK.
+ * for that attribute.
  */
 static void
 InvalidateAttoptCacheCallback(Datum arg, int cacheid, uint32 hashvalue)
@@ -57,7 +55,16 @@ InvalidateAttoptCacheCallback(Datum arg, int cacheid, uint32 hashvalue)
 	HASH_SEQ_STATUS status;
 	AttoptCacheEntry *attopt;
 
-	hash_seq_init(&status, AttoptCacheHash);
+	/*
+	 * By convection, zero hash value is passed to the callback as a sign
+	 * that it's time to invalidate the cache. See sinval.c, inval.c and
+	 * InvalidateSystemCachesExtended().
+	 */
+	if (hashvalue == 0)
+		hash_seq_init(&status, AttoptCacheHash);
+	else
+		hash_seq_init_with_hash_value(&status, AttoptCacheHash, hashvalue);
+
 	while ((attopt = (AttoptCacheEntry *) hash_seq_search(&status)) != NULL)
 	{
 		if (attopt->opts)
@@ -70,6 +77,17 @@ InvalidateAttoptCacheCallback(Datum arg, int cacheid, uint32 hashvalue)
 	}
 }
 
+/*
+ * Hash function compatible with two-arg system cache hash function.
+ */
+static uint32
+relatt_cache_syshash(const void *key, Size keysize)
+{
+	const AttoptCacheKey* ckey = key;
+
+	return GetSysCacheHashValue2(ATTNUM, ckey->attrelid, ckey->attnum);
+}
+
 /*
  * InitializeAttoptCache
  *		Initialize the attribute options cache.
@@ -82,9 +100,17 @@ InitializeAttoptCache(void)
 	/* Initialize the hash table. */
 	ctl.keysize = sizeof(AttoptCacheKey);
 	ctl.entrysize = sizeof(AttoptCacheEntry);
+
+	/*
+	 * AttoptCacheEntry takes hash value from the system cache. For
+	 * AttoptCacheHash we use the same hash in order to speedup search by hash
+	 * value. This is used by hash_seq_init_with_hash_value().
+	 */
+	ctl.hash = relatt_cache_syshash;
+
 	AttoptCacheHash =
 		hash_create("Attopt cache", 256, &ctl,
-	HASH_ELEM | HASH_BLOBS);
+	HASH_ELEM | HASH_FUNCTION);
 
 	/* Make sure we've initialized CacheMemoryContext. */
 	if (!CacheMemoryContext)


Re: CI speed improvements for FreeBSD

2024-03-12 Thread Maxim Orlov
Hi!

I looked at the changes and I liked them.  Here are my thoughts:

0001:
1. I think, this is a good idea to use RAM.  Since, it's still a UFS, and
we lose nothing in terms of testing, but win in speed significantly.
2. Change from "swapoff -a || true" to "swapoff -a" is legit in my view,
since it's better to explicitly fail than silent any possible problem.
3. Man says that lowercase suffixes should be used for the mdconfig. But in
fact, you can use either lowercase or an appercase. Yep, it's in
   the mdconfig.c: "else if (*p == 'g' || *p == 'G')".

0002:
1. The resource usage should be a bit higher, this is for sure.  But, if
I'm not missing something, not drastically. Anyway, I do not know
   how to measure this increase to get concrete values.
2. And think of a potential benefits of increasing the number of test jobs:
more concurrent processes, more interactions, better test coverage.

Here are my runs:
FreeBSD @master
https://cirrus-ci.com/task/4934701194936320
Run test_world 05:56

FreeBSD @master + 0001
https://cirrus-ci.com/task/5921385306914816
Run test_world 05:06

FreeBSD @master + 0001, + 0002
https://cirrus-ci.com/task/5635288945393664
Run test_world 02:20

For comparison
Debian @master
https://cirrus-ci.com/task/5143705577848832
Run test_world 02:38

In the overall, I consider this changes useful.  CI run faster, with better
test coverage in exchange for presumably slight increase
in resource usage, but I don't think this increase should be significant.

-- 
Best regards,
Maxim Orlov.


Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-12 Thread Bharath Rupireddy
On Tue, Mar 12, 2024 at 9:11 PM Bertrand Drouvot
 wrote:
>
> > AFAIR, we don't prevent similar invalidations due to
> > 'max_slot_wal_keep_size' for sync slots,
>
> Right, we'd invalidate them on the standby should the standby sync slot 
> restart_lsn
> exceeds the limit.

Right. Help me understand this a bit - is the wal_removed invalidation
going to conflict with recovery on the standby?

Per the discussion upthread, I'm trying to understand what
invalidation reasons will exactly cause conflict with recovery? Is it
just rows_removed and wal_level_insufficient invalidations? My
understanding on the conflict with recovery and invalidation reason
has been a bit off track. Perhaps, we need to clarify these two things
in the docs for the end users as well?

-- 
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-03-12 Thread Jelte Fennema-Nio
On Tue, 12 Mar 2024 at 15:04, Jelte Fennema-Nio  wrote:
> Attached a tiny change to 0001

One more tiny comment change, stating that pg_cancel is used by the
deprecated PQcancel function.


v36-0001-libpq-Add-encrypted-and-non-blocking-query-cance.patch
Description: Binary data


v36-0002-Start-using-new-libpq-cancel-APIs.patch
Description: Binary data


Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-12 Thread Bharath Rupireddy
On Tue, Mar 12, 2024 at 5:51 PM Amit Kapila  wrote:
>
> > Would that make sense to "simply" discard/prevent those kind of 
> > invalidations
> > for "synced" slot on standby? I mean, do they make sense given the fact that
> > those slots are not usable until the standby is promoted?
>
> AFAIR, we don't prevent similar invalidations due to
> 'max_slot_wal_keep_size' for sync slots, so why to prevent it for
> these new parameters? This will unnecessarily create inconsistency in
> the invalidation behavior.

Right. +1 to keep the behaviour consistent for all invalidations.
However, an assertion that inactive_timeout isn't set for synced slots
on the standby isn't a bad idea because we rely on the fact that
walsenders aren't started for synced slots. Again, I think it misses
the consistency in the invalidation behaviour.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: UUID v7

2024-03-12 Thread Jelte Fennema-Nio
On Tue, 12 Mar 2024 at 07:32, Michael Paquier  wrote:
> Sure, there is no problem in discussing a patch to implement a
> behavior.  But I disagree about taking a risk in merging something
> that could become non-compliant with the approved RFC, if the draft is
> approved at the end, of course.  This just strikes me as a bad idea.

I agree that we shouldn't release UUIDv7 support if the RFC describing
that is not yet approved. But I do think it would be a shame if e.g.
the RFC got approved 2 weeks after Postgres its feature freeze. Which
would then mean we'd have to wait another 1.5 years before actually
using uuidv7. Would it be a reasonable compromise to still merge the
patch for PG17 (assuming the code is good to merge with regards to the
current draft RFC), but revert the commit if the RFC is not approved
before some deadline before the release date (e.g. before the first
release candidate)?




Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-12 Thread Bertrand Drouvot
Hi,

On Tue, Mar 12, 2024 at 05:51:43PM +0530, Amit Kapila wrote:
> On Tue, Mar 12, 2024 at 1:24 PM Bertrand Drouvot
>  wrote:
> >
> > On Fri, Mar 08, 2024 at 10:42:20PM +0530, Bharath Rupireddy wrote:
> > > On Wed, Mar 6, 2024 at 4:49 PM Amit Kapila  
> > > wrote:
> > > >
> > > > You might want to consider its interaction with sync slots on standby.
> > > > Say, there is no activity on slots in terms of processing the changes
> > > > for slots. Now, we won't perform sync of such slots on standby showing
> > > > them inactive as per your new criteria where as same slots could still
> > > > be valid on primary as the walsender is still active. This may be more
> > > > of a theoretical point as in running system there will probably be
> > > > some activity but I think this needs some thougths.
> > >
> > > I believe the xmin and catalog_xmin of the sync slots on the standby
> > > keep advancing depending on the slots on the primary, no? If yes, the
> > > XID age based invalidation shouldn't be a problem.
> > >
> > > I believe there are no walsenders started for the sync slots on the
> > > standbys, right? If yes, the inactive timeout based invalidation also
> > > shouldn't be a problem. Because, the inactive timeouts for a slot are
> > > tracked only for walsenders because they are the ones that typically
> > > hold replication slots for longer durations and for real replication
> > > use. We did a similar thing in a recent commit [1].
> > >
> > > Is my understanding right? Do you still see any problems with it?
> >
> > Would that make sense to "simply" discard/prevent those kind of 
> > invalidations
> > for "synced" slot on standby? I mean, do they make sense given the fact that
> > those slots are not usable until the standby is promoted?
> >
> 
> AFAIR, we don't prevent similar invalidations due to
> 'max_slot_wal_keep_size' for sync slots,

Right, we'd invalidate them on the standby should the standby sync slot 
restart_lsn
exceeds the limit.

> so why to prevent it for
> these new parameters? This will unnecessarily create inconsistency in
> the invalidation behavior.

Yeah, but I think wal removal has a direct impact on the slot usuability which
is probably not the case with the new XID and Timeout ones. That's why I thought
about handling them differently (but I'm also fine if that's not the case).

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: UUID v7

2024-03-12 Thread Jelte Fennema-Nio
On Mon, 11 Mar 2024 at 19:27, Andrey M. Borodin  wrote:
> Sorry for this long and vague explanation, if it still seems too uncertain we 
> can have a chat or something like that. I don't think this number picking 
> stuff deserve to be commented, because it still is quite close to random. RFC 
> gives us too much freedom of choice.

I thought your explanation was quite clear and I agree that this
approach makes the most sense. I sent an email to the RFC authors to
ask for their feedback with you (Andrey) in the CC, because even
though it makes the most sense it does not comply with the either of
method 1 or 2 as described in the RFC.




Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-12 Thread Bharath Rupireddy
On Mon, Mar 11, 2024 at 11:26 AM Amit Kapila  wrote:
>
> > Hm. I get the concern. Are you okay with having inavlidation_reason
> > separately for both logical and physical slots? In such a case,
> > logical slots that got invalidated on the standby will have duplicate
> > info in conflict_reason and invalidation_reason, is this fine?
> >
>
> If we have duplicate information in two columns that could be
> confusing for users. BTW, isn't the recovery conflict occur only
> because of rows_removed and wal_level_insufficient reasons? The
> wal_removed or the new reasons you are proposing can't happen because
> of recovery conflict. Am, I missing something here?

My understanding aligns with yours that the rows_removed and
wal_level_insufficient invalidations can occur only upon recovery
conflict.

FWIW, a test named 'synchronized slot has been invalidated' in
040_standby_failover_slots_sync.pl inappropriately uses
conflict_reason = 'wal_removed' logical slot on standby. As per the
above understanding, it's inappropriate to use conflict_reason here
because wal_removed invalidation doesn't conflict with recovery.

> > Another idea is to make 'conflict_reason text' as a 'conflicting
> > boolean' again (revert 007693f2a3), and have 'invalidation_reason
> > text' for both logical and physical slots. So, whenever 'conflicting'
> > is true, one can look at invalidation_reason for the reason for
> > conflict. How does this sound?
> >
>
> So, does this mean that conflicting will only be true for some of the
> reasons (say wal_level_insufficient, rows_removed, wal_removed) and
> logical slots but not for others? I think that will also not eliminate
> the duplicate information as user could have deduced that from single
> column.

So, how about we turn conflict_reason to only report the reasons that
actually cause conflict with recovery for logical slots, something
like below, and then have invalidation_cause as a generic column for
all sorts of invalidation reasons for both logical and physical slots?

ReplicationSlotInvalidationCause cause = slot_contents.data.invalidated;

if (slot_contents.data.database == InvalidOid ||
cause == RS_INVAL_NONE ||
cause != RS_INVAL_HORIZON ||
cause != RS_INVAL_WAL_LEVEL)
{
nulls[i++] = true;
}
else
{
Assert(cause == RS_INVAL_HORIZON || cause == RS_INVAL_WAL_LEVEL);

values[i++] = CStringGetTextDatum(SlotInvalidationCauses[cause]);
}

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: POC, WIP: OR-clause support for indexes

2024-03-12 Thread Alexander Korotkov
On Mon, Mar 11, 2024 at 2:43 PM Andrei Lepikhov
 wrote:
> On 11/3/2024 18:31, Alexander Korotkov wrote:
> >> I'm not convinced about this limit. The initial reason was to combine
> >> long lists of ORs into the array because such a transformation made at
> >> an early stage increases efficiency.
> >> I understand the necessity of this limit in the array decomposition
> >> routine but not in the creation one.
> >
> > The comment near MAX_SAOP_ARRAY_SIZE says that this limit is because
> > N^2 algorithms could be applied to arrays.  Are you sure that's not
> > true for our case?
> When you operate an array, indeed. But when we transform ORs to an
> array, not. Just check all the places in the optimizer and even the
> executor where we would pass along the list of ORs. This is why I think
> we should use this optimization even more intensively for huge numbers
> of ORs in an attempt to speed up the overall query.

Ok.

> >>> 3) Better save the original order of clauses by putting hash entries and
> >>> untransformable clauses to the same list.  A lot of differences in
> >>> regression tests output have gone.
> >> I agree that reducing the number of changes in regression tests looks
> >> better. But to achieve this, you introduced a hack that increases the
> >> complexity of the code. Is it worth it? Maybe it would be better to make
> >> one-time changes in tests instead of getting this burden on board. Or
> >> have you meant something more introducing the node type?
> >
> > For me the reason is not just a regression test.  The current code
> > keeps the original order of quals as much as possible.  The OR
> > transformation code reorders quals even in cases when it doesn't
> > eventually apply any optimization.  I don't think that's acceptable.
> > However, less hackery ways for this is welcome for sure.
> Why is it unacceptable? Can the user implement some order-dependent
> logic with clauses, and will it be correct?
> Otherwise, it is a matter of taste, and generally, this decision is up
> to you.

I think this is an important property that the user sees the quals in
the plan in the same order as they were in the query.  And if some
transformations are applied, then the order is saved as much as
possible.  I don't think we should sacrifice this property without
strong reasons.  A bit of code complexity is definitely not that
reason for me.

> >>> We don't make array values unique.  That might make query execution
> >>> performance somewhat worse, and also makes selectivity estimation
> >>> worse.  I suggest Andrei and/or Alena should implement making array
> >>> values unique.
> >> The fix Alena has made looks correct. But I urge you to think twice:
> >> The optimizer doesn't care about duplicates, so why do we do it?
> >> What's more, this optimization is intended to speed up queries with long
> >> OR lists. Using the list_append_unique() comparator on such lists could
> >> impact performance. I suggest sticking to the common rule and leaving
> >> the responsibility on the user's shoulders.
> >
> > I don't see why the optimizer doesn't care about duplicates for OR
> > lists.  As I showed before in an example, it successfully removes the
> > duplicate.  So, currently OR transformation clearly introduces a
> > regression in terms of selectivity estimation.  I think we should
> > evade that.
> I think you are right. It is probably a better place than any other to
> remove duplicates in an array. I just think we should sort and remove
> duplicates from entry->consts in one pass. Thus, this optimisation
> should be applied to sortable constants.

Ok.

> >> At least, we should do this optimization later, in one pass, with
> >> sorting elements before building the array. But what if we don't have a
> >> sort operator for the type?
> >
> > It was probably discussed before, but can we do our work later?  There
> > is a canonicalize_qual() which calls find_duplicate_ors().  This is
> > the place where currently duplicate OR clauses are removed.  Could our
> > OR-to-ANY transformation be just another call from
> > canonicalize_qual()?
> Hmm, we already tried to do it at that point. I vaguely recall some
> issues caused by this approach. Anyway, it should be done as quickly as
> possible to increase the effect of the optimization.

I think there were provided quite strong reasons why this shouldn't be
implemented at the parse analysis stage [1], [2], [3].  The
canonicalize_qual() looks quite appropriate place for that since it
does similar transformations.

Links.
1. 
https://www.postgresql.org/message-id/CA%2BTgmoZCgP6FrBQEusn4yaWm02XU8OPeoEMk91q7PRBgwaAkFw%40mail.gmail.com
2. 
https://www.postgresql.org/message-id/CAH2-Wzm2%3Dnf_JhiM3A2yetxRs8Nd2NuN3JqH%3Dfm_YWYd1oYoPg%40mail.gmail.com
3. 
https://www.postgresql.org/message-id/CA%2BTgmoaOiwMXBBTYknczepoZzKTp-Zgk5ss1%2BCuVQE-eFTqBmA%40mail.gmail.com

--
Regards,
Alexander Korotkov




Re: A failure in t/038_save_logical_slots_shutdown.pl

2024-03-12 Thread Bharath Rupireddy
On Tue, Mar 12, 2024 at 6:05 PM Amit Kapila  wrote:
>
> The patch looks mostly good to me. I have changed the comments and
> commit message in the attached. I am planning to push this tomorrow
> unless you or others have any comments on it.

LGTM.

-- 
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Disable LLVM bitcode generation with pgxs.mk framework.

2024-03-12 Thread Daniel Gustafsson
> On 12 Mar 2024, at 14:38, Xing Guo  wrote:

> Would it be possible to add a new switch in the pgxs.mk framework to
> allow users to disable this feature?

Something like that doesn't seem unreasonable I think.

--
Daniel Gustafsson





Re: confusing `case when` column name

2024-03-12 Thread Daniel Gustafsson
> On 12 Mar 2024, at 15:19, Tom Lane  wrote:

> users are probably going to end up applying an AS clause most of
> the time if they care about the column name at all.

If anything, we could perhaps add a short note in the CASE documentation about
column naming, the way it's phrased now a new user could very easily assume it
would be "case".

--
Daniel Gustafsson





Re: On disable_cost

2024-03-12 Thread Robert Haas
On Thu, Aug 3, 2023 at 5:22 AM Jian Guo  wrote:
> I have write an initial patch to retire the disable_cost GUC, which labeled a 
> flag on the Path struct instead of adding up a big cost which is hard to 
> estimate. Though it involved in tons of plan changes in regression tests, I 
> have tested on some simple test cases such as eagerly generate a two-stage 
> agg plans and it worked. Could someone help to review?

I took a look at this patch today. I believe that overall this may
well be an approach worth pursuing. However, more work is going to be
needed. Here are some comments:

1. You stated that it changes lots of plans in the regression tests,
but you haven't provided any sort of analysis of why those plans
changed. I'm kind of surprised that there would be "tons" of plan
changes. You (or someone) should look into why that's happening.

2. The change to compare_path_costs_fuzzily() seems incorrect to me.
When path1->is_disabled && path2->is_disabled, costs should be
compared just as they are when neither path is disabled. Instead, the
patch treats any two such paths as having equal cost. That seems
catastrophically bad. Maybe it accounts for some of those plan
changes, although that would only be true if those plans were created
while using some disabling GUC.

3. Instead of adding is_disabled at the end of the Path structure, I
suggest adding it between param_info and parallel_aware. I think if
you do that, the space used by the new field will use up padding bytes
that are currently included in the struct, instead of making it
longer.

4. A critical issue for any patch of this type is performance. This
concern was raised earlier on this thread, but your path doesn't
address it. There's no performance analysis or benchmarking included
in your email. One idea that I have is to write the cost-comparison
test like this:

if (unlikely(path1->is_disabled || path2->is_disabled))
{
if (!path1->is_disabled)
return COSTS_BETTER1;
if (!path2->is_disabled)
return COSTS_BETTER2;
/* if both disabled, fall through */
}

I'm not sure that would be enough to prevent the patch from adding
noticeably to the cost of path comparison, but maybe it would help.

5. The patch changes only compare_path_costs_fuzzily() but I wonder
whether compare_path_costs() and compare_fractional_path_costs() need
similar surgery. Whether they do or don't, there should likely be some
comments explaining the situation.

6. In fact, the patch changes no comments at all, anywhere. I'm not
sure how many comment changes a patch like this needs to make, but the
answer definitely isn't "none".

7. The patch doesn't actually remove disable_cost. I guess it should.

8. When you submit a patch, it's a good idea to also add it on
commitfest.postgresql.org. It doesn't look like that was done in this
case.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: confusing `case when` column name

2024-03-12 Thread Tom Lane
"David G. Johnston"  writes:
> On Tuesday, March 12, 2024, adjk...@126.com  wrote:
>> Nee we change the title of the case-when output column?

> Choosing either a or b as the label seems wrong and probably worth changing
> to something that has no meaning and encourages the application of a column
> alias.

Yeah, we won't get any kudos for changing a rule that's stood for
25 years, even if it's not very good.  This is one of the places
where it's just hard to make a great choice automatically, and
users are probably going to end up applying an AS clause most of
the time if they care about the column name at all.

regards, tom lane




Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-03-12 Thread Jelte Fennema-Nio
On Tue, 12 Mar 2024 at 10:19, Alvaro Herrera  wrote:
> Here's a last one for the cfbot.

Thanks for committing the first 3 patches btw. Attached a tiny change
to 0001, which adds "(backing struct for PGcancelConn)" to the comment
on pg_cancel_conn.
From d340fde6883a249fd7c1a90033675a3b5edb603e Mon Sep 17 00:00:00 2001
From: Jelte Fennema-Nio 
Date: Thu, 14 Dec 2023 13:39:09 +0100
Subject: [PATCH v35 2/2] Start using new libpq cancel APIs

A previous commit introduced new APIs to libpq for cancelling queries.
This replaces the usage of the old APIs in most of the codebase with
these newer ones. This specifically leaves out changes to psql and
pgbench as those would need a much larger refactor to be able to call
them, due to the new functions not being signal-safe.
---
 contrib/dblink/dblink.c   |  30 +++--
 contrib/postgres_fdw/connection.c | 105 +++---
 .../postgres_fdw/expected/postgres_fdw.out|  15 +++
 contrib/postgres_fdw/sql/postgres_fdw.sql |   7 ++
 src/fe_utils/connect_utils.c  |  11 +-
 src/test/isolation/isolationtester.c  |  29 ++---
 6 files changed, 145 insertions(+), 52 deletions(-)

diff --git a/contrib/dblink/dblink.c b/contrib/dblink/dblink.c
index 19a362526d2..98dcca3e6fd 100644
--- a/contrib/dblink/dblink.c
+++ b/contrib/dblink/dblink.c
@@ -1346,22 +1346,32 @@ PG_FUNCTION_INFO_V1(dblink_cancel_query);
 Datum
 dblink_cancel_query(PG_FUNCTION_ARGS)
 {
-	int			res;
 	PGconn	   *conn;
-	PGcancel   *cancel;
-	char		errbuf[256];
+	PGcancelConn *cancelConn;
+	char	   *msg;
 
 	dblink_init();
 	conn = dblink_get_named_conn(text_to_cstring(PG_GETARG_TEXT_PP(0)));
-	cancel = PQgetCancel(conn);
+	cancelConn = PQcancelCreate(conn);
 
-	res = PQcancel(cancel, errbuf, 256);
-	PQfreeCancel(cancel);
+	PG_TRY();
+	{
+		if (!PQcancelBlocking(cancelConn))
+		{
+			msg = pchomp(PQcancelErrorMessage(cancelConn));
+		}
+		else
+		{
+			msg = "OK";
+		}
+	}
+	PG_FINALLY();
+	{
+		PQcancelFinish(cancelConn);
+	}
+	PG_END_TRY();
 
-	if (res == 1)
-		PG_RETURN_TEXT_P(cstring_to_text("OK"));
-	else
-		PG_RETURN_TEXT_P(cstring_to_text(errbuf));
+	PG_RETURN_TEXT_P(cstring_to_text(msg));
 }
 
 
diff --git a/contrib/postgres_fdw/connection.c b/contrib/postgres_fdw/connection.c
index 4931ebf5915..dcc13dc3b24 100644
--- a/contrib/postgres_fdw/connection.c
+++ b/contrib/postgres_fdw/connection.c
@@ -133,7 +133,7 @@ static void pgfdw_inval_callback(Datum arg, int cacheid, uint32 hashvalue);
 static void pgfdw_reject_incomplete_xact_state_change(ConnCacheEntry *entry);
 static void pgfdw_reset_xact_state(ConnCacheEntry *entry, bool toplevel);
 static bool pgfdw_cancel_query(PGconn *conn);
-static bool pgfdw_cancel_query_begin(PGconn *conn);
+static bool pgfdw_cancel_query_begin(PGconn *conn, TimestampTz endtime);
 static bool pgfdw_cancel_query_end(PGconn *conn, TimestampTz endtime,
    bool consume_input);
 static bool pgfdw_exec_cleanup_query(PGconn *conn, const char *query,
@@ -1315,36 +1315,104 @@ pgfdw_cancel_query(PGconn *conn)
 	endtime = TimestampTzPlusMilliseconds(GetCurrentTimestamp(),
 		  CONNECTION_CLEANUP_TIMEOUT);
 
-	if (!pgfdw_cancel_query_begin(conn))
+	if (!pgfdw_cancel_query_begin(conn, endtime))
 		return false;
 	return pgfdw_cancel_query_end(conn, endtime, false);
 }
 
 static bool
-pgfdw_cancel_query_begin(PGconn *conn)
+pgfdw_cancel_query_begin(PGconn *conn, TimestampTz endtime)
 {
-	PGcancel   *cancel;
-	char		errbuf[256];
+	bool		timed_out = false;
+	bool		failed = false;
+	PGcancelConn *cancel_conn = PQcancelCreate(conn);
 
-	/*
-	 * Issue cancel request.  Unfortunately, there's no good way to limit the
-	 * amount of time that we might block inside PQgetCancel().
-	 */
-	if ((cancel = PQgetCancel(conn)))
+
+	if (!PQcancelStart(cancel_conn))
 	{
-		if (!PQcancel(cancel, errbuf, sizeof(errbuf)))
+		PG_TRY();
 		{
 			ereport(WARNING,
 	(errcode(ERRCODE_CONNECTION_FAILURE),
 	 errmsg("could not send cancel request: %s",
-			errbuf)));
-			PQfreeCancel(cancel);
-			return false;
+			pchomp(PQcancelErrorMessage(cancel_conn);
 		}
-		PQfreeCancel(cancel);
+		PG_FINALLY();
+		{
+			PQcancelFinish(cancel_conn);
+		}
+		PG_END_TRY();
+		return false;
 	}
 
-	return true;
+	/* In what follows, do not leak any PGcancelConn on an error. */
+	PG_TRY();
+	{
+		while (true)
+		{
+			TimestampTz now = GetCurrentTimestamp();
+			long		cur_timeout;
+			PostgresPollingStatusType pollres = PQcancelPoll(cancel_conn);
+			int			waitEvents = WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH;
+
+			if (pollres == PGRES_POLLING_OK)
+			{
+break;
+			}
+
+			/* If timeout has expired, give up, else get sleep time. */
+			cur_timeout = TimestampDifferenceMilliseconds(now, endtime);
+			if (cur_timeout <= 0)
+			{
+timed_out = true;
+failed = true;
+goto exit;
+			}
+
+			switch (pollres)
+			{
+case PGRES_POLLING_READING:
+	waitEvents |= WL_SOCKET_READABLE;
+	break;
+case PGRES_POL

Re: CF entries for 17 to be reviewed

2024-03-12 Thread Aleksander Alekseev
Hi,

> > Aleksander, I would greatly appreciate if you join me in managing CF. 
> > Together we can move more stuff :)
> > Currently, I'm going through "SQL Commands". And so far I had not come to 
> > "Performance" and "Server Features" at all... So if you can handle updating 
> > statuses of that sections - that would be great.
>
> Server Features:
>
> [...]

I noticed that "Avoid mixing custom and OpenSSL BIO functions" had two
entries [1][2]. I closed [1] and marked it as "Withdrawn" due to lack
of a better status. Maybe we need an additional "Duplicate" status.

[1]: https://commitfest.postgresql.org/47/4834/
[2]: https://commitfest.postgresql.org/47/4835/

-- 
Best regards,
Aleksander Alekseev




Re: Commitfest Manager for March

2024-03-12 Thread Aleksander Alekseev
Hi Andrey,

> > If you need any help please let me know.
>
> Aleksander, I would greatly appreciate if you join me in managing CF. 
> Together we can move more stuff :)
> Currently, I'm going through "SQL Commands". And so far I had not come to 
> "Performance" and "Server Features" at all... So if you can handle updating 
> statuses of that sections - that would be great.

OK, I'll take care of the "Performance" and "Server Features"
sections. I submitted my summaries of the entries triaged so far to
the corresponding thread [1].

[1]: 
https://www.postgresql.org/message-id/CAJ7c6TN9SnYdq%3DkfP-txgo5AaT%2Bt9YU%2BvQHfLBZqOBiHwoipAg%40mail.gmail.com

-- 
Best regards,
Aleksander Alekseev




Disable LLVM bitcode generation with pgxs.mk framework.

2024-03-12 Thread Xing Guo
Hi hackers,

When the PostgreSQL server is configured with --with-llvm, the pgxs.mk
framework will generate LLVM bitcode for extensions automatically.
Sometimes, I don't want to generate bitcode for some extensions. I can
turn off this feature by specifying with_llvm=0 in the make command.

```
make with_llvm=0
```

Would it be possible to add a new switch in the pgxs.mk framework to
allow users to disable this feature? E.g., the Makefile looks like:

```
WITH_LLVM=no
PG_CONFIG = pg_config
PGXS := $(shell $(PG_CONFIG) --pgxs)
```

Best Regards,
Xing




Re: [PATCH] LockAcquireExtended improvement

2024-03-12 Thread Robert Haas
On Mon, Mar 11, 2024 at 11:11 PM Jingxian Li  wrote:
> Your version changes less code than mine by pushing the nowait flag down
> into ProcSleep(). This looks fine in general, except for a little advice,
> which I don't think there is necessary to add 'waiting' suffix to the
> process name in function WaitOnLock with dontwait being true, as follows:

That could be done, but in my opinion it's not necessary. The waiting
suffix will appear only very briefly, and probably only in relatively
rare cases. It doesn't seem worth adding code to avoid it.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: CF entries for 17 to be reviewed

2024-03-12 Thread Aleksander Alekseev
Hi,

> > On 6 Mar 2024, at 18:49, Andrey M. Borodin  wrote:
> >
> > Here are statuses for "Refactoring" section:
>
> [...]

> Aleksander, I would greatly appreciate if you join me in managing CF. 
> Together we can move more stuff :)
> Currently, I'm going through "SQL Commands". And so far I had not come to 
> "Performance" and "Server Features" at all... So if you can handle updating 
> statuses of that sections - that would be great.

Server Features:

* Fix partitionwise join with partially-redundant join clauses
I see there is a good discussion in the progress here. Doesn't
seem to be needing more reviewers at the moment.
* ALTER TABLE SET ACCESS METHOD on partitioned tables
Ditto.
* Patch to implement missing join selectivity estimation for range types
The patch doesn't apply and has been "Waiting on Author" for a few
months now. Could be a candidate for closing with RwF.
* logical decoding and replication of sequences, take 2
   According to Tomas Vondra the patch is not ready for PG17. The
patch is marked as "Waiting on Author". Although it could be withrowed
for now, personally I see no problem with keeping it WoA until the
PG18 cycle begins.
* Add the ability to limit the amount of memory that can be allocated
to backends
   v20231226 doesn't apply. The patch needs love from somebody interested in it.
* Multi-version ICU
   By a quick look the patch doesn't apply and was moved between
several commitfests, last time in "Waiting on Author" state. Could be
a candidate for closing with RwF.
* Post-special Page Storage TDE support
A large patch divided into 28 (!) parts. Currently needs a rebase.
Which shouldn't necessarily stop a reviewer looking for a challenging
task.
* Built-in collation provider for "C" and "C.UTF-8"
Peter E left some feedback today, so I changed the status to
"Waiting on Author"
* ltree hash functions
Marked as RfC and cfbot seems to be happy with the patch. Could
use some attention from a committer?
* UUID v7
The patch is in good shape. Michael argued that the patch should
be merged when RFC is approved. No action seems to be needed until
then.
* (to be continued)


--
Best regards,
Aleksander Alekseev




Re: Make attstattarget nullable

2024-03-12 Thread Tomas Vondra



On 3/12/24 13:47, Peter Eisentraut wrote:
> On 06.03.24 22:34, Tomas Vondra wrote:
>> 0001
>> 
>>
>> 1) I think this bit in ALTER STATISTICS docs is wrong:
>>
>> -  > class="parameter">new_target
>> +  SET STATISTICS { > class="parameter">integer | DEFAULT }
>>
>> because it means we now have list entries for name, ..., new_name,
>> new_schema, and then suddenly "SET STATISTICS { integer | DEFAULT }".
>> That's a bit weird.
> 
> Ok, how would you change it?  List out the full clauses of the other
> variants under Parameters as well?

I'd go with a parameter, essentially exactly as it used to be, except
for adding the DEFAULT option. So the list would define new_target, and
mention DEFAULT as a special value.

> We have similar inconsistencies on other ALTER reference pages, so I'm
> not sure what the preferred approach is.
> 

Yeah, the other reference pages may have some inconsistencies too, but
let's not add more.

>> 2) The newtarget handling in AlterStatistics seems rather confusing. Why
>> does it get set to -1 just to ignore the value later? For a while I was
>> 99% sure ALTER STATISTICS ... SET STATISTICS DEFAULT will set the field
>> to -1. Maybe ditching the first if block and directly checking
>> stmt->stxstattarget before setting repl_val/repl_null would be better?
> 
> But we also need to continue accepting -1 for default on input.  The
> current code achieves that, the proposed variant would not.
> 

OK, I did not realize that. But then maybe this should be explained in a
comment before the new "if" block, because people won't realize why it
needs to be this way.

> Note that this patch matches the equivalent patch for attstattarget
> (4f622503d6d), which uses the same logic.  We could change it if we have
> a better idea, but then we should change both.
> 
>> 0002
>> 
>>
>> 1) I think InsertPgAttributeTuples comment probably needs to document
>> what the new tupdesc_extra parameter does.
> 
> Yes, I'll update that comment.
> 

OK.

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: Add the ability to limit the amount of memory that can be allocated to backends.

2024-03-12 Thread Aleksander Alekseev
Hi,

> I took a closer look at this patch over the last couple days, and I did
> a bunch of benchmarks to see how expensive the accounting is. The good
> news is that I haven't observed any overhead - I did two simple tests,
> that I think would be particularly sensitive to this:
>
> [...]

Just wanted to let you know that v20231226 doesn't apply. The patch
needs love from somebody interested in it.

Best regards,
Aleksander Alekseev (wearing a co-CFM hat)




Re: confusing `case when` column name

2024-03-12 Thread David G. Johnston
On Tuesday, March 12, 2024, adjk...@126.com  wrote:

>
> Nee we change the title of the case-when output column?
>
>
Choosing either a or b as the label seems wrong and probably worth changing
to something that has no meaning and encourages the application of a column
alias.

David J.


Re: [PATCHES] Post-special page storage TDE support

2024-03-12 Thread Aleksander Alekseev
Hi David,

> I have finished the reworking of this particular patch series, and have tried 
> to
> organize this in such a way that it will be easily reviewable.  It is
> constructed progressively to be able to follow what is happening here.  As 
> such,
> each individual commit is not guaranteed to compile on its own, so the whole
> series would need to be applied before it works. (It does pass CI tests.)
>
> Here is a brief roadmap of the patches; some of them have additional details 
> in
> the commit message describing a little more about them.
>
> These two patches do some refactoring of existing code to make a common place 
> to
> modify the definitions:
>
> v3-0001-refactor-Create-PageUsableSpace-to-represent-spac.patch
> v3-0002-refactor-Make-PageGetUsablePageSize-routine.patch
>
> These two patches add the ReservedPageSize variable and teach PageInit to use 
> to
> adjust sizing accordingly:
>
> v3-0003-feature-Add-ReservedPageSize-variable.patch
> v3-0004-feature-Adjust-page-sizes-at-PageInit.patch
>
> This patch modifies the definitions of 4 symbols to be computed based on
> PageUsableSpace:
>
> v3-0005-feature-Create-Calc-Limit-and-Dynamic-forms-for-f.patch
>
> These following 4 patches are mechanical replacements of all existing uses of
> these symbols; this provides both visibility into where the existing symbol is
> used as well as distinguishing between parts that care about static allocation
> vs dynamic usage.  The only non-mechanical change is to remove the definition 
> of
> the old symbol so we can be guaranteed that all uses have been considered:
>
> v3-0006-chore-Split-MaxHeapTuplesPerPage-into-Limit-and-D.patch
> v3-0007-chore-Split-MaxIndexTuplesPerPage-into-Limit-and-.patch
> v3-0008-chore-Split-MaxHeapTupleSize-into-Limit-and-Dynam.patch
> v3-0009-chore-Split-MaxTIDsPerBTreePage-into-Limit-and-Dy.patch
>
> The following patches are related to required changes to support dynamic toast
> limits:
>
> v3-0010-feature-Add-hook-for-setting-reloptions-defaults-.patch
> v3-0011-feature-Dynamically-calculate-toast_tuple_target.patch
> v3-0012-feature-Add-Calc-options-for-toast-related-pieces.patch
> v3-0013-chore-Replace-TOAST_MAX_CHUNK_SIZE-with-ClusterTo.patch
> v3-0014-chore-Translation-updates-for-TOAST_MAX_CHUNK_SIZ.patch
>
> In order to calculate some of the sizes, we need to include nbtree.h 
> internals,
> but we can't use in front-end apps, so we separate out the pieces we care 
> about
> into a separate include and use that:
>
> v3-0015-chore-Split-nbtree.h-structure-defs-into-an-inter.patch
>
> This is the meat of the patch; provide a common location for these
> block-size-related constants to be computed using the infra that has been set 
> up
> so far.  Also ensure that we are properly initializing this in front end and
> back end code.  A tricky piece here is we have two separate include files for
> blocksize.h; one which exposes externs as consts for optimizations, and one 
> that
> blocksize.c itself uses without consts, which it uses to create/initialized 
> the
> vars:
>
> v3-0016-feature-Calculate-all-blocksize-constants-in-a-co.patch
>
> Add ControlFile and GUC support for reserved_page_size:
>
> v3-0017-feature-ControlFile-GUC-support-for-reserved_page.patch
>
> Add initdb support for reserving page space:
>
> v3-0018-feature-Add-reserved_page_size-to-initdb-bootstra.patch
>
> Fixes for pg_resetwal:
>
> v3-0019-feature-Updates-for-pg_resetwal.patch
>
> The following 4 patches mechanically replace the Dynamic form to use the new
> Cluster variables:
>
> v3-0020-chore-Rename-MaxHeapTupleSizeDynamic-to-ClusterMa.patch
> v3-0021-chore-Rename-MaxHeapTuplesPerPageDynamic-to-Clust.patch
> v3-0022-chore-Rename-MaxIndexTuplesPerPageDynamic-to-Clus.patch
> v3-0023-chore-Rename-MaxTIDsPerBTreePageDynamic-to-Cluste.patch
>
> Two pieces of optimization required for visibility map:
>
> v3-0024-optimization-Add-support-for-fast-non-division-ba.patch
> v3-0025-optimization-Use-fastdiv-code-in-visibility-map.patch
>
> Update bufpage.h comments:
>
> v3-0026-doc-update-bufpage-docs-w-reserved-space-data.patch
>
> Fixes for bloom to use runtime size:
>
> v3-0027-feature-Teach-bloom-about-PageUsableSpace.patch
>
> Fixes for FSM to use runtime size:
>
> v3-0028-feature-teach-FSM-about-reserved-page-space.patch
>
> I hope this makes sense for reviewing, I know it's a big job, so breaking 
> things up a little more and organizing will hopefully help.

Just wanted to let you know that the patchset seems to need a rebase,
according to cfbot.

Best regards,
Aleksander Alekseev (wearing a co-CFM hat)




Re: Make attstattarget nullable

2024-03-12 Thread Peter Eisentraut

On 06.03.24 22:34, Tomas Vondra wrote:

0001


1) I think this bit in ALTER STATISTICS docs is wrong:

-  new_target
+  SET STATISTICS { integer | DEFAULT }

because it means we now have list entries for name, ..., new_name,
new_schema, and then suddenly "SET STATISTICS { integer | DEFAULT }".
That's a bit weird.


Ok, how would you change it?  List out the full clauses of the other 
variants under Parameters as well?


We have similar inconsistencies on other ALTER reference pages, so I'm 
not sure what the preferred approach is.



2) The newtarget handling in AlterStatistics seems rather confusing. Why
does it get set to -1 just to ignore the value later? For a while I was
99% sure ALTER STATISTICS ... SET STATISTICS DEFAULT will set the field
to -1. Maybe ditching the first if block and directly checking
stmt->stxstattarget before setting repl_val/repl_null would be better?


But we also need to continue accepting -1 for default on input.  The 
current code achieves that, the proposed variant would not.


Note that this patch matches the equivalent patch for attstattarget 
(4f622503d6d), which uses the same logic.  We could change it if we have 
a better idea, but then we should change both.



0002


1) I think InsertPgAttributeTuples comment probably needs to document
what the new tupdesc_extra parameter does.


Yes, I'll update that comment.





  1   2   >