Re: Escape output of pg_amcheck test

2024-01-13 Thread Peter Eisentraut

On 08.01.24 16:06, Peter Eisentraut wrote:

On 08.01.24 15:04, Aleksander Alekseev wrote:
[...] so I quickly wrote some (wrong) instrumentation to try to test 
your patch.


Yep, it confused me too at first.

Since the encoding happens right before exit() call, maybe it's worth
changing $b in-place in order to make the code slightly more readable
for most of us :)


My patch originally had the old-style

my $b_escaped = $b;
$b_escaped =~ s/.../;

... sprintf(..., $b_escaped);

but then I learned about the newish /r modifier and thought it was 
cooler. :)


committed





Re: Asynchronous execution support for Custom Scan

2024-01-13 Thread vignesh C
On Fri, 2 Dec 2022 at 05:05, Kohei KaiGai  wrote:
>
> > > IIUC, we already can use ctid in the where clause on the latest
> > > PostgreSQL, can't we?
> >
> > Oh, sorry, I missed the TidRangeScan. My apologies for the noise.
> >
> I made the ctidscan extension when we developed CustomScan API
> towards v9.5 or v9.6, IIRC. It would make sense just an example of
> CustomScan API (e.g, PG-Strom code is too large and complicated
> to learn about this API), however, makes no sense from the standpoint
> of the functionality.

This patch has been moving from one CF to another CF without any
discussion happening. It has been more than one year since any
activity in this thread. I don't see there is much interest in this
patch. I prefer to return this patch in this CF unless someone is
interested in the patch and takes it forward.

Regards,
Vignesh




Re: logicalrep_worker_launch -- counting/checking the worker limits

2024-01-13 Thread vignesh C
On Tue, 15 Aug 2023 at 08:09, Peter Smith  wrote:
>
> A rebase was needed due to a recent push [1].

I have changed the status of the patch to "Waiting on Author" as
Amit's queries at [1] have not been verified and concluded. Please
feel free to address them and change the status back again.

[1] - 
https://www.postgresql.org/message-id/CAA4eK1LtFyiMV6e9%2BRr66pe5e-MX7Pk6N3iHd4JgcBW1X4kS6Q%40mail.gmail.com

Regards,
Vignesh




Re: collect_corrupt_items_vacuum.patch

2024-01-13 Thread Alexander Korotkov
Hi, Dmitry!

On Sat, Jan 13, 2024 at 7:33 PM Dmitry Koval  wrote:
> Thank you, there is one small point left (in the comment): can you
> replace "guarantteed to be to be newer" to "guaranteed to be newer",
> file src/backend/storage/ipc/procarray.c?

Fixed.  Thank you for catching this.

--
Regards,
Alexander Korotkov


0001-Fix-false-reports-in-pg_visibility-v5.patch
Description: Binary data


Re: speed up a logical replica setup

2024-01-13 Thread Junwang Zhao
Hi Euler,

On Fri, Jan 12, 2024 at 6:16 AM Euler Taveira  wrote:
>
> On Thu, Jan 11, 2024, at 9:18 AM, Hayato Kuroda (Fujitsu) wrote:
>
> I have been concerned that the patch has not been tested by cfbot due to the
> application error. Also, some comments were raised. Therefore, I created a 
> patch
> to move forward.
>
>
> Let me send an updated patch to hopefully keep the CF bot happy. The following
> items are included in this patch:
>
> * drop physical replication slot if standby is using one [1].
> * cleanup small changes (copyright, .gitignore) [2][3]
> * fix getopt_long() options [2]
> * fix format specifier for some messages
> * move doc to Server Application section [4]
> * fix assert failure
> * ignore duplicate database names [2]
> * store subscriber server log into a separate file
> * remove MSVC support
>
> I'm still addressing other reviews and I'll post another version that includes
> it soon.
>
> [1] 
> https://www.postgresql.org/message-id/e02a2c17-22e5-4ba6-b788-de696ab74f1e%40app.fastmail.com
> [2] 
> https://www.postgresql.org/message-id/CALDaNm1joke42n68LdegN5wCpaeoOMex2EHcdZrVZnGD3UhfNQ%40mail.gmail.com
> [3] 
> https://www.postgresql.org/message-id/TY3PR01MB98895BA6C1D72CB8582CACC4F5682%40TY3PR01MB9889.jpnprd01.prod.outlook.com
> [4] 
> https://www.postgresql.org/message-id/TY3PR01MB988978C7362A101927070D29F56A2%40TY3PR01MB9889.jpnprd01.prod.outlook.com
>

+ 
+  pg_subscriber
+  create a new logical replica from a standby server
+ 
I'm a bit confused about this wording because we are converting a standby
to a logical replica but not creating a new logical replica and leaving the
standby as is. How about:

convert a standby replica to a logical replica

+  
+   The pg_subscriber should be run at the target
+   server. The source server (known as publisher server) should accept logical
+   replication connections from the target server (known as subscriber server).
+   The target server should accept local logical replication connection.
+  

What is *local logical replication*? I can't find any clue in the patch, can you
give me some hint?


>
> --
> Euler Taveira
> EDB   https://www.enterprisedb.com/
>


-- 
Regards
Junwang Zhao




Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features)

2024-01-13 Thread Alexander Korotkov
Hi!

I think this is a demanding and long-waited feature.  The thread is
pretty long, but mostly it was disputes about how to save the errors.
The present patch includes basic infrastructure and ability to ignore
errors, thus it's pretty simple.

On Sat, Jan 13, 2024 at 4:20 PM jian he  wrote:
> On Fri, Jan 12, 2024 at 10:59 AM torikoshia  
> wrote:
> >
> >
> > Thanks for reviewing!
> >
> > Updated the patch merging your suggestions except below points:
> >
> > > +   cstate->num_errors = 0;
> >
> > Since cstate is already initialized in below lines, this may be
> > redundant.
> >
> > | /* Allocate workspace and zero all fields */
> > | cstate = (CopyFromStateData *) palloc0(sizeof(CopyFromStateData));
> >
> >
> > >  +   Assert(!cstate->escontext->details_wanted);
> >
> > I'm not sure this is necessary, considering we're going to add other
> > options like 'table' and 'log', which need details_wanted soon.
> >
> >
> > --
> > Regards,
>
> make save_error_to option cannot be used with COPY TO.
> add redundant test, save_error_to with COPY TO test.

I've incorporated these changes.  Also, I've changed
CopyFormatOptions.save_error_to to enum and made some edits in
comments and the commit message.  I'm going to push this if there are
no objections.

--
Regards,
Alexander Korotkov


0001-Add-new-COPY-option-SAVE_ERROR_TO-v3.patch
Description: Binary data


Postgres and --config-file option

2024-01-13 Thread David G. Johnston
On Saturday, January 13, 2024, Nathan Bossart 
wrote:

> On Sat, Jan 13, 2024 at 01:39:50PM +0300, Aleksander Alekseev wrote:
>
> > Should we remove --config-file from the error message to avoid any
> > confusion? Should we correct --help output? Should we update the
> > documentation?
>
> It might be worthwhile to update the documentation if it would've helped
> prevent confusion here.
>

Pointing out the long form in the -c definition makes sense.

As for the help message, I’d minimally add:

“You must specify the --config-file (or equivalent -c) or -D invocation …”

I’m fine with the status quo regarding the overview documentation
mentioning both forms.  I also haven’t tested whether PGOPTIONS accepts
both forms or only the -c form as presently documented.  Presently the
—name=value form seems discouraged in favor of -c which I’m ok with and
trying to mention both everywhere seems needlessly verbose.   But I’d be
interested in reviewing against an informed patch improving this area more
broadly than dealing with this single deviant usage.  I do like this
specific usage of the long-form option.

David J.


Re: SQL:2011 application time

2024-01-13 Thread jian he
On Thu, Jan 11, 2024 at 10:44 PM Peter Eisentraut  wrote:
>
> On 31.12.23 09:51, Paul Jungwirth wrote:
> > On Wed, Dec 6, 2023 at 12:59 AM Peter Eisentraut 
> > wrote:
> >  >
> >  > On 02.12.23 19:41, Paul Jungwirth wrote:
> >  > > So what do you think of this idea instead?:
> >  > >
> >  > > We could add a new (optional) support function to GiST that translates
> >  > > "well-known" strategy numbers into the opclass's own strategy numbers.
> >  >
> >  > I had some conversations about this behind the scenes.  I think this
> >  > idea makes sense.
> >
> > Here is a patch series with the GiST stratnum support function added. I
> > put this into a separate patch (before all the temporal ones), so it's
> > easier to review. Then in the PK patch (now #2) we call that function to
> > figure out the = and && operators. I think this is a big improvement.
>
> I like this solution.
>
> Here is some more detailed review of the first two patches.  (I reviewed
> v20; I see you have also posted v21, but they don't appear very
> different for this purpose.)
>
> v20-0001-Add-stratnum-GiST-support-function.patch
>
> * contrib/btree_gist/Makefile
>
> Needs corresponding meson.build updates.

fixed

>
> * contrib/btree_gist/btree_gist--1.7--1.8.sql
>
> Should gist_stratnum_btree() live in contrib/btree_gist/ or in core?
> Are there other extensions that use the btree strategy numbers for
> gist?
>
> +ALTER OPERATOR FAMILY gist_vbit_ops USING gist ADD
> +   FUNCTION  12 (varbit, varbit) gist_stratnum_btree (int2) ;
>
> Is there a reason for the extra space after FUNCTION here (repeated
> throughout the file)?
>

fixed.

> +-- added in 1.4:
>
> What is the purpose of these "added in" comments?
>
>
> v20-0002-Add-temporal-PRIMARY-KEY-and-UNIQUE-constraints.patch
>
> * contrib/btree_gist/Makefile
>
> Also update meson.build.

fixed.

> * contrib/btree_gist/sql/without_overlaps.sql
>
> Maybe also insert a few values, to verify that the constraint actually
> does something?
>

I added an ok and failed INSERT.

> * doc/src/sgml/ref/create_table.sgml
>
> Is "must have a range type" still true?  With the changes to the
> strategy number mapping, any type with a supported operator class
> should work?
>
> * src/backend/utils/adt/ruleutils.c
>
> Is it actually useful to add an argument to
> decompile_column_index_array()?  Wouldn't it be easier to just print
> the " WITHOUT OVERLAPS" in the caller after returning from it?

fixed. i just print it right after decompile_column_index_array.

> * src/include/access/gist_private.h
>
> The added function gistTranslateStratnum() isn't really "private" to
> gist.  So access/gist.h would be a better place for it.
>
> Also, most other functions there appear to be named "GistSomething",
> so a more consistent name might be GistTranslateStratnum.
>
> * src/include/access/stratnum.h
>
> The added StrategyIsValid() doesn't seem that useful?  Plenty of
> existing code just compares against InvalidStrategy, and there is only
> one caller for the new function.  I suggest to do without it.
>

If more StrategyNumber are used in the future, will StrategyIsValid()
make sense?

> * src/include/commands/defrem.h
>
> We are using two terms here, well-known strategy number and canonical
> strategy number, to mean the same thing (I think?).  Let's try to
> stick with one.  Or explain the relationship?
>

In my words:
for range type, well-known strategy number and canonical strategy
number are the same thing.
For types Gist does not natively support equality, like int4,
GetOperatorFromCanonicalStrategy will pass RTEqualStrategyNumber from
ComputeIndexAttrs
and return BTEqualStrategyNumber.

> If these points are addressed, and maybe with another round of checking
> that all corner cases are covered, I think these patches (0001 and 0002)
> are close to ready.
>

the following are my review:

+ /* exclusionOpNames can be non-NIL if we are creating a partition */
+ if (iswithoutoverlaps && exclusionOpNames == NIL)
+ {
+ indexInfo->ii_ExclusionOps = palloc_array(Oid, nkeycols);
+ indexInfo->ii_ExclusionProcs = palloc_array(Oid, nkeycols);
+ indexInfo->ii_ExclusionStrats = palloc_array(uint16, nkeycols);
+ }
I am not sure the above comment is related to the code

+/*
+ * Returns the btree number for equals, otherwise invalid.
+ *
+ * This is for GiST opclasses in btree_gist (and maybe elsewhere)
+ * that use the BT*StrategyNumber constants.
+ */
+Datum
+gist_stratnum_btree(PG_FUNCTION_ARGS)
+{
+ StrategyNumber strat = PG_GETARG_UINT16(0);
+
+ switch (strat)
+ {
+ case RTEqualStrategyNumber:
+ PG_RETURN_UINT16(BTEqualStrategyNumber);
+ case RTLessStrategyNumber:
+ PG_RETURN_UINT16(BTLessStrategyNumber);
+ case RTLessEqualStrategyNumber:
+ PG_RETURN_UINT16(BTLessEqualStrategyNumber);
+ case RTGreaterStrategyNumber:
+ PG_RETURN_UINT16(BTGreaterStrategyNumber);
+ case RTGreaterEqualStrategyNumber:
+ PG_RETURN_UINT16(BTGreaterEqualStrategyNumber);
+ default:
+ PG_RETURN_UINT16(InvalidStrategy);
+ }
the above comment seems 

Re: Postgres and --config-file option

2024-01-13 Thread Nathan Bossart
On Sat, Jan 13, 2024 at 01:39:50PM +0300, Aleksander Alekseev wrote:
> OK, let's check section "20.1.4. Parameter Interaction via the Shell"
> [1] of the documentation. Currently it doesn't tell anything about the
> ability to specify GUCs --like-this, unless I missed something.

It appears to be documented for 'postgres' as follows [0]:

--name=value
Sets a named run-time parameter; a shorter form of -c.

and similarly within the --help output:

--NAME=VALUE   set run-time parameter

Its documentation also describes this method of specifying parameters in
the 'Examples' section.  The section you refer to calls out "-c", so it is
sort-of indirectly mentioned, but that might be a bit of a generous
assessment.

> Should we remove --config-file from the error message to avoid any
> confusion? Should we correct --help output? Should we update the
> documentation?

It might be worthwhile to update the documentation if it would've helped
prevent confusion here.

Separately, I noticed that this is implemented in postmaster.c by looking
for the '-' option character returned by getopt(), and I'm wondering why
this doesn't use getopt_long() instead.  AFAICT this dates back to the
introduction of GUCs in 6a68f426 (May 2000).

[0] https://www.postgresql.org/docs/devel/app-postgres.html

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: introduce dynamic shared memory registry

2024-01-13 Thread Nathan Bossart
On Fri, Jan 12, 2024 at 01:45:55PM -0600, Nathan Bossart wrote:
> On Fri, Jan 12, 2024 at 11:13:46PM +0530, Abhijit Menon-Sen wrote:
>> At 2024-01-12 11:21:52 -0600, nathandboss...@gmail.com wrote:
>>> +  Each backend sould obtain a pointer to the reserved shared memory by
>> 
>> sould → should
> 
> D'oh.  Thanks.
> 
>>> +  Add-ins can reserve LWLocks on server startup.  Like with shared 
>>> memory,
>> 
>> (Would "As with shared memory" read better? Maybe, but then again maybe
>> it should be left alone because you also write "Unlike with" elsewhere.)
> 
> I think "As with shared memory..." sounds better here.

Here is a new version of the patch set with these changes.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
>From b56931edc4488a7376b27ba0e5519cc3a61b4899 Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Thu, 11 Jan 2024 21:55:25 -0600
Subject: [PATCH v7 1/3] reorganize shared memory and lwlocks documentation

---
 doc/src/sgml/xfunc.sgml | 182 +---
 1 file changed, 114 insertions(+), 68 deletions(-)

diff --git a/doc/src/sgml/xfunc.sgml b/doc/src/sgml/xfunc.sgml
index 89116ae74c..82e1dadcca 100644
--- a/doc/src/sgml/xfunc.sgml
+++ b/doc/src/sgml/xfunc.sgml
@@ -3397,90 +3397,136 @@ CREATE FUNCTION make_array(anyelement) RETURNS anyarray

 

-Shared Memory and LWLocks
+Shared Memory
 
-
- Add-ins can reserve LWLocks and an allocation of shared memory on server
- startup.  The add-in's shared library must be preloaded by specifying
- it in
- shared_preload_libraries.
- The shared library should register a shmem_request_hook
- in its _PG_init function.  This
- shmem_request_hook can reserve LWLocks or shared memory.
- Shared memory is reserved by calling:
+
+ Requesting Shared Memory at Startup
+
+ 
+  Add-ins can reserve shared memory on server startup.  To do so, the
+  add-in's shared library must be preloaded by specifying it in
+  shared_preload_libraries.
+  The shared library should also register a
+  shmem_request_hook in its
+  _PG_init function.  This
+  shmem_request_hook can reserve shared memory by
+  calling:
 
 void RequestAddinShmemSpace(int size)
 
- from your shmem_request_hook.
-
-
- LWLocks are reserved by calling:
+  Each backend should obtain a pointer to the reserved shared memory by
+  calling:
+
+void *ShmemInitStruct(const char *name, Size size, bool *foundPtr)
+
+  If this function sets foundPtr to
+  false, the caller should proceed to initialize the
+  contents of the reserved shared memory.  If foundPtr
+  is set to true, the shared memory was already
+  initialized by another backend, and the caller need not initialize
+  further.
+ 
+
+ 
+  To avoid race conditions, each backend should use the LWLock
+  AddinShmemInitLock when initializing its allocation
+  of shared memory, as shown here:
+
+static mystruct *ptr = NULL;
+boolfound;
+
+LWLockAcquire(AddinShmemInitLock, LW_EXCLUSIVE);
+ptr = ShmemInitStruct("my struct name", size, found);
+if (!found)
+{
+... initialize contents of shared memory ...
+ptr->locks = GetNamedLWLockTranche("my tranche name");
+}
+LWLockRelease(AddinShmemInitLock);
+
+  shmem_startup_hook provides a convenient place for the
+  initialization code, but it is not strictly required that all such code
+  be placed in this hook.  Each backend will execute the registered
+  shmem_startup_hook shortly after it attaches to shared
+  memory.  Note that add-ins should still acquire
+  AddinShmemInitLock within this hook, as shown in the
+  example above.
+ 
+
+ 
+  An example of a shmem_request_hook and
+  shmem_startup_hook can be found in
+  contrib/pg_stat_statements/pg_stat_statements.c in
+  the PostgreSQL source tree.
+ 
+
+   
+
+   
+LWLocks
+
+
+ Requesting LWLocks at Startup
+
+ 
+  Add-ins can reserve LWLocks on server startup.  As with shared memory,
+  the add-in's shared library must be preloaded by specifying it in
+  shared_preload_libraries,
+  and the shared library should register a
+  shmem_request_hook in its
+  _PG_init function.  This
+  shmem_request_hook can reserve LWLocks by calling:
 
 void RequestNamedLWLockTranche(const char *tranche_name, int num_lwlocks)
 
- from your shmem_request_hook.  This will ensure that an array of
- num_lwlocks LWLocks is available under the name
- tranche_name.  Use GetNamedLWLockTranche
- to get a pointer to this array.
-
-
- An example of a shmem_request_hook can be found in
- contrib/pg_stat_statements/pg_stat_statements.c in the
- PostgreSQL source tree.
-
-
- There is another, more flexible method of obtaining LWLocks. First,
- allocate a tranche_id from a shared counter by
- calling:
+  

Fix a possible socket leak at Windows (src/backend/port/win32/socket.c)

2024-01-13 Thread Ranier Vilela
Hi.

While there are plans to remove the sockets functions (Windows) [1], I
believe it is worth fixing possible current bugs.

In the pgwin32_socket function (src/backend/port/win32/socket.c), there is
a possible socket leak if the socket cannot be made non-blocking.

Trivial patch attached.

Best regards,
Ranier Vilela

[1] Re: Windows sockets (select missing events?)



fix-socket-leak-windows.patch
Description: Binary data


Re: reorganize "Shared Memory and LWLocks" section of docs

2024-01-13 Thread Nathan Bossart
On Sat, Jan 13, 2024 at 01:49:08PM +0300, Aleksander Alekseev wrote:
> That's much better, thanks.
> 
> I think the patch could use another pair of eyes, ideally from a
> native English speaker. But if no one will express any objections for
> a while I suggest merging it.

Great.  I've attached a v3 with a couple of fixes suggested in the other
thread [0].  I'll wait a little while longer in case anyone else wants to
take a look.

[0] https://postgr.es/m/ZaF6UpYImGqVIhVp%40toroid.org

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
>From b56931edc4488a7376b27ba0e5519cc3a61b4899 Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Thu, 11 Jan 2024 21:55:25 -0600
Subject: [PATCH v3 1/1] reorganize shared memory and lwlocks documentation

---
 doc/src/sgml/xfunc.sgml | 182 +---
 1 file changed, 114 insertions(+), 68 deletions(-)

diff --git a/doc/src/sgml/xfunc.sgml b/doc/src/sgml/xfunc.sgml
index 89116ae74c..82e1dadcca 100644
--- a/doc/src/sgml/xfunc.sgml
+++ b/doc/src/sgml/xfunc.sgml
@@ -3397,90 +3397,136 @@ CREATE FUNCTION make_array(anyelement) RETURNS anyarray

 

-Shared Memory and LWLocks
+Shared Memory
 
-
- Add-ins can reserve LWLocks and an allocation of shared memory on server
- startup.  The add-in's shared library must be preloaded by specifying
- it in
- shared_preload_libraries.
- The shared library should register a shmem_request_hook
- in its _PG_init function.  This
- shmem_request_hook can reserve LWLocks or shared memory.
- Shared memory is reserved by calling:
+
+ Requesting Shared Memory at Startup
+
+ 
+  Add-ins can reserve shared memory on server startup.  To do so, the
+  add-in's shared library must be preloaded by specifying it in
+  shared_preload_libraries.
+  The shared library should also register a
+  shmem_request_hook in its
+  _PG_init function.  This
+  shmem_request_hook can reserve shared memory by
+  calling:
 
 void RequestAddinShmemSpace(int size)
 
- from your shmem_request_hook.
-
-
- LWLocks are reserved by calling:
+  Each backend should obtain a pointer to the reserved shared memory by
+  calling:
+
+void *ShmemInitStruct(const char *name, Size size, bool *foundPtr)
+
+  If this function sets foundPtr to
+  false, the caller should proceed to initialize the
+  contents of the reserved shared memory.  If foundPtr
+  is set to true, the shared memory was already
+  initialized by another backend, and the caller need not initialize
+  further.
+ 
+
+ 
+  To avoid race conditions, each backend should use the LWLock
+  AddinShmemInitLock when initializing its allocation
+  of shared memory, as shown here:
+
+static mystruct *ptr = NULL;
+boolfound;
+
+LWLockAcquire(AddinShmemInitLock, LW_EXCLUSIVE);
+ptr = ShmemInitStruct("my struct name", size, found);
+if (!found)
+{
+... initialize contents of shared memory ...
+ptr->locks = GetNamedLWLockTranche("my tranche name");
+}
+LWLockRelease(AddinShmemInitLock);
+
+  shmem_startup_hook provides a convenient place for the
+  initialization code, but it is not strictly required that all such code
+  be placed in this hook.  Each backend will execute the registered
+  shmem_startup_hook shortly after it attaches to shared
+  memory.  Note that add-ins should still acquire
+  AddinShmemInitLock within this hook, as shown in the
+  example above.
+ 
+
+ 
+  An example of a shmem_request_hook and
+  shmem_startup_hook can be found in
+  contrib/pg_stat_statements/pg_stat_statements.c in
+  the PostgreSQL source tree.
+ 
+
+   
+
+   
+LWLocks
+
+
+ Requesting LWLocks at Startup
+
+ 
+  Add-ins can reserve LWLocks on server startup.  As with shared memory,
+  the add-in's shared library must be preloaded by specifying it in
+  shared_preload_libraries,
+  and the shared library should register a
+  shmem_request_hook in its
+  _PG_init function.  This
+  shmem_request_hook can reserve LWLocks by calling:
 
 void RequestNamedLWLockTranche(const char *tranche_name, int num_lwlocks)
 
- from your shmem_request_hook.  This will ensure that an array of
- num_lwlocks LWLocks is available under the name
- tranche_name.  Use GetNamedLWLockTranche
- to get a pointer to this array.
-
-
- An example of a shmem_request_hook can be found in
- contrib/pg_stat_statements/pg_stat_statements.c in the
- PostgreSQL source tree.
-
-
- There is another, more flexible method of obtaining LWLocks. First,
- allocate a tranche_id from a shared counter by
- calling:
+  This ensures that an array of num_lwlocks LWLocks is
+  available under the name tranche_name.  A pointer to
+  this array can be obtained by calling:
 
-int LWLockNewTrancheId(void)

Re: Recovering from detoast-related catcache invalidations

2024-01-13 Thread Tom Lane
I wrote:
> Xiaoran Wang  writes:
>> Hmm, how about first checking if any invalidated shared messages have been
>> accepted, then rechecking the tuple's visibility?
>> If there is no invalidated shared message accepted during
>> 'toast_flatten_tuple',
>> there is no need to do then visibility check, then it can save several
>> CPU cycles.

> Meh, I'd just as soon not add the additional dependency/risk of bugs.
> This is an expensive and seldom-taken code path, so I don't think
> shaving a few cycles is really important.

It occurred to me that this idea might be more interesting if we
could encapsulate it right into systable_recheck_tuple: something
like having systable_beginscan capture the current
SharedInvalidMessageCounter and save it in the SysScanDesc struct,
then compare in systable_recheck_tuple to possibly short-circuit
that work.  This'd eliminate one of the main bug hazards in the
idea, namely that you might capture SharedInvalidMessageCounter too
late, after something's already happened.  However, the whole idea
only works for catalogs that have catcaches, and the other users of
systable_recheck_tuple are interested in pg_depend which doesn't.
So that put a damper on my enthusiasm for the idea.

regards, tom lane




Re: cleanup patches for incremental backup

2024-01-13 Thread Alexander Lakhin

Hello Robert,

12.01.2024 17:50, Robert Haas wrote:

On Thu, Jan 11, 2024 at 8:00 PM Shinoda, Noriyoshi (HPE Services Japan
- FSIP)  wrote:

Thank you for developing the new tool. I have attached a patch that corrects 
the spelling of the --individual option in the SGML file.

Thanks, committed.


I've found one more typo in the sgml:
summarized_pid
And one in a comment:
sumamry

A trivial fix is attached.

Best regards,
Alexanderdiff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 0f7d409e60..210c7c0b02 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -26578,7 +26578,7 @@ SELECT collation for ('foo' COLLATE "de_DE");
 record that it has consumed, which must always be greater than or
 equal to summarized_lsn; if the WAL summarizer is
 not running, it will be equal to summarized_lsn.
-summarized_pid is the PID of the WAL summarizer
+summarizer_pid is the PID of the WAL summarizer
 process, if it is running, and otherwise NULL.

   
diff --git a/src/bin/pg_walsummary/t/002_blocks.pl b/src/bin/pg_walsummary/t/002_blocks.pl
index d473471bc7..d609d2c547 100644
--- a/src/bin/pg_walsummary/t/002_blocks.pl
+++ b/src/bin/pg_walsummary/t/002_blocks.pl
@@ -63,7 +63,7 @@ SELECT EXISTS (
 )
 EOM
 
-# Figure out the exact details for the new sumamry file.
+# Figure out the exact details for the new summary file.
 my $details = $node1->safe_psql('postgres', < '$summarized_lsn'


Re: collect_corrupt_items_vacuum.patch

2024-01-13 Thread Dmitry Koval

Hi!

Thank you, there is one small point left (in the comment): can you 
replace "guarantteed to be to be newer" to "guaranteed to be newer", 
file src/backend/storage/ipc/procarray.c?


--
With best regards,
Dmitry Koval

Postgres Professional: http://postgrespro.com




Re: [DOC] Add detail regarding resource consumption wrt max_connections

2024-01-13 Thread Roberto Mello
On Fri, Jan 12, 2024 at 3:15 PM Cary Huang  wrote:

> I think it is good to warn the user about the increased allocation of
> memory for certain parameters so that they do not abuse it by setting it to
> a huge number without knowing the consequences.
>
> It is true that max_connections can increase the size of proc array and
> other resources, which are allocated in the shared buffer, which also means
> less shared buffer to perform regular data operations. I am sure this is
> not the only parameter that affects the memory allocation.
> "max_prepared_xacts" can also affect the shared memory allocation too so
> the same warning message applies here as well. Maybe there are other
> parameters with similar effects.
>
> Instead of stating that higher max_connections results in higher
> allocation, It may be better to tell the user that if the value needs to be
> set much higher, consider increasing the "shared_buffers" setting as well.
>

Appreciate the review, Cary.

My goal was to inform the reader that there are implications to setting
max_connections higher. I've personally seen a user mindlessly set this to
50k connections, unaware it would cause unintended consequences.

I can add a suggestion for the user to consider increasing shared_buffers
in accordance with higher max_connections, but it would be better if there
was a "rule of thumb" guideline to go along. I'm open to suggestions.

I can revise with a similar warning in max_prepared_xacts as well.

Sincerely,

Roberto


Re: Recovering from detoast-related catcache invalidations

2024-01-13 Thread Tom Lane
Xiaoran Wang  writes:
> Hmm, how about first checking if any invalidated shared messages have been
> accepted, then rechecking the tuple's visibility?
> If there is no invalidated shared message accepted during
> 'toast_flatten_tuple',
> there is no need to do then visibility check, then it can save several
> CPU cycles.

Meh, I'd just as soon not add the additional dependency/risk of bugs.
This is an expensive and seldom-taken code path, so I don't think
shaving a few cycles is really important.

regards, tom lane




Re: Custom explain options

2024-01-13 Thread Konstantin Knizhnik


On 13/01/2024 4:51 pm, Tomas Vondra wrote:


On 1/12/24 20:30, Konstantin Knizhnik wrote:

On 12/01/2024 7:03 pm, Tomas Vondra wrote:

On 10/21/23 14:16, Konstantin Knizhnik wrote:

Hi hackers,

EXPLAIN statement has a list of options (i.e. ANALYZE, BUFFERS,
COST,...) which help to provide useful details of query execution.
In Neon we have added PREFETCH option which shows information about page
prefetching during query execution (prefetching is more critical for
Neon
architecture because of separation of compute and storage, so it is
implemented not only for bitmap heap scan as in Vanilla Postgres, but
also for seqscan, indexscan and indexonly scan). Another possible
candidate  for explain options is local file cache (extra caching layer
above shared buffers which is used to somehow replace file system cache
in standalone Postgres).

Not quite related to this patch about EXPLAIN options, but can you share
some details how you implemented prefetching for the other nodes?

I'm asking because I've been working on prefetching for index scans, so
I'm wondering if there's a better way to do this, or how to do it in a
way that would allow neon to maybe leverage that too.

regards


Yes, I am looking at your PR. What we have implemented in Neon is more
specific to Neon architecture where storage is separated from compute.
So each page not found in shared buffers has to be downloaded from page
server. It adds quite noticeable latency, because of network roundtrip.
While vanilla Postgres can rely on OS file system cache when page is not
found in shared buffer (access to OS file cache is certainly slower than
to shared buffers
because of syscall and copying of page, but performance penaly is not
very large - less than 15%), Neon has no local files and so has to send
request to the socket.

This is why we have to perform aggressive prefetching whenever it is
possible (when it it is possible to predict order of subsequent pages).
Unlike vanilla Postgres which implements prefetch only for bitmap heap
scan, we have implemented it for seqscan, index scan, indexonly scan,
bitmap heap scan, vacuum, pg_prewarm.
The main difference between Neon prefetch and vanilla Postgres prefetch
is that first one is backend specific. So each backend prefetches only
pages which it needs.
This is why we have to rewrite prefetch for bitmap heap scan, which is
using `fadvise` and assumes that pages prefetched by one backend in file
cache, can be used by any other backend.


I do understand why prefetching is important in neon (likely more than
for core postgres). I'm interested in how it's actually implemented,
whether it's somehow similar to how my patch does things or in some
different (perhaps neon-specific way), and if the approaches are
different then what are the pros/cons. And so on.

So is it implemented in the neon-specific storage, somehow, or where/how
does neon issue the prefetch requests?


Neon mostly preservers Postgres prefetch mechanism, so we are using 
PrefetchBuffer which checks if page is present in shared buffers
and if not - calls smgrprefetch. We are using own storage manager 
implementation which instead of reading pages from local disk, download 
them from page server.
And prefetch implementation in Neon storager manager is obviously also 
different from one in vanilla Postgres which uses posix_fadvise.
Neon prefetch implementation inserts prefetch request in ring buffer and 
sends it to the server. When read operation is performed we check if 
there is correspondent prefetch request in ring buffer and if so - waits 
its completion.


As I already wrote - prefetch is done locally for each backend. And each 
backend has its own connection with page server. It  can be changed in 
future when we implement multiplexing of page server connections. But 
right now prefetch is local. And certainly prefetch can improve 
performance only if we correctly predict subsequent page requests.
If not - then page server does useless jobs and backend has to waity and 
consume all issues prefetch requests. This is why in prefetch 
implementation for most of nodes we  start with minimal prefetch 
distance and then increase it. It allows to perform prefetch only for 
such queries where it is really efficient (OLAP) and doesn't degrade 
performance of simple OLTP queries.


Out prefetch implementation is also compatible with parallel plans, but 
here we need to preserve some range of pages for each parallel workers 
instead of picking page from some shared queue on demand. It is one of 
the major difference with Postgres prefetch using posix_fadvise: each 
backend shoudl prefetch only those pages which it will going to read.



Concerning index scan we have implemented two different approaches: for
index only scan we  try to prefetch leave pages and for index scan we
prefetch referenced heap pages.

In my experience the IOS handling (only prefetching leaf pages) is very
limiting, and may easily lead to index-only scans being way slower 

Re: On login trigger: take three

2024-01-13 Thread Alexander Lakhin

Hello Alexander,

I've discovered one more instability in the event_trigger_login test.
Please look for example at case [1]:
ok 213   + event_trigger   28946 ms
not ok 214   - event_trigger_login  6430 ms
ok 215   - fast_default    19349 ms
ok 216   - tablespace  44789 ms
1..216
# 1 of 216 tests failed.

--- /home/bf/bf-build/skink-master/HEAD/pgsql/src/test/regress/expected/event_trigger_login.out 2023-10-27 
22:55:12.574139524 +
+++ /home/bf/bf-build/skink-master/HEAD/pgsql.build/src/test/regress/results/event_trigger_login.out 2024-01-03 
23:49:50.177461816 +

@@ -40,6 +40,6 @@
 SELECT dathasloginevt FROM pg_database WHERE datname= :'DBNAME';
  dathasloginevt
 
- f
+ t
 (1 row)

2024-01-03 23:49:40.378 UTC [2235175][client backend][3/5949:0] STATEMENT:  
REINDEX INDEX CONCURRENTLY concur_reindex_ind;
...
2024-01-03 23:49:50.340 UTC [2260974][autovacuum worker][5/5439:18812] LOG:  automatic vacuum of table 
"regression.pg_catalog.pg_statistic": index scans: 1

(operations logged around 23:49:50.177461816)

I suspected that this failure was caused by autovacuum, and I've managed to
reproduce it without Valgrind or slowing down a machine.
With /tmp/extra.config:
log_autovacuum_min_duration = 0
autovacuum_naptime = 1
autovacuum = on

I run:
for ((i=1;i<10;i++)); do echo "ITERATION $i"; \
TEMP_CONFIG=/tmp/extra.config \
TESTS=$(printf 'event_trigger_login %.0s' `seq 1000`) \
make -s check-tests || break; done
and get failures on iterations 1, 2, 1...
ok 693   - event_trigger_login    15 ms
not ok 694   - event_trigger_login    15 ms
ok 695   - event_trigger_login    21 ms

Also, I've observed an anomaly after dropping a login event trigger:
CREATE FUNCTION on_login_proc() RETURNS event_trigger AS $$
BEGIN RAISE NOTICE 'You are welcome!'; END; $$ LANGUAGE plpgsql;
CREATE EVENT TRIGGER olt ON login EXECUTE PROCEDURE on_login_proc();
SELECT dathasloginevt FROM pg_database WHERE datname= current_database();
 dathasloginevt

 t
(1 row)

DROP EVENT TRIGGER olt;
SELECT dathasloginevt FROM pg_database WHERE datname= current_database();
 dathasloginevt

 t
(1 row)

Although after reconnection (\c, as done in the event_trigger_login test),
this query returns 'f'.

[1] 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink=2024-01-03%2023%3A04%3A20
[2] 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink=2023-12-26%2019%3A33%3A02

Best regards,
Alexander




Re: Documentation to upgrade logical replication cluster

2024-01-13 Thread vignesh C
On Fri, 5 Jan 2024 at 10:49, Peter Smith  wrote:
>
> Here are some review comments for patch v1-0001.
>
> ==
> doc/src/sgml/ref/pgupgrade.sgml
>
> 1. GENERAL - blank lines
>
> Most (but not all) of your procedure steps are preceded by blank lines
> to make them more readable in the SGML. Add the missing blank lines
> for the steps that didn't have them.

Modified

> 2. GENERAL - for e.g.:
>
> All the "for e.g:" that precedes the code examples can just say
> "e.g.:" like in other examples on this page.

Modified

> ~~~
> 3. GENERAL - reference from elsewhere
>
> I was wondering if "Chapter 30. Logical Replication" should include a
> section that references back to all this just to make it easier to
> find.

I have moved this to Chapter 30 now as it is more applicable there and
also based on feedback from Amit at [1].

> ~~~
>
> 4.
> +
> + Migration of logical replication clusters can be done when all the 
> members
> + of the old logical replication clusters are version 17.0 or later.
> +
>
> /can be done when/is possible only when/

Modified

> ~~~
>
> 5.
> +
> + The prerequisites of publisher upgrade applies to logical Replication
> + cluster upgrades also. See 
> + for the details of publisher upgrade prerequisites.
> +
>
> /applies to/apply to/
> /logical Replication/logical replication/

Modified

> ~~~
>
> 6.
> +
> + The prerequisites of subscriber upgrade applies to logical Replication
> + cluster upgrades also. See 
> + for the details of subscriber upgrade prerequisites.
> +
> +   
>
> /applies to/apply to/
> /logical Replication/logical replication/

Modified

> ~~~
>
> 7.
> +
> + The steps to upgrade logical replication clusters in various scenarios 
> are
> + given below.
> +
>
> The 3 titles do not render very prominently, so it is too easy to get
> lost scrolling up and down looking for the different scenarios. If the
> title rendering can't be improved, at least a list of 3 links here
> (like a TOC) would be helpful.

I added a list of these 3 links in the beginning.

> ~~~
>
> //
> Steps to Upgrade 2 node logical replication cluster
> //
>
> 8. GENERAL - server names
>
> I noticed in this set of steps you called the servers 'pub_data' and
> 'pub_upgraded_data' and 'sub_data' and 'sub_upgraded_data'. I see it
> is easy to read like this, it is also different from all the
> subsequent procedures where the names are just like 'data1', 'data2',
> 'data3', and 'data1_upgraded', 'data2_upgraded', 'data3_upgraded'.
>
> I felt maybe it is better to use a consistent naming for all the procedures.

Modified

> ~~~
>
> 9.
> + 
> +  Steps to Upgrade 2 node logical replication cluster
>
> SUGGESTION
> Steps to upgrade a two-node logical replication cluster

Modified

> ~~~
>
> 10.
> +
> +   
> +
> + 
> +  Let's say publisher is in node1 and subscriber 
> is
> +  in node2.
> + 
> +
>
> 10a.
> This renders as Step 1. But IMO this should not be a "step" at all --
> it's just a description of the scenario.

Modified

> ~
>
> 10b.
> The subsequent steps refer to subscriptions 'sub1_node1_node2' and
> 'sub2_node1_node2'. IMO it would help with the example code if those
> are named up front here too. e.g.
>
> node2 has two subscriptions for changes from node1:
> sub1_node1_node2
> sub2_node1_node2

Modified

> ~~~
>
> 11.
> +
> + 
> +  Upgrade the publisher node node1's server to the
> +  required newer version, for e.g.:
>
> The wording repeating node/node1 seems complicated.
>
> SUGGESTION
> Upgrade the publisher node's server to the required newer version, e.g.:

Modified

> ~~~
>
> 12.
> +
> + 
> +  Start the upgraded publisher node
> node1's server, for e.g.:
>
> IMO better to use the similar wording used for the "Stop" step
>
> SUGGESTION
> Start the upgraded publisher server in node1, e.g.:

Modified

> ~~~
>
> 13.
> +
> + 
> +  Upgrade the subscriber node node2's server to
> +  the required new version, for e.g.:
>
> The wording repeating node/node2 seems complicated.
>
> SUGGESTION
> Upgrade the subscriber node's server to the required newer version, e.g.:

Modified

> ~~~
>
> 14.
> +
> + 
> +  Start the upgraded subscriber node node2's 
> server,
> +  for e.g.:
>
> IMO better to use the similar wording used for the "Stop" step
>
> SUGGESTION
> Start the upgraded subscriber server in node2, e.g.:

Modified

> ~~~
>
> 15.
> +
> + 
> +  Create any tables that were created in the upgraded
> publisher node1
> +  server between step-5 and now, for e.g.:
> +
> +node2=# CREATE TABLE distributors (
> +node2(# did  integer CONSTRAINT no_null NOT NULL,
> +node2(# name varchar(40) NOT NULL
> +node2(# );
> +CREATE TABLE
> +
> + 
> +
>
> 15a
> Maybe it is better to have a link to setp5 instead of 

Re: POC: GROUP BY optimization

2024-01-13 Thread Alexander Korotkov
On Sat, Jan 13, 2024 at 11:09 AM Andrei Lepikhov
 wrote:
> On 11/1/2024 18:30, Alexander Korotkov wrote:
> > On Tue, Jan 9, 2024 at 1:14 PM Pavel Borisov  wrote:
> >>> Hmm, I don't see this old code in these patches. Resend 0002-* because
> >>> of trailing spaces.
> >>
> >>
> >> AFAIK, cfbot does not seek old versions of patchset parts in previous 
> >> messages. So for it to run correctly, a new version of the whole patchset 
> >> should be sent even if most patches are unchanged.
> >
> > Please, find the revised patchset with some refactoring and comments
> > improvement from me.  I'll continue looking into this.
> The patch looks better, thanks to your refactoring.
> I propose additional comments and tests to make the code more
> understandable (see attachment).
> I intended to look into this part of the code more, but the tests show a
> difference between PostgreSQL v.15 and v.16, which causes changes in the
> code of this feature.

Makes sense.  I've incorporated your changes into the patchset.

--
Regards,
Alexander Korotkov


0001-Generalize-common-code-of-adding-sort-befor-20240113.patch
Description: Binary data


0002-Explore-alternative-orderings-of-group-by-p-20240113.patch
Description: Binary data


Re: Custom explain options

2024-01-13 Thread Tomas Vondra



On 1/12/24 20:30, Konstantin Knizhnik wrote:
> 
> On 12/01/2024 7:03 pm, Tomas Vondra wrote:
>> On 10/21/23 14:16, Konstantin Knizhnik wrote:
>>> Hi hackers,
>>>
>>> EXPLAIN statement has a list of options (i.e. ANALYZE, BUFFERS,
>>> COST,...) which help to provide useful details of query execution.
>>> In Neon we have added PREFETCH option which shows information about page
>>> prefetching during query execution (prefetching is more critical for
>>> Neon
>>> architecture because of separation of compute and storage, so it is
>>> implemented not only for bitmap heap scan as in Vanilla Postgres, but
>>> also for seqscan, indexscan and indexonly scan). Another possible
>>> candidate  for explain options is local file cache (extra caching layer
>>> above shared buffers which is used to somehow replace file system cache
>>> in standalone Postgres).
>> Not quite related to this patch about EXPLAIN options, but can you share
>> some details how you implemented prefetching for the other nodes?
>>
>> I'm asking because I've been working on prefetching for index scans, so
>> I'm wondering if there's a better way to do this, or how to do it in a
>> way that would allow neon to maybe leverage that too.
>>
>> regards
>>
> Yes, I am looking at your PR. What we have implemented in Neon is more
> specific to Neon architecture where storage is separated from compute.
> So each page not found in shared buffers has to be downloaded from page
> server. It adds quite noticeable latency, because of network roundtrip.
> While vanilla Postgres can rely on OS file system cache when page is not
> found in shared buffer (access to OS file cache is certainly slower than
> to shared buffers
> because of syscall and copying of page, but performance penaly is not
> very large - less than 15%), Neon has no local files and so has to send
> request to the socket.
> 
> This is why we have to perform aggressive prefetching whenever it is
> possible (when it it is possible to predict order of subsequent pages).
> Unlike vanilla Postgres which implements prefetch only for bitmap heap
> scan, we have implemented it for seqscan, index scan, indexonly scan,
> bitmap heap scan, vacuum, pg_prewarm.
> The main difference between Neon prefetch and vanilla Postgres prefetch
> is that first one is backend specific. So each backend prefetches only
> pages which it needs.
> This is why we have to rewrite prefetch for bitmap heap scan, which is
> using `fadvise` and assumes that pages prefetched by one backend in file
> cache, can be used by any other backend.
> 

I do understand why prefetching is important in neon (likely more than
for core postgres). I'm interested in how it's actually implemented,
whether it's somehow similar to how my patch does things or in some
different (perhaps neon-specific way), and if the approaches are
different then what are the pros/cons. And so on.

So is it implemented in the neon-specific storage, somehow, or where/how
does neon issue the prefetch requests?

> 
> Concerning index scan we have implemented two different approaches: for
> index only scan we  try to prefetch leave pages and for index scan we
> prefetch referenced heap pages.

In my experience the IOS handling (only prefetching leaf pages) is very
limiting, and may easily lead to index-only scans being way slower than
regular index scans. Which is super surprising for users. It's why I
ended up improving the prefetcher to optionally look at the VM etc.

> In both cases we start from prefetch distance 0 and increase it until it
> reaches `effective_io_concurrency` for this relation. Doing so we try to
> avoid prefetching of useless pages and slowdown of "point" lookups
> returning one or few records.
> 

Right, the regular prefetch ramp-up. My patch does the same thing.

> If you are interested, you can look at our implementation in neon repo:
> all source are available. But briefly speaking, each backend has its own
> prefetch ring (prefetch requests which are waiting for response). The
> key idea is that we can send several prefetch requests to page server
> and then receive multiple replies. It allows to increased speed of OLAP
> queries up to 10 times.
> 

Can you point me to the actual code / branch where it happens? I did
check the github repo, but I don't see anything relevant in the default
branch (REL_15_STABLE_neon). There are some "prefetch" branches, but
those seem abandoned.

> Heikki thinks that prefetch can be somehow combined with async-io
> proposal (based on io_uring). But right now they have nothing in common.
> 

I can imagine async I/O being useful here, but I find the flow of I/O
requests is quite complex / goes through multiple layers. Or maybe I
just don't understand how it should work.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: [PATCH] Add additional extended protocol commands to psql: \parse and \bindx

2024-01-13 Thread Jelte Fennema-Nio
On Thu, 2 Nov 2023 at 10:52, Anthonin Bonnefoy
 wrote:
> The main goal is to provide more ways to test extended protocol in
> regression tests
> similarly to what \bind is doing.

I think this is a great addition. One thing that I think should be
added for completeness though is the ability to deallocate the
prepared statement using PQsendClosePrepared. Other than that the
changes look great.

Also a tiny nitpick: stmt_name should be replaced with STMT_NAME in
this line of the help message.

> +   HELP0("  \\bindx stmt_name [PARAM]...\n"




Re: Tidy fill hstv array (src/backend/access/heap/pruneheap.c)

2024-01-13 Thread Ranier Vilela
Em ter., 9 de jan. de 2024 às 06:31, John Naylor 
escreveu:

> On Thu, Dec 28, 2023 at 7:58 PM Ranier Vilela  wrote:
> > I think it would be more productive to initialize the entire array with
> -1, and avoid flagging, one by one, the items that should be -1.
>
> This just moves an operation from one place to the other, while
> obliterating the explanatory comment, so I don't see an advantage.
>
Well, I think that is precisely the case for using memset.
The way initialization is currently done is much slower and harmful to the
branch.
Of course, the gain should be small, but it is fully justified for
switching to memset.
Regarding the comment, once initialization is done via memset, such as
prstate.marked, it becomes irrelevant and unnecessary.

Best regards,
Ranier Vilela


Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features)

2024-01-13 Thread jian he
On Fri, Jan 12, 2024 at 10:59 AM torikoshia  wrote:
>
>
> Thanks for reviewing!
>
> Updated the patch merging your suggestions except below points:
>
> > +   cstate->num_errors = 0;
>
> Since cstate is already initialized in below lines, this may be
> redundant.
>
> | /* Allocate workspace and zero all fields */
> | cstate = (CopyFromStateData *) palloc0(sizeof(CopyFromStateData));
>
>
> >  +   Assert(!cstate->escontext->details_wanted);
>
> I'm not sure this is necessary, considering we're going to add other
> options like 'table' and 'log', which need details_wanted soon.
>
>
> --
> Regards,

make save_error_to option cannot be used with COPY TO.
add redundant test, save_error_to with COPY TO test.


v2-0001-minor-refactor.no-cfbot
Description: Binary data


Re: pg_stat_statements and "IN" conditions

2024-01-13 Thread Dmitry Dolgov
> On Mon, Jan 08, 2024 at 05:10:20PM +0100, Dmitry Dolgov wrote:
> > On Sat, Jan 06, 2024 at 09:04:54PM +0530, vignesh C wrote:
> >
> > CFBot shows documentation build has failed at [1] with:
> > [07:44:55.531] time make -s -j${BUILD_JOBS} -C doc
> > [07:44:57.987] postgres.sgml:572: element xref: validity error : IDREF
> > attribute linkend references an unknown ID
> > "guc-query-id-const-merge-threshold"
> > [07:44:58.179] make[2]: *** [Makefile:70: postgres-full.xml] Error 4
> > [07:44:58.179] make[2]: *** Deleting file 'postgres-full.xml'
> > [07:44:58.181] make[1]: *** [Makefile:8: all] Error 2
> > [07:44:58.182] make: *** [Makefile:16: all] Error 2
> >
> > [1] - https://cirrus-ci.com/task/6688578378399744
>
> Indeed, after moving the configuration option to pgss I forgot to update
> its reference in the docs. Thanks for noticing, will update soon.

Here is the fixed version.
>From 1b99ffd68de6e82d9bbc45c18153ef965a228e28 Mon Sep 17 00:00:00 2001
From: Dmitrii Dolgov <9erthali...@gmail.com>
Date: Sat, 14 Oct 2023 15:00:48 +0200
Subject: [PATCH v17 1/4] Prevent jumbling of every element in ArrayExpr

pg_stat_statements produces multiple entries for queries like

SELECT something FROM table WHERE col IN (1, 2, 3, ...)

depending on the number of parameters, because every element of
ArrayExpr is jumbled. In certain situations it's undesirable, especially
if the list becomes too large.

Make an array of Const expressions contribute only the first/last
elements to the jumble hash. Allow to enable this behavior via the new
pg_stat_statements parameter query_id_const_merge with the default value off.

Reviewed-by: Zhihong Yu, Sergey Dudoladov, Robert Haas, Tom Lane,
Michael Paquier, Sergei Kornilov, Alvaro Herrera, David Geier
Tested-by: Chengxi Sun
---
 contrib/pg_stat_statements/Makefile   |   2 +-
 .../pg_stat_statements/expected/merging.out   | 167 ++
 contrib/pg_stat_statements/meson.build|   1 +
 .../pg_stat_statements/pg_stat_statements.c   |  62 ++-
 contrib/pg_stat_statements/sql/merging.sql|  58 ++
 doc/src/sgml/pgstatstatements.sgml|  57 +-
 src/backend/nodes/gen_node_support.pl |  21 ++-
 src/backend/nodes/queryjumblefuncs.c  | 100 ++-
 src/backend/postmaster/postmaster.c   |   3 +
 src/backend/utils/misc/postgresql.conf.sample |   1 -
 src/include/nodes/primnodes.h |   2 +-
 src/include/nodes/queryjumble.h   |   9 +-
 12 files changed, 458 insertions(+), 25 deletions(-)
 create mode 100644 contrib/pg_stat_statements/expected/merging.out
 create mode 100644 contrib/pg_stat_statements/sql/merging.sql

diff --git a/contrib/pg_stat_statements/Makefile b/contrib/pg_stat_statements/Makefile
index eba4a95d91a..af731fc9a58 100644
--- a/contrib/pg_stat_statements/Makefile
+++ b/contrib/pg_stat_statements/Makefile
@@ -19,7 +19,7 @@ LDFLAGS_SL += $(filter -lm, $(LIBS))
 
 REGRESS_OPTS = --temp-config $(top_srcdir)/contrib/pg_stat_statements/pg_stat_statements.conf
 REGRESS = select dml cursors utility level_tracking planning \
-	user_activity wal cleanup oldextversions
+	user_activity wal cleanup oldextversions merging
 # Disabled because these tests require "shared_preload_libraries=pg_stat_statements",
 # which typical installcheck users do not have (e.g. buildfarm clients).
 NO_INSTALLCHECK = 1
diff --git a/contrib/pg_stat_statements/expected/merging.out b/contrib/pg_stat_statements/expected/merging.out
new file mode 100644
index 000..f286c735a36
--- /dev/null
+++ b/contrib/pg_stat_statements/expected/merging.out
@@ -0,0 +1,167 @@
+--
+-- Const merging functionality
+--
+CREATE EXTENSION pg_stat_statements;
+CREATE TABLE test_merge (id int, data int);
+-- IN queries
+-- No merging is performed, as a baseline result
+SELECT pg_stat_statements_reset();
+ pg_stat_statements_reset 
+--
+ 
+(1 row)
+
+SELECT * FROM test_merge WHERE id IN (1, 2, 3, 4, 5, 6, 7, 8, 9);
+ id | data 
++--
+(0 rows)
+
+SELECT * FROM test_merge WHERE id IN (1, 2, 3, 4, 5, 6, 7, 8, 9, 10);
+ id | data 
++--
+(0 rows)
+
+SELECT * FROM test_merge WHERE id IN (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11);
+ id | data 
++--
+(0 rows)
+
+SELECT query, calls FROM pg_stat_statements ORDER BY query COLLATE "C";
+query| calls 
+-+---
+ SELECT * FROM test_merge WHERE id IN ($1, $2, $3, $4, $5, $6, $7, $8, $9)   | 1
+ SELECT * FROM test_merge WHERE id IN ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10)  | 1
+ SELECT * FROM test_merge WHERE id IN ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11) | 1
+ SELECT pg_stat_statements_reset()   | 1
+(4 rows)
+
+-- Normal scenario, too many simple constants for an IN query
+SET 

Re: Documentation to upgrade logical replication cluster

2024-01-13 Thread vignesh C
On Fri, 5 Jan 2024 at 09:08, Hayato Kuroda (Fujitsu)
 wrote:
>
> Dear Vignesh,
>
> Thanks for making a patch! Below part is my comments.
>
> 1.
> Only two steps were added an id, but I think it should be for all the steps.
> See [1].

I have added wherever it is required as of now.

> 2.
> I'm not sure it should be listed as step 10. I felt that it should be new 
> section.
> At that time other steps like "Prepare for {publisher|subscriber} upgrades" 
> can be moved as well.
> Thought?

I have moved all of these to a separate page in logical-replication
under Upgrade

> 3.
> ```
> + The prerequisites of publisher upgrade applies to logical Replication
> ```
>
> Replication -> replication

Modified

> 4.
> ```
> + 
> +  Let's say publisher is in node1 and subscriber 
> is
> +  in node2.
> + 
> ```
>
> I felt it is more friendly if you added the name of directory for each 
> instance.

I have listed this in the pg_upgrade command execution, since it is
mentioned there I have not added here too.

> 5.
> You did not write the initialization of new node. Was it intentional?

Added it now

> 6.
> ```
> + 
> +  Disable all the subscriptions on node2 that are
> +  subscribing the changes from node1 by using
> +   linkend="sql-altersubscription-params-disable">ALTER SUBSCRIPTION 
> ... DISABLE,
> +  for e.g.:
> +
> +node2=# ALTER SUBSCRIPTION sub1_node1_node2 DISABLE;
> +ALTER SUBSCRIPTION
> +node2=# ALTER SUBSCRIPTION sub2_node1_node2 DISABLE;
> +ALTER SUBSCRIPTION
> +
> + 
> ```
>
> Subscriptions are disabled after stopping a publisher, but it leads ERRORs on 
> the publisher.
> I think it's better to swap these steps.

Modified

> 7.
> ```
> +
> +dba@node1:/opt/PostgreSQL/postgres//bin$ pg_ctl -D 
> /opt/PostgreSQL/pub_data stop -l logfile
> +
> ```
>
> Hmm. I thought you did not have to show the current directory. You were in the
> bin dir, but it is not our requirement, right?

I kept this just to show the version being used

> 8.
> ```
> +
> +dba@node1:/opt/PostgreSQL/postgres//bin$ pg_upgrade
> +--old-datadir "/opt/PostgreSQL/postgres/17/pub_data"
> +--new-datadir 
> "/opt/PostgreSQL/postgres//pub_upgraded_data"
> +--old-bindir "/opt/PostgreSQL/postgres/17/bin"
> +--new-bindir "/opt/PostgreSQL/postgres//bin"
> +
> ```
>
> For PG17, both old and new bindir look the same. Can we use 18 as new-bindir?

Modfied

> 9.
> ```
> + 
> +  Create any tables that were created in node2
> +  between step-2 and now, for e.g.:
> +
> +node2=# CREATE TABLE distributors (
> +node2(# did  integer CONSTRAINT no_null NOT NULL,
> +node2(# name varchar(40) NOT NULL
> +node2(# );
> +CREATE TABLE
> +
> + 
> ```
>
> I think this SQLs must be done on node1, because it has not boot between 
> step-2
> and step-7.

Modified

> 10.
> ```
> +
> + 
> +  Enable all the subscriptions on node2 that are
> +  subscribing the changes from node1 by using
> +  ALTER 
> SUBSCRIPTION ... ENABLE,
> +  for e.g.:
> +
> +node2=# ALTER SUBSCRIPTION sub1_node1_node2 ENABLE;
> +ALTER SUBSCRIPTION
> +node2=# ALTER SUBSCRIPTION sub2_node1_node2 ENABLE;
> +ALTER SUBSCRIPTION
> +
> + 
> +
> +
> +
> + 
> +  Refresh the publications  using
> +   linkend="sql-altersubscription-params-refresh-publication">ALTER 
> SUBSCRIPTION ... REFRESH PUBLICATION,
> +  for e.g.:
> +
> +node2=# ALTER SUBSCRIPTION sub1_node1_node2 REFRESH PUBLICATION;
> +ALTER SUBSCRIPTION
> +node2=# ALTER SUBSCRIPTION sub2_node1_node2 REFRESH PUBLICATION;
> +ALTER SUBSCRIPTION
> +
> + 
> +
> ```
>
> I was very confused the location where they would be really do. If my above
> comment is correct, should they be executed on node1 as well? Could you 
> please all
> the notation again?

Modified

> 11.
> ```
> + 
> +  Disable all the subscriptions on node1 that are
> +  subscribing the changes from node2 by using
> +   linkend="sql-altersubscription-params-disable">ALTER SUBSCRIPTION 
> ... DISABLE,
> +  for e.g.:
> +
> +node2=# ALTER SUBSCRIPTION sub1_node2_node1 DISABLE;
> +ALTER SUBSCRIPTION
> +node2=# ALTER SUBSCRIPTION sub2_node2_node1 DISABLE;
> +ALTER SUBSCRIPTION
> +
> + 
> ```
>
> They should be on node1, but noted as node2.

Modified

> 12.
> ```
> + 
> +  Enable all the subscriptions on node1 that are
> +  subscribing the changes from node2 by using
> +  ALTER 
> SUBSCRIPTION ... ENABLE,
> +  for e.g.:
> +
> +node2=# ALTER SUBSCRIPTION sub1_node2_node1 ENABLE;
> +ALTER SUBSCRIPTION
> +node2=# ALTER SUBSCRIPTION sub2_node2_node1 ENABLE;
> +ALTER SUBSCRIPTION
> +
> + 
> ```
>
> You said that "enable all the subscription on node1", but SQLs are done on 
> node2.

Modified

Thanks for the comments, the attached v2 version patch has 

Add support for data change delta tables

2024-01-13 Thread PavelTurk

Hello all,

Currently PostgreSQL doesn't support data change delta tables. For example, it 
doesn't support this type of query:

SELECT * FROM NEW TABLE (
    INSERT INTO phone_book
    VALUES ( 'Peter Doe', '555-2323' )
) AS t

PostgreSQL has RETURNING that provides only a subset of this functionality.

So I suggest to add support for data change delta tables. Because this feature 
is more powerful and it is included
in the SQL Standard.

Best regards, Pavel




Re: Invalidate the subscription worker in cases where a user loses their superuser status

2024-01-13 Thread Amit Kapila
On Sat, Jan 13, 2024 at 2:02 AM Jeff Davis  wrote:
>
> On Tue, 2023-10-17 at 14:17 +0530, Amit Kapila wrote:
> > > Fair enough. I'll wait till early next week (say till Monday EOD)
> > > to
> > > see if anyone thinks otherwise and push this patch to HEAD after
> > > some
> > > more testing and review.
> > >
> >
> > Pushed.
>
> There was a brief discussion on backporting this here:
>
> https://www.postgresql.org/message-id/CA%2BTgmob2mYpaUMT7aoFOukYTrpxt-WdgcitJnsjWhufnbDWEeg%40mail.gmail.com
>
> It looks like you opted not to backport it. Was there a reason for
> that? The only risky thing I see there is a change in the Subscription
> structure -- I suppose that could be used by extensions?
>

Right, the same is pointed out by me in an email [1].

> (I am fine with it not being backported, but I was inclined to think it
> should be backported.)
>

I don't mind backporting it if you think so but we need to ensure that
we don't break any extensions.

[1] - 
https://www.postgresql.org/message-id/CAA4eK1JztFkYeVANuH0Ja_c3zqDjTyz0j15xQqwCDRPokYhWgw%40mail.gmail.com

-- 
With Regards,
Amit Kapila.




Re: A failure in t/038_save_logical_slots_shutdown.pl

2024-01-13 Thread Amit Kapila
On Fri, Jan 12, 2024 at 3:36 PM Bharath Rupireddy
 wrote:
>
> On Fri, Jan 12, 2024 at 9:28 AM Amit Kapila  wrote:
> >
> > On Thu, Jan 11, 2024 at 10:03 PM Bharath Rupireddy
> >  wrote:
> > >
> > > On Thu, Jan 11, 2024 at 4:35 PM vignesh C  wrote:
> > > >
> > > > On further analysis, it was found that in the failing test,
> > > > CHECKPOINT_SHUTDOWN was started in a new page, so there was the WAL
> > > > page header present just before the CHECKPOINT_SHUTDOWN which was
> > > > causing the failure. We could alternatively reproduce the issue by
> > > > switching the WAL file before restarting the server like in the
> > > > attached test change patch.
> > > > There are a couple of ways to fix this issue a) one by switching the
> > > > WAL before the insertion of records so that the CHECKPOINT_SHUTDOWN
> > > > does not get inserted in a new page as in the attached test_fix.patch
> > > > b) by using pg_walinspect to check that the next WAL record is
> > > > CHECKPOINT_SHUTDOWN. I have to try this approach.
> > > >
> > > > Thanks to Bharath and Kuroda-san for offline discussions and helping
> > > > in getting to the root cause.
> > >
> > > IIUC, the problem the commit e0b2eed tries to solve is to ensure there
> > > are no left-over decodable WAL records between confirmed_flush LSN and
> > > a shutdown checkpoint, which is what it is expected from the
> > > t/038_save_logical_slots_shutdown.pl. How about we have a PG function
> > > returning true if there are any decodable WAL records between the
> > > given start_lsn and end_lsn?
> > >
> >
> > But, we already test this in 003_logical_slot during a successful
> > upgrade. Having an explicit test to do the same thing has some merits
> > but not sure if it is worth it.
>
> If the code added by commit e0b2eed is covered by the new upgrade
> test, why not remove 038_save_logical_slots_shutdown.pl altogether?
>

This is a more strict check because it is possible that even if the
latest confirmed_flush location is not persisted there is no
meaningful decodable WAL between whatever the last confirmed_flush
location saved on disk and the shutdown_checkpoint record.
Kuroda-San/Vignesh, do you have any suggestion on this one?

> > The current test tries to ensure that
> > during shutdown after we shutdown walsender and ensures that it sends
> > all the wal records and receipts an ack for the same, there is no
> > other WAL except shutdown_checkpoint. Vignesh's suggestion (a) makes
> > the test robust enough that it shouldn't show spurious failures like
> > the current one reported by you, so shall we try to proceed with that?
>
> Do you mean something like [1]? It ensures the test passes unless any
> writes are added (in future) before the publisher restarts in the test
> which can again make the tests flaky. How do we ensure no one adds
> anything in before the publisher restart
> 038_save_logical_slots_shutdown.pl? A note before the restart perhaps?
>

I am fine with adding the note.

-- 
With Regards,
Amit Kapila.




Re: speed up a logical replica setup

2024-01-13 Thread Amit Kapila
On Fri, Jan 12, 2024 at 12:02 PM Hayato Kuroda (Fujitsu)
 wrote:
>
> I didn't review all items but ...
>
> 1.
> An option --port was added to control the port number for physical standby.
> Users can specify a port number via the option, or an environment variable 
> PGSUBPORT.
> If not specified, a fixed value (50111) would be used.
>
> My first reaction as a new user would be: why do I need to specify a port if 
> my
> --subscriber-conninfo already contains a port? Ugh. I'm wondering if we can do
> it behind the scenes. Try a range of ports.
> >
>
> My initial motivation of the setting was to avoid establishing connections
> during the pg_subscriber. While considering more, I started to think that
> --subscriber-conninfo may not be needed. pg_upgrade does not requires the
> string: it requries username, and optionally port number (which would be used
> during the upgrade) instead. The advantage of this approach is that we do not
> have to parse the connection string.
> How do you think?
>

+1. This seems worth considering. I think unless we have a good reason
to have this parameter, we should try to avoid it.

-- 
With Regards,
Amit Kapila.




Re: reorganize "Shared Memory and LWLocks" section of docs

2024-01-13 Thread Aleksander Alekseev
Hi,

Thanks for the updated patch.

> > I see what you mean, but I don't think the problem is the word "each."  I
> > think the problem is the use of passive voice.  What do you think about
> > something like
> >
> >   Each backend will execute the registered shmem_startup_hook shortly
> >   after it attaches to shared memory.

That's much better, thanks.

I think the patch could use another pair of eyes, ideally from a
native English speaker. But if no one will express any objections for
a while I suggest merging it.

-- 
Best regards,
Aleksander Alekseev




Postgres and --config-file option

2024-01-13 Thread Aleksander Alekseev
Hi,

A friend of mine complained about strange behavior of `postgres`. When
executed without any arguments the following error is shown:

```
$ postgres
postgres does not know where to find the server configuration file.
You must specify the --config-file or -D invocation option or set the
PGDATA environment variable.
```

However --config-file is not listed in --help output. Apparently
that's because it's not a regular option but a GUС. It is in fact
supported:

```
$ postgres --config-file=/tmp/fake.txt
postgres: could not access the server configuration file
"/tmp/fake.txt": No such file or directory
```

Additionally --help says:

```
[...]
Please read the documentation for the complete list of run-time
configuration settings and how to set them on the command line or in
the configuration file
```

... which personally I don't find extremely useful to be honest.

OK, let's check section "20.1.4. Parameter Interaction via the Shell"
[1] of the documentation. Currently it doesn't tell anything about the
ability to specify GUCs --like-this, unless I missed something.

Should we remove --config-file from the error message to avoid any
confusion? Should we correct --help output? Should we update the
documentation?

[1]: 
https://www.postgresql.org/docs/current/config-setting.html#CONFIG-SETTING-SHELL

-- 
Best regards,
Aleksander Alekseev




Re: POC: GROUP BY optimization

2024-01-13 Thread Andrei Lepikhov

On 11/1/2024 18:30, Alexander Korotkov wrote:

Hi!

On Tue, Jan 9, 2024 at 1:14 PM Pavel Borisov  wrote:

Hmm, I don't see this old code in these patches. Resend 0002-* because
of trailing spaces.



AFAIK, cfbot does not seek old versions of patchset parts in previous messages. 
So for it to run correctly, a new version of the whole patchset should be sent 
even if most patches are unchanged.


Please, find the revised patchset with some refactoring and comments
improvement from me.  I'll continue looking into this.

The patch looks better, thanks to your refactoring.
I propose additional comments and tests to make the code more 
understandable (see attachment).
I intended to look into this part of the code more, but the tests show a 
difference between PostgreSQL v.15 and v.16, which causes changes in the 
code of this feature.


--
regards,
Andrei Lepikhov
Postgres Professional
diff --git a/src/backend/optimizer/path/pathkeys.c 
b/src/backend/optimizer/path/pathkeys.c
index f4b7dcac21..5aac6d6677 100644
--- a/src/backend/optimizer/path/pathkeys.c
+++ b/src/backend/optimizer/path/pathkeys.c
@@ -397,6 +397,11 @@ group_keys_reorder_by_pathkeys(List *pathkeys, List 
**group_pathkeys,
!list_member_ptr(*group_pathkeys, pathkey))
break;
 
+   /*
+* Since 1349d27 pathkey coming from underlying node can be in 
the
+* root->group_pathkeys but not in the processed_groupClause. 
So, we
+* should be careful here.
+*/
sgc = 
get_sortgroupref_clause_noerr(pathkey->pk_eclass->ec_sortref,

*group_clauses);
if (!sgc)
diff --git a/src/test/regress/expected/aggregates.out 
b/src/test/regress/expected/aggregates.out
index 423c8ec3b6..0d46e096e5 100644
--- a/src/test/regress/expected/aggregates.out
+++ b/src/test/regress/expected/aggregates.out
@@ -2857,6 +2857,28 @@ explain (COSTS OFF) SELECT x,w FROM btg GROUP BY w,x,y,z 
ORDER BY x*x,z;
 (8 rows)
 
 DROP TABLE btg;
+-- The case, when scanning sort order correspond to aggregate sort order but
+-- can not be found in the group-by list
+CREATE TABLE t1 (c1 int PRIMARY KEY, c2 int);
+CREATE UNIQUE INDEX ON t1(c2);
+explain (costs off)
+SELECT array_agg(c1 ORDER BY c2),c2
+FROM t1 WHERE c2 < 100 GROUP BY c1 ORDER BY 2;
+   QUERY PLAN   
+
+ Sort
+   Sort Key: c2
+   ->  GroupAggregate
+ Group Key: c1
+ ->  Sort
+   Sort Key: c1, c2
+   ->  Bitmap Heap Scan on t1
+ Recheck Cond: (c2 < 100)
+ ->  Bitmap Index Scan on t1_c2_idx
+   Index Cond: (c2 < 100)
+(10 rows)
+
+DROP TABLE t1 CASCADE;
 RESET enable_hashagg;
 RESET max_parallel_workers;
 RESET max_parallel_workers_per_gather;
diff --git a/src/test/regress/sql/aggregates.sql 
b/src/test/regress/sql/aggregates.sql
index b9fcceedd7..f99167ac9e 100644
--- a/src/test/regress/sql/aggregates.sql
+++ b/src/test/regress/sql/aggregates.sql
@@ -1224,6 +1224,15 @@ explain (COSTS OFF) SELECT x,w FROM btg GROUP BY w,x,y,z 
ORDER BY x*x,z;
 
 DROP TABLE btg;
 
+-- The case, when scanning sort order correspond to aggregate sort order but
+-- can not be found in the group-by list
+CREATE TABLE t1 (c1 int PRIMARY KEY, c2 int);
+CREATE UNIQUE INDEX ON t1(c2);
+explain (costs off)
+SELECT array_agg(c1 ORDER BY c2),c2
+FROM t1 WHERE c2 < 100 GROUP BY c1 ORDER BY 2;
+DROP TABLE t1 CASCADE;
+
 RESET enable_hashagg;
 RESET max_parallel_workers;
 RESET max_parallel_workers_per_gather;


Re: Recovering from detoast-related catcache invalidations

2024-01-13 Thread Xiaoran Wang
Hmm, how about first checking if any invalidated shared messages have been
accepted, then rechecking the tuple's visibility?
If there is no invalidated shared message accepted during
'toast_flatten_tuple',
there is no need to do then visibility check, then it can save several
CPU cycles.


   if (inval_count != SharedInvalidMessageCounter &&
!systable_recheck_tuple(scandesc, ntp))
   {
  heap_freetuple(dtp);
  return NULL;
}



Xiaoran Wang  于2024年1月13日周六 13:16写道:

> Great! That's what exactly we need.
>
> The patch LGTM,  +1
>
>
> Tom Lane  于2024年1月13日周六 04:47写道:
>
>> I wrote:
>> > This is uncomfortably much in bed with the tuple table slot code,
>> > perhaps, but I don't see a way to do it more cleanly unless we want
>> > to add some new provisions to that API.  Andres, do you have any
>> > thoughts about that?
>>
>> Oh!  After nosing around a bit more I remembered systable_recheck_tuple,
>> which is meant for exactly this purpose.  So v4 attached.
>>
>> regards, tom lane
>>
>>


Re: [PATCH] New predefined role pg_manage_extensions

2024-01-13 Thread Michael Banck
Hi,

On Fri, Jan 12, 2024 at 04:13:27PM +0100, Jelte Fennema-Nio wrote:
> But I'm not sure that such a pg_manage_extensions role would have any
> fewer permissions than superuser in practice. 

Note that just being able to create an extension does not give blanket
permission to use it. I did a few checks with things I thought might be
problematic like adminpack or plpython3u, and a pg_manage_extensions
user is not allowed to call those functions or use the untrusted
language.

> Afaik many extensions that are not marked as trusted, are not trusted
> because they would allow fairly trivial privilege escalation to
> superuser if they were.

While that might be true (or we err on the side of caution), I thought
the rationale was more that they either disclose more information about
the database server than we want to disclose to ordinary users, or that
they allow access to the file system etc.

I think if we have extensions in contrib that trivially allow
non-superusers to become superusers just by being installed, that should
be a bug and be fixed by making it impossible for ordinary users to
use those extensions without being granted some access to them in
addition.

After all, socially engineering a DBA into installing an extension due
to user demand would be a thing anyway (even if most DBAs might reject
it) and at least DBAs should be aware of the specific risks of a
particular extension probably?


Michael