date:20240426

Re: cpluspluscheck/headerscheck require build in REL_16_STABLE

2024-04-26 Thread Marina Polyakova


On 2024-04-27 09:14, John Naylor wrote:
On Wed, Apr 17, 2024 at 7:21 AM John Naylor  
wrote:


On Mon, Apr 15, 2024 at 9:20 PM Marina Polyakova
 wrote:
> Everything seems to work with this patch, thank you!

Glad to hear it -- I'll push next week when I get back from vacation,
unless there are objections.


Pushed, thanks for the report!


Thank you again!

--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: New committers: Melanie Plageman, Richard Guo

2024-04-26 Thread Kashif Zeeshan

Congratulations!

On Sat, Apr 27, 2024 at 11:34 AM Andrey M. Borodin 
wrote:

>
>
> > On 26 Apr 2024, at 16:54, Jonathan S. Katz  wrote:
> >
> > The Core Team would like to extend our congratulations to Melanie
> Plageman and Richard Guo, who have accepted invitations to become our
> newest PostgreSQL committers.
> >
> > Please join us in wishing them much success and few reverts!
>
> Congratulations!
>
>
> Best regards, Andrey Borodin.
>
>

Re: New committers: Melanie Plageman, Richard Guo

2024-04-26 Thread Andrey M. Borodin




> On 26 Apr 2024, at 16:54, Jonathan S. Katz  wrote:
> 
> The Core Team would like to extend our congratulations to Melanie Plageman 
> and Richard Guo, who have accepted invitations to become our newest 
> PostgreSQL committers.
> 
> Please join us in wishing them much success and few reverts!

Congratulations!


Best regards, Andrey Borodin.

Re: cpluspluscheck/headerscheck require build in REL_16_STABLE

2024-04-26 Thread John Naylor

On Wed, Apr 17, 2024 at 7:21 AM John Naylor  wrote:
>
> On Mon, Apr 15, 2024 at 9:20 PM Marina Polyakova
>  wrote:
> > Everything seems to work with this patch, thank you!
>
> Glad to hear it -- I'll push next week when I get back from vacation,
> unless there are objections.

Pushed, thanks for the report!

add tab-complete for memory, serialize option and other minor issues.

2024-04-26 Thread jian he

hi.

I found some minor issues related to the EXPLAIN command.

cannot auto-complete with a white space.
src8=# explain (analyze,b

can auto-complete:
src8=# explain (analyze, b

to make tab-complete work, comma, must be followed with a white space,
not sure why.
--
explain (serialize binary) select 1;
ERROR:  EXPLAIN option SERIALIZE requires ANALYZE

do you think it's better to rephrase it as:
ERROR:  EXPLAIN option SERIALIZE requires ANALYZE option

since we have separate ANALYZE SQL commands.
--

 
  Specify the output format, which can be TEXT, XML, JSON, or YAML.
  Non-text output contains the same information as the text output
  format, but is easier for programs to parse.  This parameter defaults to
  TEXT.
 

should we add  attribute for {TEXT, XML, JSON, YAML} in the
above paragraph?

--
i created a patch for tab-complete for memory, SERIALIZE option.
From bec2c92ef61bb8272608a49e6837141e7e8346b3 Mon Sep 17 00:00:00 2001
From: jian he 
Date: Sat, 27 Apr 2024 10:33:16 +0800
Subject: [PATCH v1 1/1] add Tab-complete for EXPLAIN MEMORY, EXPLAIN SERIALIZE

---
 src/bin/psql/tab-complete.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 6fee3160..08641565 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -3856,11 +3856,13 @@ psql_completion(const char *text, int start, int end)
 		 */
 		if (ends_with(prev_wd, '(') || ends_with(prev_wd, ','))
 			COMPLETE_WITH("ANALYZE", "VERBOSE", "COSTS", "SETTINGS", "GENERIC_PLAN",
-		  "BUFFERS", "WAL", "TIMING", "SUMMARY", "FORMAT");
-		else if (TailMatches("ANALYZE|VERBOSE|COSTS|SETTINGS|GENERIC_PLAN|BUFFERS|WAL|TIMING|SUMMARY"))
+		  "BUFFERS", "WAL", "TIMING", "SUMMARY", "FORMAT", "SERIALIZE", "MEMORY");
+		else if (TailMatches("ANALYZE|VERBOSE|COSTS|SETTINGS|GENERIC_PLAN|BUFFERS|WAL|TIMING|SUMMARY|MEMORY"))
 			COMPLETE_WITH("ON", "OFF");
 		else if (TailMatches("FORMAT"))
 			COMPLETE_WITH("TEXT", "XML", "JSON", "YAML");
+		else if (TailMatches("SERIALIZE"))
+			COMPLETE_WITH("TEXT", "NONE", "BINARY");
 	}
 	else if (Matches("EXPLAIN", "ANALYZE"))
 		COMPLETE_WITH("SELECT", "INSERT INTO", "DELETE FROM", "UPDATE", "DECLARE",

base-commit: 1713e3d6cd393fcc1d4873e75c7fa1f6c7023d75
-- 
2.34.1

Re: WIP: Vectored writeback

2024-04-26 Thread Thomas Munro

Here is a new straw-man patch set.  I'd already shown the basic
techniques for vectored writes from the buffer pool (FlushBuffers(),
note the "s"), but that was sort of kludged into place while I was
hacking on the lower level bits and pieces, and now I'm building
layers further up.  The main idea is: you can clean buffers with a
"WriteStream", and here are a bunch of example users to show that
working.

A WriteStream is approximately the opposite of a ReadStream (committed
in v17).  You push pinned dirty buffers into it (well they don't have
to be dirty, and it's OK if someone else cleans the buffer
concurrently, the point is that you recently dirtied them).  It
combines buffers up to io_combine_limit, but defers writing as long as
possible within some limits to avoid flushing the WAL, and tries to
coordinate with the WAL writer.  The WAL writer interaction is a very
tricky problem, and that aspect is only a toy for now, but it's at
least partially successful (see problems at end).

The CHECKPOINT code uses the WriteStream API directly.  It creates one
stream per tablespace, so that the existing load balancing algorithm
doesn't defeat the I/O combining algorithm.  Unsurprisingly, it looks
like this:

postgres=# checkpoint;

...
pwritev(18,...,2,0x1499e000) = 131072 (0x2)
pwrite(18,...,131072,0x149be000) = 131072 (0x2)
pwrite(18,...,131072,0x149de000) = 131072 (0x2)
...

Sometimes you'll see it signalling the WAL writer.  It builds up a
queue of writes that it doesn't want to perform yet, in the hope of
getting a free ride WRT WAL.

Other places can benefit from a more centrally placed write stream,
indirectly.  Our BAS_BULKWRITE and BAS_VACUUM buffer access strategies
already perform "write-behind".  That's a name I borrowed from some OS
stuff, where the kernel has clues that bulk data (for example a big
file copy) will not likely be needed again soon so you want to get it
out of the way soon before it trashes your whole buffer pool (AKA
"scan resistance"), but you want to defer just a little bit to perform
I/O combining.  That applies directly here, but we have the additional
concern of delaying the referenced WAL write in the hope that someone
else will do it for us.

In this experiment, I am trying to give that pre-existing behaviour an
explicit name (better names welcome!), and optimise it.  If you're
dirtying buffers in a ring, you'll soon crash into your own tail and
have to write it out, and it is very often sequential blocks due to
the scan-like nature of many bulk I/O jobs, so I/O combining is very
effective.  The main problem is that you'll often have to flush WAL
first, which this patch set tries to address to some extent.  In the
strategy write-behind case you don't really need a LSN reordering
queue, just a plain FIFO queue would do, but hopefully that doesn't
cost much.  (Cf CHECKPOINT, which sorts blocks by buffer tag, but
expects LSNs in random order, so it does seem to need reordering.)

With this patch set, instead of calling ReleaseBuffer() after you've
dirtied a buffer in one of those bulk writing code paths, you can use
StrategyReleaseBuffer(), and the strategy will fire it into the stream
to get I/O combining and LSN reordering; it'll be unpinned later, and
certainly before you get the same buffer back for a new block.  So
those write-behind user patches are very short, they just do
s/ReleaseBuffer/StrategyReleaseBuffer/ plus minor details.
Unsurprisingly, it looks like this:

postgres=# copy t from program 'seq -f %1.0f 1 1000';

...
pwrite(44,...,131072,0x2f986000) = 131072 (0x2) <-- streaming write-behind!
pwrite(44,...,131072,0x2f966000) = 131072 (0x2)
pwrite(44,...,131072,0x2f946000) = 131072 (0x2)
...

postgres=# vacuum t;

...
pwrite(35,...,131072,0x3fb3e000) = 131072 (0x2) <-- streaming write-behind!
preadv(35,...,122880}],2,0x3fb7a000) = 131072 (0x2) <-- from Melanie's patch
pwritev(35,...,2,0x3fb5e000) = 131072 (0x2)
pread(35,...,131072,0x3fb9a000) = 131072 (0x2)
...

Next I considered how to get INSERT, UPDATE, DELETE to participate.
The problem is that they use BAS_BULKREAD, even though they might
dirty buffers.  In master, BAS_BULKREAD doesn't do write-behind,
instead it uses the "reject" mechanism: as soon as it smells a dirty
buffer, it escapes the ring and abandons all hope of scan resistance.
As buffer/README says in parentheses:

  Bulk writes work similarly to VACUUM.  Currently this applies only to
  COPY IN and CREATE TABLE AS SELECT.  (Might it be interesting to make
  seqscan UPDATE and DELETE use the bulkwrite strategy?)  For bulk writes
  we use a ring size of 16MB (but not more than 1/8th of shared_buffers).

Hmm... what I'm now thinking is that the distinction might be a little
bogus.  Who knows how much scanned data will finish up being dirtied?
I wonder if it would make more sense to abandon
BAS_BULKREAD/BAS_BULKWRITE, and instead make an adaptive strategy.  A
ring that starts small, and grows/shrinks in respons

Re: New committers: Melanie Plageman, Richard Guo

2024-04-26 Thread Dilip Kumar

On Fri, Apr 26, 2024 at 5:24 PM Jonathan S. Katz  wrote:
>
> The Core Team would like to extend our congratulations to Melanie
> Plageman and Richard Guo, who have accepted invitations to become our
> newest PostgreSQL committers.
>
> Please join us in wishing them much success and few reverts!

Congratulations to both of you.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Re: New committers: Melanie Plageman, Richard Guo

2024-04-26 Thread Roberto Mello

On Fri, Apr 26, 2024 at 5:54 AM Jonathan S. Katz 
wrote:

> The Core Team would like to extend our congratulations to Melanie
> Plageman and Richard Guo, who have accepted invitations to become our
> newest PostgreSQL committers.
>
> Please join us in wishing them much success and few reverts!
>

Amazing accomplishment. Thank you for your contributions to PostgreSQL.
Well deserved!

Roberto

Re: New committers: Melanie Plageman, Richard Guo

2024-04-26 Thread John Naylor

On Fri, Apr 26, 2024 at 6:54 PM Jonathan S. Katz  wrote:
>
> The Core Team would like to extend our congratulations to Melanie
> Plageman and Richard Guo, who have accepted invitations to become our
> newest PostgreSQL committers.
>
> Please join us in wishing them much success and few reverts!

Congratulations to you both!

Automatic tablespace management in pg_basebackup

2024-04-26 Thread Thom Brown

Hi,

Manually specifying tablespace mappings in pg_basebackup, especially in
environments where tablespaces can come and go, or with incremental
backups, can be tedious and error-prone. I propose a solution using
pattern-based mapping to automate this process.

So rather than having to specify.

-T /path/to/original/tablespace/a=/path/to/backup/tablespace/a -T
/path/to/original/tablespace/b=/path/to/backup/tablespace/b

And then coming up with a new location to map to for the subsequent
incremental backups, perhaps we could have a parameter (I’m just going to
choose M for “mapping”), like so:

-M %p/%d_backup_1.1

Where it can interpolate the following values:
%p = path
%d = directory
%l = label (not sure about this one)


Using the -M example above, when pg_basebackup finds:

/path/to/original/tablespace/a
/path/to/original/tablespace/b

It creates:

/path/to/original/tablespace/a_backup_1.1
/path/to/original/tablespace/b_backup_1.1


Or:

-M /path/to/backup/tablespaces/1.1/%d

Creates:

/path/to/backup/tablespaces/1.1/a
/path/to/backup/tablespaces/1.1/b


Or possibly allowing something like %l to insert the backup label.

For example:

-M /path/to/backup/tablespaces/%f_%l -l 1.1

Creates:

/path/to/backup/tablespaces/a_1.1
/path/to/backup/tablespaces/b_1.1


This of course would not work if there were tablespaces as follows:

/path/to/first/tablespace/a
/path/to/second/tablespace/a

Where %d would yield the same result for both tablespaces.  However, this
seems like an unlikely scenario as the tablespace name within the database
would need to be unique, but then requires them to use a directory name
that isn't unique.  This could just be a scenario that isn't supported.

Perhaps even allow it to auto-increment a version number it defines
itself.  Maybe %v implies “make up a version number here, and if one
existed in the manifest previously, increment it”.


Ultimately, it would turn this:

pg_basebackup
  -D /Users/thombrown/Development/backups/data1.5
  -h /tmp
  -p 5999
  -c fast
  -U thombrown
  -l 1.5
  -T
/Users/thombrown/Development/tablespaces/ts_a=/Users/thombrown/Development/backups/tablespaces/1.5/backup_ts_a
  -T
/Users/thombrown/Development/tablespaces/ts_b=/Users/thombrown/Development/backups/tablespaces/1.5/backup_ts_b
  -T
/Users/thombrown/Development/tablespaces/ts_c=/Users/thombrown/Development/backups/tablespaces/1.5/backup_ts_c
  -T
/Users/thombrown/Development/tablespaces/ts_d=/Users/thombrown/Development/backups/tablespaces/1.5/backup_ts_d
  -i /Users/thombrown/Development/backups/data1.4/backup_manifest

Into this:

pg_basebackup
  -D /Users/thombrown/Development/backups/1.5/data
  -h /tmp
  -p 5999
  -c fast
  -U thombrown
  -l 1.5
  -M /Users/thombrown/Development/backups/tablespaces/%v/%d
  -i /Users/thombrown/Development/backups/data1.4/backup_manifest

In fact, if I were permitted to get carried away:

-D /Users/thombrown/Development/backups/%v/%d

Then, the only thing that needs changing for each incremental backup is the
manifest location (and optionally the label).


Given that pg_combinebackup has the same option, I imagine something
similar would need to be added there too.  We should already know where the
tablespaces reside, as they are in the final backup specified in the list
of backups, so that seems to just be a matter of getting input of how the
tablespaces should be named in the reconstructed backup.

For example:

pg_combinebackup
  -T
/Users/thombrown/Development/backups/tablespaces/1.4/ts_a=/Users/thombrown/Development/backups/tablespaces/2.0_combined/ts_a
  -T
/Users/thombrown/Development/backups/tablespaces/1.4/ts_b=/Users/thombrown/Development/backups/tablespaces/2.0_combined/ts_b
  -T
/Users/thombrown/Development/backups/tablespaces/1.4/ts_c=/Users/thombrown/Development/backups/tablespaces/2.0_combined/ts_c
  -T
/Users/thombrown/Development/backups/tablespaces/1.4/ts_d=/Users/thombrown/Development/backups/tablespaces/2.0_combined/ts_d
  -o /Users/thombrown/Development/backups/combined
  /Users/thombrown/Development/backups/data{1.0_full,1.1,1.2,1.3,1.4}

Becomes:
pg_combinebackup
  -M /Users/thombrown/Development/backups/tablespaces/%v_combined/%d
  -o /Users/thombrown/Development/backups/%v_combined/%d
  /Users/thombrown/Development/backups/{1.0_full,1.1,1.2,1.3,1.4}/data

You may have inferred that I decided pg_combinebackup increments the
version to the next major version, whereas pg_basebackup in incremental
mode increments the minor version number.

This, of course, becomes messy if the user decided to include the version
number in the backup tablespace directory name, but then these sorts of
things need to be figured out prior to placing into production anyway.

I also get the feeling that accepting an unquoted % as a parameter on the
command line could be problematic, such as it having a special meaning I
haven't accounted for here.  In which case, it may require quoting.

Thoughts?

Regards

Thom

Re: New committers: Melanie Plageman, Richard Guo

2024-04-26 Thread DEVOPS_WwIT


Congratulations!

On 2024/4/26 19:54, Jonathan S. Katz wrote:
The Core Team would like to extend our congratulations to Melanie 
Plageman and Richard Guo, who have accepted invitations to become our 
newest PostgreSQL committers.


Please join us in wishing them much success and few reverts!

Thanks,

Jonathan

Re: Direct SSL connection with ALPN and HBA rules

2024-04-26 Thread Heikki Linnakangas

On 23/04/2024 20:02, Jacob Champion wrote:

On Fri, Apr 19, 2024 at 2:43 PM Heikki Linnakangas  wrote:

On 19/04/2024 19:48, Jacob Champion wrote:

On Fri, Apr 19, 2024 at 6:56 AM Heikki Linnakangas  wrote:

With direct SSL negotiation, we always require ALPN.

   (As an aside: I haven't gotten to test the version of the patch that
made it into 17 yet, but from a quick glance it looks like we're not
rejecting mismatched ALPN during the handshake as noted in [1].)

Ah, good catch, that fell through the cracks. Agreed, the client should
reject a direct SSL connection if the server didn't send ALPN. I'll add
that to the Open Items so we don't forget again.

Yes, the client should also reject, but that's not what I'm referring
to above. The server needs to fail the TLS handshake itself with the
proper error code (I think it's `no_application_protocol`?); otherwise
a client implementing a different protocol could consume the
application-level bytes coming back from the server and act on them.
That's the protocol confusion attack from ALPACA we're trying to
avoid.

I finally understood what you mean. So if the client supports ALPN, but 
the list of protocols that it provides does not include 'postgresql', 
the server should reject the connection with 'no_applicaton_protocol' 
alert. Makes sense. I thought OpenSSL would do that with the alpn 
callback we have, but it does not.

The attached patch makes that change. I used the alpn_cb() function in 
openssl's own s_server program as example for that.

Unfortunately the error message you got in the client with that was 
horrible (I modified the server to not accept the 'postgresql' protocol):

psql "dbname=postgres sslmode=require host=localhost"
psql: error: connection to server at "localhost" (::1), port 5432 
failed: SSL error: SSL error code 167773280

This is similar to the case with system errors discussed at 
https://postgr.es/m/b6fb018b-f05c-4afd-abd3-318c649fa...@highgo.ca, but 
this one is equally bad on OpenSSL 1.1.1 and 3.3.0. It seems like an 
OpenSSL bug to me, because there is an error string "no application 
protocol" in the OpenSSL sources (ssl/ssl_err.c):

{ERR_PACK(ERR_LIB_SSL, 0, SSL_R_NO_APPLICATION_PROTOCOL),
"no application protocol"},

and in the server log, you get that message. But the error code seen in 
the client is different. There are also messages for other alerts, for 
example:

{ERR_PACK(ERR_LIB_SSL, 0, SSL_R_TLSV13_ALERT_MISSING_EXTENSION),
"tlsv13 alert missing extension"},

The bottom line is that that seems like a bug of omission to me in 
OpenSSL, but I wouldn't hold my breath waiting for it to be fixed. We 
can easily check for that error code and print the right message 
ourselves however, as in the attached patch.

--
Heikki Linnakangas
Neon (https://neon.tech)
From 125e9adda6cdab644b660772e29c713863e93cc2 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas 
Date: Sat, 27 Apr 2024 01:47:55 +0300
Subject: [PATCH 1/1] Reject SSL connection if ALPN is used but there's no
 common protocol

If the client supports ALPN but tries to use some other protocol, like
HTTPS, reject the connection in the server. That is surely a confusion
of some sort like trying to PostgreSQL server with a
browser. Furthermore, the ALPN RFC 7301 says:

> In the event that the server supports no protocols that the client
> advertises, then the server SHALL respond with a fatal
> "no_application_protocol" alert.

This commit makes the server follow that advice.

In the client, specifically check for the OpenSSL error code for the
"no_application_protocol" alert. Otherwise you got a cryptic "SSL
error: SSL error code 167773280" error if you tried to connect to a
non-PostgreSQL server that rejects the connection with
"no_application_protocol". ERR_reason_error_string() returns NULL for
that code, which frankly seems like an OpenSSL bug to me, but we can
easily print a better message ourselves.
---
 src/backend/libpq/be-secure-openssl.c| 10 +++---
 src/interfaces/libpq/fe-secure-openssl.c | 12 
 2 files changed, 19 insertions(+), 3 deletions(-)

diff --git a/src/backend/libpq/be-secure-openssl.c b/src/backend/libpq/be-secure-openssl.c
index fc46a33539..60cf68aac4 100644
--- a/src/backend/libpq/be-secure-openssl.c
+++ b/src/backend/libpq/be-secure-openssl.c
@@ -1336,10 +1336,14 @@ alpn_cb(SSL *ssl,

 	if (retval == OPENSSL_NPN_NEGOTIATED)
 		return SSL_TLSEXT_ERR_OK;
-	else if (retval == OPENSSL_NPN_NO_OVERLAP)
-		return SSL_TLSEXT_ERR_NOACK;
 	else
-		return SSL_TLSEXT_ERR_NOACK;
+	{
+		/*
+		 * The client doesn't support our protocol.  Reject the connection
+		 * with TLS "no_application_protocol" alert, per RFC 7301.
+		 */
+		return SSL_TLSEXT_ERR_ALERT_FATAL;
+	}
 }

diff --git a/src/interfaces/libpq/fe-secure-openssl.c b/src/interfaces/libpq/fe-secure-openssl.c
index 0de21dc7e4..ee1a47f2b1 100644
--- a/src/interfaces/libpq/fe-secure-openssl.c
+++ b/src/interfaces/libpq/fe-secure-openssl.c
@@ -1741,6 +17

Re: Support a wildcard in backtrace_functions

2024-04-26 Thread Michael Paquier

On Fri, Apr 26, 2024 at 02:39:16PM -0400, Tom Lane wrote:
> Well, in that case we have to have some kind of control GUC, and
> I think the consensus is that the one we have now is under-designed.
> So I also vote for a full revert and try again in v18.

Okay, fine by me to move on with a revert.
--
Michael


signature.asc
Description: PGP signature

Re: New committers: Melanie Plageman, Richard Guo

2024-04-26 Thread Michael Paquier

On Fri, Apr 26, 2024 at 06:54:26AM -0500, Jonathan S. Katz wrote:
> The Core Team would like to extend our congratulations to Melanie Plageman
> and Richard Guo, who have accepted invitations to become our newest
> PostgreSQL committers.
> 
> Please join us in wishing them much success and few reverts!

Congratulations!
--
Michael


signature.asc
Description: PGP signature

Re: New committers: Melanie Plageman, Richard Guo

2024-04-26 Thread Yasir

Congratulations to both of you Melanie and Richard.
Thank you so much for stepping forward to this great cause.

Regards...

Yasir Hussain
Bitnine Global

On Fri, Apr 26, 2024 at 4:54 PM Jonathan S. Katz 
wrote:

> The Core Team would like to extend our congratulations to Melanie
> Plageman and Richard Guo, who have accepted invitations to become our
> newest PostgreSQL committers.
>
> Please join us in wishing them much success and few reverts!
>
> Thanks,
>
> Jonathan
>

Re: New committers: Melanie Plageman, Richard Guo

2024-04-26 Thread Jaime Casanova

On Fri, Apr 26, 2024 at 6:54 AM Jonathan S. Katz  wrote:
>
> The Core Team would like to extend our congratulations to Melanie
> Plageman and Richard Guo, who have accepted invitations to become our
> newest PostgreSQL committers.
>
> Please join us in wishing them much success and few reverts!
>

No surprise here! Congratulations to both of you!
You both have done a great job!


-- 
Jaime Casanova
SYSTEMGUARDS S.A.

Re: DROP OWNED BY fails to clean out pg_init_privs grants

2024-04-26 Thread Tom Lane

Daniel Gustafsson  writes:
> On 21 Apr 2024, at 23:08, Tom Lane  wrote:
>> So the meson animals are not running the test that sets up the
>> problematic data.

> I took a look at this, reading code and the linked thread.  My gut feeling is
> that Stephen is right in that the underlying bug is these privileges ending up
> in pg_init_privs to begin with.  That being said, I wasn't able to fix that in
> a way that doesn't seem like a terrible hack.

Hmm, can't we put the duplicate logic inside recordExtensionInitPriv?
Even if these calls need a different result from others, adding a flag
parameter seems superior to having N copies of the logic.

A bigger problem though is that I think you are addressing the
original complaint from the older thread, which while it's a fine
thing to fix seems orthogonal to the failure we're seeing in the
buildfarm.  The buildfarm's problem is not that we're recording
incorrect pg_init_privs entries, it's that when we do create such
entries we're failing to show their dependency on the grantee role
in pg_shdepend.  We've missed spotting that so far because it's
so seldom that pg_init_privs entries reference any but built-in
roles (or at least roles that'd likely outlive the extension).

regards, tom lane

Re: Use XLOG_CONTROL_FILE macro everywhere?

2024-04-26 Thread Tom Lane

Robert Haas  writes:
> On Wed, Apr 24, 2024 at 8:04 PM Michael Paquier  wrote:
>> Not sure that I would bother with a second one.  But, well, why not if
>> people want to rename it, as long as you keep compatibility.

> I vote for just standardizing on XLOG_CONTROL_FILE. That name seems
> sufficiently intuitive to me, and I'd rather have one identifier for
> this than two. It's simpler that way.

+1.  Back when we did the great xlog-to-wal renaming, we explicitly
agreed that we wouldn't change internal symbols referring to xlog.
It might or might not be appropriate to revisit that decision,
but I sure don't want to do it piecemeal, one symbol at a time.

Also, if we did rename this one, the logical choice would be
WAL_CONTROL_FILE not PG_CONTROL_FILE.

regards, tom lane

Re: New committers: Melanie Plageman, Richard Guo,New committers: Melanie Plageman, Richard Guo

2024-04-26 Thread Tatsuo Ishii

> The Core Team would like to extend our congratulations to Melanie
> Plageman and Richard Guo, who have accepted invitations to become our
> newest PostgreSQL committers.
> 
> Please join us in wishing them much success and few reverts!

Congratulations!
--
Tatsuo Ishii
SRA OSS LLC
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp

Re: New committers: Melanie Plageman, Richard Guo

2024-04-26 Thread Peter Geoghegan

On Fri, Apr 26, 2024 at 7:54 AM Jonathan S. Katz  wrote:
> The Core Team would like to extend our congratulations to Melanie
> Plageman and Richard Guo, who have accepted invitations to become our
> newest PostgreSQL committers.

Congratulations to both!

-- 
Peter Geoghegan

Re: SQL:2011 application time

2024-04-26 Thread Paul Jungwirth


On 4/26/24 12:25, Robert Haas wrote:

I think this thread should be added to the open items list.


Thanks! I sent a request to pgsql-www to get edit permission. I didn't realize there was a wiki page 
tracking things like this. I agree it needs to be fixed if we want to include the feature.


Yours,

--
Paul  ~{:-)
p...@illuminatedcomputing.com

Re: SQL:2011 application time

2024-04-26 Thread Robert Haas

On Wed, Apr 3, 2024 at 1:30 AM Paul Jungwirth
 wrote:
> I found some problems with temporal primary keys and the idea of uniqueness, 
> especially around the
> indisunique column. Here are some small fixes and a proposal for a larger 
> fix, which I think we need
> but I'd like some feedback on.

I think this thread should be added to the open items list. You're
raising questions about whether the feature that was committed to this
release is fully correct. If it isn't, we shouldn't release it without
fixing it.

https://wiki.postgresql.org/wiki/PostgreSQL_17_Open_Items

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Tarball builds in the new world order

2024-04-26 Thread Tom Lane

Concretely, I'm proposing the attached.  Peter didn't like
PG_COMMIT_HASH, so I have PG_COMMIT_REFSPEC below, but I'm not
wedded to that if a better name is proposed.

regards, tom lane

diff --git a/GNUmakefile.in b/GNUmakefile.in
index 30553b2a95..27357e5e3b 100644
--- a/GNUmakefile.in
+++ b/GNUmakefile.in
@@ -102,10 +102,18 @@ distdir-location:
 # on, Unix machines.
 
 $(distdir).tar.gz:
-	$(GIT) -C $(srcdir) -c core.autocrlf=false archive --format tar.gz -9 --prefix $(distdir)/ HEAD -o $(abs_top_builddir)/$@
+ifneq ($(PG_COMMIT_REFSPEC),)
+	$(GIT) -C $(srcdir) -c core.autocrlf=false archive --format tar.gz -9 --prefix $(distdir)/ $(PG_COMMIT_REFSPEC) -o $(abs_top_builddir)/$@
+else
+	@echo "Please specify PG_COMMIT_REFSPEC." && exit 1
+endif
 
 $(distdir).tar.bz2:
-	$(GIT) -C $(srcdir) -c core.autocrlf=false -c tar.tar.bz2.command='$(BZIP2) -c' archive --format tar.bz2 --prefix $(distdir)/ HEAD -o $(abs_top_builddir)/$@
+ifneq ($(PG_COMMIT_REFSPEC),)
+	$(GIT) -C $(srcdir) -c core.autocrlf=false -c tar.tar.bz2.command='$(BZIP2) -c' archive --format tar.bz2 --prefix $(distdir)/ $(PG_COMMIT_REFSPEC) -o $(abs_top_builddir)/$@
+else
+	@echo "Please specify PG_COMMIT_REFSPEC." && exit 1
+endif
 
 distcheck: dist
 	rm -rf $(dummy)
--- mk-one-release.orig	2024-04-23 17:30:08.983226671 -0400
+++ mk-one-release	2024-04-26 15:17:29.713669677 -0400
@@ -39,13 +39,17 @@ mkdir pgsql
 git archive ${gitref} | tar xf - -C pgsql
 
 # Include the git ref in the output tarballs
+# (This has no effect with v17 and up; instead we rely on "git archive"
+# to include the commit hash in the tar header)
 echo ${gitref} >pgsql/.gitrevision
 
 cd pgsql
 ./configure
 
 # Produce .tar.gz and .tar.bz2 files
-make dist
+# (With v17 and up, this will repeat the "git archive" call; the contents
+# of the working directory don't matter directly to the results.)
+make dist PG_COMMIT_REFSPEC=${gitref}
 
 # Generate md5 and sha256 sums, and copy files to output
 for x in *.tar.*; do

Re: Why don't we support external input/output functions for the composite types

2024-04-26 Thread Robert Haas

On Fri, Apr 26, 2024 at 2:25 PM Tom Lane  wrote:
> Robert Haas  writes:
> > No other programming language that I know of, and no other database
> > that I know of, looks at x.y.z and says "ok, well first we have to
> > figure out whether the object is named x or x.y or x.y.z, and then
> > after that, we'll use whatever is left over as a field selector."
>
> It may indeed be true that nobody but SQL does that, but nonetheless
> this is exactly what SQL99 requires AFAICT.  The reason we use this
> parenthesis notation is precisely that we didn't want to get into
> that sort of tea-leaf-reading about how many identifiers mean what.
> The parens put it on the user to tell us what part of the chain
> is field selection.

I really thought this was just PostgreSQL, not SQL generally, but I
just experimented a bit with Oracle on dbfiddle.uk using this example:

CREATE TYPE foo AS OBJECT (a number(10), b varchar2(2000));
CREATE TABLE bar (quux foo);
INSERT INTO bar VALUES (foo(1, 'one'));
SELECT bar.quux, quux, (quux).a, (bar.quux).a FROM bar;

This works, but if I delete the parentheses from the last line, then
it fails. So evidently my understanding of how this works in other
systems is incorrect, or incomplete. I feel like I've encountered
cases where we required extra parenthesization that Oracle didn't
need, but it's hard to discuss that without examples, and I don't have
them right now.

> Yes, we can.  Please do not rant further about this until you've
> read the  section of a recent SQL spec.

I'm hurt to see emails that I spent time on characterized as a rant,
even if I was wrong on the facts. And I think appealing to the SQL
standard is a poor way of trying to end debate on a topic.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Support a wildcard in backtrace_functions

2024-04-26 Thread Andres Freund

On 2024-04-26 14:39:16 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > I don't think enabling backtraces without a way to disable them is a good 
> > idea
> > - security vulnerablilities in backtrace generation code are far from 
> > unheard
> > of and can make error handling a lot slower...
>
> Well, in that case we have to have some kind of control GUC, and
> I think the consensus is that the one we have now is under-designed.
> So I also vote for a full revert and try again in v18.

Yea, makes sense. I just wanted to point out that some level of control is
needed, not say that what we have now is right.

Re: Why don't we support external input/output functions for the composite types

2024-04-26 Thread Isaac Morland

On Fri, 26 Apr 2024 at 14:04, Robert Haas  wrote:

systems have this problem. I wonder if anyone knows of another system
> that works like PostgreSQL in this regard (without sharing code).
>

In Haskell period (".") is used both to form a qualified name (module.name),
very similar to our schema.object, and it is also a perfectly normal
operator which is defined by the standard prelude as function composition
(but you can re-bind it to any object whatsoever). This is disambiguated in
a very simple way however: Module names must begin with an uppercase letter
while variable names must begin with a lowercase letter.

A related point is that parentheses in Haskell act to group expressions,
but they, and commas, are not involved in calling functions: to call a
function, just write it to the left of its parameter (and it only has one
parameter, officially).

All this might sound weird but it actually works very well in the Haskell
context.

Re: Support a wildcard in backtrace_functions

2024-04-26 Thread Tom Lane

Andres Freund  writes:
> I don't think enabling backtraces without a way to disable them is a good idea
> - security vulnerablilities in backtrace generation code are far from unheard
> of and can make error handling a lot slower...

Well, in that case we have to have some kind of control GUC, and
I think the consensus is that the one we have now is under-designed.
So I also vote for a full revert and try again in v18.

regards, tom lane

Re: Support a wildcard in backtrace_functions

2024-04-26 Thread Andres Freund

Hi,

On 2024-04-19 15:24:17 -0400, Tom Lane wrote:
> I can't say that I care for "backtrace_on_internal_error".
> Re-reading that thread, I see I argued for having *no* GUC and
> just enabling that behavior all the time.  I lost that fight,
> but it should have been clear that a GUC of this exact shape
> is a design dead end --- and that's what we're seeing now.

I don't think enabling backtraces without a way to disable them is a good idea
- security vulnerablilities in backtrace generation code are far from unheard
of and can make error handling a lot slower...

Greetings,

Andres Freund

Re: New committers: Melanie Plageman, Richard Guo

2024-04-26 Thread Andres Freund

On 2024-04-26 06:54:26 -0500, Jonathan S. Katz wrote:
> The Core Team would like to extend our congratulations to Melanie Plageman
> and Richard Guo, who have accepted invitations to become our newest
> PostgreSQL committers.
> 
> Please join us in wishing them much success and few reverts!

Congratulations!

Re: Why don't we support external input/output functions for the composite types

2024-04-26 Thread Tom Lane

Robert Haas  writes:
> No other programming language that I know of, and no other database
> that I know of, looks at x.y.z and says "ok, well first we have to
> figure out whether the object is named x or x.y or x.y.z, and then
> after that, we'll use whatever is left over as a field selector."

It may indeed be true that nobody but SQL does that, but nonetheless
this is exactly what SQL99 requires AFAICT.  The reason we use this
parenthesis notation is precisely that we didn't want to get into
that sort of tea-leaf-reading about how many identifiers mean what.
The parens put it on the user to tell us what part of the chain
is field selection.

Now do you see why I'd prefer to ditch the SQL92-compatibility
measures?  If we said that the first identifier in a chain must
be a correlation name or parameter name, never anything else,
it'd be substantially saner.

> Yet that's essentially what we've done with period, and I don't think
> we can blame that on the SQL standard

Yes, we can.  Please do not rant further about this until you've
read the  section of a recent SQL spec.

regards, tom lane

Re: Why don't we support external input/output functions for the composite types

2024-04-26 Thread Robert Haas

On Fri, Apr 26, 2024 at 12:54 PM Tom Lane  wrote:
> But that's exactly the point: we DON'T consider the initial identifier
> of a qualified name "as a unit", and neither does the standard.
> We have to figure out how many of the identifiers name an object
> (column or table) referenced in the query, and that is not clear
> up-front.  SQL99's rules don't make that any better.  The parens in
> our current notation serve to separate the object name from any field
> selection(s) done on the object.  There's still some ambiguity,
> because we allow you to write either "(table.column).field" or
> "(table).column.field", but we've dealt with that for ages.

I agree that this is exactly the point.

No other programming language that I know of, and no other database
that I know of, looks at x.y.z and says "ok, well first we have to
figure out whether the object is named x or x.y or x.y.z, and then
after that, we'll use whatever is left over as a field selector."
Instead, they have a top-level namespace where x refers to one and
only one thing, and then they look for something called y inside of
that, and if that's a valid object then they look inside of that for
z.

JavaScript is probably the purest example of this. Everything is an
object, and x.y just looks up 'x' in the object that is the current
namespace. Assuming that returns an object rather than nothing, we
then try to find 'y' inside of that object.

I'm not an Oracle expert, but I am under the impression that the way
that Oracle works is closer to that than it is to our
read-the-tea-leaves approach.

I'm almost positive you're about to tell me that there's no way in the
infernal regions that we could make a semantics change of this
magnitude, and maybe you're right. But I think our current approach is
deeply unsatisfying and seriously counterintuitive. People do not get
an error about x.y and think "oh, right, I need to write (x).y so that
the parser understands that the name is x rather than x.y and the .y
part is field-selection rather than a part of the name itself." They
get an error about x.y and say "crap, I guess this syntax isn't
supported" and then when you show them that "(x).y" fixes it, they say
"why in the world does that fix it?" or "wow, that's dumb."

Imagine if we made _ perform string concatenation but also continued
to allow it as an identifier character. When we saw a_b without
spaces, we'd test for whether there's an a_b variable, and/or whether
there are a and b variables, to guess which interpretation was meant.
I hope we would all agree that this would be insane language design.
Yet that's essentially what we've done with period, and I don't think
we can blame that on the SQL standard, because I don't think other
systems have this problem. I wonder if anyone knows of another system
that works like PostgreSQL in this regard (without sharing code).

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: partitioning and identity column

2024-04-26 Thread Alexander Lakhin


26.04.2024 15:57, Ashutosh Bapat wrote:

Thanks Alexander for the report.

On Fri, Apr 26, 2024 at 5:30 PM Alexander Lakhin  wrote:


CREATE TABLE tbl3 (LIKE tbl2 INCLUDING IDENTITY);
ERROR:  no owned sequence found


I don't think creating a table like a partition is common or even useful. Usually it would create it from partitithe 
oned table. But if we consider that to be a use case, I think the error is expected since a partition doesn't have its 
own identity; it shares it with the partitioned table. Maybe we could give a better message. But I will look into this 
and fix it if the solution makes sense.


Maybe it's uncommon, but it's allowed, so users may want to
CREATE TABLE sometable (LIKE partX INCLUDING ALL), for example, if the
partition has a somewhat different structure. And thinking about how such
a restriction could be described in the docs, I would prefer to avoid this
error at the implementation level.



Do you want to track this in open items?



If you are inclined to fix this behavior,  I would add this item.

Best regards,
Alexander

Re: New committers: Melanie Plageman, Richard Guo

2024-04-26 Thread Alexander Korotkov

On Fri, Apr 26, 2024 at 2:54 PM Jonathan S. Katz  wrote:
>
> The Core Team would like to extend our congratulations to Melanie
> Plageman and Richard Guo, who have accepted invitations to become our
> newest PostgreSQL committers.
>
> Please join us in wishing them much success and few reverts!

Many congratulations!  Well deserved!

--
Regards,
Alexander Korotkov

Re: New committers: Melanie Plageman, Richard Guo

2024-04-26 Thread Amul Sul

Many congratulations both of you..!!

Regards,
Amul

On Friday 26 April 2024, Jonathan S. Katz  wrote:

> The Core Team would like to extend our congratulations to Melanie Plageman
> and Richard Guo, who have accepted invitations to become our newest
> PostgreSQL committers.
>
> Please join us in wishing them much success and few reverts!
>
> Thanks,
>
> Jonathan
>


-- 
Regards,
Amul Sul
EDB: http://www.enterprisedb.com

Re: Help update PostgreSQL 13.12 to 13.14

2024-04-26 Thread Kashif Zeeshan

On Fri, Apr 26, 2024 at 9:22 PM •Isaac Rv  wrote:

> Mira intente con el yum y si actualizó pero sin embargo no actualizo a la
> 13.14
>
> sudo yum update postgresql13
> Updating Subscription Management repositories.
>
> This system is registered with an entitlement server, but is not receiving
> updates. You can use subscription-manager to assign subscriptions.
>
> Last metadata expiration check: 0:07:02 ago on Fri 26 Apr 2024 10:01:36 AM
> CST.
> Dependencies resolved.
> Nothing to do.
> Complete!
>

It seemed yum is not able to get the latest package update, try clearing
the cache and rebuilding it

yum clean all

yum makecache

>
> El jue, 25 abr 2024 a las 23:16, Kashif Zeeshan ()
> escribió:
>
>>
>>
>> On Thu, Apr 25, 2024 at 11:55 PM •Isaac Rv 
>> wrote:
>>
>>> Entiendo si, me han dicho que es sencillo, pero no entiendo si solo
>>> descargo los binarios y en cual carpeta reemplazo? no hay una guía cómo tal
>>> de cómo realizarlo, me  podrías ayudar?
>>>
>>
>> Follow the below steps
>> 1. Backup your data
>> 2. Review the release notes of the update release
>> 3. Stop the PG Server
>> 4. Upgrade postgres to newer version, e.g. on CentOS use the command
>> 'sudo yum update postgresql'
>> 5. Restart PG Server
>>
>> Thanks
>> Kashif Zeeshan
>> Bitnine Global
>>
>>>
>>> El jue, 25 abr 2024 a las 11:20, Kashif Zeeshan (<
>>> kashi.zees...@gmail.com>) escribió:
>>>
 Hi Isaac

 You are doing the minor version upgrade so it's not a big effort as
 compared to major version upgrade, following is the process to do it.

 *Minor releases never change the internal storage format and are always
 compatible with earlier and later minor releases of the same major version
 number. For example, version 10.1 is compatible with version 10.0 and
 version 10.6. Similarly, for example, 9.5.3 is compatible with 9.5.0,
 9.5.1, and 9.5.6. To update between compatible versions, you simply replace
 the executables while the server is down and restart the server. The data
 directory remains unchanged — minor upgrades are that simple.*

 Please follow the links below for more information.
 https://www.postgresql.org/docs/13/upgrading.html
 https://www.postgresql.org/support/versioning/

 Thanks
 Kashif Zeeshan
 Bitnine Global

 On Thu, Apr 25, 2024 at 9:37 PM •Isaac Rv 
 wrote:

> Hello everyone, I hope you're doing well. Does anyone have a guide or
> know how to perform an upgrade from PostgreSQL 13.12 to 13.14 on Linux?
> I've searched in various places but haven't found any solid guides, and
> truth be told, I'm a bit of a novice with PostgreSQL. Any help would be
> appreciated.
>

Re: AIX support

2024-04-26 Thread Sriram RK


> > It would definitely make sense for a new port to start by getting
> > things going with gcc only, and then look at resurrecting xlc
> > support.

> Sriram mentioned upthread that he was looking at both of them.  I'd be
> ready to assume that most of the interest is in xlc, not gcc.  But I
> may be wrong.

Just a heads-up, we got a node in the OSU lab for the buildfarm. Will let you 
know once we have the buildfarm setup on that node.

Also, we are making progress on setting up the buildfarm on a local node as 
well.
But currently there are some tests failing, seems some issue with plperl.

aix01::build-farm-17#
./run_build.pl --keepall  --nosend --nostatus --verbose=5  --force REL_16_STABLE

Fri Apr 26 00:53:50 2024: buildfarm run for AIXnode01:REL_16_STABLE starting
AIXnode01:REL_16_STABLE [00:53:50] checking out source ...
AIXnode01:REL_16_STABLE [00:53:56] checking if build run needed ...
AIXnode01:REL_16_STABLE [00:53:56] copying source to pgsql.build ...
AIXnode01:REL_16_STABLE [00:54:08] running configure ...
AIXnode01:REL_16_STABLE [00:55:01] running build ...
AIXnode01:REL_16_STABLE [01:08:09] running basic regression tests ...
AIXnode01:REL_16_STABLE [01:09:51] running make contrib ...
AIXnode01:REL_16_STABLE [01:11:08] running make testmodules ...
AIXnode01:REL_16_STABLE [01:11:19] running install ...
AIXnode01:REL_16_STABLE [01:11:48] running make contrib install ...
AIXnode01:REL_16_STABLE [01:12:01] running testmodules install ...
AIXnode01:REL_16_STABLE [01:12:06] checking test-decoding
gmake: gcc: A file or directory in the path name does not exist.
AIXnode01:REL_16_STABLE [01:12:28] running make check miscellaneous modules ...
gmake: gcc: A file or directory in the path name does not exist.
AIXnode01:REL_16_STABLE [01:13:50] setting up db cluster (C)...
AIXnode01:REL_16_STABLE [01:13:53] starting db (C)...
AIXnode01:REL_16_STABLE [01:13:53] running installcheck (C)...
AIXnode01:REL_16_STABLE [01:15:05] restarting db (C)...
AIXnode01:REL_16_STABLE [01:15:07] running make isolation check ...
AIXnode01:REL_16_STABLE [01:15:51] restarting db (C)...
AIXnode01:REL_16_STABLE [01:15:56] running make PL installcheck (C)...
Branch: REL_16_STABLE
Stage PLCheck-C failed with status 2

Re: pgsql: psql: add an optional execution-count limit to \watch.

2024-04-26 Thread Tom Lane

Anton Voloshin  writes:
> On 26/04/2024 17:38, Anton Voloshin wrote:
>> I will try to report this to Perl community later.

> Reported under https://github.com/Perl/perl5/issues/22176

Thanks for doing that.

> Perl 5.36.3 seems to be fine (latest stable release before 5.38.x).
> 5.38.0 and 5.38.2 are broken.

If the misbehavior is that new, I'm inclined to do nothing about it,
figuring that they'll fix it sooner not later.  If we were seeing
failures in main-line check-world tests then maybe it'd be worth
band-aiding those, but AFAICS we're not.

regards, tom lane

Re: Why don't we support external input/output functions for the composite types

2024-04-26 Thread Tom Lane

Robert Haas  writes:
> On Fri, Apr 26, 2024 at 12:15 PM Tom Lane  wrote:
>> If I'm reading SQL99 correctly, they deny allowing the first
>> identifier to be a column name when there's more than one identifier,
>> so that you must table-qualify a composite column before you can
>> select a field from it.  But they allow all the other possibilities
>> and claim it's user error if more than one could apply, which seems
>> like an awful design to me.

> I'm less certain how that should be spelled. The rules you propose
> make sense to me up to a point, but what happens if the same
> unqualified name is both a table alias and a function parameter name?
> I think I need a way of forcing the function-parameter interpretation.
> You could make function_name.parameter_name resolve to that, but then
> what happens if function_name is also a table alias in the containing
> query? It's really hard to think of a set of rules here that don't
> leave any room for unfixable problems. Maybe the answer is that we
> should support some completely different notion for unambiguously
> referencing parameters, like ${parameter_name}. I don't know. I think
> that what you're proposing here could be a nice improvement but it
> definitely seems tricky to get it completely right.

I think you're moving the goal posts too far.  It's on the user to
spell the initially-written query unambiguously: if you chose a
function parameter name that matches a table correlation name in the
query, that's your fault and you'd better rename one of those things.
What concerns me is the hazard that the query is okay, and we store
it, and then subsequent object creations or renamings create a
situation where an identifier chain is ambiguous per the SQL99 rules.
ruleutils has to be able to deparse the stored query in a way that is
valid regardless of that.  Giving first priority to correlation and
parameter names makes this possible because external operations, even
including renaming tables or columns used in the query, won't affect
either.

regards, tom lane

PS: ruleutils does sometimes choose new correlation names, and it
suddenly occurs to me that it's probably not being careful to avoid
duplicating function parameter names.  But that's independent of this
discussion.

Re: AIX support

2024-04-26 Thread Bruce Momjian

On Thu, Apr 25, 2024 at 01:06:24PM +0900, Michael Paquier wrote:
> Anyway, getting an access to such compilers to be able to debug issues
> on hosts that take less than 12h to just compile the code would
> certainly help its adoption.  So seeing commitment in the form of
> patches and access to environments would help a lot.  Overall,
> studying that afresh with v18 looks like a good idea, assuming that
> anybody who commits such patches has access to hosts to evaluate them,
> with buildfarm members running on top, of course.

Agreed.  They can't even have buildfarm member for PG 17 since it
doesn't compile anymore, so someone has to go over the reverted patch,
figure out which ones are still valid, and report back.  Trying to add a
port, with possible breakage, during beta seems too risky compared to
the value.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Only you can decide what is important to you.

Re: pgsql: psql: add an optional execution-count limit to \watch.

2024-04-26 Thread Anton Voloshin


On 26/04/2024 17:38, Anton Voloshin wrote:

I will try to report this to Perl community later.


Reported under https://github.com/Perl/perl5/issues/22176

Perl 5.36.3 seems to be fine (latest stable release before 5.38.x).
5.38.0 and 5.38.2 are broken.

--
Anton Voloshin
Postgres Professional, The Russian Postgres Company
https://postgrespro.ru

Re: AIX support

2024-04-26 Thread Bruce Momjian

On Thu, Apr 25, 2024 at 10:16:34AM +0200, Álvaro Herrera wrote:
> On 2024-Apr-24, Bruce Momjian wrote:
> 
> > I agree that targeting PG 18 for a new-er AIX port is the reasonable
> > approach.  If there is huge demand, someone can create an AIX fork for
> > PG 17 using the reverted patches --- yeah, lots of pain there, but we
> > have carried the AIX pain for too long with too little support.
> 
> I'm not sure how large the demand would be for an AIX port specifically
> of 17, though.  I mean, people using older versions can continue to use
> 16 until 18 is released.  Upgrading past one major version is hardly
> unheard of.

Agreed.  They seem to have packages for 11/12, and only 15 recently.  I
don't see how PG 17 would be missed, unless there are many people
compiling from source.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Only you can decide what is important to you.

Re: Why don't we support external input/output functions for the composite types

2024-04-26 Thread Tom Lane

Robert Haas  writes:
> On Fri, Apr 26, 2024 at 11:55 AM Tom Lane  wrote:
>> Well, that would be one way of making the consistency problem be not
>> our problem, but it would be a sad restriction.  It'd void a lot of
>> the arguable use-case for this feature, if you ask me.  I realize
>> that non-superusers couldn't create the C-language I/O functions that
>> would be most at risk here, but you could imagine people building
>> I/O functions in some other PL.

> Huh, I hadn't considered that. I figured the performance would be too
> bad to even think about it. I also wasn't even sure such a thing would
> be supportable: I thought cstrings were generally limited to
> C/internal functions.

Performance could indeed be an issue, but I think people taking this
path would be doing so because they value programmer time more than
machine time.  And while there once were good reasons to not let
user functions deal in cstrings, I'm not sure there are anymore.
(The point would deserve closer investigation if we actually tried
to move forward on it, of course.)

> Rather, what I
> don't like about the status quo is that putting parentheses around
> something that we were already going to consider as a unit changes the
> meaning of it. And that's exactly what happens when you write x.y vs.
> (x).y.

But that's exactly the point: we DON'T consider the initial identifier
of a qualified name "as a unit", and neither does the standard.
We have to figure out how many of the identifiers name an object
(column or table) referenced in the query, and that is not clear
up-front.  SQL99's rules don't make that any better.  The parens in
our current notation serve to separate the object name from any field
selection(s) done on the object.  There's still some ambiguity,
because we allow you to write either "(table.column).field" or
"(table).column.field", but we've dealt with that for ages.

regards, tom lane

Re: Why don't we support external input/output functions for the composite types

2024-04-26 Thread Robert Haas

On Fri, Apr 26, 2024 at 12:15 PM Tom Lane  wrote:
> > I'm not familiar with these rules. Do they allow stuff like a.b.c.d.e,
> > or better yet, a.b(args).c(args).d(args).e(args)?
>
> The former.
>
>   ::=
>[ {   }... ]

OK, nice.

> The hard part is to figure out what the first identifier is:
> column name? table correlation name (AS name)? schema name or
> catalog name of a qualified table name? function parameter name?
> After that, as long as what you have is of composite type,
> you can drill down into it.

Right, makes sense.

> If I'm reading SQL99 correctly, they deny allowing the first
> identifier to be a column name when there's more than one identifier,
> so that you must table-qualify a composite column before you can
> select a field from it.  But they allow all the other possibilities
> and claim it's user error if more than one could apply, which seems
> like an awful design to me.  At minimum I'd want to say that the
> correlation name should be the first choice and wins if there's
> a match, regardless of anything else, because otherwise there is
> no safe way for ruleutils to deparse such a construct.  And
> probably function parameter name should be second choice and
> similarly win over other choices, for the same reason.  The other
> options are SQL92 compatibility holdovers and should only be
> considered if we can't find a matching correlation or parameter name.

I definitely agree that there must always be some way to make it
unambiguous, not just because of deparsing but also because users are
going to want a way to force their preferred interpretation. I've been
a PostgreSQL developer now for considerably longer than I was an end
user, but I definitely would not have liked "ERROR: you can't get
there from here".

I'm less certain how that should be spelled. The rules you propose
make sense to me up to a point, but what happens if the same
unqualified name is both a table alias and a function parameter name?
I think I need a way of forcing the function-parameter interpretation.
You could make function_name.parameter_name resolve to that, but then
what happens if function_name is also a table alias in the containing
query? It's really hard to think of a set of rules here that don't
leave any room for unfixable problems. Maybe the answer is that we
should support some completely different notion for unambiguously
referencing parameters, like ${parameter_name}. I don't know. I think
that what you're proposing here could be a nice improvement but it
definitely seems tricky to get it completely right.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Why don't we support external input/output functions for the composite types

2024-04-26 Thread Robert Haas

On Fri, Apr 26, 2024 at 11:55 AM Tom Lane  wrote:
> Well, that would be one way of making the consistency problem be not
> our problem, but it would be a sad restriction.  It'd void a lot of
> the arguable use-case for this feature, if you ask me.  I realize
> that non-superusers couldn't create the C-language I/O functions that
> would be most at risk here, but you could imagine people building
> I/O functions in some other PL.

Huh, I hadn't considered that. I figured the performance would be too
bad to even think about it. I also wasn't even sure such a thing would
be supportable: I thought cstrings were generally limited to
C/internal functions.

> >> I take it that you also don't believe that "2 + 3 * 4" should yield
> >> different results from "(2 + 3) * 4"?
>
> > Isaac's rebuttal to this particular point was perfect; I have nothing to 
> > add.
>
> As far as I could tell, Isaac's rebuttal was completely off-point.

OK, I'm not sure why, but let me explain my position. In an expression
like (2 + 3) * 4, the parentheses change the order of evaluation,
which makes sense. That's what parentheses are for, or at least one
thing that parentheses are for. But in an expression like
(my_point).x, that's not the case. There's only one operator here, the
period, and so there's only one possible order of evaluation, so why
do parentheses make any difference? Having (my_point).x be different
from my_point.x is like having 2 + 3 give a different answer from (2 +
3), which would be bonkers.

But it's not at all like the difference between 2 + 3 * 4 and (2 + 3)
* 4. The comparable case there would be foo.bar.baz as against
(foo.bar).baz or alternatively foo.(bar.baz). Now there are two dot
operators, and one of them has to be applied first, and there's some
default based on associativity, and if you want it the other way you
stick parentheses in there to tell the parser what you meant.

And the reason I thought Isaac put it well is that he said, "In that
expression 2 + 3 is not a subexpression, although 3 * 4 is, thanks to
the operator precedence rules." Exactly so. 2 + 3 * 4 is going to be
parsed as something like OpExpr(+, 2, OpExpr(*, 3, 4)) -- and that
does not have OpExpr(+, 2, 3) anywhere inside of it, so my comment
that parenthesizing a subexpression shouldn't change its meaning is
not relevant here. I'm perfectly fine with parentheses changing which
things we parse as subexpressions. Users have no license to just stick
parentheses into your SQL expressions in random places and expect that
they don't do anything; if that were so, we'd have to make ((2 +)
3)()()() evaluate to 5, which is obviously nonsense. Rather, what I
don't like about the status quo is that putting parentheses around
something that we were already going to consider as a unit changes the
meaning of it. And that's exactly what happens when you write x.y vs.
(x).y. The parentheses around the x make us think that it's a
different kind of thing, somehow. That seems odd, and the practical
result is that you have to insert a bunch of parentheses into your
PostgreSQL code that look like they shouldn't be needed, but are.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Why don't we support external input/output functions for the composite types

2024-04-26 Thread Tom Lane

Robert Haas  writes:
> On Thu, Apr 25, 2024 at 5:51 PM Tom Lane  wrote:
>> A different approach we could take is to implement the SQL99 rules
>> for , or at least move closer to that.

> I'm not familiar with these rules. Do they allow stuff like a.b.c.d.e,
> or better yet, a.b(args).c(args).d(args).e(args)?

The former.

  ::=
   [ {   }... ]

The hard part is to figure out what the first identifier is:
column name? table correlation name (AS name)? schema name or
catalog name of a qualified table name? function parameter name?
After that, as long as what you have is of composite type,
you can drill down into it.

If I'm reading SQL99 correctly, they deny allowing the first
identifier to be a column name when there's more than one identifier,
so that you must table-qualify a composite column before you can
select a field from it.  But they allow all the other possibilities
and claim it's user error if more than one could apply, which seems
like an awful design to me.  At minimum I'd want to say that the
correlation name should be the first choice and wins if there's
a match, regardless of anything else, because otherwise there is
no safe way for ruleutils to deparse such a construct.  And
probably function parameter name should be second choice and
similarly win over other choices, for the same reason.  The other
options are SQL92 compatibility holdovers and should only be
considered if we can't find a matching correlation or parameter name.

The net result of doing it like this, I think, is that we'd accept
some cases where SQL99 would prefer to raise an ambiguity error.
But we'd still be much closer to the modern standard than we are
today.

regards, tom lane

Re: Support a wildcard in backtrace_functions

2024-04-26 Thread Robert Haas

On Tue, Apr 23, 2024 at 1:05 AM Michael Paquier  wrote:
> At this stage, my opinion would tend in favor of a revert of the GUC.
> The second class of cases is useful to stress many unexpected cases,
> and I don't expect this number to go down over time, but increase with
> more advanced tests added into core (I/O failures with partial writes
> for availability, etc).

All right. I think there is a consensus in favor of reverting
a740b213d4b4d3360ad0cac696e47e5ec0eb8864. Maybe not an absolutely
iron-clad consensus, but there are a couple of votes explicitly in
favor of that course of action and the other votes seem to mostly be
of the form "well, reverting is ONE option."

Peter?

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Why don't we support external input/output functions for the composite types

2024-04-26 Thread Tom Lane

Robert Haas  writes:
> On Thu, Apr 25, 2024 at 5:05 PM Tom Lane  wrote:
>> What I'm trying to say is: given that the command "alter type T alter
>> attribute A type foo" exists, users would reasonably expect the system
>> to honor that on its own for any composite type, because that's what
>> it does today.  But it can't if T has custom I/O functions, at least
>> not without understanding how to rewrite those functions.

> I understand your point, but I don't agree with it. Ordinary users
> wouldn't be able to create types like this anyway, because there's no
> way we can allow an unprivileged user to set an input or output
> function. It would have to be restricted to superusers, just as we do
> for base types.

Well, that would be one way of making the consistency problem be not
our problem, but it would be a sad restriction.  It'd void a lot of
the arguable use-case for this feature, if you ask me.  I realize
that non-superusers couldn't create the C-language I/O functions that
would be most at risk here, but you could imagine people building
I/O functions in some other PL.

(We'd have to remove the restriction that cstring isn't an allowed
input or return type for user-defined functions; but AFAIK that's
just a holdover from days when cstring was a lot more magic than
it is now.)

Maybe there's an argument that PL functions already have to be
proof enough against datatype inconsistencies that nothing really
awful could happen.  Not sure.

In any case, if we have to put strange restrictions on a composite
type when it has custom I/O functions, then that still is an
indication that the feature is a hack that doesn't play nice
with the rest of the system.  So I remain of the opinion that
we shouldn't go there.  If field selection support for custom
types will solve the use-case, I find that a lot more attractive.

>> I take it that you also don't believe that "2 + 3 * 4" should yield
>> different results from "(2 + 3) * 4"?

> Isaac's rebuttal to this particular point was perfect; I have nothing to add.

As far as I could tell, Isaac's rebuttal was completely off-point.

regards, tom lane

Re: trying again to get incremental backup

2024-04-26 Thread Robert Haas

On Thu, Apr 25, 2024 at 6:44 PM Thom Brown  wrote:
> I would like to query the following:
>
> --tablespace-mapping=olddir=newdir
>
> Relocates the tablespace in directory olddir to newdir during the backup. 
> olddir is the absolute path of the tablespace as it exists in the first 
> backup specified on the command line, and newdir is the absolute path to use 
> for the tablespace in the reconstructed backup.
>
> The first backup specified on the command line will be the regular, full, 
> non-incremental backup.  But if a tablespace was introduced subsequently, it 
> would only appear in an incremental backup.  Wouldn't this then mean that a 
> mapping would need to be provided based on the path to the tablespace of that 
> incremental backup's copy?

Yes. Tomas Vondra found the same issue, which I have fixed in
1713e3d6cd393fcc1d4873e75c7fa1f6c7023d75.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Direct SSL connection with ALPN and HBA rules

2024-04-26 Thread Robert Haas

On Thu, Apr 25, 2024 at 5:50 PM Heikki Linnakangas  wrote:
> On 25/04/2024 21:13, Jacob Champion wrote:
> > On Thu, Apr 25, 2024 at 10:35 AM Robert Haas  wrote:
> >> Maybe I'm missing something here, but why doesn't sslnegotiation
> >> override sslmode completely? Or alternatively, why not remove
> >> sslnegotiation entirely and just have more sslmode values? I mean
> >> maybe this shouldn't happen categorically, but if I say I want to
> >> require a direct SSL connection, to me that implies that I don't want
> >> an indirect SSL connection, and I really don't want a non-SSL
> >> connection.
>
> My thinking with sslnegotiation is that it controls how SSL is
> negotiated with the server, if SSL is to be used at all. It does not
> control whether SSL is used or required; that's what sslmode is for.

I think this might boil down to the order in which someone thinks that
different settings should be applied. It sounds like your mental model
is that GSS settings are applied first, and then SSL settings are
applied afterwards, and then within the SSL bucket you can select how
you want to do SSL (direct or negotiated) and how required it is. My
mental model is different: I imagine that since direct SSL happens
from the first byte exchanged over the socket, direct SSL "happens
first", making settings that pertain to negotiated GSS and negotiated
SSL irrelevant. Because, logically, if you've decided to use direct
SSL, you're not even going to get a chance to negotiate those things.
I understand that the code as written works around that, by being able
to open a new connection if it turns out that we need to negotiate
that stuff after all, but IMHO that's rather confusing.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Why don't we support external input/output functions for the composite types

2024-04-26 Thread Robert Haas

On Thu, Apr 25, 2024 at 5:51 PM Tom Lane  wrote:
> A different approach we could take is to implement the SQL99 rules
> for , or at least move closer to that.  Our
> existing rules for resolving qualified column references are more
> or less SQL92.  I think the reasons we didn't do that when we first
> implemented SQL99 are

I'm not familiar with these rules. Do they allow stuff like a.b.c.d.e,
or better yet, a.b(args).c(args).d(args).e(args)?

> Still, maybe it's time to think about changing?  We could use
> the "the standard says so" excuse with anybody who complains.

I certainly agree that if we're going to break stuff, breaking stuff
to get closer to the standard is superior to other ways of breaking
stuff. Without knowing what we'd get out of it, I don't have an
opinion about whether it's worth it here or not, but making our syntax
more like other programming languages and especially other popular
database products does seem to me to have positive value.

> In the long run I wish we could ditch the SQL92 rules altogether
> and say that the head identifier of a qualified column reference
> must be a table's correlation name, not a schema or catalog name.
> There's zero good reason for the latter two cases, other than
> compatibility with thirty-year-old design mistakes.  I kind of
> doubt we could make that fly though.

Yeah, I think that would break too much stuff.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: pgsql: psql: add an optional execution-count limit to \watch.

2024-04-26 Thread Anton Voloshin

On 26/04/2024 05:20, Tom Lane wrote:

Haven't we worked around that everywhere it matters, in commits such
as 8421f6bce and 605062227?

Yes, needing 8421f6bce and 605062227 was, perhaps, surprising, but 
reasonable. Unlike breaking floating point constants in the source code. 
But, I guess, you're right and, since it does look like a Perl bug, 
we'll have to work around that in all places where we use floating-point 
constants in Perl code, which are surprisingly few.

> For me, check-world passes under
> LANG=ru_RU, even with perl 5.38.2 (where I do confirm that your
> test script fails).  The buildfarm isn't unhappy either.

Indeed, check-world seems to run fine on my machine and on the bf as well.

Grepping and browsing through, I've only found three spots with \d\.\d 
directly in Perl code as a float, only one of them needs correction.

1. src/test/perl/PostgreSQL/Test/Kerberos.pm in master
src/test/kerberos/t/001_auth.pl in REL_16_STABLE
> if ($krb5_version >= 1.15)

I guess adding use locale ':numeric' would be easiest workaround here.
Alternatively, we could also split version into krb5_major_version and 
krb5_minor_version while parsing krb5-config --version's output above, 
but I don't think that's warranted. So I suggest something along the 
lines of 0001-use-numeric-locale-in-kerberos-test-rel16.patch and 
*-master.patch (attached, REL_16 and master need this change in 
different places).

I did verify by providing fake 'krb5-config' that before the fix, with 
LANG=ru_RU.UTF-8 and Perl 5.38.2 and with, say, krb5 "version" 1.13 it 
would still add the "listen" lines to kdc.conf by mistake (presumably, 
confusing some versions of kerberos).

2 and 3. contrib/intarray/bench/create_test.pl
> if (rand() < 0.7)
and
> if ($#sect < 0 || rand() < 0.1)

PostgreSQL::Test::Utils is not used there, so it's OK, no change needed.

I did not find any other float constants in .pl/.pm files in master (I 
could have missed something).

> Particularly in
> this way --- what are we supposed to do, write "if (0 < 0,5)"?
> That means something else.

Yep. I will try to report this to Perl community later.

--
Anton Voloshin
Postgres Professional, The Russian Postgres Company
https://postgrespro.ruFrom bc187cd95f91d5a7fe97ba4ccfaf75b77e2f03c8 Mon Sep 17 00:00:00 2001
From: Anton Voloshin 
Date: Fri, 26 Apr 2024 12:24:49 +
Subject: [PATCH] use numeric locale in kerberos test to work around perl bug

After b124104e7 we became susceptible to Perl bug (at least on Perl
5.38.2), where on locales with non-dot fractional part separator,
floating-point constants in perl source are broken (1.15 would be '1').
Work around that by using numeric locale in this file.
---
 src/test/kerberos/t/001_auth.pl | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/test/kerberos/t/001_auth.pl b/src/test/kerberos/t/001_auth.pl
index d74e4af464e..9475f14d50e 100644
--- a/src/test/kerberos/t/001_auth.pl
+++ b/src/test/kerberos/t/001_auth.pl
@@ -23,6 +23,8 @@ use PostgreSQL::Test::Utils;
 use PostgreSQL::Test::Cluster;
 use Test::More;
 use Time::HiRes qw(usleep);
+# Work around the Perl 5.38.2 bug: we need floating-point constants.
+use locale ':numeric';

 if ($ENV{with_gssapi} ne 'yes')
 {
-- 
2.43.0

From d6ee9a5567cea503c211fa41a0f741c71aa60ad5 Mon Sep 17 00:00:00 2001
From: Anton Voloshin 
Date: Fri, 26 Apr 2024 14:23:28 +
Subject: [PATCH] use numeric locale in kerberos test to work around perl bug

After b124104e7 we became susceptible to Perl bug (at least on Perl
5.38.2), where on locales with non-dot fractional part separator,
floating-point constants in perl source are broken (1.15 would be '1').
Work around that by using numeric locale in this file.
---
 src/test/perl/PostgreSQL/Test/Kerberos.pm | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/test/perl/PostgreSQL/Test/Kerberos.pm b/src/test/perl/PostgreSQL/Test/Kerberos.pm
index f7810da9c1d..2c29c67329e 100644
--- a/src/test/perl/PostgreSQL/Test/Kerberos.pm
+++ b/src/test/perl/PostgreSQL/Test/Kerberos.pm
@@ -9,6 +9,8 @@ package PostgreSQL::Test::Kerberos;
 use strict;
 use warnings FATAL => 'all';
 use PostgreSQL::Test::Utils;
+# Work around the Perl 5.38.2 bug: we need floating-point constants.
+use locale ':numeric';

 our ($krb5_bin_dir, $krb5_sbin_dir, $krb5_config, $kinit, $klist,
 	 $kdb5_util, $kadmin_local, $krb5kdc,
-- 
2.43.0

Re: Why don't we support external input/output functions for the composite types

2024-04-26 Thread Robert Haas

On Thu, Apr 25, 2024 at 5:05 PM Tom Lane  wrote:
> > I'm not sure I really buy this. Changing the column definitions
> > amounts to changing the on-disk format, and no data type can survive a
> > change to the on-disk format without updating the I/O functions to
> > match.
>
> What I'm trying to say is: given that the command "alter type T alter
> attribute A type foo" exists, users would reasonably expect the system
> to honor that on its own for any composite type, because that's what
> it does today.  But it can't if T has custom I/O functions, at least
> not without understanding how to rewrite those functions.

I understand your point, but I don't agree with it. Ordinary users
wouldn't be able to create types like this anyway, because there's no
way we can allow an unprivileged user to set an input or output
function. It would have to be restricted to superusers, just as we do
for base types. And IMHO those have basically the same issue: you have
to ensure that all the functions and operators that operate on the
type, and any subscripting operations, are on the same page about what
the underlying storage is. This doesn't seem different. It may well
still be a bad idea for other reasons, or just kind of useless, but I
disagree that it's a bad idea for that particular reason.

> Nope, it isn't IMO.  We already added infrastructure to allow
> arbitrary custom types to define subscripting operations.  I think a
> case could be made to allow them to define field selection, too.

That would be cool!

> I take it that you also don't believe that "2 + 3 * 4" should yield
> different results from "(2 + 3) * 4"?

Isaac's rebuttal to this particular point was perfect; I have nothing to add.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: New committers: Melanie Plageman, Richard Guo

2024-04-26 Thread David E. Wheeler

On Apr 26, 2024, at 07:54, Jonathan S. Katz  wrote:

> The Core Team would like to extend our congratulations to Melanie Plageman 
> and Richard Guo, who have accepted invitations to become our newest 
> PostgreSQL committers.
> 
> Please join us in wishing them much success and few reverts!

Great news, congratulations!

Best,

David

Re: New GUC autovacuum_max_threshold ?

2024-04-26 Thread Robert Haas

On Fri, Apr 26, 2024 at 9:40 AM Joe Conway  wrote:
> > Can you elaborate on why you think that? I mean, to me, that's almost
> > equivalent to removing autovacuum_vacuum_scale_factor entirely,
> > because only for very small tables will that calculation produce a
> > value lower than 500k.
>
> If I understood Nathan's proposed calc, for small tables you would still
> get (thresh + sf * numtuples). Once that number exceeds the new limit
> parameter, then the latter would kick in. So small tables would retain
> the current behavior and large enough tables would be clamped.

Right. But with a 500k threshold, "large enough" is not very large at
all. The default scale factor is 20%, so the crossover point is at 2.5
million tuples. That's pgbench scale factor 25, which is a 320MB
table.

> It depends on workload to be sure. Just because a table is large, it
> doesn't mean that dead rows are generated that fast.

That is true, as far as it goes.

> Admittedly it has been quite a while since I looked at all this that
> closely, but if A/V runs on some large busy table for a few milliseconds
> once every few minutes, that is far less disruptive than A/V running for
> tens of seconds once every few hours or for minutes ones every few days
> -- or whatever. The key thing to me is the "few milliseconds" runtime.
> The short duration means that no one notices an impact, and the longer
> duration almost guarantees that an impact will be felt.

Sure, I mean, I totally agree with that, but how is a vacuum on a
large table going to run for milliseconds? If it can skip index
vacuuming, sure, then it's quick, because it only needs to scan the
heap pages that are not all-visible. But as soon as index vacuuming is
triggered, it's going to take a while. You can't afford to trigger
that constantly.

Let's compare the current situation to the situation post-patch with a
cap of 500k. Consider a table 1024 times larger than the one I
mentioned above, so pgbench scale factor 25600, size on disk 320GB.
Currently, that table will be vacuumed for bloat when the number of
dead tuples exceeds 20% of the table size, because that's the default
value of autovacuum_vacuum_scale_factor. The table has 2.56 billion
tuples, so that means that we're going to vacuum it when there are
more than 510 million dead tuples. Post-patch, we will vacuum when we
have 500 thousand dead tuples. Suppose a uniform workload that slowly
updates rows in the table. If we were previously autovacuuming the
table once per day (1440 minutes) we're now going to try to vacuum it
almost every minute (1440 minutes / 1024 = 84 seconds).

Unless I'm missing something major, that's completely bonkers. It
might be true that it would be a good idea to vacuum such a table more
often than we do at present, but there's no shot that we want to do it
that much more often. The pgbench_accounts_pkey index will, I believe,
be on the order of 8-10GB at that scale. We can't possibly want to
incur that much extra I/O every minute, and I don't think it's going
to finish in milliseconds, either.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: New committers: Melanie Plageman, Richard Guo

2024-04-26 Thread Pavel Borisov

On Fri, 26 Apr 2024 at 17:48, Nathan Bossart 
wrote:

> On Fri, Apr 26, 2024 at 06:54:26AM -0500, Jonathan S. Katz wrote:
> > The Core Team would like to extend our congratulations to Melanie
> Plageman
> > and Richard Guo, who have accepted invitations to become our newest
> > PostgreSQL committers.
> >
> > Please join us in wishing them much success and few reverts!
>

Congratulations! Well deserved!

Pavel Borisov

Re: New committers: Melanie Plageman, Richard Guo

2024-04-26 Thread Nathan Bossart

On Fri, Apr 26, 2024 at 06:54:26AM -0500, Jonathan S. Katz wrote:
> The Core Team would like to extend our congratulations to Melanie Plageman
> and Richard Guo, who have accepted invitations to become our newest
> PostgreSQL committers.
> 
> Please join us in wishing them much success and few reverts!

Congratulations to both!

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Re: DROP OWNED BY fails to clean out pg_init_privs grants

2024-04-26 Thread Daniel Gustafsson

> On 21 Apr 2024, at 23:08, Tom Lane  wrote:
> 
> Daniel Gustafsson  writes:
>>> On 6 Apr 2024, at 01:10, Tom Lane  wrote:
>>> (So now I'm wondering why *only* copperhead has shown this so far.
>>> Are our other cross-version-upgrade testing animals AWOL?)
> 
>> Clicking around searching for Xversion animals I didn't spot any, but without
>> access to the database it's nontrivial to know which animal does what.
> 
> I believe I see why this is (or isn't) happening.  The animals
> currently running xversion tests are copperhead, crake, drongo,
> and fairywren.  copperhead is using the makefiles while the others
> are using meson.  And I find this in
> src/test/modules/test_pg_dump/meson.build (from 3f0e786cc):
> 
># doesn't delete its user
>'runningcheck': false,
> 
> So the meson animals are not running the test that sets up the
> problematic data.

ugh =/

> I think we should remove the above, since (a) the reason to have
> it is gone, and (b) it seems really bad that the set of tests
> run by meson is different from that run by the makefiles.

Agreed.

> However, once we do that, those other three animals will presumably go
> red, greatly complicating detection of any Windows-specific problems.
> So I'm inclined to not do it till just before we intend to commit
> a fix for the underlying problem.  (Enough before that we can confirm
> that they do go red.)

Agreed, we definitely want that but compromising the ability to find Windows
issues at this point in the cycle seems bad.

> ... were you going to look at it?  I can take a whack if it's
> too far down your priority list.

I took a look at this, reading code and the linked thread.  My gut feeling is
that Stephen is right in that the underlying bug is these privileges ending up
in pg_init_privs to begin with.  That being said, I wasn't able to fix that in
a way that doesn't seem like a terrible hack.  The attached POC hack fixes it
for me but I'm not sure how to fix it properly. Your wisdom would be much 
appreciated.

Clusters which already has such entries aren't helped by a fix for this though,
fixing that would either require pg_dump to skip them, or pg_upgrade to have a
check along with instructions for fixing the issue.  Not sure what's the best
strategy here, the lack of complaints could indicate this isn't terribly common
so spending cycles on it for every pg_dump might be excessive compared to a
pg_upgrade check?

--
Daniel Gustafsson

init_privs.diff
Description: Binary data

Re: New committers: Melanie Plageman, Richard Guo

2024-04-26 Thread Aleksander Alekseev

On Fri, Apr 26, 2024 at 2:54 PM Jonathan S. Katz  wrote:
>
> The Core Team would like to extend our congratulations to Melanie
> Plageman and Richard Guo, who have accepted invitations to become our
> newest PostgreSQL committers.
>
> Please join us in wishing them much success and few reverts!

Completely deserved. Congrats!

-- 
Best regards,
Aleksander Alekseev

Re: New GUC autovacuum_max_threshold ?

2024-04-26 Thread Joe Conway


On 4/26/24 09:31, Robert Haas wrote:

On Fri, Apr 26, 2024 at 9:22 AM Joe Conway  wrote:

Although I don't think 50 is necessarily too small. In my view,
having autovac run very quickly, even if more frequently, provides an
overall better user experience.


Can you elaborate on why you think that? I mean, to me, that's almost
equivalent to removing autovacuum_vacuum_scale_factor entirely,
because only for very small tables will that calculation produce a
value lower than 500k.


If I understood Nathan's proposed calc, for small tables you would still 
get (thresh + sf * numtuples). Once that number exceeds the new limit 
parameter, then the latter would kick in. So small tables would retain 
the current behavior and large enough tables would be clamped.



We might need to try to figure out some test cases here. My intuition
is that this is going to vacuum large tables insanely aggressively.


It depends on workload to be sure. Just because a table is large, it 
doesn't mean that dead rows are generated that fast.


Admittedly it has been quite a while since I looked at all this that 
closely, but if A/V runs on some large busy table for a few milliseconds 
once every few minutes, that is far less disruptive than A/V running for 
tens of seconds once every few hours or for minutes ones every few days 
-- or whatever. The key thing to me is the "few milliseconds" runtime. 
The short duration means that no one notices an impact, and the longer 
duration almost guarantees that an impact will be felt.


--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: New GUC autovacuum_max_threshold ?

2024-04-26 Thread Robert Haas

On Fri, Apr 26, 2024 at 4:43 AM Michael Banck  wrote:
> > I believe that the defaults should work well in moderately sized databases
> > with moderate usage characteristics.  If you have large tables or a high
> > number of transactions per second, you can be expected to make the effort
> > and adjust the settings for your case.  Adding more GUCs makes life *harder*
> > for the users who are trying to understand and configure how autovacuum 
> > works.
>
> Well, I disagree to some degree. I agree that the defaults should work
> well in moderately sized databases with moderate usage characteristics.
> But I also think we can do better than telling DBAs to they have to
> manually fine-tune autovacuum for large tables (and frequenlty
> implementing by hand what this patch is proposed, namely setting
> autovacuum_vacuum_scale_factor to 0 and autovacuum_vacuum_threshold to a
> high number), as this is cumbersome and needs adult supervision that is
> not always available. Of course, it would be great if we just slap some
> AI into the autovacuum launcher that figures things out automagically,
> but I don't think we are there, yet.
>
> So this proposal (probably along with a higher default threshold than
> 50, but IMO less than what Robert and Nathan suggested) sounds like
> a stop forward to me. DBAs can set the threshold lower if they want, or
> maybe we can just turn it off by default if we cannot agree on a sane
> default, but I think this (using the simplified formula from Nathan) is
> a good approach that takes some pain away from autovacuum tuning and
> reserves that for the really difficult cases.

I agree with this. If having an extra setting substantially reduces
the number of cases that require manual tuning, it's totally worth it.
And I think it will.

To be clear, I don't think this is the biggest problem with the
autovacuum algorithm, not by quite a bit. But it's a relatively easy
one to fix.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Add SPLIT PARTITION/MERGE PARTITIONS commands

2024-04-26 Thread Pavel Borisov

Hi, Hackers!

On Thu, 25 Apr 2024 at 00:26, Justin Pryzby  wrote:

> On Mon, Apr 22, 2024 at 01:31:48PM +0300, Alexander Korotkov wrote:
> > Hi!
> >
> > On Fri, Apr 19, 2024 at 4:29 PM Alexander Korotkov 
> wrote:
> > > On Fri, Apr 19, 2024 at 2:26 AM Dmitry Koval 
> wrote:
> > > > 18.04.2024 19:00, Alexander Lakhin wrote:
> > > > > leaves a strange constraint:
> > > > > \d+ t*
> > > > >Table "public.tp_0"
> > > > > ...
> > > > > Not-null constraints:
> > > > >  "merge-16385-26BCB0-tmp_i_not_null" NOT NULL "i"
> > > >
> > > > Thanks!
> > > > Attached fix (with test) for this case.
> > > > The patch should be applied after patches
> > > > v6-0001- ... .patch ... v6-0004- ... .patch
> > >
> > > I've incorporated this fix with 0001 patch.
> > >
> > > Also added to the patchset
> > > 005 – tab completion by Dagfinn [1]
> > > 006 – draft fix for table AM issue spotted by Alexander Lakhin [2]
> > > 007 – doc review by Justin [3]
> > >
> > > I'm continuing work on this.
> > >
> > > Links
> > > 1.
> https://www.postgresql.org/message-id/87plumiox2.fsf%40wibble.ilmari.org
> > > 2.
> https://www.postgresql.org/message-id/84ada05b-be5c-473e-6d1c-ebe5dd21b190%40gmail.com
> > > 3.
> https://www.postgresql.org/message-id/ZiGH0xc1lxJ71ZfB%40pryzbyj2023
> >
> > 0001
> > The way we handle name collisions during MERGE PARTITIONS operation is
> > reworked by integration of patch [3].  This makes note about commit in
> > [2] not relevant.
>
> This patch also/already fixes the schema issue I reported.  Thanks.
>
> If you wanted to include a test case for that:
>
> begin;
> CREATE SCHEMA s;
> CREATE SCHEMA t;
> CREATE TABLE p(i int) PARTITION BY RANGE(i);
> CREATE TABLE s.c1 PARTITION OF p FOR VALUES FROM (1)TO(2);
> CREATE TABLE s.c2 PARTITION OF p FOR VALUES FROM (2)TO(3);
> ALTER TABLE p MERGE PARTITIONS (s.c1, s.c2) INTO s.c1; -- misbehaves if
> merging into the same name as an existing partition
> \d+ p
> ...
> Partitions: c1 FOR VALUES FROM (1) TO (3)
>
> > 0002
> > The persistence of the new partition is copied as suggested in [1].
> > But the checks are in-place, because search_path could influence new
> > table persistence.  Per review [2], commit message typos are fixed,
> > documentation is revised, revised tests to cover schema-qualification,
> > usage of search_path.
>
> Subject: [PATCH v8 2/7] Make new partitions with parent's persistence
> during MERGE/SPLIT operations
>
> This patch adds documentation saying:
> +  Any indexes, constraints and user-defined row-level triggers that
> exist
> +  in the parent table are cloned on new partitions [...]
>
> Which is good to say, and addresses part of my message [0]
> [0] ZiJW1g2nbQs9ekwK@pryzbyj2023
>
> But it doesn't have anything to do with "creating new partitions with
> parent's persistence".  Maybe there was a merge conflict and the docs
> ended up in the wrong patch ?
>
> Also, defaults, storage options, compression are also copied.  As will
> be anything else from LIKE.  And since anything added in the future will
> also be copied, maybe it's better to just say that the tables will be
> created the same way as "LIKE .. INCLUDING ALL EXCLUDING ..", or
> similar.  Otherwise, the next person who adds a new option for LIKE
> would have to remember to update this paragraph...
>
> Also, extended stats objects are currently cloned to new child tables.
> But I suggested in [0] that they probably shouldn't be.
>
> > 007 – doc review by Justin [3]
>
> I suggest to drop this patch for now.  I'll send some more minor fixes to
> docs and code comments once the other patches are settled.
>
I've looked at the patchset:

0001 Look good.
0002 Also right with docs modification proposed by Justin.
0003:
Looks like unused code
5268 datum = cmpval ? list_nth(spec->lowerdatums, abs(cmpval) -
1) : NULL;
overridden by
5278 datum = list_nth(spec->upperdatums, abs(cmpval) -
1);
and
5290 datum = list_nth(spec->upperdatums, abs(cmpval) -
1);

Otherwise - good.

0004:
I suggest also getting rid of thee-noun compound words like:
salesperson_name. Maybe salesperson -> clerk? Or maybe use the same terms
like in pgbench: branches, tellers, accounts, balance.

0005: Good
0006: Patch is right
In comments:
+  New partitions will have the same table access method,
+  same column names and types as the partitioned table to which they
belong.
(I'd suggest to remove second "same")

Tests are passed. I suppose that it's better to add similar tests for
SPLIT/MERGE PARTITION(S)  to those covering ATTACH/DETACH PARTITION (e.g.:
subscription/t/013_partition.pl and regression tests)

Overall, great work! Thanks!

Regards,
Pavel Borisov,
Supabase.

Re: New GUC autovacuum_max_threshold ?

2024-04-26 Thread Robert Haas

On Fri, Apr 26, 2024 at 9:22 AM Joe Conway  wrote:
> Although I don't think 50 is necessarily too small. In my view,
> having autovac run very quickly, even if more frequently, provides an
> overall better user experience.

Can you elaborate on why you think that? I mean, to me, that's almost
equivalent to removing autovacuum_vacuum_scale_factor entirely,
because only for very small tables will that calculation produce a
value lower than 500k.

We might need to try to figure out some test cases here. My intuition
is that this is going to vacuum large tables insanely aggressively.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: New GUC autovacuum_max_threshold ?

2024-04-26 Thread Robert Haas

On Thu, Apr 25, 2024 at 10:24 PM Laurenz Albe  wrote:
> I don't find that convincing.  Why are 2TB of wasted space in a 10TB
> table worse than 2TB of wasted space in 100 tables of 100GB each?

It's not worse, but it's more avoidable. No matter what you do, any
table that suffers a reasonable number of updates and/or deletes is
going to have some wasted space. When a tuple is deleted or update,
the old one has to stick around until its xmax is all-visible, and
then after that until the page is HOT pruned which may not happen
immediately, and then even after that the line pointer sticks around
until the next vacuum which doesn't happen instantly either. No matter
how aggressive you make autovacuum, or even no matter how aggressively
you vacuum manually, non-insert-only tables are always going to end up
containing some bloat.

But how much? Well, it's basically given by
RATE_AT_WHICH_SPACE_IS_WASTED * AVERAGE_TIME_UNTIL_SPACE_IS_RECLAIMED.
Which, you'll note, does not really depend on the table size. It does
a little bit, because the time until a tuple is fully removed,
including the line pointer, depends on how long vacuum takes, and
vacuum takes larger on a big table than a small one. But the effect is
much less than linear, I believe, because you can HOT-prune as soon as
the xmax is all-visible, which reclaims most of the space instantly.
So in practice, the minimum feasible steady-state bloat for a table
depends a great deal on how fast updates and deletes are happening,
but only weakly on the size of the table.

Which, in plain English, means that you should be able to vacuum a
10TB table often enough that it doesn't accumulate 2TB of bloat, if
you want to. It's going to be harder to vacuum a 10GB table often
enough that it doesn't accumulate 2GB of bloat. And it's going to be
*really* hard to vacuum a 10MB table often enough that it doesn't
accumulate 2MB of bloat. The only way you're going to be able to do
that last one at all is if the update rate is very low.

> > Another reason, at least in existing releases, is that at some
> > point index vacuuming hits a wall because we run out of space for dead
> > tuples. We *most definitely* want to do index vacuuming before we get
> > to the point where we're going to have to do multiple cycles of index
> > vacuuming.
>
> That is more convincing.  But do we need a GUC for that?  What about
> making a table eligible for autovacuum as soon as the number of dead
> tuples reaches 90% of what you can hold in "autovacuum_work_mem"?

That would have been a good idea to do in existing releases, a long
time before now, but we didn't. However, the new dead TID store
changes the picture, because if I understand John Naylor's remarks
correctly, the new TID store can hold so many TIDs so efficiently that
you basically won't run out of memory. So now I think this wouldn't be
effective - yet I still think it's wrong to let the vacuum threshold
scale without bound as the table size increases.

--
Robert Haas
EDB: http://www.enterprisedb.com

Re: New GUC autovacuum_max_threshold ?

2024-04-26 Thread Joe Conway


On 4/26/24 04:43, Michael Banck wrote:

So this proposal (probably along with a higher default threshold than
50, but IMO less than what Robert and Nathan suggested) sounds like
a stop forward to me. DBAs can set the threshold lower if they want, or
maybe we can just turn it off by default if we cannot agree on a sane
default, but I think this (using the simplified formula from Nathan) is
a good approach that takes some pain away from autovacuum tuning and
reserves that for the really difficult cases.


+1 to the above

Although I don't think 50 is necessarily too small. In my view, 
having autovac run very quickly, even if more frequently, provides an 
overall better user experience.


--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: New committers: Melanie Plageman, Richard Guo

2024-04-26 Thread Jacob Champion

On Fri, Apr 26, 2024 at 4:54 AM Jonathan S. Katz  wrote:
>
> The Core Team would like to extend our congratulations to Melanie
> Plageman and Richard Guo, who have accepted invitations to become our
> newest PostgreSQL committers.

Congratulations!!

--Jacob

Re: New committers: Melanie Plageman, Richard Guo

2024-04-26 Thread Ashutosh Bapat

Congratulations to Melanie and Richard.

On Fri, Apr 26, 2024 at 5:24 PM Jonathan S. Katz 
wrote:

> The Core Team would like to extend our congratulations to Melanie
> Plageman and Richard Guo, who have accepted invitations to become our
> newest PostgreSQL committers.
>
> Please join us in wishing them much success and few reverts!
>
> Thanks,
>
> Jonathan
>


-- 
Best Wishes,
Ashutosh Bapat

Re: Test to dump and restore objects left behind by regression

2024-04-26 Thread Ashutosh Bapat

On Fri, Feb 23, 2024 at 10:46 AM Ashutosh Bapat <
ashutosh.bapat@gmail.com> wrote:

> On Thu, Feb 22, 2024 at 8:35 PM Tom Lane  wrote:
> >
> > Peter Eisentraut  writes:
> > > The problem is, we don't really have any end-to-end coverage of
> >
> > > dump
> > > restore
> > > dump again
> > > compare the two dumps
> >
> > > with a database with lots of interesting objects in it.
> >
> > I'm very much against adding another full run of the core regression
> > tests to support this.
>
> This will be taken care of by Peter's latest idea of augmenting
> existing 002_pg_upgrade.pl.
>
>
Incorporated the test to 002_pg_ugprade.pl.

Some points for discussion:
1. The test still hardcodes the diffs between two dumps. Haven't found a
better way to do it. I did consider removing the problematic objects from
the regression database but thought against it since we would lose some
coverage.

2. The new code tests dump and restore of just the regression database and
does not use pg_dumpall like pg_upgrade. Should it instead perform
pg_dumpall? I decided against it since a. we are interested in dumping and
restoring objects left behind by regression, b. I didn't find a way to
provide the format option to pg_dumpall. The test could be enhanced to use
different dump formats.

I have added it to the next commitfest.
https://commitfest.postgresql.org/48/4956/

-- 
Best Wishes,
Ashutosh Bapat
From cd1d0d3a2fe5ef6b7659ab710f0287d186ca0051 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Mon, 15 Apr 2024 08:20:34 +0200
Subject: [PATCH] pg_dump/restore regression objects

002_pg_upgrade.pl tests pg_upgrade on the regression database left
behind by regression run. Modify it to test pg_dump/restore the
regression database.

Author: Ashutosh Bapat
Discussion: https://www.postgresql.org/message-id/CAExHW5uF5V=Cjecx3_Z=7xfh4rg2wf61pt+hfquzjbqourz...@mail.gmail.com
---
 src/bin/pg_upgrade/t/002_pg_upgrade.pl | 117 +
 1 file changed, 117 insertions(+)

diff --git a/src/bin/pg_upgrade/t/002_pg_upgrade.pl b/src/bin/pg_upgrade/t/002_pg_upgrade.pl
index 3e67121a8d..e79bd85a2a 100644
--- a/src/bin/pg_upgrade/t/002_pg_upgrade.pl
+++ b/src/bin/pg_upgrade/t/002_pg_upgrade.pl
@@ -197,6 +197,7 @@ is( $result,
 my $srcdir = abs_path("../../..");
 
 # Set up the data of the old instance with a dump or pg_regress.
+my $db_from_regress;
 if (defined($ENV{olddump}))
 {
 	# Use the dump specified.
@@ -207,6 +208,7 @@ if (defined($ENV{olddump}))
 	# not exist yet, and we are done here.
 	$oldnode->command_ok([ 'psql', '-X', '-f', $olddumpfile, 'postgres' ],
 		'loaded old dump file');
+	$db_from_regress = 0;
 }
 else
 {
@@ -258,6 +260,7 @@ else
 		}
 	}
 	is($rc, 0, 'regression tests pass');
+	$db_from_regress = 1;
 }
 
 # Initialize a new node for the upgrade.
@@ -510,4 +513,118 @@ if ($compare_res != 0)
 	print "=== EOF ===\n";
 }
 
+# Test normal dump/restore of the objects left behind by regression. Ideally it
+# should be done in a separate test, but doing it here saves us one full
+# regression run.
+if ($db_from_regress)
+{
+	my $dst_node = PostgreSQL::Test::Cluster->new('dst_node');
+	my $dump3_file = "$tempdir/dump3.sql";
+	my $dump4_file = "$tempdir/dump4.sql";
+	my $dump5_file = "$tempdir/dump5.sql";
+
+	$dst_node->init();
+	$oldnode->start;
+
+	# Dump source database for comparison later
+	command_ok(
+		[
+			'pg_dump', '-s', '-d', 'regression',
+			'-h', $oldnode->host,
+			'-p', $oldnode->port,
+			'-f', $dump4_file
+		],
+		'pg_dump on source instance');
+
+	# Dump to be restored
+	command_ok(
+		[
+			'pg_dump', '-Fc', '-d', 'regression',
+			'-h', $oldnode->host,
+			'-p', $oldnode->port,
+			'-f', $dump3_file
+		],
+		'pg_dump on source instance');
+
+	$dst_node->start;
+	$dst_node->command_ok(
+			[ 'createdb', 'regression' ],
+			"created destination database");
+
+	# Restore into destination database
+	command_ok(
+		[
+			'pg_restore', '-d', 'regression',
+			'-h', $dst_node->host,
+			'-p', $dst_node->port,
+			$dump3_file
+		],
+		'pg_restore on destination instance');
+
+	# Dump from destination database for comparison
+	command_ok(
+		[
+			'pg_dump', '-s', '-d', 'regression',
+			'-h', $dst_node->host,
+			'-p', $dst_node->port,
+			'-f', $dump5_file
+		],
+		'pg_dump on destination instance');
+
+	# Compare the two dumps. Ideally there should be no difference in the two
+	# dumps. But the column order in the dumps differs for inheritance
+	# children. Some regression tests purposefully create the child table with
+	# columns in different order than the parent using CREATE TABLE ...
+	# followed by ALTER TABLE ... INHERIT. These child tables are dumped as a
+	# single CREATE TABLE ... INHERITS with column order same as the child.
+	# When the child table is restored using this command, it creates the child
+	# table with same column order as the parent. The restored table is dumped
+	# as CREATE TABLE ... INHERITS but with columns order same as parent. Thus
+	# the column orders differ between t

Re: New committers: Melanie Plageman, Richard Guo

2024-04-26 Thread Andrew Dunstan




On 2024-04-26 Fr 07:54, Jonathan S. Katz wrote:
The Core Team would like to extend our congratulations to Melanie 
Plageman and Richard Guo, who have accepted invitations to become our 
newest PostgreSQL committers.


Please join us in wishing them much success and few reverts!




Very happy about both of these. Many congratulations to Melanie and Richard.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: Use XLOG_CONTROL_FILE macro everywhere?

2024-04-26 Thread Robert Haas

On Wed, Apr 24, 2024 at 8:04 PM Michael Paquier  wrote:
> On Wed, Apr 24, 2024 at 12:32:46PM +0300, Anton A. Melnikov wrote:
> > To ensure backward compatibility we can save the old macro like this:
> >
> > #define XLOG_CONTROL_FILE "global/pg_control"
> > #define PG_CONTROL_FILE   XLOG_CONTROL_FILE
> >
> > With the best wishes,
>
> Not sure that I would bother with a second one.  But, well, why not if
> people want to rename it, as long as you keep compatibility.

I vote for just standardizing on XLOG_CONTROL_FILE. That name seems
sufficiently intuitive to me, and I'd rather have one identifier for
this than two. It's simpler that way.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: New committers: Melanie Plageman, Richard Guo

2024-04-26 Thread Nazir Bilal Yavuz

On Fri, 26 Apr 2024 at 14:54, Jonathan S. Katz  wrote:
>
> The Core Team would like to extend our congratulations to Melanie
> Plageman and Richard Guo, who have accepted invitations to become our
> newest PostgreSQL committers.
>
> Please join us in wishing them much success and few reverts!

Congratulations to both of you!

-- 
Regards,
Nazir Bilal Yavuz
Microsoft

Re: BitmapHeapScan streaming read user and prelim refactoring

2024-04-26 Thread Melanie Plageman

On Thu, Apr 25, 2024 at 7:57 PM Tom Lane  wrote:
>
> I wrote:
> > Hmm, is that actually true?  There's no more reason to think a tuple
> > in a temp table is old enough to be visible to all other sessions
> > than one in any other table.  It could be all right if we had a
> > special-case rule for setting all-visible in temp tables.  Which
> > indeed I thought we had, but I can't find any evidence of that in
> > vacuumlazy.c, nor did a trawl of the commit log turn up anything
> > promising.  Am I just looking in the wrong place?
>
> Ah, never mind that --- I must be looking in the wrong place.
> Direct experimentation proves that VACUUM will set all-visible bits
> for temp tables even in the presence of concurrent transactions.

If this seems correct to you, are you okay with the rest of the fix
and test? We could close this open item once the patch is acceptable.

- Melanie

Re: partitioning and identity column

2024-04-26 Thread Ashutosh Bapat

Thanks Alexander for the report.

On Fri, Apr 26, 2024 at 5:30 PM Alexander Lakhin 
wrote:

> Hello Ashutosh and Peter,
>
> 16.01.2024 21:59, Peter Eisentraut wrote:
> > On 09.01.24 15:10, Ashutosh Bapat wrote:
> >> Here's complete patch-set.
> >
> > Looks good!  Committed.
> >
>
> Please take a look at a new error case introduced by 699586315:
> CREATE TABLE tbl1 (a int PRIMARY KEY GENERATED BY DEFAULT AS IDENTITY)
>PARTITION BY LIST (a);
> CREATE TABLE tbl2 PARTITION OF tbl1 DEFAULT;
>
> CREATE TABLE tbl3 (LIKE tbl2 INCLUDING IDENTITY);
> ERROR:  no owned sequence found
>

I don't think creating a table like a partition is common or even useful.
Usually it would create it from partitithe oned table. But if we consider
that to be a use case, I think the error is expected since a partition
doesn't have its own identity; it shares it with the partitioned table.
Maybe we could give a better message. But I will look into this and fix it
if the solution makes sense.

Do you want to track this in open items?

-- 
Best Wishes,
Ashutosh Bapat

Re: some additional (small) problems with pg_combinebackup and tablespaces

2024-04-26 Thread Robert Haas

On Thu, Apr 25, 2024 at 2:03 PM Daniel Gustafsson  wrote:
> LGTM, only one small error in the commitmessage: s/Gustaffson/Gustafsson/

Oh no! You're in danger of becoming the second person on this project
whose name I chronically misspell. Fixed (I hope) and committed.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: New committers: Melanie Plageman, Richard Guo

2024-04-26 Thread Fabrízio de Royes Mello

On Fri, Apr 26, 2024 at 8:54 AM Jonathan S. Katz 
wrote:

> The Core Team would like to extend our congratulations to Melanie
> Plageman and Richard Guo, who have accepted invitations to become our
> newest PostgreSQL committers.
>
>
It is a HUGE step for the community having a woman committer... Congrats to
you both Melanie and Richard!!!

Regards,

-- 
Fabrízio de Royes Mello

Re: New committers: Melanie Plageman, Richard Guo

2024-04-26 Thread Amit Langote

On Fri, Apr 26, 2024 at 8:54 PM Jonathan S. Katz  wrote:
> The Core Team would like to extend our congratulations to Melanie
> Plageman and Richard Guo, who have accepted invitations to become our
> newest PostgreSQL committers.
>
> Please join us in wishing them much success and few reverts!

Congratulations to both!

-- 
Thanks, Amit Langote

Re: New committers: Melanie Plageman, Richard Guo

2024-04-26 Thread Jesper Pedersen


On 4/26/24 07:54, Jonathan S. Katz wrote:
The Core Team would like to extend our congratulations to Melanie 
Plageman and Richard Guo, who have accepted invitations to become our 
newest PostgreSQL committers.


Please join us in wishing them much success and few reverts!



Congrats !

Best regards,
 Jesper

Re: New committers: Melanie Plageman, Richard Guo

2024-04-26 Thread Bharath Rupireddy

On Fri, Apr 26, 2024 at 5:24 PM Jonathan S. Katz  wrote:
>
> The Core Team would like to extend our congratulations to Melanie
> Plageman and Richard Guo, who have accepted invitations to become our
> newest PostgreSQL committers.
>
> Please join us in wishing them much success and few reverts!

Congratulations to both of you!

-- 
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Fix parallel vacuum buffer usage reporting

2024-04-26 Thread Alena Rybakina


Hi!


The same script was run, but using vacuum verbose analyze, and I
saw the difference again in the fifth step:
with your patch: buffer usage: 32312 hits, 607 misses, 1566 dirtied
master: buffer usage: 32346 hits, 573 misses, 1360 dirtied

Isn't there a chance for the checkpointer to run during this time? 
That could make the conditions between the two runs slightly different 
and explain the change in buffer report.


[0] 
https://github.com/postgres/postgres/blob/8a1b31e6e59631807a08a4e9465134c343bbdf5e/src/backend/access/heap/vacuumlazy.c#L2826-L2831


Looking at the script, you won't trigger the problem.


Thank you for the link I accounted it in my next experiments.

I repeated the test without processing checkpoints with a single index, 
and the number of pages in the buffer used almost matched:


master branch: buffer usage: 32315 hits, 606 misses, 4486 dirtied

with applied patch v4 version: buffer usage: 32315 hits, 606 misses, 
4489 dirtied


I think you are right - the problem was interfering with the checkpoint 
process, by the way I checked the first version patch. To cut a long 
story short, everything is fine now with one index.


Just in case, I'll explain: I considered this case because your patch 
could have impact influenced it too.


On 25.04.2024 10:17, Anthonin Bonnefoy wrote:


On Wed, Apr 24, 2024 at 4:01 PM Alena Rybakina 
 wrote:


I tested the main postgres branch with and without your fix using
a script that was written by me. It consists of five scenarios and
I made a comparison in the logs between the original version of
the master branch and the master branch with your patch:

 Hi! Thanks for the tests.

I have attached a test file (vacuum_check_logs.sql)

The reporting issue will only happen if there's a parallel index 
vacuum and it will only happen if there's at least 2 indexes [0]. You 
will need to create an additional index.


Speaking of the problem, I added another index and repeated the test and 
found a significant difference:


 * I found it when I commited the transaction (3):

master: 2964hits, 0misses, 0dirtied

with applied patch v4 version: buffer usage: 33013hits, 0misses, 3dirtied

 * When I deleted all the data from the table and later started vacuum
   verbose again (4):

master: buffer usage: 51486hits, 0misses, 0dirtied

with applied patch v4 version:buffer usage: 77924hits, 0misses, 0dirtied

 * when I inserted 1 million data into the table and updated it (5):

master:buffer usage: 27904hits, 5021misses, 1777dirtied

with applied patch v4 version:buffer usage: 41051hits, 9973misses, 
2564dirtied


As I see, the number of pages is significantly more than it was in the 
master branch and ,frankly, I couldn't fully figure out if it was a 
mistake or not.


I attached a test script (vacuum_checks_logs.sql) with two indexes and 
no checkpoints, I also attached log files: the first one (vacuum_test) 
is the result of testing on the master branch, the second file with your 
applied patch (vacuum_test_v4).


--
Regards,
Alena Rybakina
Postgres Professional:http://www.postgrespro.com
The Russian Postgres Company


vacuum_check_logs.sql
Description: application/sql
postgres=# \i ~/vacuum_check_logs.sql 
psql:/home/alena/vacuum_check_logs.sql:2: ERROR:  extension "dblink" does not 
exist
CREATE EXTENSION
 dblink_connect 

 OK
(1 row)

psql:/home/alena/vacuum_check_logs.sql:5: ERROR:  table "vestat" does not exist
SET
SET
CREATE TABLE
CREATE INDEX
CREATE INDEX
psql:/home/alena/vacuum_check_logs.sql:16: INFO:  vacuuming 
"postgres.public.vestat"
psql:/home/alena/vacuum_check_logs.sql:16: INFO:  finished vacuuming 
"postgres.public.vestat": index scans: 0
pages: 0 removed, 0 remain, 0 scanned (100.00% of total)
tuples: 0 removed, 0 remain, 0 are dead but not yet removable
removable cutoff: 742, which was 0 XIDs old when operation ended
new relfrozenxid: 742, which is 3 XIDs ahead of previous value
frozen: 0 pages from table (100.00% of total) had 0 tuples frozen
index scan not needed: 0 pages from table (100.00% of total) had 0 dead item 
identifiers removed
I/O timings: read: 0.049 ms, write: 0.000 ms
avg read rate: 40.584 MB/s, avg write rate: 0.000 MB/s
buffer usage: 23 hits, 2 misses, 0 dirtied
WAL usage: 1 records, 0 full page images, 237 bytes
system usage: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s
VACUUM
 pg_sleep 
--
 
(1 row)

INSERT 0 100
DELETE 10
psql:/home/alena/vacuum_check_logs.sql:25: INFO:  vacuuming 
"postgres.public.vestat"
psql:/home/alena/vacuum_check_logs.sql:25: INFO:  launched 1 parallel vacuum 
worker for index vacuuming (planned: 1)
psql:/home/alena/vacuum_check_logs.sql:25: INFO:  table "vestat": truncated 
3922 to 3530 pages
psql:/home/alena/vacuum_check_logs.sql:25: INFO:  finished vacuuming 
"postgres.public.vestat": index scans: 1
pages: 392 removed, 3530 remain, 3922 scanned (100.00% of total)
tuples: 10 removed, 90 remain, 0 are dead but not yet r

Re: partitioning and identity column

2024-04-26 Thread Alexander Lakhin


Hello Ashutosh and Peter,

16.01.2024 21:59, Peter Eisentraut wrote:

On 09.01.24 15:10, Ashutosh Bapat wrote:

Here's complete patch-set.


Looks good!  Committed.



Please take a look at a new error case introduced by 699586315:
CREATE TABLE tbl1 (a int PRIMARY KEY GENERATED BY DEFAULT AS IDENTITY)
  PARTITION BY LIST (a);
CREATE TABLE tbl2 PARTITION OF tbl1 DEFAULT;

CREATE TABLE tbl3 (LIKE tbl2 INCLUDING IDENTITY);
ERROR:  no owned sequence found

Best regards,
Alexander

New committers: Melanie Plageman, Richard Guo

2024-04-26 Thread Jonathan S. Katz

The Core Team would like to extend our congratulations to Melanie 
Plageman and Richard Guo, who have accepted invitations to become our 
newest PostgreSQL committers.


Please join us in wishing them much success and few reverts!

Thanks,

Jonathan


OpenPGP_signature.asc
Description: OpenPGP digital signature

Re: Fix the description of GUC "max_locks_per_transaction" and "max_pred_locks_per_transaction" in guc_table.c

2024-04-26 Thread Michael Banck

Hi,

On Fri, Apr 07, 2023 at 01:32:22PM -0400, Tom Lane wrote:
> "wangw.f...@fujitsu.com"  writes:
> > On Tues, Apr 4, 2023 at 23:48 PM Tom Lane  wrote:
> >> I like the "per eligible process" wording, at least for guc_tables.c;
> >> or maybe it could be "per server process"?  That would be more
> >> accurate and not much longer than what we have now.
> 
> > Thanks both for sharing your opinions.
> > I agree that verbose descriptions make maintenance difficult.
> > For consistency, I unified the formulas in guc_tables.c and pg-doc into the 
> > same
> > suggested short formula. Attach the new patch.
> 
> After studying this for awhile, I decided "server process" is probably
> the better term --- people will have some idea what that means, while
> "eligible process" is not a term we use anywhere else.  Pushed with
> that change and some minor other wordsmithing.

I stumbled upon this change while looking at the documentation searching
for guidance and what max_locks_per_transactions should be set to (or
rather, a pointer about max_locks_per_transactions not actually being
"per transaction", but a shared pool of roughly
max_locks_per_transactions * max_connections).

While I agree that the exact formula is too verbose, I find the current
wording ("per server process or prepared transaction") to be misleading;
I can see how somebody sees that as a dynamic limit based on the current
number of running server processes or prepared transactions, not
something that is allocated at server start based on some hardcoded
GUCs.

I don't have a good alternative wording for now, but I wanted to point
out that currently the wording does not seem to imply
max_{connection,prepared_transactions} being at play at all. Probably
the GUC description cannot be made much clearer without making it too
verbose, but I think the description in config.sgml has more leeway to
get a mention of max_connections back.

Michael

Re: Extend ALTER DEFAULT PRIVILEGES for large objects

2024-04-26 Thread Matthias van de Meent

On Fri, 26 Apr 2024 at 10:54, Yugo NAGATA  wrote:
>
> On Wed, 24 Apr 2024 16:08:39 -0500
> Nathan Bossart  wrote:
>
> > On Tue, Apr 23, 2024 at 11:47:38PM -0400, Tom Lane wrote:
> > > On the whole I find this proposed feature pretty unexciting
> > > and dubiously worthy of the implementation/maintenance effort.
> >
> > I don't have any particularly strong feelings on $SUBJECT, but I'll admit
> > I'd be much more interested in resolving any remaining reasons folks are
> > using large objects over TOAST.  I see a couple of reasons listed in the
> > docs [0] that might be worth examining.
> >
> > [0] https://www.postgresql.org/docs/devel/lo-intro.html
>
> If we could replace large objects with BYTEA in any use cases, large objects
> would be completely obsolete. However, currently some users use large objects
> in fact, so improvement in this feature seems beneficial for them.
>
>
> Apart from that, extending TOAST to support more than 1GB data and
> stream-style access seems a good challenge. I don't know if there was a
> proposal for this in past. This is  just a thought, for this purpose, we
> will need a new type of varlena that can contains large size information,
> and a new toast table schema that can store offset information or some way
> to convert a offset to chunk_seq.

If you're interested in this, you may want to check out [0] and [1] as
threads on the topic of improving TOAST handling of large values ([1]
being a thread where the limitations of our current external TOAST
pointer became clear once more), and maybe talk with Aleksander
Alekseev and Nikita Malakhov. They've been working closely with
systems that involve toast pointers and their limitations.

The most recent update on the work of Nikita (reworking TOAST
handling) [2] is that he got started adapting their externally
pluggable toast into type-internal methods only, though I've not yet
noticed any updated patches appear on the list.

As for other issues with creating larger TOAST values:
TOAST has a value limit of ~1GB, which means a single large value (or
two, for that matter) won't break anything in the wire protocol, as
DataRow messages have a message size field of uint32 [^3]. However, if
we're going to allow even larger values to be stored in table's
attributes, we'll have to figure out how we're going to transfer those
larger values to (and from) clients. For large objects, this is much
less of an issue because the IO operations are already chunked by
design, but this may not work well for types that you want to use in
your table's columns.

Kind regards,

Matthias van de Meent

[0] 
https://postgr.es/m/flat/CAN-LCVMq2X%3Dfhx7KLxfeDyb3P%2BBXuCkHC0g%3D9GF%2BJD4izfVa0Q%40mail.gmail.com
[1] 
https://postgr.es/m/flat/CAJ7c6TOtAB0z1UrksvGTStNE-herK-43bj22%3D5xVBg7S4vr5rQ%40mail.gmail.com
[2] 
https://postgr.es/m/CAN-LCVOgMrda9hOdzGkCMdwY6dH0JQa13QvPsqUwY57TEn6jww%40mail.gmail.com

[^3] Most, if not all PostgreSQL wire protocol messages have this
uint32 message size field, but the DataRow one is relevant here as
it's the one way users get their data out of the database.

Re: Extend ALTER DEFAULT PRIVILEGES for large objects

2024-04-26 Thread Yugo NAGATA

On Wed, 24 Apr 2024 16:08:39 -0500
Nathan Bossart  wrote:

> On Tue, Apr 23, 2024 at 11:47:38PM -0400, Tom Lane wrote:
> > On the whole I find this proposed feature pretty unexciting
> > and dubiously worthy of the implementation/maintenance effort.
> 
> I don't have any particularly strong feelings on $SUBJECT, but I'll admit
> I'd be much more interested in resolving any remaining reasons folks are
> using large objects over TOAST.  I see a couple of reasons listed in the
> docs [0] that might be worth examining.
> 
> [0] https://www.postgresql.org/docs/devel/lo-intro.html

If we could replace large objects with BYTEA in any use cases, large objects
would be completely obsolete. However, currently some users use large objects
in fact, so improvement in this feature seems beneficial for them. 

Apart from that, extending TOAST to support more than 1GB data and
stream-style access seems a good challenge. I don't know if there was a
proposal for this in past. This is  just a thought, for this purpose, we
will need a new type of varlena that can contains large size information,
and a new toast table schema that can store offset information or some way
to convert a offset to chunk_seq.

Regards,
Yugo Nagata

> -- 
> Nathan Bossart
> Amazon Web Services: https://aws.amazon.com

-- 
Yugo NAGATA

Re: New GUC autovacuum_max_threshold ?

2024-04-26 Thread Michael Banck

Hi,

On Fri, Apr 26, 2024 at 10:18:00AM +0200, Laurenz Albe wrote:
> On Fri, 2024-04-26 at 09:35 +0200, Frédéric Yhuel wrote:
> > Le 26/04/2024 à 04:24, Laurenz Albe a écrit :
> > > On Thu, 2024-04-25 at 14:33 -0400, Robert Haas wrote:
> > > > I believe that the underlying problem here can be summarized in this
> > > > way: just because I'm OK with 2MB of bloat in my 10MB table doesn't
> > > > mean that I'm OK with 2TB of bloat in my 10TB table. One reason for
> > > > this is simply that I can afford to waste 2MB much more easily than I
> > > > can afford to waste 2TB -- and that applies both on disk and in
> > > > memory.
> > > 
> > > I don't find that convincing.  Why are 2TB of wasted space in a 10TB
> > > table worse than 2TB of wasted space in 100 tables of 100GB each?
> > 
> > Good point, but another way of summarizing the problem would be that the 
> > autovacuum_*_scale_factor parameters work well as long as we have a more 
> > or less evenly distributed access pattern in the table.
> > 
> > Suppose my very large table gets updated only for its 1% most recent 
> > rows. We probably want to decrease autovacuum_analyze_scale_factor and 
> > autovacuum_vacuum_scale_factor for this one.
> > 
> > Partitioning would be a good solution, but IMHO postgres should be able 
> > to handle this case anyway, ideally without per-table configuration.
> 
> I agree that you may well want autovacuum and autoanalyze treat your large
> table differently from your small tables.
> 
> But I am reluctant to accept even more autovacuum GUCs.  It's not like
> we don't have enough of them, rather the opposite.  You can slap on more
> GUCs to treat more special cases, but we will never reach the goal of
> having a default that will make everybody happy.
> 
> I believe that the defaults should work well in moderately sized databases
> with moderate usage characteristics.  If you have large tables or a high
> number of transactions per second, you can be expected to make the effort
> and adjust the settings for your case.  Adding more GUCs makes life *harder*
> for the users who are trying to understand and configure how autovacuum works.

Well, I disagree to some degree. I agree that the defaults should work
well in moderately sized databases with moderate usage characteristics.
But I also think we can do better than telling DBAs to they have to
manually fine-tune autovacuum for large tables (and frequenlty
implementing by hand what this patch is proposed, namely setting
autovacuum_vacuum_scale_factor to 0 and autovacuum_vacuum_threshold to a
high number), as this is cumbersome and needs adult supervision that is
not always available. Of course, it would be great if we just slap some
AI into the autovacuum launcher that figures things out automagically,
but I don't think we are there, yet.

So this proposal (probably along with a higher default threshold than
50, but IMO less than what Robert and Nathan suggested) sounds like
a stop forward to me. DBAs can set the threshold lower if they want, or
maybe we can just turn it off by default if we cannot agree on a sane
default, but I think this (using the simplified formula from Nathan) is
a good approach that takes some pain away from autovacuum tuning and
reserves that for the really difficult cases.

Michael

Re: New GUC autovacuum_max_threshold ?

2024-04-26 Thread Laurenz Albe

On Fri, 2024-04-26 at 09:35 +0200, Frédéric Yhuel wrote:
> 
> Le 26/04/2024 à 04:24, Laurenz Albe a écrit :
> > On Thu, 2024-04-25 at 14:33 -0400, Robert Haas wrote:
> > > I believe that the underlying problem here can be summarized in this
> > > way: just because I'm OK with 2MB of bloat in my 10MB table doesn't
> > > mean that I'm OK with 2TB of bloat in my 10TB table. One reason for
> > > this is simply that I can afford to waste 2MB much more easily than I
> > > can afford to waste 2TB -- and that applies both on disk and in
> > > memory.
> > 
> > I don't find that convincing.  Why are 2TB of wasted space in a 10TB
> > table worse than 2TB of wasted space in 100 tables of 100GB each?
> 
> Good point, but another way of summarizing the problem would be that the 
> autovacuum_*_scale_factor parameters work well as long as we have a more 
> or less evenly distributed access pattern in the table.
> 
> Suppose my very large table gets updated only for its 1% most recent 
> rows. We probably want to decrease autovacuum_analyze_scale_factor and 
> autovacuum_vacuum_scale_factor for this one.
> 
> Partitioning would be a good solution, but IMHO postgres should be able 
> to handle this case anyway, ideally without per-table configuration.

I agree that you may well want autovacuum and autoanalyze treat your large
table differently from your small tables.

But I am reluctant to accept even more autovacuum GUCs.  It's not like
we don't have enough of them, rather the opposite.  You can slap on more
GUCs to treat more special cases, but we will never reach the goal of
having a default that will make everybody happy.

I believe that the defaults should work well in moderately sized databases
with moderate usage characteristics.  If you have large tables or a high
number of transactions per second, you can be expected to make the effort
and adjust the settings for your case.  Adding more GUCs makes life *harder*
for the users who are trying to understand and configure how autovacuum works.

Yours,
Laurenz Albe

Re: New GUC autovacuum_max_threshold ?

2024-04-26 Thread Frédéric Yhuel





Le 25/04/2024 à 22:21, Robert Haas a écrit :

The analyze case, I feel, is really murky.
autovacuum_analyze_scale_factor stands for the proposition that as the
table becomes larger, analyze doesn't need to be done as often. If
what you're concerned about is the frequency estimates, that's true:
an injection of a million new rows can shift frequencies dramatically
in a small table, but the effect is blunted in a large one. But a lot
of the cases I've seen have involved the histogram boundaries. If
you're inserting data into a table in increasing order, every new
million rows shifts the boundary of the last histogram bucket by the
same amount. You either need those rows included in the histogram to
get good query plans, or you don't. If you do, the frequency with
which you need to analyze does not change as the table grows. If you
don't, then it probably does. But the answer doesn't really depend on
how big the table is already, but on your workload. So it's unclear to
me that the proposed parameter is the right idea here at all. It's
also unclear to me that the existing system is the right idea. 🙂


This is very interesting. And what about ndistinct? I believe it could 
be problematic, too, in some (admittedly rare or pathological) cases.


For example, suppose that the actual number of distinct values grows 
from 1000 to 20 after a batch of insertions, for a particular 
column. OK, in such a case, the default analyze sampling isn't large 
enough to compute a ndistinct close enough to reality anyway. But 
without any analyze at all, it can lead to very bad planning - think of 
a Nested Loop with a parallel seq scan for the outer table instead of a 
simple efficient index scan, because the index scan of the inner table 
is overestimated (each index scan cost and number or rows returned).

Re: New GUC autovacuum_max_threshold ?

2024-04-26 Thread Michael Banck

Hi,

On Fri, Apr 26, 2024 at 04:24:45AM +0200, Laurenz Albe wrote:
> On Thu, 2024-04-25 at 14:33 -0400, Robert Haas wrote:
> > Another reason, at least in existing releases, is that at some
> > point index vacuuming hits a wall because we run out of space for dead
> > tuples. We *most definitely* want to do index vacuuming before we get
> > to the point where we're going to have to do multiple cycles of index
> > vacuuming.
> 
> That is more convincing.  But do we need a GUC for that?  What about
> making a table eligible for autovacuum as soon as the number of dead
> tuples reaches 90% of what you can hold in "autovacuum_work_mem"?

Due to the improvements in v17, this would basically never trigger
accordings to my understanding, or at least only after an excessive
amount of bloat has been accumulated.


Michael

RE: Improving the latch handling between logical replication launcher and worker processes.

2024-04-26 Thread Zhijie Hou (Fujitsu)

On Thursday, April 25, 2024 4:59 PM vignesh C  wrote:
> 
> Hi,
> 
> Currently the launcher's latch is used for the following: a) worker process 
> attach
> b) worker process exit and c) subscription creation.
> Since this same latch is used for multiple cases, the launcher process is not 
> able
> to handle concurrent scenarios like: a) Launcher started a new apply worker 
> and
> waiting for apply worker to attach and b) create subscription sub2 sending
> launcher wake up signal. In this scenario, both of them will set latch of the
> launcher process, the launcher process is not able to identify that both
> operations have occurred 1) worker is attached 2) subscription is created and
> apply worker should be started. As a result the apply worker does not get
> started for the new subscription created immediately and gets started after 
> the
> timeout of 180 seconds.
> 
> I have started a new thread for this based on suggestions at [1].
> 
> a) Introduce a new latch to handle worker attach and exit.

I found the startup process also uses two latches(see recoveryWakeupLatch) for
different purposes, so maybe this is OK. But note that both logical launcher
and apply worker will call logicalrep_worker_launch(), if we only add one new
latch, it means both workers will wait on the same new Latch, and the launcher
may consume the SetLatch that should have been consumed by the apply
worker(although it's rare).

> b) Add a new GUC launcher_retry_time which gives more flexibility to users as
> suggested by Amit at [1]. Before 5a3a953, the wal_retrieve_retry_interval 
> plays
> a similar role as the suggested new GUC launcher_retry_time, e.g. even if a
> worker is launched, the launcher only wait wal_retrieve_retry_interval time
> before next round.

IIUC, the issue does not happen frequently, and may not be noticeable where
apply workers wouldn't be restarted often. So, I am slightly not sure if it's
worth adding a new GUC.

> c) Don't reset the latch at worker attach and allow launcher main to identify 
> and
> handle it. For this there is a patch v6-0002 available at [2].

This seems simple. I found we are doing something similar in
RegisterSyncRequest() and WalSummarizerMain().

Best Regards,
Hou zj

Re: Improve the connection failure error messages

2024-04-26 Thread Daniel Gustafsson

> On 22 Mar 2024, at 11:42, Nisha Moond  wrote:

> Here is the v4 patch with changes required in slotfuncs.c and slotsync.c 
> files.

-   errmsg("could not connect to the primary server: %s", err));
+   errmsg("\"%s\" could not connect to the primary server: %s", 
app_name.data, err));

Messages like this should perhaps have translator comments to indicate what the
leading "%s" will contain?

--
Daniel Gustafsson

Re: New GUC autovacuum_max_threshold ?

2024-04-26 Thread Frédéric Yhuel





Le 26/04/2024 à 04:24, Laurenz Albe a écrit :

On Thu, 2024-04-25 at 14:33 -0400, Robert Haas wrote:

I believe that the underlying problem here can be summarized in this
way: just because I'm OK with 2MB of bloat in my 10MB table doesn't
mean that I'm OK with 2TB of bloat in my 10TB table. One reason for
this is simply that I can afford to waste 2MB much more easily than I
can afford to waste 2TB -- and that applies both on disk and in
memory.


I don't find that convincing.  Why are 2TB of wasted space in a 10TB
table worse than 2TB of wasted space in 100 tables of 100GB each?



Good point, but another way of summarizing the problem would be that the 
autovacuum_*_scale_factor parameters work well as long as we have a more 
or less evenly distributed access pattern in the table.


Suppose my very large table gets updated only for its 1% most recent 
rows. We probably want to decrease autovacuum_analyze_scale_factor and 
autovacuum_vacuum_scale_factor for this one.


Partitioning would be a good solution, but IMHO postgres should be able 
to handle this case anyway, ideally without per-table configuration.

90 matches

Mail list logo