date:20231229

On Fri, 29 Dec 2023 at 19:38, Tom Lane  wrote:
> > 2. It paves the way for GUCs that can only be set using a protocol
> > message (and not using SET).
>
> This is assuming facts not in evidence.

How about instead of talking about protocol-only GUCs (which are
indeed currently a theoretical concept), we start considering this
patch as a way to modify parameters for protocol extensions. Protocol
extension parameters (starting with _pq_.) can currently only be
configured using the StartupMessage and afterwards there is no way to
modify them.

I believe [1], [2] and [3] from my initial email could all use such
protocol extension parameters, if those parameters could be changed
after the initial startup.

Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

On Fri, 29 Dec 2023 at 19:38, Tom Lane  wrote:
>> Jelte Fennema-Nio  writes:
> > 1. Protocol messages are much easier to inspect for connection poolers
> > than queries
>
> Unless you somehow forbid clients from setting GUCs in the old way,
> exactly how will that help a pooler?

A co-operating client could choose to only use this new message type
to edit GUCs such as search_path or pgbouncer.sharding_key using this
method. That's already a big improvement over the status quo, where
PgBouncer (and most other poolers) only understands GUC changes by
observing the ParameterStatus responses from Postgres. At which point
it is obviously too late to make any routing decisions, because to get
the ParamaterStatus back the pooler already needs to have forwarded
the SET command to an actual Postgres server. The same holds for any
setting changes that are specific to the pooler and postgres doesn't
even know about, such as pgbouncer.pool_mode

> > 3. Being able to change GUCs while in an aborted transaction.
>
> I'm really dubious that that's sane.  How will it interact with, for
> example, changes to the same GUC done in the now-failed transaction?
> Or for that matter, changes that happen later in the current
> transaction?  It seems like you can't even think about this unless
> you deem GUC changes made this way to be non-transactional, which
> seems very wart-y and full of consistency problems.

I think that's a fair criticism of the current patch. I do think it's
quite useful for drivers/poolers not to have to worry about their
changes to protocol settings being rolled back unexpectedly because
they made the change while the client was doing a transaction.
Particularly because it's easy for poolers to detect when a Sync is
sent without parsing queries, but not when a BEGIN is sent (PgBouncer
uses the ReadyForQuery response from the server to detect if a
transaction is open or not). But I agree that this behaviour is
fraught with problems for any non-protocol level settings.

I feel like a reasonable solution would be to make this ParameterSet
message transactional after all, but explicitly define the relevant
protocol-only GUCs to be non-transactional.

> I agree that GUC_LIST_QUOTE is a mess, but "I'm too lazy to deal
> with that" seems like a rather poor argument for instead expending
> the amount of labor involved in a protocol change.

Fair enough, honestly this is more of a bonus benefit to me.

On Fri, 29 Dec 2023 at 20:36, Tom Lane  wrote:
> Yeah, there's definitely an issue about what level of the client-side
> software ought to be able to set such parameters.  I'm not sure that
> "only the lowest level" is the correct answer though.  As an example,
> libpq doesn't especially care what encoding it's dealing with, nor
> (AFAIR) whether COPY data is text or binary.  The calling application
> probably cares, but then we end up needing a bunch of new plumbing to
> pass the settings through.  That's not really providing a lot of
> value-add IMO.

The value-add that I think it provides is forcing the user to use a
well defined interface when requesting behavioral changes to the
protocol. A client/pooler probably still wants to allow a user to
change these protocol level settings, but it wants to exert some
control when that happens. Clients could do so by exposing a basic
wrapper around PQparameterSet, and registering that they should parse
responses from postgres differently now. And poolers could do this by
understanding the message, and taking a relevant action (such as
enabling/disabling compression on outgoing messages). By having a well
defined interface, clients and poolers know when these protocol
related settings are being changed and can possibly even slightly
change the value before sending it to the server (e.g. by adding a few
extra GUCs to the list of GUCs that should be GUC_REPORTed because
PgBouncer wants to track them).

Achieving the same without this new ParameterSet message and
PQparameterSet might seem possible too, but then all clients and
poolers would need to implement a partial (and probably buggy) SQL
parser to check if the query that is being sent is "SET
binary_protocol = ...". And even this would not really be enough,
since a function would be able to change the relevant GUC without the
client ever sending SET ...

So, to summarize: Every piece of software in between the user writing
queries and postgres sending responses has some expectation and
requirements on what stuff looks like at the protocol level. Requiring
those intermediaries to look at the application layer (i.e. the actual
queries) to understand what to expect at the protocol layer is a
layering violation. Thus having a generic way to change protocol level
configuration options at the actual protocol level is needed to ensure
layer separation.

Re: [17] CREATE SUBSCRIPTION ... SERVER

On Tue, 2023-09-05 at 12:08 -0700, Jeff Davis wrote:
> OK, so we could have a built-in FDW called pg_connection that would
> do
> the right kinds of validation; and then also allow other FDWs but the
> subscription would have to do its own validation.

Attached a rough rebased version implementing the above with a
pg_connection_fdw foreign data wrapper built in.

Regards,
Jeff Davis

From 776cd8e5e1541c56b1767aa595fc609fdeffa5e3 Mon Sep 17 00:00:00 2001
From: Jeff Davis 
Date: Wed, 23 Aug 2023 10:31:16 -0700
Subject: [PATCH v3] CREATE SUBSCRIPTION ... SERVER.

---
 doc/src/sgml/ref/alter_subscription.sgml  |   7 +-
 doc/src/sgml/ref/create_subscription.sgml |  15 +-
 doc/src/sgml/user-manag.sgml  |  21 +-
 src/backend/catalog/Makefile  |   1 +
 src/backend/catalog/pg_subscription.c |  17 +-
 src/backend/catalog/system_functions.sql  |   2 +
 src/backend/commands/subscriptioncmds.c   | 197 ++--
 src/backend/foreign/foreign.c | 214 ++
 src/backend/parser/gram.y |  20 ++
 src/backend/replication/logical/worker.c  |  12 +-
 src/bin/pg_dump/pg_dump.c |  59 -
 src/bin/pg_dump/pg_dump.h |   1 +
 src/bin/psql/describe.c   |   2 +-
 src/bin/psql/tab-complete.c   |   2 +-
 src/include/catalog/meson.build   |   1 +
 src/include/catalog/pg_authid.dat |   5 +
 .../catalog/pg_foreign_data_wrapper.dat   |  22 ++
 src/include/catalog/pg_proc.dat   |   8 +
 src/include/catalog/pg_subscription.h |   5 +-
 src/include/foreign/foreign.h |   1 +
 src/include/nodes/parsenodes.h|   3 +
 src/test/regress/expected/foreign_data.out|  60 -
 src/test/regress/expected/subscription.out|  38 
 src/test/regress/sql/foreign_data.sql |  41 +++-
 src/test/regress/sql/subscription.sql |  39 
 src/test/subscription/t/001_rep_changes.pl|  57 +
 26 files changed, 799 insertions(+), 51 deletions(-)
 create mode 100644 src/include/catalog/pg_foreign_data_wrapper.dat

diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index 6d36ff0dc9..f2235061bb 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -21,6 +21,7 @@ PostgreSQL documentation
 
  
 
+ALTER SUBSCRIPTION name SERVER servername
 ALTER SUBSCRIPTION name CONNECTION 'conninfo'
 ALTER SUBSCRIPTION name SET PUBLICATION publication_name [, ...] [ WITH ( publication_option [= value] [, ... ] ) ]
 ALTER SUBSCRIPTION name ADD PUBLICATION publication_name [, ...] [ WITH ( publication_option [= value] [, ... ] ) ]
@@ -98,9 +99,9 @@ ALTER SUBSCRIPTION name RENAME TO <
 CONNECTION 'conninfo'
 
  
-  This clause replaces the connection string originally set by
-  .  See there for more
-  information.
+  This clause replaces the foreign server or connection string originally
+  set by  with the connection
+  string conninfo.
  
 

diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index f1c20b3a46..cd76b2e32d 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -22,7 +22,7 @@ PostgreSQL documentation
  
 
 CREATE SUBSCRIPTION subscription_name
-CONNECTION 'conninfo'
+{ SERVER servername | CONNECTION 'conninfo' }
 PUBLICATION publication_name [, ...]
 [ WITH ( subscription_parameter [= value] [, ... ] ) ]
 
@@ -77,6 +77,15 @@ CREATE SUBSCRIPTION subscription_name

 
+   
+SERVER servername
+
+ 
+  A foreign server to use for the connection.
+ 
+
+   
+

 CONNECTION 'conninfo'
 
@@ -363,6 +372,10 @@ CREATE SUBSCRIPTION subscription_nametrue. Only superusers can set
   this value to false.
  
+ 
+  Only allowed when CONNECTION is
+  specified. Otherwise, see .
+ 
 

 
diff --git a/doc/src/sgml/user-manag.sgml b/doc/src/sgml/user-manag.sgml
index 92a299d2d3..d63a33a4b3 100644
--- a/doc/src/sgml/user-manag.sgml
+++ b/doc/src/sgml/user-manag.sgml
@@ -687,11 +687,20 @@ DROP ROLE doomed_role;
Allow use of connection slots reserved via
.
   
+  
+   pg_create_connection
+   Allow users with CREATE permission on the
+   database to issue CREATE
+   SERVER if FOR CONNECTION ONLY is
+   specified.
+  
   
pg_create_subscription
Allow users with CREATE permission on the
-   database to issue
-   CREATE SUBSCRIPTION.
+   database to issue CREATE
+   SUBSCRIPTION.  This role is a member of
+   pg_create_connection.
   
  
 
@@ -737,6 +746,14 @@ DROP ROLE doomed_role;
   great care should be taken when granting these roles to

Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

On Fri, 29 Dec 2023 at 19:32, Jeff Davis  wrote:
> On Fri, 2023-12-29 at 18:29 +0100, Jelte Fennema-Nio wrote:
> > There's definitely still some more work that needs to be done
> > (docs for new libpq APIs, protocol version bump, working protocol
> > version negotiation).
>
> That is my biggest concern right now: what will new clients connecting
> to old servers do?
>
> If the version is bumped, should we look around for other unrelated
> protocol changes to make at the same time? Do we want a more generic
> form of negotiation?

This is not that big of a deal. Since it's only an addition of a new
message type, it's completely backwards compatible with the current
protocol version. i.e. as long as a client just doesn't send it when
the server reports an older protocol version everything works fine.
The protocol already has version negotiation built in and the server
implements it in a reasonable way. The only problem is that no-one
bothered to implement the client side of it in libpq, because it
wasn't necessary yet since 3.x only had a single minor version.

Patch v20230911-0003[5] from thread [3] implements client side
handling in (imho) a sane way. I think it probably still needs some
small tweaks and discussion on if this is the exact user facing API
that we want. But there's no big hurdles implementation or protocol
wise to make the next libpq client backwards compatible with old
servers. I think it's worth merging something like that patch anyway,
because that's necessary for pretty much any protocol changes we would
like to do. After that there's pretty much no downside to bumping
minor versions of the protocol anymore, so we could even do it every
release if needed. Thus I don't think it's necessary to bulk any
protocol changes together.

[3]: 
https://www.postgresql.org/message-id/flat/AB607155-8FED-4C8C-B702-205B33884CBB%40yandex-team.ru#961c695d190cdccb3975a157b22ce9d8
[5]: 
https://www.postgresql.org/message-id/attachment/150192/v20230911-0003-allow-to-connect-to-server-with-major-protocol-versi.patch

Re: Statistics Import and Export

2023-12-29 Thread Tomas Vondra

On 12/29/23 17:27, Corey Huinker wrote:
> But maybe it's enough to just do what you did - if we get an MCELEM
> slot, can it ever contain anything else than array of elements of the
> attribute array type? I'd bet that'd cause all sorts of issues, no?
> 
> 
> Thanks for the explanation of why it wasn't working for me. Knowing that
> the case of MCELEM + is-array-type is the only case where we'd need to
> do that puts me at ease.
> 

But I didn't claim MCELEM is the only slot where this might be an issue.
I merely asked if a MCELEM slot can ever contain an array with element
type different from the "original" attribute.

After thinking about this a bit more, and doing a couple experiments
with a trivial custom data type, I think this is true:

1) MCELEM slots for "real" array types are OK

I don't think we allow "real" arrays created by users directly, all
arrays are created implicitly by the system. Those types always have
array_typanalyze, which guarantees MCELEM has the correct element type.

I haven't found a way to either inject my custom array type or alter the
typanalyze to some custom function. So I think this is OK.

2) I'm not sure we can extend this regular data types / other slots

For example, I think I can implement a data type with custom typanalyze
function (and custom compute_stats function) that fills slots with some
other / strange stuff. For example I might build MCV with hashes of the
original data, a CountMin sketch, or something like that.

Yes, I don't think people do that often, but as long as the type also
implements custom selectivity functions for the operators, I think this
would work.

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: broken master regress tests

On Thu, 2023-12-28 at 22:00 +0300, Alexander Lakhin wrote:
> But looking at the result with the comment above that "do" block, I
> wonder
> whether this successful CREATE COLLATION command is so important to
> perform
> it that tricky way, if we want to demonstrate that nondeterministic
> collations not supported.

Thank you, pushed this version. There are other similar commands in the
file, so I think it's fine. It exercises a specific locale that might
be different from datcollate.

Regards,
Jeff Davis

Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

Jeff Davis  writes:
> On Fri, 2023-12-29 at 13:38 -0500, Tom Lane wrote:
>> This is assuming facts not in evidence.  Why would we want such a
>> thing?

> The problem came up during the binary_formats GUC discussion: it
> doesn't really make sense to change that with a SQL query, and doing so
> can cause strange things to happen.
> We already have the issue with client_encoding and binary format COPY,
> so arguably it's not worth trying to solve it. But protocol-only GUCs
> was one idea that came up.

Yeah, there's definitely an issue about what level of the client-side
software ought to be able to set such parameters.  I'm not sure that
"only the lowest level" is the correct answer though.  As an example,
libpq doesn't especially care what encoding it's dealing with, nor
(AFAIR) whether COPY data is text or binary.  The calling application
probably cares, but then we end up needing a bunch of new plumbing to
pass the settings through.  That's not really providing a lot of
value-add IMO.

regards, tom lane

Re: Make all Perl warnings fatal

2023-12-29 Thread Peter Eisentraut


On 22.12.23 22:33, Peter Eisentraut wrote:

On 12.09.23 07:42, Peter Eisentraut wrote:

On 10.08.23 07:58, Peter Eisentraut wrote:
There are also a couple of issues in the MSVC legacy build system 
that would need to be tightened up in order to survive with fatal 
Perl warnings.  Obviously, there is a question whether it's worth 
spending any time on that anymore.


It looks like there are no principled objections to the overall idea 
here, but given this dependency on the MSVC build system removal, I'm 
going to set this patch to Returned with feedback for now.


Now that that is done, here is an updated patch for this.

I found one more bug in the Perl code because of this, a fix for which 
is included here.


With this fix, this passes all the CI tests on all platforms.


committed

Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

On Fri, 2023-12-29 at 13:38 -0500, Tom Lane wrote:
> > 2. It paves the way for GUCs that can only be set using a protocol
> > message (and not using SET).
> 
> This is assuming facts not in evidence.  Why would we want such a
> thing?

The problem came up during the binary_formats GUC discussion: it
doesn't really make sense to change that with a SQL query, and doing so
can cause strange things to happen.

We already have the issue with client_encoding and binary format COPY,
so arguably it's not worth trying to solve it. But protocol-only GUCs
was one idea that came up.

Regards,
Jeff Davis

Re: [17] collation provider "builtin"

On Thu, 2023-06-15 at 15:08 -0400, Joe Conway wrote:
> > I haven't studied the details yet but +1 for this idea.  It models
> > what we are actually doing.
> 
> +1 agreed

I am combining this discussion with my "built-in CTYPE provider"
proposal here:

https://www.postgresql.org/message-id/804eb67b37f41d3afeb2b6469cbe8bfa79c562cc.ca...@j-davis.com

and the most recent patch is posted there. Having a built-in provider
is more useful if it also offers a "C.UTF-8" locale that is superior to
the libc locale of the same name.

Regards,
Jeff Davis

Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

Jelte Fennema-Nio  writes:
> Currently the only way to set GUCs from a client is by using SET
> commands or set them in the StartupMessage. I think it would be very
> useful to be able to change settings using a message at the protocol
> level. For the following reasons:

> 1. Protocol messages are much easier to inspect for connection poolers
> than queries

Unless you somehow forbid clients from setting GUCs in the old way,
exactly how will that help a pooler?

> 2. It paves the way for GUCs that can only be set using a protocol
> message (and not using SET).

This is assuming facts not in evidence.  Why would we want such a thing?

> 3. Being able to change GUCs while in an aborted transaction.

I'm really dubious that that's sane.  How will it interact with, for
example, changes to the same GUC done in the now-failed transaction?
Or for that matter, changes that happen later in the current
transaction?  It seems like you can't even think about this unless
you deem GUC changes made this way to be non-transactional, which
seems very wart-y and full of consistency problems.

> 4. Have an easy way to use the value contained in a ParameterStatus
> message as the value for a GUC

I agree that GUC_LIST_QUOTE is a mess, but "I'm too lazy to deal
with that" seems like a rather poor argument for instead expending
the amount of labor involved in a protocol change.

On the whole, this feels like you are trying to force some things
into the GUC model that should not be there.  I do perceive that
there are things that could be protocol-level variables, but
trying to say they are a kind of GUC seems like it will create
more problems than it solves.

regards, tom lane

Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

On Fri, 2023-12-29 at 18:29 +0100, Jelte Fennema-Nio wrote:
> 2. It paves the way for GUCs that can only be set using a protocol
> message (and not using SET).

That sounds useful for GUCs that can interfere with the client, such as
client_encoding or the proposed GUC in you referred to at [1].

> There's definitely still some more work that needs to be done
> (docs for new libpq APIs, protocol version bump, working protocol
> version negotiation).

That is my biggest concern right now: what will new clients connecting
to old servers do?

If the version is bumped, should we look around for other unrelated
protocol changes to make at the same time? Do we want a more generic
form of negotiation?

Regards,
Jeff Davis

Re: Avoid computing ORDER BY junk columns unnecessarily

Heikki Linnakangas  writes:
> On 29/12/2023 01:42, Tom Lane wrote:
>> I didn't stop to trace the details but I'm pretty sure this is why
>> you're getting the bogus extra projections shown in the 0005 patch.

> They're not bogus. With the patches, projecting away the junk columns is 
> visible in the plan as an extra Result node, while currently it's done 
> as an implicit step in the executor. That seems fine and arguably an 
> even more honest representation of what's happening, although I don't 
> like the extra verbosity in EXPLAIN output.

I didn't bring it up in my previous message, but I'm not really on
board with trying to get rid of the executor junkfilter.  It seems
to me that that code is probably faster than a general-purpose
projection, and surely faster than an extra Result node, so I think
that is not a goal that would improve matters.

However, what I *was* trying to say is that I think those projections
occur because the lower-level plan node is still outputting the
columns you want to suppress, and then the planner realizes that it
needs a projection to create the shortened tlist.  But that's not
saving any work, because we still computed the expensive function :-(.
We need a more thorough treatment of the problem to ensure that the
lower-level plan nodes don't emit these columns either.

> Related to this, we are not currently costing the target list evaluation 
> correctly for index-only scans.

Right, and we're dumb in other ways too: if the index can return
f(x) but not x, we fail to realize that we can use it for an IOS
in the first place, because there's an assumption that we'd better
be able to fetch x.  Currently I think the best way around that
might be via the other discussion that's going on about unifying
regular indexscans and index-only scans [1].  If we do that then
we could postpone the decision about whether we actually need x
itself, and perhaps that would simplify getting this right.

I'm kind of inclined to think that it'd be best to push the
other discussion forward first, and come back to this area
when it's done, because that will be touching a lot of the
same code as well as (maybe) simplifying the planner's problem.

regards, tom lane

[1] 
https://www.postgresql.org/message-id/20230609000600.syqy447e6metnvyj%40awork3.anarazel.de

Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

2023-12-29 Thread Pavel Stehule

Hi

pá 29. 12. 2023 v 18:29 odesílatel Jelte Fennema-Nio  napsal:

> Currently the only way to set GUCs from a client is by using SET
> commands or set them in the StartupMessage. I think it would be very
> useful to be able to change settings using a message at the protocol
> level. For the following reasons:
>
> 1. Protocol messages are much easier to inspect for connection poolers
> than queries
> 2. It paves the way for GUCs that can only be set using a protocol
> message (and not using SET).
> 3. Being able to change GUCs while in an aborted transaction.
> 4. Have an easy way to use the value contained in a ParameterStatus
> message as the value for a GUC
>
> I attached a patch that adds a new protocol message to set a GUC
> value. There's definitely still some more work that needs to be done
> (docs for new libpq APIs, protocol version bump, working protocol
> version negotiation). But the core functionality is working. At this
> point I'd mainly like some feedback on the general idea.
>
> The sections below go into more detail on each of the reasons I mentioned
> above:
>
> (1) PgBouncer does not inspect query strings, to avoid having to
> write/maintain a SQL parser (even a partial one). But that means that
> clients cannot configure any behaviour of PgBouncer for their session.
> A few examples of useful things to configure would be:
> a. allow changing between session and transaction pooling on the same
> connection.
> b. intercepting changes to search_path, and routing different schemas
> to different machines (useful for Citus its schema based sharding).
> c. intercepting changing of pgbouncer.sharding_key, and route to
> different machines based on this value.
>
> (2) There are currently multiple threads ongoing that propose very
> similar protocol changes for very different purposes. Essentially all
> of them boil down to sending a protocol message to the server to
> change some other protocol behaviour. And the reason why they cannot
> use GUCs, is because the driver and/or connection pooler need to know
> what the setting is and be able to choose it without a user running
> some SQL suddenly changing the value. The threads I'm talking about
> are: Choosing specific types that use binary format for encoding [1].
> Changing what GUCs are reported to the client using ParameterStatus
> (a.k.a configurable GUC_REPORT) [2]. Changing the compression method
> that is used to compress messages[3].
>
> Another benefit could be to allow a connection pooler to configure
> certain settings to not be changeable with SQL. For instance if a
> pooler could ensure that a client couldn't later change
> session_authorization, it could use session_authorization to set the
> user and then multiplex client connections from different users over
> the same connection to the database.
>
> (3) For psql it's useful to be able to control what messages it gets a
> ParameterStatus for, even when the transaction is in aborted state.
> Because that way it could decide what parameters status updates to
> request based on the prompt it needs to display. And the prompt can be
> changed even during an aborted transaction.
>
> (4) PgBouncer uses the value contained in the ParameterStatus message
> to correctly set some GUCs back to their expected value. But to do
> this you currently need to use a SET query, which in turn requires
> quoting the value using SQL quoting . This wouldn't be so bad, except
> that GUC_LIST_QUOTE exists. Parameters with GUC_LIST_QUOTE have each
> item in the list returned **double** quoted, and then those double
> quoted items separated by commas. But to be able to correctly set
> them, they need to be given each separately **single** quoted and
> separated by commas. Doing that would require a lot of parsing logic
> to replace double quotes with single quotes correctly. For now
> pgbouncer only handles the empty string case correctly, for the
> situations where the difference between double and single quotes
> matters[4].
>
> [1]:
> https://www.postgresql.org/message-id/flat/CA%2BTgmoZyAh%2BhdN8zvHeN40n9vTstw8K1KjuWdgDuAMMbFAZqHg%40mail.gmail.com#e3a603bbc091e796148a2d660a4a1c1f
> [2]:
> https://www.postgresql.org/message-id/flat/cafj8prbfu-wzzqhnrwrhn67n0ug8a9-0-9boo69pptchibd...@mail.gmail.com
> [3]:
> https://www.postgresql.org/message-id/flat/AB607155-8FED-4C8C-B702-205B33884CBB%40yandex-team.ru#961c695d190cdccb3975a157b22ce9d8
> [4]:
> https://github.com/pgbouncer/pgbouncer/blob/fb468025d61e1ffdc6dbc819558f45414e0a176e/src/varcache.c#L172-L183
>
> P.S. I included authors and some reviewers of the threads I mentioned
> for 2 in the CC. Since this patch is meant to be a generic protocol
> change that could be used by all of them.
>

+1

Pavel

Re: [PATCH] plpython function causes server panic

I wrote:
> Hao Zhang  writes:
>> IMHO, there are other error reports in the function
>> BeginInternalSubTransaction(), like

> Sure, but all the other ones are extremely hard to hit, which is why
> we didn't bother to worry about them to begin with.  If we want to
> make this more formally bulletproof, my inclination would be to
> (a) get rid of the IsInParallelMode restriction and then (b) turn
> the function into a critical section, so that any other error gets
> treated as a PANIC.

Here's a draft patch along this line.  Basically the idea is that
subtransactions used for error control are now legal in parallel
mode (including in parallel workers) so long as they don't try to
acquire their own XIDs.  I had to clean up some error handling
in xact.c, but really this is a pretty simple patch.

Rather than a true critical section (ie PANIC on failure), it seemed
to me to be enough to force FATAL exit if BeginInternalSubTransaction
fails.  Given the likelihood that our transaction state is messed up
if we get a failure partway through, it's not clear to me that we
could do much better than that even if we were willing to make an API
change for BeginInternalSubTransaction.

I haven't thought hard about what new test cases we might want to
add for this.  It gets through check-world as is, meaning that
nobody has made any test cases exercising the previous restrictions
either.  There might be more documentation work to be done, too.

regards, tom lane

diff --git a/doc/src/sgml/parallel.sgml b/doc/src/sgml/parallel.sgml
index 5acc9537d6..7237be40bd 100644
--- a/doc/src/sgml/parallel.sgml
+++ b/doc/src/sgml/parallel.sgml
@@ -546,9 +546,8 @@ EXPLAIN SELECT * FROM pgbench_accounts WHERE filler LIKE '%x%';
 
   
 Functions and aggregates must be marked PARALLEL UNSAFE if
-they write to the database, access sequences, change the transaction state
-even temporarily (e.g., a PL/pgSQL function that establishes an
-EXCEPTION block to catch errors), or make persistent changes to
+they write to the database, access sequences, change the transaction state,
+or make persistent changes to
 settings.  Similarly, functions must be marked PARALLEL
 RESTRICTED if they access temporary tables, client connection state,
 cursors, prepared statements, or miscellaneous backend-local state that
diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index 8442c5e6a7..2aa218638c 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -638,7 +638,9 @@ AssignTransactionId(TransactionState s)
 	 * operation, so we can't account for new XIDs at this point.
 	 */
 	if (IsInParallelMode() || IsParallelWorker())
-		elog(ERROR, "cannot assign XIDs during a parallel operation");
+		ereport(ERROR,
+(errcode(ERRCODE_INVALID_TRANSACTION_STATE),
+ errmsg("cannot assign XIDs during a parallel operation")));
 
 	/*
 	 * Ensure parent(s) have XIDs, so that a child always has an XID later
@@ -826,7 +828,11 @@ GetCurrentCommandId(bool used)
 		 * could relax this restriction when currentCommandIdUsed was already
 		 * true at the start of the parallel operation.
 		 */
-		Assert(!IsParallelWorker());
+		if (IsParallelWorker())
+			ereport(ERROR,
+	(errcode(ERRCODE_INVALID_TRANSACTION_STATE),
+	 errmsg("cannot modify data in a parallel worker")));
+
 		currentCommandIdUsed = true;
 	}
 	return currentCommandId;
@@ -1091,7 +1097,9 @@ CommandCounterIncrement(void)
 		 * point.
 		 */
 		if (IsInParallelMode() || IsParallelWorker())
-			elog(ERROR, "cannot start commands during a parallel operation");
+			ereport(ERROR,
+	(errcode(ERRCODE_INVALID_TRANSACTION_STATE),
+	 errmsg("cannot start commands during a parallel operation")));
 
 		currentCommandId += 1;
 		if (currentCommandId == InvalidCommandId)
@@ -4227,7 +4235,7 @@ DefineSavepoint(const char *name)
 	 * is TBLOCK_PARALLEL_INPROGRESS, so we can treat that as an invalid case
 	 * below.)
 	 */
-	if (IsInParallelMode())
+	if (IsInParallelMode() || IsParallelWorker())
 		ereport(ERROR,
 (errcode(ERRCODE_INVALID_TRANSACTION_STATE),
  errmsg("cannot define savepoints during a parallel operation")));
@@ -4314,7 +4322,7 @@ ReleaseSavepoint(const char *name)
 	 * is TBLOCK_PARALLEL_INPROGRESS, so we can treat that as an invalid case
 	 * below.)
 	 */
-	if (IsInParallelMode())
+	if (IsInParallelMode() || IsParallelWorker())
 		ereport(ERROR,
 (errcode(ERRCODE_INVALID_TRANSACTION_STATE),
  errmsg("cannot release savepoints during a parallel operation")));
@@ -4423,7 +4431,7 @@ RollbackToSavepoint(const char *name)
 	 * is TBLOCK_PARALLEL_INPROGRESS, so we can treat that as an invalid case
 	 * below.)
 	 */
-	if (IsInParallelMode())
+	if (IsInParallelMode() || IsParallelWorker())
 		ereport(ERROR,
 (errcode(ERRCODE_INVALID_TRANSACTION_STATE),
  errmsg("cannot rollback to savepoints during a parallel operation")));
@@ -4529,38 +4537,39 @@

Re: [HACKERS] Changing references of password encryption to hashing

2023-12-29 Thread Bruce Momjian

On Wed, Dec 27, 2023 at 10:52:15PM +0100, Peter Eisentraut wrote:
> On 27.12.23 02:04, Bruce Momjian wrote:
> > I did_not_  change the user API, so CREATE/ALTER ROLE still uses
> > [ENCRYPTED] PASSWORD, the GUC is still called password_encryption, and
> > the libpq function is still called PQencryptPasswordConn().  This makes
> > the user interface confusing since the API uses "encryption" but the
> > text calls it "hashing".  Is there support for renaming the API to use
> > "hash" and keeping "encrypt" for backward compatiblity.
> 
> Yeah, this is clearly confusing.  I think we should just leave it alone.
> Some gentle rewording here and there to clarify things might be appropriate
> (but the text already uses hashing on some occasions), but the blanket
> replacement does not make things better, I think.

Seeing no more replies, I will abandon this improvement idea.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Only you can decide what is important to you.

Re: Avoid computing ORDER BY junk columns unnecessarily

2023-12-29 Thread Heikki Linnakangas


On 29/12/2023 01:42, Tom Lane wrote:

I wrote:

Yeah, fair point.  I'll try to take a look at your patchset after
the holidays.


I spent some time looking at this patch, and I'm not very pleased
with it.  My basic complaint is that this is a band-aid that only
touches things at a surface level, whereas I think we need a much
deeper set of changes in order to have a plausible solution.
Specifically, you're only munging the final top-level targetlist
not anything for lower plan levels.  That looks like it works in
simple cases, but it leaves everything on the table as soon as
there's more than one level of plan involved.


Yeah, that's fair.


I didn't stop to trace the details but I'm pretty sure this is why
you're getting the bogus extra projections shown in the 0005 patch.
They're not bogus. With the patches, projecting away the junk columns is 
visible in the plan as an extra Result node, while currently it's done 
as an implicit step in the executor. That seems fine and arguably an 
even more honest representation of what's happening, although I don't 
like the extra verbosity in EXPLAIN output.



Moreover, this isn't doing anything for cost estimates, which means
the planner doesn't really understand that more-desirable plans are
more desirable, and it may not pick an available plan that would
exploit what we want to have happen here.

As an example, say we have an index on f(x) and the query requires
sorting by f(x) but not final output of f(x).  If we devise a plan
that uses the index to sort and doesn't emit f(x), we need to not
charge the evaluation cost of f() for that plan --- this might
make the difference between picking that plan and not.  Right now
I believe we'll charge all plans for f(), so that some other plan
might look cheaper even when f() is very expensive.
> Another example: we might be using an indexscan but not relying on
its sort order, for example because its output is going into a hash
join and then we'll sort afterwards.  For expensive f() it would
still be useful if the index could emit f(x) so that we don't have
to calculate f() at the sort step.  Right now I don't think we can
even emit that plan shape, never mind recognize why it's cheaper.


Related to this, we are not currently costing the target list evaluation 
correctly for index-only scans. Here's an example:


create or replace function expensive(i int) returns int
language plpgsql as
$$
  begin return i; end;
$$
immutable cost 100;

create table atab (i int);
insert into atab select g from generate_series(1, 1) g;
create index on atab (i, expensive(i));

postgres=# explain verbose select expensive(i) from atab order by 
expensive(i);
 QUERY PLAN 



 Sort  (cost=25000809.39..25000834.39 rows=1 width=4)
   Output: (expensive(i))
   Sort Key: (expensive(atab.i))
   ->  Seq Scan on public.atab  (cost=0.00..25000145.00 rows=1 width=4)
 Output: expensive(i)
(5 rows)

postgres=# set enable_seqscan=off; set enable_bitmapscan=off;
SET
SET
postgres=# explain verbose select expensive(i) from atab order by 
expensive(i);
  QUERY PLAN 


--
 Sort  (cost=25001114.67..25001139.67 rows=1 width=4)
   Output: (expensive(i))
   Sort Key: (expensive(atab.i))
   ->  Index Only Scan using atab_i_expensive_idx on public.atab 
(cost=0.29..25000450.29 rows=1 width=4)

 Output: (expensive(i))
(5 rows)

The cost of the index only scan ought to be much lower than the seqscan, 
as it can return the pre-computed expensive(i) from the index.


That could probably be fixed without any of the other changes we've been 
discussing here, though.



I have only vague ideas about how to do this better.  It might work
to set up multiple PathTargets for a relation that has such an
index, one for the base case where the scan only emits x and f() is
computed above, one for the case where we don't need either x or
f(x) because we're relying on the index order, and one that emits
f(x) with the expectation that a sort will happen above.  Then we'd
potentially generate multiple Paths representing the very same
indexscan but with different PathTargets, and differing targets
would have to become something that add_path treats as a reason to
keep multiple Paths for the same relation.  I'm a little frightened
about the possible growth in number of paths considered, but how
else would we keep track of the differing costs of these cases?


Hmm, if there are multiple functions like that in the target list, would 
you need to create different paths for each combination of expressions? 
That could really explode the number of paths.


Perhaps each Path could include "optional" target entries that it can 
calculate more cheaply, with a separate cost for each such

Add new protocol message to change GUCs for usage with future protocol-only GUCs

Currently the only way to set GUCs from a client is by using SET
commands or set them in the StartupMessage. I think it would be very
useful to be able to change settings using a message at the protocol
level. For the following reasons:

1. Protocol messages are much easier to inspect for connection poolers
than queries
2. It paves the way for GUCs that can only be set using a protocol
message (and not using SET).
3. Being able to change GUCs while in an aborted transaction.
4. Have an easy way to use the value contained in a ParameterStatus
message as the value for a GUC

I attached a patch that adds a new protocol message to set a GUC
value. There's definitely still some more work that needs to be done
(docs for new libpq APIs, protocol version bump, working protocol
version negotiation). But the core functionality is working. At this
point I'd mainly like some feedback on the general idea.

The sections below go into more detail on each of the reasons I mentioned above:

(1) PgBouncer does not inspect query strings, to avoid having to
write/maintain a SQL parser (even a partial one). But that means that
clients cannot configure any behaviour of PgBouncer for their session.
A few examples of useful things to configure would be:
a. allow changing between session and transaction pooling on the same
connection.
b. intercepting changes to search_path, and routing different schemas
to different machines (useful for Citus its schema based sharding).
c. intercepting changing of pgbouncer.sharding_key, and route to
different machines based on this value.

(2) There are currently multiple threads ongoing that propose very
similar protocol changes for very different purposes. Essentially all
of them boil down to sending a protocol message to the server to
change some other protocol behaviour. And the reason why they cannot
use GUCs, is because the driver and/or connection pooler need to know
what the setting is and be able to choose it without a user running
some SQL suddenly changing the value. The threads I'm talking about
are: Choosing specific types that use binary format for encoding [1].
Changing what GUCs are reported to the client using ParameterStatus
(a.k.a configurable GUC_REPORT) [2]. Changing the compression method
that is used to compress messages[3].

Another benefit could be to allow a connection pooler to configure
certain settings to not be changeable with SQL. For instance if a
pooler could ensure that a client couldn't later change
session_authorization, it could use session_authorization to set the
user and then multiplex client connections from different users over
the same connection to the database.

(3) For psql it's useful to be able to control what messages it gets a
ParameterStatus for, even when the transaction is in aborted state.
Because that way it could decide what parameters status updates to
request based on the prompt it needs to display. And the prompt can be
changed even during an aborted transaction.

(4) PgBouncer uses the value contained in the ParameterStatus message
to correctly set some GUCs back to their expected value. But to do
this you currently need to use a SET query, which in turn requires
quoting the value using SQL quoting . This wouldn't be so bad, except
that GUC_LIST_QUOTE exists. Parameters with GUC_LIST_QUOTE have each
item in the list returned **double** quoted, and then those double
quoted items separated by commas. But to be able to correctly set
them, they need to be given each separately **single** quoted and
separated by commas. Doing that would require a lot of parsing logic
to replace double quotes with single quotes correctly. For now
pgbouncer only handles the empty string case correctly, for the
situations where the difference between double and single quotes
matters[4].

[1]:
https://www.postgresql.org/message-id/flat/CA%2BTgmoZyAh%2BhdN8zvHeN40n9vTstw8K1KjuWdgDuAMMbFAZqHg%40mail.gmail.com#e3a603bbc091e796148a2d660a4a1c1f
[2]:
https://www.postgresql.org/message-id/flat/cafj8prbfu-wzzqhnrwrhn67n0ug8a9-0-9boo69pptchibd...@mail.gmail.com
[3]:
https://www.postgresql.org/message-id/flat/AB607155-8FED-4C8C-B702-205B33884CBB%40yandex-team.ru#961c695d190cdccb3975a157b22ce9d8
[4]:
https://github.com/pgbouncer/pgbouncer/blob/fb468025d61e1ffdc6dbc819558f45414e0a176e/src/varcache.c#L172-L183

P.S. I included authors and some reviewers of the threads I mentioned
for 2 in the CC. Since this patch is meant to be a generic protocol
change that could be used by all of them.

v1-0001-Add-support-to-change-GUCs-at-the-protocol-level.patch
Description: Binary data

Re: Fix Brin Private Spool Initialization (src/backend/access/brin/brin.c)

2023-12-29 Thread Ranier Vilela

Em sex., 29 de dez. de 2023 às 08:53, Ranier Vilela 
escreveu:

> Em qui., 28 de dez. de 2023 às 22:16, Tomas Vondra <
> tomas.von...@enterprisedb.com> escreveu:
>
>>
>>
>> On 12/27/23 12:37, Ranier Vilela wrote:
>> > Em ter., 26 de dez. de 2023 às 19:07, Tomas Vondra
>> > mailto:tomas.von...@enterprisedb.com>>
>> > escreveu:
>> >
>> >
>> >
>> > On 12/26/23 19:10, Ranier Vilela wrote:
>> > > Hi,
>> > >
>> > > The commit b437571
>> > > > > I
>> > > think has an oversight.
>> > > When allocate memory and initialize private spool in function:
>> > > _brin_leader_participate_as_worker
>> > >
>> > > The behavior is the bs_spool (heap and index fields)
>> > > are left empty.
>> > >
>> > > The code affected is:
>> > >   buildstate->bs_spool = (BrinSpool *) palloc0(sizeof(BrinSpool));
>> > > - buildstate->bs_spool->heap = buildstate->bs_spool->heap;
>> > > - buildstate->bs_spool->index = buildstate->bs_spool->index;
>> > > + buildstate->bs_spool->heap = heap;
>> > > + buildstate->bs_spool->index = index;
>> > >
>> > > Is the fix correct?
>> > >
>> >
>> > Thanks for noticing this.
>> >
>> > You're welcome.
>> >
>> >
>> > Yes, I believe this is a bug - the assignments
>> > are certainly wrong, it leaves the fields set to NULL.
>> >
>> > I wonder how come this didn't fail during testing. Surely, if the
>> leader
>> > participates as a worker, the tuplesort_begin_index_brin shall be
>> called
>> > with heap/index being NULL, leading to some failure during the
>> sort. But
>> > maybe this means we don't actually need the heap/index fields, it's
>> just
>> > a copy of TuplesortIndexArg, but BRIN does not need that because we
>> sort
>> > the tuples by blkno, and we don't need the descriptors for that.
>> >
>> > Unfortunately I can't test on Windows, since I can't build with meson on
>> > Windows.
>> >
>> >
>> > In any case, the _brin_parallel_scan_and_build does not actually
>> need
>> > the separate heap/index arguments, those are already in the spool.
>> >
>> > Yeah, for sure.
>> >
>> >
>> > I'll try to figure out if we want to simplify the tuplesort or
>> remove
>> > the arguments from _brin_parallel_scan_and_build.
>> >
>>
>> Here is a patch simplifying the BRIN parallel create code a little bit.
>> As I suspected, we don't need the heap/index in the spool at all, and we
>> don't need to pass it to tuplesort_begin_index_brin either - we only
>> need blkno, and we have that in the datum1 field. This also means we
>> don't need TuplesortIndexBrinArg.
>>
> With Windows 10, msvc 2022, compile end pass ninja test.
>
> But, if you allow me, I would like to try another approach to
> simplification.
> Instead of increasing the arguments in the call, wouldn't it be better to
> decrease them
> and this way all arguments will be passed in the registers instead of on a
> stack?
>
> bs_spool may well contain this data and will probably be useful in the
> future.
>
> I made a v1 version, based on your patch, for your consideration.
>
As I wrote, the new patch version was for consideration.
It seems more like a question of style, so it's better to remove it.

Anyway +1 for your original patch.

Best regards,
Ranier Vilela

Re: Statistics Import and Export

2023-12-29 Thread Corey Huinker

>
> But maybe it's enough to just do what you did - if we get an MCELEM
> slot, can it ever contain anything else than array of elements of the
> attribute array type? I'd bet that'd cause all sorts of issues, no?
>
>
Thanks for the explanation of why it wasn't working for me. Knowing that
the case of MCELEM + is-array-type is the only case where we'd need to do
that puts me at ease.

Re: libpq compression (part 3)

On Fri, 29 Dec 2023 at 11:02, Andrey M. Borodin  wrote:
> This patchset allows sending CompressionMethod message, which allows to set 
> another codec\level picked from the set of negotiated codec sets (during 
> startup).

Did you mean to attach a patchset? I don't see the CompressionMethod
message in the v2 patchset. Only a CompressedData one.

Re: introduce dynamic shared memory registry

2023-12-29 Thread Bharath Rupireddy

On Wed, Dec 20, 2023 at 9:33 PM Nathan Bossart  wrote:
>
> On Wed, Dec 20, 2023 at 03:28:38PM +0530, Bharath Rupireddy wrote:
> > Isn't the worker_spi best place to show the use of the DSM registry
> > instead of a separate test extension? Note the custom wait event
> > feature that added its usage code to worker_spi. Since worker_spi
> > demonstrates typical coding patterns, having just set_val_in_shmem()
> > and get_val_in_shmem() in there makes this patch simple and shaves
> > some code off.
>
> I don't agree.  The test case really isn't that complicated, and I'd rather
> have a dedicated test suite for this feature that we can build on instead
> of trying to squeeze it into something unrelated.

With the use of dsm registry for pg_prewarm, do we need this
test_dsm_registry module at all? Because 0002 patch pretty much
demonstrates how to use the DSM registry. With this comment and my
earlier comment on incorporating the use of dsm registry in
worker_spi, I'm trying to make a point to reduce the code for this
feature. However, if others have different opinions, I'm okay with
having a separate test module.

> > 1. IIUC, this feature lets external modules create as many DSM
> > segments as possible with different keys right? If yes, is capping the
> > max number of DSMs a good idea?
>
> Why?  Even if it is a good idea, what limit could we choose that wouldn't
> be arbitrary and eventually cause problems down the road?

I've tried with a shared memory structure size of 10GB on my
development machine and I have seen an intermittent crash (I haven't
got a chance to dive deep into it). I think the shared memory that a
named-DSM segment can get via this DSM registry feature depends on the
total shared memory area that a typical DSM client can allocate today.
I'm not sure it's worth it to limit the shared memory for this feature
given that we don't limit the shared memory via startup hook.

> > 2. Why does this feature have to deal with DSMs? Why not DSAs? With
> > DSA and an API that gives the DSA handle to the external modules, the
> > modules can dsa_allocate and dsa_free right? Do you see any problem
> > with it?
>
> Please see upthread discussion [0].

Per my understanding, this feature allows one to define and manage
named-DSM segments.

> > +typedef struct DSMRegistryEntry
> > +{
> > +charkey[256];
> >
> > Key length 256 feels too much, can it be capped at NAMEDATALEN 64
> > bytes (similar to some of the key lengths for hash_create()) to start
> > with?
>
> Why is it too much?

Are we expecting, for instance, a 128-bit UUID being used as a key and
hence limiting it to a higher value 256 instead of just NAMEDATALEN?
My thoughts were around saving a few bytes of shared memory space that
can get higher when multiple modules using a DSM registry with
multiple DSM segments.

> > 4. Do we need on_dsm_detach for each DSM created?
>
> Presently, I've designed this such that the DSM remains attached for the
> lifetime of a session (and stays present even if all attached sessions go
> away) to mimic what you get when you allocate shared memory during startup.
> Perhaps there's a use-case for having backends do some cleanup before
> exiting, in which case a detach_cb might be useful.  IMHO we should wait
> for a concrete use-case before adding too many bells and whistles, though.

On Thu, Dec 28, 2023 at 9:05 PM Nathan Bossart  wrote:
>
> On Wed, Dec 27, 2023 at 01:53:27PM -0600, Nathan Bossart wrote:
> > Here is a new version of the patch.  In addition to various small changes,
> > I've rewritten the test suite to use an integer and a lock, added a
> > dsm_registry_find() function, and adjusted autoprewarm to use the registry.
>
> Here's a v3 that fixes a silly mistake in the test.

Some comments on the v3 patch set:

1. Typo: missing "an" before "already-attached".
+/* Return address of already-attached DSM registry entry. */

2. Do you see any immediate uses of dsm_registry_find()? Especially
given that dsm_registry_init_or_attach() does the necessary work
(returns the DSM address if DSM already exists for a given key) for a
postgres process. If there is no immediate use (the argument to remove
this function goes similar to detach_cb above), I'm sure we can
introduce it when there's one.

3. I think we don't need miscadmin.h inclusion in autoprewarm.c, do we?

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Remove useless GROUP BY columns considering unique index

2023-12-29 Thread Zhang Mingli

Hi,

This idea first came from remove_useless_groupby_columns does not need to 
record constraint dependencie[0] which points out that
unique index whose columns all have NOT NULL constraints  could also take the 
work with primary key when removing useless GROUP BY columns.
I study it and implement the idea.

Ex:

create temp table t2 (a int, b int, c int not null, primary key (a, b), 
unique(c));

 explain (costs off) select * from t2 group by a,b,c;
 QUERY PLAN
 --
 HashAggregate
 Group Key: c
 -> Seq Scan on t2

The plan drop column a, b as I did a little more.

For the query, as t2 has primary key (a, b), before this patch, we could drop 
column c because {a, b} are PK.
And  we have an unique  index(c) with NOT NULL constraint now, we could drop 
column {a, b}, just keep {c}.

While we have multiple choices, group by a, b (c is removed  by PK) and group 
by c (a, b are removed by unique not null index)
And  I implement it to choose the one with less columns so that we can drop as 
more columns as possible.
I think it’s good for planner to save some cost like Sort less columns.
There may be better one for some reason like: try to keep PK for planner?
I’m not sure about that and it seems not worth much complex.

The NOT NULL constraint may also be computed from primary keys, ex:
create temp table t2 (a int, b int, c int not null, primary key (a, b), 
unique(a));
Primary key(a, b) ensure a is NOT NULL and we have a unique index(a), but it 
will introduce more complex to check if a unique index could be used.
I also doubt it worths doing that..
So my patch make it easy: check unique index’s columns, it’s a valid candidate 
if all of that have NOT NULL constraint.
And we choose a best one who has the least column numbers in 
get_min_unique_not_null_attnos(), as the reason: less columns mean that more 
group by columns could be removed.

create temp table t3 (a int, b int, c int not null, d int not null, primary key 
(a, b), unique(c, d));
-- Test primary key beats unique not null index.
explain (costs off) select * from t3 group by a,b,c,d;
 QUERY PLAN
--
 HashAggregate
 Group Key: a, b
 -> Seq Scan on t3
(3 rows)

create temp table t4 (a int, b int not null, c int not null, d int not null, 
primary key (a, b), unique(b, c), unique(d));
-- Test unique not null index with less columns wins.
explain (costs off) select * from t4 group by a,b,c,d;
 QUERY PLAN
--
 HashAggregate
 Group Key: d
 -> Seq Scan on t4
(3 rows)

The unique Indices could have overlaps with primary keys and indices themselves.

create temp table t5 (a int not null, b int not null, c int not null, d int not 
null, unique (a, b), unique(b, c), unique(a, c, d));
-- Test unique not null indices have overlap.
explain (costs off) select * from t5 group by a,b,c,d;
 QUERY PLAN
--
 HashAggregate
 Group Key: a, b
 -> Seq Scan on t5
(3 rows)





[0]https://www.postgresql.org/message-id/flat/CAApHDvrdYa%3DVhOoMe4ZZjZ-G4ALnD-xuAeUNCRTL%2BPYMVN8OnQ%40mail.gmail.com


Zhang Mingli
www.hashdata.xyz


v1-0001-Remove-useless-GROUP-BY-columns-considering-unique-i.patch
Description: Binary data

Re: Fix Brin Private Spool Initialization (src/backend/access/brin/brin.c)

2023-12-29 Thread Tomas Vondra




On 12/29/23 14:53, Ranier Vilela wrote:
> 
> 
> Em sex., 29 de dez. de 2023 às 10:33, Tomas Vondra
> mailto:tomas.von...@enterprisedb.com>>
> escreveu:
> 
> 
> 
> On 12/29/23 12:53, Ranier Vilela wrote:
> > Em qui., 28 de dez. de 2023 às 22:16, Tomas Vondra
> >  
>  >>
> > escreveu:
> >
> >
> >
> >     On 12/27/23 12:37, Ranier Vilela wrote:
> >     > Em ter., 26 de dez. de 2023 às 19:07, Tomas Vondra
> >     >  
> >      >
> >      
> >       >     > escreveu:
> >     >
> >     >
> >     >
> >     >     On 12/26/23 19:10, Ranier Vilela wrote:
> >     >     > Hi,
> >     >     >
> >     >     > The commit b437571
> >     >      
> >      >
> >     >      
> >       >     >     > think has an oversight.
> >     >     > When allocate memory and initialize private spool in
> function:
> >     >     > _brin_leader_participate_as_worker
> >     >     >
> >     >     > The behavior is the bs_spool (heap and index fields)
> >     >     > are left empty.
> >     >     >
> >     >     > The code affected is:
> >     >     >   buildstate->bs_spool = (BrinSpool *)
> >     palloc0(sizeof(BrinSpool));
> >     >     > - buildstate->bs_spool->heap = buildstate->bs_spool->heap;
> >     >     > - buildstate->bs_spool->index =
> buildstate->bs_spool->index;
> >     >     > + buildstate->bs_spool->heap = heap;
> >     >     > + buildstate->bs_spool->index = index;
> >     >     >
> >     >     > Is the fix correct?
> >     >     >
> >     >
> >     >     Thanks for noticing this.
> >     >
> >     > You're welcome.
> >     >  
> >     >
> >     >     Yes, I believe this is a bug - the assignments
> >     >     are certainly wrong, it leaves the fields set to NULL.
> >     >
> >     >     I wonder how come this didn't fail during testing.
> Surely, if
> >     the leader
> >     >     participates as a worker, the tuplesort_begin_index_brin
> shall
> >     be called
> >     >     with heap/index being NULL, leading to some failure
> during the
> >     sort. But
> >     >     maybe this means we don't actually need the heap/index
> fields,
> >     it's just
> >     >     a copy of TuplesortIndexArg, but BRIN does not need that
> >     because we sort
> >     >     the tuples by blkno, and we don't need the descriptors
> for that.
> >     >
> >     > Unfortunately I can't test on Windows, since I can't build with
> >     meson on
> >     > Windows.
> >     >
> >     >
> >     >     In any case, the _brin_parallel_scan_and_build does not
> >     actually need
> >     >     the separate heap/index arguments, those are already in
> the spool.
> >     >
> >     > Yeah, for sure.
> >     >
> >     >
> >     >     I'll try to figure out if we want to simplify the
> tuplesort or
> >     remove
> >     >     the arguments from _brin_parallel_scan_and_build.
> >     >
> >
> >     Here is a patch simplifying the BRIN parallel create code a
> little bit.
> >     As I suspected, we don't need the heap/index in the spool at
> all, and we
> >     don't need to pass it to tuplesort_begin_index_brin either -
> we only
> >     need blkno, and we have that in the datum1 field. This also
> means we
> >     don't need TuplesortIndexBrinArg.
> >
> > With Windows 10, msvc 2022, compile end pass ninja test.
> >
> > But, if you allow me, I would like to try another approach to
> > simplification.
> > Instead of increasing the arguments in the call, wouldn't it be better
> > to decrease them 
> > and this way all arguments will be passed in the registers instead
> of on
> > a stack?
> >
> 
> If this was beneficial, we'd be passing everything through structs and
> not as explicit arguments. But we don't. If you're arguing it's
> beneficial in this case, it'd be good to see it demonstrated.
>

Re: Fix Brin Private Spool Initialization (src/backend/access/brin/brin.c)

2023-12-29 Thread Ranier Vilela

Em sex., 29 de dez. de 2023 às 10:33, Tomas Vondra <
tomas.von...@enterprisedb.com> escreveu:

>
>
> On 12/29/23 12:53, Ranier Vilela wrote:
> > Em qui., 28 de dez. de 2023 às 22:16, Tomas Vondra
> > mailto:tomas.von...@enterprisedb.com>>
> > escreveu:
> >
> >
> >
> > On 12/27/23 12:37, Ranier Vilela wrote:
> > > Em ter., 26 de dez. de 2023 às 19:07, Tomas Vondra
> > >  > 
> >  > >>
> > > escreveu:
> > >
> > >
> > >
> > > On 12/26/23 19:10, Ranier Vilela wrote:
> > > > Hi,
> > > >
> > > > The commit b437571
> > >  > 
> > >  > >> I
> > > > think has an oversight.
> > > > When allocate memory and initialize private spool in
> function:
> > > > _brin_leader_participate_as_worker
> > > >
> > > > The behavior is the bs_spool (heap and index fields)
> > > > are left empty.
> > > >
> > > > The code affected is:
> > > >   buildstate->bs_spool = (BrinSpool *)
> > palloc0(sizeof(BrinSpool));
> > > > - buildstate->bs_spool->heap = buildstate->bs_spool->heap;
> > > > - buildstate->bs_spool->index = buildstate->bs_spool->index;
> > > > + buildstate->bs_spool->heap = heap;
> > > > + buildstate->bs_spool->index = index;
> > > >
> > > > Is the fix correct?
> > > >
> > >
> > > Thanks for noticing this.
> > >
> > > You're welcome.
> > >
> > >
> > > Yes, I believe this is a bug - the assignments
> > > are certainly wrong, it leaves the fields set to NULL.
> > >
> > > I wonder how come this didn't fail during testing. Surely, if
> > the leader
> > > participates as a worker, the tuplesort_begin_index_brin shall
> > be called
> > > with heap/index being NULL, leading to some failure during the
> > sort. But
> > > maybe this means we don't actually need the heap/index fields,
> > it's just
> > > a copy of TuplesortIndexArg, but BRIN does not need that
> > because we sort
> > > the tuples by blkno, and we don't need the descriptors for
> that.
> > >
> > > Unfortunately I can't test on Windows, since I can't build with
> > meson on
> > > Windows.
> > >
> > >
> > > In any case, the _brin_parallel_scan_and_build does not
> > actually need
> > > the separate heap/index arguments, those are already in the
> spool.
> > >
> > > Yeah, for sure.
> > >
> > >
> > > I'll try to figure out if we want to simplify the tuplesort or
> > remove
> > > the arguments from _brin_parallel_scan_and_build.
> > >
> >
> > Here is a patch simplifying the BRIN parallel create code a little
> bit.
> > As I suspected, we don't need the heap/index in the spool at all,
> and we
> > don't need to pass it to tuplesort_begin_index_brin either - we only
> > need blkno, and we have that in the datum1 field. This also means we
> > don't need TuplesortIndexBrinArg.
> >
> > With Windows 10, msvc 2022, compile end pass ninja test.
> >
> > But, if you allow me, I would like to try another approach to
> > simplification.
> > Instead of increasing the arguments in the call, wouldn't it be better
> > to decrease them
> > and this way all arguments will be passed in the registers instead of on
> > a stack?
> >
>
> If this was beneficial, we'd be passing everything through structs and
> not as explicit arguments. But we don't. If you're arguing it's
> beneficial in this case, it'd be good to see it demonstrated.
>
Please see the https://www.agner.org/optimize/optimizing_cpp.pdf
Excerpt:
"Use 64-bit mode
Parameter transfer is more efficient in 64-bit mode than in 32-bit mode,
and more efficient in 64-bit Linux than in 64-bit Windows. In 64-bit Linux,
the first six integer parameters and the first eight floating point
parameters are transferred in registers, totaling up to fourteen register
parameters. In 64-bit Windows, the first four parameters are transferred in
registers, regardless of whether they are integers or floating point
numbers."

With function:
_brin_parallel_scan_and_build(buildstate, buildstate->bs_spool,
brinshared, sharedsort,  heapRel, indexRel, sortmem, false);
We have:
Linux -> six first parameters in registers and two parameters in stack
Windows -> four parameters in registers and four parameters in stack


> > bs_spool may well contain this data and will probably be useful in the
> > future.
> >
> > I made a v1 version, based on your patch, for your consideration.
> >
>
> I did actually consider doing it

Re: Next step towards 64bit XIDs: Switch to FullTransactionId for PGPROC->xid and XLogRecord->xl_xid

2023-12-29 Thread Matthias van de Meent

On Fri, 29 Dec 2023, 13:49 Maxim Orlov,  wrote:

> Hi!
>
> As were discussed in [0] our overall goal is to make Postgres 64 bit
> XIDs.  It's obvious, that such a big patch set
> couldn't possible to commit "at once".  SLUR patch set [1] was committed a
> short while ago as a first significant
> step in this direction.
>
> This thread is a next step in this enterprise.  My objective here is to
> commit some changes, which were mandatory,
> as far as I understand, for any type of 64 XIDs implementation. And I'm
> sure there will be points for discussion here.
>
> My original intention was to make PGPROC->xmin, PGPROC->xid and
> PROC_HDR->xids 64bit.  But in reality,
> it turned out to be much more difficult than I expected.  On the one hand,
> the patch became too big and on the other
> hand, it's heavily relayed on epoch and XID "adjustment" to FXID.  Therefore,
> for now, I decided to limit myself to
> more atomic and independent changes. However, as I said above, these
> changes are required for any implementation
> of 64bit XIDs.
>
> So, PFA patches to make switch PGPROC->xid
>

I think this could be fine, but ...

and XLogRecord->xl_xid to FullTransactionId.
>

I don't think this is an actionable change, as this wastes 4 more bytes (or
8 with alignment) in nearly all WAL records that don't use the
HEAP/HEAP2/XLOG rmgrs, which would then be up to 10 (if not 14, when
64but-aligned) bytes per record. Unless something like [0] gets committed
this will add a significant write overhead to all operations, even if they
are not doing anything that needs an XID.

Also, I don't think we need to support transactions that stay alive and
change things for longer than 2^31 concurrently created transactions, so we
could well add a fxid to each WAL segment header (and checkpoint record?)
and calculate the fxid of each record as a relative fxid off of that.

Kind regards

Matthias van de Meent

[0] https://commitfest.postgresql.org/46/4386/

Re: Fix Brin Private Spool Initialization (src/backend/access/brin/brin.c)

2023-12-29 Thread Tomas Vondra




On 12/29/23 12:53, Ranier Vilela wrote:
> Em qui., 28 de dez. de 2023 às 22:16, Tomas Vondra
> mailto:tomas.von...@enterprisedb.com>>
> escreveu:
> 
> 
> 
> On 12/27/23 12:37, Ranier Vilela wrote:
> > Em ter., 26 de dez. de 2023 às 19:07, Tomas Vondra
> >  
>  >>
> > escreveu:
> >
> >
> >
> >     On 12/26/23 19:10, Ranier Vilela wrote:
> >     > Hi,
> >     >
> >     > The commit b437571
> >      
> >      >> I
> >     > think has an oversight.
> >     > When allocate memory and initialize private spool in function:
> >     > _brin_leader_participate_as_worker
> >     >
> >     > The behavior is the bs_spool (heap and index fields)
> >     > are left empty.
> >     >
> >     > The code affected is:
> >     >   buildstate->bs_spool = (BrinSpool *)
> palloc0(sizeof(BrinSpool));
> >     > - buildstate->bs_spool->heap = buildstate->bs_spool->heap;
> >     > - buildstate->bs_spool->index = buildstate->bs_spool->index;
> >     > + buildstate->bs_spool->heap = heap;
> >     > + buildstate->bs_spool->index = index;
> >     >
> >     > Is the fix correct?
> >     >
> >
> >     Thanks for noticing this.
> >
> > You're welcome.
> >  
> >
> >     Yes, I believe this is a bug - the assignments
> >     are certainly wrong, it leaves the fields set to NULL.
> >
> >     I wonder how come this didn't fail during testing. Surely, if
> the leader
> >     participates as a worker, the tuplesort_begin_index_brin shall
> be called
> >     with heap/index being NULL, leading to some failure during the
> sort. But
> >     maybe this means we don't actually need the heap/index fields,
> it's just
> >     a copy of TuplesortIndexArg, but BRIN does not need that
> because we sort
> >     the tuples by blkno, and we don't need the descriptors for that.
> >
> > Unfortunately I can't test on Windows, since I can't build with
> meson on
> > Windows.
> >
> >
> >     In any case, the _brin_parallel_scan_and_build does not
> actually need
> >     the separate heap/index arguments, those are already in the spool.
> >
> > Yeah, for sure.
> >
> >
> >     I'll try to figure out if we want to simplify the tuplesort or
> remove
> >     the arguments from _brin_parallel_scan_and_build.
> >
> 
> Here is a patch simplifying the BRIN parallel create code a little bit.
> As I suspected, we don't need the heap/index in the spool at all, and we
> don't need to pass it to tuplesort_begin_index_brin either - we only
> need blkno, and we have that in the datum1 field. This also means we
> don't need TuplesortIndexBrinArg.
> 
> With Windows 10, msvc 2022, compile end pass ninja test.
> 
> But, if you allow me, I would like to try another approach to
> simplification.
> Instead of increasing the arguments in the call, wouldn't it be better
> to decrease them 
> and this way all arguments will be passed in the registers instead of on
> a stack?
> 

If this was beneficial, we'd be passing everything through structs and
not as explicit arguments. But we don't. If you're arguing it's
beneficial in this case, it'd be good to see it demonstrated.

> bs_spool may well contain this data and will probably be useful in the
> future.
> 
> I made a v1 version, based on your patch, for your consideration.
> 

I did actually consider doing it this way yesterday, but I don't like
this approach. I don't believe it's faster (and even if it was, the
difference is going to be negligible), and parameters hidden in some
struct increase the cognitive load. I like explicit arguments.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Add Index-level REINDEX with multiple jobs

2023-12-29 Thread Maxim Orlov

Hi!

Recently, one of our customers came to us with the question: why do reindex
utility does not support multiple jobs for indices (-i opt)?
And, of course, it is because we cannot control the concurrent processing
of multiple indexes on the same relation.  This was
discussed somewhere in [0], I believe.  So, customer have to make a shell
script to do his business and so on.

But. This seems to be not that complicated to split indices by parent
tables and do reindex in multiple jobs?  Or I miss something?
PFA patch implementing this.

As always, any opinions are very welcome!

[0]
https://www.postgresql.org/message-id/flat/CAOBaU_YrnH_Jqo46NhaJ7uRBiWWEcS40VNRQxgFbqYo9kApUsg%40mail.gmail.com

-- 
Best regards,
Maxim Orlov.


v1-0001-Add-Index-level-REINDEX-with-multiple-jobs.patch
Description: Binary data

Next step towards 64bit XIDs: Switch to FullTransactionId for PGPROC->xid and XLogRecord->xl_xid

2023-12-29 Thread Maxim Orlov

Hi!

As were discussed in [0] our overall goal is to make Postgres 64 bit XIDs.
It's obvious, that such a big patch set
couldn't possible to commit "at once". SLUR patch set [1] was committed a
short while ago as a first significant
step in this direction.

This thread is a next step in this enterprise. My objective here is to
commit some changes, which were mandatory,
as far as I understand, for any type of 64 XIDs implementation. And I'm
sure there will be points for discussion here.

My original intention was to make PGPROC->xmin, PGPROC->xid and
PROC_HDR->xids 64bit. But in reality,
it turned out to be much more difficult than I expected. On the one hand,
the patch became too big and on the other
hand, it's heavily relayed on epoch and XID "adjustment" to FXID. Therefore,
for now, I decided to limit myself to
more atomic and independent changes. However, as I said above, these
changes are required for any implementation
of 64bit XIDs.

So, PFA patches to make switch PGPROC->xid and XLogRecord->xl_xid to
FullTransactionId.

As always, any opinions are very welcome!

[0]
https://www.postgresql.org/message-id/CACG=ezZe1NQSCnfHOr78AtAZxJZeCvxrts0ygrxYwe=pyyj...@mail.gmail.com
[1]
https://www.postgresql.org/message-id/flat/CAJ7c6TPDOYBYrnCAeyndkBktO0WG2xSdYduTF0nxq%2BvfkmTF5Q%40mail.gmail.com

--
Best regards,
Maxim Orlov.

v1-0002-Switch-to-FullTransactionId-for-XLogRecord-xl_xid.patch
Description: Binary data

v1-0001-Switch-to-FullTransactionId-for-PGPROC-xid.patch
Description: Binary data

Re: Fix Brin Private Spool Initialization (src/backend/access/brin/brin.c)

2023-12-29 Thread Ranier Vilela

Em qui., 28 de dez. de 2023 às 22:16, Tomas Vondra <
tomas.von...@enterprisedb.com> escreveu:

>
>
> On 12/27/23 12:37, Ranier Vilela wrote:
> > Em ter., 26 de dez. de 2023 às 19:07, Tomas Vondra
> > mailto:tomas.von...@enterprisedb.com>>
> > escreveu:
> >
> >
> >
> > On 12/26/23 19:10, Ranier Vilela wrote:
> > > Hi,
> > >
> > > The commit b437571
> >  > > I
> > > think has an oversight.
> > > When allocate memory and initialize private spool in function:
> > > _brin_leader_participate_as_worker
> > >
> > > The behavior is the bs_spool (heap and index fields)
> > > are left empty.
> > >
> > > The code affected is:
> > >   buildstate->bs_spool = (BrinSpool *) palloc0(sizeof(BrinSpool));
> > > - buildstate->bs_spool->heap = buildstate->bs_spool->heap;
> > > - buildstate->bs_spool->index = buildstate->bs_spool->index;
> > > + buildstate->bs_spool->heap = heap;
> > > + buildstate->bs_spool->index = index;
> > >
> > > Is the fix correct?
> > >
> >
> > Thanks for noticing this.
> >
> > You're welcome.
> >
> >
> > Yes, I believe this is a bug - the assignments
> > are certainly wrong, it leaves the fields set to NULL.
> >
> > I wonder how come this didn't fail during testing. Surely, if the
> leader
> > participates as a worker, the tuplesort_begin_index_brin shall be
> called
> > with heap/index being NULL, leading to some failure during the sort.
> But
> > maybe this means we don't actually need the heap/index fields, it's
> just
> > a copy of TuplesortIndexArg, but BRIN does not need that because we
> sort
> > the tuples by blkno, and we don't need the descriptors for that.
> >
> > Unfortunately I can't test on Windows, since I can't build with meson on
> > Windows.
> >
> >
> > In any case, the _brin_parallel_scan_and_build does not actually need
> > the separate heap/index arguments, those are already in the spool.
> >
> > Yeah, for sure.
> >
> >
> > I'll try to figure out if we want to simplify the tuplesort or remove
> > the arguments from _brin_parallel_scan_and_build.
> >
>
> Here is a patch simplifying the BRIN parallel create code a little bit.
> As I suspected, we don't need the heap/index in the spool at all, and we
> don't need to pass it to tuplesort_begin_index_brin either - we only
> need blkno, and we have that in the datum1 field. This also means we
> don't need TuplesortIndexBrinArg.
>
With Windows 10, msvc 2022, compile end pass ninja test.

But, if you allow me, I would like to try another approach to
simplification.
Instead of increasing the arguments in the call, wouldn't it be better to
decrease them
and this way all arguments will be passed in the registers instead of on a
stack?

bs_spool may well contain this data and will probably be useful in the
future.

I made a v1 version, based on your patch, for your consideration.

best regards,
Ranier Vilela


v1-brin-parallel-create-simplify.patch
Description: Binary data

Re: Add PQsendSyncMessage() to libpq

On Mon, 13 Nov 2023 at 09:20, Anthonin Bonnefoy
 wrote:
> \syncpipeline
> tps = 2607.587877 (without initial connection time)
> ...
> \endpipeline
> \startpipeline
> tps = 2290.527462 (without initial connection time)

Those are some nice improvements. And I think once this patch is in,
it would make sense to add the pgbench feature you're suggesting.
Because indeed that makes it see what perf improvements can be gained
for your workload.

Re: Add PQsendSyncMessage() to libpq