Re: Make ON_ERROR_STOP stop on shell script failure

2022-09-27 Thread walther

Fujii Masao:

One concern about this patch is that some applications already depend on
the current behavior of ON_ERROR_STOP, i.e., psql doesn't stop even when
the shell command returns non-zero exit code. If so, we might need to
extend ON_ERROR_STOP so that it accepts the following setting values.


I just got bitten by this and I definitely consider this a bug. I expect 
psql to stop when a shell script fails and I have ON_ERROR_STOP set. I 
don't think this should be made more complicated with different settings.


If someone needs to have ON_ERROR_STOP set, but continue execution after 
a certain shell command, they could still do something like this:


\! might_fail || true

Best

Wolfgang




Re: fixing CREATEROLE

2022-11-22 Thread walther

Robert Haas:

It seems
to me that the root of any fix in this area must be to change the rule
that CREATEROLE can administer any role whatsoever.


Agreed.


Instead, I propose
to change things so that you can only administer roles for which you
have ADMIN OPTION. [...] > I'm curious to hear what other people think of these 
proposals, [...]
Third, someone could well have a better or just
different idea how to fix the problems in this area than what I'm
proposing here.


Once you can restrict CREATEROLE to only manage "your own" (no matter 
how that is defined, e.g. via ADMIN or through some "ownership" concept) 
roles, the possibility to "namespace" those roles somehow will become a 
lot more important. For example in a multi-tenant setup in the same 
cluster, where each tenant has their own database and admin user with a 
restricted CREATEROLE privilege, it will very quickly be at least quite 
annoying to have conflicts with other tenants' role names. I'm not sure 
whether it could even be a serious problem, because I should still be 
able to GRANT my own roles to other roles from other tenants - and that 
could affect matching of +group records in pg_hba.conf?


My suggestion to $subject and the namespace problem would be to 
introduce database-specific roles, i.e. add a database column to 
pg_authid. Having this column set to 0 will make the role a cluster-wide 
role ("cluster role") just as currently the case. But having a database 
oid set will make the role exist in the context of that database only 
("database role"). Then, the following principles should be enforced:


- database roles can not share the same name with a cluster role.
- database roles can have the same name as database roles in other 
databases.
- database roles can not be members of database roles in **other** 
databases.
- database roles with CREATEROLE can only create or alter database roles 
in their own database, but not roles in other databases and also not 
cluster roles.
- database roles with CREATEROLE can GRANT all database roles in the 
same database, but only those cluster roles they have ADMIN privilege on.

- database roles with CREATEROLE can not set SUPERUSER.

To be able to create database roles with a cluster role, there needs to 
be some syntax, e.g. something like


CREATE ROLE name IN DATABASE dbname ...

A database role with CREATEROLE should not need to use that syntax, 
though - every CREATE ROLE should be IN DATABASE anyway.


With database roles, it would be possible to hand out CREATEROLE without 
the ability to grant SUPERUSER or any of the built-in roles. It would be 
much more useful on top of that, too. Not only is the namespace problem 
mentioned above solved, but it would also be possible to let pg_dump 
dump a whole database, including the database roles and their 
memberships. This would allow dumping (and restoring) a single 
tenant/application including the relevant roles and privileges - without 
dumping all roles in the cluster. Plus, it's backwards compatible 
because without creating database roles, everything stays exactly the same.


Best,

Wolfgang




Re: fixing CREATEROLE

2022-11-22 Thread walther

Robert Haas:

2. There are some serious implementation challenges because the
constraints on duplicate object names must be something which can be
enforced by unique constraints on the relevant catalogs. Off-hand, I
don't see how to do that. It would be easy to make the cluster roles
all have unique names, and it would be easy to make the database roles
have unique names within each database, but I have no idea how you
would keep a database role from having the same name as a cluster
role. For anyone to try to implement this, we'd need to have a
solution to that problem.


For each database created, create a partial unique index:

CREATE UNIQUE INDEX ... ON pg_authid (rolname) WHERE roldatabase IN (0, 
);


Is that possible on catalogs?

Best,

Wolfgang




Re: fixing CREATEROLE

2022-11-22 Thread walther

Tom Lane:

No, we don't support partial indexes on catalogs, and I don't think
we want to change that.  Partial indexes would require expression
evaluations occurring at very inopportune times.


I see. Is that the same for indexes *on* an expression? Or would those 
be ok?


With a custom operator, an EXCLUDE constraint on the ROW(reldatabase, 
relname) expression could work. The operator would compare:

- (0, name1) and (0, name2) as name1 == name2
- (db1, name1) and (0, name2) as name1 == name2
- (0, name1) and (db2, name2) as name1 == name2
- (db1, name1) and (db2, name2) as db1 == db2 && name1 == name2

or just (db1 == 0 || db2 == 0 || db1 == db2) && name1 == name2.

Now, you are going to tell me that EXCLUDE constraints are not supported 
on catalogs either, right? ;)


Best,

Wolfgang




Re: fixing CREATEROLE

2022-11-22 Thread walther

Wolfgang Walther:

Tom Lane:

No, we don't support partial indexes on catalogs, and I don't think
we want to change that.  Partial indexes would require expression
evaluations occurring at very inopportune times.


I see. Is that the same for indexes *on* an expression? Or would those 
be ok?


With a custom operator, an EXCLUDE constraint on the ROW(reldatabase, 
relname) expression could work. The operator would compare:

- (0, name1) and (0, name2) as name1 == name2
- (db1, name1) and (0, name2) as name1 == name2
- (0, name1) and (db2, name2) as name1 == name2
- (db1, name1) and (db2, name2) as db1 == db2 && name1 == name2

or just (db1 == 0 || db2 == 0 || db1 == db2) && name1 == name2.


Does it even need to be on the expression? I don't think so. It would be 
enough to just make it compare relname WITH = and reldatabase WITH the 
custom operator (db1 == 0 || db2 == 0 || db1 == db2), right?


Best,

Wolfgang




Re: fixing CREATEROLE

2022-11-23 Thread walther

Robert Haas:

I have to admit that when I realized that was the natural place to put
them to make the patch work, my first reaction internally was "well,
that can't possibly be right, role properties suck!". But I didn't and
still don't see where else to put them that makes any sense at all, so
I eventually decided that my initial reaction was misguided. So I
can't really blame you for not liking it either, and would be happy if
we could come up with something else that feels better. I just don't
know what it is: at least as of this moment in time, I believe these
naturally ARE properties of the role [...]

That might be the wrong view. As I say, I'm open to other ideas, and
it's possible there's some nicer way to do it that I just don't see
right now.


INHERITCREATEDROLES and SETCREATEDROLES behave much like DEFAULT 
PRIVILEGES. What about something like:


ALTER DEFAULT PRIVILEGES FOR alice
GRANT TO alice WITH INHERIT FALSE, SET TRUE, ADMIN TRUE

The "abbreviated grant" is very much abbreviated, because the original 
syntax GRANT a TO b is already quite short to begin with, i.e. there is 
no ON ROLE or something like that in it.


The initial DEFAULT privilege would be INHERIT FALSE, SET FALSE, ADMIN 
TRUE, I guess?


Best,

Wolfgang




Re: fixing CREATEROLE

2022-11-28 Thread walther

Robert Haas:

I don't know if changing the syntax from A to B is really getting us
anywhere. I generally agree that the ALTER DEFAULT PRIVILEGES syntax
looks nicer than the CREATE/ALTER ROLE syntax, but I'm not sure that's
a sufficient reason to move the control over this behavior to ALTER
DEFAULT PRIVILEGES.


The list of role attributes can currently be roughly divided into the 
following categories:
- Settings with role-specific values: CONNECTION LIMIT, PASSWORD, VALID 
UNTIL. It's hard to imagine storing them anywhere else, because they 
need to have a different value for each role. Those are not just "flags" 
like all the other attributes.
- Two special attributes in INHERIT and BYPASSRLS regarding 
security/privileges. Those were invented because there was no other 
syntax to do the same thing. Those could be interpreted as privileges to 
do something, too - but lacking the ability to do that explicit. There 
is no SET BYPASSRLS ON/OFF or SET INHERIT ON/OFF. Of course the INHERIT 
case is now a bit different, because there is the inherit grant option 
you introduced.
- Cluster-wide privileges: SUPERUSER, CREATEDB, CREATEROLE, LOGIN, 
REPLICATION. Those can't be granted on some kind of object, because 
there is no such global object. You could imagine inventing some kind of 
global CLUSTER object and do something like GRANT SUPERUSER ON CLUSTER 
TO alice; instead. Turning those into role attributes was the choice 
made instead. Most likely it would have been only a syntactic difference 
anyway: Even if there was something like GRANT .. ON CLUSTER, you'd most 
likely implement that as... storing those grants as role attributes.


Your patch is introducing a new category of role attributes - those that 
are affecting default behavior. But there is already a way to express 
this right now, and that's ALTER DEFAULT PRIVILEGES in this case. Imho, 
the question asked should not be "why change from syntax A to B?" but 
rather: Why introduce a new category of role attributes, when there is a 
way to express the same concept already? I can't see any compelling 
reason for that, yet.


Or not "yet", but rather "anymore". When I understood and remember 
correctly, you implemented it in a way that a user could not change 
those new attributes on their own role. This is in fact different to how 
ALTER DEFAULT PRIVILEGES works, so you could have made an argument that 
this was better expressed as role attributes. But I think this was asked 
and agreed on to act differently, so that the user can change this 
default behavior of what happens when they create a role for themselves. 
And now this reason is gone - there is no reason NOT to implement it as 
DEFAULT PRIVILEGES.



One thing to consider is that, as I've designed
this, whether or not ADMIN is included in the grant is non-negotiable.
I am, at least at present, inclined to think that was the right call,
partly because Mark Dilger expressed a lot of concern about the
CREATEROLE user losing control over the role they'd just created, and
allowing ADMIN to be turned off would have exactly that effect. Plus a
grant with INHERIT FALSE, SET FALSE, ADMIN FALSE would end up being
almost identical to no great at all, which seems pointless. Basically,
without ADMIN, the implicit GRANT fails to accomplish its intended
purpose, so I don't like having that as a possibility.


With how you implemented it right now, is it possible to do the following?

CREATE ROLE alice;
REVOKE ADMIN OPTION FOR alice FROM CURRENT_USER;

If the answer is yes, then there is no reason to allow a user to set a 
shortcut for SET and INHERIT, but not for ADMIN.


If the answer is no, then you could just not allow specifying the ADMIN 
option in the ALTER DEFAULT PRIVILEGES statement and always force it to 
be TRUE.




The other thing that's a little weird about the syntax which you
propose is that it's not obviously related to CREATE ROLE. The intent
of the patch as implemented is to allow control over only the implicit
GRANT that is created when a new role is created, not all grants that
might be created by or to a particular user. Setting defaults for all
grants doesn't seem like a particularly good idea to me, but it's
definitely a different idea than what the patch proposes to do.


Before I proposed that I was confused for a moment about this, too - but 
it turns out to be wrong. ALTER DEFAULT PRIVILEGES in general works as:


When object A is created, issue a GRANT ON A automatically.

In my proposal, the "object" is not the GRANT of that role. It's the 
role itself. So the default privileges express what should happen when 
the role is created. The default privileges would NOT affect a regular 
GRANT role TO role_spec command. They only run that command when a role 
is created.



I did spend some time thinking about trying to tie this to the
CREATEROLE syntax itself. For example, instead of CREATE ROLE alice
CREATEROLE INHERITCREATEDROLES SETCREATEDROLES you could write CREATE
ROLE alic

Re: fixing CREATEROLE

2022-11-28 Thread walther

David G. Johnston:

A quick tally of the thread so far:

No Defaults needed: David J., Mark?, Tom?
Defaults needed - attached to role directly: Robert
Defaults needed - defined within Default Privileges: Walther?


s/Walther/Wolfgang

The capability itself seems orthogonal to the rest of the patch to track 
these details better.  I think we can "Fix CREATEROLE" without any 
feature regarding optional default behaviors and would suggest this 
patch be so limited and that another thread be started for discussion of 
(assuming a default specifying mechanism is wanted overall) how it 
should look.  Let's not let a usability debate distract us from fixing a 
real problem.


+1

I didn't argue for whether defaults are needed in this case or not. I 
just said that ADP is better for defaults than role attributes are. Or 
the other way around: I think role attributes are not a good way to 
express those.


Personally, I'm in the No Defaults needed camp, too.

Best,

Wolfgang




Re: fixing CREATEROLE

2022-11-28 Thread walther

Robert Haas:

In my proposal, the "object" is not the GRANT of that role. It's the
role itself. So the default privileges express what should happen when
the role is created. The default privileges would NOT affect a regular
GRANT role TO role_spec command. They only run that command when a role
is created.


I agree that this is what you are proposing, but it is not what your
proposed syntax says. Your proposed syntax simply says ALTER DEFAULT
PRIVILEGES .. GRANT. Users who read that are going to think it
controls the default behavior for all grants, because that's what the
syntax says. If the proposed syntax mentioned CREATE ROLE someplace,
maybe that would have some potential. A proposal to make a command
that controls CREATE ROLE and only CREATE ROLE and mentions neither
CREATE nor ROLE anywhere in the syntax is never going to be
acceptable.


Yes, I agree - the abbreviated GRANT syntax is confusing/misleading in 
that case. Consistent with the other syntaxes, but easily confused 
nonetheless.



It's no. Well, OK, you can do it, but it doesn't revoke anything,
because you can only revoke your own grant, not the bootstrap
superuser's grant.


Ah, I see. I didn't get that difference regarding the bootstrap 
superuser, so far.


So in that sense, the ADP GRANT would be an additional GRANT issued by 
the user that created the role in addition to the bootstrap superuser's 
grant. You can't revoke the bootstrap superuser's grant - but you can't 
modify it either. And there is no need to add SET or INHERIT to the 
boostrap superuser's grant, because you can grant the role yourself 
again, with those options.


I think it would be very strange to have a default for that bootstrap 
superuser's grant. Or rather: A different default than the minimum 
required - and that's just ADMIN, not SET, not INHERIT. When you have 
the minimum, you can always choose to grant SET and INHERIT later on 
yourself - and revoke it, too! But when the SET and INHERIT are on the 
boostrap superuser's grant - then there is no way for you to revoke SET 
or INHERIT on that grant anymore later.


Why should the superuser, who gave you CREATEROLE, insist on you having 
SET or INHERIT forever and disallow to revoke it from yourself?


Best,

Wolfgang




Re: fixing CREATEROLE

2022-11-28 Thread walther

Mark Dilger:

Robert's patch tries to deal with the (possibly unwanted) role membership by 
setting up defaults to mitigate the effects, but that is more confusing to me 
than just de-conflating role membership from role administration, and giving 
role creators administration over roles they create, without in so doing giving 
them role membership.  I don't recall enough details about how hard it is to 
de-conflate role membership from role administration, and maybe that's a 
non-starter for reasons I don't recall at the moment.


Isn't this just GRANT .. WITH SET FALSE, INHERIT FALSE, ADMIN TRUE? That 
should allow role administration, without actually granting membership 
in that role, yet, right?


Best,

Wolfgang




Re: fixing CREATEROLE

2022-11-28 Thread walther

Mark Dilger:

Isn't this just GRANT .. WITH SET FALSE, INHERIT FALSE, ADMIN TRUE? That should 
allow role administration, without actually granting membership in that role, 
yet, right?


Can you clarify what you mean here?  Are you inventing a new syntax?

+GRANT bob TO alice WITH SET FALSE, INHERIT FALSE, ADMIN TRUE;
+ERROR:  syntax error at or near "SET"
+LINE 1: GRANT bob TO alice WITH SET FALSE, INHERIT FALSE, ADMIN TRUE...


This is valid syntax on latest master.

Best,

Wolfgang





Re: fixing CREATEROLE

2022-11-28 Thread walther

Robert Haas:

And the result is that I've got like five people, some of whom
particulated in those discussions, showing up to say "hey, we don't
need the ability to set defaults." Well, if that's the case, then why
did we have hundreds and hundreds of emails within the last 12 months
arguing about which way it should work?


For me: "Needed" as in "required". I don't think we *require* defaults 
to make this useful, just as David said as well. Personally, I don't 
need defaults either, at least I didn't have a use-case for it, yet. I'm 
not objecting to introduce defaults, but I do object to *how* they were 
introduced in your patch set, so far. It just wasn't consistent with the 
other stuff that already exists.


Best,

Wolfgang




Re: fixing CREATEROLE

2022-11-28 Thread walther

Robert Haas:

1) Automatically install an additional membership grant, with the CREATEROLE 
user as the grantor, specifying INHERIT OR SET as TRUE (I personally favor 
attaching these to ALTER ROLE, modifiable only by oneself)


Hmm, that's an interesting alternative to what I actually implemented.
Some people might like it better, because it puts the behavior fully
under the control of the CREATEROLE user, which a number of you seem
to favor.


+1


I suppose if we did it that way, it could even be a GUC, like
create_role_automatic_grant_options.


I don't think using GUCs for that is any better. ALTER DEFAULT 
PRIVILEGES is the correct way to do it. The only argument against it 
was, so far, that it's easy to confuse with default options for newly 
created role grants, due to the abbreviated grant syntax.


I propose a slightly different syntax instead:

ALTER DEFAULT PRIVILEGES GRANT CREATED ROLE TO role_specification WITH ...;

This, together with the proposal above regarding the grantor, should be 
consistent.


Is there any other argument to be made against ADP?

Note, that ADP allows much more than just creating a grant for the 
CREATEROLE user, which would be the case if the default GRANT was made 
TO the_create_role_user. But it could be made towards *other* users as 
well, so you could do something like this:


CREATE ROLE alice CREATEROLE;
CREATE ROLE bob;

ALTER DEFAULT PRIVILEGES FOR alice GRANT CREATED ROLE TO bob WITH SET 
TRUE, INHERIT FALSE;


This is much more flexible than role attributes or GUCs.

Best,

Wolfgang




Re: Updatable Views and INSERT INTO ... ON CONFLICT

2022-09-02 Thread walther

Joel Jacobson:
I note it's not yet possible to INSERT INTO an Updatable View using the 
ON CONFLICT feature.


To be clear, it seems to be supported for AUTO-updatable views and for 
views with manually created RULES, but not for views with INSTEAD OF 
triggers.


Not saying it is desired, just trying to better understand the limits of 
Updatable Views.


It's certainly desired. I tried to use it in the past.

> Are there reasons why it would not be possible to develop support INSERT
> INTO ... ON CONFLICT for Updatable Views?

I think the main challenge is, that when a view has an INSTEAD OF insert 
trigger, the INSERT statement that is in the trigger function is not the 
same statement that is called on the view. Auto-updatable views rewrite 
the original query, so they can support this.


For this to work, the outer INSERT would have to "catch" the error that 
the trigger function throws on a conflict - and then the outer INSERT 
would have to execute an UPDATE on the view instead.


I don't know about the internals of INSERT .. ON CONFLICT, but I'd 
assume the conflict handling + update happens much later than calling 
the instead of trigger, so that makes it impossible to do it right now.


Best

Wolfgang




Re: has_privs_of_role vs. is_member_of_role, redux

2022-09-08 Thread walther

Robert Haas:

Fairly obviously, my thinking here is biased by having written the
patch to allow restricting SET ROLE. If alice can neither inherit
bob's privileges nor SET ROLE bob, she had better not be able to
create objects owned by bob, because otherwise she can make a table,
add an expression index that calls a user-defined function, do stuff
until it needs to be autovacuumed, and then give it to bob, and boom,
exploit. But that doesn't mean that the is_member_of_role() tests here
have to be changed to has_privs_of_role(). They could be changed to
has_privs_of_role() || member_can_set_role(). And if the consensus is
to do it that way, I'm OK with that.


A different line of thought (compared to the "USAGE" privilege I 
discussed earlier), would be:

To transfer ownership of an object, you need two sets of privileges:
- You need to have the privilege to initiate a request to transfer 
ownership.

- You need to have the privilege to accept a request to transfer ownership.

Let's imagine there'd be such a request created temporarily, then when I 
start the process of changing ownership, I would have to change to the 
other role and then accept that request.


In theory, I could also inherit that privilege, but that's not how the 
system works today. By using is_member_of_role, the decision was already 
made that this should not depend on inheritance. What is left, is the 
ability to do it via SET ROLE only.


So it should not be has_privs_of_role() nor has_privs_of_role() || 
member_can_set_role(), as you suggested above, but rather just 
member_can_set_role() only. Of course, only in the context of the SET 
ROLE patch.


Basically, with that patch is_member_of_role() has to become 
member_can_set_role().



I'm just a little unconvinced that it's actually the best route. I
think that logic of the form "well Alice could just SET ROLE and do it
anyway" is weak -- and not only because of the patch to allow
restricting SET ROLE, but because AFAICT there is no point to the
INHERIT option in the first place unless it is to force you to issue
SET ROLE. That is literally the only thing it does. If we're going to
have weird exceptions where you don't have to SET ROLE after all, why
even have INHERIT in the first place?


As stated above, I don't think this is about INHERIT. INHERIT works fine 
both without the SET ROLE patch (and keeping is_member_of_role) and with 
the SET ROLE patch (and changing to member_can_set_role).


The exception is made, because there is no formal two-step process for 
requesting and accepting a transfer of ownership. Or alternatively: 
There is no exception, it's just that during the command to transfer 
ownership, the current role has to be changed temporarily to the 
accepting role. And that's the same as checking is_member_of_role or 
member_can_set_role, respectively.


Best

Wolfgang




Re: Possibility to disable `ALTER SYSTEM`

2024-03-19 Thread walther

Greg Sabino Mullane:
On Tue, Mar 19, 2024 at 12:05 PM Tom Lane > wrote:


If you aren't willing to build a solution that blocks off mods
using COPY TO FILE/PROGRAM and other readily-available-to-superusers
tools (plpythonu for instance), I think you shouldn't bother asking
for a feature at all.  Just trust your superusers.


There is a huge gap between using a well-documented standard tool like 
ALTER SYSTEM and going out of your way to modify the configuration files 
through trickery. I think we need to only solve the former as in "hey, 
please don't do that because your changes will be overwritten"


Recap: The requested feature is not supposed to be a security feature. 
It is supposed to prevent the admin from accidentally doing the wrong 
thing - but not from willfully doing the same through different means.


This very much sounds like a "warning" - how about turning the feature 
into one?


Have a GUC warn_on_alter_system = "", which allows the 
kubernetes operator to set it to something like "hey, please don't do 
that because your changes will be overwritten. Use xyz operator instead.".


This will hardly be taken as a security feature by anyone, but should 
essentially achieve what is asked for.


A more sophisticated way would be to make that GUC throw an error, but 
have a syntax for ALTER SYSTEM to override this - i.e. similar to a 
--force flag.


Best,

Wolfgang




Re: Slow GRANT ROLE on PostgreSQL 16 with thousands of ROLEs

2024-03-21 Thread walther

Tom Lane:

Actually, roles_is_member_of sucks before v16 too; the new thing
is only that it's being invoked during GRANT ROLE.  Using the
roles created by the given test case, I see in v15:

[...]
So it takes ~3.5s to populate the roles_is_member_of cache for "acc"
given this membership set.  This is actually considerably worse than
in v16 or HEAD, where the same test takes about 1.6s for me.


Ah, this reminds me that I hit the same problem about a year ago, but 
haven't had the time to put together a test-case, yet. In my case, it's 
like this:
- I have one role "authenticator" with which the application (PostgREST) 
connects to the database.
- This role has been granted all of the actual user roles and will then 
do a SET ROLE for each authenticated request it handles.
- In my case that's currently about 120k roles granted to 
"authenticator", back then it was probably around 60k.
- The first request (SET ROLE) for each session took between 5 and 10 
*minutes* to succeed - subsequent requests were instant.
- When the "authenticator" role is made SUPERUSER, the first request is 
instant, too.


I guess this matches exactly what you are observing.

There is one more thing that is actually even worse, though: When you 
try to cancel the query or terminate the backend while the SET ROLE is 
still running, this will not work. It will not only not cancel the 
query, but somehow leave the process for that backend in some kind of 
zombie state that is impossible to recover from.


All of this was v15.

Best,

Wolfgang




Re: Building with meson on NixOS/nixpkgs

2024-03-29 Thread walther

Wolfgang Walther:
To build on NixOS/nixpkgs I came up with a few small patches to 
meson.build. All of this works fine with Autoconf/Make already.


In v3, I added another small patch for meson, this one about proper 
handling of -Dlibedit_preferred when used together with -Dreadline=enabled.


Best,

WolfgangFrom 1e7db8b57f69ed9866fdf874bbd0dcb33d25c045 Mon Sep 17 00:00:00 2001
From: Wolfgang Walther 
Date: Sat, 2 Mar 2024 17:18:38 +0100
Subject: [PATCH v3 1/4] Fallback to uuid for ossp-uuid with meson

The upstream name for the ossp-uuid package / pkg-config file is "uuid". Many
distributions change this to be "ossp-uuid" to not conflict with e2fsprogs.

This lookup fails on distributions which don't change this name, for example
NixOS / nixpkgs. Both "ossp-uuid" and "uuid" are also checked in configure.ac.
---
 meson.build | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/meson.build b/meson.build
index 18b5be842e3..4be5f65e8b6 100644
--- a/meson.build
+++ b/meson.build
@@ -1346,7 +1346,8 @@ if uuidopt != 'none'
 uuidfunc = 'uuid_to_string'
 uuidheader = 'uuid.h'
   elif uuidopt == 'ossp'
-uuid = dependency('ossp-uuid', required: true)
+# upstream is called "uuid", but many distros change this to "ossp-uuid"
+uuid = dependency('ossp-uuid', 'uuid', required: true)
 uuidfunc = 'uuid_export'
 uuidheader = 'uuid.h'
   else
-- 
2.44.0

From 28e8067b6b04ac601946fab4aa0849b3dfec5a7d Mon Sep 17 00:00:00 2001
From: Wolfgang Walther 
Date: Sat, 2 Mar 2024 22:06:25 +0100
Subject: [PATCH v3 2/4] Fallback to clang in PATH with meson

Some distributions put clang into a different path than the llvm binary path.

For example, this is the case on NixOS / nixpkgs, which failed to find clang
with meson before this patch.
---
 meson.build | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/meson.build b/meson.build
index 4be5f65e8b6..4a6af978fd4 100644
--- a/meson.build
+++ b/meson.build
@@ -759,7 +759,10 @@ if add_languages('cpp', required: llvmopt, native: false)
 llvm_binpath = llvm.get_variable(configtool: 'bindir')
 
 ccache = find_program('ccache', native: true, required: false)
-clang = find_program(llvm_binpath / 'clang', required: true)
+
+# Some distros put LLVM and clang in different paths, so fallback to
+# find via PATH, too.
+clang = find_program(llvm_binpath / 'clang', 'clang', required: true)
   endif
 elif llvmopt.auto()
   message('llvm requires a C++ compiler')
-- 
2.44.0

From 01fda05ea14e8cf0d201b749c84b1f3cb532f353 Mon Sep 17 00:00:00 2001
From: Wolfgang Walther 
Date: Mon, 11 Mar 2024 19:54:41 +0100
Subject: [PATCH v3 3/4] Support absolute bindir/libdir in regression tests
 with meson

Passing an absolute bindir/libdir will install the binaries and libraries to
/tmp_install/ and /tmp_install/ respectively.

This is path is correctly passed to the regression test suite via configure/make,
but not via meson, yet. This is because the "/" operator in the following expression
throws away the whole left side when the right side is an absolute path:

  test_install_location / get_option('libdir')

This was already correctly handled for dir_prefix, which is likely absolute as well.
This patch handles both bindir and libdir in the same way - prefixing absolute paths
with the tmp_install path correctly.
---
 meson.build | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/meson.build b/meson.build
index 4a6af978fd4..3e628659843 100644
--- a/meson.build
+++ b/meson.build
@@ -3048,15 +3048,17 @@ test_install_destdir = meson.build_root() / 'tmp_install/'
 if build_system != 'windows'
   # On unixoid systems this is trivial, we just prepend the destdir
   assert(dir_prefix.startswith('/')) # enforced by meson
-  test_install_location = '@0@@1@'.format(test_install_destdir, dir_prefix)
+  temp_install_bindir = '@0@@1@'.format(test_install_destdir, dir_prefix / dir_bin)
+  temp_install_libdir = '@0@@1@'.format(test_install_destdir, dir_prefix / dir_lib)
 else
   # drives, drive-relative paths, etc make this complicated on windows, call
   # into a copy of meson's logic for it
   command = [
 python, '-c',
 'import sys; from pathlib import PurePath; d1=sys.argv[1]; d2=sys.argv[2]; print(str(PurePath(d1, *PurePath(d2).parts[1:])))',
-test_install_destdir, dir_prefix]
-  test_install_location = run_command(command, check: true).stdout().strip()
+test_install_destdir]
+  temp_install_bindir = run_command(command, dir_prefix / dir_bin, check: true).stdout().strip()
+  temp_install_libdir = run_command(command, dir_prefix / dir_lib, check: true).stdout().st

Re: Building with musl in CI and the build farm

2024-03-30 Thread walther
Here's an update on the progress to run musl (Alpine Linux) in the 
buildfarm.


Wolfgang Walther:
The animal runs in a docker container via GitHub Actions in [2]. Right 
now it's still running with --test, until I get the credentials to 
activate it.


The animals have been activated and are reporting now. Thanks, Andrew!


I tried to enable everything (except systemd, because Alpine doesn't 
have it) and run all tests. The LDAP tests are failing right now, but 
that is likely something that I need to fix in the Dockerfile - it's 
failing to start the slapd, IIRC. There are other issues, though - all 
of them have open pull requests in that repo [3].


ldap tests are enabled, just a missing package.


I also had to skip the recovery check. Andrew mentioned that he had to 
do that, too, when he was still running his animal on Alpine. Not sure 
what this is about, yet.


This was about a missing init process in the docker image. Without an 
init process reaping zombie processes, the recovery tests end up with 
some supposed-to-be-terminated backends still running and can't start 
them up again. Fixed by adding a minimal init process with "tinit".



Building --with-icu fails two tests. One of them (001_initdb) is fixed 
by having the "locale" command in your PATH, which is not the case on 
Alpine by default. I assume this will not break on your debian/musl 
build, Andres - but it will also probably not return any sane values, 
because it will run glibc's locale command.
I haven't looked into that in detail, yet, but I think the other test 
(icu/010_database) fails because it expects that setlocale(LC_COLLATE, 
) throws an error. I think it doesn't do that on musl, 
because LC_COLLATE is not implemented.
Those failing tests are not "just failing", but probably mean that we 
need to do something about how we deal with locale/setlocale on musl.


I still need to look into this in depth.


The last failure is about building --with-nls. This fails with something 
like:


ld: src/port/strerror.c:72:(.text+0x2d8): undefined reference to 
`libintl_gettext'


Of course, gettext-dev is installed, libintl.so is available in /usr/lib 
and it also contains the symbol. So not sure what's happening here.


This is an Alpine Linux packaging issue. Theoretically, it could be made 
to work by introducing some configure/meson flag like "--with-gettext" 
or so, to prefer gettext's libintl over the libc-builtin. However, NixOS 
/ nixpkgs with its pkgsMusl overlay manages to solve this issue just 
fine, builds with --enable-nls and gettext work. Thus, I conclude this 
is best solved upstream in Alpine Linux.


TLDR: The only real issue which is still open from PostgreSQL's side is 
around locales and ICU - certainly the pain point in musl. Will look 
into it further.


Best,

Wolfgang




Re: Building with musl in CI and the build farm

2024-03-31 Thread walther

About building one of the CI tasks with musl:

Andres Freund:

I'd rather adapt one of the existing tasks, to avoid increasing CI costs unduly.


I looked into this and I think the only task that could be changed is 
the SanityCheck. This is because this builds without any additional 
features enabled. I guess that makes sense, because otherwise those 
dependencies would first have to be built with musl-gcc as well.




FWIW, except for one small issue, building postgres against musl works on 
debian and the tests pass if I install first.


After the fix for LD_LIBRARY_PATH this now works as expected without 
installing first. I confirmed it works on debian with CC=musl-gcc.




The small problem mentioned above is that on debian linux/fs.h isn't available
when building with musl, which in turn causes src/bin/pg_upgrade/file.c to
fail to compile.


According to [1], this can be worked around by linking some folders:

ln -s /usr/include/linux /usr/include/x86_64-linux-musl/
ln -s /usr/include/asm-generic /usr/include/x86_64-linux-musl/
ln -s /usr/include/x86_64-linux-gnu/asm /usr/include/x86_64-linux-musl/

Please find a patch to use musl-gcc in SanityCheck attached. Logs from 
the CI run are in [2]. It has this in the configure phase:


[13:19:52.712] Using 'CC' from environment with value: 'ccache musl-gcc'
[13:19:52.712] C compiler for the host machine: ccache musl-gcc (gcc 
10.2.1 "cc (Debian 10.2.1-6) 10.2.1 20210110")

[13:19:52.712] C linker for the host machine: musl-gcc ld.bfd 2.35.2
[13:19:52.712] Using 'CC' from environment with value: 'ccache musl-gcc'

So meson picks up musl-gcc properly. I also double checked that without 
the links above, the build does indeed fail with the linux/fs.h error.


I assume the installation of musl-tools should be done in the 
pg-vm-images repo instead of the additional script here?


Best,

Wolfgang

[1]: 
https://debian-bugs-dist.debian.narkive.com/VlFkLigg/bug-789789-musl-fails-to-compile-stuff-that-depends-on-kernel-headers

[2]: https://cirrus-ci.com/task/5741892590108672From 4a69d9851e7bbd7cd521d236847af9ebf5e6253b Mon Sep 17 00:00:00 2001
From: Wolfgang Walther 
Date: Sun, 31 Mar 2024 15:17:43 +0200
Subject: [PATCH] Build SanityCheck against musl

---
 .cirrus.tasks.yml | 9 +
 1 file changed, 9 insertions(+)

diff --git a/.cirrus.tasks.yml b/.cirrus.tasks.yml
index 72f553e69f4..5815c51abbe 100644
--- a/.cirrus.tasks.yml
+++ b/.cirrus.tasks.yml
@@ -69,6 +69,7 @@ task:
 CCACHE_DIR: ${CIRRUS_WORKING_DIR}/ccache_dir
 # no options enabled, should be small
 CCACHE_MAXSIZE: "150M"
+CC: ccache musl-gcc
 
   # While containers would start up a bit quicker, building is a bit
   # slower. This way we don't have to maintain a container image.
@@ -77,6 +78,14 @@ task:
   ccache_cache:
 folder: $CCACHE_DIR
 
+  setup_additional_packages_script: |
+apt-get update
+DEBIAN_FRONTEND=noninteractive apt-get -y install musl-tools
+ln -s -t /usr/include/x86_64-linux-musl/ \
+  /usr/include/linux \
+  /usr/include/asm-generic \
+  /usr/include/x86_64-linux-gnu/asm
+
   create_user_script: |
 useradd -m postgres
 chown -R postgres:postgres .
-- 
2.44.0



Re: Security lessons from liblzma

2024-04-01 Thread walther
I looked through the repositories of 19 linux distros [1] to see what 
kind of patches are applied often. Many of them share the same package 
managers / repositories and thus the same patches. I made sure to look 
at some smaller, "other" distros as well, to see what kind of problems 
appear outside the mainstream distros.


Andres Freund:

I've previously proposed a patch to make pkglibdir configurable, I think we
should just go for that.


+1. Other paths, which some distros need to configure are pkgincludedir 
and pgxs (independently of pkglibdir).


Also a configurable directoy to look up extensions, possibly even to be 
changed at run-time like [2]. The patch says this:



This directory is prepended to paths when loading extensions (control and SQL 
files), and to the '$libdir' directive when loading modules that back 
functions. The location is made configurable to allow build-time testing of 
extensions that do not have been installed to their proper location yet.


This seems like a great thing to have. This might also be relevant in 
light of recent discussions in the ecosystem around extension management.


All the path-related issues have in common, that while it's easy to move 
files around to their proper locations later, they all need to adjust 
pg_config's output.



Andres Freund:

For the various defines, ISTM we should just make them #ifndef guarded, then
they could be overridden by defining them at configure time. Some of them,
like DEFAULT_PGSOCKET_DIR seem to be overridden by just about every
distro. And others would be nice to easily override anyway, I e.g. dislike the
default DEFAULT_PAGER value.


DEFAULT_PAGER is also overriden by a few distros. DEFAULT_EDITOR by one 
as well. As you said DEFAULT_PGSOCKET_DIR in **a lot** of them.



Andres Freund:

postgresql-17-rpm-pgsql.patch:

We should probably make this stuff properly configurable. The current logic
with inferring whether to add /postgresql is just weird. Perhaps a configure
option that defaults to the current logic when set to an empty string but can
be overridden?


+1 for that option to force the suffix, no matter whether postgres/pgsql 
are in the path already.



Some other commonly patched issues are:
- Building only libpq, only libecpg - or disabling both and building 
only the server code. This comes up often, it's not supported nicely in 
our build system, yet. I think meson already has some build targets for 
parts of that, but it's very hard / impossible to then install only this 
subset of the build. It would be great to be able to build and install 
only the frontend code (including only the frontend headers) or only the 
backend code (including headers) etc.
- Related to the above is pg_config and how to deal with it when 
installing separate client and server packages, in different locations, 
too. Some distros provide a second version of pg_config as pg_server_config.


Those two issues and the path-related issues above make it harder than 
it should be to provide separate packages for each major version, which 
can be installed at the same time (versioned prefixes, multiple server 
packages, but maybe only a single libpq package etc.).



Some small patches that might not be widespread, but could possibly 
still be upstreamed:

- jit-s390x [3] (Alpine, Debian, Fedora, openSUSE)
- pg_config --major-version option for extension builds [4] (Alpine)
- Some fixes for man pages [5] (AlmaLinux, CentOS, Fedora)


TLDR: I think it should be possible to make the build system more 
flexible in some areas without introducing distro specific things in 
core. This should eliminate the need for many of the same patches across 
the board for a lot of distros.


Best,

Wolfgang

[1]: ALT Linux, Adélie Linux, AlmaLinux, Alpine Linux, Arch Linux, 
CentOS, Crux, Debian, Fedora, Gentoo, GoboLinux, Guix, Mandriva, NixOS, 
OpenWRT, Rocky Linux, Ubuntu, Void Linux, openSUSE


[2]: 
https://salsa.debian.org/postgresql/postgresql/-/blob/17/debian/patches/extension_destdir?ref_type=heads


[3]: 
https://salsa.debian.org/postgresql/postgresql/-/blob/17/debian/patches/jit-s390x?ref_type=heads


[4]: 
https://gitlab.alpinelinux.org/alpine/aports/-/blob/master/main/postgresql16/pg_config-add-major-version.patch?ref_type=heads


[5]: 
https://gitlab.com/redhat/centos-stream/rpms/postgresql/-/blob/c9s/postgresql-man.patch





Re: RFC: Additional Directory for Extensions

2024-04-03 Thread walther

Alvaro Herrera:

I support the idea of there being a second location from where to load
shared libraries ... but I don't like the idea of making it
runtime-configurable.  If we want to continue to tighten up what
superuser can do, then one of the things that has to go away is the
ability to load shared libraries from arbitrary locations
(dynamic_library_path).  I think we should instead look at making those
locations hardcoded at compile time.  The packager can then decide where
those things go, and the superuser no longer has the ability to load
arbitrary code from arbitrary locations.


The use-case for runtime configuration of this seems to be build-time 
testing of extensions against an already installed server. For this 
purpose it should be enough to be able to set this directory at startup 
- it doesn't need to be changed while the server is actually running. 
Then you could spin up a temporary postgres instance with the extension 
directory pointing a the build directory and test.


Would startup-configurable be better than runtime-configurable regarding 
your concerns?


I can also imagine that it would be very helpful in a container setup to 
be able to set an environment variable with this path instead of having 
to recompile all of postgres to change it.


Best,

Wolfgang




Re: psql: add \create_function command

2024-01-26 Thread walther

Tom Lane:

Or we could cut out the intermediate variable altogether
by inventing something that works like :'...' but reads
from a file not a variable.  That might be too specialized
though, and I'm not sure about good syntax for it either.
Maybe like

CREATE FUNCTION foo() RETURNS whatever AS :{source_file.txt} LANGUAGE ...;


That would indeed be very useful! I would immediately use this in a lot 
of places.





Re: psql: add \create_function command

2024-01-26 Thread walther

Pavel Stehule:
looks a little bit obscure - why do you need to do it from psql? And how 
frequently do you do it?


I store all my SQL code in git and use "psql -e" to "bundle" it into an 
extension, which is then deployed to production.


The code is spread over many files, which include other files via \ir. 
Sometimes you need to include other types of files, though - for example 
code in other languages as Steve mentioned, but I have also had cases 
for yaml files, markdown templates, even binary assets which should 
still be considered "code" and not data.


So anything in that direction would help.




Re: MERGE ... RETURNING

2024-03-08 Thread walther

Jeff Davis:

To summarize, most of the problem has been in retrieving the action
(INSERT/UPDATE/DELETE) taken or the WHEN-clause number applied to a
particular matched row. The reason this is important is because the row
returned is the old row for a DELETE action, and the new row for an
INSERT or UPDATE action. Without a way to distinguish the particular
action, the RETURNING clause returns a mixture of old and new rows,
which would be hard to use sensibly.


It seems to me that all of this is only a problem, because there is only
one RETURNING clause.

Dean Rasheed wrote in the very first post to this thread:

I considered allowing a separate RETURNING list at the end of each
action, but rapidly dismissed that idea. Firstly, it introduces
shift/reduce conflicts to the grammar. These can be resolved by making
the "AS" before column aliases non-optional, but that's pretty ugly,
and there may be a better way. More serious drawbacks are that this
syntax is much more cumbersome for the end user, having to repeat the
RETURNING clause several times, and the implementation is likely to be
pretty complex, so I didn't pursue it.


I can't judge the grammar and complexity issues, but as a potential user
it seems to me to be less complex to have multiple RETURNING clauses, 
where I could inject my own constants about the specific actions, than 
to have to deal with any of the suggested functions / clauses. More 
repetitive, yes - but not more complex.


More importantly, I could add RETURNING to only some of the actions and 
not always all at the same time - which seems pretty useful to me.


Best,

Wolfgang




Re: RFC: adding pytest as a supported test framework

2024-06-12 Thread walther

Jelte Fennema-Nio:

As scripting languages go, the ones that are still fairly heavily in
use are Javascript, Python, Ruby, and PHP. I think all of those could
probably work, but my personal order of preference would be Python,
Ruby, Javascript, PHP.

Finally, I'm definitely biased towards using Python myself. But I
think there's good reasons for that:
1. In the data space, that Postgres in, Python is very heavily used for analysis
2. Everyone coming out of university these days knows it to some extent
3. Multiple people in the community have been writing Postgres related
tests in python and have enjoyed doing so (me[1], Jacob[2],
Alexander[3])


PostgREST also uses pytest for integration tests - and that was a very 
good decision compared to the bash based tests we had before.


One more argument for Python compared to the other mentioned scripting 
languages: Python is already a development dependency via meson. None of 
the other 3 are. In a future where meson will be the only build system, 
we will have python "for free" already.


Best,

Wolfgang




Re: Meson far from ready on Windows

2024-06-22 Thread walther

Andres Freund:

FWIW, dynamic linking has a noticeable overhead on other platforms too. A
non-dependencies-enabled postgres can do about 2x the connections-per-second
than a fully kitted out postgres can (basically due to more memory mapping
metadata being copied).  But on windows the overhead is larger because so much
more happens for every new connections, including loading all dlls from
scratch.

I suspect linking a few libraries statically would be quite worth it on
windows. On other platforms it'd be quite inadvisable to statically link
libraries, due to security updates, [...]


That's not necessarily true. The nix package manager and thus NixOS 
track all dependencies for a piece of software. If any of the 
dependencies are updated, all dependents are rebuilt, too. So the 
security concern doesn't apply here. There is a "static overlay", which 
builds everything linked fully statically. Unfortunately, PostgreSQL 
doesn't build in that, so far.


Lately, I have been looking into building at least libpq in that static 
overlay, via Meson. There are two related config options:

-Ddefault_library=shared|static|both
-Dprefer_static

The first controls which libraries (libpq, ...) to build ourselves. The 
second controls linking, IIUC also against external dependencies.


Maybe it would be a first step to support -Dprefer_static?

Then this could be set on Windows.

Best,

Wolfgang




Re: Building with meson on NixOS/nixpkgs

2024-04-17 Thread walther

Peter Eisentraut:

On 29.03.24 19:47, walt...@technowledgy.de wrote:
 > -    uuid = dependency('ossp-uuid', required: true)
 > +    # upstream is called "uuid", but many distros change this to 
"ossp-uuid"

 > +    uuid = dependency('ossp-uuid', 'uuid', required: true)

How would this behave if you have only uuid.pc from e2fsprogs installed 
but choose -Duuid=ossp?  Then it would pick up uuid.pc here, but fail to 
compile later?


It would still fail the meson setup step, because for e2fs we have:

uuidfunc = 'uuid_generate'
uuidheader = 'uuid/uuid.h'

while for ossp we have:

uuidfunc = 'uuid_export'
uuidheader = 'uuid.h'

and later we do:

if not cc.has_header_symbol(uuidheader, uuidfunc, args: test_c_args, 
dependencies: uuid)
error('uuid library @0@ missing required function 
@1@'.format(uuidopt, uuidfunc))

endif

Best,

Wolfgang




Re: [PATCH] Add --syntax to postgres for SQL syntax checking

2024-05-15 Thread walther

Tom Lane:

The thing that was bothering me most about this is that I don't
understand why that's a useful check.  If I meant to type

UPDATE mytab SET mycol = 42;

and instead I type

UPDATEE mytab SET mycol = 42;

your proposed feature would catch that; great.  But if I type

UPDATE mytabb SET mycol = 42;

it won't.  How does that make sense?  I'm not entirely sure where
to draw the line about what a "syntax check" should catch, but this
seems a bit south of what I'd want in a syntax-checking editor.

BTW, if you do feel that a pure grammar check is worthwhile, you
should look at the ecpg preprocessor, which does more or less that
with the SQL portions of its input.  ecpg could be a better model
to follow because it doesn't bring all the dependencies of the server
and so is much more likely to appear in a client-side installation.
It's kind of an ugly, unmaintained mess at the moment, sadly.


Would working with ecpg allow to get back a parse tree of the query to 
do stuff with that?


This is really what is missing for the ecosystem. A libpqparser for 
tools to use: Formatters, linters, query rewriters, simple syntax 
checkers... they are all missing access to postgres' own parser.


Best,

Wolfgang





Re: [PATCH] Add --syntax to postgres for SQL syntax checking

2024-05-15 Thread walther

Tom Lane:

This is really what is missing for the ecosystem. A libpqparser for
tools to use: Formatters, linters, query rewriters, simple syntax
checkers... they are all missing access to postgres' own parser.


To get to that, you'd need some kind of agreement on what the syntax
tree is.  I doubt our existing implementation would be directly useful
to very many tools, and even if it is, do they want to track constant
version-to-version changes?


Correct, on top of what the syntax tree currently has, one would 
probably need:

- comments
- locations (line number / character) for everything, including those of 
comments


Otherwise it's impossible to print proper SQL again without losing 
information.


And then on top of that, to be really really useful, you'd need to be 
able to parse partial statements, too, to support all kinds of "language 
server" applications.


Tracking version-to-version changes is exactly the reason why it would 
be good to have that from upstream, imho. New syntax is added in 
(almost?) every release and everyone outside core trying to write their 
own parser and staying up2date with **all** the new syntax.. will 
eventually fail.


Yes, there could be changes to the produced parse tree as well and you'd 
also need to adjust, for example, your SQL-printers. But it should be 
easier to stay up2date than right now.


Best,

Wolfgang




Re: Schema variables - new implementation for Postgres 15

2024-05-22 Thread walther

Alvaro Herrera:

Perhaps the solution to all this is to avoid having the variables be
implicitly present in the range table of all queries.  Instead, if you
need a variable's value, then you need to add the variable to the FROM
clause;


+1

This should make it easier to work with composite type schema variables 
in some cases.  It could also enable schema qualifying of schema 
variables, or at least make it easier to do, I think.


In this case variables would share the same namespace as tables and 
views, right?  So I could not create a variable with the same name as 
another table.  Which is a good thing, I guess.  Not sure how it's 
currently implemented in the patch.


Best,

Wolfgang




Re: Schema variables - new implementation for Postgres 15

2024-05-25 Thread walther

Pavel Stehule:
Sure there is more possibilities, but I don't want to lost the 
possibility to write code like


CREATE TEMP VARIABLE _x;

LET _x = 'hello';

DO $$
BEGIN
   RAISE NOTICE '%', _x;
END;
$$;

So I am searching for a way to do it safely, but still intuitive and 
user friendly.


Maybe a middle-way between this and Alvaro's proposal could be:

Whenever you have a FROM clause, a variable must be added to it to be 
accessible.  When you don't have a FROM clause, you can access it directly.


This would make the following work:

RAISE NOTICE '%', _x;

SELECT _x;

SELECT tbl.*, _x FROM tbl, _x;

SELECT tbl.*, (SELECT _x) FROM tbl, _x;

SELECT tbl.*, (SELECT _x FROM _x) FROM tbl;


But the following would be an error:

SELECT tbl.*, _x FROM tbl;

SELECT tbl.*, (SELECT _x) FROM tbl;


Best,

Wolfgang




Re: Build with LTO / -flto on macOS

2024-06-03 Thread walther

Wolfgang Walther:

Peter:
I don't think we explicitly offer LTO builds as part of the make build 
system, so anyone trying this would do it sort of self-service, by 
passing additional options to configure or make.  In which case they 
might as well pass the -export_dynamic option along in the same way?


The challenge is that it defeats the purpose of LTO to pass this along 
to everything, e.g. via CFLAGS. The Makefiles set this in LDFLAGS_EX_BE 
only, so it only affects the backend binary. This is not at all obvious 
and took me quite a while to figure out why LTO silently didn't strip 
symbols from other binaries. It does work to explicitly set 
LDFLAGS_EX_BE, though.


Oh, and more importantly: LDFLAGS_EX_BE is not available on all back 
branches. It was only introduced in v16 in preparation for meson. So up 
to v15, I would have to patch src/makesfiles/Makefile.darwin to set 
export_dynamic.


So back-patching a change like this would certainly help to get LTO 
across versions seamlessly - which is what I am trying to achieve while 
packaging all versions in nixpkgs / NixOS.


Best,

Wolfgang




Re: Meson far from ready on Windows

2024-08-18 Thread walther

Andres Freund:

That's not necessarily true. The nix package manager and thus NixOS track all 
dependencies for a piece of software. If any of the dependencies are updated, all 
dependents are rebuilt, too. So the security concern doesn't apply here. There is a 
"static overlay", which builds everything linked fully statically.


Right. There's definitely some scenario where it's ok, I was simplifying a bit.


Unfortunately, PostgreSQL doesn't build in that, so far.


I've built mostly statically linked pg without much of a problem, what trouble 
did you encounter? Think there were some issues with linking Kerberos and 
openldap statically, but not on postgres' side.


Mostly the "can't disable shared libraries / backend builds" part 
mentioned below.



Building the postgres backend without support for dynamic linking doesn't make 
sense though. Extensions are just stop ingrained part of pg.


I think there might be some limited use-cases for a fully-static 
postgres backend without the ability to load extensions: Even if we get 
libpq to build fine in the fully-static overlay mentioned above, a lot 
of reverse dependencies have to disable tests, because they need a 
running postgres server to run their tests against.


Providing a really simple postgres backend, with only minimal 
functionality, would allow some basic sanity tests, even in this purely 
static environment.



Lately, I have been looking into building at least libpq in that static 
overlay, via Meson. There are two related config options:
-Ddefault_library=shared|static|both
-Dprefer_static

The first controls which libraries (libpq, ...) to build ourselves. The second 
controls linking, IIUC also against external dependencies.


Pg by default builds a static libpq on nearly all platforms (not aix I think 
and maybe not Windows when building with autoconf, not sure about the old msvc 
system) today?


Yes, PG builds a static libpq today. But it's hard-to-impossible to 
*disable building the shared library*. In the fully static overlay, this 
causes the build to fail, because shared libraries can't be build.



Maybe it would be a first step to support -Dprefer_static?


That should work for nearly all dependencies today. Except for libintl, I 
think.  I found that there are a lot of buglets in static link dependencies of 
various libraries though.


To support prefer_static, we'd also have to look at our internal 
linking, i.e. whether for example psql is linked against libpq 
statically or dynamically. Once prefer_static controls that, that's 
already a step forward to be able to build more of the code-base without 
shared libraries available.


Best,

Wolfgang




Re: [PATCH] Add reloption for views to enable RLS

2022-02-04 Thread walther

Christoph Heiss wrote:

As part of a customer project we are looking to implement an reloption for 
views which when set, runs the subquery as invoked by the user rather than the 
view owner, as is currently the case.
The rewrite rule's table references are then checked as if the user were 
referencing the table(s) directly.

This feature is similar to so-called 'SECURITY INVOKER' views in other DBMS. 


This is a feature I have long been looking for. I tested the patch (v5) 
and found two cases that I feel need to be either fixed or documented 
explicitly.



Case 1 - Schema privileges:

create schema a;
create table a.t();

create schema b;
create view b.v with (security_invoker=true) as table a.t;

create role alice;
grant usage on schema b to alice; -- missing schema a
grant select on table a.t, b.v to alice;

set role alice;
table a.t; -- ERROR: permission denied for schema a (good)
table b.v; -- no error (good or bad?)

User alice does not have USAGE privileges on schema a, but only on table 
a.t. A SELECT directly on the table fails as expected, but a SELECT on 
the view succeeds. I assume the schema access is checked when the query 
is parsed - and at that stage, the user is still the view owner?
The docs mention explicitly that *all* objects are accessed with invoker 
privileges, which is not the case.


Personally I actually like this. It allows to keep a view-based api in a 
separate schema, while:

- preserving full RLS capabilities and
- forcing the user to go through the api, because a direct access to the 
data schema is not possible.


However, since this behavior was likely unintended until now, it raises 
the question whether there are any other privilege checks that are not 
taking the invoking user into account properly?



Case 2 - Chained views:

create schema a;
create table a.t();

create role bob;
grant create on database postgres to bob;
grant usage on schema a to bob;
set role bob;
create schema b;
create view b.v1 with (security_invoker=true) as table a.t;
create view b.v2 with (security_invoker=false) as table b.v1;

reset role;
create role alice;
grant usage on schema a, b to alice;
grant select on table a.t to bob;
grant select on table b.v2 to alice;

set role alice;
table b.v2; -- ERROR: permission denied for table t (bad)

When alice runs the SELECT on b.v2, the query on b.v1 is made with bob 
privileges as the view owner of b.v2. This is verified, because alice 
does not have privileges to access b.v1, but no such error is thrown.


b.v1 will then access a.t - and my first assumption was, that in this 
case a.t should be accessed by bob, still as the view owner of b.v2. 
Clearly, this is not the case as the permission denied error shows.


This is not actually a problem with this patch, I think, but just 
highlighting a quirk in the current implementation of views 
(security_invoker=false) in general: While the query will be run with 
the view owner, the CURRENT_USER is still the invoker, even "after" the 
view. In other words, the current implementation is *not* the same as 
"security definer". It's somewhere between "security definer" and 
"security invoker" - a strange mix really.


Afaik this mix is not documented explicitly so far. But the 
security_invoker reloption exposes it in a much less expected way, so I 
only see two options really:
a) make the current implementation of security_invoker=false a true 
"security definer", i.e. change the CURRENT_USER "after" the view for good.

b) document the "security infiner/devoker" default behavior as a feature.

I really like a), as this would make a clear cut between security 
definer and security invoker views - but this would be a big breaking 
change, which I don't think is acceptable.


Best,

Wolfgang




Re: [PATCH] Add reloption for views to enable RLS

2022-02-09 Thread walther

Laurenz Albe:

So even though the view owner "duff" has no permissions
on the schema "viewtest", we can still select from the table.
Permissions on the schema containing the table are not
checked, only permissions on the table itself.

[...]

If not, I don't know if it is the business of this patch to
change the behavior.


Ah, good find. In that case, I suggest to change the docs slightly to 
say that the schema will not be checked.


In one place it's described as "it will cause all access to the 
underlying tables to be checked as ..." which is fine, I think. But in 
another place it's "access to tables, functions and *other objects* 
referenced in the view, ..." which is misleading.



I agree that the name "security_invoker" is suggestive of SECURITY INVOKER
in CREATE FUNCTION, but the behavior is different.
Perhaps the solution is as simple as choosing a different name that does
not prompt this association, for example "permissions_invoker".


Yes, given that there is not much that can be done about the 
functionality anymore, a different name would be better. This would also 
avoid the implicit "if security_invoker=false, the view behaves like 
SECURITY DEFINER" association, which is also clearly wrong. And this 
assumption is actually what made me think the chained views example was 
somehow off.


I am not convinced "permissions_invoker" is much better, though. The 
difference between SECURITY INVOKER and SECURITY DEFINER is invoker vs 
definer... where, I think, we need something else to describe what we 
currently have and what the patch provides.


Maybe we can look at it from the other perspective: Both ways of 
operating keep the CURRENT_USER the same, pretty much like what we 
understand "security invoker" should do. The difference, however, is the 
current default in which the permissions are checked with the view 
*owner*. Let's treat this difference as the thing that can be set: 
security_owner=true|false. Or run_as_owner=true|false.


xxx_owner=true would be the default and xxx_owner=false could be set 
explicitly to get the behavior we are looking for in this patch?




I guess more documentation how this works would be a good idea.
[...] but since it exposes this
in new ways, it might as well clarify how all this works.


+1

Best

Wolfgang




Re: faulty link

2022-02-10 Thread walther




leads to
https://www.postgresql.org/docs/release/14.2/

which gives 'Not Found' for me (Netherlands)


Same here: Not Found. (Germany)




Re: [PATCH] Add reloption for views to enable RLS

2022-02-15 Thread walther

Laurenz Albe:

So even though the view owner "duff" has no permissions
on the schema "viewtest", we can still select from the table.
Permissions on the schema containing the table are not
checked, only permissions on the table itself.

I am not sure how to feel about this.  It is not what I would have
expected, but changing it would be a compatibility break.
Should this be considered a live bug in PostgreSQL?


I now found the docs to say:


USAGE:
For schemas, allows access to objects contained in the schema (assuming 
that the objects' own privilege requirements are also met). Essentially 
this allows the grantee to “look up” objects within the schema. Without 
this permission, it is still possible to see the object names, e.g., by 
querying system catalogs. Also, after revoking this permission, existing 
sessions might have statements that have previously performed this 
lookup, so this is not a completely secure way to prevent object access.



So, this seems to be perfectly fine.

Best

Wolfgang




Re: [PATCH] Add reloption for views to enable RLS

2022-02-15 Thread walther

Christoph Heiss:
xxx_owner=true would be the default and xxx_owner=false could be set 
explicitly to get the behavior we are looking for in this patch?


I'm not sure if an option which is on by default would be best, IMHO. I 
would rather have an off-by-default option, so that you explicitly have 
to turn *on* that behavior rather than turning *off* the current.


Just out of curiosity I asked myself whether there were any other 
boolean options that default to true in postgres - and there are plenty. 
./configure options, client connection settings, server config options, 
etc - but also some SQL statements:

- CREATE USER defaults to LOGIN
- CREATE ROLE defaults to INHERIT
- CREATE COLLATION defaults to DETERMINISTIC=true

There's even reloptions, that do, e.g. vacuum_truncate.


My best suggestions is maybe something like run_as_invoker=t|f, but that 
would probably raise the same "invoker vs definer" association.


It is slightly better, I agree. But, yes, that same association is 
raised easily. The more I think about it, the more it becomes clear that 
really the current default behavior of "running the query as the view 
owner" is the special thing here, not the behavior you are introducing.


If we were to start from scratch, it would be pretty obvious - to me - 
that run_as_owner=false would be the default, and the run_as_owner=true 
would need to be turned on explicitly. I'm thinking about "run_as_owner" 
as the better design and "defaults to true" as a backwards compatibility 
thing.


But yeah, it would be good to hear other opinions on that, too.

Best

Wolfgang




Re: [PATCH] Add reloption for views to enable RLS

2022-02-15 Thread walther

Laurenz Albe:

I converted the option to run_as_owner=true|false in the attached v7.
It now definitely seems like the right way to move forward and getting
more feedback.

I think we are straying from the target.

"run_as_owner" seems wrong to me, because it is all about permission
checking and*not*  about running.  As we have established, the query
is always executed by the caller.

So my preferred bikeshed colors would be "permissions_owner" or
"permissions_caller".


My main point was the "xxx_owner = true by default" thing. Whether xxx 
is "permissions" or "run_as" doesn't change that. permissions_caller, 
however, would be a step backwards.


I can see how permissions_owner is better than run_as_owner. The code 
uses checkAsUser, so check_as_owner would be an option, too. Although 
that could easily be associated with WITH CHECK OPTION. Thinking about 
that, the difference between LOCAL and CASCADED for CHECK OPTION pretty 
much sums up one of the confusing bits about the whole thing, too.


Maybe "local_permissions_owner = true | false"? That would make it 
crystal-clear, that this is only about the very first permissions check 
and not about any checks later in a chain of multiple views.


"local_permissions = owner | caller" could also work - as long as we're 
not using any of definer or invoker.


Best

Wolfgang




Re: [PATCH] Add reloption for views to enable RLS

2022-02-15 Thread walther

Laurenz Albe:

I'd be happy with "check_as_owner", except it is unclear *what* is checked.


Yeah, that could be associated with WITH CHECK OPTION, too, as in "do 
the CHECK OPTION stuff as the owner".



"check_permissions_as_owner" is ok with me, but a bit long.


check_permissions_as_owner is exactly what happens. The additional "as" 
shouldn't be a problem in length - but is much better to read. I 
wouldn't associate that with CHECK OPTION either. +1


Best

Wolfgang




Suggestion: optionally return default value instead of error on failed cast

2020-12-12 Thread Wolfgang Walther

Hi,

currently a failed cast throws an error. It would be useful to have a 
way to get a default value instead.


T-SQL has try_cast [1]
Oracle has CAST(... AS .. DEFAULT ... ON CONVERSION ERROR) [2]

The DEFAULT ... ON CONVERSION ERROR syntax seems like it could be 
implemented in PostgreSQL. Even if only DEFAULT NULL was supported (at 
first) that would already help.


The short syntax could be extended for the DEFAULT NULL case, too:

SELECT '...'::type -- throws error
SELECT '...':::type -- returns NULL

I couldn't find any previous discussion on this, please advise in case I 
just missed it.


Thoughts?

Best

Wolfgang

[1]: 
https://docs.microsoft.com/en-us/sql/t-sql/functions/try-cast-transact-sql
[2]: 
https://docs.oracle.com/en/database/oracle/oracle-database/12.2/sqlrf/CAST.html





Re: [PATCH] Add support for ON UPDATE/DELETE actions on ALTER CONSTRAINT

2020-08-03 Thread Wolfgang Walther

Tom Lane:

We don't generally act that way in other ALTER commands and I don't see
a strong argument to start doing so here.  [...]

In short, I'm inclined to argue that this variant of ALTER TABLE
should replace *all* the fields of the constraint with the same
properties it'd have if you'd created it fresh using the same syntax.
This is by analogy to CREATE OR REPLACE commands, which don't
preserve any of the old properties of the replaced object.  Given
the interactions between these fields, I think you're going to end up
with a surprising mess of ad-hoc choices if you do differently.
Indeed, you already have, but I think it'll get worse if anyone
tries to extend the feature set further.


I don't think the analogy to CREATE OR REPLACE holds. Semantically 
REPLACE and ALTER are very different. Using ALTER the expectation is to 
change something, keeping everything else unchanged. Looking at all the 
other ALTER TABLE actions, especially ALTER COLUMN, it looks like every 
command does exactly one thing and not more. I don't think deferrability 
and ON UPDATE / ON CASCADE should be changed together at all, neither 
implicitly nor explicitly.


There seems to be a fundamental difference between deferrability and the 
ON UPDATE/ON DELETE clauses as well - the latter only apply to FOREIGN 
KEYs, while the former apply to multiple types of constraints.


Matheus de Oliveira:
I'm still not sure if the chosen path is the best way. But I'd be glad 
to follow any directions we all see fit.


For now, this patch applies two methods:
1. Changes full constraint definition (which keeps compatibility with 
current ALTER CONSTRAINT):

     ALTER CONSTRAINT [] [] []
2. Changes only the subset explicit seem in the command (a new way, I've 
chosen to just add SET in the middle, similar to `ALTER COLUMN ... SET 
{DEFAULT | NOT NULL}` ):

     ALTER CONSTRAINT SET [] [] []

I'm OK with changing the approach, we just need to chose the color :D


The `ALTER CONSTRAINT SET [] [] []` 
has the same problem about implied changes: What happens if you only do 
e.g. ALTER CONSTRAINT SET ON UPDATE xy - will the ON DELETE part be kept 
as-is or set to the default?


Also, since the ON UPDATE/ON DELETE just applies to FOREIGN KEYs and no 
other constraints, there's one level of "nesting" missing in your SET 
variant, I think.


I suggest to:

- keep `ALTER CONSTRAINT constraint_name [ DEFERRABLE | NOT DEFERRABLE ] 
[ INITIALLY DEFERRED | INITIALLY IMMEDIATE ]` exactly as-is


- add both:
 + `ALTER CONSTRAINT constraint_name [ALTER] FOREIGN KEY ON UPDATE 
referential_action`
 + `ALTER CONSTRAINT constraint_name [ALTER] FOREIGN KEY ON DELETE 
referential_action`


This does not imply any changes, that are not in the command - very much 
analog to the ALTER COLUMN variants.


This could also be extended in the future with stuff like `ALTER 
CONSTRAINT constraint_name [ALTER] FOREIGN KEY MATCH [ FULL | PARTIAL | 
SIMPLE ]`.





Re: Allow an alias to be attached directly to a JOIN ... USING

2020-08-03 Thread Wolfgang Walther

Peter Eisentraut:

On 2019-12-31 00:07, Vik Fearing wrote:

One thing I notice is that the joined columns are still accessible from
their respective table names when they should not be per spec.  That
might be one of those "silly restrictions" that we choose to ignore, but
it should probably be noted somewhere, at the very least in a code
comment if not in user documentation. (This is my reading of SQL:2016 SR
11.a.i)


Here is a rebased patch.

The above comment is valid.  One reason I didn't implement it is that it 
would create inconsistencies with existing behavior, which is already 
nonstandard.


For example,

create table a (id int, a1 int, a2 int);
create table b (id int, b2 int, b3 int);

makes

select a.id from a join b using (id);

invalid.  Adding an explicit alias for the common column names doesn't 
change that semantically, because an implicit alias also exists if an 
explicit one isn't specified.
I just looked through the patch without applying or testing it - but I 
couldn't find anything that would indicate that this is not going to 
work for e.g. a LEFT JOIN as well. First PG patch I looked at, so tell 
me if I missed something there.


So given this:

SELECT x.id FROM a LEFT JOIN b USING (id) AS x

will this return NULL or a.id for rows that don't match in b? This 
should definitely be mentioned in the docs and I guess a test wouldn't 
be too bad as well?


In any case: If a.id and b.id would not be available anymore, but just 
x.id, either the id value itself or the NULL value (indicating the 
missing row in b) are lost. So this seems like a no-go.


> I agree that some documentation would be in order if we decide to leave
> it like this.

Keep it like that!




Re: extension patch of CREATE OR REPLACE TRIGGER

2020-08-03 Thread Wolfgang Walther

osumi.takami...@fujitsu.com:

* I'm a little bit concerned about the semantics of changing the
tgdeferrable/tginitdeferred properties of an existing trigger.  If there are 
trigger
events pending, and the trigger is redefined in such a way that those events
should already have been fired, what then?

OK. I need a discussion about this point.
There would be two ideas to define the behavior of this semantics change, I 
think.
The first idea is to throw an error that means
the *pending* trigger can't be replaced during the session.
The second one is just to replace the trigger and ignite the new trigger
at the end of the session when its tginitdeferred is set true.
For me, the first one sounds safer. Yet, I'd like to know other opinions.


IMHO, constraint triggers should behave the same in that regard as other 
constraints. I just checked:


BEGIN;
CREATE TABLE t1 (a int CONSTRAINT u UNIQUE DEFERRABLE INITIALLY DEFERRED);
INSERT INTO t1 VALUES (1),(1);
ALTER TABLE t1 ALTER CONSTRAINT u NOT DEFERRABLE;

will throw with:

ERROR:  cannot ALTER TABLE "t1" because it has pending trigger events
SQL state: 55006

So if a trigger event is pending, CREATE OR REPLACE for that trigger 
should throw. I think it should do in any case, not just when changing 
deferrability. This makes it easier to reason about.


If the user has a pending trigger, they can still do SET CONSTRAINTS 
trigger_name IMMEDIATE; to resolve that and then do CREATE OR REPLACE 
TRIGGER, just like in the ALTER TABLE case.



regression=# create constraint trigger my_trig after insert on trig_table
deferrable initially deferred for each row execute procedure
before_replacement(); CREATE TRIGGER regression=# begin; BEGIN
regression=*# insert into trig_table default values; INSERT 0 1 regression=*#
drop trigger my_trig on trig_table; DROP TRIGGER regression=*# commit;
ERROR:  relation 38489 has no triggers

I could reproduce this bug, using the current master without my patch.
So this is another issue.
I'm thinking that throwing an error when *pending* trigger is dropped
makes sense. Does everyone agree with it ?


Just tested the same example as above, but with DROP TABLE t1; instead 
of ALTER TABLE. This throws with:


ERROR:  cannot DROP TABLE "t1" because it has pending trigger events
SQL state: 55006

So yes, your suggestion makes a lot of sense!




Re: Allow foreign keys to reference a superset of unique columns

2022-09-25 Thread Wolfgang Walther

James Coleman:

If I'm following properly this sounds like an overengineered EAV
schema, and neither of those things inspires me to think "this is a
use case I want to support".

That being said, I know that sometimes examples that have been
abstracted enough to share aren't always the best, so perhaps there's
something underlying this that's a more valuable example.


Most my use-cases are slightly denormalized and I was looking for an 
example that didn't require those kind of FKS only because of the 
denormalization. So that's why it might have been a bit artifical or 
abstracted too much.


Take another example: I deal with multi-tenancy systems, for which I 
want to use RLS to separate the data between tenants:


CREATE TABLE tenants (tenant INT PRIMARY KEY);

Each tenant can create multiple users and groups:

CREATE TABLE users (
  "user" INT PRIMARY KEY,
  tenant INT NOT NULL REFERENCES tenants
);

CREATE TABLLE groups (
  "group" INT PRIMARY KEY,
  tenant INT NOT NULL REFERENCES tenants
);

Users can be members of groups. The simple approach would be:

CREATE TABLE members (
  PRIMARY KEY ("user", "group"),
  "user" INT REFERENCES users,
  "group" INT REFERENCES groups
);

But this falls short in two aspects:
- To make RLS policies simpler to write and quicker to execute, I want 
to add "tenant" columns to **all** other tables. A slightly denormalized 
schema for efficiency.
- The schema above does not ensure that users can only be members in 
groups of the same tenant. Our business model requires to separate 
tenants cleanly, but as written above, cross-tenant memberships would be 
allowed.


In comes the "tenant" column which solves both of this:

CREATE TABLE members (
  PRIMARY KEY ("user", "group"),
  tenant INT REFERENCES tenants,
  "user" INT,
  "group" INT,
  FOREIGN KEY ("user", tenant) REFERENCES users ("user", tenant),
  FOREIGN KEY ("group", tenant) REFERENCES groups ("group", tenant)
);

This is not possible to do right now, without adding more UNIQUE 
constraints to the users and groups tables - on a superset of already 
unique columns.



bar.y is a little bit like a generated value in that sense, it should
always match foo.b. I think it would be great, if we could actually go a
step further, too: On an update to bar.x to a new value, if foo.a=bar.x
exists, I would like to set bar.y automatically to the new foo.b.
Otherwise those kind of updates always have to either query foo before,
or add a trigger to do the same.


Isn't this actually contradictory to the behavior you currently have
with a multi-column foreign key? In the example above then an update
to bar.x is going to update the rows in foo that match bar.x = foo.a
and bar.y = foo.b *using the old values of bar.x and bar.y* to be the
new values.


No, I think there was a misunderstanding. An update to bar should not 
update rows in foo. An update to bar.x should update bar.y implicitly, 
to match the new value of foo.b.



You seem to be suggesting that instead it should look for
other rows that already match the *new value* of only one of the
columns in the constraint.


Yes. I think basically what I'm suggesting is, that for an FK to a 
superset of unique columns, all the FK-logic should still be done on the 
already unique set of columns only - and then the additional columns 
should be mirrored into the referencing table. The referencing table can 
then put additional constraints on this column. In the members example 
above, this additional constraint is the fact that the tenant column 
can't be filled with two different values for the users and groups FKs. 
But this could also be a CHECK constraint to allow FKs only to a subset 
of rows in the target table:


CREATE TYPE foo_type AS ENUM ('A', 'B', 'C');

CREATE TABLE foo (
  f INT PRIMARY KEY,
  type foo_type
);

CREATE TABLE bar (
  b INT PRIMARY KEY,
  f INT,
  ftype foo_type CHECK (ftype <> 'C'),
  FOREIGN KEY (f, ftype) REFERENCES foo (f, type);
);

In this example, the additional ftype column is just used to enforce 
that bar can only reference rows with type A or B, but not C. Assume:


INSERT INTO foo VALUES (1, 'A'), (2, 'B'), (3, 'C');

In this case, it would be nice to be able to do the following, i.e. 
derive the value for bar.ftype automatically:


INSERT INTO bar (b, f) VALUES (10, 1); -- bar.ftype is then 'A'
UPDATE bar SET f = 2 WHERE b = 10; -- bar.ftype is then 'B'

And it would throw errors in the following cases, because the 
automatically derived value fails the CHECK constraint:


INSERT INTO bar (b, f) VALUES (20, 3);
UPDATE bar SET f = 3 WHERE b = 10;

Note: This "automatically derived columns" extension would be a separate 
feature. Really nice to have, but the above mentioned FKs to supersets 
of unique columns would be very valuable without it already.


Best

Wolfgang




Re: has_privs_of_role vs. is_member_of_role, redux

2022-09-25 Thread Wolfgang Walther

Robert Haas:

Well, maybe. Suppose that role A has been granted pg_read_all_settings
WITH INHERIT TRUE, SET TRUE and role B has been granted
pg_read_all_settings WITH INHERIT TRUE, SET FALSE. A can create a
table owned by pg_read_all_settings. If A does that, then B can now
create a trigger on that table and usurp the privileges of
pg_read_all_settings, after which B can now create any number of
objects owned by pg_read_all_settings.


I'm not seeing how this is possible. A trigger function would run with 
the invoking user's privileges by default, right? So B would have to 
create a trigger with a SECURITY DEFINER function, which is owned by 
pg_read_all_settings to actually usurp the privileges of that role. But 
creating objects with that owner is exactly the thing B can't do.


What am I missing?

Best

Wolfgang




Add ON CONFLICT DO RETURN clause

2022-09-25 Thread Wolfgang Walther
When using ON CONFLICT DO NOTHING together with RETURNING, the 
conflicted rows are not returned. Sometimes, this would be useful 
though, for example when generated columns or default values are in play:


CREATE TABLE x (
  id INT PRIMARY KEY,
  created_at TIMESTAMPTZ DEFAULT CURRENT_TIMEMSTAMP
);

To get the created_at timestamp for a certain id **and** at the same 
time create this id in case it does not exist, yet, I can currently do:


INSERT INTO x (id) VALUES (1)
  ON CONFLICT DO UPDATE
  SET id=EXCLUDED.id
  RETURNING created_at;

However that will result in a useless UPDATE of the row.

I could probably add a trigger to prevent the UPDATE in that case. Or I 
could do something in a CTE. Or in multiple statements in plpgsql - this 
is what I currently do in application code.


The attached patch adds a DO RETURN clause to be able to do this:

INSERT INTO x (id) VALUES (1)
  ON CONFLICT DO RETURN
  RETURNING created_at;

Much simpler. This will either insert or do nothing - but in both cases 
return a row.


Thoughts?

Best

Wolfgang>From 83a0031ed2ded46cbf6fd130bd76680267db7a5e Mon Sep 17 00:00:00 2001
From: Wolfgang Walther 
Date: Sun, 25 Sep 2022 16:20:44 +0200
Subject: [PATCH v1] Add ON CONFLICT DO RETURN clause

This behaves the same as DO NOTHING, but returns the row when used together 
with RETURNING.
---
 doc/src/sgml/postgres-fdw.sgml|   6 +-
 doc/src/sgml/ref/insert.sgml  |  15 +-
 src/backend/commands/explain.c|  21 +-
 src/backend/executor/nodeModifyTable.c|  24 +-
 src/backend/optimizer/util/plancat.c  |   4 +
 src/backend/parser/gram.y |  10 +
 src/backend/parser/parse_clause.c |   7 +
 src/backend/utils/adt/ruleutils.c |   4 +
 src/include/nodes/nodes.h |   1 +
 .../expected/insert-conflict-do-nothing-2.out | 186 
 .../expected/insert-conflict-do-nothing.out   |  46 
 .../expected/partition-key-update-3.out   | 206 ++
 .../specs/insert-conflict-do-nothing-2.spec   |  11 +
 .../specs/insert-conflict-do-nothing.spec |   5 +
 .../specs/partition-key-update-3.spec |  11 +
 src/test/regress/expected/insert_conflict.out |  25 +++
 src/test/regress/sql/insert_conflict.sql  |  19 ++
 17 files changed, 587 insertions(+), 14 deletions(-)

diff --git a/doc/src/sgml/postgres-fdw.sgml b/doc/src/sgml/postgres-fdw.sgml
index bfd344cdc0..e5b6b8501f 100644
--- a/doc/src/sgml/postgres-fdw.sgml
+++ b/doc/src/sgml/postgres-fdw.sgml
@@ -80,9 +80,9 @@
  
   Note that postgres_fdw currently lacks support for
   INSERT statements with an ON CONFLICT DO
-  UPDATE clause.  However, the ON CONFLICT DO 
NOTHING
-  clause is supported, provided a unique index inference specification
-  is omitted.
+  UPDATE or ON CONFLICT DO RETURN clause.
+  However, the ON CONFLICT DO NOTHING clause is supported,
+  provided a unique index inference specification is omitted.
   Note also that postgres_fdw supports row movement
   invoked by UPDATE statements executed on partitioned
   tables, but it currently does not handle the case where a remote partition
diff --git a/doc/src/sgml/ref/insert.sgml b/doc/src/sgml/ref/insert.sgml
index 7cea70329e..eb0c721637 100644
--- a/doc/src/sgml/ref/insert.sgml
+++ b/doc/src/sgml/ref/insert.sgml
@@ -36,6 +36,7 @@ INSERT INTO table_name [ AS and conflict_action is 
one of:
 
 DO NOTHING
+DO RETURN
 DO UPDATE SET { column_name = 
{ expression | DEFAULT } |
 ( column_name 
[, ...] ) = [ ROW ] ( { expression 
| DEFAULT } [, ...] ) |
 ( column_name 
[, ...] ) = ( sub-SELECT )
@@ -336,9 +337,11 @@ INSERT INTO table_name [ AS conflict_target is violated, the
 alternative conflict_action is taken.
 ON CONFLICT DO NOTHING simply avoids inserting
-a row as its alternative action.  ON CONFLICT DO
-UPDATE updates the existing row that conflicts with the
-row proposed for insertion as its alternative action.
+a row as its alternative action.  ON CONFLICT DO RETURN
+avoids inserting the row, but returns the row when 
RETURNING
+is specified.  ON CONFLICT DO UPDATE updates the
+existing row that conflicts with the row proposed for insertion as
+its alternative action.

 

@@ -379,7 +382,7 @@ INSERT INTO table_name [ AS arbiter
 indexes.  Either performs unique index
 inference, or names a constraint explicitly.  For
-ON CONFLICT DO NOTHING, it is optional to
+DO NOTHING and DO RETURN, it is 
optional to
 specify a conflict_target; when
 omitted, conflicts with all usable constraints (and unique
 indexes) are handled.  For ON CONFLICT DO
@@ -395,8 +398,8 @@ INSERT INTO table_name [ AS 
 conflict_action specifies an
 alternative ON CONFLICT action.  It can be
-either DO NOTHING, or a DO
-UPDATE clause specifying the exact details of 

Re: Allow foreign keys to reference a superset of unique columns

2022-09-25 Thread Wolfgang Walther

James Coleman:

If we have a declared constraint on x,y where x is unique based on an
index including on x I do not think we should have that fk constraint
work differently than a constraint on x,y where there is a unique
index on x,y. That would seem to be incredibly confusing behavior
(even if it would be useful for some specific use case).


I don't think it's behaving differently from how it does now. See below. 
But I can see how that could be confusing. Maybe it's just about 
describing the feature in a better way than I did so far. Or maybe it 
needs a different syntax.


Anyway, I don't think it's just a specific use case. In every use case I 
had for $subject so far, the immediate next step was to write some 
triggers to fetch those derived values from the referenced table.


Ultimately it's a question of efficiency: We can achieve the same thing 
in two ways today:
- We can either **not** add the additional column (members.tenant, 
bar.ftype in my examples) to the referencing table at all, and add 
constraint triggers that do all those checks instead. This adds 
complexity to write the triggers and more complicated RLS policies etc, 
and also is potentially slower when executing those more complicated 
queries.
- Or we can add the additional column, but also add an additional unique 
index on the referenced table, and then make it part of the FK. This 
removes some of the constraint triggers and makes RLS policies simpler 
and likely faster to execute queries. It comes at a cost of additional 
cost of storage, though - and this is something that $subject tries to 
address.


Still, even when $subject is allowed, in practice we need some of the 
triggers to fetch those dependent values. Considering that the current 
FK triggers already do the same kind of queries at the same times, it'd 
be more efficient to have those FK queries fetch those dependent values.



But this could also be a CHECK constraint to allow FKs only to a subset
of rows in the target table:


Are you suggesting a check constraint that queries another table?


No. I was talking about the CHECK constraint in my example in the next 
paragraph of that mail. The CHECK constraint on bar.ftype is a regular 
CHECK constraint, but because of how ftype is updated automatically, it 
effectively behaves like some kind of additional constraint on the FK 
itself.



This "derive the value automatically" is not what foreign key
constraints do right now at all, right? And if fact it's contradictory
to existing behavior, no?


I don't think it's contradicting. Maybe a better way to put my idea is this:

For a foreign key to a superset of unique columns, the already-unique 
columns should behave according to the specified ON UPDATE clause. 
However, the extra columns should always behave as they were ON UPDATE 
CASCADE. And additionally, they should behave similar to something like 
ON INSERT CASCADE. Although that INSERT is about the referencing table, 
not the referenced table, so the analogy isn't 100%.


I guess this would also be a more direct answer to Tom's earlier 
question about what to expect in the ON UPDATE scenario.


Best

Wolfgang




Re: Add ON CONFLICT DO RETURN clause

2022-09-25 Thread Wolfgang Walther

Peter Geoghegan:

On Sun, Sep 25, 2022 at 8:55 AM Wolfgang Walther
 wrote:

The attached patch adds a DO RETURN clause to be able to do this:

INSERT INTO x (id) VALUES (1)
ON CONFLICT DO RETURN
RETURNING created_at;

Much simpler. This will either insert or do nothing - but in both cases
return a row.


How can you tell which it was, though?


I guess I can't reliably. But isn't that the same in the ON UPDATE case?

In the use cases I had so far, I didn't need to know.


I don't see why this statement should ever perform steps for any row
that are equivalent to DO NOTHING processing -- it should at least
lock each and every affected row, if only to conclusively determine
that there really must be a conflict.

In general ON CONFLICT DO UPDATE allows the user to add a WHERE clause
to back out of updating a row based on an arbitrary predicate. DO
NOTHING has no such WHERE clause. So DO NOTHING quite literally does
nothing for any rows that had conflicts, unlike DO UPDATE, which will
at the very least lock the row (with or without an explicit WHERE
clause).

The READ COMMITTED behavior for DO NOTHING is a bit iffy, even
compared to DO UPDATE, but the advantages in bulk loading scenarios
can be decisive. Or at least they were before we had MERGE.


Agreed - it needs to lock the row. I don't think I fully understood what 
"nothing" in DO NOTHING extended to.


I guess I want DO RETURN to behave more like a DO SELECT, so with the 
same semantics as selecting the row?


Best

Wolfgang




Re: Allow foreign keys to reference a superset of unique columns

2022-09-26 Thread Wolfgang Walther

James Coleman:

So the broader point I'm trying to make is that, as I understand it,
indexes backing foreign key constraints is an implementation detail.
The SQL standard details the behavior of foreign key constraints
regardless of implementation details like a backing index. That means
that the behavior of two column foreign key constraints is defined in
a single way whether or not there's a backing index at all or whether
such a backing index, if present, contains one or two columns.

I understand that for the use case you're describing this isn't the
absolute most efficient way to implement the desired data semantics.
But it would be incredibly confusing (and, I think, a violation of the
SQL standard) to have one foreign key constraint work in a different
way from another such constraint when both are indistinguishable at
the constraint level (the backing index isn't an attribute of the
constraint; it's merely an implementation detail).


Ah, thanks, I understand better now.

The two would only be indistinguishable at the constraint level, if 
$subject was implemented by allowing to create unique constraints on a 
superset of unique columns, backed by a different index (the suggestion 
we both made independently). But if it was possible to reference a 
superset of unique columns, where there was only a unique constraint put 
on a subset of the referenced columns (the idea originally introduced in 
this thread), then there would be a difference, right?


That's if it was only the backing index that is not part of the SQL 
standard, and not also the fact that a foreign key should reference a 
primary key or unique constraint?


Anyway, I can see very well how that would be quite confusing overall. 
It would probably be wiser to allow something roughly like this (if at 
all, of course):


CREATE TABLE bar (
  b INT PRIMARY KEY,
  f INT,
  ftype foo_type GENERATED ALWAYS AS REFERENCE TO foo.type,
  FOREIGN KEY (f, ftype) REFERENCES foo (f, type)
);

It likely wouldn't work exactly like that, but given a foreign key to 
foo, the GENERATED clause could be used to fetch the value through the 
same triggers that form that FK for efficiency. My main point for now 
is: With a much more explicit syntax anything near that, this would 
certainly be an entirely different feature than $subject **and** it 
would be possible to implement on top of $subject. If at all.


So no need for me to distract this thread from $subject anymore. I think 
the idea of allowing to create unique constraints on a superset of the 
columns of an already existing unique index is a good one, so let's 
discuss this further.


Best

Wolfgang




Re: Allow foreign keys to reference a superset of unique columns

2022-09-26 Thread Wolfgang Walther

James Coleman:

As I was reading through the email chain I had this thought: could you
get the same benefit (or 90% of it anyway) by instead allowing the
creation of a uniqueness constraint that contains more columns than
the index backing it? So long as the index backing it still guaranteed
the uniqueness on a subset of columns that would seem to be safe.

Tom notes the additional columns being nullable is something to think
about. But if we go the route of allowing unique constraints with
backing indexes having a subset of columns from the constraint I don't
think the nullability is an issue since it's already the case that a
unique constraint can be declared on columns that are nullable. Indeed
it's also the case that we already support a foreign key constraint
backed by a unique constraint including nullable columns.

Because such an approach would, I believe, avoid changing the foreign
key code (or semantics) at all, I believe that would address Tom's
concerns about information_schema and fuzziness of semantics.



Could we create this additional unique constraint implicitly, when using 
FOREIGN KEY ... REFERENCES on a superset of unique columns? This would 
make it easier to use, but still give proper information_schema output.


Best

Wolfgang




Re: has_privs_of_role vs. is_member_of_role, redux

2022-09-26 Thread Wolfgang Walther

Robert Haas:

This shows that if rhaas (or whoever) performs DML on a table owned by
pg_read_all_settings, he might trigger arbitrary code written by alice
to run under his own user ID. Now, that hazard would exist anyway for
tables owned by alice, but now it also exists for any tables owned by
pg_read_all_settings.


This hazard exists for all tables that alice has been granted the 
TRIGGER privilege on. While we prevent alice from creating tables that 
are owned by pg_read_all_settings, we do not prevent inheriting the 
TRIGGER privilege.



I'm slightly skeptical of that conclusion because the whole thing just
feels a bit flimsy. Like, the whole idea that you can compromise your
account by inserting a row into somebody else's table feels a little
nuts to me. Triggers and row-level security policies make it easy to
do things that look safe and are actually very dangerous. I think
anyone would reasonably expect that calling a function owned by some
other user might be risky, because who knows what that function might
do, but it seems less obvious that accessing a table could execute
arbitrary code, yet it can. And it is even less obvious that creating
a table owned by one role might give some other role who inherits that
user's privileges to booby-trap that table in a way that might fool a
third user into doing something unsafe. But I have no idea what we
could reasonably do to improve the situation.


Right. This will always be the case when giving out the TRIGGER 
privilege on one of your tables to somebody else.


There is two kind of TRIGGER privileges: An explicitly GRANTed privilege 
and an implicit privilege, that is given to the table owner.


I think, when WITH INHERIT TRUE, SET FALSE is set, we should:
- Inherit all explicitly granted privileges
- Not inherit any DDL privileges implicitly given through ownership: 
CREATE, REFERENCES, TRIGGER.
- Inherit all other privileges implicitly given through ownership (DML + 
others)


Those implicit DDL privileges should be considered part of WITH SET 
TRUE. When you can't do SET ROLE x, then you can't act as the owner of 
any object owned by x.


Or to put it the other way around: Only allow implicit ownership 
privileges to be executed when the CURRENT_USER is the owner. But 
provide a shortcut, when you have the WITH SET TRUE option on a role, so 
that you don't need to do SET ROLE + CREATE TRIGGER, but can just do 
CREATE TRIGGER instead. This is similar to the mental model of 
"requesting and accepting a transfer of ownership" with an implicit SET 
ROLE built-in, that I used before.


Best

Wolfgang




Re: has_privs_of_role vs. is_member_of_role, redux

2022-09-26 Thread Wolfgang Walther

Robert Haas:

I don't think we're going to be very happy if we redefine inheriting
the privileges of another role to mean inheriting only some of them.
That seems pretty counterintuitive to me. I also think that this
particular definition is pretty fuzzy.


Scratch my previous suggestion. A new, less fuzyy definition would be: 
Ownership is not a privilege itself and as such not inheritable.


When role A is granted to role B, two things happen:
1. Role B now has the right to use the GRANTed privileges of role A.
2. Role B now has the right to become role A via SET ROLE.

WITH SET controls whether point 2 is the case or not.

WITH INHERIT controls whether role B actually executes their right to 
use those privileges ("inheritance") **and** whether the set role is 
done implicitly for anything that requires ownership, but of course only 
WITH SET TRUE.


This is the same way that the role attributes INHERIT / NOINHERIT behave.


Your previous proposal was to make the SET attribute of a GRANT
control not only the ability to SET ROLE to the target role but also
the ability to create objects owned by that role and/or transfer
objects to that role. I think some people might find that behavior a
little bit surprising - certainly, it goes beyond what the name SET
implies - but it is at least simple enough to explain in one sentence,
and the consequences don't seem too difficult to reason about.


This would be included in the above.


Here, though, it doesn't really seem simple enough to explain in one
sentence, nor does it seem easy to reason about.


I think the "ownership is not inheritable" idea is easy to explain.


There are many operations which are permitted or declined just using
an owner-check. One example is commenting on an object. That sure
sounds like it would fit within your proposed "DDL privileges
implicitly given through ownership" category, but it doesn't really
present any security hazard, so I do not think there is a good reason
to restrict that from a user who has INHERIT TRUE, SET FALSE. Another
is renaming an object, which is a little more murky. You can't
directly usurp someone's privileges by renaming an object that they
own, but you could potentially rename an object out of the way and
replace it with one that you own and thus induce a user to do
something dangerous. I don't really want to make even small exceptions
to the idea that inheriting a role's privileges means inheriting all
of them, and I especially don't want to make large exceptions, or
exceptions that involve judgement calls about the relative degree of
risk of each possible operation.


I would not make this about security-risks only. We didn't distinguish 
between privileges and ownership that much before, because we didn't 
have WITH INHERIT or WITH SET. Now that we have both, we could do so.


The ideas of "inherited GRANTs" and "a shortcut to avoid SET ROLE to do 
owner-things" should be better to explain.


No judgement required.

All of this is to find a way to make WITH INHERIT TRUE, SET FALSE a 
"real", risk-free thing - and not just some syntactic sugar. And if that 
comes with the inability to COMMENT ON TABLE 
owned_by_pg_read_all_settings... fine. No need for that at all.


However, it would come with the inability to do SELECT * FROM 
owned_by_pg_read_all_settings, **unless** explicitly GRANTed to the 
owner, too. This might feel strange at first, but should not be a 
problem either. WITH INHERIT TRUE, SET FALSE is designed for built-in 
roles or other container roles that group a set of privileges. Those 
roles should not have objects they own anyway. And if they still do, 
denying access to those objects unless explicitly granted is the safe way.


Best

Wolfgang




Re: has_privs_of_role vs. is_member_of_role, redux

2022-09-26 Thread Wolfgang Walther

Robert Haas:

Scratch my previous suggestion. A new, less fuzyy definition would be:
Ownership is not a privilege itself and as such not inheritable.
[...]

If I'm understanding correctly, this would amount to a major
redefinition of what it means to inherit privileges, and I think the
chances of such a change being accepted are approximately zero.
Inheriting privileges needs to keep meaning what it means now, namely,
you inherit all the rights of the granted role.


No. Inheriting stays the same, it's just WITH SET that's different from 
what it is "now".



I don't. And even if I did think it were easy to explain, I don't
think it would be a good idea. One of my first patches to PostgreSQL
added a grantable TRUNCATE privilege to tables. I think that, under
your proposed definitions, the addition of this privilege would have
had the result that a role grant would cease to allow the recipient to
truncate tables owned by the granted role. There is currently a
proposal on the table to make VACUUM and ANALYZE grantable permissions
on tables, which would have the same issue. I think that if I made it
so that adding such privileges resulted in role inheritance not
working for those operations any more, people would come after me with
pitchforks. And I wouldn't blame them: that sounds terrible.


No, there is a misunderstanding. In my proposal, when you do WITH SET 
TRUE everything stays exactly the same as it is right now.


I'm just saying WITH SET FALSE should take away more of the things you 
can do (all the ownership things) to a point where it's safe to GRANT .. 
WITH INHERIT TRUE, SET FALSE and still be useful for pre-defined or 
privilege-container roles.


Could be discussed in the WITH SET thread, but it's a natural extension 
of the categories (1) and (2) in your original email. It's all about 
ownership.


Best

Wolfgang




Re: allowing for control over SET ROLE

2022-09-02 Thread Wolfgang Walther

Robert Haas:

Beginning in
e3ce2de09d814f8770b2e3b3c152b7671bcdb83f, the inheritance behavior of
role-grants can be overridden for individual grants, so that some
grants are inherited and others are not.


That's a great thing to have!


However, there is no similar
facility for controlling whether a role can SET ROLE to some other
role of which it is a member. At present, if role A is a member of
role B, then A can SET ROLE B, and that's it.

In some circumstances, it may be desirable to control this behavior.


+1


rhaas=# grant oncall to oncallbot with inherit false, set false, admin true;


Looking at the syntax here, I'm not sure whether adding more WITH 
options is the best way to do this. From a user perspective WITH SET 
TRUE looks more like a privilege granted on how to use this database 
object (role). Something like this would be more consistent with the 
other GRANT variants:


GRANT SET ON ROLE oncall TO oncallbot WITH GRANT OPTION;

This is obviously not exactly the same as the command above, because 
oncallbot would be able to use SET ROLE directly. But as discussed, this 
is more cosmetic anyway, because they could GRANT it to themselves.


The full syntax could look like this:

GRANT { INHERIT | SET | ALL [ PRIVILEGES ] }
  ON ROLE role_name [, ...]
  TO role_specification [, ...] WITH GRANT OPTION
  [ GRANTED BY role_specification ]

With this new syntax, the existing

GRANT role_name TO role_specification [WITH ADMIN OPTION];

would be the same as

GRANT ALL ON role_name TO role_specification [WITH GRANT OPTION];

This would slightly change the way INHERIT works: As a privilege, it 
would not override the member's role INHERIT attribute, but would 
control whether that attribute is applied. This means:


- INHERIT attribute + INHERIT granted -> inheritance (same)
- INHERIT attribute + INHERIT not granted -> no inheritance (different!)
- NOINHERIT attribute  + INHERIT not granted -> no inheritance (same)
- NOINHERIT attribute  + INHERIT granted -> no inheritance (different!)

This would allow us to do the following:

GRANT INHERIT ON ROLE pg_read_all_settings TO seer_bot WITH GRANT OPTION;

seer_bot would now be able to GRANT pg_read_all_settings to other users, 
too - but without the ability to use or grant SET ROLE to anyone. As 
long as seer_bot has the NOINHERIT attribute set, they wouldn't use that 
privilege, though - which might be desired for the bot.


Similary, it would be possible for the oncallbot in the example above to 
be able to grant SET ROLE only - and not INHERIT.


I realize that there has been a lot of discussion about roles and 
privileges in the past year. I have tried to follow those discussions, 
but it's likely that I missed some good arguments against my proposal above.


Best

Wolfgang




Re: Allow foreign keys to reference a superset of unique columns

2022-09-02 Thread Wolfgang Walther

Kaiting Chen:

I'd like to propose a change to PostgreSQL to allow the creation of a foreign
key constraint referencing a superset of uniquely constrained columns.


+1

Tom Lane:

TBH, I think this is a fundamentally bad idea and should be rejected
outright.  It fuzzes the semantics of the FK relationship, and I'm
not convinced that there are legitimate use-cases.  Your example
schema could easily be dismissed as bad design that should be done
some other way.


I had to add quite a few unique constraints on a superset of already 
uniquely constrained columns in the past, just to be able to support FKs 
to those columns. I think those cases most often come up when dealing 
with slightly denormalized schemas, e.g. for efficiency.


One other use-case I had recently, was along the followling lines, in 
abstract terms:


CREATE TABLE classes (class INT PRIMARY KEY, ...);

CREATE TABLE instances (
  instance INT PRIMARY KEY,
  class INT REFERENCES classes,
  ...
);

Think about classes and instances as in OOP. So the table classes 
contains some definitions for different types of object and the table 
instances realizes them into concrete objects.


Now, assume you have some property of a class than is best modeled as a 
table like this:


CREATE TABLE classes_prop (
  property INT PRIMARY KEY,
  class INT REFERNECES classes,
  ...
);

Now, assume you need to store data for each of those classes_prop rows 
for each instance. You'd do the following:


CREATE TABLE instances_prop (
  instance INT REFERENCES instances,
  property INT REFERENCES classes_prop,
  ...
);

However, this does not ensure that the instance and the property you're 
referencing in instances_prop are actually from the same class, so you 
add a class column:


CREATE TABLE instances_prop (
  instance INT,
  class INT,
  property INT,
  FOREIGN KEY (instance, class) REFERENCES instances,
  FOREIGN KEY (property, class) REFERENCES classes_prop,
  ...
);

But this won't work, without creating some UNIQUE constraints on those 
supersets of the PK column first.



For one example of where the semantics get fuzzy, it's not
very clear how the extra-baggage columns ought to participate in
CASCADE updates.  Currently, if we have
CREATE TABLE foo (a integer PRIMARY KEY, b integer);
then an update that changes only foo.b doesn't need to update
referencing tables, and I think we even have optimizations that
assume that if no unique-key columns are touched then RI checks
need not be made.  But if you did
CREATE TABLE bar (x integer, y integer,
  FOREIGN KEY (x, y) REFERENCES foo(a, b) ON UPDATE 
CASCADE);
then perhaps you expect bar.y to be updated ... or maybe you don't?


In all use-cases I had so far, I would expect bar.y to be updated, too.

I think it would not even be possible to NOT update bar.y, because the 
FK would then not match anymore. foo.a is the PK, so the value in bar.x 
already forces bar.y to be the same as foo.b at all times.


bar.y is a little bit like a generated value in that sense, it should 
always match foo.b. I think it would be great, if we could actually go a 
step further, too: On an update to bar.x to a new value, if foo.a=bar.x 
exists, I would like to set bar.y automatically to the new foo.b. 
Otherwise those kind of updates always have to either query foo before, 
or add a trigger to do the same.


In the classes/instances example above, when updating 
instances_prop.property to a new value, instances_prop.class would be 
updated automatically to match classes_prop.class. This would fail, when 
the class is different than the class required by the FK to instances, 
though, providing exactly the safe-guard that this constraint was 
supposed to provide, without incurring additional overhead in update 
statements.


In the foo/bar example above, which is just a bit of denormalization, 
this automatic update would also be helpful - because rejecting the 
update on the grounds that the columns don't match doesn't make sense here.



Another example is that I think the idea is only well-defined when
the subset column(s) are a primary key, or at least all marked NOT NULL.
Otherwise they're not as unique as you're claiming.


I fail to see why. My understanding is that rows with NULL values in the 
referenced table can't participate in FK matches anyway, because both 
MATCH SIMPLE and MATCH FULL wouldn't require a match when any/all of the 
columns in the referencing table are NULL. MATCH PARTIAL is not 
implemented, so I can't tell whether the semantics would be different there.


I'm not sure whether a FK on a superset of unique columns would be 
useful with MATCH SIMPLE. Maybe it could be forced to be MATCH FULL, if 
MATCH SIMPLE is indeed not well-defined.



It's also unclear to me how this ought to interact with the
information_schema views concerning foreign keys.  We generally
feel that we don't want to present any non-SQL-compatible data
in information_schema, for fear that it will confuse 

Re: has_privs_of_role vs. is_member_of_role, redux

2022-09-08 Thread Wolfgang Walther

Robert Haas:

Fairly obviously, my thinking here is biased by having written the
patch to allow restricting SET ROLE. If alice can neither inherit
bob's privileges nor SET ROLE bob, she had better not be able to
create objects owned by bob, because otherwise she can make a table,
add an expression index that calls a user-defined function, do stuff
until it needs to be autovacuumed, and then give it to bob, and boom,
exploit. But that doesn't mean that the is_member_of_role() tests here
have to be changed to has_privs_of_role(). They could be changed to
has_privs_of_role() || member_can_set_role(). And if the consensus is
to do it that way, I'm OK with that.

I'm just a little unconvinced that it's actually the best route. I
think that logic of the form "well Alice could just SET ROLE and do it
anyway" is weak -- and not only because of the patch to allow
restricting SET ROLE, but because AFAICT there is no point to the
INHERIT option in the first place unless it is to force you to issue
SET ROLE. That is literally the only thing it does. If we're going to
have weird exceptions where you don't have to SET ROLE after all, why
even have INHERIT in the first place?


I think to change the owner of an object from role A to role B, you just 
need a different "privilege" on that role B to "use" the role that way, 
which is distinct from INHERIT or SET ROLE "privileges".


When you are allowed to INHERIT a role, you are allowed to use the 
GRANTs that have been given to this role. When you are allowed to SET 
ROLE, then you are allowed to switch into this role. You could think of 
another "privilege", USAGE on a role, which would allow you to "use" 
this role as a target in a statement to change the owner of an object.


To change the owner for an object from role A to role B, you need:
- the privilege to ALTER the object, which is implied when you are A
- the privilege to "use" role B as a target

So basically the privilege to use role B as the new owner, is a 
privilege you have **on** the role object B, while the privilege to 
change the owner of an object is something you have **through** your 
membership in role A.


Up to v15, there were no separate privileges for this. You were either a 
member of a role or you were not. Now with INHERIT and maybe SET ROLE 
privileges/grant options, we can do two things:
- Keep the ability to use a role as a target in those statements as the 
most basic privilege on a role, that is implied by membership in that 
role and can't be taken away (currently the case), or

- invent a new privilege or grant option to allow changing that.

But mixing this with either INHERIT or SET ROLE doesn't make sense, imho.

Best

Wolfgang




Re: has_privs_of_role vs. is_member_of_role, redux

2022-09-08 Thread Wolfgang Walther

Robert Haas:

I think to change the owner of an object from role A to role B, you just
need a different "privilege" on that role B to "use" the role that way,
which is distinct from INHERIT or SET ROLE "privileges".


It's not distinct, though, because if you can transfer ownership of a
table to another user, you can use that ability to gain the privileges
of that user.


Right, but the inverse is not neccessarily true, so you could have SET 
ROLE privileges, but not "USAGE" - and then couldn't change the owner of 
an object to this role.


USAGE is not a good term, because it implies "least amount of 
privileges", but in this case it's quite the opposite.


In any case, adding a grant option for SET ROLE, while keeping the 
required privileges for a transfer of ownership at the minimum 
(membership only), doesn't really make sense. I guess both threads 
should be discussed together?


Best

Wolfgang




Building with meson on NixOS/nixpkgs

2024-03-16 Thread Wolfgang Walther
To build on NixOS/nixpkgs I came up with a few small patches to 
meson.build. All of this works fine with Autoconf/Make already.From 24ae72b9b0adc578c6729eff59c9038e6b4ac517 Mon Sep 17 00:00:00 2001
From: Wolfgang Walther 
Date: Sat, 2 Mar 2024 17:18:38 +0100
Subject: [PATCH 1/3] Fallback to uuid for ossp-uuid with meson

The upstream name for the ossp-uuid package / pkg-config file is "uuid". Many
distributions change this to be "ossp-uuid" to not conflict with e2fsprogs.

This lookup fails on distributions which don't change this name, for example
NixOS / nixpkgs. Both "ossp-uuid" and "uuid" are also checked in configure.ac.
---
 meson.build | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/meson.build b/meson.build
index c8fdfeb0ec3..d9ad1ce902f 100644
--- a/meson.build
+++ b/meson.build
@@ -1346,7 +1346,7 @@ if uuidopt != 'none'
 uuidfunc = 'uuid_to_string'
 uuidheader = 'uuid.h'
   elif uuidopt == 'ossp'
-uuid = dependency('ossp-uuid', required: true)
+uuid = dependency('ossp-uuid', 'uuid', required: true)
 uuidfunc = 'uuid_export'
 uuidheader = 'uuid.h'
   else
-- 
2.44.0

From 56e0abbcc3b950b6e93eddc6ede453ce529423ea Mon Sep 17 00:00:00 2001
From: Wolfgang Walther 
Date: Sat, 2 Mar 2024 22:06:25 +0100
Subject: [PATCH 2/3] Fallback to clang in PATH with meson

Some distributions put clang into a different path than the llvm binary path.

For example, this is the case on NixOS / nixpkgs, which failed to find clang
with meson before this patch.
---
 meson.build | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/meson.build b/meson.build
index d9ad1ce902f..7c64fb04ea5 100644
--- a/meson.build
+++ b/meson.build
@@ -759,7 +759,7 @@ if add_languages('cpp', required: llvmopt, native: false)
 llvm_binpath = llvm.get_variable(configtool: 'bindir')
 
 ccache = find_program('ccache', native: true, required: false)
-clang = find_program(llvm_binpath / 'clang', required: true)
+clang = find_program(llvm_binpath / 'clang', 'clang', required: true)
   endif
 elif llvmopt.auto()
   message('llvm requires a C++ compiler')
-- 
2.44.0

From 62f10689d843227fca6d54e86462d0be5c4f434f Mon Sep 17 00:00:00 2001
From: Wolfgang Walther 
Date: Mon, 11 Mar 2024 19:54:41 +0100
Subject: [PATCH 3/3] Support absolute bindir/libdir in regression tests with
 meson

Passing an absolute bindir/libdir will install the binaries and libraries to
/tmp_install/ and /tmp_install/ respectively.

This is path is correctly passed to the regression test suite via configure/make,
but not via meson, yet. This is because the "/" operator in the following expression
throws away the whole left side when the right side is an absolute path:

  test_install_location / get_option('libdir')

This was already correctly handled for dir_prefix, which is likely absolute as well.
This patch handles both bindir and libdir in the same way - prefixing absolute paths
with the tmp_install path correctly.
---
 meson.build | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/meson.build b/meson.build
index 7c64fb04ea5..e8438209058 100644
--- a/meson.build
+++ b/meson.build
@@ -3044,15 +3044,17 @@ test_install_destdir = meson.build_root() / 'tmp_install/'
 if build_system != 'windows'
   # On unixoid systems this is trivial, we just prepend the destdir
   assert(dir_prefix.startswith('/')) # enforced by meson
-  test_install_location = '@0@@1@'.format(test_install_destdir, dir_prefix)
+  temp_install_bindir = '@0@@1@'.format(test_install_destdir, dir_prefix / dir_bin)
+  temp_install_libdir = '@0@@1@'.format(test_install_destdir, dir_prefix / dir_lib)
 else
   # drives, drive-relative paths, etc make this complicated on windows, call
   # into a copy of meson's logic for it
   command = [
 python, '-c',
 'import sys; from pathlib import PurePath; d1=sys.argv[1]; d2=sys.argv[2]; print(str(PurePath(d1, *PurePath(d2).parts[1:])))',
-test_install_destdir, dir_prefix]
-  test_install_location = run_command(command, check: true).stdout().strip()
+test_install_destdir]
+  temp_install_bindir = run_command(command, dir_prefix / dir_bin, check: true).stdout().strip()
+  temp_install_libdir = run_command(command, dir_prefix / dir_lib, check: true).stdout().strip()
 endif
 
 meson_install_args = meson_args + ['install'] + {
@@ -3089,7 +3091,6 @@ testport = 4
 
 test_env = environment()
 
-temp_install_bindir = test_install_location / get_option('bindir')
 test_initdb_template = meson.build_root() / 'tmp_install' / 'initdb-template'
 test_env.set('PG_REGRESS', pg_regress.full_path())
 test_env.set(&#

Re: Building with meson on NixOS/nixpkgs

2024-03-21 Thread Wolfgang Walther

Nazir Bilal Yavuz:

0001 & 0002: Adding code comments to explain why they have fallback
could be nice.
0003: Looks good to me.


Added some comments in the attached.

Best,

WolfgangFrom 2d271aafd96a0ea21710a06ac5236e47217c36d1 Mon Sep 17 00:00:00 2001
From: Wolfgang Walther 
Date: Sat, 2 Mar 2024 17:18:38 +0100
Subject: [PATCH v2 1/3] Fallback to uuid for ossp-uuid with meson

The upstream name for the ossp-uuid package / pkg-config file is "uuid". Many
distributions change this to be "ossp-uuid" to not conflict with e2fsprogs.

This lookup fails on distributions which don't change this name, for example
NixOS / nixpkgs. Both "ossp-uuid" and "uuid" are also checked in configure.ac.
---
 meson.build | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/meson.build b/meson.build
index c8fdfeb0ec3..942c79c8be3 100644
--- a/meson.build
+++ b/meson.build
@@ -1346,7 +1346,8 @@ if uuidopt != 'none'
 uuidfunc = 'uuid_to_string'
 uuidheader = 'uuid.h'
   elif uuidopt == 'ossp'
-uuid = dependency('ossp-uuid', required: true)
+# upstream is called "uuid", but many distros change this to "ossp-uuid"
+uuid = dependency('ossp-uuid', 'uuid', required: true)
 uuidfunc = 'uuid_export'
 uuidheader = 'uuid.h'
   else
-- 
2.44.0

From f150a8dcb92b08eab40b5dfec130a18f297c709f Mon Sep 17 00:00:00 2001
From: Wolfgang Walther 
Date: Sat, 2 Mar 2024 22:06:25 +0100
Subject: [PATCH v2 2/3] Fallback to clang in PATH with meson

Some distributions put clang into a different path than the llvm binary path.

For example, this is the case on NixOS / nixpkgs, which failed to find clang
with meson before this patch.
---
 meson.build | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/meson.build b/meson.build
index 942c79c8be3..f9e96b01cfa 100644
--- a/meson.build
+++ b/meson.build
@@ -759,7 +759,10 @@ if add_languages('cpp', required: llvmopt, native: false)
 llvm_binpath = llvm.get_variable(configtool: 'bindir')
 
 ccache = find_program('ccache', native: true, required: false)
-clang = find_program(llvm_binpath / 'clang', required: true)
+
+# Some distros put LLVM and clang in different paths, so fallback to
+# find via PATH, too.
+clang = find_program(llvm_binpath / 'clang', 'clang', required: true)
   endif
 elif llvmopt.auto()
   message('llvm requires a C++ compiler')
-- 
2.44.0

From fddee56b5e27a3b5e4c406e8caa2d230b49eb447 Mon Sep 17 00:00:00 2001
From: Wolfgang Walther 
Date: Mon, 11 Mar 2024 19:54:41 +0100
Subject: [PATCH v2 3/3] Support absolute bindir/libdir in regression tests
 with meson

Passing an absolute bindir/libdir will install the binaries and libraries to
/tmp_install/ and /tmp_install/ respectively.

This is path is correctly passed to the regression test suite via configure/make,
but not via meson, yet. This is because the "/" operator in the following expression
throws away the whole left side when the right side is an absolute path:

  test_install_location / get_option('libdir')

This was already correctly handled for dir_prefix, which is likely absolute as well.
This patch handles both bindir and libdir in the same way - prefixing absolute paths
with the tmp_install path correctly.
---
 meson.build | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/meson.build b/meson.build
index f9e96b01cfa..144c443b5e4 100644
--- a/meson.build
+++ b/meson.build
@@ -3048,15 +3048,17 @@ test_install_destdir = meson.build_root() / 'tmp_install/'
 if build_system != 'windows'
   # On unixoid systems this is trivial, we just prepend the destdir
   assert(dir_prefix.startswith('/')) # enforced by meson
-  test_install_location = '@0@@1@'.format(test_install_destdir, dir_prefix)
+  temp_install_bindir = '@0@@1@'.format(test_install_destdir, dir_prefix / dir_bin)
+  temp_install_libdir = '@0@@1@'.format(test_install_destdir, dir_prefix / dir_lib)
 else
   # drives, drive-relative paths, etc make this complicated on windows, call
   # into a copy of meson's logic for it
   command = [
 python, '-c',
 'import sys; from pathlib import PurePath; d1=sys.argv[1]; d2=sys.argv[2]; print(str(PurePath(d1, *PurePath(d2).parts[1:])))',
-test_install_destdir, dir_prefix]
-  test_install_location = run_command(command, check: true).stdout().strip()
+test_install_destdir]
+  temp_install_bindir = run_command(command, dir_prefix / dir_bin, check: true).stdout().strip()
+  temp_install_libdir = run_command(command, dir_prefix / dir_lib, check: true).stdout().strip()
 endif
 
 meson_install_args = meson_args + ['install'] + {
@@ -3093,7 +3095,6 @@ testport = 4
 
 tes

Building with musl in CI and the build farm

2024-03-26 Thread Wolfgang Walther
The need to do $subject came up in [1]. Moving this to a separate 
discussion on -hackers, because there are more issues to solve than just 
the LD_LIBRARY_PATH problem.


Andres Freund:

FWIW, except for one small issue, building postgres against musl works on
debian and the tests pass if I install first.


The small problem mentioned above is that on debian linux/fs.h isn't available
when building with musl, which in turn causes src/bin/pg_upgrade/file.c to
fail to compile.  I assume that's not the case on "fully musl" distro?


Correct, I have not seen this before on Alpine.

Here is my progress setting up a buildfarm animal to run on Alpine Linux 
and the issues I found, so far:


The animal runs in a docker container via GitHub Actions in [2]. Right 
now it's still running with --test, until I get the credentials to 
activate it.


I tried to enable everything (except systemd, because Alpine doesn't 
have it) and run all tests. The LDAP tests are failing right now, but 
that is likely something that I need to fix in the Dockerfile - it's 
failing to start the slapd, IIRC. There are other issues, though - all 
of them have open pull requests in that repo [3].


I also had to skip the recovery check. Andrew mentioned that he had to 
do that, too, when he was still running his animal on Alpine. Not sure 
what this is about, yet.


Building --with-icu fails two tests. One of them (001_initdb) is fixed 
by having the "locale" command in your PATH, which is not the case on 
Alpine by default. I assume this will not break on your debian/musl 
build, Andres - but it will also probably not return any sane values, 
because it will run glibc's locale command.
I haven't looked into that in detail, yet, but I think the other test 
(icu/010_database) fails because it expects that setlocale(LC_COLLATE, 
) throws an error. I think it doesn't do that on musl, 
because LC_COLLATE is not implemented.
Those failing tests are not "just failing", but probably mean that we 
need to do something about how we deal with locale/setlocale on musl.


The last failure is about building --with-nls. This fails with something 
like:


ld: src/port/strerror.c:72:(.text+0x2d8): undefined reference to 
`libintl_gettext'


Of course, gettext-dev is installed, libintl.so is available in /usr/lib 
and it also contains the symbol. So not sure what's happening here.


Andres, did you build --with-icu and/or --with-nls on debian/musl? Did 
you run the recovery tests?


Best,

Wolfgang

[1]: 
https://postgr.es/m/fddd1cd6-dc16-40a2-9eb5-d7fef2101488%40technowledgy.de
[2]: 
https://github.com/technowledgy/postgresql-buildfarm-alpine/actions/workflows/run.yaml

[3]: https://github.com/technowledgy/postgresql-buildfarm-alpine/pulls




Re: Building with musl in CI and the build farm

2024-04-04 Thread Wolfgang Walther

Peter Eisentraut:

On 31.03.24 15:34, walt...@technowledgy.de wrote:
I'd rather adapt one of the existing tasks, to avoid increasing CI 
costs unduly.


I looked into this and I think the only task that could be changed is 
the SanityCheck.


I think SanityCheck should run a simple, "average" environment, like the 
current Debian one.  Otherwise, niche problems with musl or multi-arch 
or whatever will throw off the entire build pipeline.


All the errors/problems I have seen so far, while setting up the 
buildfarm animal on Alpine Linux, have been way beyond what SanityCheck 
does. Problems only appeared in the tests suites, of which sanity check 
only runs *very* basic ones. I don't have much experience with the 
"cross" setup, that "musl on debian" essentially is, though.


All those things are certainly out of scope for CI - they are tested in 
the build farm instead.


I do agree: SanityCheck doesn't feel like the right place to put this. 
But on the other side.. if it really fails to *build* with musl, then it 
shouldn't make a difference whether you will be notified about that 
immediately or later in the CI pipeline. It certainly needs the fewest 
additional resources to put it there.


I'm not sure what Andres meant with "adopting one of the existing 
tasks". It could fit as another step into the "Linux - Debian Bullseye - 
Autoconf" task, too. A bit similar to how the meson task build for 32 
and 64bit. This would still not be an entirely new task like I proposed 
initially (to run in Docker).


Best,

Wolfgang




Re: Building with musl in CI and the build farm

2024-04-04 Thread Wolfgang Walther

Tom Lane:

That is not the concern here.  What I think Peter is worried about,
and certainly what I'm worried about, is that a breakage in
SanityCheck comprehensively breaks all CI testing for all Postgres
developers.


You'd have to commit a failing patch first to break CI for all other 
developers. If you're only going to commit patches that pass those CI 
tasks, then this is not going to happen. Then it only becomes a question 
of how much feedback *you* get from a single CI run of your own patch.



To be blunt, I do not think we need to test musl in the CI pipeline.
I see it as one of the niche platforms that the buildfarm exists
to test.


I don't really have an opinion on this. I'm fine with having musl in the 
buildfarm only. I don't expect the core build itself to fail with musl 
anyway, this has been working fine for years. Andres asked for it to be 
added to CI, so maybe he sees more value on top of just "building with 
musl"?


Best,

Wolfgang




Re: Building with musl in CI and the build farm

2024-04-04 Thread Wolfgang Walther

Tom Lane:

You'd have to commit a failing patch first to break CI for all other
developers.


No, what I'm more worried about is some change in the environment
causing the build to start failing.  When that happens, it'd better
be an environment that many of us are familiar with and can test/fix.


The way I understand how this work is, that the images for the VMs in 
which those CI tasks run, are not just dynamically updated - but are 
actually tested before they are used in CI. So the environment doesn't 
just change suddenly.


See e.g. [1] for a pull request to the repo containing those images to 
update the linux debian image from bullseye to bookworm. This is exactly 
the image we're talking about. Before this image is used in postgres CI, 
it's tested and confirmed that it actually works there. If one of the 
jobs was using musl - that would be tested as well. So this job would 
not just suddenly start failing for everybody.


I do see the "familiarity" argument for the SanityCheck task, but for a 
different reason: Even though it's unlikely for this job to fail for 
musl specific reasons - if you're not familiar with musl and can't 
easily test it locally, you might not be able to tell immediately 
whether it's musl specific or not. If musl was run in one of the later 
jobs, it's much different: You see all tests failing - alright, not musl 
specific. You see only the musl test failing - yeah, musl problem. This 
should give developers much more confidence looking at the results.


Best,

Wolfgang

[1]: https://github.com/anarazel/pg-vm-images/pull/91




Docs: Order of json aggregate functions

2024-06-19 Thread Wolfgang Walther
The order of json related aggregate functions in the docs is currently 
like this:


[...]
json_agg
json_objectagg
json_object_agg
json_object_agg_strict
json_object_agg_unique
json_arrayagg
json_object_agg_unique_strict
max
min
range_agg
range_intersect_agg
json_agg_strict
[...]

json_arrayagg and json_agg_strict are out of place.

Attached patch puts them in the right spot. This is the same down to v16.

Best,

WolfgangFrom ad857a824d893a3e421c6c577c1215f71c1ebfe3 Mon Sep 17 00:00:00 2001
From: Wolfgang Walther 
Date: Wed, 19 Jun 2024 19:40:49 +0200
Subject: [PATCH v1] Fix order of json aggregate functions in docs

---
 doc/src/sgml/func.sgml | 96 +-
 1 file changed, 48 insertions(+), 48 deletions(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 2609269610b..c3b342d832f 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -21790,6 +21790,54 @@ SELECT NULLIF(value, '(none)') ...
No
   
 
+  
+   
+
+ json_agg_strict
+
+json_agg_strict ( anyelement )
+json
+   
+   
+
+ jsonb_agg_strict
+
+jsonb_agg_strict ( anyelement )
+jsonb
+   
+   
+Collects all the input values, skipping nulls, into a JSON array.
+Values are converted to JSON as per to_json
+or to_jsonb.
+   
+   No
+  
+
+  
+   
+json_arrayagg
+json_arrayagg (
+ value_expression 
+ ORDER BY sort_expression 
+ { NULL | ABSENT } ON NULL 
+ RETURNING data_type  FORMAT JSON  ENCODING UTF8   )
+   
+   
+Behaves in the same way as json_array
+but as an aggregate function so it only takes one
+value_expression parameter.
+If ABSENT ON NULL is specified, any NULL
+values are omitted.
+If ORDER BY is specified, the elements will
+appear in the array in that order rather than in the input order.
+   
+   
+SELECT json_arrayagg(v) FROM (VALUES(2),(1)) t(v)
+[2, 1]
+   
+   No
+  
+
   

  json_objectagg
@@ -21900,31 +21948,6 @@ SELECT NULLIF(value, '(none)') ...
No
   
 
-  
-   
-json_arrayagg
-json_arrayagg (
- value_expression 
- ORDER BY sort_expression 
- { NULL | ABSENT } ON NULL 
- RETURNING data_type  FORMAT JSON  ENCODING UTF8   )
-   
-   
-Behaves in the same way as json_array
-but as an aggregate function so it only takes one
-value_expression parameter.
-If ABSENT ON NULL is specified, any NULL
-values are omitted.
-If ORDER BY is specified, the elements will
-appear in the array in that order rather than in the input order.
-   
-   
-SELECT json_arrayagg(v) FROM (VALUES(2),(1)) t(v)
-[2, 1]
-   
-   No
-  
-
   

 
@@ -22033,29 +22056,6 @@ SELECT NULLIF(value, '(none)') ...
No
   
 
-  
-   
-
- json_agg_strict
-
-json_agg_strict ( anyelement )
-json
-   
-   
-
- jsonb_agg_strict
-
-jsonb_agg_strict ( anyelement )
-jsonb
-   
-   
-Collects all the input values, skipping nulls, into a JSON array.
-Values are converted to JSON as per to_json
-or to_jsonb.
-   
-   No
-  
-
   

 
-- 
2.45.1



Re: Schema variables - new implementation for Postgres 15

2024-05-31 Thread Wolfgang Walther

Pavel Stehule:
2. But my main argument is, it is not really safe - it solves Peter's 
use case, but if I use a reverse example of Peter's case, I still have a 
problem.


I can have a variable x, and then I can write query like `SELECT x FROM x`;

but if somebody creates table x(x int), then the query `SELECT x FROM x` 
will be correct, but it is surely something else. So the requirement of 
the usage variable inside FROM clause doesn't help. It doesn't work.


But in this case you could make variables and tables share the same 
namespace, i.e. forbid creating a variable with the same name as an 
already existing table.


Best,

Wolfgang




Re: Schema variables - new implementation for Postgres 15

2024-05-31 Thread Wolfgang Walther

Pavel Stehule:

But in this case you could make variables and tables share the same
namespace, i.e. forbid creating a variable with the same name as an
already existing table.


It helps, but not on 100% - there is a search path


I think we can ignore the search_path for this discussion. That's not a 
problem of variables vs tables, but just a search path related problem. 
It is exactly the same thing right now, when you create a new table x(x) 
in a schema which happens to be earlier in your search path.


The objection to the proposed approach for variables was that it would 
introduce *new* ambiguities, which Alvaro's suggestion avoids.


Best,

Wolfgang




Re: Schema variables - new implementation for Postgres 15

2024-05-31 Thread Wolfgang Walther

Pavel Stehule:
The session variables can be used in queries, but should be used in 
PL/pgSQL expressions, and then the mandatory usage in FROM clause will 
do lot of problems and unreadable code like


DO $$
BEGIN
   RAISE NOTICE '% %', (SELECT x FROM x), (SELECT a,b FROM y);

END
$$

This requirement does variables unusable in PL


I already proposed earlier to only require listing them in FROM when 
there is actually a related FROM.


In this case you could still write:

RAISE NOTICE '% %', x, (SELECT a,b FROM y);

(assuming only x is a variable here)

Best,

Wolfgang




Re: Schema variables - new implementation for Postgres 15

2024-05-31 Thread Wolfgang Walther

Pavel Stehule:
When you write RAISE NOTICE '%', x, then PLpgSQL parser rewrite it to 
RAISE NOTICE '%', SELECT $1


There is no parser just for expressions.


That's why my suggestion in [1] already made a difference between:

SELECT var;

and

SELECT col, var FROM table, var;

So the "only require variable-in-FROM if FROM is used" should extend to 
the SQL level.


That should be possible, right?

Best,

Wolfgang

[1]: 
https://www.postgresql.org/message-id/e7faf42f-62b8-47f4-af5c-cb8efa3e0e20%40technowledgy.de





Build with LTO / -flto on macOS

2024-06-03 Thread Wolfgang Walther
Building with clang and -flto on macOS currently fails with errors 
similar to [1]. This is because the --export-dynamic flag is called 
-export_dynamic [2] instead and we have not been passing this variant to 
the linker, so far.


Attached patch fixes that for configure/make.

CC: Tom, who hit the same in [3] and Andres who last touched 
--export-dynamic in 9db49fc5bfdc0126be03f4b8986013e59d93b91d.


Will also create an issue upstream for meson, because the logic is 
built-in there.


Would be great if this could be back-patched, since this is the same in 
all live versions.


Best,

Wolfgang

[1]: https://postgr.es/m/1581936537572-0.post%40n3.nabble.com
[2]: 
https://opensource.apple.com/source/ld64/ld64-609/doc/man/man1/ld.1.auto.html 
(grep for export_dynamic)

[3]: https://postgr.es/m/21800.1499270547%40sss.pgh.pa.usFrom 55175631d25af11971ec7b7f89d5bf4958ed3c8b Mon Sep 17 00:00:00 2001
From: Wolfgang Walther 
Date: Sun, 2 Jun 2024 10:46:56 +0200
Subject: [PATCH v1] Make building with clang's LTO work on macOS

When building with -flto the backend binary must keep many otherwise
unused symbols to make them available to dynamically loaded modules /
extensions.  This has been done via -Wl,--export-dynamic on many
platforms for years.  This flag is not supported by Apple's clang,
though.  Here it's called -Wl,-export_dynamic instead.

Thus, make configure pick up on this variant of the flag as well.  Meson
has the logic to detect this flag built-in, but doesn't support this
variant either.  This needs to be raised upstream.

Without this fix, building with -flto fails with errors similar to [1]
and [2].  This happens for all currently live versions, including 17 and
HEAD.

[1]: https://postgr.es/m/1581936537572-0.post%40n3.nabble.com
[2]: https://postgr.es/m/21800.1499270547%40sss.pgh.pa.us
---
 configure| 39 +++
 configure.ac |  2 ++
 2 files changed, 41 insertions(+)

diff --git a/configure b/configure
index 7b03db56a67..4c0a1383428 100755
--- a/configure
+++ b/configure
@@ -19135,6 +19135,7 @@ fi
 # For linkers that understand --export-dynamic, add that to the LDFLAGS_EX_BE
 # (backend specific ldflags). One some platforms this will always fail (e.g.,
 # windows), but on others it depends on the choice of linker (e.g., solaris).
+# macos uses -export_dynamic instead.
 { $as_echo "$as_me:${as_lineno-$LINENO}: checking whether $CC supports -Wl,--export-dynamic, for LDFLAGS_EX_BE" >&5
 $as_echo_n "checking whether $CC supports -Wl,--export-dynamic, for LDFLAGS_EX_BE... " >&6; }
 if ${pgac_cv_prog_cc_LDFLAGS_EX_BE__Wl___export_dynamic+:} false; then :
@@ -19173,6 +19174,44 @@ if test x"$pgac_cv_prog_cc_LDFLAGS_EX_BE__Wl___export_dynamic" = x"yes"; then
   LDFLAGS_EX_BE="${LDFLAGS_EX_BE} -Wl,--export-dynamic"
 fi
 
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking whether $CC supports -Wl,-export_dynamic, for LDFLAGS_EX_BE" >&5
+$as_echo_n "checking whether $CC supports -Wl,-export_dynamic, for LDFLAGS_EX_BE... " >&6; }
+if ${pgac_cv_prog_cc_LDFLAGS_EX_BE__Wl__export_dynamic+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  pgac_save_LDFLAGS=$LDFLAGS
+LDFLAGS="$pgac_save_LDFLAGS -Wl,-export_dynamic"
+if test "$cross_compiling" = yes; then :
+  pgac_cv_prog_cc_LDFLAGS_EX_BE__Wl__export_dynamic="assuming no"
+else
+  cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h.  */
+extern void $link_test_func (); void (*fptr) () = $link_test_func;
+int
+main ()
+{
+
+  ;
+  return 0;
+}
+_ACEOF
+if ac_fn_c_try_run "$LINENO"; then :
+  pgac_cv_prog_cc_LDFLAGS_EX_BE__Wl__export_dynamic=yes
+else
+  pgac_cv_prog_cc_LDFLAGS_EX_BE__Wl__export_dynamic=no
+fi
+rm -f core *.core core.conftest.* gmon.out bb.out conftest$ac_exeext \
+  conftest.$ac_objext conftest.beam conftest.$ac_ext
+fi
+
+LDFLAGS="$pgac_save_LDFLAGS"
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $pgac_cv_prog_cc_LDFLAGS_EX_BE__Wl__export_dynamic" >&5
+$as_echo "$pgac_cv_prog_cc_LDFLAGS_EX_BE__Wl__export_dynamic" >&6; }
+if test x"$pgac_cv_prog_cc_LDFLAGS_EX_BE__Wl__export_dynamic" = x"yes"; then
+  LDFLAGS_EX_BE="${LDFLAGS_EX_BE} -Wl,-export_dynamic"
+fi
+
 
 
 # Create compiler version string
diff --git a/configure.ac b/configure.ac
index 63e7be38472..4f5053fe592 100644
--- a/configure.ac
+++ b/configure.ac
@@ -2423,7 +2423,9 @@ fi
 # For linkers that understand --export-dynamic, add that to the LDFLAGS_EX_BE
 # (backend specific ldflags). One some platforms this will always fail (e.g.,
 # windows), but on others it depends on the choice of linker (e.g., solaris).
+# macos uses -export_dynamic instead.
 PGAC_PROG_CC_LD_VARFLAGS_OPT(LDFLAGS_EX_BE, [-Wl,--export-dynamic], $link_test_func)
+PGAC_PROG_CC_LD_VARFLAGS_OPT(LDFLAGS_EX_BE, [-Wl,-export_dynamic], $link_test_func)
 AC_SUBST(LDFLAGS_EX_BE)
 
 # Create compiler version string
-- 
2.45.1



Re: Build with LTO / -flto on macOS

2024-06-03 Thread Wolfgang Walther

Peter Eisentraut:
It's probably worth clarifying that this option is needed on macOS only 
if LTO is also enabled.  For standard (non-LTO) builds, the 
export-dynamic behavior is already the default on macOS (otherwise 
nothing in PostgreSQL would work).


Right, man page say this:

> Preserves all global symbols in main executables during LTO.  Without 
this option, Link Time Optimization is allowed to inline and remove 
global functions. This option is used when a main executable may load a 
plug-in which requires certain symbols from the main executable.


Peter:
I don't think we explicitly offer LTO builds as part of the make build 
system, so anyone trying this would do it sort of self-service, by 
passing additional options to configure or make.  In which case they 
might as well pass the -export_dynamic option along in the same way?


The challenge is that it defeats the purpose of LTO to pass this along 
to everything, e.g. via CFLAGS. The Makefiles set this in LDFLAGS_EX_BE 
only, so it only affects the backend binary. This is not at all obvious 
and took me quite a while to figure out why LTO silently didn't strip 
symbols from other binaries. It does work to explicitly set 
LDFLAGS_EX_BE, though.


Also, passing the LTO flag on Linux "just works" (clang, not GCC 
necessarily).


I don't mind addressing this in PG18, but I would hesitate with 
backpatching.  With macOS, it's always hard to figure out whether these 
kinds of options work the same way going versions back.


All the versions for ld64 are in [1]. It seems this was introduced in 
ld64-224.1 [2] the first time. It was not there in ld64-136 [3]. Finally 
the man page has **exactly** the same wording in the latest version 
ld64-609 [4].


We could go further and compare the source, but I think it's safe to 
assume that this flag hasn't changed much and should not affect non-LTO 
builds. And for even older versions it would just not be supported, so 
configure would not use it.


Best,

Wolfgang

[1]: https://opensource.apple.com/source/ld64/
[2]: 
https://opensource.apple.com/source/ld64/ld64-224.1/doc/man/man1/ld.1.auto.html
[3]: 
https://opensource.apple.com/source/ld64/ld64-136/doc/man/man1/ld.1.auto.html
[4]: 
https://opensource.apple.com/source/ld64/ld64-609/doc/man/man1/ld.1.auto.html





Re: Build with LTO / -flto on macOS

2024-06-04 Thread Wolfgang Walther

Andres Freund:

Gah. Apples tendency to just break stuff that has worked across *nix-y
platforms for decades is pretty annoying. They could just have made
--export-dynamic an alias for --export_dynamic, but no, everyone needs a
special macos thingy in their build scripts.


Interesting enough my Linux ld does support -export_dynamic, too.. but 
it doesn't say anywhere in the man pages or so.




Also, passing the LTO flag on Linux "just works" (clang, not GCC
necessarily).


It should just work on gcc, or at least has in the recent past.


Well it "works" in a sense that the build succeeds and check-world as 
well. But there are some symbols in all the client binaries that I know 
are unused (paths to .../include etc.), and which LLVM's LTO strips out 
happily - that are still in there after GCC's LTO.


GCC can remove them with -fdata-sections -ffunction-sections 
-fmerge-constants and -Wl,--gc-sections. But not with -flto. At least I 
didn't manage to.




ISTM if we want to test for -export_dynamic like what you proposed, we should
do so only if --export-dynamic wasn't found. No need to incur the overhead on
!macos.


Makes sense! v2 attached.

I also attached a .backpatch to show what that would look like for v15 
and down.



Peter Eisentraut:
> With the native compiler tooling on macOS, it is not safe to assume
> anything, including that the man pages are accurate or that the
> documented options actually work correctly and don't break anything
> else.  Unless we have actual testing on all the supported macOS
> versions, I don't believe it.

Which macOS versions are "supported"?

I just set up a VM with macOS Mojave (2018) and tested both the .patch 
on HEAD as well as the .backpatch on REL_12_STABLE with -flto. Build 
passed, make check-world as well.


clang --version for Mojave:
Apple LLVM version 10.0.1 (clang-1001.0.46.4)
Target: x86_64-apple-darwin18.5.0

clang --version for Sonoma (where I tested before):
Apple clang version 15.0.0 (clang-1500.3.9.4)
Target: x86_64-apple-darwin@23.5.0

Since PostgreSQL 12 is from 2019 and Mojave from 2018, I think that's 
far enough back?



> Given that LTO apparently never worked on macOS, this is not a
> regression, so I wouldn't backpatch it.  I'm not objecting, but I don't
> want to touch it.

Fair enough! Hopefully my testing convinces more than the man pages ;)

Best,

WolfgangFrom 3ca5357bbdb9aae29a1785d5ca2179d6cca15cdd Mon Sep 17 00:00:00 2001
From: Wolfgang Walther 
Date: Sun, 2 Jun 2024 10:46:56 +0200
Subject: [PATCH v2] Make building with clang's LTO work on macOS

When building with -flto the backend binary must keep many otherwise
unused symbols to make them available to dynamically loaded modules /
extensions.  This has been done via -Wl,--export-dynamic on many
platforms for years.  This flag is not supported by Apple's clang,
though.  Here it's called -Wl,-export_dynamic instead.

Thus, make configure pick up on this variant of the flag as well.  Meson
has the logic to detect this flag built-in, but doesn't support this
variant either.  This needs to be raised upstream.

Without this fix, building with -flto fails with errors similar to [1]
and [2].  This happens for all currently live versions, including 17 and
HEAD.

[1]: https://postgr.es/m/1581936537572-0.post%40n3.nabble.com
[2]: https://postgr.es/m/21800.1499270547%40sss.pgh.pa.us
---
 configure| 41 +
 configure.ac |  4 
 2 files changed, 45 insertions(+)

diff --git a/configure b/configure
index 7b03db56a67..771dcfbdef9 100755
--- a/configure
+++ b/configure
@@ -19135,6 +19135,7 @@ fi
 # For linkers that understand --export-dynamic, add that to the LDFLAGS_EX_BE
 # (backend specific ldflags). One some platforms this will always fail (e.g.,
 # windows), but on others it depends on the choice of linker (e.g., solaris).
+# macos uses -export_dynamic instead.
 { $as_echo "$as_me:${as_lineno-$LINENO}: checking whether $CC supports -Wl,--export-dynamic, for LDFLAGS_EX_BE" >&5
 $as_echo_n "checking whether $CC supports -Wl,--export-dynamic, for LDFLAGS_EX_BE... " >&6; }
 if ${pgac_cv_prog_cc_LDFLAGS_EX_BE__Wl___export_dynamic+:} false; then :
@@ -19173,6 +19174,46 @@ if test x"$pgac_cv_prog_cc_LDFLAGS_EX_BE__Wl___export_dynamic" = x"yes"; then
   LDFLAGS_EX_BE="${LDFLAGS_EX_BE} -Wl,--export-dynamic"
 fi
 
+if test x"$LDFLAGS_EX_BE" = x""; then
+  { $as_echo "$as_me:${as_lineno-$LINENO}: checking whether $CC supports -Wl,-export_dynamic, for LDFLAGS_EX_BE" >&5
+$as_echo_n "checking whether $CC supports -Wl,-export_dynamic, for LDFLAGS_EX_BE... " >&6; }
+if ${pgac_cv_prog_cc_LDFLAGS_EX_BE__Wl__export_dynamic+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  pgac_save_L

Re: Build with LTO / -flto on macOS

2024-06-05 Thread Wolfgang Walther

Peter Eisentraut:

On 04.06.24 18:41, Tom Lane wrote:

Relevant to this: I wonder what we think the supported macOS versions
are, anyway.  AFAICS, the buildfarm only covers current (Sonoma)
and current-1 (Ventura) major versions, and only the latest minor
versions in those OS branches.


For other OS lines I think we are settling on supporting what the OS 
vendor supports.  So for macOS at the moment this would be current, 
current-1, and current-2, per 
.


So I tested both HEAD and v12 on current and current-5, both successful. 
That should cover current-1 and current-2, too. If you want me to test 
any other macOS versions inbetween, or any other PG versions, I can do that.


I would really like to upstream those kind of patches and see them 
backpatched - otherwise we need to carry around those patches for up to 
5 years in the distros. And in light of the discussion in [1] my goal is 
to reduce the number of patches carried to a minimum. Yes - those 
patches are simple enough - but the more patches you have, the less 
likely you are going to spot a malicious patch inbetween.


Best,

Wolfgang

[1]: https://postgr.es/m/flat/ZgdCpFThi9ODcCsJ%40momjian.us




Re: Building with meson on NixOS/nixpkgs

2024-08-17 Thread Wolfgang Walther

Tristan Partin:

On Fri Aug 9, 2024 at 11:14 AM CDT, Andres Freund wrote:
[..]

commit a00fae9d43e5adabc56e64a4df6d332062666501
Author: Heikki Linnakangas 
Date:   2024-07-27 13:53:08 +0300

    Fallback to uuid for ossp-uuid with meson
[..]

I think this is a redundant change with

commit 2416fdb3ee30bdd2810408f93f14d47bff840fea
Author: Andres Freund 
Date:   2024-07-20 13:51:08 -0700

    meson: Add support for detecting ossp-uuid without pkg-config
[..]


I'm not sure I would call them redundant. It's cheaper (and better) to 
do a pkg-config lookup than it is to do the various checks in your 
patch. I think the two patches are complementary. Yours services Windows 
plus anywhere else that doesn't have a pkg-config file, while Wolfgang's 
services distros that install the pkg-config with a different name.


Agreed.

There is also a small difference in output for meson: When uuid is 
queried via pkg-config, meson also detects the version, so I get this 
output:


  External libraries
[..]
uuid   : YES 1.6.2


Without pkg-config:

  External libraries
[..]
uuid   : YES

Best,

Wolfgang




Re: Building with meson on NixOS/nixpkgs

2024-08-17 Thread Wolfgang Walther

Tristan Partin:

On Fri Aug 9, 2024 at 11:14 AM CDT, Andres Freund wrote:

commit 4d8de281b5834c8f5e0be6ae21e884e69dffd4ce
Author: Heikki Linnakangas 
Date:   2024-07-27 13:53:11 +0300

    Fallback to clang in PATH with meson
[..]

I think this is a bad change unfortunately - this way clang and llvm 
version
can mismatch. Yes, we've done it that way for autoconf, but back then 
LLVM

broke compatibility far less often.


See the attached patch on how we could make this situation better.


Works great.

With the correct clang on path:

Program clang found: YES 18.1.8 18.1.8 
(/nix/store/mr1y1rxkx59dr2bci2akmw2zkbbpmc15-clang-wrapper-18.1.8/bin/clang)


With a mismatching version on path:

Program 
/nix/store/x4gwwwlw2ylv0d9vjmkx3dmlcb7gingd-llvm-18.1.8/bin/clang clang 
found: NO found 16.0.6 but need: '18.1.8' 
(/nix/store/r85xsa9z0s04n0y21xhrii47bh74g2a8-clang-wrapper-16.0.6/bin/clang)


Yes, the match is exact, also fails with a newer version:

Program 
/nix/store/x4gwwwlw2ylv0d9vjmkx3dmlcb7gingd-llvm-18.1.8/bin/clang clang 
found: NO found 19.1.0 but need: '18.1.8' 
(/nix/store/rjsfx6sxjpkgd4f9hl9apm0n8dk7jd9w-clang-wrapper-19.1.0-rc2/bin/clang)


+1 for this patch.

Best,

Wolfgang




Re: [PATCH] Add reloption for views to enable RLS

2022-03-02 Thread Wolfgang Walther

Dean Rasheed:

That is also the main reason I preferred naming it "security_invoker" -
it is consistent with other databases and eases transition from such
systems.

[...]

For my part, I find myself more and more convinced that
"security_invoker" is the right name, because it matches the
terminology used for functions, and in other database systems. I think
the parallels between security invoker functions and security invoker
views are quite strong.

[...]

When we come to write the release notes for this feature, saying that
this version of PG now supports security invoker views is going to
mean a lot more to people who already use that feature in other
databases.

What are other people's opinions?


All those points in favor of security_invoker are very good indeed. The 
main objection was not the term invoker, though, but the implicit 
association it creates as in "security_invoker=false behaves like 
security definer". But this is clearly wrong, the "security definer" 
semantics as used for functions or in other databases just don't apply 
as the default in PG.


I think renaming the reloption was a shortcut to avoid that association, 
while the best way to deal with that would be explicit documentation. 
Meanwhile, the patch has added a mention about CURRENT_USER, so that's a 
first step. Maybe an explicit mention that security_invoker=false, is 
NOT the same as "security definer" and explaining why would already be 
enough?


Best

Wolfgang