Re: Regression tests fail with tzdata 2024b

2024-09-16 Thread Wolfgang Walther

Tom Lane:

I was wondering whether the timezone used by pg_regress could be made
configurable.


Yes, I understood that you were suggesting that.  My point is that
it wouldn't do you any good: you will still have to change any
regression test cases that depend on behavior PST8PDT has/had that
is different from America/Los_Angeles.  That being the case,
I don't see much value in making it configurable.


Just changing it back to PST8PDT wouldn't really help as Tom pointed 
out. You'd still get different results depending on which tzdata version 
you are running with.


The core regression tests need to be run with a timezone that tests 
special cases in the timezone handling code. But that might not be true 
for extensions - all they want could be a stable output across major and 
minor versions of postgres and versions of tzdata. It could be helpful 
to set pg_regress' timezone to UTC, for example?


Best,

Wolfgang




Re: Regression tests fail with tzdata 2024b

2024-09-15 Thread Wolfgang Walther

Tom Lane:

Also, as a real place to a greater extent
than "PST8PDT" is, it's more subject to historical revisionism when
somebody turns up evidence of local law having been different than
TZDB currently thinks.


I now tried all versions of tzdata which we had in tree back to 2018g, 
they all work fine with the same regression test output. 2018g was an 
arbitrary cutoff, I just didn't try any further.


In the end, we don't need a default timezone that will never change. We 
just need one that didn't change in a reasonable number of releases 
going backwards. Once America/Los_Angeles is changed, we need to switch 
to a different zone, which could be one that wouldn't work today. Kind 
of a sliding window.


One positive might be: With this timezone, we are more likely to see 
relevant changes mentioned in the upstream release notes.


Best,

Wolfgang




Regression tests fail with tzdata 2024b

2024-09-14 Thread Wolfgang Walther
 856143121.00
  Sat Feb 16 17:32:01 2097 PST|  100 | 1000.000 |  1.00 |   
2487022 |   4011903121.00
@@ -2304,7 +2304,7 @@
 SELECT * FROM TIMESTAMPTZ_TST ORDER BY a;
  a |   b
 ---+
- 1 | Wed Mar 12 13:58:48 1000 PST
+ 1 | Wed Mar 12 14:05:50 1000 LMT
  2 | Sun Mar 12 14:58:48 1 PDT
  3 | Sun Mar 12 14:58:48 10 PDT
  3 | Sun Mar 12 14:58:48 1 PDT
diff -U3 /build/source/src/test/regress/expected/horology.out 
/build/source/build/testrun/regress/regress/results/horology.out
--- /build/source/src/test/regress/expected/horology.out1970-01-01 
00:00:01.0 +
+++ /build/source/build/testrun/regress/regress/results/horology.out
2024-09-13 17:58:09.785778080 +
@@ -1033,12 +1033,12 @@
  Sat Feb 14 17:32:01 1998 PST
  Sun Feb 15 17:32:01 1998 PST
  Mon Feb 16 17:32:01 1998 PST
- Thu Feb 16 17:32:01 0096 PST BC
- Sun Feb 16 17:32:01 0098 PST
- Fri Feb 16 17:32:01 0598 PST
- Wed Feb 16 17:32:01 1098 PST
- Sun Feb 16 17:32:01 1698 PST
- Fri Feb 16 17:32:01 1798 PST
+ Thu Feb 16 17:32:01 0096 LMT BC
+ Sun Feb 16 17:32:01 0098 LMT
+ Fri Feb 16 17:32:01 0598 LMT
+ Wed Feb 16 17:32:01 1098 LMT
+ Sun Feb 16 17:32:01 1698 LMT
+ Fri Feb 16 17:32:01 1798 LMT
  Wed Feb 16 17:32:01 1898 PST
  Mon Feb 16 17:32:01 1998 PST
  Sun Feb 16 17:32:01 2098 PST
@@ -1104,12 +1104,12 @@
  Wed Feb 14 17:32:01 1996 PST
  Thu Feb 15 17:32:01 1996 PST
  Fri Feb 16 17:32:01 1996 PST
- Mon Feb 16 17:32:01 0098 PST BC
- Thu Feb 16 17:32:01 0096 PST
- Tue Feb 16 17:32:01 0596 PST
- Sun Feb 16 17:32:01 1096 PST
- Thu Feb 16 17:32:01 1696 PST
- Tue Feb 16 17:32:01 1796 PST
+ Mon Feb 16 17:32:01 0098 LMT BC
+ Thu Feb 16 17:32:01 0096 LMT
+ Tue Feb 16 17:32:01 0596 LMT
+ Sun Feb 16 17:32:01 1096 LMT
+ Thu Feb 16 17:32:01 1696 LMT
+ Tue Feb 16 17:32:01 1796 LMT
  Sun Feb 16 17:32:01 1896 PST
  Fri Feb 16 17:32:01 1996 PST
  Thu Feb 16 17:32:01 2096 PST
@@ -2388,7 +2388,7 @@
 SELECT '4714-11-24 BC'::date::timestamptz;
timestamptz   
 -
- Mon Nov 24 00:00:00 4714 PST BC
+ Mon Nov 24 00:00:00 4714 LMT BC
 (1 row)
 
 SET TimeZone = 'UTC-2';
@@ -2966,13 +2966,13 @@
 SELECT to_timestamp('0097/Feb/16 --> 08:14:30', '/Mon/DD --> HH:MI:SS');
  to_timestamp 
 --
- Sat Feb 16 08:14:30 0097 PST
+ Sat Feb 16 08:14:30 0097 LMT
 (1 row)
 
 SELECT to_timestamp('97/2/16 8:14:30', 'FM/FMMM/FMDD FMHH:FMMI:FMSS');
  to_timestamp 
 --
- Sat Feb 16 08:14:30 0097 PST
+ Sat Feb 16 08:14:30 0097 LMT
 (1 row)
 
 SELECT to_timestamp('2011$03!18 23_38_15', '-MM-DD HH24:MI:SS');
@@ -3009,7 +3009,7 @@
 SELECT to_timestamp('1,582nd VIII 21', 'Y,YYYth FMRM DD');
  to_timestamp 
 --
- Sat Aug 21 00:00:00 1582 PST
+ Sat Aug 21 00:00:00 1582 LMT
 (1 row)
 
 SELECT to_timestamp('15 "text between quote marks" 98 54 45',
@@ -3073,7 +3073,7 @@
 SELECT to_timestamp('1997 BC 11 16', ' BC MM DD');
   to_timestamp   
 -
- Tue Nov 16 00:00:00 1997 PST BC
+ Tue Nov 16 00:00:00 1997 LMT BC
 (1 row)
 
 SELECT to_timestamp('1997 A.D. 11 16', ' B.C. MM DD');
@@ -3085,7 +3085,7 @@
 SELECT to_timestamp('1997 B.C. 11 16', ' B.C. MM DD');
   to_timestamp   
 -
- Tue Nov 16 00:00:00 1997 PST BC
+ Tue Nov 16 00:00:00 1997 LMT BC
 (1 row)
 
 SELECT to_timestamp('9-1116', 'Y-MMDD');
@@ -3355,19 +3355,19 @@
 SELECT to_timestamp('44-02-01 11:12:13 BC','-MM-DD HH24:MI:SS BC');
   to_timestamp   
 -
- Fri Feb 01 11:12:13 0044 PST BC
+ Fri Feb 01 11:12:13 0044 LMT BC
 (1 row)
 
 SELECT to_timestamp('-44-02-01 11:12:13','-MM-DD HH24:MI:SS');
   to_timestamp   
 -
- Fri Feb 01 11:12:13 0044 PST BC
+ Fri Feb 01 11:12:13 0044 LMT BC
 (1 row)
 
 SELECT to_timestamp('-44-02-01 11:12:13 BC','-MM-DD HH24:MI:SS BC');
  to_timestamp 
 --
- Mon Feb 01 11:12:13 0044 PST
+ Mon Feb 01 11:12:13 0044 LMT
 (1 row)
 
 --
From c5de73fbb86f7d600c90ba42ab9bd3d4eca34d1b Mon Sep 17 00:00:00 2001
From: Wolfgang Walther 
Date: Sat, 14 Sep 2024 14:42:38 +0200
Subject: [PATCH v1] Run regression tests with timezone America/Los_Angeles

Previously, the regression tests were running with "PST8PDT", which was
changed to link to America/Los_Angeles in tzdata 2024b.  Running the
tests --with-system-tzdata=2024b would therefore fail.

Run the regression tests with America/Los_Angeles to get consistent
output in both 2024a and 2024b.
---
 do

Re: Building with meson on NixOS/nixpkgs

2024-08-17 Thread Wolfgang Walther

Tristan Partin:

On Fri Aug 9, 2024 at 11:14 AM CDT, Andres Freund wrote:

commit 4d8de281b5834c8f5e0be6ae21e884e69dffd4ce
Author: Heikki Linnakangas 
Date:   2024-07-27 13:53:11 +0300

    Fallback to clang in PATH with meson
[..]

I think this is a bad change unfortunately - this way clang and llvm 
version
can mismatch. Yes, we've done it that way for autoconf, but back then 
LLVM

broke compatibility far less often.


See the attached patch on how we could make this situation better.


Works great.

With the correct clang on path:

Program clang found: YES 18.1.8 18.1.8 
(/nix/store/mr1y1rxkx59dr2bci2akmw2zkbbpmc15-clang-wrapper-18.1.8/bin/clang)


With a mismatching version on path:

Program 
/nix/store/x4gwwwlw2ylv0d9vjmkx3dmlcb7gingd-llvm-18.1.8/bin/clang clang 
found: NO found 16.0.6 but need: '18.1.8' 
(/nix/store/r85xsa9z0s04n0y21xhrii47bh74g2a8-clang-wrapper-16.0.6/bin/clang)


Yes, the match is exact, also fails with a newer version:

Program 
/nix/store/x4gwwwlw2ylv0d9vjmkx3dmlcb7gingd-llvm-18.1.8/bin/clang clang 
found: NO found 19.1.0 but need: '18.1.8' 
(/nix/store/rjsfx6sxjpkgd4f9hl9apm0n8dk7jd9w-clang-wrapper-19.1.0-rc2/bin/clang)


+1 for this patch.

Best,

Wolfgang




Re: Building with meson on NixOS/nixpkgs

2024-08-17 Thread Wolfgang Walther

Tristan Partin:

On Fri Aug 9, 2024 at 11:14 AM CDT, Andres Freund wrote:
[..]

commit a00fae9d43e5adabc56e64a4df6d332062666501
Author: Heikki Linnakangas 
Date:   2024-07-27 13:53:08 +0300

    Fallback to uuid for ossp-uuid with meson
[..]

I think this is a redundant change with

commit 2416fdb3ee30bdd2810408f93f14d47bff840fea
Author: Andres Freund 
Date:   2024-07-20 13:51:08 -0700

    meson: Add support for detecting ossp-uuid without pkg-config
[..]


I'm not sure I would call them redundant. It's cheaper (and better) to 
do a pkg-config lookup than it is to do the various checks in your 
patch. I think the two patches are complementary. Yours services Windows 
plus anywhere else that doesn't have a pkg-config file, while Wolfgang's 
services distros that install the pkg-config with a different name.


Agreed.

There is also a small difference in output for meson: When uuid is 
queried via pkg-config, meson also detects the version, so I get this 
output:


  External libraries
[..]
uuid   : YES 1.6.2


Without pkg-config:

  External libraries
[..]
uuid   : YES

Best,

Wolfgang




Docs: Order of json aggregate functions

2024-06-19 Thread Wolfgang Walther
The order of json related aggregate functions in the docs is currently 
like this:


[...]
json_agg
json_objectagg
json_object_agg
json_object_agg_strict
json_object_agg_unique
json_arrayagg
json_object_agg_unique_strict
max
min
range_agg
range_intersect_agg
json_agg_strict
[...]

json_arrayagg and json_agg_strict are out of place.

Attached patch puts them in the right spot. This is the same down to v16.

Best,

WolfgangFrom ad857a824d893a3e421c6c577c1215f71c1ebfe3 Mon Sep 17 00:00:00 2001
From: Wolfgang Walther 
Date: Wed, 19 Jun 2024 19:40:49 +0200
Subject: [PATCH v1] Fix order of json aggregate functions in docs

---
 doc/src/sgml/func.sgml | 96 +-
 1 file changed, 48 insertions(+), 48 deletions(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 2609269610b..c3b342d832f 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -21790,6 +21790,54 @@ SELECT NULLIF(value, '(none)') ...
No
   
 
+  
+   
+
+ json_agg_strict
+
+json_agg_strict ( anyelement )
+json
+   
+   
+
+ jsonb_agg_strict
+
+jsonb_agg_strict ( anyelement )
+jsonb
+   
+   
+Collects all the input values, skipping nulls, into a JSON array.
+Values are converted to JSON as per to_json
+or to_jsonb.
+   
+   No
+  
+
+  
+   
+json_arrayagg
+json_arrayagg (
+ value_expression 
+ ORDER BY sort_expression 
+ { NULL | ABSENT } ON NULL 
+ RETURNING data_type  FORMAT JSON  ENCODING UTF8   )
+   
+   
+Behaves in the same way as json_array
+but as an aggregate function so it only takes one
+value_expression parameter.
+If ABSENT ON NULL is specified, any NULL
+values are omitted.
+If ORDER BY is specified, the elements will
+appear in the array in that order rather than in the input order.
+   
+   
+SELECT json_arrayagg(v) FROM (VALUES(2),(1)) t(v)
+[2, 1]
+   
+   No
+  
+
   

  json_objectagg
@@ -21900,31 +21948,6 @@ SELECT NULLIF(value, '(none)') ...
No
   
 
-  
-   
-json_arrayagg
-json_arrayagg (
- value_expression 
- ORDER BY sort_expression 
- { NULL | ABSENT } ON NULL 
- RETURNING data_type  FORMAT JSON  ENCODING UTF8   )
-   
-   
-Behaves in the same way as json_array
-but as an aggregate function so it only takes one
-value_expression parameter.
-If ABSENT ON NULL is specified, any NULL
-values are omitted.
-If ORDER BY is specified, the elements will
-appear in the array in that order rather than in the input order.
-   
-   
-SELECT json_arrayagg(v) FROM (VALUES(2),(1)) t(v)
-[2, 1]
-   
-   No
-  
-
   

 
@@ -22033,29 +22056,6 @@ SELECT NULLIF(value, '(none)') ...
No
   
 
-  
-   
-
- json_agg_strict
-
-json_agg_strict ( anyelement )
-json
-   
-   
-
- jsonb_agg_strict
-
-jsonb_agg_strict ( anyelement )
-jsonb
-   
-   
-Collects all the input values, skipping nulls, into a JSON array.
-Values are converted to JSON as per to_json
-or to_jsonb.
-   
-   No
-  
-
   

 
-- 
2.45.1



Re: Build with LTO / -flto on macOS

2024-06-05 Thread Wolfgang Walther

Peter Eisentraut:

On 04.06.24 18:41, Tom Lane wrote:

Relevant to this: I wonder what we think the supported macOS versions
are, anyway.  AFAICS, the buildfarm only covers current (Sonoma)
and current-1 (Ventura) major versions, and only the latest minor
versions in those OS branches.


For other OS lines I think we are settling on supporting what the OS 
vendor supports.  So for macOS at the moment this would be current, 
current-1, and current-2, per 
.


So I tested both HEAD and v12 on current and current-5, both successful. 
That should cover current-1 and current-2, too. If you want me to test 
any other macOS versions inbetween, or any other PG versions, I can do that.


I would really like to upstream those kind of patches and see them 
backpatched - otherwise we need to carry around those patches for up to 
5 years in the distros. And in light of the discussion in [1] my goal is 
to reduce the number of patches carried to a minimum. Yes - those 
patches are simple enough - but the more patches you have, the less 
likely you are going to spot a malicious patch inbetween.


Best,

Wolfgang

[1]: https://postgr.es/m/flat/ZgdCpFThi9ODcCsJ%40momjian.us




Re: Build with LTO / -flto on macOS

2024-06-04 Thread Wolfgang Walther

Andres Freund:

Gah. Apples tendency to just break stuff that has worked across *nix-y
platforms for decades is pretty annoying. They could just have made
--export-dynamic an alias for --export_dynamic, but no, everyone needs a
special macos thingy in their build scripts.


Interesting enough my Linux ld does support -export_dynamic, too.. but 
it doesn't say anywhere in the man pages or so.




Also, passing the LTO flag on Linux "just works" (clang, not GCC
necessarily).


It should just work on gcc, or at least has in the recent past.


Well it "works" in a sense that the build succeeds and check-world as 
well. But there are some symbols in all the client binaries that I know 
are unused (paths to .../include etc.), and which LLVM's LTO strips out 
happily - that are still in there after GCC's LTO.


GCC can remove them with -fdata-sections -ffunction-sections 
-fmerge-constants and -Wl,--gc-sections. But not with -flto. At least I 
didn't manage to.




ISTM if we want to test for -export_dynamic like what you proposed, we should
do so only if --export-dynamic wasn't found. No need to incur the overhead on
!macos.


Makes sense! v2 attached.

I also attached a .backpatch to show what that would look like for v15 
and down.



Peter Eisentraut:
> With the native compiler tooling on macOS, it is not safe to assume
> anything, including that the man pages are accurate or that the
> documented options actually work correctly and don't break anything
> else.  Unless we have actual testing on all the supported macOS
> versions, I don't believe it.

Which macOS versions are "supported"?

I just set up a VM with macOS Mojave (2018) and tested both the .patch 
on HEAD as well as the .backpatch on REL_12_STABLE with -flto. Build 
passed, make check-world as well.


clang --version for Mojave:
Apple LLVM version 10.0.1 (clang-1001.0.46.4)
Target: x86_64-apple-darwin18.5.0

clang --version for Sonoma (where I tested before):
Apple clang version 15.0.0 (clang-1500.3.9.4)
Target: x86_64-apple-darwin@23.5.0

Since PostgreSQL 12 is from 2019 and Mojave from 2018, I think that's 
far enough back?



> Given that LTO apparently never worked on macOS, this is not a
> regression, so I wouldn't backpatch it.  I'm not objecting, but I don't
> want to touch it.

Fair enough! Hopefully my testing convinces more than the man pages ;)

Best,

WolfgangFrom 3ca5357bbdb9aae29a1785d5ca2179d6cca15cdd Mon Sep 17 00:00:00 2001
From: Wolfgang Walther 
Date: Sun, 2 Jun 2024 10:46:56 +0200
Subject: [PATCH v2] Make building with clang's LTO work on macOS

When building with -flto the backend binary must keep many otherwise
unused symbols to make them available to dynamically loaded modules /
extensions.  This has been done via -Wl,--export-dynamic on many
platforms for years.  This flag is not supported by Apple's clang,
though.  Here it's called -Wl,-export_dynamic instead.

Thus, make configure pick up on this variant of the flag as well.  Meson
has the logic to detect this flag built-in, but doesn't support this
variant either.  This needs to be raised upstream.

Without this fix, building with -flto fails with errors similar to [1]
and [2].  This happens for all currently live versions, including 17 and
HEAD.

[1]: https://postgr.es/m/1581936537572-0.post%40n3.nabble.com
[2]: https://postgr.es/m/21800.1499270547%40sss.pgh.pa.us
---
 configure| 41 +
 configure.ac |  4 
 2 files changed, 45 insertions(+)

diff --git a/configure b/configure
index 7b03db56a67..771dcfbdef9 100755
--- a/configure
+++ b/configure
@@ -19135,6 +19135,7 @@ fi
 # For linkers that understand --export-dynamic, add that to the LDFLAGS_EX_BE
 # (backend specific ldflags). One some platforms this will always fail (e.g.,
 # windows), but on others it depends on the choice of linker (e.g., solaris).
+# macos uses -export_dynamic instead.
 { $as_echo "$as_me:${as_lineno-$LINENO}: checking whether $CC supports -Wl,--export-dynamic, for LDFLAGS_EX_BE" >&5
 $as_echo_n "checking whether $CC supports -Wl,--export-dynamic, for LDFLAGS_EX_BE... " >&6; }
 if ${pgac_cv_prog_cc_LDFLAGS_EX_BE__Wl___export_dynamic+:} false; then :
@@ -19173,6 +19174,46 @@ if test x"$pgac_cv_prog_cc_LDFLAGS_EX_BE__Wl___export_dynamic" = x"yes"; then
   LDFLAGS_EX_BE="${LDFLAGS_EX_BE} -Wl,--export-dynamic"
 fi
 
+if test x"$LDFLAGS_EX_BE" = x""; then
+  { $as_echo "$as_me:${as_lineno-$LINENO}: checking whether $CC supports -Wl,-export_dynamic, for LDFLAGS_EX_BE" >&5
+$as_echo_n "checking whether $CC supports -Wl,-export_dynamic, for LDFLAGS_EX_BE... " >&6; }
+if ${pgac_cv_prog_cc_LDFLAGS_EX_BE__Wl__export_dynamic+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  pgac_save_L

Re: Build with LTO / -flto on macOS

2024-06-03 Thread Wolfgang Walther

Peter Eisentraut:
It's probably worth clarifying that this option is needed on macOS only 
if LTO is also enabled.  For standard (non-LTO) builds, the 
export-dynamic behavior is already the default on macOS (otherwise 
nothing in PostgreSQL would work).


Right, man page say this:

> Preserves all global symbols in main executables during LTO.  Without 
this option, Link Time Optimization is allowed to inline and remove 
global functions. This option is used when a main executable may load a 
plug-in which requires certain symbols from the main executable.


Peter:
I don't think we explicitly offer LTO builds as part of the make build 
system, so anyone trying this would do it sort of self-service, by 
passing additional options to configure or make.  In which case they 
might as well pass the -export_dynamic option along in the same way?


The challenge is that it defeats the purpose of LTO to pass this along 
to everything, e.g. via CFLAGS. The Makefiles set this in LDFLAGS_EX_BE 
only, so it only affects the backend binary. This is not at all obvious 
and took me quite a while to figure out why LTO silently didn't strip 
symbols from other binaries. It does work to explicitly set 
LDFLAGS_EX_BE, though.


Also, passing the LTO flag on Linux "just works" (clang, not GCC 
necessarily).


I don't mind addressing this in PG18, but I would hesitate with 
backpatching.  With macOS, it's always hard to figure out whether these 
kinds of options work the same way going versions back.


All the versions for ld64 are in [1]. It seems this was introduced in 
ld64-224.1 [2] the first time. It was not there in ld64-136 [3]. Finally 
the man page has **exactly** the same wording in the latest version 
ld64-609 [4].


We could go further and compare the source, but I think it's safe to 
assume that this flag hasn't changed much and should not affect non-LTO 
builds. And for even older versions it would just not be supported, so 
configure would not use it.


Best,

Wolfgang

[1]: https://opensource.apple.com/source/ld64/
[2]: 
https://opensource.apple.com/source/ld64/ld64-224.1/doc/man/man1/ld.1.auto.html
[3]: 
https://opensource.apple.com/source/ld64/ld64-136/doc/man/man1/ld.1.auto.html
[4]: 
https://opensource.apple.com/source/ld64/ld64-609/doc/man/man1/ld.1.auto.html





Build with LTO / -flto on macOS

2024-06-03 Thread Wolfgang Walther
Building with clang and -flto on macOS currently fails with errors 
similar to [1]. This is because the --export-dynamic flag is called 
-export_dynamic [2] instead and we have not been passing this variant to 
the linker, so far.


Attached patch fixes that for configure/make.

CC: Tom, who hit the same in [3] and Andres who last touched 
--export-dynamic in 9db49fc5bfdc0126be03f4b8986013e59d93b91d.


Will also create an issue upstream for meson, because the logic is 
built-in there.


Would be great if this could be back-patched, since this is the same in 
all live versions.


Best,

Wolfgang

[1]: https://postgr.es/m/1581936537572-0.post%40n3.nabble.com
[2]: 
https://opensource.apple.com/source/ld64/ld64-609/doc/man/man1/ld.1.auto.html 
(grep for export_dynamic)

[3]: https://postgr.es/m/21800.1499270547%40sss.pgh.pa.usFrom 55175631d25af11971ec7b7f89d5bf4958ed3c8b Mon Sep 17 00:00:00 2001
From: Wolfgang Walther 
Date: Sun, 2 Jun 2024 10:46:56 +0200
Subject: [PATCH v1] Make building with clang's LTO work on macOS

When building with -flto the backend binary must keep many otherwise
unused symbols to make them available to dynamically loaded modules /
extensions.  This has been done via -Wl,--export-dynamic on many
platforms for years.  This flag is not supported by Apple's clang,
though.  Here it's called -Wl,-export_dynamic instead.

Thus, make configure pick up on this variant of the flag as well.  Meson
has the logic to detect this flag built-in, but doesn't support this
variant either.  This needs to be raised upstream.

Without this fix, building with -flto fails with errors similar to [1]
and [2].  This happens for all currently live versions, including 17 and
HEAD.

[1]: https://postgr.es/m/1581936537572-0.post%40n3.nabble.com
[2]: https://postgr.es/m/21800.1499270547%40sss.pgh.pa.us
---
 configure| 39 +++
 configure.ac |  2 ++
 2 files changed, 41 insertions(+)

diff --git a/configure b/configure
index 7b03db56a67..4c0a1383428 100755
--- a/configure
+++ b/configure
@@ -19135,6 +19135,7 @@ fi
 # For linkers that understand --export-dynamic, add that to the LDFLAGS_EX_BE
 # (backend specific ldflags). One some platforms this will always fail (e.g.,
 # windows), but on others it depends on the choice of linker (e.g., solaris).
+# macos uses -export_dynamic instead.
 { $as_echo "$as_me:${as_lineno-$LINENO}: checking whether $CC supports -Wl,--export-dynamic, for LDFLAGS_EX_BE" >&5
 $as_echo_n "checking whether $CC supports -Wl,--export-dynamic, for LDFLAGS_EX_BE... " >&6; }
 if ${pgac_cv_prog_cc_LDFLAGS_EX_BE__Wl___export_dynamic+:} false; then :
@@ -19173,6 +19174,44 @@ if test x"$pgac_cv_prog_cc_LDFLAGS_EX_BE__Wl___export_dynamic" = x"yes"; then
   LDFLAGS_EX_BE="${LDFLAGS_EX_BE} -Wl,--export-dynamic"
 fi
 
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking whether $CC supports -Wl,-export_dynamic, for LDFLAGS_EX_BE" >&5
+$as_echo_n "checking whether $CC supports -Wl,-export_dynamic, for LDFLAGS_EX_BE... " >&6; }
+if ${pgac_cv_prog_cc_LDFLAGS_EX_BE__Wl__export_dynamic+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  pgac_save_LDFLAGS=$LDFLAGS
+LDFLAGS="$pgac_save_LDFLAGS -Wl,-export_dynamic"
+if test "$cross_compiling" = yes; then :
+  pgac_cv_prog_cc_LDFLAGS_EX_BE__Wl__export_dynamic="assuming no"
+else
+  cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h.  */
+extern void $link_test_func (); void (*fptr) () = $link_test_func;
+int
+main ()
+{
+
+  ;
+  return 0;
+}
+_ACEOF
+if ac_fn_c_try_run "$LINENO"; then :
+  pgac_cv_prog_cc_LDFLAGS_EX_BE__Wl__export_dynamic=yes
+else
+  pgac_cv_prog_cc_LDFLAGS_EX_BE__Wl__export_dynamic=no
+fi
+rm -f core *.core core.conftest.* gmon.out bb.out conftest$ac_exeext \
+  conftest.$ac_objext conftest.beam conftest.$ac_ext
+fi
+
+LDFLAGS="$pgac_save_LDFLAGS"
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $pgac_cv_prog_cc_LDFLAGS_EX_BE__Wl__export_dynamic" >&5
+$as_echo "$pgac_cv_prog_cc_LDFLAGS_EX_BE__Wl__export_dynamic" >&6; }
+if test x"$pgac_cv_prog_cc_LDFLAGS_EX_BE__Wl__export_dynamic" = x"yes"; then
+  LDFLAGS_EX_BE="${LDFLAGS_EX_BE} -Wl,-export_dynamic"
+fi
+
 
 
 # Create compiler version string
diff --git a/configure.ac b/configure.ac
index 63e7be38472..4f5053fe592 100644
--- a/configure.ac
+++ b/configure.ac
@@ -2423,7 +2423,9 @@ fi
 # For linkers that understand --export-dynamic, add that to the LDFLAGS_EX_BE
 # (backend specific ldflags). One some platforms this will always fail (e.g.,
 # windows), but on others it depends on the choice of linker (e.g., solaris).
+# macos uses -export_dynamic instead.
 PGAC_PROG_CC_LD_VARFLAGS_OPT(LDFLAGS_EX_BE, [-Wl,--export-dynamic], $link_test_func)
+PGAC_PROG_CC_LD_VARFLAGS_OPT(LDFLAGS_EX_BE, [-Wl,-export_dynamic], $link_test_func)
 AC_SUBST(LDFLAGS_EX_BE)
 
 # Create compiler version string
-- 
2.45.1



Re: Schema variables - new implementation for Postgres 15

2024-05-31 Thread Wolfgang Walther

Pavel Stehule:
When you write RAISE NOTICE '%', x, then PLpgSQL parser rewrite it to 
RAISE NOTICE '%', SELECT $1


There is no parser just for expressions.


That's why my suggestion in [1] already made a difference between:

SELECT var;

and

SELECT col, var FROM table, var;

So the "only require variable-in-FROM if FROM is used" should extend to 
the SQL level.


That should be possible, right?

Best,

Wolfgang

[1]: 
https://www.postgresql.org/message-id/e7faf42f-62b8-47f4-af5c-cb8efa3e0e20%40technowledgy.de





Re: Schema variables - new implementation for Postgres 15

2024-05-31 Thread Wolfgang Walther

Pavel Stehule:
The session variables can be used in queries, but should be used in 
PL/pgSQL expressions, and then the mandatory usage in FROM clause will 
do lot of problems and unreadable code like


DO $$
BEGIN
   RAISE NOTICE '% %', (SELECT x FROM x), (SELECT a,b FROM y);

END
$$

This requirement does variables unusable in PL


I already proposed earlier to only require listing them in FROM when 
there is actually a related FROM.


In this case you could still write:

RAISE NOTICE '% %', x, (SELECT a,b FROM y);

(assuming only x is a variable here)

Best,

Wolfgang




Re: Schema variables - new implementation for Postgres 15

2024-05-31 Thread Wolfgang Walther

Pavel Stehule:

But in this case you could make variables and tables share the same
namespace, i.e. forbid creating a variable with the same name as an
already existing table.


It helps, but not on 100% - there is a search path


I think we can ignore the search_path for this discussion. That's not a 
problem of variables vs tables, but just a search path related problem. 
It is exactly the same thing right now, when you create a new table x(x) 
in a schema which happens to be earlier in your search path.


The objection to the proposed approach for variables was that it would 
introduce *new* ambiguities, which Alvaro's suggestion avoids.


Best,

Wolfgang




Re: Schema variables - new implementation for Postgres 15

2024-05-31 Thread Wolfgang Walther

Pavel Stehule:
2. But my main argument is, it is not really safe - it solves Peter's 
use case, but if I use a reverse example of Peter's case, I still have a 
problem.


I can have a variable x, and then I can write query like `SELECT x FROM x`;

but if somebody creates table x(x int), then the query `SELECT x FROM x` 
will be correct, but it is surely something else. So the requirement of 
the usage variable inside FROM clause doesn't help. It doesn't work.


But in this case you could make variables and tables share the same 
namespace, i.e. forbid creating a variable with the same name as an 
already existing table.


Best,

Wolfgang




Re: Building with musl in CI and the build farm

2024-04-04 Thread Wolfgang Walther

Tom Lane:

You'd have to commit a failing patch first to break CI for all other
developers.


No, what I'm more worried about is some change in the environment
causing the build to start failing.  When that happens, it'd better
be an environment that many of us are familiar with and can test/fix.


The way I understand how this work is, that the images for the VMs in 
which those CI tasks run, are not just dynamically updated - but are 
actually tested before they are used in CI. So the environment doesn't 
just change suddenly.


See e.g. [1] for a pull request to the repo containing those images to 
update the linux debian image from bullseye to bookworm. This is exactly 
the image we're talking about. Before this image is used in postgres CI, 
it's tested and confirmed that it actually works there. If one of the 
jobs was using musl - that would be tested as well. So this job would 
not just suddenly start failing for everybody.


I do see the "familiarity" argument for the SanityCheck task, but for a 
different reason: Even though it's unlikely for this job to fail for 
musl specific reasons - if you're not familiar with musl and can't 
easily test it locally, you might not be able to tell immediately 
whether it's musl specific or not. If musl was run in one of the later 
jobs, it's much different: You see all tests failing - alright, not musl 
specific. You see only the musl test failing - yeah, musl problem. This 
should give developers much more confidence looking at the results.


Best,

Wolfgang

[1]: https://github.com/anarazel/pg-vm-images/pull/91




Re: Building with musl in CI and the build farm

2024-04-04 Thread Wolfgang Walther

Tom Lane:

That is not the concern here.  What I think Peter is worried about,
and certainly what I'm worried about, is that a breakage in
SanityCheck comprehensively breaks all CI testing for all Postgres
developers.


You'd have to commit a failing patch first to break CI for all other 
developers. If you're only going to commit patches that pass those CI 
tasks, then this is not going to happen. Then it only becomes a question 
of how much feedback *you* get from a single CI run of your own patch.



To be blunt, I do not think we need to test musl in the CI pipeline.
I see it as one of the niche platforms that the buildfarm exists
to test.


I don't really have an opinion on this. I'm fine with having musl in the 
buildfarm only. I don't expect the core build itself to fail with musl 
anyway, this has been working fine for years. Andres asked for it to be 
added to CI, so maybe he sees more value on top of just "building with 
musl"?


Best,

Wolfgang




Re: Building with musl in CI and the build farm

2024-04-04 Thread Wolfgang Walther

Peter Eisentraut:

On 31.03.24 15:34, walt...@technowledgy.de wrote:
I'd rather adapt one of the existing tasks, to avoid increasing CI 
costs unduly.


I looked into this and I think the only task that could be changed is 
the SanityCheck.


I think SanityCheck should run a simple, "average" environment, like the 
current Debian one.  Otherwise, niche problems with musl or multi-arch 
or whatever will throw off the entire build pipeline.


All the errors/problems I have seen so far, while setting up the 
buildfarm animal on Alpine Linux, have been way beyond what SanityCheck 
does. Problems only appeared in the tests suites, of which sanity check 
only runs *very* basic ones. I don't have much experience with the 
"cross" setup, that "musl on debian" essentially is, though.


All those things are certainly out of scope for CI - they are tested in 
the build farm instead.


I do agree: SanityCheck doesn't feel like the right place to put this. 
But on the other side.. if it really fails to *build* with musl, then it 
shouldn't make a difference whether you will be notified about that 
immediately or later in the CI pipeline. It certainly needs the fewest 
additional resources to put it there.


I'm not sure what Andres meant with "adopting one of the existing 
tasks". It could fit as another step into the "Linux - Debian Bullseye - 
Autoconf" task, too. A bit similar to how the meson task build for 32 
and 64bit. This would still not be an entirely new task like I proposed 
initially (to run in Docker).


Best,

Wolfgang




Building with musl in CI and the build farm

2024-03-26 Thread Wolfgang Walther
The need to do $subject came up in [1]. Moving this to a separate 
discussion on -hackers, because there are more issues to solve than just 
the LD_LIBRARY_PATH problem.


Andres Freund:

FWIW, except for one small issue, building postgres against musl works on
debian and the tests pass if I install first.


The small problem mentioned above is that on debian linux/fs.h isn't available
when building with musl, which in turn causes src/bin/pg_upgrade/file.c to
fail to compile.  I assume that's not the case on "fully musl" distro?


Correct, I have not seen this before on Alpine.

Here is my progress setting up a buildfarm animal to run on Alpine Linux 
and the issues I found, so far:


The animal runs in a docker container via GitHub Actions in [2]. Right 
now it's still running with --test, until I get the credentials to 
activate it.


I tried to enable everything (except systemd, because Alpine doesn't 
have it) and run all tests. The LDAP tests are failing right now, but 
that is likely something that I need to fix in the Dockerfile - it's 
failing to start the slapd, IIRC. There are other issues, though - all 
of them have open pull requests in that repo [3].


I also had to skip the recovery check. Andrew mentioned that he had to 
do that, too, when he was still running his animal on Alpine. Not sure 
what this is about, yet.


Building --with-icu fails two tests. One of them (001_initdb) is fixed 
by having the "locale" command in your PATH, which is not the case on 
Alpine by default. I assume this will not break on your debian/musl 
build, Andres - but it will also probably not return any sane values, 
because it will run glibc's locale command.
I haven't looked into that in detail, yet, but I think the other test 
(icu/010_database) fails because it expects that setlocale(LC_COLLATE, 
) throws an error. I think it doesn't do that on musl, 
because LC_COLLATE is not implemented.
Those failing tests are not "just failing", but probably mean that we 
need to do something about how we deal with locale/setlocale on musl.


The last failure is about building --with-nls. This fails with something 
like:


ld: src/port/strerror.c:72:(.text+0x2d8): undefined reference to 
`libintl_gettext'


Of course, gettext-dev is installed, libintl.so is available in /usr/lib 
and it also contains the symbol. So not sure what's happening here.


Andres, did you build --with-icu and/or --with-nls on debian/musl? Did 
you run the recovery tests?


Best,

Wolfgang

[1]: 
https://postgr.es/m/fddd1cd6-dc16-40a2-9eb5-d7fef2101488%40technowledgy.de
[2]: 
https://github.com/technowledgy/postgresql-buildfarm-alpine/actions/workflows/run.yaml

[3]: https://github.com/technowledgy/postgresql-buildfarm-alpine/pulls




Re: Building with meson on NixOS/nixpkgs

2024-03-21 Thread Wolfgang Walther

Nazir Bilal Yavuz:

0001 & 0002: Adding code comments to explain why they have fallback
could be nice.
0003: Looks good to me.


Added some comments in the attached.

Best,

WolfgangFrom 2d271aafd96a0ea21710a06ac5236e47217c36d1 Mon Sep 17 00:00:00 2001
From: Wolfgang Walther 
Date: Sat, 2 Mar 2024 17:18:38 +0100
Subject: [PATCH v2 1/3] Fallback to uuid for ossp-uuid with meson

The upstream name for the ossp-uuid package / pkg-config file is "uuid". Many
distributions change this to be "ossp-uuid" to not conflict with e2fsprogs.

This lookup fails on distributions which don't change this name, for example
NixOS / nixpkgs. Both "ossp-uuid" and "uuid" are also checked in configure.ac.
---
 meson.build | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/meson.build b/meson.build
index c8fdfeb0ec3..942c79c8be3 100644
--- a/meson.build
+++ b/meson.build
@@ -1346,7 +1346,8 @@ if uuidopt != 'none'
 uuidfunc = 'uuid_to_string'
 uuidheader = 'uuid.h'
   elif uuidopt == 'ossp'
-uuid = dependency('ossp-uuid', required: true)
+# upstream is called "uuid", but many distros change this to "ossp-uuid"
+uuid = dependency('ossp-uuid', 'uuid', required: true)
 uuidfunc = 'uuid_export'
 uuidheader = 'uuid.h'
   else
-- 
2.44.0

From f150a8dcb92b08eab40b5dfec130a18f297c709f Mon Sep 17 00:00:00 2001
From: Wolfgang Walther 
Date: Sat, 2 Mar 2024 22:06:25 +0100
Subject: [PATCH v2 2/3] Fallback to clang in PATH with meson

Some distributions put clang into a different path than the llvm binary path.

For example, this is the case on NixOS / nixpkgs, which failed to find clang
with meson before this patch.
---
 meson.build | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/meson.build b/meson.build
index 942c79c8be3..f9e96b01cfa 100644
--- a/meson.build
+++ b/meson.build
@@ -759,7 +759,10 @@ if add_languages('cpp', required: llvmopt, native: false)
 llvm_binpath = llvm.get_variable(configtool: 'bindir')
 
 ccache = find_program('ccache', native: true, required: false)
-clang = find_program(llvm_binpath / 'clang', required: true)
+
+# Some distros put LLVM and clang in different paths, so fallback to
+# find via PATH, too.
+clang = find_program(llvm_binpath / 'clang', 'clang', required: true)
   endif
 elif llvmopt.auto()
   message('llvm requires a C++ compiler')
-- 
2.44.0

From fddee56b5e27a3b5e4c406e8caa2d230b49eb447 Mon Sep 17 00:00:00 2001
From: Wolfgang Walther 
Date: Mon, 11 Mar 2024 19:54:41 +0100
Subject: [PATCH v2 3/3] Support absolute bindir/libdir in regression tests
 with meson

Passing an absolute bindir/libdir will install the binaries and libraries to
/tmp_install/ and /tmp_install/ respectively.

This is path is correctly passed to the regression test suite via configure/make,
but not via meson, yet. This is because the "/" operator in the following expression
throws away the whole left side when the right side is an absolute path:

  test_install_location / get_option('libdir')

This was already correctly handled for dir_prefix, which is likely absolute as well.
This patch handles both bindir and libdir in the same way - prefixing absolute paths
with the tmp_install path correctly.
---
 meson.build | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/meson.build b/meson.build
index f9e96b01cfa..144c443b5e4 100644
--- a/meson.build
+++ b/meson.build
@@ -3048,15 +3048,17 @@ test_install_destdir = meson.build_root() / 'tmp_install/'
 if build_system != 'windows'
   # On unixoid systems this is trivial, we just prepend the destdir
   assert(dir_prefix.startswith('/')) # enforced by meson
-  test_install_location = '@0@@1@'.format(test_install_destdir, dir_prefix)
+  temp_install_bindir = '@0@@1@'.format(test_install_destdir, dir_prefix / dir_bin)
+  temp_install_libdir = '@0@@1@'.format(test_install_destdir, dir_prefix / dir_lib)
 else
   # drives, drive-relative paths, etc make this complicated on windows, call
   # into a copy of meson's logic for it
   command = [
 python, '-c',
 'import sys; from pathlib import PurePath; d1=sys.argv[1]; d2=sys.argv[2]; print(str(PurePath(d1, *PurePath(d2).parts[1:])))',
-test_install_destdir, dir_prefix]
-  test_install_location = run_command(command, check: true).stdout().strip()
+test_install_destdir]
+  temp_install_bindir = run_command(command, dir_prefix / dir_bin, check: true).stdout().strip()
+  temp_install_libdir = run_command(command, dir_prefix / dir_lib, check: true).stdout().strip()
 endif
 
 meson_install_args = meson_args + ['install'] + {
@@ -3093,7 +3095,6 @@ testport = 4
 
 tes

Building with meson on NixOS/nixpkgs

2024-03-16 Thread Wolfgang Walther
To build on NixOS/nixpkgs I came up with a few small patches to 
meson.build. All of this works fine with Autoconf/Make already.From 24ae72b9b0adc578c6729eff59c9038e6b4ac517 Mon Sep 17 00:00:00 2001
From: Wolfgang Walther 
Date: Sat, 2 Mar 2024 17:18:38 +0100
Subject: [PATCH 1/3] Fallback to uuid for ossp-uuid with meson

The upstream name for the ossp-uuid package / pkg-config file is "uuid". Many
distributions change this to be "ossp-uuid" to not conflict with e2fsprogs.

This lookup fails on distributions which don't change this name, for example
NixOS / nixpkgs. Both "ossp-uuid" and "uuid" are also checked in configure.ac.
---
 meson.build | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/meson.build b/meson.build
index c8fdfeb0ec3..d9ad1ce902f 100644
--- a/meson.build
+++ b/meson.build
@@ -1346,7 +1346,7 @@ if uuidopt != 'none'
 uuidfunc = 'uuid_to_string'
 uuidheader = 'uuid.h'
   elif uuidopt == 'ossp'
-uuid = dependency('ossp-uuid', required: true)
+uuid = dependency('ossp-uuid', 'uuid', required: true)
 uuidfunc = 'uuid_export'
 uuidheader = 'uuid.h'
   else
-- 
2.44.0

From 56e0abbcc3b950b6e93eddc6ede453ce529423ea Mon Sep 17 00:00:00 2001
From: Wolfgang Walther 
Date: Sat, 2 Mar 2024 22:06:25 +0100
Subject: [PATCH 2/3] Fallback to clang in PATH with meson

Some distributions put clang into a different path than the llvm binary path.

For example, this is the case on NixOS / nixpkgs, which failed to find clang
with meson before this patch.
---
 meson.build | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/meson.build b/meson.build
index d9ad1ce902f..7c64fb04ea5 100644
--- a/meson.build
+++ b/meson.build
@@ -759,7 +759,7 @@ if add_languages('cpp', required: llvmopt, native: false)
 llvm_binpath = llvm.get_variable(configtool: 'bindir')
 
 ccache = find_program('ccache', native: true, required: false)
-clang = find_program(llvm_binpath / 'clang', required: true)
+clang = find_program(llvm_binpath / 'clang', 'clang', required: true)
   endif
 elif llvmopt.auto()
   message('llvm requires a C++ compiler')
-- 
2.44.0

From 62f10689d843227fca6d54e86462d0be5c4f434f Mon Sep 17 00:00:00 2001
From: Wolfgang Walther 
Date: Mon, 11 Mar 2024 19:54:41 +0100
Subject: [PATCH 3/3] Support absolute bindir/libdir in regression tests with
 meson

Passing an absolute bindir/libdir will install the binaries and libraries to
/tmp_install/ and /tmp_install/ respectively.

This is path is correctly passed to the regression test suite via configure/make,
but not via meson, yet. This is because the "/" operator in the following expression
throws away the whole left side when the right side is an absolute path:

  test_install_location / get_option('libdir')

This was already correctly handled for dir_prefix, which is likely absolute as well.
This patch handles both bindir and libdir in the same way - prefixing absolute paths
with the tmp_install path correctly.
---
 meson.build | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/meson.build b/meson.build
index 7c64fb04ea5..e8438209058 100644
--- a/meson.build
+++ b/meson.build
@@ -3044,15 +3044,17 @@ test_install_destdir = meson.build_root() / 'tmp_install/'
 if build_system != 'windows'
   # On unixoid systems this is trivial, we just prepend the destdir
   assert(dir_prefix.startswith('/')) # enforced by meson
-  test_install_location = '@0@@1@'.format(test_install_destdir, dir_prefix)
+  temp_install_bindir = '@0@@1@'.format(test_install_destdir, dir_prefix / dir_bin)
+  temp_install_libdir = '@0@@1@'.format(test_install_destdir, dir_prefix / dir_lib)
 else
   # drives, drive-relative paths, etc make this complicated on windows, call
   # into a copy of meson's logic for it
   command = [
 python, '-c',
 'import sys; from pathlib import PurePath; d1=sys.argv[1]; d2=sys.argv[2]; print(str(PurePath(d1, *PurePath(d2).parts[1:])))',
-test_install_destdir, dir_prefix]
-  test_install_location = run_command(command, check: true).stdout().strip()
+test_install_destdir]
+  temp_install_bindir = run_command(command, dir_prefix / dir_bin, check: true).stdout().strip()
+  temp_install_libdir = run_command(command, dir_prefix / dir_lib, check: true).stdout().strip()
 endif
 
 meson_install_args = meson_args + ['install'] + {
@@ -3089,7 +3091,6 @@ testport = 4
 
 test_env = environment()
 
-temp_install_bindir = test_install_location / get_option('bindir')
 test_initdb_template = meson.build_root() / 'tmp_install' / 'initdb-template'
 test_env.set('PG_REGRESS', pg_regress.full_path())
 test_env.set(&#

Re: has_privs_of_role vs. is_member_of_role, redux

2022-09-26 Thread Wolfgang Walther

Robert Haas:

Scratch my previous suggestion. A new, less fuzyy definition would be:
Ownership is not a privilege itself and as such not inheritable.
[...]

If I'm understanding correctly, this would amount to a major
redefinition of what it means to inherit privileges, and I think the
chances of such a change being accepted are approximately zero.
Inheriting privileges needs to keep meaning what it means now, namely,
you inherit all the rights of the granted role.


No. Inheriting stays the same, it's just WITH SET that's different from 
what it is "now".



I don't. And even if I did think it were easy to explain, I don't
think it would be a good idea. One of my first patches to PostgreSQL
added a grantable TRUNCATE privilege to tables. I think that, under
your proposed definitions, the addition of this privilege would have
had the result that a role grant would cease to allow the recipient to
truncate tables owned by the granted role. There is currently a
proposal on the table to make VACUUM and ANALYZE grantable permissions
on tables, which would have the same issue. I think that if I made it
so that adding such privileges resulted in role inheritance not
working for those operations any more, people would come after me with
pitchforks. And I wouldn't blame them: that sounds terrible.


No, there is a misunderstanding. In my proposal, when you do WITH SET 
TRUE everything stays exactly the same as it is right now.


I'm just saying WITH SET FALSE should take away more of the things you 
can do (all the ownership things) to a point where it's safe to GRANT .. 
WITH INHERIT TRUE, SET FALSE and still be useful for pre-defined or 
privilege-container roles.


Could be discussed in the WITH SET thread, but it's a natural extension 
of the categories (1) and (2) in your original email. It's all about 
ownership.


Best

Wolfgang




Re: has_privs_of_role vs. is_member_of_role, redux

2022-09-26 Thread Wolfgang Walther

Robert Haas:

I don't think we're going to be very happy if we redefine inheriting
the privileges of another role to mean inheriting only some of them.
That seems pretty counterintuitive to me. I also think that this
particular definition is pretty fuzzy.


Scratch my previous suggestion. A new, less fuzyy definition would be: 
Ownership is not a privilege itself and as such not inheritable.


When role A is granted to role B, two things happen:
1. Role B now has the right to use the GRANTed privileges of role A.
2. Role B now has the right to become role A via SET ROLE.

WITH SET controls whether point 2 is the case or not.

WITH INHERIT controls whether role B actually executes their right to 
use those privileges ("inheritance") **and** whether the set role is 
done implicitly for anything that requires ownership, but of course only 
WITH SET TRUE.


This is the same way that the role attributes INHERIT / NOINHERIT behave.


Your previous proposal was to make the SET attribute of a GRANT
control not only the ability to SET ROLE to the target role but also
the ability to create objects owned by that role and/or transfer
objects to that role. I think some people might find that behavior a
little bit surprising - certainly, it goes beyond what the name SET
implies - but it is at least simple enough to explain in one sentence,
and the consequences don't seem too difficult to reason about.


This would be included in the above.


Here, though, it doesn't really seem simple enough to explain in one
sentence, nor does it seem easy to reason about.


I think the "ownership is not inheritable" idea is easy to explain.


There are many operations which are permitted or declined just using
an owner-check. One example is commenting on an object. That sure
sounds like it would fit within your proposed "DDL privileges
implicitly given through ownership" category, but it doesn't really
present any security hazard, so I do not think there is a good reason
to restrict that from a user who has INHERIT TRUE, SET FALSE. Another
is renaming an object, which is a little more murky. You can't
directly usurp someone's privileges by renaming an object that they
own, but you could potentially rename an object out of the way and
replace it with one that you own and thus induce a user to do
something dangerous. I don't really want to make even small exceptions
to the idea that inheriting a role's privileges means inheriting all
of them, and I especially don't want to make large exceptions, or
exceptions that involve judgement calls about the relative degree of
risk of each possible operation.


I would not make this about security-risks only. We didn't distinguish 
between privileges and ownership that much before, because we didn't 
have WITH INHERIT or WITH SET. Now that we have both, we could do so.


The ideas of "inherited GRANTs" and "a shortcut to avoid SET ROLE to do 
owner-things" should be better to explain.


No judgement required.

All of this is to find a way to make WITH INHERIT TRUE, SET FALSE a 
"real", risk-free thing - and not just some syntactic sugar. And if that 
comes with the inability to COMMENT ON TABLE 
owned_by_pg_read_all_settings... fine. No need for that at all.


However, it would come with the inability to do SELECT * FROM 
owned_by_pg_read_all_settings, **unless** explicitly GRANTed to the 
owner, too. This might feel strange at first, but should not be a 
problem either. WITH INHERIT TRUE, SET FALSE is designed for built-in 
roles or other container roles that group a set of privileges. Those 
roles should not have objects they own anyway. And if they still do, 
denying access to those objects unless explicitly granted is the safe way.


Best

Wolfgang




Re: has_privs_of_role vs. is_member_of_role, redux

2022-09-26 Thread Wolfgang Walther

Robert Haas:

This shows that if rhaas (or whoever) performs DML on a table owned by
pg_read_all_settings, he might trigger arbitrary code written by alice
to run under his own user ID. Now, that hazard would exist anyway for
tables owned by alice, but now it also exists for any tables owned by
pg_read_all_settings.


This hazard exists for all tables that alice has been granted the 
TRIGGER privilege on. While we prevent alice from creating tables that 
are owned by pg_read_all_settings, we do not prevent inheriting the 
TRIGGER privilege.



I'm slightly skeptical of that conclusion because the whole thing just
feels a bit flimsy. Like, the whole idea that you can compromise your
account by inserting a row into somebody else's table feels a little
nuts to me. Triggers and row-level security policies make it easy to
do things that look safe and are actually very dangerous. I think
anyone would reasonably expect that calling a function owned by some
other user might be risky, because who knows what that function might
do, but it seems less obvious that accessing a table could execute
arbitrary code, yet it can. And it is even less obvious that creating
a table owned by one role might give some other role who inherits that
user's privileges to booby-trap that table in a way that might fool a
third user into doing something unsafe. But I have no idea what we
could reasonably do to improve the situation.


Right. This will always be the case when giving out the TRIGGER 
privilege on one of your tables to somebody else.


There is two kind of TRIGGER privileges: An explicitly GRANTed privilege 
and an implicit privilege, that is given to the table owner.


I think, when WITH INHERIT TRUE, SET FALSE is set, we should:
- Inherit all explicitly granted privileges
- Not inherit any DDL privileges implicitly given through ownership: 
CREATE, REFERENCES, TRIGGER.
- Inherit all other privileges implicitly given through ownership (DML + 
others)


Those implicit DDL privileges should be considered part of WITH SET 
TRUE. When you can't do SET ROLE x, then you can't act as the owner of 
any object owned by x.


Or to put it the other way around: Only allow implicit ownership 
privileges to be executed when the CURRENT_USER is the owner. But 
provide a shortcut, when you have the WITH SET TRUE option on a role, so 
that you don't need to do SET ROLE + CREATE TRIGGER, but can just do 
CREATE TRIGGER instead. This is similar to the mental model of 
"requesting and accepting a transfer of ownership" with an implicit SET 
ROLE built-in, that I used before.


Best

Wolfgang




Re: Allow foreign keys to reference a superset of unique columns

2022-09-26 Thread Wolfgang Walther

James Coleman:

As I was reading through the email chain I had this thought: could you
get the same benefit (or 90% of it anyway) by instead allowing the
creation of a uniqueness constraint that contains more columns than
the index backing it? So long as the index backing it still guaranteed
the uniqueness on a subset of columns that would seem to be safe.

Tom notes the additional columns being nullable is something to think
about. But if we go the route of allowing unique constraints with
backing indexes having a subset of columns from the constraint I don't
think the nullability is an issue since it's already the case that a
unique constraint can be declared on columns that are nullable. Indeed
it's also the case that we already support a foreign key constraint
backed by a unique constraint including nullable columns.

Because such an approach would, I believe, avoid changing the foreign
key code (or semantics) at all, I believe that would address Tom's
concerns about information_schema and fuzziness of semantics.



Could we create this additional unique constraint implicitly, when using 
FOREIGN KEY ... REFERENCES on a superset of unique columns? This would 
make it easier to use, but still give proper information_schema output.


Best

Wolfgang




Re: Allow foreign keys to reference a superset of unique columns

2022-09-26 Thread Wolfgang Walther

James Coleman:

So the broader point I'm trying to make is that, as I understand it,
indexes backing foreign key constraints is an implementation detail.
The SQL standard details the behavior of foreign key constraints
regardless of implementation details like a backing index. That means
that the behavior of two column foreign key constraints is defined in
a single way whether or not there's a backing index at all or whether
such a backing index, if present, contains one or two columns.

I understand that for the use case you're describing this isn't the
absolute most efficient way to implement the desired data semantics.
But it would be incredibly confusing (and, I think, a violation of the
SQL standard) to have one foreign key constraint work in a different
way from another such constraint when both are indistinguishable at
the constraint level (the backing index isn't an attribute of the
constraint; it's merely an implementation detail).


Ah, thanks, I understand better now.

The two would only be indistinguishable at the constraint level, if 
$subject was implemented by allowing to create unique constraints on a 
superset of unique columns, backed by a different index (the suggestion 
we both made independently). But if it was possible to reference a 
superset of unique columns, where there was only a unique constraint put 
on a subset of the referenced columns (the idea originally introduced in 
this thread), then there would be a difference, right?


That's if it was only the backing index that is not part of the SQL 
standard, and not also the fact that a foreign key should reference a 
primary key or unique constraint?


Anyway, I can see very well how that would be quite confusing overall. 
It would probably be wiser to allow something roughly like this (if at 
all, of course):


CREATE TABLE bar (
  b INT PRIMARY KEY,
  f INT,
  ftype foo_type GENERATED ALWAYS AS REFERENCE TO foo.type,
  FOREIGN KEY (f, ftype) REFERENCES foo (f, type)
);

It likely wouldn't work exactly like that, but given a foreign key to 
foo, the GENERATED clause could be used to fetch the value through the 
same triggers that form that FK for efficiency. My main point for now 
is: With a much more explicit syntax anything near that, this would 
certainly be an entirely different feature than $subject **and** it 
would be possible to implement on top of $subject. If at all.


So no need for me to distract this thread from $subject anymore. I think 
the idea of allowing to create unique constraints on a superset of the 
columns of an already existing unique index is a good one, so let's 
discuss this further.


Best

Wolfgang




Re: Add ON CONFLICT DO RETURN clause

2022-09-25 Thread Wolfgang Walther

Peter Geoghegan:

On Sun, Sep 25, 2022 at 8:55 AM Wolfgang Walther
 wrote:

The attached patch adds a DO RETURN clause to be able to do this:

INSERT INTO x (id) VALUES (1)
ON CONFLICT DO RETURN
RETURNING created_at;

Much simpler. This will either insert or do nothing - but in both cases
return a row.


How can you tell which it was, though?


I guess I can't reliably. But isn't that the same in the ON UPDATE case?

In the use cases I had so far, I didn't need to know.


I don't see why this statement should ever perform steps for any row
that are equivalent to DO NOTHING processing -- it should at least
lock each and every affected row, if only to conclusively determine
that there really must be a conflict.

In general ON CONFLICT DO UPDATE allows the user to add a WHERE clause
to back out of updating a row based on an arbitrary predicate. DO
NOTHING has no such WHERE clause. So DO NOTHING quite literally does
nothing for any rows that had conflicts, unlike DO UPDATE, which will
at the very least lock the row (with or without an explicit WHERE
clause).

The READ COMMITTED behavior for DO NOTHING is a bit iffy, even
compared to DO UPDATE, but the advantages in bulk loading scenarios
can be decisive. Or at least they were before we had MERGE.


Agreed - it needs to lock the row. I don't think I fully understood what 
"nothing" in DO NOTHING extended to.


I guess I want DO RETURN to behave more like a DO SELECT, so with the 
same semantics as selecting the row?


Best

Wolfgang




Re: Allow foreign keys to reference a superset of unique columns

2022-09-25 Thread Wolfgang Walther

James Coleman:

If we have a declared constraint on x,y where x is unique based on an
index including on x I do not think we should have that fk constraint
work differently than a constraint on x,y where there is a unique
index on x,y. That would seem to be incredibly confusing behavior
(even if it would be useful for some specific use case).


I don't think it's behaving differently from how it does now. See below. 
But I can see how that could be confusing. Maybe it's just about 
describing the feature in a better way than I did so far. Or maybe it 
needs a different syntax.


Anyway, I don't think it's just a specific use case. In every use case I 
had for $subject so far, the immediate next step was to write some 
triggers to fetch those derived values from the referenced table.


Ultimately it's a question of efficiency: We can achieve the same thing 
in two ways today:
- We can either **not** add the additional column (members.tenant, 
bar.ftype in my examples) to the referencing table at all, and add 
constraint triggers that do all those checks instead. This adds 
complexity to write the triggers and more complicated RLS policies etc, 
and also is potentially slower when executing those more complicated 
queries.
- Or we can add the additional column, but also add an additional unique 
index on the referenced table, and then make it part of the FK. This 
removes some of the constraint triggers and makes RLS policies simpler 
and likely faster to execute queries. It comes at a cost of additional 
cost of storage, though - and this is something that $subject tries to 
address.


Still, even when $subject is allowed, in practice we need some of the 
triggers to fetch those dependent values. Considering that the current 
FK triggers already do the same kind of queries at the same times, it'd 
be more efficient to have those FK queries fetch those dependent values.



But this could also be a CHECK constraint to allow FKs only to a subset
of rows in the target table:


Are you suggesting a check constraint that queries another table?


No. I was talking about the CHECK constraint in my example in the next 
paragraph of that mail. The CHECK constraint on bar.ftype is a regular 
CHECK constraint, but because of how ftype is updated automatically, it 
effectively behaves like some kind of additional constraint on the FK 
itself.



This "derive the value automatically" is not what foreign key
constraints do right now at all, right? And if fact it's contradictory
to existing behavior, no?


I don't think it's contradicting. Maybe a better way to put my idea is this:

For a foreign key to a superset of unique columns, the already-unique 
columns should behave according to the specified ON UPDATE clause. 
However, the extra columns should always behave as they were ON UPDATE 
CASCADE. And additionally, they should behave similar to something like 
ON INSERT CASCADE. Although that INSERT is about the referencing table, 
not the referenced table, so the analogy isn't 100%.


I guess this would also be a more direct answer to Tom's earlier 
question about what to expect in the ON UPDATE scenario.


Best

Wolfgang




Add ON CONFLICT DO RETURN clause

2022-09-25 Thread Wolfgang Walther
When using ON CONFLICT DO NOTHING together with RETURNING, the 
conflicted rows are not returned. Sometimes, this would be useful 
though, for example when generated columns or default values are in play:


CREATE TABLE x (
  id INT PRIMARY KEY,
  created_at TIMESTAMPTZ DEFAULT CURRENT_TIMEMSTAMP
);

To get the created_at timestamp for a certain id **and** at the same 
time create this id in case it does not exist, yet, I can currently do:


INSERT INTO x (id) VALUES (1)
  ON CONFLICT DO UPDATE
  SET id=EXCLUDED.id
  RETURNING created_at;

However that will result in a useless UPDATE of the row.

I could probably add a trigger to prevent the UPDATE in that case. Or I 
could do something in a CTE. Or in multiple statements in plpgsql - this 
is what I currently do in application code.


The attached patch adds a DO RETURN clause to be able to do this:

INSERT INTO x (id) VALUES (1)
  ON CONFLICT DO RETURN
  RETURNING created_at;

Much simpler. This will either insert or do nothing - but in both cases 
return a row.


Thoughts?

Best

Wolfgang>From 83a0031ed2ded46cbf6fd130bd76680267db7a5e Mon Sep 17 00:00:00 2001
From: Wolfgang Walther 
Date: Sun, 25 Sep 2022 16:20:44 +0200
Subject: [PATCH v1] Add ON CONFLICT DO RETURN clause

This behaves the same as DO NOTHING, but returns the row when used together 
with RETURNING.
---
 doc/src/sgml/postgres-fdw.sgml|   6 +-
 doc/src/sgml/ref/insert.sgml  |  15 +-
 src/backend/commands/explain.c|  21 +-
 src/backend/executor/nodeModifyTable.c|  24 +-
 src/backend/optimizer/util/plancat.c  |   4 +
 src/backend/parser/gram.y |  10 +
 src/backend/parser/parse_clause.c |   7 +
 src/backend/utils/adt/ruleutils.c |   4 +
 src/include/nodes/nodes.h |   1 +
 .../expected/insert-conflict-do-nothing-2.out | 186 
 .../expected/insert-conflict-do-nothing.out   |  46 
 .../expected/partition-key-update-3.out   | 206 ++
 .../specs/insert-conflict-do-nothing-2.spec   |  11 +
 .../specs/insert-conflict-do-nothing.spec |   5 +
 .../specs/partition-key-update-3.spec |  11 +
 src/test/regress/expected/insert_conflict.out |  25 +++
 src/test/regress/sql/insert_conflict.sql  |  19 ++
 17 files changed, 587 insertions(+), 14 deletions(-)

diff --git a/doc/src/sgml/postgres-fdw.sgml b/doc/src/sgml/postgres-fdw.sgml
index bfd344cdc0..e5b6b8501f 100644
--- a/doc/src/sgml/postgres-fdw.sgml
+++ b/doc/src/sgml/postgres-fdw.sgml
@@ -80,9 +80,9 @@
  
   Note that postgres_fdw currently lacks support for
   INSERT statements with an ON CONFLICT DO
-  UPDATE clause.  However, the ON CONFLICT DO 
NOTHING
-  clause is supported, provided a unique index inference specification
-  is omitted.
+  UPDATE or ON CONFLICT DO RETURN clause.
+  However, the ON CONFLICT DO NOTHING clause is supported,
+  provided a unique index inference specification is omitted.
   Note also that postgres_fdw supports row movement
   invoked by UPDATE statements executed on partitioned
   tables, but it currently does not handle the case where a remote partition
diff --git a/doc/src/sgml/ref/insert.sgml b/doc/src/sgml/ref/insert.sgml
index 7cea70329e..eb0c721637 100644
--- a/doc/src/sgml/ref/insert.sgml
+++ b/doc/src/sgml/ref/insert.sgml
@@ -36,6 +36,7 @@ INSERT INTO table_name [ AS and conflict_action is 
one of:
 
 DO NOTHING
+DO RETURN
 DO UPDATE SET { column_name = 
{ expression | DEFAULT } |
 ( column_name 
[, ...] ) = [ ROW ] ( { expression 
| DEFAULT } [, ...] ) |
 ( column_name 
[, ...] ) = ( sub-SELECT )
@@ -336,9 +337,11 @@ INSERT INTO table_name [ AS conflict_target is violated, the
 alternative conflict_action is taken.
 ON CONFLICT DO NOTHING simply avoids inserting
-a row as its alternative action.  ON CONFLICT DO
-UPDATE updates the existing row that conflicts with the
-row proposed for insertion as its alternative action.
+a row as its alternative action.  ON CONFLICT DO RETURN
+avoids inserting the row, but returns the row when 
RETURNING
+is specified.  ON CONFLICT DO UPDATE updates the
+existing row that conflicts with the row proposed for insertion as
+its alternative action.

 

@@ -379,7 +382,7 @@ INSERT INTO table_name [ AS arbiter
 indexes.  Either performs unique index
 inference, or names a constraint explicitly.  For
-ON CONFLICT DO NOTHING, it is optional to
+DO NOTHING and DO RETURN, it is 
optional to
 specify a conflict_target; when
 omitted, conflicts with all usable constraints (and unique
 indexes) are handled.  For ON CONFLICT DO
@@ -395,8 +398,8 @@ INSERT INTO table_name [ AS 
 conflict_action specifies an
 alternative ON CONFLICT action.  It can be
-either DO NOTHING, or a DO
-UPDATE clause specifying the exact details of 

Re: has_privs_of_role vs. is_member_of_role, redux

2022-09-25 Thread Wolfgang Walther

Robert Haas:

Well, maybe. Suppose that role A has been granted pg_read_all_settings
WITH INHERIT TRUE, SET TRUE and role B has been granted
pg_read_all_settings WITH INHERIT TRUE, SET FALSE. A can create a
table owned by pg_read_all_settings. If A does that, then B can now
create a trigger on that table and usurp the privileges of
pg_read_all_settings, after which B can now create any number of
objects owned by pg_read_all_settings.


I'm not seeing how this is possible. A trigger function would run with 
the invoking user's privileges by default, right? So B would have to 
create a trigger with a SECURITY DEFINER function, which is owned by 
pg_read_all_settings to actually usurp the privileges of that role. But 
creating objects with that owner is exactly the thing B can't do.


What am I missing?

Best

Wolfgang




Re: Allow foreign keys to reference a superset of unique columns

2022-09-25 Thread Wolfgang Walther

James Coleman:

If I'm following properly this sounds like an overengineered EAV
schema, and neither of those things inspires me to think "this is a
use case I want to support".

That being said, I know that sometimes examples that have been
abstracted enough to share aren't always the best, so perhaps there's
something underlying this that's a more valuable example.


Most my use-cases are slightly denormalized and I was looking for an 
example that didn't require those kind of FKS only because of the 
denormalization. So that's why it might have been a bit artifical or 
abstracted too much.


Take another example: I deal with multi-tenancy systems, for which I 
want to use RLS to separate the data between tenants:


CREATE TABLE tenants (tenant INT PRIMARY KEY);

Each tenant can create multiple users and groups:

CREATE TABLE users (
  "user" INT PRIMARY KEY,
  tenant INT NOT NULL REFERENCES tenants
);

CREATE TABLLE groups (
  "group" INT PRIMARY KEY,
  tenant INT NOT NULL REFERENCES tenants
);

Users can be members of groups. The simple approach would be:

CREATE TABLE members (
  PRIMARY KEY ("user", "group"),
  "user" INT REFERENCES users,
  "group" INT REFERENCES groups
);

But this falls short in two aspects:
- To make RLS policies simpler to write and quicker to execute, I want 
to add "tenant" columns to **all** other tables. A slightly denormalized 
schema for efficiency.
- The schema above does not ensure that users can only be members in 
groups of the same tenant. Our business model requires to separate 
tenants cleanly, but as written above, cross-tenant memberships would be 
allowed.


In comes the "tenant" column which solves both of this:

CREATE TABLE members (
  PRIMARY KEY ("user", "group"),
  tenant INT REFERENCES tenants,
  "user" INT,
  "group" INT,
  FOREIGN KEY ("user", tenant) REFERENCES users ("user", tenant),
  FOREIGN KEY ("group", tenant) REFERENCES groups ("group", tenant)
);

This is not possible to do right now, without adding more UNIQUE 
constraints to the users and groups tables - on a superset of already 
unique columns.



bar.y is a little bit like a generated value in that sense, it should
always match foo.b. I think it would be great, if we could actually go a
step further, too: On an update to bar.x to a new value, if foo.a=bar.x
exists, I would like to set bar.y automatically to the new foo.b.
Otherwise those kind of updates always have to either query foo before,
or add a trigger to do the same.


Isn't this actually contradictory to the behavior you currently have
with a multi-column foreign key? In the example above then an update
to bar.x is going to update the rows in foo that match bar.x = foo.a
and bar.y = foo.b *using the old values of bar.x and bar.y* to be the
new values.


No, I think there was a misunderstanding. An update to bar should not 
update rows in foo. An update to bar.x should update bar.y implicitly, 
to match the new value of foo.b.



You seem to be suggesting that instead it should look for
other rows that already match the *new value* of only one of the
columns in the constraint.


Yes. I think basically what I'm suggesting is, that for an FK to a 
superset of unique columns, all the FK-logic should still be done on the 
already unique set of columns only - and then the additional columns 
should be mirrored into the referencing table. The referencing table can 
then put additional constraints on this column. In the members example 
above, this additional constraint is the fact that the tenant column 
can't be filled with two different values for the users and groups FKs. 
But this could also be a CHECK constraint to allow FKs only to a subset 
of rows in the target table:


CREATE TYPE foo_type AS ENUM ('A', 'B', 'C');

CREATE TABLE foo (
  f INT PRIMARY KEY,
  type foo_type
);

CREATE TABLE bar (
  b INT PRIMARY KEY,
  f INT,
  ftype foo_type CHECK (ftype <> 'C'),
  FOREIGN KEY (f, ftype) REFERENCES foo (f, type);
);

In this example, the additional ftype column is just used to enforce 
that bar can only reference rows with type A or B, but not C. Assume:


INSERT INTO foo VALUES (1, 'A'), (2, 'B'), (3, 'C');

In this case, it would be nice to be able to do the following, i.e. 
derive the value for bar.ftype automatically:


INSERT INTO bar (b, f) VALUES (10, 1); -- bar.ftype is then 'A'
UPDATE bar SET f = 2 WHERE b = 10; -- bar.ftype is then 'B'

And it would throw errors in the following cases, because the 
automatically derived value fails the CHECK constraint:


INSERT INTO bar (b, f) VALUES (20, 3);
UPDATE bar SET f = 3 WHERE b = 10;

Note: This "automatically derived columns" extension would be a separate 
feature. Really nice to have, but the above mentioned FKs to supersets 
of unique columns would be very valuable without it already.


Best

Wolfgang




Re: has_privs_of_role vs. is_member_of_role, redux

2022-09-08 Thread Wolfgang Walther

Robert Haas:

I think to change the owner of an object from role A to role B, you just
need a different "privilege" on that role B to "use" the role that way,
which is distinct from INHERIT or SET ROLE "privileges".


It's not distinct, though, because if you can transfer ownership of a
table to another user, you can use that ability to gain the privileges
of that user.


Right, but the inverse is not neccessarily true, so you could have SET 
ROLE privileges, but not "USAGE" - and then couldn't change the owner of 
an object to this role.


USAGE is not a good term, because it implies "least amount of 
privileges", but in this case it's quite the opposite.


In any case, adding a grant option for SET ROLE, while keeping the 
required privileges for a transfer of ownership at the minimum 
(membership only), doesn't really make sense. I guess both threads 
should be discussed together?


Best

Wolfgang




Re: has_privs_of_role vs. is_member_of_role, redux

2022-09-08 Thread Wolfgang Walther

Robert Haas:

Fairly obviously, my thinking here is biased by having written the
patch to allow restricting SET ROLE. If alice can neither inherit
bob's privileges nor SET ROLE bob, she had better not be able to
create objects owned by bob, because otherwise she can make a table,
add an expression index that calls a user-defined function, do stuff
until it needs to be autovacuumed, and then give it to bob, and boom,
exploit. But that doesn't mean that the is_member_of_role() tests here
have to be changed to has_privs_of_role(). They could be changed to
has_privs_of_role() || member_can_set_role(). And if the consensus is
to do it that way, I'm OK with that.

I'm just a little unconvinced that it's actually the best route. I
think that logic of the form "well Alice could just SET ROLE and do it
anyway" is weak -- and not only because of the patch to allow
restricting SET ROLE, but because AFAICT there is no point to the
INHERIT option in the first place unless it is to force you to issue
SET ROLE. That is literally the only thing it does. If we're going to
have weird exceptions where you don't have to SET ROLE after all, why
even have INHERIT in the first place?


I think to change the owner of an object from role A to role B, you just 
need a different "privilege" on that role B to "use" the role that way, 
which is distinct from INHERIT or SET ROLE "privileges".


When you are allowed to INHERIT a role, you are allowed to use the 
GRANTs that have been given to this role. When you are allowed to SET 
ROLE, then you are allowed to switch into this role. You could think of 
another "privilege", USAGE on a role, which would allow you to "use" 
this role as a target in a statement to change the owner of an object.


To change the owner for an object from role A to role B, you need:
- the privilege to ALTER the object, which is implied when you are A
- the privilege to "use" role B as a target

So basically the privilege to use role B as the new owner, is a 
privilege you have **on** the role object B, while the privilege to 
change the owner of an object is something you have **through** your 
membership in role A.


Up to v15, there were no separate privileges for this. You were either a 
member of a role or you were not. Now with INHERIT and maybe SET ROLE 
privileges/grant options, we can do two things:
- Keep the ability to use a role as a target in those statements as the 
most basic privilege on a role, that is implied by membership in that 
role and can't be taken away (currently the case), or

- invent a new privilege or grant option to allow changing that.

But mixing this with either INHERIT or SET ROLE doesn't make sense, imho.

Best

Wolfgang




Re: Allow foreign keys to reference a superset of unique columns

2022-09-02 Thread Wolfgang Walther

Kaiting Chen:

I'd like to propose a change to PostgreSQL to allow the creation of a foreign
key constraint referencing a superset of uniquely constrained columns.


+1

Tom Lane:

TBH, I think this is a fundamentally bad idea and should be rejected
outright.  It fuzzes the semantics of the FK relationship, and I'm
not convinced that there are legitimate use-cases.  Your example
schema could easily be dismissed as bad design that should be done
some other way.


I had to add quite a few unique constraints on a superset of already 
uniquely constrained columns in the past, just to be able to support FKs 
to those columns. I think those cases most often come up when dealing 
with slightly denormalized schemas, e.g. for efficiency.


One other use-case I had recently, was along the followling lines, in 
abstract terms:


CREATE TABLE classes (class INT PRIMARY KEY, ...);

CREATE TABLE instances (
  instance INT PRIMARY KEY,
  class INT REFERENCES classes,
  ...
);

Think about classes and instances as in OOP. So the table classes 
contains some definitions for different types of object and the table 
instances realizes them into concrete objects.


Now, assume you have some property of a class than is best modeled as a 
table like this:


CREATE TABLE classes_prop (
  property INT PRIMARY KEY,
  class INT REFERNECES classes,
  ...
);

Now, assume you need to store data for each of those classes_prop rows 
for each instance. You'd do the following:


CREATE TABLE instances_prop (
  instance INT REFERENCES instances,
  property INT REFERENCES classes_prop,
  ...
);

However, this does not ensure that the instance and the property you're 
referencing in instances_prop are actually from the same class, so you 
add a class column:


CREATE TABLE instances_prop (
  instance INT,
  class INT,
  property INT,
  FOREIGN KEY (instance, class) REFERENCES instances,
  FOREIGN KEY (property, class) REFERENCES classes_prop,
  ...
);

But this won't work, without creating some UNIQUE constraints on those 
supersets of the PK column first.



For one example of where the semantics get fuzzy, it's not
very clear how the extra-baggage columns ought to participate in
CASCADE updates.  Currently, if we have
CREATE TABLE foo (a integer PRIMARY KEY, b integer);
then an update that changes only foo.b doesn't need to update
referencing tables, and I think we even have optimizations that
assume that if no unique-key columns are touched then RI checks
need not be made.  But if you did
CREATE TABLE bar (x integer, y integer,
  FOREIGN KEY (x, y) REFERENCES foo(a, b) ON UPDATE 
CASCADE);
then perhaps you expect bar.y to be updated ... or maybe you don't?


In all use-cases I had so far, I would expect bar.y to be updated, too.

I think it would not even be possible to NOT update bar.y, because the 
FK would then not match anymore. foo.a is the PK, so the value in bar.x 
already forces bar.y to be the same as foo.b at all times.


bar.y is a little bit like a generated value in that sense, it should 
always match foo.b. I think it would be great, if we could actually go a 
step further, too: On an update to bar.x to a new value, if foo.a=bar.x 
exists, I would like to set bar.y automatically to the new foo.b. 
Otherwise those kind of updates always have to either query foo before, 
or add a trigger to do the same.


In the classes/instances example above, when updating 
instances_prop.property to a new value, instances_prop.class would be 
updated automatically to match classes_prop.class. This would fail, when 
the class is different than the class required by the FK to instances, 
though, providing exactly the safe-guard that this constraint was 
supposed to provide, without incurring additional overhead in update 
statements.


In the foo/bar example above, which is just a bit of denormalization, 
this automatic update would also be helpful - because rejecting the 
update on the grounds that the columns don't match doesn't make sense here.



Another example is that I think the idea is only well-defined when
the subset column(s) are a primary key, or at least all marked NOT NULL.
Otherwise they're not as unique as you're claiming.


I fail to see why. My understanding is that rows with NULL values in the 
referenced table can't participate in FK matches anyway, because both 
MATCH SIMPLE and MATCH FULL wouldn't require a match when any/all of the 
columns in the referencing table are NULL. MATCH PARTIAL is not 
implemented, so I can't tell whether the semantics would be different there.


I'm not sure whether a FK on a superset of unique columns would be 
useful with MATCH SIMPLE. Maybe it could be forced to be MATCH FULL, if 
MATCH SIMPLE is indeed not well-defined.



It's also unclear to me how this ought to interact with the
information_schema views concerning foreign keys.  We generally
feel that we don't want to present any non-SQL-compatible data
in information_schema, for fear that it will confuse 

Re: allowing for control over SET ROLE

2022-09-02 Thread Wolfgang Walther

Robert Haas:

Beginning in
e3ce2de09d814f8770b2e3b3c152b7671bcdb83f, the inheritance behavior of
role-grants can be overridden for individual grants, so that some
grants are inherited and others are not.


That's a great thing to have!


However, there is no similar
facility for controlling whether a role can SET ROLE to some other
role of which it is a member. At present, if role A is a member of
role B, then A can SET ROLE B, and that's it.

In some circumstances, it may be desirable to control this behavior.


+1


rhaas=# grant oncall to oncallbot with inherit false, set false, admin true;


Looking at the syntax here, I'm not sure whether adding more WITH 
options is the best way to do this. From a user perspective WITH SET 
TRUE looks more like a privilege granted on how to use this database 
object (role). Something like this would be more consistent with the 
other GRANT variants:


GRANT SET ON ROLE oncall TO oncallbot WITH GRANT OPTION;

This is obviously not exactly the same as the command above, because 
oncallbot would be able to use SET ROLE directly. But as discussed, this 
is more cosmetic anyway, because they could GRANT it to themselves.


The full syntax could look like this:

GRANT { INHERIT | SET | ALL [ PRIVILEGES ] }
  ON ROLE role_name [, ...]
  TO role_specification [, ...] WITH GRANT OPTION
  [ GRANTED BY role_specification ]

With this new syntax, the existing

GRANT role_name TO role_specification [WITH ADMIN OPTION];

would be the same as

GRANT ALL ON role_name TO role_specification [WITH GRANT OPTION];

This would slightly change the way INHERIT works: As a privilege, it 
would not override the member's role INHERIT attribute, but would 
control whether that attribute is applied. This means:


- INHERIT attribute + INHERIT granted -> inheritance (same)
- INHERIT attribute + INHERIT not granted -> no inheritance (different!)
- NOINHERIT attribute  + INHERIT not granted -> no inheritance (same)
- NOINHERIT attribute  + INHERIT granted -> no inheritance (different!)

This would allow us to do the following:

GRANT INHERIT ON ROLE pg_read_all_settings TO seer_bot WITH GRANT OPTION;

seer_bot would now be able to GRANT pg_read_all_settings to other users, 
too - but without the ability to use or grant SET ROLE to anyone. As 
long as seer_bot has the NOINHERIT attribute set, they wouldn't use that 
privilege, though - which might be desired for the bot.


Similary, it would be possible for the oncallbot in the example above to 
be able to grant SET ROLE only - and not INHERIT.


I realize that there has been a lot of discussion about roles and 
privileges in the past year. I have tried to follow those discussions, 
but it's likely that I missed some good arguments against my proposal above.


Best

Wolfgang




Re: [PATCH] Add reloption for views to enable RLS

2022-03-02 Thread Wolfgang Walther

Dean Rasheed:

That is also the main reason I preferred naming it "security_invoker" -
it is consistent with other databases and eases transition from such
systems.

[...]

For my part, I find myself more and more convinced that
"security_invoker" is the right name, because it matches the
terminology used for functions, and in other database systems. I think
the parallels between security invoker functions and security invoker
views are quite strong.

[...]

When we come to write the release notes for this feature, saying that
this version of PG now supports security invoker views is going to
mean a lot more to people who already use that feature in other
databases.

What are other people's opinions?


All those points in favor of security_invoker are very good indeed. The 
main objection was not the term invoker, though, but the implicit 
association it creates as in "security_invoker=false behaves like 
security definer". But this is clearly wrong, the "security definer" 
semantics as used for functions or in other databases just don't apply 
as the default in PG.


I think renaming the reloption was a shortcut to avoid that association, 
while the best way to deal with that would be explicit documentation. 
Meanwhile, the patch has added a mention about CURRENT_USER, so that's a 
first step. Maybe an explicit mention that security_invoker=false, is 
NOT the same as "security definer" and explaining why would already be 
enough?


Best

Wolfgang





Suggestion: optionally return default value instead of error on failed cast

2020-12-12 Thread Wolfgang Walther

Hi,

currently a failed cast throws an error. It would be useful to have a 
way to get a default value instead.


T-SQL has try_cast [1]
Oracle has CAST(... AS .. DEFAULT ... ON CONVERSION ERROR) [2]

The DEFAULT ... ON CONVERSION ERROR syntax seems like it could be 
implemented in PostgreSQL. Even if only DEFAULT NULL was supported (at 
first) that would already help.


The short syntax could be extended for the DEFAULT NULL case, too:

SELECT '...'::type -- throws error
SELECT '...':::type -- returns NULL

I couldn't find any previous discussion on this, please advise in case I 
just missed it.


Thoughts?

Best

Wolfgang

[1]: 
https://docs.microsoft.com/en-us/sql/t-sql/functions/try-cast-transact-sql
[2]: 
https://docs.oracle.com/en/database/oracle/oracle-database/12.2/sqlrf/CAST.html





Re: extension patch of CREATE OR REPLACE TRIGGER

2020-08-03 Thread Wolfgang Walther

osumi.takami...@fujitsu.com:

* I'm a little bit concerned about the semantics of changing the
tgdeferrable/tginitdeferred properties of an existing trigger.  If there are 
trigger
events pending, and the trigger is redefined in such a way that those events
should already have been fired, what then?

OK. I need a discussion about this point.
There would be two ideas to define the behavior of this semantics change, I 
think.
The first idea is to throw an error that means
the *pending* trigger can't be replaced during the session.
The second one is just to replace the trigger and ignite the new trigger
at the end of the session when its tginitdeferred is set true.
For me, the first one sounds safer. Yet, I'd like to know other opinions.


IMHO, constraint triggers should behave the same in that regard as other 
constraints. I just checked:


BEGIN;
CREATE TABLE t1 (a int CONSTRAINT u UNIQUE DEFERRABLE INITIALLY DEFERRED);
INSERT INTO t1 VALUES (1),(1);
ALTER TABLE t1 ALTER CONSTRAINT u NOT DEFERRABLE;

will throw with:

ERROR:  cannot ALTER TABLE "t1" because it has pending trigger events
SQL state: 55006

So if a trigger event is pending, CREATE OR REPLACE for that trigger 
should throw. I think it should do in any case, not just when changing 
deferrability. This makes it easier to reason about.


If the user has a pending trigger, they can still do SET CONSTRAINTS 
trigger_name IMMEDIATE; to resolve that and then do CREATE OR REPLACE 
TRIGGER, just like in the ALTER TABLE case.



regression=# create constraint trigger my_trig after insert on trig_table
deferrable initially deferred for each row execute procedure
before_replacement(); CREATE TRIGGER regression=# begin; BEGIN
regression=*# insert into trig_table default values; INSERT 0 1 regression=*#
drop trigger my_trig on trig_table; DROP TRIGGER regression=*# commit;
ERROR:  relation 38489 has no triggers

I could reproduce this bug, using the current master without my patch.
So this is another issue.
I'm thinking that throwing an error when *pending* trigger is dropped
makes sense. Does everyone agree with it ?


Just tested the same example as above, but with DROP TABLE t1; instead 
of ALTER TABLE. This throws with:


ERROR:  cannot DROP TABLE "t1" because it has pending trigger events
SQL state: 55006

So yes, your suggestion makes a lot of sense!




Re: Allow an alias to be attached directly to a JOIN ... USING

2020-08-03 Thread Wolfgang Walther

Peter Eisentraut:

On 2019-12-31 00:07, Vik Fearing wrote:

One thing I notice is that the joined columns are still accessible from
their respective table names when they should not be per spec.  That
might be one of those "silly restrictions" that we choose to ignore, but
it should probably be noted somewhere, at the very least in a code
comment if not in user documentation. (This is my reading of SQL:2016 SR
11.a.i)


Here is a rebased patch.

The above comment is valid.  One reason I didn't implement it is that it 
would create inconsistencies with existing behavior, which is already 
nonstandard.


For example,

create table a (id int, a1 int, a2 int);
create table b (id int, b2 int, b3 int);

makes

select a.id from a join b using (id);

invalid.  Adding an explicit alias for the common column names doesn't 
change that semantically, because an implicit alias also exists if an 
explicit one isn't specified.
I just looked through the patch without applying or testing it - but I 
couldn't find anything that would indicate that this is not going to 
work for e.g. a LEFT JOIN as well. First PG patch I looked at, so tell 
me if I missed something there.


So given this:

SELECT x.id FROM a LEFT JOIN b USING (id) AS x

will this return NULL or a.id for rows that don't match in b? This 
should definitely be mentioned in the docs and I guess a test wouldn't 
be too bad as well?


In any case: If a.id and b.id would not be available anymore, but just 
x.id, either the id value itself or the NULL value (indicating the 
missing row in b) are lost. So this seems like a no-go.


> I agree that some documentation would be in order if we decide to leave
> it like this.

Keep it like that!




Re: [PATCH] Add support for ON UPDATE/DELETE actions on ALTER CONSTRAINT

2020-08-03 Thread Wolfgang Walther

Tom Lane:

We don't generally act that way in other ALTER commands and I don't see
a strong argument to start doing so here.  [...]

In short, I'm inclined to argue that this variant of ALTER TABLE
should replace *all* the fields of the constraint with the same
properties it'd have if you'd created it fresh using the same syntax.
This is by analogy to CREATE OR REPLACE commands, which don't
preserve any of the old properties of the replaced object.  Given
the interactions between these fields, I think you're going to end up
with a surprising mess of ad-hoc choices if you do differently.
Indeed, you already have, but I think it'll get worse if anyone
tries to extend the feature set further.


I don't think the analogy to CREATE OR REPLACE holds. Semantically 
REPLACE and ALTER are very different. Using ALTER the expectation is to 
change something, keeping everything else unchanged. Looking at all the 
other ALTER TABLE actions, especially ALTER COLUMN, it looks like every 
command does exactly one thing and not more. I don't think deferrability 
and ON UPDATE / ON CASCADE should be changed together at all, neither 
implicitly nor explicitly.


There seems to be a fundamental difference between deferrability and the 
ON UPDATE/ON DELETE clauses as well - the latter only apply to FOREIGN 
KEYs, while the former apply to multiple types of constraints.


Matheus de Oliveira:
I'm still not sure if the chosen path is the best way. But I'd be glad 
to follow any directions we all see fit.


For now, this patch applies two methods:
1. Changes full constraint definition (which keeps compatibility with 
current ALTER CONSTRAINT):

     ALTER CONSTRAINT [] [] []
2. Changes only the subset explicit seem in the command (a new way, I've 
chosen to just add SET in the middle, similar to `ALTER COLUMN ... SET 
{DEFAULT | NOT NULL}` ):

     ALTER CONSTRAINT SET [] [] []

I'm OK with changing the approach, we just need to chose the color :D


The `ALTER CONSTRAINT SET [] [] []` 
has the same problem about implied changes: What happens if you only do 
e.g. ALTER CONSTRAINT SET ON UPDATE xy - will the ON DELETE part be kept 
as-is or set to the default?


Also, since the ON UPDATE/ON DELETE just applies to FOREIGN KEYs and no 
other constraints, there's one level of "nesting" missing in your SET 
variant, I think.


I suggest to:

- keep `ALTER CONSTRAINT constraint_name [ DEFERRABLE | NOT DEFERRABLE ] 
[ INITIALLY DEFERRED | INITIALLY IMMEDIATE ]` exactly as-is


- add both:
 + `ALTER CONSTRAINT constraint_name [ALTER] FOREIGN KEY ON UPDATE 
referential_action`
 + `ALTER CONSTRAINT constraint_name [ALTER] FOREIGN KEY ON DELETE 
referential_action`


This does not imply any changes, that are not in the command - very much 
analog to the ALTER COLUMN variants.


This could also be extended in the future with stuff like `ALTER 
CONSTRAINT constraint_name [ALTER] FOREIGN KEY MATCH [ FULL | PARTIAL | 
SIMPLE ]`.