Re: Popcount optimization using AVX512
On Thu, Oct 31, 2024 at 07:58:06PM +, Devulapalli, Raghuveer wrote: > LGTM. Thanks. Barring additional feedback, I plan to commit this soon. -- nathan
Re: Popcount optimization using AVX512
On Wed, Oct 30, 2024 at 04:10:10PM -0500, Nathan Bossart wrote: > On Wed, Oct 30, 2024 at 08:53:10PM +, Raghuveer Devulapalli wrote: >> BTW, I just realized function attributes for xsave and avx512 don't work >> on MSVC (see >> https://developercommunity.visualstudio.com/t/support-function-target-attribute-and-mutiversioning/10130630). >> Not sure if you care about it. Its an easy fix (see >> https://gcc.godbolt.org/z/Pebdj3vMx). > > Oh, good catch. IIUC we only need to check for #ifndef _MSC_VER in the > configure programs for meson. pg_attribute_target will be empty on MSVC, > and I believe we only support meson builds there. Here is an updated patch with this change. -- nathan >From 8cf7c08220a9c0a1dec809794af2dfb719981923 Mon Sep 17 00:00:00 2001 From: Nathan Bossart Date: Wed, 16 Oct 2024 15:57:55 -0500 Subject: [PATCH v2 1/1] use __attribute__((target(...))) for AVX-512 stuff --- config/c-compiler.m4 | 60 +- configure| 163 ++- configure.ac | 17 +-- meson.build | 21 ++-- src/Makefile.global.in | 5 - src/include/c.h | 10 ++ src/makefiles/meson.build| 4 +- src/port/Makefile| 12 +- src/port/meson.build | 7 +- src/port/pg_popcount_avx512.c| 86 +- src/port/pg_popcount_avx512_choose.c | 102 - 11 files changed, 175 insertions(+), 312 deletions(-) delete mode 100644 src/port/pg_popcount_avx512_choose.c diff --git a/config/c-compiler.m4 b/config/c-compiler.m4 index 10f8c7bd0a..aa90f8ef33 100644 --- a/config/c-compiler.m4 +++ b/config/c-compiler.m4 @@ -700,20 +700,20 @@ undefine([Ac_cachevar])dnl # Check if the compiler supports the XSAVE instructions using the _xgetbv # intrinsic function. # -# An optional compiler flag can be passed as argument (e.g., -mxsave). If the -# intrinsic is supported, sets pgac_xsave_intrinsics and CFLAGS_XSAVE. +# If the intrinsics are supported, sets pgac_xsave_intrinsics. AC_DEFUN([PGAC_XSAVE_INTRINSICS], -[define([Ac_cachevar], [AS_TR_SH([pgac_cv_xsave_intrinsics_$1])])dnl -AC_CACHE_CHECK([for _xgetbv with CFLAGS=$1], [Ac_cachevar], -[pgac_save_CFLAGS=$CFLAGS -CFLAGS="$pgac_save_CFLAGS $1" -AC_LINK_IFELSE([AC_LANG_PROGRAM([#include ], - [return _xgetbv(0) & 0xe0;])], +[define([Ac_cachevar], [AS_TR_SH([pgac_cv_xsave_intrinsics])])dnl +AC_CACHE_CHECK([for _xgetbv], [Ac_cachevar], +[AC_LINK_IFELSE([AC_LANG_PROGRAM([#include +__attribute__((target("xsave"))) +static int xsave_test(void) +{ + return _xgetbv(0) & 0xe0; +}], + [return xsave_test();])], [Ac_cachevar=yes], - [Ac_cachevar=no]) -CFLAGS="$pgac_save_CFLAGS"]) + [Ac_cachevar=no])]) if test x"$Ac_cachevar" = x"yes"; then - CFLAGS_XSAVE="$1" pgac_xsave_intrinsics=yes fi undefine([Ac_cachevar])dnl @@ -725,29 +725,27 @@ undefine([Ac_cachevar])dnl # _mm512_setzero_si512, _mm512_maskz_loadu_epi8, _mm512_popcnt_epi64, # _mm512_add_epi64, and _mm512_reduce_add_epi64 intrinsic functions. # -# Optional compiler flags can be passed as argument (e.g., -mavx512vpopcntdq -# -mavx512bw). If the intrinsics are supported, sets -# pgac_avx512_popcnt_intrinsics and CFLAGS_POPCNT. +# If the intrinsics are supported, sets pgac_avx512_popcnt_intrinsics. AC_DEFUN([PGAC_AVX512_POPCNT_INTRINSICS], -[define([Ac_cachevar], [AS_TR_SH([pgac_cv_avx512_popcnt_intrinsics_$1])])dnl -AC_CACHE_CHECK([for _mm512_popcnt_epi64 with CFLAGS=$1], [Ac_cachevar], -[pgac_save_CFLAGS=$CFLAGS -CFLAGS="$pgac_save_CFLAGS $1" -AC_LINK_IFELSE([AC_LANG_PROGRAM([#include ], - [const char buf@<:@sizeof(__m512i)@:>@; - PG_INT64_TYPE popcnt = 0; - __m512i accum = _mm512_setzero_si512(); - const __m512i val = _mm512_maskz_loadu_epi8((__mmask64) 0xf0f0f0f0f0f0f0f0, (const __m512i *) buf); - const __m512i cnt = _mm512_popcnt_epi64(val); - accum = _mm512_add_epi64(accum, cnt); - popcnt = _mm512_reduce_add_epi64(accum); - /* return computed value, to prevent the above being optimized away */ - return popcnt == 0;])], +[define([Ac_cachevar], [AS_TR_SH([pgac_cv_avx512_popcnt_intrinsics])])dnl +AC_CACHE_CHECK([for _mm512_popcnt_epi64], [Ac_cachevar], +[AC_LINK_IFELSE([AC_LANG_PROGRAM([#include +__attribute__((target("avx512vpopcntdq","avx512bw"))) +static int popcount_test(void) +{ + const char buf@<:@sizeof(__m512i)@:>@; + PG_INT64_TYPE popcnt = 0; + __m512i accum = _mm512_setzero_si512(); + const __m512i val = _mm512_maskz_loadu_epi8((__mmask64) 0xf0f0f0f0f0f0f0f0, (const __m512i *) buf); + const __m512i cnt = _mm512_popcnt_epi64(val); + accum = _mm512_add_epi64(accum, cnt); + popcnt = _mm512_reduce_add_epi64(accum); + retur
Re: Time to add a Git .mailmap?
On Thu, Oct 31, 2024 at 11:37:13AM +0100, Daniel Gustafsson wrote: > When looking at our Git tree for a recent conference presentation I happened > to > notice that we have recently gained duplicate names in the shortlog. Not sure > if we care enough to fix that with a .mailmap, but if we do the attached diff > makes sure that all commits are accounted for a single committer entry. Seems reasonable to me. -- nathan
Re: Popcount optimization using AVX512
On Wed, Oct 30, 2024 at 08:53:10PM +, Raghuveer Devulapalli wrote: > BTW, I just realized function attributes for xsave and avx512 don't work > on MSVC (see > https://developercommunity.visualstudio.com/t/support-function-target-attribute-and-mutiversioning/10130630). > Not sure if you care about it. Its an easy fix (see > https://gcc.godbolt.org/z/Pebdj3vMx). Oh, good catch. IIUC we only need to check for #ifndef _MSC_VER in the configure programs for meson. pg_attribute_target will be empty on MSVC, and I believe we only support meson builds there. -- nathan
Re: Large expressions in indexes can't be stored (non-TOASTable)
On Fri, Oct 18, 2024 at 08:08:55AM +0900, Michael Paquier wrote: > 0002 and 0003 look sane enough to me as shaped. I'd need to spend a > few more hours on 0001 if I were to do a second round on it, but you > don't really need a second opinion, do you? :D I'll manage. 0001 was a doozy to back-patch, and this obviously isn't a terribly pressing issue, so I plan to wait until after the November minor release to apply this. I'm having second thoughts on 0002, so I'll probably leave that one uncommitted, but IMHO we definitely need 0003 to prevent this issue from reoccurring. -- nathan >From 58eefc6c2f90ac7d1f23f5b5f3052042cf0e75d9 Mon Sep 17 00:00:00 2001 From: Nathan Bossart Date: Wed, 30 Oct 2024 15:42:29 -0500 Subject: [PATCH v5 1/1] Ensure we have a snapshot when updating various system catalogs. Reported-by: Alexander Lakhin Reviewed-by: Michael Paquier Discussion: https://postgr.es/m/18127-fe54b6a667f29658%40postgresql.org Discussion: https://postgr.es/m/18309-c0bf914950c46692%40postgresql.org Discussion: https://postgr.es/m/ZvMSUPOqUU-VNADN%40nathan Backpatch-through: 12 --- src/backend/commands/indexcmds.c| 2 +- src/backend/commands/tablecmds.c| 8 +++ src/backend/postmaster/autovacuum.c | 7 ++ src/backend/replication/logical/tablesync.c | 21 ++ src/backend/replication/logical/worker.c| 24 + 5 files changed, 61 insertions(+), 1 deletion(-) diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c index 2f652463e3..fd86f28a35 100644 --- a/src/backend/commands/indexcmds.c +++ b/src/backend/commands/indexcmds.c @@ -4230,7 +4230,7 @@ ReindexRelationConcurrently(const ReindexStmt *stmt, Oid relationOid, const Rein false); /* -* Updating pg_index might involve TOAST table access, so ensure we +* Swapping the indexes might involve TOAST table access, so ensure we * have a valid snapshot. */ PushActiveSnapshot(GetTransactionSnapshot()); diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c index 4345b96de5..f1649419d6 100644 --- a/src/backend/commands/tablecmds.c +++ b/src/backend/commands/tablecmds.c @@ -19383,9 +19383,17 @@ ATExecDetachPartition(List **wqueue, AlteredTableInfo *tab, Relation rel, tab->rel = rel; } + /* +* Detaching the partition might involve TOAST table access, so ensure we +* have a valid snapshot. +*/ + PushActiveSnapshot(GetTransactionSnapshot()); + /* Do the final part of detaching */ DetachPartitionFinalize(rel, partRel, concurrent, defaultPartOid); + PopActiveSnapshot(); + ObjectAddressSet(address, RelationRelationId, RelationGetRelid(partRel)); /* keep our lock until commit */ diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c index dc3cf87aba..e84285b644 100644 --- a/src/backend/postmaster/autovacuum.c +++ b/src/backend/postmaster/autovacuum.c @@ -2209,6 +2209,12 @@ do_autovacuum(void) get_namespace_name(classForm->relnamespace), NameStr(classForm->relname; + /* +* Deletion might involve TOAST table access, so ensure we have a +* valid snapshot. +*/ + PushActiveSnapshot(GetTransactionSnapshot()); + object.classId = RelationRelationId; object.objectId = relid; object.objectSubId = 0; @@ -2221,6 +2227,7 @@ do_autovacuum(void) * To commit the deletion, end current transaction and start a new * one. Note this also releases the locks we took. */ + PopActiveSnapshot(); CommitTransactionCommand(); StartTransactionCommand(); diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c index 118503fcb7..57e1eede41 100644 --- a/src/backend/replication/logical/tablesync.c +++ b/src/backend/replication/logical/tablesync.c @@ -377,6 +377,12 @@ process_syncing_tables_for_sync(XLogRecPtr current_lsn) replorigin_session_origin_lsn = InvalidXLogRecPtr; replorigin_session_origin_timestamp = 0; + /* +* Updating pg_replication_origin might involve TOAST table access, so +* ensure we have a valid snapshot. +*/ + PushActiveSnapshot(GetTransactionSnapshot()); + /* * Drop the tablesync's origin tracking if exists. * @@ -387,6 +393,8 @
Re: Reduce one comparison in binaryheap's sift down
On Wed, Oct 30, 2024 at 08:09:17AM -0400, Robert Haas wrote: > I would be delighted if you could take care of it. Committed. -- nathan
Re: Reduce one comparison in binaryheap's sift down
On Tue, Oct 29, 2024 at 11:21:15AM +0800, cca5507 wrote: > Agree, new version patch is attached. I might editorialize a bit, but overall this looks pretty good to me. Robert, would you like to commit this, or shall I? -- nathan
Re: sunsetting md5 password support
On Mon, Oct 28, 2024 at 04:10:29PM -0500, Jim Nasby wrote: > Patch itself looks good, but it does leave me wondering if cleartext > should also be deprecated? I see that Tom has already chimed in on this point. In any case, this is probably a topic for another thread. > Might also be worth mentioning deprecation in pg_hba.conf. Yeah. I vaguely recall waffling on whether to add one there, and for whatever reason, I decided against it. I've added it in v4. -- nathan >From 716ca7332ed3e9e7e23146c926f93cb759a0d227 Mon Sep 17 00:00:00 2001 From: Nathan Bossart Date: Mon, 28 Oct 2024 19:54:02 -0500 Subject: [PATCH v4 1/1] Deprecate MD5 passwords. MD5 has been considered to be unsuitable for use as a cryptographic hash algorithm for some time. Furthermore, MD5 password hashes in PostgreSQL are vulnerable to pass-the-hash attacks, i.e., knowing the username and hashed password is sufficient to authenticate. The SCRAM-SHA-256 method added in v10 is not subject to these problems and is considered to be superior to MD5. This commit marks MD5 password support in PostgreSQL as deprecated and to be removed in a future release. The documentation now contains several deprecation notices, and CREATE ROLE and ALTER ROLE now emit deprecation warnings when setting MD5 passwords. The warnings can be disabled by setting the md5_password_warnings parameter to "off". Reviewed-by: Greg Sabino Mullane, Jim Nasby Discussion: https://postgr.es/m/ZwbfpJJol7lDWajL%40nathan --- .../passwordcheck/expected/passwordcheck.out | 1 + .../expected/passwordcheck_1.out | 1 + contrib/passwordcheck/sql/passwordcheck.sql | 1 + doc/src/sgml/catalogs.sgml| 9 +++ doc/src/sgml/client-auth.sgml | 17 + doc/src/sgml/config.sgml | 24 +++ doc/src/sgml/libpq.sgml | 9 +++ doc/src/sgml/protocol.sgml| 8 +++ doc/src/sgml/ref/create_role.sgml | 8 +++ doc/src/sgml/runtime.sgml | 10 src/backend/libpq/crypt.c | 10 src/backend/utils/misc/guc_tables.c | 9 +++ src/backend/utils/misc/postgresql.conf.sample | 1 + src/include/libpq/crypt.h | 3 +++ src/test/regress/expected/password.out| 15 src/test/regress/expected/password_1.out | 9 +++ 16 files changed, 135 insertions(+) diff --git a/contrib/passwordcheck/expected/passwordcheck.out b/contrib/passwordcheck/expected/passwordcheck.out index 2027681daf..dfb2ccfe00 100644 --- a/contrib/passwordcheck/expected/passwordcheck.out +++ b/contrib/passwordcheck/expected/passwordcheck.out @@ -1,3 +1,4 @@ +SET md5_password_warnings = off; LOAD 'passwordcheck'; CREATE USER regress_passwordcheck_user1; -- ok diff --git a/contrib/passwordcheck/expected/passwordcheck_1.out b/contrib/passwordcheck/expected/passwordcheck_1.out index 5d8d5dcc1c..9519d60a49 100644 --- a/contrib/passwordcheck/expected/passwordcheck_1.out +++ b/contrib/passwordcheck/expected/passwordcheck_1.out @@ -1,3 +1,4 @@ +SET md5_password_warnings = off; LOAD 'passwordcheck'; CREATE USER regress_passwordcheck_user1; -- ok diff --git a/contrib/passwordcheck/sql/passwordcheck.sql b/contrib/passwordcheck/sql/passwordcheck.sql index 1fbd6b0e96..5953ece5c2 100644 --- a/contrib/passwordcheck/sql/passwordcheck.sql +++ b/contrib/passwordcheck/sql/passwordcheck.sql @@ -1,3 +1,4 @@ +SET md5_password_warnings = off; LOAD 'passwordcheck'; CREATE USER regress_passwordcheck_user1; diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml index 964c819a02..0b9ca087c8 100644 --- a/doc/src/sgml/catalogs.sgml +++ b/doc/src/sgml/catalogs.sgml @@ -1618,6 +1618,15 @@ will store the md5 hash of xyzzyjoe. + + +Support for MD5-encrypted passwords is deprecated and will be removed in a +future release of PostgreSQL. Refer to + for details about migrating to another +password type. + + + If the password is encrypted with SCRAM-SHA-256, it has the format: diff --git a/doc/src/sgml/client-auth.sgml b/doc/src/sgml/client-auth.sgml index 51343de7ca..782b49c85a 100644 --- a/doc/src/sgml/client-auth.sgml +++ b/doc/src/sgml/client-auth.sgml @@ -531,6 +531,15 @@ include_dir directory user's password. See for details. + + + Support for MD5-encrypted passwords is deprecated and will be + removed in a future release of + PostgreSQL. Refer to +for details about migrating to + another password type. + + @@ -1260,6 +1269,14 @@ omicron bryanh guest1 server is encrypted for SCRAM (see below), then SCRAM-based authentication will automatically be chose
Re: Assertion failure when autovacuum drops orphan temp indexes.
Committed. -- nathan
Re: Reduce one comparison in binaryheap's sift down
On Mon, Oct 28, 2024 at 12:40:20PM -0400, Robert Haas wrote: > Hmm, so at present we compare the parent to the left child and to the > right child. If it's smaller than neither, everything is OK. If it's > smaller than one, we swap it with that one. If it's smaller than both, > we compare the left and right child with each other and swap the > parent with the larger of the two. Hence, if a node has 2 children, we > always do 2 comparisons, and we sometimes do 3 comparisons. > > With the patch, we first compare the two children to each other, and > then compare the larger one to the parent. If the parent is smaller > than the larger child, we swap them. Hene, if a node has 2 children, > we always do 2 comparisons. > > Unless I'm missing something, that does seem significantly better. That sounds right to me. I think there are some ways to simplify the code a little further, though. For example, you can initialize larger_off to left_off, and before any comparisons happen, break if the node has no children, i.e., left_off >= heap->bh_size. That should help reduce the number of offset assignments and comparisons, which I found difficult to read at first. -- nathan
Re: Assertion failure when autovacuum drops orphan temp indexes.
On Sun, Oct 27, 2024 at 09:36:38PM -0700, Masahiko Sawada wrote: > Thank you for your suggestion. IMHO I'm not sure we need to have a > regression test for this bug as it's just an oversight of recently > committed code. Agreed. FWIW cfbot seems to catch this already (CTRL+F for "index_drop" in the highlights page [0]), and I saw it on my laptop shortly after the issue was reported. [0] http://cfbot.cputube.org/highlights/all.html -- nathan
Re: sunsetting md5 password support
rebased -- nathan >From 0f867b4b560c83f33b448df5a9b20a6d61ba2611 Mon Sep 17 00:00:00 2001 From: Nathan Bossart Date: Fri, 11 Oct 2024 16:21:09 -0500 Subject: [PATCH v3 1/1] Deprecate MD5 passwords. MD5 has been considered to be unsuitable for use as a cryptographic hash algorithm for some time. Furthermore, MD5 password hashes in PostgreSQL are vulnerable to pass-the-hash attacks, i.e., knowing the username and hashed password is sufficient to authenticate. The SCRAM-SHA-256 method added in v10 is not subject to these problems and is considered to be superior to MD5. This commit marks MD5 password support in PostgreSQL as deprecated and to be removed in a future release. The documentation now contains several deprecation notices, and CREATE ROLE and ALTER ROLE now emit deprecation warnings when setting MD5 passwords. The warnings can be disabled by setting the md5_password_warnings parameter to "off". Reviewed-by: ??? Discussion: https://postgr.es/m/ZwbfpJJol7lDWajL%40nathan --- .../passwordcheck/expected/passwordcheck.out | 1 + .../expected/passwordcheck_1.out | 1 + contrib/passwordcheck/sql/passwordcheck.sql | 1 + doc/src/sgml/catalogs.sgml| 9 +++ doc/src/sgml/client-auth.sgml | 8 +++ doc/src/sgml/config.sgml | 24 +++ doc/src/sgml/libpq.sgml | 9 +++ doc/src/sgml/protocol.sgml| 8 +++ doc/src/sgml/ref/create_role.sgml | 8 +++ doc/src/sgml/runtime.sgml | 10 src/backend/libpq/crypt.c | 10 src/backend/utils/misc/guc_tables.c | 9 +++ src/backend/utils/misc/postgresql.conf.sample | 1 + src/include/libpq/crypt.h | 3 +++ src/test/regress/expected/password.out| 15 src/test/regress/expected/password_1.out | 9 +++ 16 files changed, 126 insertions(+) diff --git a/contrib/passwordcheck/expected/passwordcheck.out b/contrib/passwordcheck/expected/passwordcheck.out index 2027681daf..dfb2ccfe00 100644 --- a/contrib/passwordcheck/expected/passwordcheck.out +++ b/contrib/passwordcheck/expected/passwordcheck.out @@ -1,3 +1,4 @@ +SET md5_password_warnings = off; LOAD 'passwordcheck'; CREATE USER regress_passwordcheck_user1; -- ok diff --git a/contrib/passwordcheck/expected/passwordcheck_1.out b/contrib/passwordcheck/expected/passwordcheck_1.out index 5d8d5dcc1c..9519d60a49 100644 --- a/contrib/passwordcheck/expected/passwordcheck_1.out +++ b/contrib/passwordcheck/expected/passwordcheck_1.out @@ -1,3 +1,4 @@ +SET md5_password_warnings = off; LOAD 'passwordcheck'; CREATE USER regress_passwordcheck_user1; -- ok diff --git a/contrib/passwordcheck/sql/passwordcheck.sql b/contrib/passwordcheck/sql/passwordcheck.sql index 1fbd6b0e96..5953ece5c2 100644 --- a/contrib/passwordcheck/sql/passwordcheck.sql +++ b/contrib/passwordcheck/sql/passwordcheck.sql @@ -1,3 +1,4 @@ +SET md5_password_warnings = off; LOAD 'passwordcheck'; CREATE USER regress_passwordcheck_user1; diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml index 964c819a02..0b9ca087c8 100644 --- a/doc/src/sgml/catalogs.sgml +++ b/doc/src/sgml/catalogs.sgml @@ -1618,6 +1618,15 @@ will store the md5 hash of xyzzyjoe. + + +Support for MD5-encrypted passwords is deprecated and will be removed in a +future release of PostgreSQL. Refer to + for details about migrating to another +password type. + + + If the password is encrypted with SCRAM-SHA-256, it has the format: diff --git a/doc/src/sgml/client-auth.sgml b/doc/src/sgml/client-auth.sgml index 51343de7ca..60a1d16288 100644 --- a/doc/src/sgml/client-auth.sgml +++ b/doc/src/sgml/client-auth.sgml @@ -1260,6 +1260,14 @@ omicron bryanh guest1 server is encrypted for SCRAM (see below), then SCRAM-based authentication will automatically be chosen instead. + + + +Support for MD5-encrypted passwords is deprecated and will be removed +in a future release of PostgreSQL. Refer to +the text below for details about migrating to another password type. + + diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index dc401087dc..02311f6e3b 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -1124,6 +1124,14 @@ include_dir 'conf.d' mechanism, and hence not work with passwords encrypted with SCRAM-SHA-256. See for more details. + + + Support for MD5-encrypted passwords is deprecated and will be removed + in a future release of PostgreSQL. Refer + to for details about migrating to + another password type. + + @@ -789
Re: Assertion failure when autovacuum drops orphan temp indexes.
On Sat, Oct 26, 2024 at 12:43:44AM -0700, Masahiko Sawada wrote: > I got the following assertion failure from an autovaucum worker: > > 2024-10-26 00:32:03.685 PDT [830672] LOG: autovacuum: dropping orphan > temp table "postgres.pg_temp_0.hoge" > TRAP: failed Assert("HaveRegisteredOrActiveSnapshot()"), File: > "index.c", Line: 2345, PID: 830672 IMHO the best way to handle this is to just unconditionally push a snapshot in this code path instead of making assumptions about what callers will do. -- nathan diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c index 9162b9f81a..74d0f3097e 100644 --- a/src/backend/catalog/index.c +++ b/src/backend/catalog/index.c @@ -2336,13 +2336,9 @@ index_drop(Oid indexId, bool concurrent, bool concurrent_lock_mode) /* * Updating pg_index might involve TOAST table access, so ensure we have a -* valid snapshot. We only expect to get here without a snapshot in the -* concurrent path. +* valid snapshot. */ - if (concurrent) - PushActiveSnapshot(GetTransactionSnapshot()); - else - Assert(HaveRegisteredOrActiveSnapshot()); + PushActiveSnapshot(GetTransactionSnapshot()); /* * fix INDEX relation, and check for expressional index @@ -2361,8 +2357,7 @@ index_drop(Oid indexId, bool concurrent, bool concurrent_lock_mode) ReleaseSysCache(tuple); table_close(indexRelation, RowExclusiveLock); - if (concurrent) - PopActiveSnapshot(); + PopActiveSnapshot(); /* * if it has any expression columns, we might have stored statistics about
Re: use a non-locking initial test in TAS_SPIN on AArch64
On Wed, Oct 23, 2024 at 09:46:56AM -0500, Nathan Bossart wrote: > I have a couple of Apple M-series machines I can test, too. After some preliminary tests on an M3, I'm not seeing any gains outside the noise range. That's not too surprising because it's likely more difficult to create a lot of spinlock contention on these smaller machines. But, at the very least, I'm not seeing a regression. -- nathan
Re: use a non-locking initial test in TAS_SPIN on AArch64
On Wed, Oct 23, 2024 at 11:01:05AM +0800, Jingtang Zhang wrote: > The result looks great, but the discussion in [0] shows that the result may > vary among different ARM chips. Could you provide the chip model of this > test? So that we can do a cross validation of this patch. This is on a c8g.24xlarge, which is using Neoverse-V2 and Armv9.0-a [0]. > I'm willing to test it on Alibaba Cloud Yitian 710 if I have time. That would be great. I have a couple of Apple M-series machines I can test, too. [0] https://github.com/aws/aws-graviton-getting-started/blob/main/README.md#building-for-graviton -- nathan
use a non-locking initial test in TAS_SPIN on AArch64
My colleague Salvatore Dipietro (CC'd) sent me a couple of profiles that showed an enormous amount of s_lock() time going to the __sync_lock_test_and_set() call in the AArch64 implementation of tas(). Upon closer inspection, I noticed that we don't implement a custom TAS_SPIN() for this architecture, so I quickly hacked together the attached patch and ran a couple of benchmarks that stressed the spinlock code. I found no discussion about TAS_SPIN() on ARM in the archives, but I did notice that the initial AArch64 support was added [0] before x86_64 started using a non-locking test [1]. These benchmarks are for a c8g.24xlarge running a select-only pgbench with 256 clients and pg_stat_statements.track_planning enabled. without the patch: 90.04% postgres[.] s_lock 1.07% pg_stat_statements.so [.] pgss_store 0.59% postgres[.] LWLockRelease 0.56% postgres[.] perform_spin_delay 0.31% [kernel][k] arch_local_irq_enable |while (TAS_SPIN(lock)) |{ |perform_spin_delay(&delayStatus); 0.12 |2c: -> bl perform_spin_delay |tas(): |return __sync_lock_test_and_set(lock, 1); 0.01 |30: swpa w20, w1, [x19] |s_lock(): 99.87 | add x0, sp, #0x28 |while (TAS_SPIN(lock)) 0.00 |^ cbnz w1, 2c tps = 74135.100891 (without initial connection time) with the patch: 30.46% postgres[.] s_lock 5.88% postgres[.] perform_spin_delay 4.61% [kernel][k] arch_local_irq_enable 3.31% [kernel][k] next_uptodate_page 2.50% postgres[.] hash_search_with_hash_value |while (TAS_SPIN(lock)) |{ |perform_spin_delay(&delayStatus); 0.63 |2c:+-->add x0, sp, #0x28 0.07 | |-> bl perform_spin_delay | |while (TAS_SPIN(lock)) 0.25 |34:| ldr w0, [x19] 65.19 | +--cbnz w0, 2c |tas(): |return __sync_lock_test_and_set(lock, 1); 0.00 | swpa w20, w0, [x19] |s_lock(): 33.82 |^ cbnz w0, 2c tps = 549462.785554 (without initial connection time) [0] https://postgr.es/c/5c7603c [1] https://postgr.es/c/b03d196 -- nathan >From a69b7d38c40f0fb49f714c49bb45c292e7c8587d Mon Sep 17 00:00:00 2001 From: Nathan Bossart Date: Tue, 22 Oct 2024 14:33:39 -0500 Subject: [PATCH v1 1/1] Use a non-locking initial test in TAS_SPIN on AArch64. --- src/include/storage/s_lock.h | 6 ++ 1 file changed, 6 insertions(+) diff --git a/src/include/storage/s_lock.h b/src/include/storage/s_lock.h index e94ed5f48b..07f343baf8 100644 --- a/src/include/storage/s_lock.h +++ b/src/include/storage/s_lock.h @@ -270,6 +270,12 @@ tas(volatile slock_t *lock) */ #if defined(__aarch64__) +/* + * On ARM64, it's a win to use a non-locking test before the TAS proper. It + * may be a win on 32-bit ARM, too, but nobody's tested it yet. + */ +#define TAS_SPIN(lock)(*(lock) ? 1 : TAS(lock)) + #define SPIN_DELAY() spin_delay() static __inline__ void -- 2.39.5 (Apple Git-154)
Re: Use more CppAsString2() in pg_amcheck.c
On Fri, Oct 18, 2024 at 06:50:50PM +0900, Michael Paquier wrote: > - appendPQExpBufferStr(&sql, "\nWHERE c.relpersistence != 't'"); > + appendPQExpBufferStr(&sql, "\nWHERE c.relpersistence != " > + > CppAsString2(RELPERSISTENCE_TEMP) " "); I think these whitespace changes may cause invalid statements to be produced in some code paths. IMHO those should be reserved for a separate patch, anyway. -- nathan
Re: Proposal for Updating CRC32C with AVX-512 Algorithm.
On Tue, Oct 08, 2024 at 08:19:27PM +, Devulapalli, Raghuveer wrote: > Hi all, I'm currently in the process of reviewing and analyzing Paul's > patch. In the meantime, I'm open to addressing any questions or feedback > you may have. I've proposed a patch to move the existing AVX-512 code in Postgres to use __attribute__((target("..."))) instead of per-translation-unit compiler flags [0]. We should likely do something similar for this one. [0] https://postgr.es/m/ZxAqRG1-8fJLMRUY%40nathan -- nathan
Re: Doc: shared_memory_size_in_huge_pages with the "SHOW" command.
On Fri, Oct 18, 2024 at 02:47:37PM +0900, Fujii Masao wrote: > On 2024/10/18 7:10, Michael Paquier wrote: >> Fine by me. Thanks. > > LGTM, too. Thanks! Committed. -- nathan
Re: Doc: shared_memory_size_in_huge_pages with the "SHOW" command.
On Thu, Oct 17, 2024 at 02:07:32PM +0900, Seino Yuki wrote: > I've fixed the patch and will register it in the CF. Thanks. Here is what I have staged for commit. -- nathan >From 8aff8ec54a6588614f952693f509b75bf240851c Mon Sep 17 00:00:00 2001 From: Nathan Bossart Date: Thu, 17 Oct 2024 16:56:52 -0500 Subject: [PATCH v3 1/1] Adjust documentation for configuring Linux huge pages. The present wording about viewing shared_memory_size_in_huge_pages seems to suggest that the parameter cannot be viewed after startup at all, whereas the intent is to make it clear that you can't use "postgres -C" to view this GUC while the server is running. This commit rephrases this section to remove the ambiguity. Author: Seino Yuki Reviewed-by: Michael Paquier, David G. Johnston, Fujii Masao Discussion: https://postgr.es/m/420584fd274f9ec4f337da55ffb3b790%40oss.nttdata.com Backpatch-through: 15 --- doc/src/sgml/runtime.sgml | 9 - 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/doc/src/sgml/runtime.sgml b/doc/src/sgml/runtime.sgml index 2c4d5ef640..a47fa67b38 100644 --- a/doc/src/sgml/runtime.sgml +++ b/doc/src/sgml/runtime.sgml @@ -1429,11 +1429,10 @@ export PG_OOM_ADJUST_VALUE=0 with CONFIG_HUGETLBFS=y and CONFIG_HUGETLB_PAGE=y. You will also have to configure the operating system to provide enough huge pages of the desired size. -To determine the number of huge pages needed, use the -postgres command to see the value of -. Note that the -server must be shut down to view this runtime-computed parameter. -This might look like: +The runtime-computed parameter + reports the number +of huge pages required. This parameter can be viewed before starting the +server with a postgres command like: $ postgres -D $PGDATA -C shared_memory_size_in_huge_pages 3170 -- 2.39.5 (Apple Git-154)
Re: Popcount optimization using AVX512
On Tue, Oct 08, 2024 at 09:36:03PM -0500, Nathan Bossart wrote: > On Wed, Jul 31, 2024 at 04:43:02PM -0500, Nathan Bossart wrote: >> On Wed, Jul 31, 2024 at 01:52:54PM -0700, Andres Freund wrote: >>> I think we'd be better off enabling architectural features on a per-function >>> basis, roughly like this: >>> >>> [...] >>> >>> /* FIXME: Should be gated by configure check of -mavx512vpopcntdq >>> -mavx512bw support */ >>> pg_enable_target("avx512vpopcntdq,avx512bw") >>> uint64_t >>> pg_popcount_avx512(const char *buf, int bytes) >>> ... >> >> I remember wondering why the CRC-32C code wasn't already doing something >> like this (old compiler versions? non-gcc-like compilers?), and I'm not >> sure I ever discovered the reason, so out of an abundance of caution I used >> the same approach for AVX-512. If we can convince ourselves that >> __attribute__((target("..."))) is standard enough at this point, +1 for >> moving to that. > > [...] > > So, at least for the CRC code, __attribute__((target("..."))) was probably > not widely available enough yet when it was first added. Unfortunately, > the ARMv8 CRC target support (without -march) is still pretty new, but it > might be possible to switch the others to a per-function approach in v18. Here is a first attempt at using __attribute__((target("..."))) for the AVX-512 stuff. Besides allowing us to consolidate the code into a single file, this simplifies the build file changes quite a bit. -- nathan >From c97e25e56347c90f169a5ce069a9ea06c873915b Mon Sep 17 00:00:00 2001 From: Nathan Bossart Date: Wed, 16 Oct 2024 15:57:55 -0500 Subject: [PATCH v1 1/1] use __attribute__((target(...))) for AVX-512 stuff --- config/c-compiler.m4 | 60 +- configure| 163 ++- configure.ac | 17 +-- meson.build | 17 +-- src/Makefile.global.in | 5 - src/include/c.h | 10 ++ src/makefiles/meson.build| 4 +- src/port/Makefile| 12 +- src/port/meson.build | 7 +- src/port/pg_popcount_avx512.c| 86 +- src/port/pg_popcount_avx512_choose.c | 102 - 11 files changed, 171 insertions(+), 312 deletions(-) delete mode 100644 src/port/pg_popcount_avx512_choose.c diff --git a/config/c-compiler.m4 b/config/c-compiler.m4 index 10f8c7bd0a..aa90f8ef33 100644 --- a/config/c-compiler.m4 +++ b/config/c-compiler.m4 @@ -700,20 +700,20 @@ undefine([Ac_cachevar])dnl # Check if the compiler supports the XSAVE instructions using the _xgetbv # intrinsic function. # -# An optional compiler flag can be passed as argument (e.g., -mxsave). If the -# intrinsic is supported, sets pgac_xsave_intrinsics and CFLAGS_XSAVE. +# If the intrinsics are supported, sets pgac_xsave_intrinsics. AC_DEFUN([PGAC_XSAVE_INTRINSICS], -[define([Ac_cachevar], [AS_TR_SH([pgac_cv_xsave_intrinsics_$1])])dnl -AC_CACHE_CHECK([for _xgetbv with CFLAGS=$1], [Ac_cachevar], -[pgac_save_CFLAGS=$CFLAGS -CFLAGS="$pgac_save_CFLAGS $1" -AC_LINK_IFELSE([AC_LANG_PROGRAM([#include ], - [return _xgetbv(0) & 0xe0;])], +[define([Ac_cachevar], [AS_TR_SH([pgac_cv_xsave_intrinsics])])dnl +AC_CACHE_CHECK([for _xgetbv], [Ac_cachevar], +[AC_LINK_IFELSE([AC_LANG_PROGRAM([#include +__attribute__((target("xsave"))) +static int xsave_test(void) +{ + return _xgetbv(0) & 0xe0; +}], + [return xsave_test();])], [Ac_cachevar=yes], - [Ac_cachevar=no]) -CFLAGS="$pgac_save_CFLAGS"]) + [Ac_cachevar=no])]) if test x"$Ac_cachevar" = x"yes"; then - CFLAGS_XSAVE="$1" pgac_xsave_intrinsics=yes fi undefine([Ac_cachevar])dnl @@ -725,29 +725,27 @@ undefine([Ac_cachevar])dnl # _mm512_setzero_si512, _mm512_maskz_loadu_epi8, _mm512_popcnt_epi64, # _mm512_add_epi64, and _mm512_reduce_add_epi64 intrinsic functions. # -# Optional compiler flags can be passed as argument (e.g., -mavx512vpopcntdq -# -mavx512bw). If the intrinsics are supported, sets -# pgac_avx512_popcnt_intrinsics and CFLAGS_POPCNT. +# If the intrinsics are supported, sets pgac_avx512_popcnt_intrinsics. AC_DEFUN([PGAC_AVX512_POPCNT_INTRINSICS], -[define([Ac_cachevar], [AS_TR_SH([pgac_cv_avx512_popcnt_intrinsics_$1])])dnl -AC_CACHE_CHECK([for _mm512_popcnt_epi64 with CFLAGS=$1], [Ac_cachevar], -[pgac_save_CFLAGS=$CFLAGS -CFLAGS="$pgac_save_CFLAGS $1" -AC_LINK_IFELSE([AC_LANG_PROGRAM([#include ], - [const char buf@<:@sizeof(__m512i)@:>@; - PG_INT64_TYPE popcnt = 0; - __m512i accum = _mm512_setzero_si512(); - const __m512i val = _mm512_maskz_loadu_epi8((__mmask64) 0xf0f0f0f0f0f0f0f0, (const __m512i *)
Re: Misleading error "permission denied for table"
On Wed, Oct 16, 2024 at 07:36:29PM +0530, Ashutosh Bapat wrote: > In privileges.sql there are tests for column level privileges e.g. > > INSERT INTO atest5(two) VALUES (6) ON CONFLICT (two) DO UPDATE set > three = 10 RETURNING atest5.three; > ERROR: permission denied for table atest5 > > In the above case the current user regress_priv_user4, doesn't have > privileges to access atest5.three. But the error does not mention > atest5.three anywhere. In fact, if the same query were to be changed > to return atest5.four, it would succeed since the user has privileges > to access column atest5.four. > > Shouldn't we report "permission defined for column atest5.three? We do have "permission denied for column" messages in aclchk.c (e.g., aclcheck_error_col()), but I don't see them actually used anywhere (or at least they don't show up in any expected regression test output). I'm inclined to agree that a more specific error would be nice, but I worry there's some hidden complexity that's prevented it thus far... -- nathan
Re: Large expressions in indexes can't be stored (non-TOASTable)
On Wed, Oct 16, 2024 at 10:24:32AM +0900, Michael Paquier wrote: > On Tue, Oct 15, 2024 at 08:20:17PM -0500, Nathan Bossart wrote: >> I assume all of this will get compiled out in non-USE_ASSERT_CHECKING >> builds as-is, but I see no problem with surrounding it with an #ifdef to be >> sure. > > Yeah, I'm not sure that that would always be the case when optimized. > Code generated can be dumb sometimes even if compilers got much > smarter in the last 10 years or so (not compiler guy here). Here is a new version of the patch set with this #ifdef added. I plan to give each of the code paths adjusted by 0001 a closer look, but otherwise I'm hoping to commit these soon. -- nathan >From b9444fd22d8dcc05e3a27817943be7f5296503fa Mon Sep 17 00:00:00 2001 From: Nathan Bossart Date: Mon, 14 Oct 2024 14:43:26 -0500 Subject: [PATCH v4 1/3] Ensure we have a snapshot when updating various system catalogs. Reported-by: Alexander Lakhin Reviewed-by: Michael Paquier Discussion: https://postgr.es/m/18127-fe54b6a667f29658%40postgresql.org Discussion: https://postgr.es/m/18309-c0bf914950c46692%40postgresql.org Discussion: https://postgr.es/m/ZvMSUPOqUU-VNADN%40nathan Backpatch-through: FIXME --- src/backend/commands/tablecmds.c| 13 +++ src/backend/replication/logical/tablesync.c | 21 ++ src/backend/replication/logical/worker.c| 24 + 3 files changed, 58 insertions(+) diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c index 1ccc80087c..c6f82e93b9 100644 --- a/src/backend/commands/tablecmds.c +++ b/src/backend/commands/tablecmds.c @@ -19283,9 +19283,22 @@ ATExecDetachPartition(List **wqueue, AlteredTableInfo *tab, Relation rel, tab->rel = rel; } + /* +* Detaching the partition might involve TOAST table access, so ensure we +* have a valid snapshot. We only expect to get here without a snapshot +* in the concurrent path. +*/ + if (concurrent) + PushActiveSnapshot(GetTransactionSnapshot()); + else + Assert(HaveRegisteredOrActiveSnapshot()); + /* Do the final part of detaching */ DetachPartitionFinalize(rel, partRel, concurrent, defaultPartOid); + if (concurrent) + PopActiveSnapshot(); + ObjectAddressSet(address, RelationRelationId, RelationGetRelid(partRel)); /* keep our lock until commit */ diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c index e03e761392..d3fbb18438 100644 --- a/src/backend/replication/logical/tablesync.c +++ b/src/backend/replication/logical/tablesync.c @@ -377,6 +377,12 @@ process_syncing_tables_for_sync(XLogRecPtr current_lsn) replorigin_session_origin_lsn = InvalidXLogRecPtr; replorigin_session_origin_timestamp = 0; + /* +* Updating pg_replication_origin might involve TOAST table access, so +* ensure we have a valid snapshot. +*/ + PushActiveSnapshot(GetTransactionSnapshot()); + /* * Drop the tablesync's origin tracking if exists. * @@ -387,6 +393,8 @@ process_syncing_tables_for_sync(XLogRecPtr current_lsn) */ replorigin_drop_by_name(originname, true, false); + PopActiveSnapshot(); + finish_sync_worker(); } else @@ -483,6 +491,12 @@ process_syncing_tables_for_apply(XLogRecPtr current_lsn) started_tx = true; } + /* +* Updating pg_replication_origin might involve TOAST table +* access, so ensure we have a valid snapshot. +*/ + PushActiveSnapshot(GetTransactionSnapshot()); + /* * Remove the tablesync origin tracking if exists. * @@ -500,6 +514,8 @@ process_syncing_tables_for_apply(XLogRecPtr current_lsn) sizeof(originname)); replorigin_drop_by_name(originname, true, false); + PopActiveSnapshot(); + /* * Update the state to READY only after the origin cleanup. */ @@ -1455,8 +1471,13 @@ LogicalRepSyncTableStart(XLogRecPtr *origin_startpos) * Then advance to the LSN got from walrcv_create_slot. This is WAL * logged for the purpose of recovery. Locks are to prevent the
Re: sunsetting md5 password support
On Fri, Oct 11, 2024 at 04:36:27PM -0500, Nathan Bossart wrote: > Here is a first attempt at a patch for marking MD5 passwords as deprecated. > It's quite bare-bones at the moment, so I anticipate future revisions will > add more content. Besides sprinkling several deprecation notices > throughout the documentation, this patch teaches CREATE ROLE and ALTER ROLE > to emit warnings when setting MD5 passwords. A new GUC named > md5_password_warnings can be set to "off" to disable these warnings. I > considered adding even more warnings (e.g., when authenticating), but I > felt that would be far too noisy. In v2, I've added an entry for the new md5_password_warnings GUC to the documentation, and I've simplified the passwordcheck test changes a bit. -- nathan >From e7786f4d2e4c892d34c7992ffddbd1cf9e109be7 Mon Sep 17 00:00:00 2001 From: Nathan Bossart Date: Fri, 11 Oct 2024 16:21:09 -0500 Subject: [PATCH v2 1/1] Deprecate MD5 passwords. MD5 has been considered to be unsuitable for use as a cryptographic hash algorithm for some time. Furthermore, MD5 password hashes in PostgreSQL are vulnerable to pass-the-hash attacks, i.e., knowing the username and hashed password is sufficient to authenticate. The SCRAM-SHA-256 method added in v10 is not subject to these problems and is considered to be superior to MD5. This commit marks MD5 password support in PostgreSQL as deprecated and to be removed in a future release. The documentation now contains several deprecation notices, and CREATE ROLE and ALTER ROLE now emit deprecation warnings when setting MD5 passwords. The warnings can be disabled by setting the md5_password_warnings parameter to "off". Reviewed-by: ??? Discussion: https://postgr.es/m/ZwbfpJJol7lDWajL%40nathan --- .../passwordcheck/expected/passwordcheck.out | 1 + .../expected/passwordcheck_1.out | 1 + contrib/passwordcheck/sql/passwordcheck.sql | 1 + doc/src/sgml/catalogs.sgml| 9 +++ doc/src/sgml/client-auth.sgml | 8 +++ doc/src/sgml/config.sgml | 24 +++ doc/src/sgml/libpq.sgml | 9 +++ doc/src/sgml/protocol.sgml| 8 +++ doc/src/sgml/ref/create_role.sgml | 8 +++ doc/src/sgml/runtime.sgml | 10 src/backend/libpq/crypt.c | 10 src/backend/utils/misc/guc_tables.c | 9 +++ src/backend/utils/misc/postgresql.conf.sample | 1 + src/include/libpq/crypt.h | 3 +++ src/test/modules/test_misc/t/003_check_guc.pl | 2 +- src/test/regress/expected/password.out| 15 src/test/regress/expected/password_1.out | 9 +++ 17 files changed, 127 insertions(+), 1 deletion(-) diff --git a/contrib/passwordcheck/expected/passwordcheck.out b/contrib/passwordcheck/expected/passwordcheck.out index 2027681daf..dfb2ccfe00 100644 --- a/contrib/passwordcheck/expected/passwordcheck.out +++ b/contrib/passwordcheck/expected/passwordcheck.out @@ -1,3 +1,4 @@ +SET md5_password_warnings = off; LOAD 'passwordcheck'; CREATE USER regress_passwordcheck_user1; -- ok diff --git a/contrib/passwordcheck/expected/passwordcheck_1.out b/contrib/passwordcheck/expected/passwordcheck_1.out index 5d8d5dcc1c..9519d60a49 100644 --- a/contrib/passwordcheck/expected/passwordcheck_1.out +++ b/contrib/passwordcheck/expected/passwordcheck_1.out @@ -1,3 +1,4 @@ +SET md5_password_warnings = off; LOAD 'passwordcheck'; CREATE USER regress_passwordcheck_user1; -- ok diff --git a/contrib/passwordcheck/sql/passwordcheck.sql b/contrib/passwordcheck/sql/passwordcheck.sql index 1fbd6b0e96..5953ece5c2 100644 --- a/contrib/passwordcheck/sql/passwordcheck.sql +++ b/contrib/passwordcheck/sql/passwordcheck.sql @@ -1,3 +1,4 @@ +SET md5_password_warnings = off; LOAD 'passwordcheck'; CREATE USER regress_passwordcheck_user1; diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml index 964c819a02..0b9ca087c8 100644 --- a/doc/src/sgml/catalogs.sgml +++ b/doc/src/sgml/catalogs.sgml @@ -1618,6 +1618,15 @@ will store the md5 hash of xyzzyjoe. + + +Support for MD5-encrypted passwords is deprecated and will be removed in a +future release of PostgreSQL. Refer to + for details about migrating to another +password type. + + + If the password is encrypted with SCRAM-SHA-256, it has the format: diff --git a/doc/src/sgml/client-auth.sgml b/doc/src/sgml/client-auth.sgml index 51343de7ca..60a1d16288 100644 --- a/doc/src/sgml/client-auth.sgml +++ b/doc/src/sgml/client-auth.sgml @@ -1260,6 +1260,14 @@ omicron bryanh guest1 server is encrypted for SCRAM (see below), then SCRAM-based authentication will automatically be chosen instead. + + + +Support f
Re: Doc: shared_memory_size_in_huge_pages with the "SHOW" command.
On Wed, Oct 16, 2024 at 03:31:05PM +0900, Fujii Masao wrote: > Another idea is to use the same wording as for num_os_semaphores in > runtime.sgml, like this: > > This parameter can be viewed > before starting the server with a postgres command > like: WFM -- nathan
Re: Doc: shared_memory_size_in_huge_pages with the "SHOW" command.
On Wed, Oct 16, 2024 at 09:31:49AM +0900, Michael Paquier wrote: > I am not sure that neither mentioning nor recommending SHOW for this > area is a good idea, the point being to encourage its use before the > server is running as we don't want to allocate anything when tuning > it. Agreed. > Point taken that the current wording can be confusing because of the > "*must* be shut down". How about doing a simpler s/must/should/ for a > softer wording? Well, the point is that you can't use "postgres -C" to view this GUC while the server is running. I can see how the present wording might be misinterpreted, though. Perhaps we should change this sentence to something like: Note that the server must be shut down to view this runtime-computed parameter with the postgres command. -- nathan
Re: Large expressions in indexes can't be stored (non-TOASTable)
On Wed, Oct 16, 2024 at 09:12:31AM +0900, Michael Paquier wrote: > - if (!ctx->rel->rd_rel->reltoastrelid) > + if (!OidIsValid(RelationGetToastRelid(ctx->rel))) > > This set of diffs in 0002 is a nice cleanup. I'd wish for relying > less on zero comparitons when assuming that InvalidOid is in use. I'm wondering if there's any concern about this one causing back-patching pain. If so, I can just add the macro for use in new code. > +static inline void > +AssertHasSnapshotForToast(Relation rel) > +{ > + /* bootstrap mode in particular breaks this rule */ > + if (!IsNormalProcessingMode()) > + return; > + > + /* if the relation doesn't have a TOAST table, we are good */ > + if (!OidIsValid(RelationGetToastRelid(rel))) > + return; > + > + Assert(HaveRegisteredOrActiveSnapshot()); > +} > > Using a separate inlined routine is indeed cleaner as you have > documented the assumptions behind the check. Wouldn't it be better to > use a USE_ASSERT_CHECKING block? These two checks for normal > processing and toastrelid are cheap lookups, but we don't need them at > all in non-assert paths, so I'd suggest to ignore them entirely for > the non-USE_ASSERT_CHECKING case. I assume all of this will get compiled out in non-USE_ASSERT_CHECKING builds as-is, but I see no problem with surrounding it with an #ifdef to be sure. -- nathan
Re: Make foreach_ptr macro work in C++ extensions
Committed. -- nathan
Re: Detailed release notes
On Tue, Oct 15, 2024 at 02:23:15PM -0400, Bruce Momjian wrote: > On Tue, Oct 15, 2024 at 02:21:31PM -0400, Tom Lane wrote: >> Bruce Momjian writes: >> > On Tue, Oct 15, 2024 at 11:04:50AM -0700, Jacob Champion wrote: >> >> Maybe plus-minus (±), to represent the diff? >> >> > Uh, that might make sense to people who vie diffs often, but not the >> > average reader of the document. >> >> Well, the § symbol hasn't got that much mnemonic value either. >> >> However ... wasn't one of the arguments for § that there's some >> other project(s) already using it for this purpose? If so, I think >> we'd be better off joining the crowd. > > I don't remember anyone else using it, but its purpose to point to a > place with more detail makes logical sense. Another option could be delta (Δ or δ). That's not LATIN-1, but I'd imagine it's relatively well supported at this point. -- nathan
Re: Improve node type forward reference
On Tue, Oct 15, 2024 at 09:02:48AM +0200, Peter Eisentraut wrote: > On 14.10.24 23:28, Nathan Bossart wrote: >> On Mon, Oct 14, 2024 at 09:47:59AM +0200, Peter Eisentraut wrote: >> > But we can do this better by using an incomplete struct, like >> > >> > struct Query *viewQuery ...; >> > >> > That way, everything has the correct type and fewer casts are required. >> > This >> > technique is already used elsewhere in node type definitions. >> >> I noticed that the examples in parsenodes.h are for structs defined within >> the same file. If the struct is defined in a separate file, I guess you >> might need to include another header file wherever it is used, but that >> doesn't seem too bad. > > No, you can leave the struct incomplete. You only need to provide its full > definition (= include the other header file) if you actually want to access > the struct's fields. That's what I figured. This one LGTM, too, then. -- nathan
Re: Improve node type forward reference
On Mon, Oct 14, 2024 at 09:47:59AM +0200, Peter Eisentraut wrote: > But we can do this better by using an incomplete struct, like > > struct Query *viewQuery ...; > > That way, everything has the correct type and fewer casts are required. This > technique is already used elsewhere in node type definitions. I noticed that the examples in parsenodes.h are for structs defined within the same file. If the struct is defined in a separate file, I guess you might need to include another header file wherever it is used, but that doesn't seem too bad. > The second patch just removes some more unnecessary casts around > copyObject() that I found while working on this. LGTM -- nathan
Re: Make foreach_ptr macro work in C++ extensions
On Mon, Oct 14, 2024 at 10:05:42PM +0200, Jelte Fennema-Nio wrote: > I've been writing a C++ extension recently and it turns out that the > foreach_ptr macro added in PG17 by one of my patches causes a > compilation failure when used in C++. This is due to C++ its stricter > rules around implicit casts from void pointers. Whoops. > Attached is a tiny patch that fixes that by adding an explicit cast to > the macro. Looks reasonable to me. > I think it would be good to backpatch this to PG17 as well, > as to not introduce some differences in the macro across versions. I'd consider this a "low-risk" fix, which our versioning page [0] says is acceptable for a minor release, so I'll plan on back-patching unless someone objects. [0] https://www.postgresql.org/support/versioning/ -- nathan
Re: Large expressions in indexes can't be stored (non-TOASTable)
On Tue, Oct 08, 2024 at 01:50:52PM -0500, Nathan Bossart wrote: > 0002 could probably use some more commentary, but otherwise I think it is > in decent shape. You (Michael) seem to be implying that I should > back-patch the actual fixes and only apply the new assertions to v18. Am I > understanding you correctly? Here is a reorganized patch set. 0001 would be back-patched, but the others would only be applied to v18. -- nathan >From 6792eeb656e8fc707d85dc07595e2183853d2ce5 Mon Sep 17 00:00:00 2001 From: Nathan Bossart Date: Mon, 14 Oct 2024 14:43:26 -0500 Subject: [PATCH v3 1/3] Ensure we have a snapshot when updating various system catalogs. Reported-by: Alexander Lakhin Reviewed-by: Michael Paquier Discussion: https://postgr.es/m/18127-fe54b6a667f29658%40postgresql.org Discussion: https://postgr.es/m/18309-c0bf914950c46692%40postgresql.org Discussion: https://postgr.es/m/ZvMSUPOqUU-VNADN%40nathan Backpatch-through: FIXME --- src/backend/commands/tablecmds.c| 13 +++ src/backend/replication/logical/tablesync.c | 21 ++ src/backend/replication/logical/worker.c| 24 + 3 files changed, 58 insertions(+) diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c index 1ccc80087c..c6f82e93b9 100644 --- a/src/backend/commands/tablecmds.c +++ b/src/backend/commands/tablecmds.c @@ -19283,9 +19283,22 @@ ATExecDetachPartition(List **wqueue, AlteredTableInfo *tab, Relation rel, tab->rel = rel; } + /* +* Detaching the partition might involve TOAST table access, so ensure we +* have a valid snapshot. We only expect to get here without a snapshot +* in the concurrent path. +*/ + if (concurrent) + PushActiveSnapshot(GetTransactionSnapshot()); + else + Assert(HaveRegisteredOrActiveSnapshot()); + /* Do the final part of detaching */ DetachPartitionFinalize(rel, partRel, concurrent, defaultPartOid); + if (concurrent) + PopActiveSnapshot(); + ObjectAddressSet(address, RelationRelationId, RelationGetRelid(partRel)); /* keep our lock until commit */ diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c index e03e761392..d3fbb18438 100644 --- a/src/backend/replication/logical/tablesync.c +++ b/src/backend/replication/logical/tablesync.c @@ -377,6 +377,12 @@ process_syncing_tables_for_sync(XLogRecPtr current_lsn) replorigin_session_origin_lsn = InvalidXLogRecPtr; replorigin_session_origin_timestamp = 0; + /* +* Updating pg_replication_origin might involve TOAST table access, so +* ensure we have a valid snapshot. +*/ + PushActiveSnapshot(GetTransactionSnapshot()); + /* * Drop the tablesync's origin tracking if exists. * @@ -387,6 +393,8 @@ process_syncing_tables_for_sync(XLogRecPtr current_lsn) */ replorigin_drop_by_name(originname, true, false); + PopActiveSnapshot(); + finish_sync_worker(); } else @@ -483,6 +491,12 @@ process_syncing_tables_for_apply(XLogRecPtr current_lsn) started_tx = true; } + /* +* Updating pg_replication_origin might involve TOAST table +* access, so ensure we have a valid snapshot. +*/ + PushActiveSnapshot(GetTransactionSnapshot()); + /* * Remove the tablesync origin tracking if exists. * @@ -500,6 +514,8 @@ process_syncing_tables_for_apply(XLogRecPtr current_lsn) sizeof(originname)); replorigin_drop_by_name(originname, true, false); + PopActiveSnapshot(); + /* * Update the state to READY only after the origin cleanup. */ @@ -1455,8 +1471,13 @@ LogicalRepSyncTableStart(XLogRecPtr *origin_startpos) * Then advance to the LSN got from walrcv_create_slot. This is WAL * logged for the purpose of recovery. Locks are to prevent the * replication origin from vanishing while advancing. +* +* Updating pg_replication_origin might involve TOAST table access, so +* ensure we have a valid snapshot. */ + PushActiveSnapshot
Re: sunsetting md5 password support
On Fri, Oct 11, 2024 at 09:47:58AM -0400, Andrew Dunstan wrote: > On 2024-10-10 Th 6:28 PM, Tom Lane wrote: >> On the whole I agree with Heikki's comment that we should just >> do it (disallow MD5, full stop) whenever we feel that enough >> time has passed. These intermediate states are mostly going to >> add headaches. Maybe we could do something with an intermediate >> release that just emits warnings, without any feature changes. > > I also agree with this. Here is a first attempt at a patch for marking MD5 passwords as deprecated. It's quite bare-bones at the moment, so I anticipate future revisions will add more content. Besides sprinkling several deprecation notices throughout the documentation, this patch teaches CREATE ROLE and ALTER ROLE to emit warnings when setting MD5 passwords. A new GUC named md5_password_warnings can be set to "off" to disable these warnings. I considered adding even more warnings (e.g., when authenticating), but I felt that would be far too noisy. -- nathan >From a20f43d93562c4234de228ae3f0afae12750d7a0 Mon Sep 17 00:00:00 2001 From: Nathan Bossart Date: Fri, 11 Oct 2024 16:21:09 -0500 Subject: [PATCH v1 1/1] Deprecate MD5 passwords. MD5 has been considered to be unsuitable for use as a cryptographic hash algorithm for some time. Furthermore, MD5 password hashes in PostgreSQL are vulnerable to pass-the-hash attacks, i.e., knowing the username and hashed password is sufficient to authenticate. The SCRAM-SHA-256 method added in v10 is not subject to these problems and is considered to be superior to MD5. This commit marks MD5 password support in PostgreSQL as deprecated and to be removed in a future release. The documentation now contains several deprecation notices, and CREATE ROLE and ALTER ROLE now emit deprecation warnings when setting MD5 passwords. The warnings can be disabled by setting the md5_password_warnings parameter to "off". Reviewed-by: ??? Discussion: https://postgr.es/m/ZwbfpJJol7lDWajL%40nathan --- contrib/passwordcheck/expected/passwordcheck.out | 3 +++ .../passwordcheck/expected/passwordcheck_1.out| 6 ++ doc/src/sgml/catalogs.sgml| 9 + doc/src/sgml/client-auth.sgml | 8 doc/src/sgml/config.sgml | 8 doc/src/sgml/libpq.sgml | 9 + doc/src/sgml/protocol.sgml| 8 doc/src/sgml/ref/create_role.sgml | 8 doc/src/sgml/runtime.sgml | 10 ++ src/backend/libpq/crypt.c | 10 ++ src/backend/utils/misc/guc_tables.c | 9 + src/backend/utils/misc/postgresql.conf.sample | 1 + src/include/libpq/crypt.h | 3 +++ src/test/modules/test_misc/t/003_check_guc.pl | 2 +- src/test/regress/expected/password.out| 15 +++ src/test/regress/expected/password_1.out | 9 + 16 files changed, 117 insertions(+), 1 deletion(-) diff --git a/contrib/passwordcheck/expected/passwordcheck.out b/contrib/passwordcheck/expected/passwordcheck.out index 2027681daf..2f91b53b8a 100644 --- a/contrib/passwordcheck/expected/passwordcheck.out +++ b/contrib/passwordcheck/expected/passwordcheck.out @@ -13,6 +13,9 @@ ALTER USER regress_passwordcheck_user1 PASSWORD 'alessnicelongpassword'; ERROR: password must contain both letters and nonletters -- encrypted ok (password is "secret") ALTER USER regress_passwordcheck_user1 PASSWORD 'md592350e12ac34e52dd598f90893bb3ae7'; +WARNING: setting an MD5-encrypted password +DETAIL: MD5 password support is deprecated and will be removed in a future release of PostgreSQL. +HINT: Refer to the PostgreSQL documentation for details about migrating to another password type. -- error: password is user name ALTER USER regress_passwordcheck_user1 PASSWORD 'md507a112732ed9f2087fa90b192d44e358'; ERROR: password must not equal user name diff --git a/contrib/passwordcheck/expected/passwordcheck_1.out b/contrib/passwordcheck/expected/passwordcheck_1.out index 5d8d5dcc1c..ebb72956b0 100644 --- a/contrib/passwordcheck/expected/passwordcheck_1.out +++ b/contrib/passwordcheck/expected/passwordcheck_1.out @@ -13,6 +13,12 @@ ALTER USER regress_passwordcheck_user1 PASSWORD 'alessnicelongpassword'; ERROR: password must contain both letters and nonletters -- encrypted ok (password is "secret") ALTER USER regress_passwordcheck_user1 PASSWORD 'md592350e12ac34e52dd598f90893bb3ae7'; +WARNING: setting an MD5-encrypted password +DETAIL: MD5 password support is deprecated and will be removed in a future release of PostgreSQL. +HINT: Refer to the PostgreSQL documentation for details about migrating to another password type. -- error: password is u
Re: Function for listing pg_wal/summaries directory
On Fri, Oct 11, 2024 at 11:09:30AM +0900, Fujii Masao wrote: > On 2024/10/08 23:36, Nathan Bossart wrote: >> The patch posted upthread looks reasonable to me, so I'll go commit it soon >> unless there is any feedback. > > Thanks! The patch looks good to me, too. Committed. -- nathan
Re: Statistics Import and Export
On Thu, Oct 10, 2024 at 03:49:16PM -0400, Corey Huinker wrote: >> One other question I had when looking at this patch is whether we could >> remove dataOnly/schemaOnly from DumpOptions and RestoreOptions. Once 0007 >> is applied, those variables become particularly hazardous, so we really >> want to prevent folks from using them in new code. > > Well, the very next patch in the series adds --statistics-only, so I don't > think we're getting rid of user-facing command switches. However, I could > see us taking away the dataOnly/schemaOnly internal variables, thus > preventing coders from playing with those sharp objects. That's what I meant. The user-facing options would stay the same, but the internal variables would be local to main() so that other functions would be forced to use dumpData, dumpSchema, etc. -- nathan
Re: sunsetting md5 password support
On Thu, Oct 10, 2024 at 02:11:53AM +0300, Heikki Linnakangas wrote: > My feeling is that it would be less confusing to users to just disallow md5 > passwords in one release. I'm not sure these intermediate steps are really > doing anyone any favors. As I'm reading the various responses in this thread, I do find myself leaning in this direction. My intent with the incremental approach was to provide gentle reminders to migrate for a few years before removing support completely, but I suppose there will always be a subset of users that will wait until we actually follow through. If we went this route, we could still do step 1 (add deprecation notices), but there would just be one more step along the lines of "after X years, remove all support." (Or maybe we would remove server support after X years and then remove libpq support after Y more years.) In general, it seems like folks are generally onboard with removing MD5 password support. For v18, the only thing I'm hoping to accomplish is to get the deprecation notices added, so I will start writing a patch for that. Perhaps we should also consider adding WARNINGs whenever folks use MD5 passwords in any fashion (with a corresponding GUC to turn those off). -- nathan
Re: Pg17 Crash in Planning (Arrays + Casting + UDF)
On Wed, Oct 09, 2024 at 04:21:53PM -0400, Tom Lane wrote: > Paul Ramsey writes: >> This extremely odd case [2] came in via a report using a lot of PostGIS >> functions, but it can be reconfigured into a pure-PostgreSQL crasher >> [1]. > > Thanks for the report! Looks like estimate_array_length() is > incautiously assuming that the "root" pointer it receives will > never be NULL. Yup, git-bisect points me to commit 9391f71. > We could change their signatures ... but it's explicitly documented > that eval_const_expressions allows NULL for root, so there would > presumably still be code paths that'd fail. It looks like the only > safe fix is to ensure that estimate_array_length will cope with NULL > for root. Do you mean something like this? -else if (arrayexpr) +else if (arrayexpr && root != NULL) That'd at least be no worse than how it worked before estimate_array_length() tried to use statistics, so that seems reasonable to me. -- nathan
sunsetting md5 password support
In this message, I propose a multi-year, incremental approach to remove MD5 password support from Postgres. The problems with MD5 password hashes in Postgres are well-understood, so I won't discuss them in too much detail here. But suffice it to say that MD5 has been considered to be unsuitable for use as a cryptographic hash algorithm for some time [0], and cracking MD5-hashed passwords is trivial on modern hardware [1]. Furthermore, MD5 password hashes in Postgres are vulnerable to pass-the-hash attacks [2] [3], i.e., knowing the username and hashed password is sufficient to authenticate. The SCRAM-SHA-256 method added in v10 is not subject to these problems and AFAIK is generally considered far superior. Since v14, this method has been the default for the password_encryption parameter, which determines the algorithm to use to store new passwords on disk (unless the password has already been hashed by the client, as is recommended). Given there is a battle-tested alternative to MD5, I propose we take the following steps. I am not wedded to the exact details, but I feel that this would be a reasonably conservative path forward. 1. In v18, continue to support MD5 passwords, but place several notes in the documentation and release notes that unambiguously indicate that MD5 password support is deprecated and will be removed in a future release. 2. In v19, allow upgrading with MD5 passwords and allow authenticating with them, but disallow creating new ones (i.e., restrict/remove password_encryption and don't allow setting pre-hashed MD5 passwords). 3. In v20, allow upgrading with MD5 passwords, but disallow using them for authentication. Users would only be able to update these passwords to SCRAM-SHA-256 after upgrading. 4. In v21, disallow upgrading with MD5 passwords. At this point, there should be no remaining MD5 password support in Postgres. With this plan, the first version with all MD5 password support removed would be released in 2028. Considering SCRAM-SHA-256 was first introduced in 2017 and has been the default for password_encryption since 2021, users will have had several years to migrate. Thoughts? [0] https://en.wikipedia.org/wiki/MD5#Security [1] https://www.postgresql.org/docs/devel/pgcrypto.html#PGCRYPTO-HASH-SPEED-TABLE [2] https://hashcat.net/misc/postgres-pth/postgres-pth.pdf [3] https://www.postgresql.org/docs/devel/auth-password.html -- nathan
Re: Remove deprecated -H option from oid2name
On Wed, Oct 09, 2024 at 06:45:07PM +0200, Daniel Gustafsson wrote: >> On 9 Oct 2024, at 18:33, Nathan Bossart wrote: >> >> On Wed, Oct 09, 2024 at 10:08:31AM +0200, Daniel Gustafsson wrote: >>> The -H option to oid2name was deprecated in v12, and with v12 going out of >>> support it seems about time to remove it for v18. Not the most >>> groundbreaking >>> cleanup, but also no point in carrying it around for 5 more years. >> >> Seems fine to me, although I don't see a problem with keeping it, either. > > There is little to no cost in keeping it, but then we might as well remove the > deprecation notice. If we keep marking things deprecated without removing > them > then we risk being seen as crying wolf and users stop caring to update away > from things we really want to deprecate. I tend to agree, but I still think it's worth considering whether removing a few lines of code is worth breaking folks' scripts. In this case, I'm not too worried... Another problem is that "deprecated" may or may not imply that the feature will be removed in the future. IMHO we should be clear about that when we intend to remove something down the road (e.g., "this flag is deprecated and will be removed in a future major release of PostgreSQL"). -- nathan
Re: Remove deprecated -H option from oid2name
On Wed, Oct 09, 2024 at 10:08:31AM +0200, Daniel Gustafsson wrote: > The -H option to oid2name was deprecated in v12, and with v12 going out of > support it seems about time to remove it for v18. Not the most groundbreaking > cleanup, but also no point in carrying it around for 5 more years. Seems fine to me, although I don't see a problem with keeping it, either. -- nathan
Re: Popcount optimization using AVX512
On Wed, Jul 31, 2024 at 04:43:02PM -0500, Nathan Bossart wrote: > On Wed, Jul 31, 2024 at 01:52:54PM -0700, Andres Freund wrote: >> I think we'd be better off enabling architectural features on a per-function >> basis, roughly like this: >> >> [...] >> >> /* FIXME: Should be gated by configure check of -mavx512vpopcntdq -mavx512bw >> support */ >> pg_enable_target("avx512vpopcntdq,avx512bw") >> uint64_t >> pg_popcount_avx512(const char *buf, int bytes) >> ... > > I remember wondering why the CRC-32C code wasn't already doing something > like this (old compiler versions? non-gcc-like compilers?), and I'm not > sure I ever discovered the reason, so out of an abundance of caution I used > the same approach for AVX-512. If we can convince ourselves that > __attribute__((target("..."))) is standard enough at this point, +1 for > moving to that. I looked into this some more, and found the following: * We added SSE 4.2 CRC support in April 2015 (commit 3dc2d62). gcc support for __attribute__((target("sse4.2"))) was added in 4.9.0 (April 2014). clang support was added in 3.8 (March 2016). * We added ARMv8 CRC support in April 2018 (commit f044d71). gcc support for __attribute__((target("+crc"))) was added in 6.3 (December 2016). I didn't find precisely when clang support was added, but until 16.0.0 (March 2023), including arm_acle.h requires the -march flag [0], and you had to use "crc" (plus sign omitted) as the target [1]. * We added AVX-512 support in April 2024 (commit 792752a). gcc support for __attribute__((target("avx512vpopcntdq,avx512bw"))) was added in 7.1 (May 2017). clang support was added in 5.0.0 (September 2017). However, the "xsave" target was not supported until 9.1 for gcc (May 2019) and 9.0.0 for clang (September 2019), and we need that for our AVX-512 code, too. So, at least for the CRC code, __attribute__((target("..."))) was probably not widely available enough yet when it was first added. Unfortunately, the ARMv8 CRC target support (without -march) is still pretty new, but it might be possible to switch the others to a per-function approach in v18. [0] https://github.com/llvm/llvm-project/commit/30b67c6 [1] https://releases.llvm.org/16.0.0/tools/clang/docs/ReleaseNotes.html#arm-and-aarch64-support -- nathan
Re: Avoid possible overflow (src/port/bsearch_arg.c)
On Tue, Oct 08, 2024 at 04:09:00PM -0300, Ranier Vilela wrote: > The port function *bsearch_arg* mimics the C function > *bsearch*. > > The API signature is: > void * > bsearch_arg(const void *key, const void *base0, > size_t nmemb, size_t size, > int (*compar) (const void *, const void *, void *), > void *arg) > > So, the parameter *nmemb* is size_t. > Therefore, a call with nmemb greater than INT_MAX is possible. > > Internally the code uses the *int* type to iterate through the number of > members, which makes overflow possible. I traced this back to commit bfa2cee (v14), which both moved bsearch_arg() to its current location and adjusted the style a bit. Your patch looks reasonable to me. -- nathan
Re: Large expressions in indexes can't be stored (non-TOASTable)
On Wed, Sep 25, 2024 at 01:05:26PM +0900, Michael Paquier wrote: > On Tue, Sep 24, 2024 at 02:26:08PM -0500, Nathan Bossart wrote: >> So... maybe we >> should just remove pg_replication_origin's TOAST table instead... > > I'd rather keep it, FWIW. Contrary to pg_authid it does not imply > problems at the same scale because we would have access to the toast > relation in all the code paths with logical workers or table syncs. > The other one was at early authentication stages. Okay. > It sounds to me that we should be much more proactive in detecting > these failures and add something like that on HEAD. That's cheap > enough. As the checks are the same for all these code paths, perhaps > just hide them behind a local macro to reduce the duplication? In v2, I moved the assertions to a new function called by the heapam.c routines. I was hoping to move them to the tableam.h routines, but several callers (in particular, the catalog/indexing.c ones that are causing problems) call the heap ones directly. I've also included a 0001 patch that introduces a RelationGetToastRelid() macro because I got tired of typing "rel->rd_rel->reltoastrelid". 0002 could probably use some more commentary, but otherwise I think it is in decent shape. You (Michael) seem to be implying that I should back-patch the actual fixes and only apply the new assertions to v18. Am I understanding you correctly? -- nathan >From 3eb1c0997e18d9e7a56cc3e974076e58613a1bb9 Mon Sep 17 00:00:00 2001 From: Nathan Bossart Date: Tue, 8 Oct 2024 11:28:58 -0500 Subject: [PATCH v2 1/2] add RelationGetToastRelid macro --- contrib/amcheck/verify_heapam.c | 6 +++--- src/backend/access/common/toast_internals.c | 2 +- src/backend/access/heap/heaptoast.c | 6 +++--- src/backend/catalog/heap.c| 2 +- src/backend/catalog/index.c | 2 +- src/backend/catalog/toasting.c| 2 +- src/backend/commands/cluster.c| 19 ++- src/backend/commands/indexcmds.c | 4 ++-- src/backend/commands/tablecmds.c | 8 src/backend/commands/vacuum.c | 2 +- .../replication/logical/reorderbuffer.c | 4 ++-- src/backend/utils/adt/dbsize.c| 4 ++-- src/include/utils/rel.h | 6 ++ 13 files changed, 37 insertions(+), 30 deletions(-) diff --git a/contrib/amcheck/verify_heapam.c b/contrib/amcheck/verify_heapam.c index f2526ed63a..78316daacd 100644 --- a/contrib/amcheck/verify_heapam.c +++ b/contrib/amcheck/verify_heapam.c @@ -367,12 +367,12 @@ verify_heapam(PG_FUNCTION_ARGS) } /* Optionally open the toast relation, if any. */ - if (ctx.rel->rd_rel->reltoastrelid && check_toast) + if (RelationGetToastRelid(ctx.rel) && check_toast) { int offset; /* Main relation has associated toast relation */ - ctx.toast_rel = table_open(ctx.rel->rd_rel->reltoastrelid, + ctx.toast_rel = table_open(RelationGetToastRelid(ctx.rel), AccessShareLock); offset = toast_open_indexes(ctx.toast_rel, AccessShareLock, @@ -1721,7 +1721,7 @@ check_tuple_attribute(HeapCheckContext *ctx) } /* The relation better have a toast table */ - if (!ctx->rel->rd_rel->reltoastrelid) + if (!OidIsValid(RelationGetToastRelid(ctx->rel))) { report_corruption(ctx, psprintf("toast value %u is external but relation has no toast relation", diff --git a/src/backend/access/common/toast_internals.c b/src/backend/access/common/toast_internals.c index 90d0654e62..39d93d12f2 100644 --- a/src/backend/access/common/toast_internals.c +++ b/src/backend/access/common/toast_internals.c @@ -151,7 +151,7 @@ toast_save_datum(Relation rel, Datum value, * uniqueness of the OID we assign to the toasted item, even though it has * additional columns besides OID. */ - toastrel = table_open(rel->rd_rel->reltoastrelid, RowExclusiveLock); + toastrel = table_open(RelationGetToastRelid(rel), RowExclusiveLock); toasttupDesc = toastrel->rd_att; /* Open all the toast indexes and look for the valid one */ diff --git a/src/backend/access/heap/heaptoast.c b/src/backend/access/heap/heaptoast.c index a420e16530..cf6f3dff53 100644 --- a/src/backend/access/heap/heaptoast.c +++ b/src/backend/access/heap/heaptoast.c @@ -213,7 +213,7 @@ heap_toast_insert_or_update(Relation rel, HeapTuple newtup, HeapTuple oldtup, * XXX maybe the threshold should be less tha
Re: Statistics Import and Export
I took a look at v29-0006. On Tue, Sep 17, 2024 at 05:02:49AM -0400, Corey Huinker wrote: > From: Corey Huinker > Date: Sat, 4 May 2024 04:52:38 -0400 > Subject: [PATCH v29 6/7] Add derivative flags dumpSchema, dumpData. > > User-set flags --schema-only and --data-only are often consulted by > various operations to determine if they should be skipped or not. While > this logic works when there are only two mutually-exclusive -only > options, it will get progressively more confusing when more are added. After glancing at v29-0007, I see what you mean. > In anticipation of this, create the flags dumpSchema and dumpData which > are derivative of the existing options schemaOnly and dataOnly. This > allows us to restate current skip-this-section tests in terms of what is > enabled, rather than checking if the other -only mode is turned off. This seems like a reasonable refactoring exercise that we could take care of before the rest of the patch set goes in. I added one new reference to dopt.schemaOnly in commit bd15b7d, so that should probably be revised to !dumpData, too. I also noticed a few references to dataOnly/schemaOnly in comments that should likely be adjusted. One other question I had when looking at this patch is whether we could remove dataOnly/schemaOnly from DumpOptions and RestoreOptions. Once 0007 is applied, those variables become particularly hazardous, so we really want to prevent folks from using them in new code. -- nathan
Re: Function for listing pg_wal/summaries directory
On Tue, Oct 08, 2024 at 01:19:52PM +0900, Michael Paquier wrote: > On Tue, Oct 08, 2024 at 12:41:16PM +0900, Fujii Masao wrote: >> One benefit of supporting something like pg_ls_summariesdir() is that >> it allows us to view the last modification time of each WAL summary file >> and estimate when they'll be removed based on wal_summary_keep_time. >> >> Of course, we could also extend the existing function to report >> the last modification time if this use case is valid, though. > > My argument is about knowing the size of each file, for monitoring of > disk space. The retention can be controlled by a GUC based on time, > and this function requires knowing about the file name format. Okay. I have no problem with adding something like pg_ls_summariesdir(), but I guess I was hopeful we could just add any missing information to the existing WAL summarization information functions. A new pg_ls_*dir() function would indeed fit nicely with the existing suite of generic file access functions. The patch posted upthread looks reasonable to me, so I'll go commit it soon unless there is any feedback. IMHO we should consider alphabetizing the table in the docs [0], too. [0] https://www.postgresql.org/docs/devel/functions-admin.html#FUNCTIONS-ADMIN-GENFILE -- nathan
Re: pg_upgrade check for invalid databases
On Mon, Oct 07, 2024 at 03:37:35PM -0400, Bruce Momjian wrote: > On Tue, Oct 1, 2024 at 09:28:54AM +0200, Daniel Gustafsson wrote: >> Correct, sorry for being unclear. The consistency argument would be to >> expand >> pg_upgrade to report all invalid databases rather than just the first found; >> attempting to fix problems would be a new behavior. > > Yes, historically pg_upgrade will fail if it finds anything unusual, > mostly because what it does normally is already scary enough. If users > what pg_upgrade to do cleanups, it would be enabled by a separate flag, > or even a new command-line app. While I suspect it's rare that someone CTRL-C's out of an accidental DROP DATABASE and then runs pg_upgrade before trying to recover the data, I agree with the principle of having pg_upgrade fail by default for things like this. If we did add a new flag, the new invalid database report that Daniel mentions could say something like "try again with --skip-invalid-databases to have pg_upgrade automatically drop invalid databases." -- nathan
Re: pg_ctl/miscinit: print "MyStartTime" as a long long instead of long to avoid 2038 problem.
On Mon, Oct 07, 2024 at 02:00:05PM -0500, Nathan Bossart wrote: > I refrained from introducing INT64_HEX_FORMAT/UINT64_HEX_FORMAT in c.h > because I felt there was a nonzero chance of that causing problems with > third-party code on the back-branches. We could probably add them for v18, > though. Like so... -- nathan >From 03f7536acd06b91e6891ccfc86470a5a51b45736 Mon Sep 17 00:00:00 2001 From: Nathan Bossart Date: Mon, 7 Oct 2024 14:16:07 -0500 Subject: [PATCH 1/1] add INT64_HEX_FORMAT and UINT64_HEX_FORMAT --- contrib/postgres_fdw/option.c| 2 +- src/backend/utils/error/csvlog.c | 2 +- src/backend/utils/error/elog.c | 4 ++-- src/backend/utils/error/jsonlog.c| 2 +- src/include/c.h | 2 ++ src/test/modules/test_radixtree/test_radixtree.c | 2 -- 6 files changed, 7 insertions(+), 7 deletions(-) diff --git a/contrib/postgres_fdw/option.c b/contrib/postgres_fdw/option.c index d740893918..9a6785720d 100644 --- a/contrib/postgres_fdw/option.c +++ b/contrib/postgres_fdw/option.c @@ -522,7 +522,7 @@ process_pgfdw_appname(const char *appname) appendStringInfoString(&buf, application_name); break; case 'c': - appendStringInfo(&buf, "%" INT64_MODIFIER "x.%x", MyStartTime, MyProcPid); + appendStringInfo(&buf, INT64_HEX_FORMAT ".%x", MyStartTime, MyProcPid); break; case 'C': appendStringInfoString(&buf, cluster_name); diff --git a/src/backend/utils/error/csvlog.c b/src/backend/utils/error/csvlog.c index eab8df3fcc..acdffb6d06 100644 --- a/src/backend/utils/error/csvlog.c +++ b/src/backend/utils/error/csvlog.c @@ -120,7 +120,7 @@ write_csvlog(ErrorData *edata) appendStringInfoChar(&buf, ','); /* session id */ - appendStringInfo(&buf, "%" INT64_MODIFIER "x.%x", MyStartTime, MyProcPid); + appendStringInfo(&buf, INT64_HEX_FORMAT ".%x", MyStartTime, MyProcPid); appendStringInfoChar(&buf, ','); /* Line number */ diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c index 987ff98067..c71508c923 100644 --- a/src/backend/utils/error/elog.c +++ b/src/backend/utils/error/elog.c @@ -2944,12 +2944,12 @@ log_status_format(StringInfo buf, const char *format, ErrorData *edata) { charstrfbuf[128]; - snprintf(strfbuf, sizeof(strfbuf) - 1, "%" INT64_MODIFIER "x.%x", + snprintf(strfbuf, sizeof(strfbuf) - 1, INT64_HEX_FORMAT ".%x", MyStartTime, MyProcPid); appendStringInfo(buf, "%*s", padding, strfbuf); } else - appendStringInfo(buf, "%" INT64_MODIFIER "x.%x", MyStartTime, MyProcPid); + appendStringInfo(buf, INT64_HEX_FORMAT ".%x", MyStartTime, MyProcPid); break; case 'p': if (padding != 0) diff --git a/src/backend/utils/error/jsonlog.c b/src/backend/utils/error/jsonlog.c index 2c7b14cacb..492383a89e 100644 --- a/src/backend/utils/error/jsonlog.c +++ b/src/backend/utils/error/jsonlog.c @@ -168,7 +168,7 @@ write_jsonlog(ErrorData *edata) } /* Session id */ - appendJSONKeyValueFmt(&buf, "session_id", true, "%" INT64_MODIFIER "x.%x", + appendJSONKeyValueFmt(&buf, "session_id", true, INT64_HEX_FORMAT ".%x", MyStartTime, MyProcPid); /* Line number */ diff --git a/src/include/c.h b/src/include/c.h index 3297d89ff0..9520b89b00 100644 --- a/src/include/c.h +++ b/src/include/c.h @@ -550,6 +550,8 @@ typedef unsigned long long int uint64; /* snprintf format strings to use for 64-bit integers */ #define INT64_FORMAT "%" INT64_MODIFIER "d" #define UINT64_FORMAT "%" INT64_MODIFIER "u" +#define INT64_HEX_FORMAT "%" INT64_MODIFIER "x" +#define UINT64_HEX_FORMAT "%" INT64_MODIFIER "x" /* * 128-bit signed and unsigned integers diff --git a/src/test/modules/test_radixtree/test_radixtree.c b/src/test/modules/test_radixtree/test_radixtree.c index 1d9165a3a2..3e6863f660 100644 --- a/src/test/modules/test_radixtree/test_radixtree.c +++ b/src/test/modules/test_radixtree/test_radixtree.c @@ -23,8 +23,6 @@ /* uncomment to use shared memory for the tree */ /* #define TEST_SHARED_RT */ -#define UINT64_HEX_FORMAT "%" INT64_MODIFIER "X" - /* Convenient macros to test results */ #define EXPECT_TRUE(expr) \ do { \ -- 2.39.5 (Apple Git-154)
Re: pg_ctl/miscinit: print "MyStartTime" as a long long instead of long to avoid 2038 problem.
On Fri, Sep 27, 2024 at 02:10:47PM -0500, Nathan Bossart wrote: > Alright. I was able to back-patch it to v12 without too much trouble, > fortunately. I'll commit that soon unless anyone else has feedback. Committed, thanks! I refrained from introducing INT64_HEX_FORMAT/UINT64_HEX_FORMAT in c.h because I felt there was a nonzero chance of that causing problems with third-party code on the back-branches. We could probably add them for v18, though. -- nathan
Re: Should rolpassword be toastable?
Committed with the limit set to 512 bytes. We have plenty of time to adjust this limit as needed before it takes effect in v18. -- nathan
Re: Function for listing pg_wal/summaries directory
On Mon, Oct 07, 2024 at 10:07:10AM +0900, Michael Paquier wrote: > On Fri, Oct 04, 2024 at 10:02:11AM -0500, Nathan Bossart wrote: >> Could you explain why you feel the existing support functions are >> insufficient? > > Because it is not possible to outsource the scan of pg_wal/summaries/ > to a different role, no? I was under the impression that you could do this with pg_available_wal_summaries() [0]. [0] https://www.postgresql.org/docs/current/functions-info.html#FUNCTIONS-INFO-WAL-SUMMARY -- nathan
Re: Should rolpassword be toastable?
On Sun, Oct 06, 2024 at 11:42:53AM +0200, Hannu Krosing wrote: > On Fri, Oct 4, 2024 at 4:48 PM Nathan Bossart > wrote: >> Since BLCKSZ can be as low as 1024, I think 512 would be a good choice. > > Where did you get the minimal value of 1024 from ? https://www.postgresql.org/docs/devel/install-make.html#CONFIGURE-OPTION-WITH-BLOCKSIZE -- nathan
Re: Function for listing pg_wal/summaries directory
On Fri, Oct 04, 2024 at 11:32:08AM +0900, btogiwarayuushi wrote: > While WAL summaries feature and some support functions have been added in > version 17, merely listing the contents of the pg_wal/summaries directory is > missing. Could you explain why you feel the existing support functions are insufficient? -- nathan
Re: New PostgreSQL Contributors
On Fri, Oct 04, 2024 at 03:55:23PM +0200, Christoph Berg wrote: > New PostgreSQL Contributors: > > * Antonin Houska > * Ants Aasma > * Georgios Kokolatos > * Henrietta Dombrovskaya > * Ian Lawrence Barwick > * Jelte Fennema-Nio > * Karen Jex > * Pavlo Golub > * Zhang Mingli > > New PostgreSQL Major Contributors: > > * Bertrand Drouvot > * Laurenz Albe > * Richard Guo Congratulations to all! -- nathan
Re: Should rolpassword be toastable?
On Thu, Oct 03, 2024 at 10:33:04PM -0400, Tom Lane wrote: > "Jonathan S. Katz" writes: >> I think Tom's initial suggestion (BLCKSZ/2) is better than 256, given we >> really don't know what' out there in the wild, and this could end up >> being a breaking change. Every other type in pg_authid is pretty small. > > I'm having second thoughts about that though, based on the argument > that we don't really want a platform-dependent limit here. > Admittedly, nobody changes BLCKSZ on production systems, but it's > still theoretically an issue. I don't have a problem with selecting > a larger limit such as 512 or 1024 though. Since BLCKSZ can be as low as 1024, I think 512 would be a good choice. > However, if you wanted to allow multiple passwords I'm not > sure about a good way. The most recent proposal I'm aware of [0] did seem to target that use-case. One option might be to move rolpassword to a different catalog. In any case, I don't think it matters much for the patch at hand. [0] https://postgr.es/m/CAGB%2BVh5SQQorNDEKP%2B0G%3DsmxHRhbhs%2BVkmQWD5Vh98fmn8X4dg%40mail.gmail.com -- nathan
Re: Should rolpassword be toastable?
On Thu, Oct 03, 2024 at 05:39:06PM -0400, Tom Lane wrote: > I agree with the idea that complaining about the password being too > long is far more intelligible than that. Another problem with > leaving it as it stands in HEAD is that the effective limit is now > platform-specific, if not indeed dependent on the phase of the moon > (or at least, the other contents of the pg_authid row). I fear it > would be very easy to construct cases where a password is accepted > on one machine but fails with "row is too big" on another. A > uniform limit seems much less fraught with surprises. I don't mind proceeding with the patch if there is strong support for it. I wavered only because it's hard to be confident that we are choosing the right limit. AFAICT 256 bytes ought to be sufficient to avoid "row is too big" errors independent of BLCKSZ today, but maybe someone will add another varlena column in the future that breaks it. Or maybe we add a new password hashing method that produces longer strings. Or maybe someone is doing something really out there like storing additional information in the salt. I don't have any reason to believe that any of these things are happening or are likely to happen anytime soon, but they seem similar in likelihood to someone building a custom driver that generates ginormous hashes. But I can also buy the argument that none of this is a strong enough reason to avoid making the error message nicer... -- nathan
Re: Should rolpassword be toastable?
On Sat, Sep 21, 2024 at 03:25:54PM -0500, Nathan Bossart wrote: > Thanks for reviewing. I went ahead and committed 0002 since it seems like > there's consensus on that one. I've attached a rebased version of 0001 > with s/characters/bytes. For the reasons discussed upthread [0], I can't bring myself to add an arbitrary limit to the password hash length. I am going to leave 0001 uncommitted for now. [0] https://postgr.es/m/Zu2eT2H8OT3OXauc%40nathan -- nathan
Re: Enable data checksums by default
On Tue, Oct 01, 2024 at 11:15:02AM -0400, Peter Eisentraut wrote: > I have committed 0001 (the new option) and 0004 (the docs tweak). I think > there is consensus for the rest, too, but I'll leave it for a few more days > to think about. I guess the test failure has to be addressed. Here is a rebased patch with the test fix (for cfbot). I have made no other changes. -- nathan >From 912894763686e29e907a5f5600e0caabca961dad Mon Sep 17 00:00:00 2001 From: Nathan Bossart Date: Thu, 3 Oct 2024 16:01:19 -0500 Subject: [PATCH v3 1/1] use data checksums by default --- contrib/amcheck/t/001_verify_heapam.pl| 2 +- doc/src/sgml/ref/initdb.sgml | 2 ++ src/bin/initdb/initdb.c | 2 +- src/bin/initdb/t/001_initdb.pl| 18 ++ src/bin/pg_amcheck/t/003_check.pl | 2 +- src/bin/pg_amcheck/t/004_verify_heapam.pl | 2 +- src/bin/pg_checksums/t/002_actions.pl | 2 +- src/test/perl/PostgreSQL/Test/Cluster.pm | 7 +++ 8 files changed, 24 insertions(+), 13 deletions(-) diff --git a/contrib/amcheck/t/001_verify_heapam.pl b/contrib/amcheck/t/001_verify_heapam.pl index 481e4dbe4f..b380d1b034 100644 --- a/contrib/amcheck/t/001_verify_heapam.pl +++ b/contrib/amcheck/t/001_verify_heapam.pl @@ -15,7 +15,7 @@ my $node; # Test set-up # $node = PostgreSQL::Test::Cluster->new('test'); -$node->init; +$node->init(no_checksums => 1); $node->append_conf('postgresql.conf', 'autovacuum=off'); $node->start; $node->safe_psql('postgres', q(CREATE EXTENSION amcheck)); diff --git a/doc/src/sgml/ref/initdb.sgml b/doc/src/sgml/ref/initdb.sgml index a0ce261999..f9f00c 100644 --- a/doc/src/sgml/ref/initdb.sgml +++ b/doc/src/sgml/ref/initdb.sgml @@ -273,6 +273,8 @@ PostgreSQL documentation pg_stat_database view. See for details. +Data checksums are enabled by default. They can be +disabled by use of --no-data-checksums. diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c index eac6cbd8e5..9eff46e799 100644 --- a/src/bin/initdb/initdb.c +++ b/src/bin/initdb/initdb.c @@ -164,7 +164,7 @@ static bool noinstructions = false; static bool do_sync = true; static bool sync_only = false; static bool show_setting = false; -static bool data_checksums = false; +static bool data_checksums = true; static char *xlog_dir = NULL; static int wal_segment_size_mb = (DEFAULT_XLOG_SEG_SIZE) / (1024 * 1024); static DataDirSyncMethod sync_method = DATA_DIR_SYNC_METHOD_FSYNC; diff --git a/src/bin/initdb/t/001_initdb.pl b/src/bin/initdb/t/001_initdb.pl index 8072adb97f..7520d3d0dd 100644 --- a/src/bin/initdb/t/001_initdb.pl +++ b/src/bin/initdb/t/001_initdb.pl @@ -69,16 +69,11 @@ mkdir $datadir; } } -# Control file should tell that data checksums are disabled by default. +# Control file should tell that data checksums are enabled by default. command_like( [ 'pg_controldata', $datadir ], - qr/Data page checksum version:.*0/, - 'checksums are disabled in control file'); -# pg_checksums fails with checksums disabled by default. This is -# not part of the tests included in pg_checksums to save from -# the creation of an extra instance. -command_fails([ 'pg_checksums', '-D', $datadir ], - "pg_checksums fails with data checksum disabled"); + qr/Data page checksum version:.*1/, + 'checksums are enabled in control file'); command_ok([ 'initdb', '-S', $datadir ], 'sync only'); command_fails([ 'initdb', $datadir ], 'existing data directory'); @@ -280,4 +275,11 @@ command_like( qr/Data page checksum version:.*0/, 'checksums are disabled in control file'); +# pg_checksums fails with checksums disabled. This is +# not part of the tests included in pg_checksums to save from +# the creation of an extra instance. +command_fails( + [ 'pg_checksums', '-D', $datadir_nochecksums ], + "pg_checksums fails with data checksum disabled"); + done_testing(); diff --git a/src/bin/pg_amcheck/t/003_check.pl b/src/bin/pg_amcheck/t/003_check.pl index d99b094dba..c140b1622b 100644 --- a/src/bin/pg_amcheck/t/003_check.pl +++ b/src/bin/pg_amcheck/t/003_check.pl @@ -120,7 +120,7 @@ sub perform_all_corruptions() # Test set-up $node = PostgreSQL::Test::Cluster->new('test'); -$node->init; +$node->init(no_checksums => 1); $node->append_conf('postgresql.conf', 'autovacuum=off'); $node->start; $port = $node->port; diff --git a/src/bin/pg_amcheck/t/004_verify_heapam.pl b/src/bin/pg_amcheck/t/004_verify_heapam.pl index f6d2c5f787..2c17f7d068 100644 --- a/src/bin/pg_amcheck/t/004_verify_heapam.pl +++ b/src/bin/pg_amcheck/t/004_verify_heapam.pl @@ -181,7 +181,7 @@ m
Re: ACL_MAINTAIN, Lack of comment content
On Mon, Sep 30, 2024 at 04:13:55PM +0200, Daniel Gustafsson wrote: >> On 30 Sep 2024, at 12:38, Yugo Nagata wrote: >> >> Should we put a comma between REINDEX and "and" as following? >> >> "... MATERIALIZED VIEW, REINDEX, and LOCK TABLE on all relations." > > I'm not a native speaker so I'm not sure which is right, but grepping for > other > lists of items shows that the last "and" item is often preceded by a comma so > I'll do that. I'm not aware of a project policy around the Oxford comma [0], but I tend to include one. [0] https://en.wikipedia.org/wiki/Serial_comma -- nathan
Re: pg_upgrade check for invalid databases
On Sun, Sep 29, 2024 at 08:45:50PM -0400, Thomas Krennwallner wrote: > if a cluster contains invalid databases that we cannot connect to anymore, > pg_upgrade would currently fail when trying to connect to the first > encountered invalid database with > > [...] > > If there is more than one invalid database, we need to run pg_upgrade more > than once (unless the user inspects pg_database). > > I attached two small patches for PG 17 and PG 18 (can be easily backported > to all previous versions upon request). Instead of just failing to connect > with an error, we collect all invalid databases in a report file > invalid_databases.txt: Should we have pg_upgrade skip invalid databases? If the only valid action is to drop them, IMHO it seems unnecessary to require users to manually drop them before retrying pg_upgrade. -- nathan
Re: MAINTAIN privilege -- what do we need to un-revert it?
On Fri, Sep 27, 2024 at 01:27:38PM -0700, Jeff Davis wrote: > Looks good to me. Thanks, committed. -- nathan
Re: MAINTAIN privilege -- what do we need to un-revert it?
On Fri, Sep 27, 2024 at 09:22:48AM -0700, Jeff Davis wrote: > I suggest that we add the wording to the > query portion of the doc, near "security- > restricted operation". How does this look? diff --git a/doc/src/sgml/ref/create_materialized_view.sgml b/doc/src/sgml/ref/create_materialized_view.sgml index 0d2fea2b97..62d897931c 100644 --- a/doc/src/sgml/ref/create_materialized_view.sgml +++ b/doc/src/sgml/ref/create_materialized_view.sgml @@ -143,7 +143,9 @@ CREATE MATERIALIZED VIEW [ IF NOT EXISTS ] table_name A SELECT, TABLE, or VALUES command. This query will run within a security-restricted operation; in particular, calls to functions that - themselves create temporary tables will fail. + themselves create temporary tables will fail. Also, while the query is + running, the is temporarily changed to + pg_catalog, pg_temp. -- nathan
Re: Thoughts about NUM_BUFFER_PARTITIONS
On Mon, Aug 05, 2024 at 12:32:20AM +0500, Andrey M. Borodin wrote: > I´ve found some dead code: BufMappingPartitionLockByIndex() is unused, > and unused for a long time. See patch 1. I don't see a reason to also get rid of BufTableHashPartition(), but otherwise this looks reasonable to me. It would probably be a good idea to first check whether there are any external callers we can find. > I´ve prototyped similar GUC for anyone willing to do such experiments. > See patch 2, 4. Probably, I´ll do some experiments too, on customer's > clusters and workloads :) Like Tomas, I'm not too wild about making this a GUC. And as Heikki pointed out upthread, a good first step would be to benchmark different NUM_BUFFER_PARTITIONS settings on modern hardware. I suspect the current setting is much lower than optimal (e.g., I doubled it and saw a TPS boost for read-only pgbench on an i5-13500T), but I don't think we fully understand the performance characteristics with different settings yet. If we find that the ideal setting depends heavily on workload/hardware, then perhaps we can consider adding a GUC, but I don't think we are there yet. >> Anyway, this value is inherently a trade off. If it wasn't, we'd set it >> to something super high from the start. But having more partitions of >> the lock table has a cost too, because some parts need to acquire all >> the partition locks (and that's O(N) where N = number of partitions). > > I´ve found none such cases, actually. Or, perhaps, I was not looking good > enough. pg_buffercache iterates over buffers and releases locks. See > patch 3 to fix comments. Yeah, I think 0003 is reasonable, too. pg_buffercache stopped acquiring all the partition locks in v10 (see commit 6e65454), which is also the commit that removed all remaining uses of BufMappingPartitionLockByIndex(). In fact, I think BufMappingPartitionLockByIndex() was introduced just for this pg_buffercache use-case (see commit ea9df81). -- nathan
Re: pg_ctl/miscinit: print "MyStartTime" as a long long instead of long to avoid 2038 problem.
On Fri, Sep 27, 2024 at 02:48:01PM +, Max Johnson wrote: > I think that it would be a good idea to include these fixes in the next > minor release. After working for a couple months on getting our embedded > systems 2038 compliant, it has become very apparent that 2038 will be a > substantial ordeal. Maximizing the number of systems that include this > fix would make things a little easier when that time comes around. Alright. I was able to back-patch it to v12 without too much trouble, fortunately. I'll commit that soon unless anyone else has feedback. -- nathan
Re: micro-optimize nbtcompare.c routines
I've marked this one as Withdrawn. Apologies for the noise. -- nathan
Re: MAINTAIN privilege -- what do we need to un-revert it?
On Fri, Sep 27, 2024 at 12:42:34PM +0900, Yugo NAGATA wrote: > I agree with you. I overlooked WITH NO DATA. > I attached a updated patch. Thanks. Unless someone objects, I plan to commit this shortly. -- nathan
Re: micro-optimize nbtcompare.c routines
On Fri, Sep 27, 2024 at 02:50:13PM +1200, David Rowley wrote: > I had been looking at [1] (which I've added your version to now). I > had been surprised to see gcc emitting different code for the first 3 > versions. Clang does a better job at figuring out they all do the same > thing and emitting the same code for each. Interesting. > I played around with the attached (hacked up) qsort.c to see if there > was any difference. Likely function call overhead kills the > performance anyway. There does not seem to be much difference between > them. I've not tested with an inlined comparison function. I'd expect worse performance with the branchless routines for the inlined case. However, I recall that clang was able to optimize med3() as well as it can with the branching routines, so that may not always be true. > Looking at your version, it doesn't look like there's any sort of > improvement in terms of the instructions. Certainly, for clang, it's > worse as it adds a shift left instruction and an additional compare. > No jumps, at least. I think I may have forgotten to add -O2 when I was inspecting this code with godbolt.org earlier. *facepalm* The different versions look pretty comparable with that added. > What's your reasoning for returning INT_MIN and INT_MAX? That's just for the compile option added by commit c87cb5f, which IIUC is intended to test that we correctly handle comparisons that return INT_MIN. -- nathan
Re: pg_ctl/miscinit: print "MyStartTime" as a long long instead of long to avoid 2038 problem.
On Wed, Sep 25, 2024 at 08:04:59PM +, Max Johnson wrote: > I think your patch looks good, no objections. I am happy to have contributed. Great. I've attached what I have staged for commit. My first instinct was to not bother back-patching this since all currently-supported versions will have been out of support for over 8 years by the time this becomes a practical issue. However, I wonder if it makes sense to back-patch for the kinds of 32-bit embedded systems you cited upthread. I can imagine that such systems might need to work for a very long time without any software updates, in which case it'd probably be a good idea to make this fix available in the next minor release. What do you think? -- nathan >From 84f05c822f44b8b8ec437540bbae6bb43c72e916 Mon Sep 17 00:00:00 2001 From: Nathan Bossart Date: Thu, 26 Sep 2024 21:00:41 -0500 Subject: [PATCH v4 1/1] Fix Y2K38 issues with MyStartTime. Several places treat MyStartTime as a "long", which is only 32 bits wide on some platforms. In reality, MyStartTime is a pg_time_t, i.e., a signed 64-bit integer. This will lead to interesting bugs on the aforementioned systems in 2038 when signed 32-bit integers are no longer sufficient to store Unix time (e.g., "pg_ctl start" hanging). To fix, ensure that MyStartTime is handled as a 64-bit value everywhere. (Of course, users will need to ensure that time_t is 64 bits wide on their system, too, but there's little that Postgres can do about that.) This could probably be back-patched, but since all currently-supported versions will have been out of support for over 8 years by the time this becomes a practical issue, I'm not going to bother. Co-authored-by: Max Johnson Discussion: https://postgr.es/m/CO1PR07MB905262E8AC270FAAACED66008D682%40CO1PR07MB9052.namprd07.prod.outlook.com --- contrib/postgres_fdw/option.c| 3 ++- src/backend/utils/error/csvlog.c | 2 +- src/backend/utils/error/elog.c | 7 --- src/backend/utils/error/jsonlog.c| 4 ++-- src/backend/utils/init/miscinit.c| 4 ++-- src/bin/pg_ctl/pg_ctl.c | 2 +- src/include/c.h | 2 ++ src/test/modules/test_radixtree/test_radixtree.c | 2 -- 8 files changed, 14 insertions(+), 12 deletions(-) diff --git a/contrib/postgres_fdw/option.c b/contrib/postgres_fdw/option.c index 630b304338..da16fc78d4 100644 --- a/contrib/postgres_fdw/option.c +++ b/contrib/postgres_fdw/option.c @@ -522,7 +522,8 @@ process_pgfdw_appname(const char *appname) appendStringInfoString(&buf, application_name); break; case 'c': - appendStringInfo(&buf, "%lx.%x", (long) (MyStartTime), MyProcPid); + appendStringInfo(&buf, INT64_HEX_FORMAT ".%x", +MyStartTime, MyProcPid); break; case 'C': appendStringInfoString(&buf, cluster_name); diff --git a/src/backend/utils/error/csvlog.c b/src/backend/utils/error/csvlog.c index 855e130a97..acdffb6d06 100644 --- a/src/backend/utils/error/csvlog.c +++ b/src/backend/utils/error/csvlog.c @@ -120,7 +120,7 @@ write_csvlog(ErrorData *edata) appendStringInfoChar(&buf, ','); /* session id */ - appendStringInfo(&buf, "%lx.%x", (long) MyStartTime, MyProcPid); + appendStringInfo(&buf, INT64_HEX_FORMAT ".%x", MyStartTime, MyProcPid); appendStringInfoChar(&buf, ','); /* Line number */ diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c index 5cbb5b5416..70e4c30df3 100644 --- a/src/backend/utils/error/elog.c +++ b/src/backend/utils/error/elog.c @@ -2944,12 +2944,13 @@ log_status_format(StringInfo buf, const char *format, ErrorData *edata) { charstrfbuf[128]; - snprintf(strfbuf, sizeof(strfbuf) - 1, "%lx.%x", -(long) (MyStartTime), MyProcPid); + snprintf(strfbuf, sizeof(strfbuf) - 1, INT64_HEX_FORMAT ".%x", +MyStartTime, MyProcPid); appendStringInfo(buf, "%*s", padding, strfbuf); } else - appendStringInfo(buf, "%lx.%x", (long) (MyStartTime), MyProcPid); + appendStringInfo(b
Re: MAINTAIN privilege -- what do we need to un-revert it?
On Mon, Aug 05, 2024 at 04:05:02PM +0900, Yugo Nagata wrote: > + > + While CREATE MATERIALIZED VIEW is running, the + linkend="guc-search-path"/> is temporarily changed to pg_catalog, > + pg_temp. > + I think we should mention that this is not true when WITH NO DATA is used. Maybe something like: Unless WITH NO DATA is used, the search_path is temporarily changed to pg_catalog, pg_temp while CREATE MATERIALIZED VIEW is running. -- nathan
Re: Large expressions in indexes can't be stored (non-TOASTable)
On Tue, Sep 24, 2024 at 02:26:08PM -0500, Nathan Bossart wrote: > On Tue, Sep 24, 2024 at 01:21:45PM +0900, Michael Paquier wrote: >> On Mon, Sep 23, 2024 at 10:50:21AM -0500, Nathan Bossart wrote: >>> I carefully inspected all the code paths this patch touches, and I think >>> I've got all the details right, but I would be grateful if someone else >>> could take a look. >> >> No objections from here with putting the snapshots pops and pushes >> outside the inner routines of reindex/drop concurrently, meaning that >> ReindexRelationConcurrently(), DefineIndex() and index_drop() are fine >> to do these operations. > > Great. I plan to push 0001 shortly. Committed this one. -- nathan
micro-optimize nbtcompare.c routines
Here's a patch that adjusts several routines in nbtcompare.c and related files to use the branchless integer comparison functions added in commit 6b80394. It's probably unlikely this produces a measurable benefit (at least I've been unable to find any in my admittedly-limited testing), but in theory it should save a cycle here and there. I was hoping that this would trim many lines of code, but maintaining the STRESS_SORT_INT_MIN stuff eats up most of what we save. Anyway, I don't feel too strongly about this patch, but I went to the trouble of writing it, and so I figured I'd post it. -- nathan >From bd2e1eb92b28cf855b529b09813f1b09084c7cfc Mon Sep 17 00:00:00 2001 From: Nathan Bossart Date: Thu, 26 Sep 2024 15:03:32 -0500 Subject: [PATCH v1 1/1] micro-optimize nbtcompare.c routines --- src/backend/access/nbtree/nbtcompare.c | 162 - src/backend/storage/page/itemptr.c | 18 +-- src/backend/utils/sort/tuplesort.c | 21 ++-- 3 files changed, 96 insertions(+), 105 deletions(-) diff --git a/src/backend/access/nbtree/nbtcompare.c b/src/backend/access/nbtree/nbtcompare.c index 1c72867c84..2df0c5dbe0 100644 --- a/src/backend/access/nbtree/nbtcompare.c +++ b/src/backend/access/nbtree/nbtcompare.c @@ -57,17 +57,16 @@ #include +#include "common/int.h" #include "utils/fmgrprotos.h" #include "utils/sortsupport.h" -#ifdef STRESS_SORT_INT_MIN -#define A_LESS_THAN_B INT_MIN -#define A_GREATER_THAN_B INT_MAX -#else -#define A_LESS_THAN_B (-1) -#define A_GREATER_THAN_B 1 -#endif - +#define STRESS_SORT_INT_MIN_CMP(a, b) \ +( \ + ((a) > (b)) ? INT_MAX : \ + ((a) < (b)) ? INT_MIN : \ + 0 \ +) Datum btboolcmp(PG_FUNCTION_ARGS) @@ -84,7 +83,11 @@ btint2cmp(PG_FUNCTION_ARGS) int16 a = PG_GETARG_INT16(0); int16 b = PG_GETARG_INT16(1); - PG_RETURN_INT32((int32) a - (int32) b); +#ifdef STRESS_SORT_INT_MIN + PG_RETURN_INT32(STRESS_SORT_INT_MIN_CMP(a, b)); +#else + PG_RETURN_INT32(pg_cmp_s16(a, b)); +#endif } static int @@ -93,7 +96,11 @@ btint2fastcmp(Datum x, Datum y, SortSupport ssup) int16 a = DatumGetInt16(x); int16 b = DatumGetInt16(y); - return (int) a - (int) b; +#ifdef STRESS_SORT_INT_MIN + return STRESS_SORT_INT_MIN_CMP(a, b); +#else + return pg_cmp_s16(a, b); +#endif } Datum @@ -111,12 +118,11 @@ btint4cmp(PG_FUNCTION_ARGS) int32 a = PG_GETARG_INT32(0); int32 b = PG_GETARG_INT32(1); - if (a > b) - PG_RETURN_INT32(A_GREATER_THAN_B); - else if (a == b) - PG_RETURN_INT32(0); - else - PG_RETURN_INT32(A_LESS_THAN_B); +#ifdef STRESS_SORT_INT_MIN + PG_RETURN_INT32(STRESS_SORT_INT_MIN_CMP(a, b)); +#else + PG_RETURN_INT32(pg_cmp_s32(a, b)); +#endif } Datum @@ -134,12 +140,11 @@ btint8cmp(PG_FUNCTION_ARGS) int64 a = PG_GETARG_INT64(0); int64 b = PG_GETARG_INT64(1); - if (a > b) - PG_RETURN_INT32(A_GREATER_THAN_B); - else if (a == b) - PG_RETURN_INT32(0); - else - PG_RETURN_INT32(A_LESS_THAN_B); +#ifdef STRESS_SORT_INT_MIN + PG_RETURN_INT32(STRESS_SORT_INT_MIN_CMP(a, b)); +#else + PG_RETURN_INT32(pg_cmp_s64(a, b)); +#endif } #if SIZEOF_DATUM < 8 @@ -149,12 +154,11 @@ btint8fastcmp(Datum x, Datum y, SortSupport ssup) int64 a = DatumGetInt64(x); int64 b = DatumGetInt64(y); - if (a > b) - return A_GREATER_THAN_B; - else if (a == b) - return 0; - else - return A_LESS_THAN_B; +#ifdef STRESS_SORT_INT_MIN + return STRESS_SORT_INT_MIN_CMP(a, b); +#else + return pg_cmp_s64(a, b); +#endif } #endif @@ -177,12 +181,11 @@ btint48cmp(PG_FUNCTION_ARGS) int32 a = PG_GETARG_INT32(0); int64 b = PG_GETARG_INT64(1); - if (a > b) - PG_RETURN_INT32(A_GREATER_THAN_B); - else if (a == b) - PG_RETURN_INT32(0); - else - PG_RETURN_INT32(A_LESS_THAN_B); +#ifdef STRESS_SORT_INT_MIN + PG_RETURN_INT32(STRESS_SORT_INT_MIN_CMP(a, b)); +#else + PG_RETURN_INT32(pg_cmp_s64(a, b)); +#endif } Datum @@ -191,12 +194,11 @@ btint84cmp(PG_FUNCTION_ARGS) int64 a = PG_GETARG_INT64(0); int32 b = PG_GETARG_INT32(1); - if (a > b) - PG_RETURN_INT32(A_GREATER_THAN_B); - else if (a == b) - PG_RETURN_INT32(0); - else - PG_RETURN_INT32(A_LESS_THAN_B); +#ifdef STRESS_SORT_INT_MIN + PG_RETURN_INT32(STRESS_SORT_INT_MIN_CMP(a, b)); +#else + PG_RETURN_INT32(pg_cmp_s64(a, b)); +#endif } Datum @@ -205,12 +207,11 @@ btint24c
Re: miscellaneous pg_upgrade cleanup
Committed. -- nathan
Re: CREATE INDEX regression in 17 RC1 or expected behavior?
On Thu, Sep 26, 2024 at 12:22:32PM +0930, Tom Dunstan wrote: > Reporting in case this is unexpected. At the very least if a function used > in an index must now always find other functions using an explicit path, it > seems like this should be documented and noted in the release notes. The first compatibility entry in the release notes [0] has the following sentence: Functions used by expression indexes and materialized views that need to reference non-default schemas must specify a search path during function creation. Do you think this needs to be expanded upon? [0] https://www.postgresql.org/docs/release/17.0/ -- nathan
Re: pg_ctl/miscinit: print "MyStartTime" as a long long instead of long to avoid 2038 problem.
On Wed, Sep 25, 2024 at 03:17:45PM +, Max Johnson wrote: > I have amended my patch to reflect the changes that were discussed and > have verified on my system that it works the same as before. I have also > fixed a typo and changed the name of the patch to more accurately reflect > what it does now. Please let me know if there is anything else you'd like > me to do. Thanks! I went through all the other uses of MyStartTime and fixed those as needed, too. Please let me know what you think. -- nathan diff --git a/contrib/postgres_fdw/option.c b/contrib/postgres_fdw/option.c index 630b304338..da16fc78d4 100644 --- a/contrib/postgres_fdw/option.c +++ b/contrib/postgres_fdw/option.c @@ -522,7 +522,8 @@ process_pgfdw_appname(const char *appname) appendStringInfoString(&buf, application_name); break; case 'c': - appendStringInfo(&buf, "%lx.%x", (long) (MyStartTime), MyProcPid); + appendStringInfo(&buf, INT64_HEX_FORMAT ".%x", +MyStartTime, MyProcPid); break; case 'C': appendStringInfoString(&buf, cluster_name); diff --git a/src/backend/utils/error/csvlog.c b/src/backend/utils/error/csvlog.c index 855e130a97..acdffb6d06 100644 --- a/src/backend/utils/error/csvlog.c +++ b/src/backend/utils/error/csvlog.c @@ -120,7 +120,7 @@ write_csvlog(ErrorData *edata) appendStringInfoChar(&buf, ','); /* session id */ - appendStringInfo(&buf, "%lx.%x", (long) MyStartTime, MyProcPid); + appendStringInfo(&buf, INT64_HEX_FORMAT ".%x", MyStartTime, MyProcPid); appendStringInfoChar(&buf, ','); /* Line number */ diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c index 5cbb5b5416..70e4c30df3 100644 --- a/src/backend/utils/error/elog.c +++ b/src/backend/utils/error/elog.c @@ -2944,12 +2944,13 @@ log_status_format(StringInfo buf, const char *format, ErrorData *edata) { charstrfbuf[128]; - snprintf(strfbuf, sizeof(strfbuf) - 1, "%lx.%x", -(long) (MyStartTime), MyProcPid); + snprintf(strfbuf, sizeof(strfbuf) - 1, INT64_HEX_FORMAT ".%x", +MyStartTime, MyProcPid); appendStringInfo(buf, "%*s", padding, strfbuf); } else - appendStringInfo(buf, "%lx.%x", (long) (MyStartTime), MyProcPid); + appendStringInfo(buf, INT64_HEX_FORMAT ".%x", + MyStartTime, MyProcPid); break; case 'p': if (padding != 0) diff --git a/src/backend/utils/error/jsonlog.c b/src/backend/utils/error/jsonlog.c index bd0124869d..492383a89e 100644 --- a/src/backend/utils/error/jsonlog.c +++ b/src/backend/utils/error/jsonlog.c @@ -168,8 +168,8 @@ write_jsonlog(ErrorData *edata) } /* Session id */ - appendJSONKeyValueFmt(&buf, "session_id", true, "%lx.%x", - (long) MyStartTime, MyProcPid); + appendJSONKeyValueFmt(&buf, "session_id", true, INT64_HEX_FORMAT ".%x", + MyStartTime, MyProcPid); /* Line number */ appendJSONKeyValueFmt(&buf, "line_num", false, "%ld", log_line_number); diff --git a/src/backend/utils/init/miscinit.c b/src/backend/utils/init/miscinit.c index 537d92c0cf..ef60f41b8c 100644 --- a/src/backend/utils/init/miscinit.c +++ b/src/backend/utils/init/miscinit.c @@ -1372,10 +1372,10 @@ CreateLockFile(const char *filename, bool amPostmaster, * both datadir and socket lockfiles; although more stuff may get added to * the datadir lockfile later. */ - snprintf(buffer, sizeof(buffer), "%d\n%s\n%ld\n%d\n%s\n", + snprintf(buffer, sizeof(buffer), "%d\n%s\n" INT64_FORMAT "\n%d\n%s\n", amPostmaster ? (int) my_pid : -((int) my_pid), DataDir, -(long) MyStartTime, +MyStartTime, PostPortNumber, socketDir); diff --git a/src/bin/pg_ctl/pg_ctl.c b/src/bin/pg_ctl/pg_ctl.c index e7e878c22f..d6bb2c3311 100644 --- a/src/bin/pg_ctl/pg_ctl.c +++ b/src/bin/pg_ctl/pg_ctl.c @@ -618,7 +618,7 @@ wait_for_postmaster_start(pid_t pm_pid, bool do_checkpoint) * Al
Re: src/backend/optimizer/util/plancat.c -> Is this correct English
On Wed, Sep 25, 2024 at 04:52:47PM +, Daniel Westermann (DWE) wrote: > just came across this: > > src/backend/optimizer/util/plancat.c -> Is this correct English? > -> We need not lock the relation since it was already locked ... > > I am not a native speaker, but this sounds strange. I think it's fine. It could also be phrased like this: We do not need to lock the relation since... -- nathan
Re: pg_ctl/miscinit: print "MyStartTime" as a long long instead of long to avoid 2038 problem.
On Tue, Sep 24, 2024 at 04:44:41PM -0400, Tom Lane wrote: > Nathan Bossart writes: >> I think we should use INT64_FORMAT here. That'll choose the right length >> modifier for the platform. And I don't think we need to cast MyStartTime, >> since it's a pg_time_t (which is just an int64). > > Agreed. However, a quick grep finds half a dozen other places that > are casting MyStartTime to long. We should fix them all. +1 > Also note that if any of the other places are using translatable > format strings, INT64_FORMAT is problematic in that context, and > "long long" is a better answer for them. At a glance, I'm not seeing any translatable format strings that involve MyStartTime. But that is good to know... -- nathan
Re: pg_ctl/miscinit: print "MyStartTime" as a long long instead of long to avoid 2038 problem.
On Tue, Sep 24, 2024 at 07:33:24PM +, Max Johnson wrote: > I have found an instance of a time overflow with the start time that is > written in "postmaster.pid". On a 32-bit Linux system, if the system date > is past 01/19/2038, when you start Postgres with `pg_ctl start -D > {datadir} ...`, the start time will have rolled back to approximately > 1900. This is an instance of the "2038 problem". On my system, pg_ctl > will not exit if the start time has overflowed. Nice find. I think this has been the case since the start time was added to the lock files [0]. > - snprintf(buffer, sizeof(buffer), "%d\n%s\n%ld\n%d\n%s\n", > + snprintf(buffer, sizeof(buffer), "%d\n%s\n%lld\n%d\n%s\n", >amPostmaster ? (int) my_pid : -((int) my_pid), >DataDir, > - (long) MyStartTime, > + (long long) MyStartTime, >PostPortNumber, >socketDir); I think we should use INT64_FORMAT here. That'll choose the right length modifier for the platform. And I don't think we need to cast MyStartTime, since it's a pg_time_t (which is just an int64). [0] https://postgr.es/c/30aeda4 -- nathan
Re: Large expressions in indexes can't be stored (non-TOASTable)
On Tue, Sep 24, 2024 at 01:21:45PM +0900, Michael Paquier wrote: > On Mon, Sep 23, 2024 at 10:50:21AM -0500, Nathan Bossart wrote: >> I carefully inspected all the code paths this patch touches, and I think >> I've got all the details right, but I would be grateful if someone else >> could take a look. > > No objections from here with putting the snapshots pops and pushes > outside the inner routines of reindex/drop concurrently, meaning that > ReindexRelationConcurrently(), DefineIndex() and index_drop() are fine > to do these operations. Great. I plan to push 0001 shortly. > Actually, thinking more... Could it be better to have some more > sanity checks in the stack outside the toast code for catalogs with > toast tables? For example, I could imagine adding a check in > CatalogTupleUpdate() so as all catalog accessed that have a toast > relation require an active snapshot. That would make checks more > aggressive, because we would not need any toast data in a catalog to > make sure that there is a snapshot set. This strikes me as something > we could do better to improve the detection of failures like the one > reported by Alexander when updating catalog tuples as this can be > triggered each time we do a CatalogTupleUpdate() when dirtying a > catalog tuple. The idea is then to have something before the > HaveRegisteredOrActiveSnapshot() in the toast internals, for catalogs, > and we would not require toast data to detect problems. I gave this a try and, unsurprisingly, found a bunch of other problems. I hastily hacked together the attached patch that should fix all of them, but I'd still like to comb through the code a bit more. The three catalogs with problems are pg_replication_origin, pg_subscription, and pg_constraint. pg_contraint has had a TOAST table for a while, and I don't think it's unheard of for conbin to be large, so this one is probably worth fixing. pg_subscription hasn't had its TOAST table for quite as long, but presumably subpublications could be large enough to require out-of-line storage. pg_replication_origin, however, only has one varlena column: roname. Three out of the seven problem areas involve pg_replication_origin, but AFAICT that'd only ever be a problem if the name of your replication origin requires out-of-line storage. So... maybe we should just remove pg_replication_origin's TOAST table instead... -- nathan diff --git a/src/backend/catalog/indexing.c b/src/backend/catalog/indexing.c index d0d1abda58..b8be124555 100644 --- a/src/backend/catalog/indexing.c +++ b/src/backend/catalog/indexing.c @@ -22,6 +22,7 @@ #include "catalog/index.h" #include "catalog/indexing.h" #include "executor/executor.h" +#include "miscadmin.h" #include "utils/rel.h" @@ -234,6 +235,14 @@ CatalogTupleInsert(Relation heapRel, HeapTuple tup) { CatalogIndexState indstate; + /* +* If we might need TOAST access, make sure the caller has set up a valid +* snapshot. +*/ + Assert(HaveRegisteredOrActiveSnapshot() || + !OidIsValid(heapRel->rd_rel->reltoastrelid) || + !IsNormalProcessingMode()); + CatalogTupleCheckConstraints(heapRel, tup); indstate = CatalogOpenIndexes(heapRel); @@ -256,6 +265,14 @@ void CatalogTupleInsertWithInfo(Relation heapRel, HeapTuple tup, CatalogIndexState indstate) { + /* +* If we might need TOAST access, make sure the caller has set up a valid +* snapshot. +*/ + Assert(HaveRegisteredOrActiveSnapshot() || + !OidIsValid(heapRel->rd_rel->reltoastrelid) || + !IsNormalProcessingMode()); + CatalogTupleCheckConstraints(heapRel, tup); simple_heap_insert(heapRel, tup); @@ -273,6 +290,14 @@ void CatalogTuplesMultiInsertWithInfo(Relation heapRel, TupleTableSlot **slot, int ntuples, CatalogIndexState indstate) { + /* +* If we might need TOAST access, make sure the caller has set up a valid +* snapshot. +*/ + Assert(HaveRegisteredOrActiveSnapshot() || + !OidIsValid(heapRel->rd_rel->reltoastrelid) || + !IsNormalProcessingMode()); + /* Nothing to do */ if (ntuples <= 0) return; @@ -315,6 +340,14 @@ CatalogTupleUpdate(Relation heapRel, ItemPointer otid, HeapTuple tup) CatalogIndexState indstate; TU_UpdateIndexes updateIndexes = TU_All; + /* +* If we might need TOAST access, make sure the caller has set up a valid +* snapshot. +*/ + Assert(HaveRegisteredOrActiveSnapshot() || + !OidIsValid(heapRel->rd_rel->
Re: [PATCH] Support Int64 GUCs
On Tue, Sep 24, 2024 at 12:27:20PM +0300, Aleksander Alekseev wrote: >> It doesn't look like these *_age GUCs could benefit from being 64-bit, >> before 64-bit transaction ids get in. However, I think there are some >> better candidates. >> >> autovacuum_vacuum_threshold >> autovacuum_vacuum_insert_threshold >> autovacuum_analyze_threshold >> >> This GUCs specify number of tuples before vacuum/analyze. That could >> be more than 2^31. With large tables of small tuples, I can't even >> say this is always impractical to have values greater than 2^31. > > [...] > > I found a great candidate in src/test/modules/delay_execution: > > ``` > DefineCustomIntVariable("delay_execution.post_planning_lock_id", > "Sets the advisory lock ID to be > locked/unlocked after planning.", > ``` > > Advisory lock IDs are bigints [1]. I modified the module to use Int64's. > > I guess it may also answer Nathan's question. Hm. I'm not sure I find any of these to be particularly convincing examples of why we need int64 GUCs. Yes, the GUCs in question could potentially be set to higher values, but I've yet to hear of this being a problem in practice. We might not want to encourage such high values, either. -- nathan
Re: Large expressions in indexes can't be stored (non-TOASTable)
On Mon, Sep 23, 2024 at 04:00:00PM +0300, Alexander Lakhin wrote: > I tested it with two code modifications (1st is to make each created > expression index TOASTed (by prepending 1M of spaces to the indexeprs > value) and 2nd to make each created index an expression index (by > modifying index_elem_options in gram.y) - both modifications are kludgy so > I don't dare to publish them) and found no other snapshot-related issues > during `make check-world`. Thanks. Here is an updated patch with tests and comments. I've also moved the calls to PushActiveSnapshot()/PopActiveSnapshot() to surround only the section of code where the snapshot is needed. Besides being more similar in style to other fixes I found, I think this is safer because much of this code is cautious to avoid deadlocks. For example, DefineIndex() has the following comment: /* * The snapshot subsystem could still contain registered snapshots that * are holding back our process's advertised xmin; in particular, if * default_transaction_isolation = serializable, there is a transaction * snapshot that is still active. The CatalogSnapshot is likewise a * hazard. To ensure no deadlocks, we must commit and start yet another * transaction, and do our wait before any snapshot has been taken in it. */ I carefully inspected all the code paths this patch touches, and I think I've got all the details right, but I would be grateful if someone else could take a look. -- nathan >From 3be62fac910e930f5635193ac60d1536b17e0d6d Mon Sep 17 00:00:00 2001 From: Nathan Bossart Date: Mon, 23 Sep 2024 10:33:58 -0500 Subject: [PATCH v3 1/1] Ensure we have a snapshot when updating pg_index entries. Creating, reindexing, and dropping an index concurrently could entail accessing pg_index's TOAST table, which was recently added in commit b52c4fc3c0. These code paths start and commit their own transactions internally, but they do not always set an active snapshot. This rightfully leads to assertion failures and ERRORs when trying to access pg_index's TOAST table, such as the following: ERROR: cannot fetch toast data without an active snapshot To fix, push an active snapshot just before each section of code that might require accessing pg_index's TOAST table, and pop it shortly afterwards. Reported-by: Alexander Lakhin Discussion: https://postgr.es/m/a97d7401-e7c9-f771-6a00-037379f0a8bb%40gmail.com --- src/backend/catalog/index.c| 21 + src/backend/commands/indexcmds.c | 24 src/test/regress/expected/indexing.out | 15 +++ src/test/regress/sql/indexing.sql | 16 4 files changed, 76 insertions(+) diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c index b2b3ecb524..09d79b26b8 100644 --- a/src/backend/catalog/index.c +++ b/src/backend/catalog/index.c @@ -2276,9 +2276,17 @@ index_drop(Oid indexId, bool concurrent, bool concurrent_lock_mode) */ WaitForLockers(heaplocktag, AccessExclusiveLock, true); + /* +* Updating pg_index might involve TOAST table access, so ensure we +* have a valid snapshot. +*/ + PushActiveSnapshot(GetTransactionSnapshot()); + /* Finish invalidation of index and mark it as dead */ index_concurrently_set_dead(heapId, indexId); + PopActiveSnapshot(); + /* * Again, commit the transaction to make the pg_index update visible * to other sessions. @@ -2326,6 +2334,16 @@ index_drop(Oid indexId, bool concurrent, bool concurrent_lock_mode) RelationForgetRelation(indexId); + /* +* Updating pg_index might involve TOAST table access, so ensure we have a +* valid snapshot. We only expect to get here without an active snapshot +* in the concurrent path. +*/ + if (concurrent) + PushActiveSnapshot(GetTransactionSnapshot()); + else + Assert(ActiveSnapshotSet()); + /* * fix INDEX relation, and check for expressional index */ @@ -2343,6 +2361,9 @@ index_drop(Oid indexId, bool concurrent, bool concurrent_lock_mode) ReleaseSysCache(tuple); table_close(indexRelation, RowExclusiveLock); + if (concurrent) + PopActiveSnapshot(); + /* * if it has any expression columns, we might have stored statistics about * them. diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c index f99c2d2dee..7d41cb73ab 100644 --- a/src/backend/commands/indexcmds.c +++ b/src/backend/commands/indexcmds.c @@ -1798,11 +1798,19 @@ DefineIndex(Oid tableId,
Re: miscellaneous pg_upgrade cleanup
On Mon, Sep 23, 2024 at 03:04:22PM +0200, Daniel Gustafsson wrote: > No objections to any of these changes, LGTM. Thanks for reviewing. I'll commit these once the v17 release freeze is over (since 0001 needs to be back-patched there). -- nathan
Re: Should rolpassword be toastable?
On Fri, Sep 20, 2024 at 12:27:41PM -0400, Tom Lane wrote: > Nitpick: the message should say "%d bytes" not "%d characters", > because we're counting bytes. Passes an eyeball check otherwise. Thanks for reviewing. I went ahead and committed 0002 since it seems like there's consensus on that one. I've attached a rebased version of 0001 with s/characters/bytes. -- nathan >From f039f02cc35d57862af64648d4152693e52fbee2 Mon Sep 17 00:00:00 2001 From: Nathan Bossart Date: Thu, 19 Sep 2024 20:59:10 -0500 Subject: [PATCH v4 1/1] place limit on password hash length --- src/backend/libpq/crypt.c | 60 ++ src/include/libpq/crypt.h | 10 + src/test/regress/expected/password.out | 7 +++ src/test/regress/sql/password.sql | 4 ++ 4 files changed, 63 insertions(+), 18 deletions(-) diff --git a/src/backend/libpq/crypt.c b/src/backend/libpq/crypt.c index 629e51e00b..05d289977f 100644 --- a/src/backend/libpq/crypt.c +++ b/src/backend/libpq/crypt.c @@ -116,7 +116,7 @@ encrypt_password(PasswordType target_type, const char *role, const char *password) { PasswordType guessed_type = get_password_type(password); - char *encrypted_password; + char *encrypted_password = NULL; const char *errstr = NULL; if (guessed_type != PASSWORD_TYPE_PLAINTEXT) @@ -125,32 +125,56 @@ encrypt_password(PasswordType target_type, const char *role, * Cannot convert an already-encrypted password from one format to * another, so return it as it is. */ - return pstrdup(password); + encrypted_password = pstrdup(password); } - - switch (target_type) + else { - case PASSWORD_TYPE_MD5: - encrypted_password = palloc(MD5_PASSWD_LEN + 1); + switch (target_type) + { + case PASSWORD_TYPE_MD5: + encrypted_password = palloc(MD5_PASSWD_LEN + 1); - if (!pg_md5_encrypt(password, role, strlen(role), - encrypted_password, &errstr)) - elog(ERROR, "password encryption failed: %s", errstr); - return encrypted_password; + if (!pg_md5_encrypt(password, role, strlen(role), + encrypted_password, &errstr)) + elog(ERROR, "password encryption failed: %s", errstr); + break; - case PASSWORD_TYPE_SCRAM_SHA_256: - return pg_be_scram_build_secret(password); + case PASSWORD_TYPE_SCRAM_SHA_256: + encrypted_password = pg_be_scram_build_secret(password); + break; - case PASSWORD_TYPE_PLAINTEXT: - elog(ERROR, "cannot encrypt password with 'plaintext'"); + case PASSWORD_TYPE_PLAINTEXT: + elog(ERROR, "cannot encrypt password with 'plaintext'"); + break; + } } + Assert(encrypted_password); + /* -* This shouldn't happen, because the above switch statements should -* handle every combination of source and target password types. +* Valid password hashes may be very long, but we don't want to store +* anything that might need out-of-line storage, since de-TOASTing won't +* work during authentication because we haven't selected a database yet +* and cannot read pg_class. 256 bytes should be more than enough for all +* practical use, so fail for anything longer. */ - elog(ERROR, "cannot encrypt password to requested type"); - return NULL;/* keep compiler quiet */ + if (encrypted_password && /* keep compiler quiet */ + strlen(encrypted_password) > MAX_ENCRYPTED_PASSWORD_LEN) + { + /* +* We don't expect any of our own hashing routines to produce hashes +* that are too long. +*/ + Assert(guessed_type != PASSWORD_TYPE_PLAINTEXT); + + ereport(ERROR, + (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED), +errmsg("encrypted password is too long"), +errdetail("Encrypted passwords must be no longer than %d bytes.", + MAX_
Re: pg_checksums: Reorder headers in alphabetical order
On Sat, Sep 21, 2024 at 02:48:32PM +0900, Fujii Masao wrote: > On 2024/09/21 12:09, Tom Lane wrote: >> Fujii Masao writes: >> > I don´t have any objections to this commit, but I´d like to confirm >> > whether we really want to proactively reorder #include directives, >> > even for standard C library headers. >> >> I'm hesitant to do that. We can afford to insist that our own header >> files be inclusion-order-independent, because we have the ability to >> fix any problems that might arise. We have no ability to do something >> about it if the system headers on $random_platform have inclusion >> order dependencies. (In fact, I'm fairly sure there are already >> places in plperl and plpython where we know we have to be careful >> about inclusion order around those languages' headers.) >> >> So I would tread pretty carefully around making changes of this >> type, especially in long-established code. I have no reason to >> think that the committed patch will cause any problems, but >> I do think it's mostly asking for trouble. > > Sounds reasonable to me. Oh, sorry. I thought it was project policy to keep these alphabetized, and I was unaware of the past problems with system header ordering. I'll keep this in mind in the future. -- nathan
Re: pg_checksums: Reorder headers in alphabetical order
On Fri, Sep 20, 2024 at 01:56:16PM -0500, Nathan Bossart wrote: > On Fri, Sep 20, 2024 at 07:20:15PM +0200, Michael Banck wrote: >> I noticed two headers are not in alphabetical order in pg_checkums.c, >> patch attached. > > This appears to be commit 280e5f1's fault. Will fix. Committed, thanks! -- nathan
Re: pg_checksums: Reorder headers in alphabetical order
On Fri, Sep 20, 2024 at 07:20:15PM +0200, Michael Banck wrote: > I noticed two headers are not in alphabetical order in pg_checkums.c, > patch attached. This appears to be commit 280e5f1's fault. Will fix. -- nathan
Re: Large expressions in indexes can't be stored (non-TOASTable)
On Fri, Sep 20, 2024 at 07:00:00AM +0300, Alexander Lakhin wrote: > I've found another two paths to reach that condition: > CREATE INDEX CONCURRENTLY ON def (vec_quantizer(id, :'b')); > ERROR: cannot fetch toast data without an active snapshot > > REINDEX INDEX CONCURRENTLY def_vec_quantizer_idx; > (or REINDEX TABLE CONCURRENTLY def;) > TRAP: failed Assert("HaveRegisteredOrActiveSnapshot()"), File: > "toast_internals.c", Line: 668, PID: 2934502 Here's a (probably naive) attempt at fixing these, too. I'll give each path a closer look once it feels like we've identified all the bugs. > Perhaps it would make sense to check all CatalogTupleUpdate(pg_index, ...) > calls (I've found 10 such instances, but haven't checked them yet). Indeed. -- nathan >From b7432c42c0cea9c1aadba7c72f9ce2ba6e6407d2 Mon Sep 17 00:00:00 2001 From: Nathan Bossart Date: Fri, 20 Sep 2024 11:48:52 -0500 Subject: [PATCH v2 1/1] fix failed assertions due to pg_index's TOAST table --- src/backend/catalog/index.c | 5 + src/backend/commands/indexcmds.c | 9 + 2 files changed, 14 insertions(+) diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c index b2b3ecb524..2e378ef4ef 100644 --- a/src/backend/catalog/index.c +++ b/src/backend/catalog/index.c @@ -2255,6 +2255,7 @@ index_drop(Oid indexId, bool concurrent, bool concurrent_lock_mode) PopActiveSnapshot(); CommitTransactionCommand(); StartTransactionCommand(); + PushActiveSnapshot(GetTransactionSnapshot()); /* * Now we must wait until no running transaction could be using the @@ -2283,8 +2284,10 @@ index_drop(Oid indexId, bool concurrent, bool concurrent_lock_mode) * Again, commit the transaction to make the pg_index update visible * to other sessions. */ + PopActiveSnapshot(); CommitTransactionCommand(); StartTransactionCommand(); + PushActiveSnapshot(GetTransactionSnapshot()); /* * Wait till every transaction that saw the old index state has @@ -2387,6 +2390,8 @@ index_drop(Oid indexId, bool concurrent, bool concurrent_lock_mode) { UnlockRelationIdForSession(&heaprelid, ShareUpdateExclusiveLock); UnlockRelationIdForSession(&indexrelid, ShareUpdateExclusiveLock); + + PopActiveSnapshot(); } } diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c index f99c2d2dee..36318c81ea 100644 --- a/src/backend/commands/indexcmds.c +++ b/src/backend/commands/indexcmds.c @@ -1798,11 +1798,15 @@ DefineIndex(Oid tableId, PROGRESS_CREATEIDX_PHASE_WAIT_3); WaitForOlderSnapshots(limitXmin, true); + PushActiveSnapshot(GetTransactionSnapshot()); + /* * Index can now be marked valid -- update its pg_index entry */ index_set_state_flags(indexRelationId, INDEX_CREATE_SET_VALID); + PopActiveSnapshot(); + /* * The pg_index update will cause backends (including this one) to update * relcache entries for the index itself, but we should also send a @@ -4236,6 +4240,8 @@ ReindexRelationConcurrently(const ReindexStmt *stmt, Oid relationOid, const Rein */ set_indexsafe_procflags(); + PushActiveSnapshot(GetTransactionSnapshot()); + forboth(lc, indexIds, lc2, newIndexIds) { ReindexIndexInfo *oldidx = lfirst(lc); @@ -4280,8 +4286,10 @@ ReindexRelationConcurrently(const ReindexStmt *stmt, Oid relationOid, const Rein } /* Commit this transaction and make index swaps visible */ + PopActiveSnapshot(); CommitTransactionCommand(); StartTransactionCommand(); + PushActiveSnapshot(GetTransactionSnapshot()); /* * While we could set PROC_IN_SAFE_IC if all indexes qualified, there's no @@ -4316,6 +4324,7 @@ ReindexRelationConcurrently(const ReindexStmt *stmt, Oid relationOid, const Rein } /* Commit this transaction to make the updates visible. */ + PopActiveSnapshot(); CommitTransactionCommand(); StartTransactionCommand(); -- 2.39.5 (Apple Git-154)
Re: Large expressions in indexes can't be stored (non-TOASTable)
On Fri, Sep 20, 2024 at 08:16:24AM +0900, Michael Paquier wrote: > Perhaps the reason why these snapshots are pushed should be documented > with a comment? Definitely. I'll add those once we are more confident that we've identified all the bugs. -- nathan
Re: Should rolpassword be toastable?
On Fri, Sep 20, 2024 at 10:06:28AM -0400, Jonathan S. Katz wrote: > On 9/20/24 1:23 AM, Michael Paquier wrote: >> Not sure. Is this really something we absolutely need? Sure, this >> generates a better error when inserting a record too long to >> pg_authid, but removing the toast relation is enough to avoid the >> problems one would see when authenticating. Not sure if this argument >> is enough to count as an objection, just sharing some doubts :) > > The errors from lack of TOAST are confusing to users. Why can't we have a > user friendly error here? If I wanted to argue against adding a user-friendly error, I'd point out that it's highly unlikely anyone is actually trying to use super long hashes unless they are trying to break things, and it's just another arbitrary limit that we'll need to maintain/enforce. But on the off-chance that someone is building a custom driver that generates long hashes for whatever reason, I'd imagine that a clear error would be more helpful than "row is too big." Here is a v3 patch set that fixes the test comment and a compiler warning in cfbot. -- nathan >From bb3aad9105b1997bc088403437ac6294e22809d9 Mon Sep 17 00:00:00 2001 From: Nathan Bossart Date: Thu, 19 Sep 2024 20:59:10 -0500 Subject: [PATCH v3 1/2] place limit on password hash length --- src/backend/libpq/crypt.c | 60 ++ src/include/libpq/crypt.h | 10 + src/test/regress/expected/password.out | 7 +++ src/test/regress/sql/password.sql | 4 ++ 4 files changed, 63 insertions(+), 18 deletions(-) diff --git a/src/backend/libpq/crypt.c b/src/backend/libpq/crypt.c index 629e51e00b..753e5c11da 100644 --- a/src/backend/libpq/crypt.c +++ b/src/backend/libpq/crypt.c @@ -116,7 +116,7 @@ encrypt_password(PasswordType target_type, const char *role, const char *password) { PasswordType guessed_type = get_password_type(password); - char *encrypted_password; + char *encrypted_password = NULL; const char *errstr = NULL; if (guessed_type != PASSWORD_TYPE_PLAINTEXT) @@ -125,32 +125,56 @@ encrypt_password(PasswordType target_type, const char *role, * Cannot convert an already-encrypted password from one format to * another, so return it as it is. */ - return pstrdup(password); + encrypted_password = pstrdup(password); } - - switch (target_type) + else { - case PASSWORD_TYPE_MD5: - encrypted_password = palloc(MD5_PASSWD_LEN + 1); + switch (target_type) + { + case PASSWORD_TYPE_MD5: + encrypted_password = palloc(MD5_PASSWD_LEN + 1); - if (!pg_md5_encrypt(password, role, strlen(role), - encrypted_password, &errstr)) - elog(ERROR, "password encryption failed: %s", errstr); - return encrypted_password; + if (!pg_md5_encrypt(password, role, strlen(role), + encrypted_password, &errstr)) + elog(ERROR, "password encryption failed: %s", errstr); + break; - case PASSWORD_TYPE_SCRAM_SHA_256: - return pg_be_scram_build_secret(password); + case PASSWORD_TYPE_SCRAM_SHA_256: + encrypted_password = pg_be_scram_build_secret(password); + break; - case PASSWORD_TYPE_PLAINTEXT: - elog(ERROR, "cannot encrypt password with 'plaintext'"); + case PASSWORD_TYPE_PLAINTEXT: + elog(ERROR, "cannot encrypt password with 'plaintext'"); + break; + } } + Assert(encrypted_password); + /* -* This shouldn't happen, because the above switch statements should -* handle every combination of source and target password types. +* Valid password hashes may be very long, but we don't want to store +* anything that might need out-of-line storage, since de-TOASTing won't +* work during authentication because we haven't selected a database yet +* and cannot read pg_class. 256 bytes should be more than enough for all +* practical use, so fail for anything longer. */ - elog(ERROR, "cannot encrypt password to requested type"); - return NULL;
Re: Should rolpassword be toastable?
On Thu, Sep 19, 2024 at 07:37:55PM -0400, Jonathan S. Katz wrote: > On 9/19/24 6:14 PM, Tom Lane wrote: >> Nathan Bossart writes: >> > Oh, actually, I see that we are already validating the hash, but you can >> > create valid SCRAM-SHA-256 hashes that are really long. > > You _can_, but it's up to a driver or a very determined user to do this, as > it involves creating a very long salt. I can't think of any reason to support this, unless we want Alexander to find more bugs. >> So putting an >> > arbitrary limit (patch attached) is probably the correct path forward. I'd >> > also remove pg_authid's TOAST table while at it. >> >> Shouldn't we enforce the limit in every case in encrypt_password, >> not just this one? (I do agree that encrypt_password is an okay >> place to enforce it.) Yeah, that seems like a good idea. I've attached a more fleshed-out patch set that applies the limit in all cases. > +1; if there's any breakage, my guess is it would be on very long plaintext > passwords, but that would be from a very old upgrade? IIUC there's zero support for plain-text passwords in newer versions, and any that remain in older clusters will be silently converted to a hash by pg_upgrade. >> I think you will get pushback from a limit of 256 bytes --- I seem >> to recall discussion of actual use-cases where people were using >> strings of a couple of kB. Whatever the limit is, the error message >> had better cite it explicitly. > > I think it's OK to be a bit generous with the limit. Also, currently oru > hashes are 256-bit (I know the above says byte), but this could increase > should we support larger hashes. Hm. Are you thinking of commit 67a472d? That one removed the password length restrictions in client-side code and password message packets, which I think is entirely separate from the lengths of the hashes stored in rolpassword. >> Also, the ereport call needs an errcode. >> ERRCODE_PROGRAM_LIMIT_EXCEEDED is probably suitable. This is added in v2. -- nathan >From 0493783d99195080f5ef48f8b5d749b2a3979be6 Mon Sep 17 00:00:00 2001 From: Nathan Bossart Date: Thu, 19 Sep 2024 20:59:10 -0500 Subject: [PATCH v2 1/2] place limit on password hash length --- src/backend/libpq/crypt.c | 56 ++ src/include/libpq/crypt.h | 9 + src/test/regress/expected/password.out | 7 src/test/regress/sql/password.sql | 4 ++ 4 files changed, 59 insertions(+), 17 deletions(-) diff --git a/src/backend/libpq/crypt.c b/src/backend/libpq/crypt.c index 629e51e00b..2cdc977bea 100644 --- a/src/backend/libpq/crypt.c +++ b/src/backend/libpq/crypt.c @@ -125,32 +125,54 @@ encrypt_password(PasswordType target_type, const char *role, * Cannot convert an already-encrypted password from one format to * another, so return it as it is. */ - return pstrdup(password); + encrypted_password = pstrdup(password); } - - switch (target_type) + else { - case PASSWORD_TYPE_MD5: - encrypted_password = palloc(MD5_PASSWD_LEN + 1); + switch (target_type) + { + case PASSWORD_TYPE_MD5: + encrypted_password = palloc(MD5_PASSWD_LEN + 1); - if (!pg_md5_encrypt(password, role, strlen(role), - encrypted_password, &errstr)) - elog(ERROR, "password encryption failed: %s", errstr); - return encrypted_password; + if (!pg_md5_encrypt(password, role, strlen(role), + encrypted_password, &errstr)) + elog(ERROR, "password encryption failed: %s", errstr); + break; - case PASSWORD_TYPE_SCRAM_SHA_256: - return pg_be_scram_build_secret(password); + case PASSWORD_TYPE_SCRAM_SHA_256: + encrypted_password = pg_be_scram_build_secret(password); + break; - case PASSWORD_TYPE_PLAINTEXT: - elog(ERROR, "cannot encrypt password with 'plaintext'"); + case PASSWORD_TYPE_PLAINTEXT: + elog(ERROR, "cannot encrypt password with 'plaintext'"); + encrypted_password = palloc(0); /* keep compiler quiet */ + break; + } } /* -
Re: Should rolpassword be toastable?
On Thu, Sep 19, 2024 at 12:44:32PM -0500, Nathan Bossart wrote: > On Thu, Sep 19, 2024 at 10:31:15AM -0400, Tom Lane wrote: >> We could put an arbitrary limit (say, half of BLCKSZ) on the length of >> passwords. > > Something like that could be good enough. I was thinking about actually > validating that the hash had the correct form, but that might be a little > more complex than is warranted here. Oh, actually, I see that we are already validating the hash, but you can create valid SCRAM-SHA-256 hashes that are really long. So putting an arbitrary limit (patch attached) is probably the correct path forward. I'd also remove pg_authid's TOAST table while at it. -- nathan diff --git a/src/backend/libpq/crypt.c b/src/backend/libpq/crypt.c index 629e51e00b..a723e8219a 100644 --- a/src/backend/libpq/crypt.c +++ b/src/backend/libpq/crypt.c @@ -121,6 +121,19 @@ encrypt_password(PasswordType target_type, const char *role, if (guessed_type != PASSWORD_TYPE_PLAINTEXT) { + /* +* Valid SCRAM-SHA-256 hashes can have very long "iterations" and +* "salt" fields, but we don't want to store anything that might get +* TOASTed, since de-TOASTing won't work during authentication because +* we haven't selected a database yet and cannot read pg_class. 256 +* bytes should be more than enough for all practical use, so fail for +* anything longer. +*/ + if (guessed_type == PASSWORD_TYPE_SCRAM_SHA_256 && + strlen(password) > 256) + ereport(ERROR, + (errmsg("SCRAM-SHA-256 hash is too long"))); + /* * Cannot convert an already-encrypted password from one format to * another, so return it as it is.
Re: Partitioned tables and [un]loggedness
On Thu, Sep 19, 2024 at 10:03:09AM +0200, Alvaro Herrera wrote: > It looks to me like these cases could be modified to accept only > ATT_PARTITIONED_TABLE, not ATT_TABLE anymore. The ATT_TABLE cases are > useless anyway, because they're rejected by transformPartitionCmd. +1. If anything, this makes the error messages more consistent. -- nathan
Re: Large expressions in indexes can't be stored (non-TOASTable)
On Thu, Sep 19, 2024 at 12:00:00PM +0300, Alexander Lakhin wrote: > I've discovered that Jonathan's initial script: > CREATE TABLE def (id int); > SELECT array_agg(n) b FROM generate_series(1,10_000) n \gset > CREATE OR REPLACE FUNCTION vec_quantizer (a int, b int[]) RETURNS bool > AS $$ SELECT true $$ LANGUAGE SQL IMMUTABLE; > CREATE INDEX ON def (vec_quantizer(id, :'b')); > > completed with: > DROP INDEX CONCURRENTLY def_vec_quantizer_idx; > > triggers an assertion failure: > TRAP: failed Assert("HaveRegisteredOrActiveSnapshot()"), File: > "toast_internals.c", Line: 668, PID: 3723372 Ha, that was fast. The attached patch seems to fix the assertion failures. It's probably worth checking if any of the adjacent code paths are affected, too. -- nathan diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c index b2b3ecb524..2e378ef4ef 100644 --- a/src/backend/catalog/index.c +++ b/src/backend/catalog/index.c @@ -2255,6 +2255,7 @@ index_drop(Oid indexId, bool concurrent, bool concurrent_lock_mode) PopActiveSnapshot(); CommitTransactionCommand(); StartTransactionCommand(); + PushActiveSnapshot(GetTransactionSnapshot()); /* * Now we must wait until no running transaction could be using the @@ -2283,8 +2284,10 @@ index_drop(Oid indexId, bool concurrent, bool concurrent_lock_mode) * Again, commit the transaction to make the pg_index update visible * to other sessions. */ + PopActiveSnapshot(); CommitTransactionCommand(); StartTransactionCommand(); + PushActiveSnapshot(GetTransactionSnapshot()); /* * Wait till every transaction that saw the old index state has @@ -2387,6 +2390,8 @@ index_drop(Oid indexId, bool concurrent, bool concurrent_lock_mode) { UnlockRelationIdForSession(&heaprelid, ShareUpdateExclusiveLock); UnlockRelationIdForSession(&indexrelid, ShareUpdateExclusiveLock); + + PopActiveSnapshot(); } }
Re: Should rolpassword be toastable?
On Thu, Sep 19, 2024 at 10:31:15AM -0400, Tom Lane wrote: > Nathan Bossart writes: >> Hm. It does seem like there's little point in giving pg_authid a TOAST >> table, as rolpassword is the only varlena column, and it obviously has >> problems. But wouldn't removing it just trade one unhelpful internal error >> when trying to log in for another when trying to add a really long password >> hash (which hopefully nobody is really trying to do in practice)? I wonder >> if we could make this a little more user-friendly. > > We could put an arbitrary limit (say, half of BLCKSZ) on the length of > passwords. Something like that could be good enough. I was thinking about actually validating that the hash had the correct form, but that might be a little more complex than is warranted here. -- nathan
Re: Should rolpassword be toastable?
On Thu, Sep 19, 2024 at 06:00:00AM +0300, Alexander Lakhin wrote: > 23.09.2023 21:00, Alexander Lakhin wrote: >> So for now only pg_authid is worthy of condemnation, AFAICS. > > Let me remind you of this issue in light of b52c4fc3c. > Yes, it's opposite, but maybe it makes sense to fix it now in the hope that > ~1 year of testing will bring something helpful for both changes. Hm. It does seem like there's little point in giving pg_authid a TOAST table, as rolpassword is the only varlena column, and it obviously has problems. But wouldn't removing it just trade one unhelpful internal error when trying to log in for another when trying to add a really long password hash (which hopefully nobody is really trying to do in practice)? I wonder if we could make this a little more user-friendly. -- nathan
Re: First draft of PG 17 release notes
On Wed, Sep 18, 2024 at 05:33:18PM -0400, Bruce Momjian wrote: > On Tue, Sep 17, 2024 at 10:01:28AM +0200, Jelte Fennema-Nio wrote: >> Another new API that is useful for extension authors is the following >> one (I'm obviously biased since I'm the author, and I don't know if >> there's still time): >> >> Add macros for looping through a List without a ListCell. > > Can someone else comment on the idea of adding this release note item? > I don't feel confident in my ability to evaluate this. I obviously did > not see it as significant the first time. I'm not sure precisely what criteria you use to choose what goes in the release notes, but this one seems like a judgement call to me. My initial reaction is that it shouldn't be included, but I do see some items with a similar scope, such as "Remove some SPI macros." -- nathan
Re: detoast datum into the given buffer as a optimization.
On Wed, Sep 18, 2024 at 05:35:56PM +0800, Andy Fan wrote: > Currently detoast_attr always detoast the data into a palloc-ed memory > and then if user wants the detoast data in a different memory, user has to > copy them, I'm thinking if we could provide a buf as optional argument for > detoast_attr to save such wastage. > > [...] > > What do you think? My first thought is that this seems reasonable if there are existing places where we are copying the data out of the palloc'd memory, but otherwise it might be more of a prerequisite patch for the other things you mentioned. -- nathan
Re: Track the amount of time waiting due to cost_delay
On Thu, Sep 05, 2024 at 04:59:54AM +, Bertrand Drouvot wrote: > Please find attached v6, a mandatory rebase due to catversion bump conflict. > I'm removing the catversion bump from the patch as it generates too frequent > conflicts (just mention it needs to be done in the commit message). v6 looks generally reasonable to me. I think the nap_time_since_last_report variable needs to be marked static, though. One thing that occurs to me is that this information may not be particularly useful when parallel workers are used. Without parallelism, it's easy enough to figure out the percentage of time that your VACUUM is spending asleep, but when there are parallel workers, it may be hard to deduce much of anything from the value. I'm not sure that this is a deal-breaker for the patch, though, if for no other reason than it'll most likely be used for autovacuum, which doesn't use parallel vacuum yet. If there are no other concerns, I'll plan on committing this one soon after a bit of editorialization. -- nathan
Re: Large expressions in indexes can't be stored (non-TOASTable)
On Wed, Sep 18, 2024 at 10:54:56AM -0400, Tom Lane wrote: > Nathan Bossart writes: >> Any objections to committing this? > > Nope. Committed. I waffled on whether to add a test for system indexes that used pg_index's varlena columns, but I ended up leaving it out. I've attached it here in case anyone thinks we should add it. -- nathan diff --git a/src/test/regress/expected/misc_sanity.out b/src/test/regress/expected/misc_sanity.out index 45e7492877..2af66cf648 100644 --- a/src/test/regress/expected/misc_sanity.out +++ b/src/test/regress/expected/misc_sanity.out @@ -60,6 +60,31 @@ ORDER BY 1, 2; pg_largeobject_metadata | lomacl| aclitem[] (9 rows) +-- pg_index +-- Look for system indexes that use varlena columns in pg_index. There +-- shouldn't be any to avoid circularity hazards with pg_index's TOAST table. +-- +-- The first query returns the varlena columns in pg_index. Each should be +-- listed in the second query's WHERE clause. +SELECT attname +FROM pg_attribute +WHERE attstorage != 'p' AND + attrelid = 'pg_catalog.pg_index'::regclass +ORDER BY 1; + attname +-- + indexprs + indpred +(2 rows) + +SELECT relname +FROM pg_class c JOIN pg_index i ON indexrelid = c.oid +WHERE c.oid < 16384 AND + (i.indexprs IS NOT NULL OR i.indpred IS NOT NULL); + relname +- +(0 rows) + -- system catalogs without primary keys -- -- Current exceptions: diff --git a/src/test/regress/sql/misc_sanity.sql b/src/test/regress/sql/misc_sanity.sql index 16f3a7c0c1..3e87c50b84 100644 --- a/src/test/regress/sql/misc_sanity.sql +++ b/src/test/regress/sql/misc_sanity.sql @@ -53,6 +53,26 @@ WHERE c.oid < 16384 AND ORDER BY 1, 2; +-- pg_index + +-- Look for system indexes that use varlena columns in pg_index. There +-- shouldn't be any to avoid circularity hazards with pg_index's TOAST table. +-- +-- The first query returns the varlena columns in pg_index. Each should be +-- listed in the second query's WHERE clause. + +SELECT attname +FROM pg_attribute +WHERE attstorage != 'p' AND + attrelid = 'pg_catalog.pg_index'::regclass +ORDER BY 1; + +SELECT relname +FROM pg_class c JOIN pg_index i ON indexrelid = c.oid +WHERE c.oid < 16384 AND + (i.indexprs IS NOT NULL OR i.indpred IS NOT NULL); + + -- system catalogs without primary keys -- -- Current exceptions:
Re: Partitioned tables and [un]loggedness
On Wed, Sep 18, 2024 at 10:17:47AM -0500, Nathan Bossart wrote: > Could we also use ATT_PARTITIONED_TABLE to remove the partitioned table > check in ATExecAddIndexConstraint()? Eh, never mind. That ends up being gross because you have to redo the relation type check in a few places. -- nathan diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c index 9966d1f53c..777de48c69 100644 --- a/src/backend/commands/tablecmds.c +++ b/src/backend/commands/tablecmds.c @@ -4922,7 +4922,7 @@ ATPrepCmd(List **wqueue, Relation rel, AlterTableCmd *cmd, pass = AT_PASS_ADD_CONSTR; break; case AT_AddIndexConstraint: /* ADD CONSTRAINT USING INDEX */ - ATSimplePermissions(cmd->subtype, rel, ATT_TABLE | ATT_PARTITIONED_TABLE); + ATSimplePermissions(cmd->subtype, rel, ATT_TABLE); /* This command never recurses */ /* No command-specific prep needed */ pass = AT_PASS_ADD_INDEXCONSTR; @@ -5583,6 +5583,7 @@ ATParseTransformCmd(List **wqueue, AlteredTableInfo *tab, Relation rel, pass = AT_PASS_ADD_INDEX; break; case AT_AddIndexConstraint: + ATSimplePermissions(cmd2->subtype, rel, ATT_TABLE); /* This command never recurses */ /* No command-specific prep needed */ pass = AT_PASS_ADD_INDEXCONSTR; @@ -5596,6 +5597,7 @@ ATParseTransformCmd(List **wqueue, AlteredTableInfo *tab, Relation rel, case CONSTR_PRIMARY: case CONSTR_UNIQUE: case CONSTR_EXCLUSION: + ATSimplePermissions(cmd2->subtype, rel, ATT_TABLE); pass = AT_PASS_ADD_INDEXCONSTR; break; default: @@ -9208,15 +9210,6 @@ ATExecAddIndexConstraint(AlteredTableInfo *tab, Relation rel, Assert(OidIsValid(index_oid)); Assert(stmt->isconstraint); - /* -* Doing this on partitioned tables is not a simple feature to implement, -* so let's punt for now. -*/ - if (rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE) - ereport(ERROR, - (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), -errmsg("ALTER TABLE / ADD CONSTRAINT USING INDEX is not supported on partitioned tables"))); - indexRel = index_open(index_oid, AccessShareLock); indexName = pstrdup(RelationGetRelationName(indexRel)); diff --git a/src/test/regress/expected/create_index.out b/src/test/regress/expected/create_index.out index cf6eac5734..bdb7731543 100644 --- a/src/test/regress/expected/create_index.out +++ b/src/test/regress/expected/create_index.out @@ -1593,7 +1593,8 @@ DROP TABLE cwi_test; CREATE TABLE cwi_test(a int) PARTITION BY hash (a); create unique index on cwi_test (a); alter table cwi_test add primary key using index cwi_test_a_idx ; -ERROR: ALTER TABLE / ADD CONSTRAINT USING INDEX is not supported on partitioned tables +ERROR: ALTER action ADD CONSTRAINT cannot be performed on relation "cwi_test" +DETAIL: This operation is not supported for partitioned tables. DROP TABLE cwi_test; -- PRIMARY KEY constraint cannot be backed by a NULLS NOT DISTINCT index CREATE TABLE cwi_test(a int, b int);
Re: Partitioned tables and [un]loggedness
On Tue, Sep 10, 2024 at 09:42:31AM +0900, Michael Paquier wrote: > On Mon, Sep 09, 2024 at 03:56:14PM +0900, Michael Paquier wrote: >> How about inventing a new ATT_PARTITIONED_TABLE and make a clean split >> between both relkinds? I'd guess that blocking both SET LOGGED and >> UNLOGGED for partitioned tables is the best move, even if it is >> possible to block only one or the other, of course. > > I gave it a try, and while it is much more invasive, it is also much > more consistent with the rest of the file. This looks reasonable to me. Could we also use ATT_PARTITIONED_TABLE to remove the partitioned table check in ATExecAddIndexConstraint()? -- nathan