Re: Changing the state of data checksums in a running cluster
Hi, On Mon, Sep 30, 2024 at 11:21:30PM +0200, Daniel Gustafsson wrote: > > Yeah, I think a view like pg_stat_progress_checksums would work. > > Added in the attached version. It probably needs some polish (the docs for > sure do) but it's at least a start. Just a nitpick, but we call it data_checksums about everywhere, but the new view is called pg_stat_progress_datachecksums - I think pg_stat_progress_data_checksums would look better even if it gets quite long. Michael
Re: [PATCH] Extend ALTER OPERATOR to support adding commutator, negator, hashes, and merges
Hi, On Tue, Oct 24, 2023 at 11:42:15AM -0400, Tom Lane wrote: > Christoph Berg writes: > > Anyway, if this doesn't raise any "oh we didn't think of this" > > concerns, we'll just remove the old operators in pgsphere. > > Well, the idea was exactly to forbid that sort of setup. > However, if we get sufficient pushback maybe we should > reconsider --- for example, maybe it'd be sane to enforce > the restriction in ALTER but not CREATE? > > I'm inclined to wait and see if there are more complaints. FWIW, rdkit also fails, but that seems to be an ancient thing as well: https://github.com/rdkit/rdkit/issues/7843 I guess there's no way to make that error a bit more helpful, like printing out the offenbding SQL command, presumably because we are loding an extension? Michael
Re: pg_checksums: Reorder headers in alphabetical order
On Fri, Sep 20, 2024 at 03:20:16PM -0500, Nathan Bossart wrote: > On Fri, Sep 20, 2024 at 01:56:16PM -0500, Nathan Bossart wrote: > > On Fri, Sep 20, 2024 at 07:20:15PM +0200, Michael Banck wrote: > >> I noticed two headers are not in alphabetical order in pg_checkums.c, > >> patch attached. > > > > This appears to be commit 280e5f1's fault. Will fix. Oops, that was my fault then :) > Committed, thanks! Thanks! Michael
pg_checksums: Reorder headers in alphabetical order
Hi, I noticed two headers are not in alphabetical order in pg_checkums.c, patch attached. Michael >From e4d6d6503b4c14685424c6a04c5eb2ae29024bf6 Mon Sep 17 00:00:00 2001 From: Michael Banck Date: Fri, 20 Sep 2024 19:17:43 +0200 Subject: [PATCH] Reorder headers in pg_checkums.c in alphabetical order --- src/bin/pg_checksums/pg_checksums.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/bin/pg_checksums/pg_checksums.c b/src/bin/pg_checksums/pg_checksums.c index 68a68eb0fa..f5f7ff1045 100644 --- a/src/bin/pg_checksums/pg_checksums.c +++ b/src/bin/pg_checksums/pg_checksums.c @@ -16,8 +16,8 @@ #include #include -#include #include +#include #include #include "common/controldata_utils.h" -- 2.39.2
Re: Proposal to Enable/Disable Index using ALTER INDEX
Hi, On Tue, Sep 10, 2024 at 10:16:34AM +1200, David Rowley wrote: > On Tue, 10 Sept 2024 at 09:39, Shayon Mukherjee wrote: > > Adding and removing indexes is a common operation in PostgreSQL. On > > larger databases, however, these operations can be > > resource-intensive. When evaluating the performance impact of one or > > more indexes, dropping them might not be ideal since as a user you > > may want a quicker way to test their effects without fully > > committing to removing & adding them back again. Which can be a time > > taking operation on larger tables. > > > > Proposal: > > I propose adding an ALTER INDEX command that allows for enabling or > > disabling an index globally. This could look something like: > > > > ALTER INDEX index_name ENABLE; > > ALTER INDEX index_name DISABLE; > > > > A disabled index would still receive updates and enforce constraints > > as usual but would not be used for queries. This allows users to > > assess whether an index impacts query performance before deciding to > > drop it entirely. > > I personally think having some way to alter an index to stop it from > being used in query plans would be very useful for the reasons you > mentioned. I don't have any arguments against the syntax you've > proposed. We'd certainly have to clearly document that constraints > are still enforced. Perhaps there is some other syntax which would > self-document slightly better. I just can't think of it right now. > > > Implementation: > > To keep this simple, I suggest toggling the indisvalid flag in > > pg_index during the enable/disable operation. > > That's not a good idea as it would allow ALTER INDEX ... ENABLE; to be > used to make valid a failed concurrently created index. I think this > would need a new flag and everywhere in the planner would need to be > adjusted to ignore indexes when that flag is false. How about the indislive flag instead? I haven't looked at the code, but from the documentation ("If false, the index is in process of being dropped, and should be ignored for all purposes") it sounds like we made be able to piggy-back on that instead? Michael
Re: First draft of PG 17 release notes
On Thu, Sep 05, 2024 at 09:51:25PM -0400, Bruce Momjian wrote: > On Tue, Sep 3, 2024 at 10:44:01AM -0500, Nathan Bossart wrote: > > While freely acknowledging that I am biased because I wrote it, I am a bit > > surprised to see the DSM registry left out of the release notes (commit > > 8b2bcf3, docs are here [0]). This feature is intended to allow modules to > > allocate shared memory after startup, i.e., without requiring the module to > > be loaded via shared_preload_libraries. IMHO that is worth mentioning. > > > > [0] > > https://www.postgresql.org/docs/devel/xfunc-c.html#XFUNC-SHARED-ADDIN-AFTER-STARTUP > > That seems more infrastructure/extension author stuff which isn't > normally mentioned in the release notes. If I understand the feature correctly, it allows extensions to be just CREATEd without having them to be added to shared_preload_libraries, i.e. saving the organization an instance restart/downtime. That seems important enough for end-users to know, even if they will need to wait for extension authors to catch up to this (but I guess a lot will). Michael
Re: Use streaming read API in ANALYZE
Hi, On Thu, Sep 05, 2024 at 09:12:07PM +1200, Thomas Munro wrote: > On Thu, Sep 5, 2024 at 6:45 PM Mats Kindahl wrote: > > Making the ReadStream API non-opaque (that is, moving the definition > > to the header file) would at least solve our problem (unless I am > > mistaken). However, I am ignorant about long-term plans which might > > affect this, so there might be a good reason to revert it for > > reasons I am not aware of. > > The second thing. I am a bit confused about the status of this thread. Robert mentioned RC1, so I guess it pertains to v17 but I don't see it on the open item wiki list? Does the above mean you are going to revert it for v17, Thomas? And if so, what exactly? The ANALYZE changes on top of the streaming read API or something else about that API that is being discussed on this thread? I am also asking because this feature (i.e. Use streaming read API in ANALYZE) is being mentioned in the release announcement and that was just frozen for translations. Michael
Re: allowing extensions to control planner behavior
Hi, On Tue, Aug 27, 2024 at 03:11:15PM -0400, Robert Haas wrote: > Third, I think there's simply a lack of critical mass in terms of our > planner hooks. While the ability to add hypothetical indexes has some > use, the ability to remove indexes from consideration is probably > significantly more useful. JFTR, hypopg can also mask away/hide indexes since version 1.4.0: https://github.com/HypoPG/hypopg/commit/351f14a79daae8ab57339d2367d7f2fc639041f7 I haven't looked closely at the implementation though, and maybe you meant something else in the above entirely. Michael
Re: Enable data checksums by default
On Thu, Aug 15, 2024 at 09:49:04AM +0200, Jakub Wartak wrote: > On Wed, Aug 7, 2024 at 4:18 PM Greg Sabino Mullane wrote: > > > > On Wed, Aug 7, 2024 at 4:43 AM Michael Banck wrote: > >> > >> I think the last time we dicussed this the consensus was that > >> computational overhead of computing the checksums is pretty small for > >> most systems (so the above change seems warranted regardless of whether > >> we switch the default), but turning on wal_compression also turns on > >> wal_log_hints, which can increase WAL by quite a lot. Maybe this is > [..] > > > > > > Yeah, that seems something beyond this patch? Certainly we should > > mention wal_compression in the release notes if the default changes. > > I mean, I feel wal_log_hints should probably default to on as well, > > but I've honestly never really given it much thought because my > > fingers are trained to type "initdb -k". I've been using data > > checksums for roughly a decade now. I think the only time I've NOT > > used checksums was when I was doing checksum overhead measurements, > > or hacking on the pg_checksums program. > > Maybe I don't understand something, but just to be clear: > wal_compression (mentioned above) is not turning wal_log_hints on, > just the wal_log_hints needs to be on when using data checksums > (implicitly, by the XLogHintBitIsNeeded() macro). I suppose Michael > was thinking about the wal_log_hints earlier (?) Uh, I am pretty sure I meant to say "turning on data_checksums als turns on wal_log_hints", sorry about the confusion. I guess the connection is that if you turn on wal_lot_hints (either directly or via data_checksums) then the number FPIs goes up (possibly signficantly), and enabling wal_compression could (partly) remedy that. But I agree with Greg that such a discussion is probably out-of-scope for this default change. Michael
Re: Improve error message for ICU libraries if pkg-config is absent
Hi, On Wed, Aug 14, 2024 at 06:05:19PM -0700, Jeff Davis wrote: > That looks good to me. Does anyone have a different opinion? If not, > I'll go ahead and commit (but not backport) this change. What is the rationale not to backpatch this? The error message changes, but we do not translate configure output, so is this a problem/project policy to never change configure output in the back-branches? If the change is too invasive, would something like the initial patch I suggested (i.e., in the absense of pkg-config, add a more targetted error message) be acceptable for the back-branches? Michael
Re: Improve error message for ICU libraries if pkg-config is absent
Hi, adding Jeff to CC as he changed the way ICU configure detection was done in fcb21b3. On Fri, Aug 09, 2024 at 11:59:12AM +0300, Heikki Linnakangas wrote: > On 09/08/2024 11:16, Michael Banck wrote: > > Hi, > > > > Holger Jacobs complained in pgsql-admin that in v17, if you have the ICU > > development libraries installed but not pkg-config, you get a somewhat > > unhelpful error message about ICU not being present: > > > > |checking for pkg-config... no > > |checking whether to build with ICU support... yes > > |checking for icu-uc icu-i18n... no > > |configure: error: ICU library not found > > |If you have ICU already installed, see config.log for details on the > > |failure. It is possible the compiler isn't looking in the proper > > directory. > > |Use --without-icu to disable ICU support. > > > > The attached patch improves things to that: > > > > |checking for pkg-config... no > > |checking whether to build with ICU support... yes > > |configure: error: ICU library not found > > |The ICU library could not be found because pkg-config is not available, see > > |config.log for details on the failure. If ICU is installed, the variables > > |ICU_CFLAGS and ICU_LIBS can be set explicitly in this case, or use > > |--without-icu to disable ICU support. > > Hmm, if that's a good change, shouldn't we do it for all libraries that we > try to find with pkg-config? Hrm, probably. I think the main difference is that libicu is checked by default (actually since v16, see below), but the others are not, so maybe it is less of a problem there? So I had a further look and the only other pkg-config checks seem to be xml2, lz4 and zstd. From those, lz4 and zstd do not have a custom AC_MSG_ERROR so there you get something more helpful like this: |checking for pkg-config... no [...] |checking whether to build with LZ4 support... yes |checking for liblz4... no |configure: error: in `/home/mbanck/repos/postgres/postgresql/build': |configure: error: The pkg-config script could not be found or is too old. Make sure it |is in your PATH or set the PKG_CONFIG environment variable to the full |path to pkg-config. | |Alternatively, you may set the environment variables LZ4_CFLAGS |and LZ4_LIBS to avoid the need to call pkg-config. |See the pkg-config man page for more details. | |To get pkg-config, see <http://pkg-config.freedesktop.org/>. |See `config.log' for more details The XML check sets the error as no-op because there is xml2-config which is usually used: | if test -z "$XML2_CONFIG" -a -n "$PKG_CONFIG"; then |PKG_CHECK_MODULES(XML2, [libxml-2.0 >= 2.6.23], | [have_libxml2_pkg_config=yes], [# do nothing]) [...] |if test "$with_libxml" = yes ; then | AC_CHECK_LIB(xml2, xmlSaveToBuffer, [], [AC_MSG_ERROR([library 'xml2' (version >= 2.6.23) is required for XML support])]) |fi So if both pkg-config and libxml2-dev are missing this results in: |checking for pkg-config... no [...] |checking whether to build with XML support... yes |checking for xml2-config... no [...] |checking for xmlSaveToBuffer in -lxml2... no |configure: error: library 'xml2' (version >= 2.6.23) is required for XML support Whereas if only pkg-config is missing, configure goes through fine: |checking for pkg-config... no [...] |checking whether to build with XML support... yes |checking for xml2-config... /usr/bin/xml2-config [...] |checking for xmlSaveToBuffer in -lxml2... yes So to summarize, I think the others are fine. > I'm surprised the pkg.m4 module doesn't provide a nice error message already > if pkg-config is not found. I can see some messages like that in pkg.m4. Why > are they not printed? > > > Also, this should be backpatched to v17 if accepted. > Did something change here in v17? I was mistaken, things changed in v16 when ICU was checked for by default and the explicit error message was added. Before, ICU behaved like LZ4/ZST now, i.e. if you added --with-icu explicitly you would get the error about pkg-config not being there. So maybe the better change is to just remove the explicit error message again and depend on PKG_CHECK_MODULES erroring out helpfully? The downside would be that the hint about specifying --without-icu to get around this would disappear, but I think somebody building from source can figure that out more easily than the subtle issue that pkg-config is not installed. This would lead to this: |checking for pkg-config... no |checking whether to build with ICU support... yes |checking for icu-uc icu-i18n... no |configure: error: in `/home/mbanck/repos/postgres/postgresql/build': |configure: error: The pkg-config script could not be found or is too old. Make sure it |is in your PA
Re: Improve error message for ICU libraries if pkg-config is absent
Meh, forgot the attachment. Also, this should be backpatched to v17 if accepted. Michael >From b696949180437a3c7307ac0509cba54828b44259 Mon Sep 17 00:00:00 2001 From: Michael Banck Date: Fri, 9 Aug 2024 10:13:27 +0200 Subject: [PATCH] Improve configure error for ICU libraries if pkg-config is absent. If pkg-config is not installed, the ICU libraries cannot be found, but this was not mentioned in the error message and might lead to confusion about the actual problem. To improve this, add an additional error message for the case that pkg-config is not available. Reported-by: Holger Jakobs Discussion: https://www.postgresql.org/message-id/ccd579ed-4949-d3de-ab13-9e6456fd2caf%40jakobs.com --- configure| 7 +++ configure.ac | 7 +++ 2 files changed, 14 insertions(+) diff --git a/configure b/configure index 4f3aa44756..b3a2774f1b 100755 --- a/configure +++ b/configure @@ -8094,6 +8094,13 @@ $as_echo "$with_icu" >&6; } if test "$with_icu" = yes; then + if test -z "$PKG_CONFIG"; then +as_fn_error $? "ICU library not found +The ICU library could not be found because pkg-config is not available, see +config.log for details on the failure. If ICU is installed, the variables +ICU_CFLAGS and ICU_LIBS can be set explicitly in this case, or use +--without-icu to disable ICU support." "$LINENO" 5 + fi pkg_failed=no { $as_echo "$as_me:${as_lineno-$LINENO}: checking for icu-uc icu-i18n" >&5 diff --git a/configure.ac b/configure.ac index 049bc01491..18472a464a 100644 --- a/configure.ac +++ b/configure.ac @@ -829,6 +829,13 @@ AC_MSG_RESULT([$with_icu]) AC_SUBST(with_icu) if test "$with_icu" = yes; then + if test -z "$PKG_CONFIG"; then +AC_MSG_ERROR([ICU library not found +The ICU library could not be found because pkg-config is not available, see +config.log for details on the failure. If ICU is installed, the variables +ICU_CFLAGS and ICU_LIBS can be set explicitly in this case, or use +--without-icu to disable ICU support.]) + fi PKG_CHECK_MODULES(ICU, icu-uc icu-i18n, [], [AC_MSG_ERROR([ICU library not found If you have ICU already installed, see config.log for details on the -- 2.39.2
Improve error message for ICU libraries if pkg-config is absent
Hi, Holger Jacobs complained in pgsql-admin that in v17, if you have the ICU development libraries installed but not pkg-config, you get a somewhat unhelpful error message about ICU not being present: |checking for pkg-config... no |checking whether to build with ICU support... yes |checking for icu-uc icu-i18n... no |configure: error: ICU library not found |If you have ICU already installed, see config.log for details on the |failure. It is possible the compiler isn't looking in the proper directory. |Use --without-icu to disable ICU support. The attached patch improves things to that: |checking for pkg-config... no |checking whether to build with ICU support... yes |configure: error: ICU library not found |The ICU library could not be found because pkg-config is not available, see |config.log for details on the failure. If ICU is installed, the variables |ICU_CFLAGS and ICU_LIBS can be set explicitly in this case, or use |--without-icu to disable ICU support. Michael
Re: Enable data checksums by default
On Thu, Aug 08, 2024 at 12:11:38PM +0200, Peter Eisentraut wrote: > So I think we need to think through the upgrade experience a bit more. > Unfortunately, pg_checksums hasn't gotten to the point that we were perhaps > once hoping for that you could enable checksums on a live system. I'm > thinking pg_upgrade could have a mode where it adds the checksum during the > upgrade as it copies the files (essentially a subset of pg_checksums). I > think that would be useful for that middle tier of users who just want a > good default experience. Well that, or, as a first less ambitious step, pg_upgrade could carry over the data_checksums setting from the old to the new instance by essentially disabling it via pg_checksums -d (which is fast) if it the current default (off) is set on the old instance and the new instance was created with the new onw (checksums on). Probably should include a warning or something in that case, though I guess a lot of users will read just past it. But at least they are not worse off than before. Michael
Re: Enable data checksums by default
Hi, On Tue, Aug 06, 2024 at 06:46:52PM -0400, Greg Sabino Mullane wrote: > Please find attached a patch to enable data checksums by default. > > Currently, initdb only enables data checksums if passed the > --data-checksums or -k argument. There was some hesitation years ago when > this feature was first added, leading to the current situation where the > default is off. However, many years later, there is wide consensus that > this is an extraordinarily safe, desirable setting. Indeed, most (if not > all) of the major commercial and open source Postgres systems currently > turn this on by default. I posit you would be hard-pressed to find many > systems these days in which it has NOT been turned on. So basically we have > a de-facto standard, and I think it's time we flipped the switch to make it > on by default. [...] > Yes, I am aware of the previous discussions on this, but the world moves > fast - wal compression is better than in the past, vacuum is better now, > and data-checksums being on is such a complete default in the wild, it > feels weird and a disservice that we are not running all our tests like > that. I agree. Some review on the patch: > diff --git a/doc/src/sgml/ref/initdb.sgml b/doc/src/sgml/ref/initdb.sgml > index bdd613e77f..511f489d34 100644 > --- a/doc/src/sgml/ref/initdb.sgml > +++ b/doc/src/sgml/ref/initdb.sgml > @@ -267,12 +267,14 @@ PostgreSQL documentation > > Use checksums on data pages to help detect corruption by the > I/O system that would otherwise be silent. Enabling checksums > -may incur a noticeable performance penalty. If set, checksums > +may incur a small performance penalty. If set, checksums > are calculated for all objects, in all databases. All checksum I think the last time we dicussed this the consensus was that computational overhead of computing the checksums is pretty small for most systems (so the above change seems warranted regardless of whether we switch the default), but turning on wal_compression also turns on wal_log_hints, which can increase WAL by quite a lot. Maybe this is covered elsewhere in the documentation (I just looked at the patch), but if not, it probably should be added here as a word of caution. > failures will be reported in the > > pg_stat_database view. > See for details. > +As of version 18, checksums are enabled by default. They can be > +disabled by use of --no-data-checksums. I think we usually do not mention when a feature was added/changed, do we? So I'd just write "(default: enabled)" or whatever is the style of the surrounding options. > > > > diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c > index f00718a015..ce7d3e99e5 100644 > --- a/src/bin/initdb/initdb.c > +++ b/src/bin/initdb/initdb.c > @@ -164,7 +164,7 @@ static bool noinstructions = false; > static bool do_sync = true; > static bool sync_only = false; > static bool show_setting = false; > -static bool data_checksums = false; > +static bool data_checksums = true; > static char *xlog_dir = NULL; > static int wal_segment_size_mb = (DEFAULT_XLOG_SEG_SIZE) / (1024 * 1024); > static DataDirSyncMethod sync_method = DATA_DIR_SYNC_METHOD_FSYNC; > @@ -3121,6 +3121,7 @@ main(int argc, char *argv[]) > {"waldir", required_argument, NULL, 'X'}, > {"wal-segsize", required_argument, NULL, 12}, > {"data-checksums", no_argument, NULL, 'k'}, > + {"no-data-checksums", no_argument, NULL, 20}, Does it make sense to add -K (capital k) as a short-cut for this? I think this is how we distinguish on/off for pg_dump (-t/-T etc.) but maybe that is not wider project policy. Michael
Re: Lock-free compaction. Why not?
Hi, On Mon, Jul 22, 2024 at 08:39:23AM -0400, Robert Haas wrote: > What the extensions that are out there seem to do is, as I understand > it, an online table rewrite with concurrent change capture, and then > you apply the changes to the output table afterward. That has the > problem that if the changes are happening faster than you can apply > them, the operation does not terminate. But, enough people seem to be > happy with this kind of solution that we should perhaps look harder at > doing something along these lines in core. I believe this is being discussed here: https://commitfest.postgresql.org/49/5117/ https://www.postgresql.org/message-id/5186.1706694913%40antos Michael
Re: why there is not VACUUM FULL CONCURRENTLY?
On Mon, Jul 22, 2024 at 01:23:03PM +0500, Kirill Reshke wrote: > Also, should we create a cf entry for this thread already? I was wondering about this as well, but there is one for the upcoming commitfest already: https://commitfest.postgresql.org/49/5117/ Michael
Re: Set log_lock_waits=on by default
Hi, this patch is still on the table, though for v18 now. Nathan mentioned up-thread that he was hesitant to commit this without further discussion. Laurenz, Stephen and I are +1 on this, but when it comes to committers having chimed in only Tom did so far and was -1. Are there any others who have an opinion on this? On Tue, Feb 06, 2024 at 12:29:10PM -0500, Tom Lane wrote: > Nathan Bossart writes: > > It looks like there are two ideas: > > > * Separate log_lock_waits from deadlock_timeout. It looks like this idea > > has a decent amount of support, but I didn't see any patch for it so far. > > I think the support comes from people who have not actually looked at > the code. The reason they are not separate is that the same timeout > service routine does both things. To pull them apart, you would have > to (1) detangle that code and (2) incur the overhead of two timeout > events queued for every lock wait. It's not clear to me that it's > worth it. In some sense, deadlock_timeout is exactly the length of > time after which you want to get concerned. > > > IMHO this is arguably a prerequisite for setting log_lock_waits on by > > default, as we could then easily set it higher by default to help address > > concerns about introducing too much noise in the logs. > > Well, that's the question --- would it be sane to enable > log_lock_waits by default if we don't separate them? I think it would be, I have not seen people change the value of deadlock_timeout so far, and I think 1s is a reasonable long time for a default lock wait to be reported. > > * Set log_lock_waits on by default. The folks on this thread seem to > > support this idea, but given the lively discussion for enabling > > log_checkpoints by default [0], I'm hesitant to commit something like > > this without further community discussion. > > I was, and remain, of the opinion that that was a bad idea that > we'll eventually revert, just like we previously got rid of most > inessential log chatter in the default configuration. I somewhat agree here in the sense that log_checkpoints is really only useful for heavily-used servers, but this is a digression and due to the fact that log_checkpoints emits log lines periodically while log_lock_waits only emits them for application conflicts (and arguably application bugs), I do not think those would be in the "issential log chatter" group similar to how all SQL errors are not in that group either. Michael
Re: Fix the description of GUC "max_locks_per_transaction" and "max_pred_locks_per_transaction" in guc_table.c
Hi, On Fri, Apr 07, 2023 at 01:32:22PM -0400, Tom Lane wrote: > "wangw.f...@fujitsu.com" writes: > > On Tues, Apr 4, 2023 at 23:48 PM Tom Lane wrote: > >> I like the "per eligible process" wording, at least for guc_tables.c; > >> or maybe it could be "per server process"? That would be more > >> accurate and not much longer than what we have now. > > > Thanks both for sharing your opinions. > > I agree that verbose descriptions make maintenance difficult. > > For consistency, I unified the formulas in guc_tables.c and pg-doc into the > > same > > suggested short formula. Attach the new patch. > > After studying this for awhile, I decided "server process" is probably > the better term --- people will have some idea what that means, while > "eligible process" is not a term we use anywhere else. Pushed with > that change and some minor other wordsmithing. I stumbled upon this change while looking at the documentation searching for guidance and what max_locks_per_transactions should be set to (or rather, a pointer about max_locks_per_transactions not actually being "per transaction", but a shared pool of roughly max_locks_per_transactions * max_connections). While I agree that the exact formula is too verbose, I find the current wording ("per server process or prepared transaction") to be misleading; I can see how somebody sees that as a dynamic limit based on the current number of running server processes or prepared transactions, not something that is allocated at server start based on some hardcoded GUCs. I don't have a good alternative wording for now, but I wanted to point out that currently the wording does not seem to imply max_{connection,prepared_transactions} being at play at all. Probably the GUC description cannot be made much clearer without making it too verbose, but I think the description in config.sgml has more leeway to get a mention of max_connections back. Michael
Re: New GUC autovacuum_max_threshold ?
Hi, On Fri, Apr 26, 2024 at 10:18:00AM +0200, Laurenz Albe wrote: > On Fri, 2024-04-26 at 09:35 +0200, Frédéric Yhuel wrote: > > Le 26/04/2024 à 04:24, Laurenz Albe a écrit : > > > On Thu, 2024-04-25 at 14:33 -0400, Robert Haas wrote: > > > > I believe that the underlying problem here can be summarized in this > > > > way: just because I'm OK with 2MB of bloat in my 10MB table doesn't > > > > mean that I'm OK with 2TB of bloat in my 10TB table. One reason for > > > > this is simply that I can afford to waste 2MB much more easily than I > > > > can afford to waste 2TB -- and that applies both on disk and in > > > > memory. > > > > > > I don't find that convincing. Why are 2TB of wasted space in a 10TB > > > table worse than 2TB of wasted space in 100 tables of 100GB each? > > > > Good point, but another way of summarizing the problem would be that the > > autovacuum_*_scale_factor parameters work well as long as we have a more > > or less evenly distributed access pattern in the table. > > > > Suppose my very large table gets updated only for its 1% most recent > > rows. We probably want to decrease autovacuum_analyze_scale_factor and > > autovacuum_vacuum_scale_factor for this one. > > > > Partitioning would be a good solution, but IMHO postgres should be able > > to handle this case anyway, ideally without per-table configuration. > > I agree that you may well want autovacuum and autoanalyze treat your large > table differently from your small tables. > > But I am reluctant to accept even more autovacuum GUCs. It's not like > we don't have enough of them, rather the opposite. You can slap on more > GUCs to treat more special cases, but we will never reach the goal of > having a default that will make everybody happy. > > I believe that the defaults should work well in moderately sized databases > with moderate usage characteristics. If you have large tables or a high > number of transactions per second, you can be expected to make the effort > and adjust the settings for your case. Adding more GUCs makes life *harder* > for the users who are trying to understand and configure how autovacuum works. Well, I disagree to some degree. I agree that the defaults should work well in moderately sized databases with moderate usage characteristics. But I also think we can do better than telling DBAs to they have to manually fine-tune autovacuum for large tables (and frequenlty implementing by hand what this patch is proposed, namely setting autovacuum_vacuum_scale_factor to 0 and autovacuum_vacuum_threshold to a high number), as this is cumbersome and needs adult supervision that is not always available. Of course, it would be great if we just slap some AI into the autovacuum launcher that figures things out automagically, but I don't think we are there, yet. So this proposal (probably along with a higher default threshold than 50, but IMO less than what Robert and Nathan suggested) sounds like a stop forward to me. DBAs can set the threshold lower if they want, or maybe we can just turn it off by default if we cannot agree on a sane default, but I think this (using the simplified formula from Nathan) is a good approach that takes some pain away from autovacuum tuning and reserves that for the really difficult cases. Michael
Re: New GUC autovacuum_max_threshold ?
Hi, On Fri, Apr 26, 2024 at 04:24:45AM +0200, Laurenz Albe wrote: > On Thu, 2024-04-25 at 14:33 -0400, Robert Haas wrote: > > Another reason, at least in existing releases, is that at some > > point index vacuuming hits a wall because we run out of space for dead > > tuples. We *most definitely* want to do index vacuuming before we get > > to the point where we're going to have to do multiple cycles of index > > vacuuming. > > That is more convincing. But do we need a GUC for that? What about > making a table eligible for autovacuum as soon as the number of dead > tuples reaches 90% of what you can hold in "autovacuum_work_mem"? Due to the improvements in v17, this would basically never trigger accordings to my understanding, or at least only after an excessive amount of bloat has been accumulated. Michael
Re: [PATCH] Exponential backoff for auth_delay
Hi, On Wed, Mar 20, 2024 at 11:22:12PM +0100, Daniel Gustafsson wrote: > > On 20 Mar 2024, at 22:21, Jacob Champion > > wrote: > > > > On Wed, Mar 20, 2024 at 2:15 PM Jacob Champion > > wrote: > >> I think solutions for case 1 and case 2 are necessarily at odds under > >> the current design, if auth_delay relies on slot exhaustion to do its > >> work effectively. Weakening that on purpose doesn't make much sense to > >> me; if a DBA is uncomfortable with the DoS implications then I'd argue > >> they need a different solution. (Which we could theoretically > >> implement, but it's not my intention to sign you up for that. :D ) > > > > The thread got quiet, and I'm nervous that I squashed it unintentionally. :/ > > > > Is there consensus on whether the backoff is useful, even without the > > host tracking? (Or, alternatively, is the host tracking helpful in a > > way I'm not seeing?) Failing those, is there a way forward that could > > make it useful in the future? > > I actually wrote more or less the same patch with rudimentary attacker > fingerprinting, and after some off-list discussion decided to abandon it for > the reasons discussed in this thread. It's unlikely to protect against the > attackers we wan't to protect the cluster against since they won't wait for > the > delay anyways. I have marked the patch "Returned with Feedback" now. Maybe I will get back to this for v18, but it was clearly not ready for v17. Michael
Re: Security lessons from liblzma
Hi, On Sun, Mar 31, 2024 at 01:05:40PM -0400, Joe Conway wrote: > But it has always bothered me how many patches get applied to the upstream > tarballs by the OS package builders. Some of them, e.g. glibc on RHEL 7, > include more than 1000 patches that you would have to manually vet if you > cared enough and had the skills. Last time I looked at the openssl package > sources it was similar in volume and complexity. They might as well be > called forks if everyone were being honest about it... I think this more an artifact of how RHEL development works, i.e. trying to ship the same major version of glibc for 10 years, but still fix lots of bugs and possibly some performance improvements your larger customers ask for. So I guess a lot of those 1000 patches are just cherry-picks / backports of upstream commits from newer releases. I guess it would be useful to maybe have another look at the patches that are being applied for apt/yum.postgresql.org for the 18 release cycle, but I do not think those are a security problem. Not sure about RPM builds, but at least in theory the APT builds should be reproducible. What would be a significant gain in security/trust was an easy service/recipe on how to verify the reproducibility (i) by independently building packages (and maybe the more popular extensions) and comparing them to the {apt,yum}.postgresql.org repository packages (ii) by being able to build the release tarballs reproducibly. Michael
Re: Security lessons from liblzma
On Sat, Mar 30, 2024 at 09:52:47PM -0400, Bruce Momjian wrote: > On Sat, Mar 30, 2024 at 07:54:00PM -0400, Joe Conway wrote: > > Virtually every RPM source, including ours, contains out of tree patches > > that get applied on top of the release tarball. At least for the PGDG > > packages, it would be nice to integrate them into our git repo as build > > options or whatever so that the packages could be built without any patches > > applied to it. Add a tarball that is signed and traceable back to the git > > tag, and we would be in a much better place than we are now. > > How would someone access the out-of-tree patches? I think Debian > includes the patches in its source tarball. If you ask where they are maintained, the answer is here: https://salsa.debian.org/postgresql/postgresql/-/tree/17/debian/patches?ref_type=heads the other major versions have their own branch. Michael
Re: pg_upgrade failing for 200+ million Large Objects
Hi, On Wed, Mar 27, 2024 at 10:53:51AM +0100, Laurenz Albe wrote: > On Wed, 2024-03-27 at 10:20 +0100, Michael Banck wrote: > > Also, is there a chance this is going to be back-patched? I guess it > > would be enough if the ugprade target is v17 so it is less of a concern, > > but it would be nice if people with millions of large objects are not > > stuck until they are ready to ugprade to v17. > > It is a quite invasive patch, and it adds new features (pg_restore in > bigger transaction patches), so I think this is not for backpatching, > desirable as it may seem from the usability angle. Right, I forgot about those changes, makes sense. Michael
Re: pg_upgrade failing for 200+ million Large Objects
Hi, On Sat, Mar 16, 2024 at 06:46:15PM -0400, Tom Lane wrote: > Laurenz Albe writes: > > On Fri, 2024-03-15 at 19:18 -0400, Tom Lane wrote: > >> This patch seems to have stalled out again. In hopes of getting it > >> over the finish line, I've done a bit more work to address the two > >> loose ends I felt were probably essential to deal with: > > > Applies and builds fine. > > I didn't scrutinize the code, but I gave it a spin on a database with > > 15 million (small) large objects. I tried pg_upgrade --link with and > > without the patch on a debug build with the default configuration. > > Thanks for looking at it! > > > Without the patch: > > Runtime: 74.5 minutes > > > With the patch: > > Runtime: 70 minutes > > Hm, I'd have hoped for a bit more runtime improvement. I also think that this is quite a large runtime for pg_upgrade, but the more important savings should be the memory usage. > But perhaps not --- most of the win we saw upthread was from > parallelism, and I don't think you'd get any parallelism in a > pg_upgrade with all the data in one database. (Perhaps there is more > to do there later, but I'm still not clear on how this should interact > with the existing cross-DB parallelism; so I'm content to leave that > question for another patch.) What is the status of this? In the commitfest, this patch is marked as "Needs Review" with Nathan as reviewer - Nathan, were you going to take another look at this or was your mail from January 12th a full review? My feeling is that this patch is "Ready for Committer" and it is Tom's call to commit it during the next days or not. I am +1 that this is an important feature/bug fix to have. Because we have customers stuck on older versions due to their pathological large objects usage, I did some benchmarks (jsut doing pg_dump, not pg_upgarde) a while ago which were also very promising; however, I lost the exact numbers/results. I am happy to do further tests if that is required for this patch to go forward. Also, is there a chance this is going to be back-patched? I guess it would be enough if the ugprade target is v17 so it is less of a concern, but it would be nice if people with millions of large objects are not stuck until they are ready to ugprade to v17. Michael
Re: Possibility to disable `ALTER SYSTEM`
Hi, On Wed, Mar 20, 2024 at 08:11:32PM +0100, Magnus Hagander wrote: > (And FWIW also already solved on debian-based platforms for example, > which but the main config files in /etc with postgres only having read > permissions on them JFTR - Debian/Ubuntu keep postgresql.conf under /etc/postgresql, but that directory is owned by the postgres user by default and it can change the configuration files (if that wasn't the case, external tools like Patroni that run under the postgres user and manage postgresql.conf would work much less easily on them). Michael
Re: Possibility to disable `ALTER SYSTEM`
Hi, On Tue, Mar 19, 2024 at 10:51:50AM -0400, Tom Lane wrote: > Heikki Linnakangas writes: > > Perhaps we could make that even better with a GUC though. I propose a > > GUC called 'configuration_managed_externally = true / false". If you set > > it to true, we prevent ALTER SYSTEM and make the error message more > > definitive: > > > postgres=# ALTER SYSTEM SET wal_level TO minimal; > > ERROR: configuration is managed externally > > > As a bonus, if that GUC is set, we could even check at server startup > > that all the configuration files are not writable by the postgres user, > > and print a warning or refuse to start up if they are. > > I like this idea. The "bonus" is not optional though, because > setting the files' ownership/permissions is the only way to be > sure that the prohibition is even a little bit bulletproof. Isn't this going to break pgbackrest restore then, which (AIUI, and was mentioned upthread) writes recovery configs into postgresql.auto.conf? Or do I misunderstand the proposal? I think it would be awkward if only root users are able to run pgbackrest restore. I have added David to the CC list to make him aware of this, in case he was not following this thread. The other candidate for breakage that was mentioned was pg_basebackup -R, but I guess that could be worked around. Michael
Re: Reports on obsolete Postgres versions
On Fri, Mar 15, 2024 at 11:17:53AM +0100, Daniel Gustafsson wrote: > > On 14 Mar 2024, at 16:48, Peter Eisentraut wrote: > > One could instead, for example, describe those as "maintenance releases": > > That might indeed be a better name for what we provide. +1
Re: Reports on obsolete Postgres versions
Hi, On Mon, Mar 11, 2024 at 05:17:13PM -0400, Bruce Momjian wrote: > On Mon, Mar 11, 2024 at 04:12:04PM -0500, Nathan Bossart wrote: > > On Mon, Mar 11, 2024 at 04:37:49PM -0400, Bruce Momjian wrote: > > > https://www.postgresql.org/support/versioning/ > > > > > > This web page should correct the idea that "upgrades are more risky than > > > staying with existing versions". Is there more we can do? Should we > > > have a more consistent response for such reporters? > > > > I've read that the use of the term "minor release" can be confusing. While > > the versioning page clearly describes what is eligible for a minor release, > > not everyone reads it, so I suspect that many folks think there are new > > features, etc. in minor releases. I think a "minor release" of Postgres is > > more similar to what other projects would call a "patch version." > > Well, we do say: > > While upgrading will always contain some level of risk, PostgreSQL > minor releases fix only frequently-encountered bugs, security issues, > and data corruption problems to reduce the risk associated with > upgrading. For minor releases, the community considers not upgrading to > be riskier than upgrading. > > but that is far down the page. Do we need to improve this? I liked the statement from Laurenz a while ago on his blog (paraphrased): "Upgrading to the latest patch release does not require application testing or recertification". I am not sure we want to put that into the official page (or maybe tone down/qualify it a bit), but I think a lot of users stay on older minor versions because they dread their internal testing policies. The other thing that could maybe be made a bit better is the fantastic patch release schedule, which however is buried in the "developer roadmap". I can see how this was useful years ago, but I think this page should be moved to the end-user part of the website, and maybe (also) integrated into the support/versioning page? Michael
Re: [DOC] Add detail regarding resource consumption wrt max_connections
Hi, On Sun, Mar 10, 2024 at 09:58:25AM -0400, Robert Treat wrote: > On Fri, Mar 8, 2024 at 10:47 AM Michael Banck wrote: > > On Fri, Jan 12, 2024 at 10:14:38PM +, Cary Huang wrote: > > > I think it is good to warn the user about the increased allocation of > > > memory for certain parameters so that they do not abuse it by setting > > > it to a huge number without knowing the consequences. > > > > Right, and I think it might be useful to log (i.e. at LOG not DEBUG3 > > level, with a nicer message) the amount of memory we allocate on > > startup, that is just one additional line per instance lifetime but > > might be quite useful to admins. Or maybe two lines if we log whether we > > could allocate it as huge pages or not as well: > > > > |2024-03-08 16:46:13.117 CET [237899] DEBUG: invoking > > IpcMemoryCreate(size=145145856) > > |2024-03-08 16:46:13.117 CET [237899] DEBUG: mmap(146800640) with > > MAP_HUGETLB failed, huge pages disabled: Cannot allocate memory > > > > If we were going to add these details (and I very much like the idea), > I would advocate that we put it somewhere more permanent than a single > log entry at start-up. Given that database up-times easily run months > and sometimes years, it is hard to imagine we'd always have access to > the log files to figure this out on any actively running systems. Well actually, those two numbers are already available at runtime, via the shared_memory_size and (from 17 on) huge_pages_status GUCs. So this would be geared at admins that keeps in long-term storage and want to know what the numbers were a while ago. Maybe it is not that interesting, but I think one or two lines at startup would not hurt. Michael
Re: [DOC] Add detail regarding resource consumption wrt max_connections
Hi, On Fri, Jan 12, 2024 at 10:14:38PM +, Cary Huang wrote: > I think it is good to warn the user about the increased allocation of > memory for certain parameters so that they do not abuse it by setting > it to a huge number without knowing the consequences. Right, and I think it might be useful to log (i.e. at LOG not DEBUG3 level, with a nicer message) the amount of memory we allocate on startup, that is just one additional line per instance lifetime but might be quite useful to admins. Or maybe two lines if we log whether we could allocate it as huge pages or not as well: |2024-03-08 16:46:13.117 CET [237899] DEBUG: invoking IpcMemoryCreate(size=145145856) |2024-03-08 16:46:13.117 CET [237899] DEBUG: mmap(146800640) with MAP_HUGETLB failed, huge pages disabled: Cannot allocate memory > It is true that max_connections can increase the size of proc array > and other resources, which are allocated in the shared buffer, which > also means less shared buffer to perform regular data operations. AFAICT, those resources are allocated on top of shared_buffers, i.e. the total allocated memory is shared_buffers + (some resources) * max_connections + (other resources) * other_factors. > Instead of stating that higher max_connections results in higher > allocation, It may be better to tell the user that if the value needs > to be set much higher, consider increasing the "shared_buffers" > setting as well. Only if what you say above is true and I am at fault. Michael
Re: [PATCH] Exponential backoff for auth_delay
Hi, On Wed, Mar 06, 2024 at 09:34:37PM +0100, Tomas Vondra wrote: > On 3/6/24 19:24, Jacob Champion wrote: > > On Wed, Mar 6, 2024 at 8:10 AM Nathan Bossart > > wrote: > >> Assuming you are referring to the case where we run out of free slots in > >> acr_array, I'm not sure I see that as desirable behavior. Once you run out > >> of slots, all failed authentication attempts are now subject to the maximum > >> delay, which is arguably a denial-of-service scenario, albeit not a > >> particularly worrisome one. > > > > Maybe I've misunderstood the attack vector, but I thought the point of > > the feature was to deny service when the server is under attack. If we > > don't deny service, what does the feature do? I think there are two attack vectors under discussion: 1. Somebody tries to brute-force a password. The original auth_delay delays auth for a bit everytime authentication fails. If you configure the delay to be very small, maybe it does not bother the attacker too much. If you configure it to be long enough, legitimate users might be annoyed when typoing their password. The suggested feature tries to help here by initially delaying authentication just a bit and then gradually increasing the delay. 2. Somebody tries to denial-of-service a server by exhausting all (remaining) available connections with failed authentication requests that are being delayed. For this attack, the suggested feature is hurting more than doing good as it potentially keeps a failed authentication attempt's connection hanging around longer than before and makes it easier to denial-of-service a server in this way. In order to at least make case 2 not worse for exponential backoff, we could maybe disable it (and just wait for auth_delay.milliseconds) once MAX_CONN_RECORDS is full. In addition, maybe MAX_CONN_RECORDS should be some fraction of max_connections, like 25%? > > And I may have introduced a red herring in talking about the number of > > hosts, because an attacker operating from a single host is under no > > obligation to actually wait for the authentication delay. Once we hit > > some short timeout, we can safely assume the password is wrong, > > abandon the request, and open up a new connection. That is a valid point. Maybe this could be averted if we either outright deny even a successful authentication request if the host it originates from has a max_milliseconds delay on file (i.e. has been trying to brute-force the password for a while) or at least delay a successful authentication request for some delay, if the host it originates on has a max_milliseconds delay on file (assuming it will close the connection beforehand as it thinks the password guess was wrong)? > > It seems like the thing limiting our attack is the number of > > connection slots, not MAX_CONN_RECORDS. Am I missing something > > crucial? > > Doesn't this mean this approach (as implemented) doesn't actually work > as a protection against this sort of DoS? > > If I'm an attacker, and I can just keep opening new connections for each > auth request, am I even subject to any auth delay? Yeah, but see above. > ISTM the problem lies in the fact that we apply the delay only *after* > the failed auth attempt. Which makes sense, because until now we didn't > have any state with information for new connections. But with the new > acr_array, we could check that first, and do the delay before trying to > athenticate, no? I don't think we have a hook for that yet, do we? Michael
Re: [PATCH] Exponential backoff for auth_delay
Hi, On Mon, Mar 04, 2024 at 03:50:07PM -0500, Robert Haas wrote: > I agree that two GUCs here seems to be one more than necessary, but I > wonder whether we couldn't just say 0 means no exponential backoff and > any other value is the maximum time. Alright, I have changed it so that auth_delay.milliseconds and auth_delay.max_milliseconds are the only GUCs, their default being 0. If the latter is 0, the former's value is always taken. If the latter is non-zero and larger than the former, exponential backoff is applied with the latter's value as maximum delay. If the latter is smaller than the former then auth_delay just sets the delay to the latter, I don't think this is problem or confusing, or should this be considered a misconfiguration? > The idea that 0 means unlimited doesn't seem useful in practice. Yeah, that was more how it was coded than a real policy decision, so let's do away with it. V5 attached. Michael >From 3563d77061480b7e022255b968a39086b0cc8814 Mon Sep 17 00:00:00 2001 From: Michael Banck Date: Wed, 27 Dec 2023 15:55:39 +0100 Subject: [PATCH v5] Add optional exponential backoff to auth_delay contrib module. MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit This adds the new auth_delay.max_milliseconds GUC. If set (its default is 0), auth_delay adds exponential backoff with this GUC's value as maximum delay. The exponential backoff is tracked per remote host and doubled for every failed login attempt (i.e., wrong password, not just missing pg_hba line or database) and reset to auth_delay.milliseconds after a successful authentication or when no authentication attempts have been made for 5*max_milliseconds from that host. Authors: Michael Banck, based on an earlier patch by 成之焕 Reviewed-by: Abhijit Menon-Sen, Tomas Vondra Discussion: https://postgr.es/m/ahwaxacqiwivoehs5yejpqog.1.1668569845751.hmail.zhch...@ceresdata.com --- contrib/auth_delay/auth_delay.c | 216 ++- doc/src/sgml/auth-delay.sgml | 31 - src/tools/pgindent/typedefs.list | 1 + 3 files changed, 244 insertions(+), 4 deletions(-) diff --git a/contrib/auth_delay/auth_delay.c b/contrib/auth_delay/auth_delay.c index ff0e1fd461..5fb123d133 100644 --- a/contrib/auth_delay/auth_delay.c +++ b/contrib/auth_delay/auth_delay.c @@ -14,24 +14,50 @@ #include #include "libpq/auth.h" +#include "miscadmin.h" #include "port.h" +#include "storage/dsm_registry.h" +#include "storage/ipc.h" +#include "storage/lwlock.h" +#include "storage/shmem.h" #include "utils/guc.h" #include "utils/timestamp.h" PG_MODULE_MAGIC; +#define MAX_CONN_RECORDS 100 + /* GUC Variables */ static int auth_delay_milliseconds = 0; +static int auth_delay_max_milliseconds = 0; /* Original Hook */ static ClientAuthentication_hook_type original_client_auth_hook = NULL; +typedef struct AuthConnRecord +{ + char remote_host[NI_MAXHOST]; + double sleep_time; /* in milliseconds */ + TimestampTz last_failed_auth; +} AuthConnRecord; + +static shmem_startup_hook_type shmem_startup_next = NULL; +static AuthConnRecord *acr_array = NULL; + +static AuthConnRecord *auth_delay_find_acr_for_host(char *remote_host); +static AuthConnRecord *auth_delay_find_free_acr(void); +static double auth_delay_increase_delay_after_failed_conn_auth(Port *port); +static void auth_delay_cleanup_conn_record(Port *port); +static void auth_delay_expire_conn_records(Port *port); + /* * Check authentication */ static void auth_delay_checks(Port *port, int status) { + double delay = auth_delay_milliseconds; + /* * Any other plugins which use ClientAuthentication_hook. */ @@ -39,20 +65,190 @@ auth_delay_checks(Port *port, int status) original_client_auth_hook(port, status); /* - * Inject a short delay if authentication failed. + * We handle both STATUS_ERROR and STATUS_OK - the third option + * (STATUS_EOF) is disregarded. + * + * In case of STATUS_ERROR we inject a short delay, optionally with + * exponential backoff. + */ + if (status == STATUS_ERROR) + { + if (auth_delay_max_milliseconds > 0) + { + /* + * Delay by 2^n seconds after each authentication failure from a + * particular host, where n is the number of consecutive + * authentication failures. + */ + delay = auth_delay_increase_delay_after_failed_conn_auth(port); + + /* + * Clamp delay to a maximum of auth_delay_max_milliseconds. + */ + delay = Min(delay, auth_delay_max_milliseconds); + } + + if (delay > 0) + { + elog(DEBUG1, "Authentication delayed for %g seconds due to auth_delay", delay / 1000.0); + pg_usleep(1000L * (long) delay); + } + + /* + * Expire delays from other hosts after auth_delay_max_milliseconds * + * 5. + */ + auth_delay_expire_conn_records(port); + } + + /
Re: [PATCH] Exponential backoff for auth_delay
Hi, On Wed, Feb 21, 2024 at 10:26:11PM +0100, Tomas Vondra wrote: > Thanks for the patch. I took a closer look at v3, so let me share some > review comments. Please push back if you happen to disagree with some of > it, some of this is going to be more a matter of personal preference. Thanks! As my patch was based on a previous patch, some of decisions were carry-overs I am not overly attached to. > 1) I think it's a bit weird to have two options specifying amount of > time, but one is in seconds and one in milliseconds. Won't that be > unnecessarily confusing? Could we do both in milliseconds? Alright, I changed that. See below for a discussion about the GUCs in general. > 2) The C code defines the GUC as auth_delay.exponential_backoff, but the > SGML docs say auth_delay.exp_backoff. Right, an oversight from the last version where the GUC name got changed but I forgot to change the documentation, fixed. > 3) Do we actually need the exponential_backoff GUC? Wouldn't it be > sufficient to have the maximum value, and if it's -1 treat that as no > backoff? That is a good question, I guess that makes sense. The next question then is: what should the default for (now) auth_delay.max_milliseconds be in this case, -1? Or do we say that as the default for auth_delay.milliseconds is 0 anyway, why would somebody not want exponential backoff when they switch it on and keep it at the current 10s/1ms)? I have not changed that for now, pending further input. > 4) I think the SGML docs should more clearly explain that the delay is > initialized to auth_delay.milliseconds, and reset to this value after > successful authentication. The wording kinda implies that, but it's not > quite clear I think. Ok, I added some text to that end. I also added a not that auth_delay.max_milliseconds will mean that the delay doubling never stops. > 4) I've been really confused by the change from > > if (status != STATUS_OK) >to > if (status == STATUS_ERROR) > > in auth_delay_checks, until I realized those two codes are not the only > ones, and we intentionally ignore STATUS_EOF. I think it'd be good to > mention that in a comment, it's not quite obvious (I only realized it > because the e-mail mentioned it). Yeah I agree, I tried to explain that now. > 5) I kinda like the custom that functions in a contrib module start with > a clear common prefix, say auth_delay_ in this case. Matter of personal > preference, ofc. Ok, I changed the functions to have an auth_delay_ prefix throughout.. > 6) I'm a bit skeptical about some acr_array details. Firstly, why would > 50 entries be enough? Seems like a pretty low and arbitrary number ... > Also, what happens if someone attempts to authenticate, fails to do so, > and then disconnects and never tries again? Or just changes IP? Seems > like the entry will remain in the array forever, no? Yeah, that is how v3 of this patch worked. I have changed that now, see below. > Seems like a way to cause a "limited" DoS - do auth failure from 50 > different hosts, to fill the table, and now everyone has to wait the > maximum amount of time (if they fail to authenticate). Right, though the problem would only exist on authentication failures, so it is really rather limited. > I think it'd be good to consider: > > - Making the acr_array a hash table, and larger than 50 entries (I > wonder if that should be dynamic / customizable by GUC?). I would say a GUC should be overkill for this as this would mostly be an implementation detail. More generally, I think now that entries are expired (see below), this should be less of a concern, so I have not changed this to a hash table for now but doubled MAX_CONN_RECORDS to 100 entries. > - Make sure the entries are eventually expired, based on time (for > example after 2*max_delay?). I went with 5*max_milliseconds - the AuthConnRecord struct now has a last_failed_auth timestamp member; if we increase the delay for a host, we check if any other host expired in the meantime and remove it if so. > - It would be a good idea to log something when we get into the "full > table" and start delaying everyone by max_delay_seconds. (How would > anyone know that's what's happening, seems rather confusing.) Right, I added a log line for that. However, I made it LOG instead of WARNING as I don't think the client would ever see it, would he? Attached is v4 with the above changes. Cheers, Michael >From 6520fb1e768bb8bc26cafad014d08d7e9ac7f12a Mon Sep 17 00:00:00 2001 From: Michael Banck Date: Wed, 27 Dec 2023 15:55:39 +0100 Subject: [PATCH v4] Add optional exponential backoff to auth_delay contrib module. MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit This adds
Re: Remove AIX Support (was: Re: Relation bulk write facility)
Hi, On Thu, Feb 29, 2024 at 12:57:31AM -0800, Andres Freund wrote: > On 2024-02-29 09:13:04 +0100, Michael Banck wrote: > > The commit message says there is not a lot of user demand and that might > > be right, but contrary to other fringe OSes that got removed like HPPA > > or Irix, I believe Postgres on AIX is still used in production and if > > so, probably in a mission-critical manner at some old-school > > institutions (in fact, one of our customers does just that) and not as a > > thought-experiment. It is probably well-known among Postgres hackers > > that AIX support is problematic/a burden, but the current users might > > not be aware of this. > > Then these users should have paid somebody to actually do maintenance work on > the AIX support,o it doesn't regularly stand in the way of implementing > various things. Right, absolutely. But: did we ever tell them to do that? I don't think it's reasonable for them to expect to follow -hackers and jump in when somebody grumbles about AIX being a burden somewhere deep down a thread... Michael
Remove AIX Support (was: Re: Relation bulk write facility)
Hi, On Sat, Feb 24, 2024 at 01:29:36PM -0800, Andres Freund wrote: > Let's just drop AIX. This isn't the only alignment issue we've found and the > solution for those isn't so much a fix as forcing everyone to carefully only > look into one direction and not notice the cliffs to either side. While I am not against dropping AIX (and certainly won't step up to maintain it just for fun), I don't think burying this inside some "Relation bulk write facility" thread is helpful; I have changed the thread title as a first step. The commit message says there is not a lot of user demand and that might be right, but contrary to other fringe OSes that got removed like HPPA or Irix, I believe Postgres on AIX is still used in production and if so, probably in a mission-critical manner at some old-school institutions (in fact, one of our customers does just that) and not as a thought-experiment. It is probably well-known among Postgres hackers that AIX support is problematic/a burden, but the current users might not be aware of this. Not sure what to do about this (especially now that this has been committed), maybe there should have been be a public deprecation notice first for v17... On the other hand, that might not work if important features like direct-IO would have to be bumped from v17 just because of AIX. I posted about this on Twitter and Mastodon to see whether anybody complains and did not get a lot of feedback. In any case, users will have a couple of years to migrate as usual if they upgrade to v16. Michael
Re: Oom on temp (un-analyzed table caused by JIT) V16.1 [ NOT Fixed ]
Hi, On Wed, Jan 24, 2024 at 02:50:52PM -0500, Kirk Wolak wrote: > On Mon, Jan 22, 2024 at 1:30 AM Kirk Wolak wrote: > > On Fri, Jan 19, 2024 at 7:03 PM Daniel Gustafsson wrote: > >> > On 19 Jan 2024, at 23:09, Kirk Wolak wrote: > > Thank you, that made it possible to build and run... > > UNFORTUNATELY this has a CLEAR memory leak (visible in htop) > > I am watching it already consuming 6% of my system memory. > > > Daniel, > In the previous email, I made note that once the JIT was enabled, the > problem exists in 17Devel. > I re-included my script, which forced the JIT to be used... > > I attached an updated script that forced the settings. > But this is still leaking memory (outside of the > pg_backend_memory_context() calls). > Probably because it's at the LLVM level? And it does NOT happen from > planning/opening the query. It appears I have to fetch the rows to > see the problem. I had a look at this (and blogged about it here[1]) and was also wondering what was going on with 17devel and the recent back-branch releases, cause I could also reproduce those continuing memory leaks. Adding some debug logging when llvm_inline_reset_caches() is called solves the mystery: as you are calling a function, the fix that is in 17devel and the back-branch releases is not applicable and only after the function returns llvm_inline_reset_caches() is being called (as llvm_jit_context_in_use_count is greater than zero, presumably, so it never reaches the call-site of llvm_inline_reset_caches()). If you instead run your SQL in a DO-loop (as in the blog post) and not as a PL/PgSQL function, you should see that it no longer leaks. This might be obvious to some (and Andres mentioned it in https://www.postgresql.org/message-id/20210421002056.gjd6rpe6toumiqd6%40alap3.anarazel.de) but it took me a while to figure out/find. Michael [1] https://www.credativ.de/en/blog/postgresql-en/quick-benchmark-postgresql-2024q1-release-performance-improvements/
Re: Use of backup_label not noted in log
Hi, On Thu, Jan 25, 2024 at 08:56:52AM -0400, David Steele wrote: > I would still advocate for a back patch here. It is frustrating to get logs > from users that just say: > > LOG: invalid checkpoint record > PANIC: could not locate a valid checkpoint record > > It would be very helpful to know what the checkpoint record LSN was in this > case. I agree. Michael
Re: Set log_lock_waits=on by default
Hi, On Thu, Jan 11, 2024 at 03:24:55PM +0100, Michael Banck wrote: > On Thu, Dec 21, 2023 at 02:29:05PM +0100, Laurenz Albe wrote: > > Here is a patch to implement this. > > Being stuck behind a lock for more than a second is almost > > always a problem, so it is reasonable to turn this on by default. > > I also think that this should be set to on. > > I had a look at the patch and it works fine. > > Regarding the documentation, maybe the back-reference at > deadlock_timeout could be made a bit more explicit then as well, as in > the attached patch, but this is mostly bikeshedding. I've marked it ready for committer now, as the above really is bikeshedding. Michael
Re: [PATCH] Exponential backoff for auth_delay
On Fri, Jan 19, 2024 at 03:08:36PM +0100, Michael Banck wrote: > I went through your comments (a lot of which pertained to the original > larger patch I took code from), attached is a reworked version 2. Oops, we are supposed to be at version 3, attached. Michael >From e60cdca73d384fa467f8e6becdd85d9a0c530b60 Mon Sep 17 00:00:00 2001 From: Michael Banck Date: Wed, 27 Dec 2023 15:55:39 +0100 Subject: [PATCH v3] Add optional exponential backoff to auth_delay contrib module. MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit This adds two new GUCs for auth_delay, exponential_backoff and max_seconds. The former controls whether exponential backoff should be used or not, the latter sets an maximum delay (default is 10s) in case exponential backoff is active. The exponential backoff is tracked per remote host and doubled for every failed login attempt (i.e., wrong password, not just missing pg_hba line or database) and reset to auth_delay.milliseconds after a successful authentication from that host. This patch is partly based on a larger (but ultimately rejected) patch by 成之焕. Authors: Michael Banck, 成之焕 Reviewed-by: Abhijit Menon-Sen Discussion: https://postgr.es/m/ahwaxacqiwivoehs5yejpqog.1.1668569845751.hmail.zhch...@ceresdata.com --- contrib/auth_delay/auth_delay.c | 197 ++- doc/src/sgml/auth-delay.sgml | 48 +++- src/tools/pgindent/typedefs.list | 1 + 3 files changed, 243 insertions(+), 3 deletions(-) diff --git a/contrib/auth_delay/auth_delay.c b/contrib/auth_delay/auth_delay.c index ff0e1fd461..06d2f5f280 100644 --- a/contrib/auth_delay/auth_delay.c +++ b/contrib/auth_delay/auth_delay.c @@ -14,24 +14,49 @@ #include #include "libpq/auth.h" +#include "miscadmin.h" #include "port.h" +#include "storage/ipc.h" +#include "storage/lwlock.h" +#include "storage/shmem.h" #include "utils/guc.h" #include "utils/timestamp.h" PG_MODULE_MAGIC; +#define MAX_CONN_RECORDS 50 + /* GUC Variables */ static int auth_delay_milliseconds = 0; +static bool auth_delay_exp_backoff = false; +static int auth_delay_max_seconds = 0; /* Original Hook */ static ClientAuthentication_hook_type original_client_auth_hook = NULL; +typedef struct AuthConnRecord +{ + char remote_host[NI_MAXHOST]; + double sleep_time; /* in milliseconds */ +} AuthConnRecord; + +static shmem_startup_hook_type shmem_startup_next = NULL; +static shmem_request_hook_type shmem_request_next = NULL; +static AuthConnRecord *acr_array = NULL; + +static AuthConnRecord *find_acr_for_host(char *remote_host); +static AuthConnRecord *find_free_acr(void); +static double increase_delay_after_failed_conn_auth(Port *port); +static void cleanup_conn_record(Port *port); + /* * Check authentication */ static void auth_delay_checks(Port *port, int status) { + double delay = auth_delay_milliseconds; + /* * Any other plugins which use ClientAuthentication_hook. */ @@ -41,10 +66,146 @@ auth_delay_checks(Port *port, int status) /* * Inject a short delay if authentication failed. */ - if (status != STATUS_OK) + if (status == STATUS_ERROR) { - pg_usleep(1000L * auth_delay_milliseconds); + if (auth_delay_exp_backoff) + { + /* + * Delay by 2^n seconds after each authentication failure from a + * particular host, where n is the number of consecutive + * authentication failures. + */ + delay = increase_delay_after_failed_conn_auth(port); + + /* + * Clamp delay to a maximum of auth_delay_max_seconds. + */ + if (auth_delay_max_seconds > 0) { +delay = Min(delay, 1000L * auth_delay_max_seconds); + } + } + + if (delay > 0) + { + elog(DEBUG1, "Authentication delayed for %g seconds due to auth_delay", delay / 1000.0); + pg_usleep(1000L * (long) delay); + } } + + /* + * Remove host-specific delay if authentication succeeded. + */ + if (status == STATUS_OK) + cleanup_conn_record(port); +} + +static double +increase_delay_after_failed_conn_auth(Port *port) +{ + AuthConnRecord *acr = NULL; + + acr = find_acr_for_host(port->remote_host); + + if (!acr) + { + acr = find_free_acr(); + + if (!acr) + { + /* + * No free space, MAX_CONN_RECORDS reached. Wait for the + * configured maximum amount. + */ + return 1000L * auth_delay_max_seconds; + } + strcpy(acr->remote_host, port->remote_host); + } + if (acr->sleep_time == 0) + acr->sleep_time = (double) auth_delay_milliseconds; + else + acr->sleep_time *= 2; + + return acr->sleep_time; +} + +static AuthConnRecord * +find_acr_for_host(char *remote_host) +{ + int i; + + for (i = 0; i < MAX_CONN_RECORDS; i++) + { + if (strcmp(acr_array[i].remote_host, remote_host) == 0) + return &acr_array[i]; + } + + return NULL; +} + +static AuthConnRecord * +find_free_acr(void) +{ + int i; + + for (i = 0; i < MAX_CONN_RECORDS; i++
Re: [PATCH] Exponential backoff for auth_delay
t;. Done. > (Parameter indentation doesn't match the earlier block.) I noticed that as well, but pgindent keeps changing it back to this, not sure why... > I'm not able to make up my mind if I think 10s is a good default or not. > In practice, it means that after the first three consecutive failures, > we'll delay by 10s for every subsequent failure. That sounds OK. But is > is much more useful than, for example, delaying the first three failures > by auth_delay_milliseconds and then jumping straight to max_seconds? What I had in mind is that admins would lower auth_delay.milliseconds to something like 100 or 125 when exponential_backoff is on, so that the first few (possibley honest) auth failures do not get an annoying 1 seconds penalty, but later ones then wil. In that case, 10 seconds is probably ok cause you'd need more than a handful of auth failures. I added a paragraph to the documentation to this end. > I can't really imagine wanting to increase max_seconds to, say, 128 and > keep a bunch of backends sleeping while someone's trying to brute-force > a password. And with a reasonably short max_seconds, I'm not sure if > having the backoff be _exponential_ is particularly important. > > Or maybe because this is a contrib module, we don't have to think about > it to that extent? Well, not sure. I think something like 10 seconds should be fine for most brute-force attacks in practise, and it is configurable (and turned off by default). > > diff --git a/doc/src/sgml/auth-delay.sgml b/doc/src/sgml/auth-delay.sgml > > index 0571f2a99d..2ca9528011 100644 > > --- a/doc/src/sgml/auth-delay.sgml > > +++ b/doc/src/sgml/auth-delay.sgml > > @@ -16,6 +16,17 @@ > >connection slots. > > > > > > + > > + It is optionally possible to let auth_delay wait > > longer > > + for each successive authentication failure from a particular remote > > host, if > > + the configuration parameter auth_delay.exp_backoff is > > + active. Once an authentication succeeded from a remote host, the > > + authentication delay is reset to the value of > > + auth_delay.milliseconds for this host. The parameter > > + auth_delay.max_seconds sets an upper bound for the > > delay > > + in this case. > > + > > How about something like this… > > If you enable exponential_backoff, auth_delay will double the delay > after each consecutive authentication failure from a particular > host, up to the given max_seconds (default: 10s). If the host > authenticates successfully, the delay is reset. Done, mostly. > > + > > + > > + auth_delay.max_seconds (integer) > > + > > + auth_delay.max_seconds configuration > > parameter > > + > > + > > + > > + > > + How many seconds to wait at most if exponential backoff is active. > > + Setting this parameter to 0 disables it. The default is 10 seconds. > > + > > + > > + > > I suggest "The maximum delay, in seconds, when exponential backoff is > enabled." Done. Michael >From e60cdca73d384fa467f8e6becdd85d9a0c530b60 Mon Sep 17 00:00:00 2001 From: Michael Banck Date: Wed, 27 Dec 2023 15:55:39 +0100 Subject: [PATCH v2] Add optional exponential backoff to auth_delay contrib module. MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit This adds two new GUCs for auth_delay, exponential_backoff and max_seconds. The former controls whether exponential backoff should be used or not, the latter sets an maximum delay (default is 10s) in case exponential backoff is active. The exponential backoff is tracked per remote host and doubled for every failed login attempt (i.e., wrong password, not just missing pg_hba line or database) and reset to auth_delay.milliseconds after a successful authentication from that host. This patch is partly based on a larger (but ultimately rejected) patch by 成之焕. Authors: Michael Banck, 成之焕 Reviewed-by: Abhijit Menon-Sen Discussion: https://postgr.es/m/ahwaxacqiwivoehs5yejpqog.1.1668569845751.hmail.zhch...@ceresdata.com --- contrib/auth_delay/auth_delay.c | 197 ++- doc/src/sgml/auth-delay.sgml | 48 +++- src/tools/pgindent/typedefs.list | 1 + 3 files changed, 243 insertions(+), 3 deletions(-) diff --git a/contrib/auth_delay/auth_delay.c b/contrib/auth_delay/auth_delay.c index ff0e1fd461..06d2f5f280 100644 --- a/contrib/auth_delay/auth_delay.c +++ b/contrib/auth_delay/auth_delay.c @@ -14,24 +14,49 @@ #include #include "libpq/auth.h" +#include "miscadmin.h" #include "port.h" +#include "storage/ipc.h" +#include
Re: Oom on temp (un-analyzed table caused by JIT) V16.1 [Fixed Already]
Hi, On Fri, Jan 19, 2024 at 10:48:12AM +0100, Daniel Gustafsson wrote: > This does bring up an interesting point, I don't think there is a way > for a user to know whether the server is jit enabled or not (apart > from explaining a query with costs adjusted but that's not all that > userfriendly). Maybe we need a way to reliably tell if JIT is active > or not. I thought this is what pg_jit_available() is for? postgres=> SHOW jit; jit - on (1 Zeile) postgres=> SELECT pg_jit_available(); pg_jit_available -- f (1 Zeile) Michael
Re: [PATCH] New predefined role pg_manage_extensions
Hi, On Fri, Jan 12, 2024 at 04:13:27PM +0100, Jelte Fennema-Nio wrote: > But I'm not sure that such a pg_manage_extensions role would have any > fewer permissions than superuser in practice. Note that just being able to create an extension does not give blanket permission to use it. I did a few checks with things I thought might be problematic like adminpack or plpython3u, and a pg_manage_extensions user is not allowed to call those functions or use the untrusted language. > Afaik many extensions that are not marked as trusted, are not trusted > because they would allow fairly trivial privilege escalation to > superuser if they were. While that might be true (or we err on the side of caution), I thought the rationale was more that they either disclose more information about the database server than we want to disclose to ordinary users, or that they allow access to the file system etc. I think if we have extensions in contrib that trivially allow non-superusers to become superusers just by being installed, that should be a bug and be fixed by making it impossible for ordinary users to use those extensions without being granted some access to them in addition. After all, socially engineering a DBA into installing an extension due to user demand would be a thing anyway (even if most DBAs might reject it) and at least DBAs should be aware of the specific risks of a particular extension probably? Michael
[PATCH] New predefined role pg_manage_extensions
Hi, I propose to add a new predefined role to Postgres, pg_manage_extensions. The idea is that it allows Superusers to delegate the rights to create, update or delete extensions to other roles, even if those extensions are not trusted or those users are not the database owner. I have attached a WIP patch for this. Thoughts? Michael >From 59497e825184f0de30a18573ffd7d331be3b233d Mon Sep 17 00:00:00 2001 From: Michael Banck Date: Fri, 12 Jan 2024 13:56:59 +0100 Subject: [PATCH] Add new pg_manage_extensions predefined role. This allows any role that is granted this new predefined role to CREATE, UPDATE or DROP extensions, no matter whether they are trusted or not. --- doc/src/sgml/user-manag.sgml | 5 + src/backend/commands/extension.c | 11 ++- src/include/catalog/pg_authid.dat | 5 + 3 files changed, 16 insertions(+), 5 deletions(-) diff --git a/doc/src/sgml/user-manag.sgml b/doc/src/sgml/user-manag.sgml index 92a299d2d3..ebb82801ec 100644 --- a/doc/src/sgml/user-manag.sgml +++ b/doc/src/sgml/user-manag.sgml @@ -693,6 +693,11 @@ DROP ROLE doomed_role; database to issue CREATE SUBSCRIPTION. + + pg_manage_extensions + Allow creating, removing or updating extensions, even if the + extensions are untrusted or the user is not the database owner. + diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c index 226f85d0e3..71481d9a73 100644 --- a/src/backend/commands/extension.c +++ b/src/backend/commands/extension.c @@ -882,13 +882,14 @@ execute_extension_script(Oid extensionOid, ExtensionControlFile *control, ListCell *lc2; /* - * Enforce superuser-ness if appropriate. We postpone these checks until - * here so that the control flags are correctly associated with the right + * Enforce superuser-ness/membership of the pg_manage_extensions + * predefined role if appropriate. We postpone these checks until here + * so that the control flags are correctly associated with the right * script(s) if they happen to be set in secondary control files. */ if (control->superuser && !superuser()) { - if (extension_is_trusted(control)) + if (extension_is_trusted(control) || has_privs_of_role(GetUserId(), ROLE_PG_MANAGE_EXTENSIONS)) switch_to_superuser = true; else if (from_version == NULL) ereport(ERROR, @@ -897,7 +898,7 @@ execute_extension_script(Oid extensionOid, ExtensionControlFile *control, control->name), control->trusted ? errhint("Must have CREATE privilege on current database to create this extension.") - : errhint("Must be superuser to create this extension."))); + : errhint("Only roles with privileges of the \"%s\" role are allowed to create this extension.", "pg_manage_extensions"))); else ereport(ERROR, (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE), @@ -905,7 +906,7 @@ execute_extension_script(Oid extensionOid, ExtensionControlFile *control, control->name), control->trusted ? errhint("Must have CREATE privilege on current database to update this extension.") - : errhint("Must be superuser to update this extension."))); + : errhint("Only roles with privileges of the \"%s\" role are allowed to update this extension.", "pg_manage_extensions"))); } filename = get_extension_script_filename(control, from_version, version); diff --git a/src/include/catalog/pg_authid.dat b/src/include/catalog/pg_authid.dat index 82a2ec2862..ac70603d26 100644 --- a/src/include/catalog/pg_authid.dat +++ b/src/include/catalog/pg_authid.dat @@ -94,5 +94,10 @@ rolcreaterole => 'f', rolcreatedb => 'f', rolcanlogin => 'f', rolreplication => 'f', rolbypassrls => 'f', rolconnlimit => '-1', rolpassword => '_null_', rolvaliduntil => '_null_' }, +{ oid => '8801', oid_symbol => 'ROLE_PG_MANAGE_EXTENSIONS', + rolname => 'pg_manage_extensions', rolsuper => 'f', rolinherit => 't', + rolcreaterole => 'f', rolcreatedb => 'f', rolcanlogin => 'f', + rolreplication => 'f', rolbypassrls => 'f', rolconnlimit => '-1', + rolpassword => '_null_', rolvaliduntil => '_null_' }, ] -- 2.39.2
Re: plpgsql memory leaks
Hi, On Fri, Jan 12, 2024 at 01:35:24PM +0100, Pavel Stehule wrote: > pá 12. 1. 2024 v 11:54 odesílatel Michael Banck napsal: > > Which version of Postgres is this and on which platform/distribution? > > It was tested on master branch (pg 17) on Fedora 39 > > > Did you try keep jit on but set jit_inline_above_cost to 0? > > > > The back-branches have a fix for the above case, i.e. llvmjit memleaks > > that can be worked-around by setting jit_inline_above_cost=0. I got that wrong, it needs to be -1 to disable it. But if you are already running the master branch, it is probably a separate issue. Michael
Re: plpgsql memory leaks
Hi, On Fri, Jan 12, 2024 at 11:02:14AM +0100, Pavel Stehule wrote: > pá 12. 1. 2024 v 10:27 odesílatel Pavel Stehule > napsal: > > > Hi > > > > I have reported very memory expensive pattern: [...] > > takes lot of megabytes of memory too. > > The megabytes leaks are related to JIT. With JIT off the memory consumption > is significantly less although there are some others probably. I cannot readily reproduce this. Which version of Postgres is this and on which platform/distribution? Did you try keep jit on but set jit_inline_above_cost to 0? The back-branches have a fix for the above case, i.e. llvmjit memleaks that can be worked-around by setting jit_inline_above_cost=0. Michael
Re: Set log_lock_waits=on by default
Hi, On Thu, Dec 21, 2023 at 02:29:05PM +0100, Laurenz Albe wrote: > Here is a patch to implement this. > Being stuck behind a lock for more than a second is almost > always a problem, so it is reasonable to turn this on by default. I also think that this should be set to on. I had a look at the patch and it works fine. Regarding the documentation, maybe the back-reference at deadlock_timeout could be made a bit more explicit then as well, as in the attached patch, but this is mostly bikeshedding. Michael diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index 6d975ef7ab..7f9bdea50b 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -10185,11 +10185,12 @@ dynamic_library_path = 'C:\tools\postgresql;H:\my_project\lib;$libdir' -When is set, -this parameter also determines the amount of time to wait before -a log message is issued about the lock wait. If you are trying -to investigate locking delays you might want to set a shorter than -normal deadlock_timeout. +When is set to +on (which is the default), this parameter also +determines the amount of time to wait before a log message is issued +about the lock wait. If you are trying to investigate locking delays +you might want to set a shorter than normal +deadlock_timeout.
Re: [PATCH] Exponential backoff for auth_delay
Hi, On Wed, Dec 27, 2023 at 05:19:54PM +0100, Michael Banck wrote: > This patch adds exponential backoff so that one can choose a small > initial value which gets doubled for each failed authentication attempt > until a maximum wait time (which is 10s by default, but can be disabled > if so desired). Here is a new version, hopefully fixing warnings in the documentation build, per cfbot. Michael >From 579f6ce8f968464af06a4695b7a3b66ee94716c8 Mon Sep 17 00:00:00 2001 From: Michael Banck Date: Wed, 27 Dec 2023 15:55:39 +0100 Subject: [PATCH v2] Add optional exponential backoff to auth_delay contrib module. MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit This adds two new GUCs for auth_delay, exp_backoff and max_seconds. The former controls whether exponential backoff should be used or not, the latter sets an maximum delay (default is 10s) in case exponential backoff is active. The exponential backoff is tracked per remote host and doubled for every failed login attempt (i.e., wrong password, not just missing pg_hba line or database) and reset to auth_delay.milliseconds after a successful authentication from that host. This patch is partly based on a larger (but ultimately rejected) patch by 成之焕. Authors: Michael Banck, 成之焕 Discussion: https://postgr.es/m/ahwaxacqiwivoehs5yejpqog.1.1668569845751.hmail.zhch...@ceresdata.com --- contrib/auth_delay/auth_delay.c | 202 ++- doc/src/sgml/auth-delay.sgml | 41 +++ src/tools/pgindent/typedefs.list | 1 + 3 files changed, 243 insertions(+), 1 deletion(-) diff --git a/contrib/auth_delay/auth_delay.c b/contrib/auth_delay/auth_delay.c index 8d6e4d2778..95e56db6ec 100644 --- a/contrib/auth_delay/auth_delay.c +++ b/contrib/auth_delay/auth_delay.c @@ -14,24 +14,50 @@ #include #include "libpq/auth.h" +#include "miscadmin.h" #include "port.h" +#include "storage/ipc.h" +#include "storage/shmem.h" #include "utils/guc.h" #include "utils/timestamp.h" PG_MODULE_MAGIC; +#define MAX_CONN_RECORDS 50 + /* GUC Variables */ static int auth_delay_milliseconds = 0; +static bool auth_delay_exp_backoff = false; +static int auth_delay_max_seconds = 0; /* Original Hook */ static ClientAuthentication_hook_type original_client_auth_hook = NULL; +typedef struct AuthConnRecord +{ + char remote_host[NI_MAXHOST]; + bool used; + double sleep_time; /* in milliseconds */ +} AuthConnRecord; + +static shmem_startup_hook_type shmem_startup_next = NULL; +static shmem_request_hook_type shmem_request_next = NULL; +static AuthConnRecord *acr_array = NULL; + +static AuthConnRecord *find_conn_record(char *remote_host, int *free_index); +static double record_failed_conn_auth(Port *port); +static double find_conn_max_delay(void); +static void record_conn_failure(AuthConnRecord *acr); +static void cleanup_conn_record(Port *port); + /* * Check authentication */ static void auth_delay_checks(Port *port, int status) { + double delay; + /* * Any other plugins which use ClientAuthentication_hook. */ @@ -43,8 +69,150 @@ auth_delay_checks(Port *port, int status) */ if (status != STATUS_OK) { - pg_usleep(1000L * auth_delay_milliseconds); + if (auth_delay_exp_backoff) + { + /* + * Exponential backoff per remote host. + */ + delay = record_failed_conn_auth(port); + if (auth_delay_max_seconds > 0) +delay = Min(delay, 1000L * auth_delay_max_seconds); + } + else + delay = auth_delay_milliseconds; + if (delay > 0) + { + elog(DEBUG1, "Authentication delayed for %g seconds", delay / 1000.0); + pg_usleep(1000L * (long) delay); + } + } + else + { + cleanup_conn_record(port); + } +} + +static double +record_failed_conn_auth(Port *port) +{ + AuthConnRecord *acr = NULL; + int j = -1; + + acr = find_conn_record(port->remote_host, &j); + + if (!acr) + { + if (j == -1) + + /* + * No free space, MAX_CONN_RECORDS reached. Wait as long as the + * largest delay for any remote host. + */ + return find_conn_max_delay(); + acr = &acr_array[j]; + strcpy(acr->remote_host, port->remote_host); + acr->used = true; + elog(DEBUG1, "new connection: %s, index: %d", acr->remote_host, j); + } + + record_conn_failure(acr); + return acr->sleep_time; +} + +static AuthConnRecord * +find_conn_record(char *remote_host, int *free_index) +{ + int i; + + *free_index = -1; + for (i = 0; i < MAX_CONN_RECORDS; i++) + { + if (!acr_array[i].used) + { + if (*free_index == -1) +/* record unused element */ +*free_index = i; + continue; + } + if (strcmp(acr_array[i].remote_host, remote_host) == 0) + return &acr_array[i]; + } + + return NULL; +} + +static double +find_conn_max_delay(void) +{ + int i; + double max_delay = 0.0; + + + for (i = 0; i < MAX_CONN_RECORDS; i++) + { + if (acr_array[i].used && ac
[PATCH] Exponential backoff for auth_delay
Hi, we had a conversation with a customer about security compliance a while ago and one thing they were concerned about was avoiding brute-force attemps for remote password guessing. This is should not be a big concern if reasonably secure passwords are used and increasing SCRAM iteration count can also help, but generally auth_delay is recommended for this if there are concerns. This patch adds exponential backoff so that one can choose a small initial value which gets doubled for each failed authentication attempt until a maximum wait time (which is 10s by default, but can be disabled if so desired). Currently, this patch tracks remote hosts but not users, the idea being that a remote attacker likely tries several users from a particular host, but this could in theory be extended to users if there are concerns. The patch is partly based on an earlier, more ambitious attempt at extending auth_delay by 成之焕 from a year ago: https://postgr.es/m/ahwaxacqiwivoehs5yejpqog.1.1668569845751.hmail.zhch...@ceresdata.com Michael >From 4c964c866010bbdbeee9f0b2a755d97c91c5c091 Mon Sep 17 00:00:00 2001 From: Michael Banck Date: Wed, 27 Dec 2023 15:55:39 +0100 Subject: [PATCH v1] Add optional exponential backoff to auth_delay contrib module. MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit This adds two new GUCs for auth_delay, exp_backoff and max_seconds. The former controls whether exponential backoff should be used or not, the latter sets an maximum delay (default is 10s) in case exponential backoff is active. The exponential backoff is tracked per remote host and doubled for every failed login attempt (i.e., wrong password, not just missing pg_hba line or database) and reset to auth_delay.milliseconds after a successful authentication from that host. This patch is partly based on a larger (but ultimately rejected) patch by 成之焕. Authors: Michael Banck, 成之焕 Discussion: https://postgr.es/m/ahwaxacqiwivoehs5yejpqog.1.1668569845751.hmail.zhch...@ceresdata.com --- contrib/auth_delay/auth_delay.c | 202 ++- doc/src/sgml/auth-delay.sgml | 41 +++ src/tools/pgindent/typedefs.list | 1 + 3 files changed, 243 insertions(+), 1 deletion(-) diff --git a/contrib/auth_delay/auth_delay.c b/contrib/auth_delay/auth_delay.c index 8d6e4d2778..95e56db6ec 100644 --- a/contrib/auth_delay/auth_delay.c +++ b/contrib/auth_delay/auth_delay.c @@ -14,24 +14,50 @@ #include #include "libpq/auth.h" +#include "miscadmin.h" #include "port.h" +#include "storage/ipc.h" +#include "storage/shmem.h" #include "utils/guc.h" #include "utils/timestamp.h" PG_MODULE_MAGIC; +#define MAX_CONN_RECORDS 50 + /* GUC Variables */ static int auth_delay_milliseconds = 0; +static bool auth_delay_exp_backoff = false; +static int auth_delay_max_seconds = 0; /* Original Hook */ static ClientAuthentication_hook_type original_client_auth_hook = NULL; +typedef struct AuthConnRecord +{ + char remote_host[NI_MAXHOST]; + bool used; + double sleep_time; /* in milliseconds */ +} AuthConnRecord; + +static shmem_startup_hook_type shmem_startup_next = NULL; +static shmem_request_hook_type shmem_request_next = NULL; +static AuthConnRecord *acr_array = NULL; + +static AuthConnRecord *find_conn_record(char *remote_host, int *free_index); +static double record_failed_conn_auth(Port *port); +static double find_conn_max_delay(void); +static void record_conn_failure(AuthConnRecord *acr); +static void cleanup_conn_record(Port *port); + /* * Check authentication */ static void auth_delay_checks(Port *port, int status) { + double delay; + /* * Any other plugins which use ClientAuthentication_hook. */ @@ -43,8 +69,150 @@ auth_delay_checks(Port *port, int status) */ if (status != STATUS_OK) { - pg_usleep(1000L * auth_delay_milliseconds); + if (auth_delay_exp_backoff) + { + /* + * Exponential backoff per remote host. + */ + delay = record_failed_conn_auth(port); + if (auth_delay_max_seconds > 0) +delay = Min(delay, 1000L * auth_delay_max_seconds); + } + else + delay = auth_delay_milliseconds; + if (delay > 0) + { + elog(DEBUG1, "Authentication delayed for %g seconds", delay / 1000.0); + pg_usleep(1000L * (long) delay); + } + } + else + { + cleanup_conn_record(port); + } +} + +static double +record_failed_conn_auth(Port *port) +{ + AuthConnRecord *acr = NULL; + int j = -1; + + acr = find_conn_record(port->remote_host, &j); + + if (!acr) + { + if (j == -1) + + /* + * No free space, MAX_CONN_RECORDS reached. Wait as long as the + * largest delay for any remote host. + */ + return find_conn_max_delay(); + acr = &acr_array[j]; + strcpy(acr->remote_host, port->remote_host); + acr->used = true; + elog(DEBUG1, "new connection: %s, index: %d", acr->remote_host, j); + } + + record_conn_failure(ac
Re: Add --check option to pgindent
Hi, On Tue, Dec 12, 2023 at 11:18:59AM +0100, Daniel Gustafsson wrote: > > On 12 Dec 2023, at 01:09, Tristan Partin wrote: > > > > Not sold on the name, but --check is a combination of --silent-diff > > and --show-diff. I envision --check mostly being used in CI > > environments. I recently came across a situation where this behavior > > would have been useful. Without --check, you're left to capture the > > output of --show-diff and exit 2 if the output isn't empty by > > yourself. > > I wonder if we should model this around the semantics of git diff to > keep it similar to other CI jobs which often use git diff? git diff > --check means "are there conflicts or issues" which isn't really > comparable to here, git diff --exit-code however is pretty much > exactly what this is trying to accomplish. > > That would make pgindent --show-diff --exit-code exit with 1 if there > were diffs and 0 if there are no diffs. To be honest, I find that rather convoluted; contrary to "git diff", I believe the primary action of pgident is not to show diffs, so I find the proposed --check option to be entirely reasonable from a UX perspective. On the other hand, tying a "does this need re-indenting?" question to a "--show-diff --exit-code" option combination is not very obvious (to me, at least). Cheers, Michael
Re: [HACKERS] pg_upgrade vs vacuum_cost_delay
Hi, On Fri, Nov 24, 2023 at 12:17:56PM +0100, Magnus Hagander wrote: > On Fri, Nov 24, 2023 at 11:21 AM Michael Banck wrote: > > On Wed, Nov 22, 2023 at 11:23:34PM -0500, Bruce Momjian wrote: > > > + Non-zero values of > > > + vacuum_cost_delay will delay statistics > > > generation. > > > > Now I wonder wheter vacuumdb maybe should have an option to explicitly > > force vacuum_cost_delay to 0 (I don't think it has?)? > > That's exactly what I proposed, isn't it? :) You're right, I somehow only saw your mail after I had already sent mine. To make up for this, I created a patch that implements our propoals, see attached. Michael >From b4a6e23f297994a237e35b9f96d2b806a17ba7a8 Mon Sep 17 00:00:00 2001 From: Michael Banck Date: Fri, 24 Nov 2023 13:00:31 +0100 Subject: [PATCH] Add --no-cost-delay option to vacuumdb. This sets the vacuum_cost_delay parameter to 0 and overrides the system setting (which is 0 by default as well). This can be useful in scripts where the actual value of vacuum_cost_delay is not known and vacuumdb should operate as fast as possible. If --no-cost-delay is selected in addition to --analze-in-stages, all stages are always analyzed without cost-based vacuum. Otherwise, the previous behaviour of only overriding the system settting of vacuum_cost_delay for the first stage is retained. --- src/bin/scripts/vacuumdb.c | 29 +++-- 1 file changed, 27 insertions(+), 2 deletions(-) diff --git a/src/bin/scripts/vacuumdb.c b/src/bin/scripts/vacuumdb.c index dd0d51659b..67373c6506 100644 --- a/src/bin/scripts/vacuumdb.c +++ b/src/bin/scripts/vacuumdb.c @@ -47,6 +47,7 @@ typedef struct vacuumingOptions bool process_toast; bool skip_database_stats; char *buffer_usage_limit; + bool no_cost_delay; } vacuumingOptions; /* object filter options */ @@ -127,6 +128,7 @@ main(int argc, char *argv[]) {"no-process-toast", no_argument, NULL, 11}, {"no-process-main", no_argument, NULL, 12}, {"buffer-usage-limit", required_argument, NULL, 13}, + {"no-cost-delay", no_argument, NULL, 14}, {NULL, 0, NULL, 0} }; @@ -157,6 +159,7 @@ main(int argc, char *argv[]) vacopts.do_truncate = true; vacopts.process_main = true; vacopts.process_toast = true; + vacopts.no_cost_delay = false; pg_logging_init(argv[0]); progname = get_progname(argv[0]); @@ -274,6 +277,9 @@ main(int argc, char *argv[]) case 13: vacopts.buffer_usage_limit = escape_quotes(optarg); break; + case 14: +vacopts.no_cost_delay = true; +break; default: /* getopt_long already emitted a complaint */ pg_log_error_hint("Try \"%s --help\" for more information.", progname); @@ -508,6 +514,11 @@ vacuum_one_database(ConnParams *cparams, "SET default_statistics_target=10; RESET vacuum_cost_delay;", "RESET default_statistics_target;" }; + const char *stage_commands_no_cost_delay[] = { + "SET default_statistics_target=1; SET vacuum_cost_delay=0;", + "SET default_statistics_target=10; SET vacuum_cost_delay=0", + "RESET default_statistics_target; SET vacuum_cost_delay=0" + }; const char *stage_messages[] = { gettext_noop("Generating minimal optimizer statistics (1 target)"), gettext_noop("Generating medium optimizer statistics (10 targets)"), @@ -799,10 +810,22 @@ vacuum_one_database(ConnParams *cparams, * ourselves before setting up the slots. */ if (stage == ANALYZE_NO_STAGE) - initcmd = NULL; + { + /* Switch off vacuum_cost_delay, if requested */ + if (vacopts->no_cost_delay) + { + initcmd = "SET vacuum_cost_delay=0;"; + executeCommand(conn, initcmd, echo); + } + else + initcmd = NULL; + } else { - initcmd = stage_commands[stage]; + if (vacopts->no_cost_delay) + initcmd = stage_commands_no_cost_delay[stage]; + else + initcmd = stage_commands[stage]; executeCommand(conn, initcmd, echo); } @@ -1171,6 +1194,8 @@ help(const char *progname) printf(_(" --no-process-main skip the main relation\n")); printf(_(" --no-process-toast skip the TOAST table associated with the table to vacuum\n")); printf(_(" --no-truncate don't truncate empty pages at the end of the table\n")); + printf(_(" --no-cost-delay force cost-based vacuum to off, overriding the\n" + " vacuum_cost_delay configuration parameter\n")); printf(_(" -n, --schema=SCHEMA vacuum tables in the specified schema(s) only\n")); printf(_(" -N, --exclude-schema=SCHEMA do not vacuum tables in the specified schema(s)\n")); printf(_(" -P, --parallel=PARALLEL_WORKERS use this many background workers for vacuum, if available\n")); -- 2.39.2
Re: [HACKERS] pg_upgrade vs vacuum_cost_delay
Hi, On Wed, Nov 22, 2023 at 11:23:34PM -0500, Bruce Momjian wrote: > + Non-zero values of > + vacuum_cost_delay will delay statistics generation. Now I wonder wheter vacuumdb maybe should have an option to explicitly force vacuum_cost_delay to 0 (I don't think it has?)? Michael
Re: Version 14/15 documentation Section "Alter Default Privileges"
Hi, On Tue, Nov 07, 2023 at 05:30:20PM -0500, Bruce Momjian wrote: > On Mon, Nov 6, 2023 at 09:53:50PM +0100, Laurenz Albe wrote: > > + > > + There is no way to change the default privileges for objects created by > > + arbitrary roles. You have run ALTER DEFAULT > > PRIVILEGES > > I find the above sentence odd. What is its purpose? I guess it is to address the main purpose of this patch/confusion with users: they believe setting DEFAULT PRIVILEGES will set grants accordingly for all objects created in the future, no matter who creates them. So hammering in that this is not the case seems fine from my side (modulo the "You have to run" typo). Michael
Re: Version 14/15 documentation Section "Alter Default Privileges"
Hi, On Sat, Nov 04, 2023 at 09:14:01PM -0400, Bruce Momjian wrote: > + There is no way to change the default privileges for objects created by > + any role. You have run ALTER DEFAULT PRIVILEGES for > all > + roles that can create objects whose default privileges should be modified. That second sentence is broken, it should be "You have to run [...]" I think. Michael
Re: [patch] pg_basebackup: mention that spread checkpoints are the default in --help
Hi, On Tue, Oct 31, 2023 at 04:59:24PM +0530, Shlok Kyal wrote: > I went through the Cfbot for this patch and found out that the build > is failing with the following error (Link: > https://cirrus-ci.com/task/4648506929971200?logs=build#L1217): Oops, sorry. Attached is a working third version of this patch. Michael >From bc9eb46a49ee514236aabe42d9689a7c35b5bcd7 Mon Sep 17 00:00:00 2001 From: Michael Banck Date: Thu, 19 Oct 2023 11:37:11 +0200 Subject: [PATCH v3] pg_basebackup: Mention that spread checkpoints are the default in --help --- src/bin/pg_basebackup/pg_basebackup.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c index f32684a8f2..b0ac77b4ce 100644 --- a/src/bin/pg_basebackup/pg_basebackup.c +++ b/src/bin/pg_basebackup/pg_basebackup.c @@ -407,7 +407,8 @@ usage(void) printf(_(" -Z, --compress=nonedo not compress tar output\n")); printf(_("\nGeneral options:\n")); printf(_(" -c, --checkpoint=fast|spread\n" - " set fast or spread checkpointing\n")); + " set fast or spread checkpointing\n" + " (default: spread)\n")); printf(_(" -C, --create-slot create replication slot\n")); printf(_(" -l, --label=LABEL set backup label\n")); printf(_(" -n, --no-clean do not clean up after errors\n")); -- 2.39.2
Re: Version 14/15 documentation Section "Alter Default Privileges"
Hi, On Fri, Oct 27, 2023 at 05:49:42PM +0200, Laurenz Albe wrote: > On Fri, 2023-10-27 at 11:34 +0200, Michael Banck wrote: > > On Fri, Oct 27, 2023 at 09:03:04AM +0200, Laurenz Albe wrote: > > > On Fri, 2022-11-04 at 10:49 +0100, Laurenz Albe wrote: > > > > On Thu, 2022-11-03 at 11:32 +0100, Laurenz Albe wrote: > > > > > On Wed, 2022-11-02 at 19:29 +, David Burns wrote: > > > > > > Some additional clarity in the versions 14/15 documentation > > > > > > would be helpful specifically surrounding the "target_role" > > > > > > clause for the ALTER DEFAULT PRIVILEGES command. To the > > > > > > uninitiated, the current description seems vague. Maybe > > > > > > something like the following would help: > > > > > > > > After some more thinking, I came up with the attached patch. > > > > > I think something like this is highly useful because I have also seen > > people very confused why default privileges are not applied. > > > > However, maybe it could be made even clearer if also the main > > description is amended, like > > > > "You can change default privileges only for objects that will be created > > by yourself or by roles that you are a member of (via target_role)." > > > > or something. > > True. I have done that in the attached patch. > In this patch, it is mentioned *twice* that ALTER DEFAULT PRIVILEGES > only affects objects created by the current user. I thought that > would not harm, but if it is too redundant, I can remove the second > mention. I think it is fine, and I have marked the patch as ready-for-committer. I think it should be applied to all branches, not just 14/15 as mentioned in the subject. Michael
Re: Version 14/15 documentation Section "Alter Default Privileges"
Hi, On Fri, Oct 27, 2023 at 09:03:04AM +0200, Laurenz Albe wrote: > On Fri, 2022-11-04 at 10:49 +0100, Laurenz Albe wrote: > > On Thu, 2022-11-03 at 11:32 +0100, Laurenz Albe wrote: > > > On Wed, 2022-11-02 at 19:29 +, David Burns wrote: > > > > > > > Some additional clarity in the versions 14/15 documentation > > > > would be helpful specifically surrounding the "target_role" > > > > clause for the ALTER DEFAULT PRIVILEGES command. To the > > > > uninitiated, the current description seems vague. Maybe > > > > something like the following would help: > > > > After some more thinking, I came up with the attached patch. > > I'm sending a reply to the hackers list, so that I can add the patch > to the commitfest. I think something like this is highly useful because I have also seen people very confused why default privileges are not applied. However, maybe it could be made even clearer if also the main description is amended, like "You can change default privileges only for objects that will be created by yourself or by roles that you are a member of (via target_role)." or something. Michael
Re: [patch] pg_basebackup: mention that spread checkpoints are the default in --help
Hi, On Wed, Oct 25, 2023 at 04:36:41PM +0200, Peter Eisentraut wrote: > On 19.10.23 11:39, Michael Banck wrote: > > Hi, > > > > I believed that spread (not fast) checkpoints are the default in > > pg_basebackup, but noticed that --help does not specify which is which - > > contrary to the reference documentation. > > > > So I propose the small attached patch to clarify that. > > > printf(_(" -c, --checkpoint=fast|spread\n" > >- " set fast or spread checkpointing\n")); > >+ " set fast or spread (default) > checkpointing\n")); > > Could we do like > > -c, --checkpoint=fast|spread > set fast or spread checkpointing > (default: spread) > > This seems to be easier to read. Yeah, we could do that. But then again the question pops up what to do about the other option that mentions defaults (-F) and the others which have a default but it is not spelt out yet (-X, -Z at least) (output is still from v15, additional options have been added since): -F, --format=p|t output format (plain (default), tar) -X, --wal-method=none|fetch|stream include required WAL files with specified method -Z, --compress=0-9 compress tar output with given compression level So, my personal opinion is that we should really document -c because it is quite user-noticable compared to the others. So attached is a new version with just your proposed change for now. Michael >From 817f71f2eaa8814a30320bc1ef97c1ec8a95f083 Mon Sep 17 00:00:00 2001 From: Michael Banck Date: Thu, 19 Oct 2023 11:37:11 +0200 Subject: [PATCH v2] pg_basebackup: Mention that spread checkpoints are the default in --help --- src/bin/pg_basebackup/pg_basebackup.c | 1 + 1 file changed, 1 insertion(+) diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c index 1a8cef345d..9957fb4f54 100644 --- a/src/bin/pg_basebackup/pg_basebackup.c +++ b/src/bin/pg_basebackup/pg_basebackup.c @@ -408,6 +408,7 @@ usage(void) printf(_("\nGeneral options:\n")); printf(_(" -c, --checkpoint=fast|spread\n" " set fast or spread checkpointing\n")); + " (default: spread)\n")); printf(_(" -C, --create-slot create replication slot\n")); printf(_(" -l, --label=LABEL set backup label\n")); printf(_(" -n, --no-clean do not clean up after errors\n")); -- 2.39.2
Re: missing dependencies PostgreSQL 16 OpenSUSE Tumbleweed (SLES 15.5 packages)
Hi, On Sun, Oct 22, 2023 at 11:00:07AM +0200, André Verwijs wrote: > missing dependencies PostgreSQL 16 OpenSUSE Tumbleweed (SLES 15.5 > packages) > > --- > > YaST2 conflicts list - generated 2023-10-22 10:30:07 > > there is no package providing 'libldap_r-2.4.so.2())(64bit)' required by > installing postgresql16-server-16.0-1PGDG.sles15.x86_64 Those are the packages from zypp.postgresql.org, right? There is a link to the issue tracker at https://redmine.postgresql.org/projects/pgrpms/ from the home page, I think it would best to report the problem there. Michael
Re: [patch] pg_basebackup: mention that spread checkpoints are the default in --help
Hi, On Thu, Oct 19, 2023 at 04:21:19PM +0300, Aleksander Alekseev wrote: > > I believed that spread (not fast) checkpoints are the default in > > pg_basebackup, but noticed that --help does not specify which is which - > > contrary to the reference documentation. > > > > So I propose the small attached patch to clarify that. > > You are right and I believe this is a good change. > > Maybe we should also display the defaults for -X, > --manifest-checksums, etc for consistency. Hrm right, but those have multiple options and they do not enumerate them in the help string as do -F and -c - not sure what general project policy here is for mentioning defaults in --help, I will check some of the other commands. Michael
[patch] pg_basebackup: mention that spread checkpoints are the default in --help
Hi, I believed that spread (not fast) checkpoints are the default in pg_basebackup, but noticed that --help does not specify which is which - contrary to the reference documentation. So I propose the small attached patch to clarify that. Michael >From 2fc49eae5ccc82e144c3f683689757e014e331bd Mon Sep 17 00:00:00 2001 From: Michael Banck Date: Thu, 19 Oct 2023 11:37:11 +0200 Subject: [PATCH] pg_basebackup: Mention that spread checkpoints are the default in --help --- src/bin/pg_basebackup/pg_basebackup.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c index 1a8cef345d..f2cf38a773 100644 --- a/src/bin/pg_basebackup/pg_basebackup.c +++ b/src/bin/pg_basebackup/pg_basebackup.c @@ -407,7 +407,7 @@ usage(void) printf(_(" -Z, --compress=nonedo not compress tar output\n")); printf(_("\nGeneral options:\n")); printf(_(" -c, --checkpoint=fast|spread\n" - " set fast or spread checkpointing\n")); + " set fast or spread (default) checkpointing\n")); printf(_(" -C, --create-slot create replication slot\n")); printf(_(" -l, --label=LABEL set backup label\n")); printf(_(" -n, --no-clean do not clean up after errors\n")); -- 2.39.2
Re: Rename backup_label to recovery_control
On Mon, Oct 16, 2023 at 11:15:53AM -0400, David Steele wrote: > On 10/16/23 10:19, Robert Haas wrote: > > We got rid of exclusive backup mode. We replaced pg_start_backup > > with pg_backup_start. > > I do think this was an improvement. For example it allows us to do > [1], which I believe is a better overall solution to the problem of > torn reads of pg_control. With exclusive backup we would not have this > option. Well maybe, but it also seems to mean that any other 3rd party (i.e. not Postgres-specific) backup tool seems to only support Postgres up till version 14, as they cannot deal with non-exclusive mode - they are used to a simple pre/post-script approach. Not sure what to do about this, but as people/companies start moving to 15, I am afraid we will get people complaining about this. I think having exclusive mode still be the default for pg_start_backup() (albeit deprecated) in one release and then dropping it in the next was too fast. Or is somebody helping those "enterprise" backup solutions along in implementing non-exclusive Postgres backups? Michael
Re: pg_upgrade instructions involving "rsync --size-only" might lead to standby corruption?
Hi, On Mon, Jul 10, 2023 at 02:37:24PM -0700, Nikolay Samokhvalov wrote: > On Mon, Jul 10, 2023 at 2:02 PM Michael Banck wrote: > > > +++ b/doc/src/sgml/ref/pgupgrade.sgml > > > @@ -380,22 +380,28 @@ NET STOP postgresql-&majorversion; > > > > > > > > > > > > - Streaming replication and log-shipping standby servers can > > > + Streaming replication and log-shipping standby servers must > > > remain running until a later step. > > > > > > > > > > > > - > > > + > > > > > > > > > - If you are upgrading standby servers using methods outlined in > > section > > - linkend="pgupgrade-step-replicas"/>, verify that the old standby > > > - servers are caught up by running > > pg_controldata > > > - against the old primary and standby clusters. Verify that the > > > - Latest checkpoint location values match in all > > clusters. > > > - (There will be a mismatch if old standby servers were shut down > > > - before the old primary or if the old standby servers are still > > running.) > > > + If you are upgrading standby servers using methods outlined in > > > + , > > > > You dropped the "section" before the xref, I think that should be kept > > around. > > Seems to be a problem in discussing source code that looks quite different > than the final result. > > In the result – the docs – we currently have "section Step 9", looking > weird. I still think it's good to remove it. We also have "in Step 17 > below" (without the extra word "section") in a different place on the same > page. Ok. > > > +ensure that they were > > running when > > > + you shut down the primaries in the previous step, so all the > > latest changes > > > > You talk of primaries in plural here, that is a bit weird for PostgreSQL > > documentation. > > The same docs already discuss two primaries ("8. Stop both primaries"), but > I agree it might look confusing if you read only a part of the doc, jumping > into middle of it, like I do all the time when using the docs in "check the > reference" style. [...] > > I think this should be something like "ensure both that the primary is > > shut down and that the standbys have received all the changes". > > Well, we have two primary servers – old and new. I tried to clarify it in > the new version. Yeah sorry about that, I think I should have first have coffee and/or slept over this review before sending it. Maybe one reason why I was confused is beause I consider a "primary" more like a full server/VM, not necessarily a database instance (though one could of course have a primary/seconday pair on the same host). Possibly "primary instances" or something might be clearer, but I think I should re-read the whole section first before commenting further. Michael
Re: pg_upgrade instructions involving "rsync --size-only" might lead to standby corruption?
Hi, On Mon, Jul 10, 2023 at 01:36:39PM -0700, Nikolay Samokhvalov wrote: > On Fri, Jul 7, 2023 at 6:31 AM Stephen Frost wrote: > > * Nikolay Samokhvalov (n...@postgres.ai) wrote: > > > But this can happen with anyone who follows the procedure from the docs > > as > > > is and doesn't do any additional steps, because in step 9 "Prepare for > > > standby server upgrades": > > > > > > 1) there is no requirement to follow specific order to shut down the > > nodes > > >- "Streaming replication and log-shipping standby servers can remain > > > running until a later step" should probably be changed to a > > > requirement-like "keep them running" > > > > Agreed that it would be good to clarify that the primary should be shut > > down first, to make sure everything written by the primary has been > > replicated to all of the replicas. > > Thanks! > > Here is a patch to fix the existing procedure description. Thanks for that! > I agree with Andrey – without it, we don't have any good way to upgrade > large clusters in short time. Default rsync mode (without "--size-only") > takes a lot of time too, if the load is heavy. > > With these adjustments, can "rsync --size-only" remain in the docs as the > *fast* and safe method to upgrade standbys, or there are still some > concerns related to corruption risks? I hope somebody can answer that definitively, but I read Stephen's mail to indicate that this procedure should be safe in principle (if you know what you are doing). > From: Nikolay Samokhvalov > Date: Mon, 10 Jul 2023 20:07:18 + > Subject: [PATCH] Improve major upgrade docs Maybe mention standby here, like "Improve major upgrade documentation for standby servers". > +++ b/doc/src/sgml/ref/pgupgrade.sgml > @@ -380,22 +380,28 @@ NET STOP postgresql-&majorversion; > > > > - Streaming replication and log-shipping standby servers can > + Streaming replication and log-shipping standby servers must > remain running until a later step. > > > > - > + > > > - If you are upgrading standby servers using methods outlined in section > - linkend="pgupgrade-step-replicas"/>, verify that the old standby > - servers are caught up by running > pg_controldata > - against the old primary and standby clusters. Verify that the > - Latest checkpoint location values match in all clusters. > - (There will be a mismatch if old standby servers were shut down > - before the old primary or if the old standby servers are still running.) > + If you are upgrading standby servers using methods outlined in > + , You dropped the "section" before the xref, I think that should be kept around. > +ensure that they were > running when > + you shut down the primaries in the previous step, so all the latest > changes You talk of primaries in plural here, that is a bit weird for PostgreSQL documentation. > + and the shutdown checkpoint record were received. You can verify this > by running > + pg_controldata against the old primary and > standby > + clusters. The Latest checkpoint location values must > match in all > + nodes. A mismatch might occur if old standby servers were shut down > before > + the old primary. To fix a mismatch, start all old servers and return to > the > + previous step; proceeding with mismatched > + Latest checkpoint location may lead to standby > corruption. > + > + > + > Also, make sure wal_level is not set to > minimal in the postgresql.conf > file on the > new primary cluster. > @@ -497,7 +503,6 @@ pg_upgrade.exe > linkend="warm-standby"/>) standby servers, you can follow these steps to > quickly upgrade them. You will not be running > pg_upgrade on > the standby servers, but rather rsync on the > primary. > - Do not start any servers yet. > > > > @@ -508,6 +513,15 @@ pg_upgrade.exe > is running. > > > + > + Before running rsync, to avoid standby corruption, it is absolutely > + critical to ensure that both primaries are shut down and standbys > + have received the last changes (see linkend="pgupgrade-step-prepare-standbys"/>). I think this should be something like "ensure both that the primary is shut down and that the standbys have received all the changes". > + Standbys can be running at this point or fully stopped. "or be fully stopped." I think. > + If they > + are still running, you can stop, upgrade, and start them one by one; > this > + can be useful to keep the cluster open for read-only transactions. > + Maybe this is clear from the context, but "upgrade" in the above should maybe more explicitly refer to the rsync method or else people might think one can run pg_upgrade on them after all? Michael
Re: Stampede of the JIT compilers
Hi, On Sat, Jun 24, 2023 at 01:54:53PM -0400, Tom Lane wrote: > I don't know whether raising the default would be enough to fix that > in a nice way, and I certainly don't pretend to have a specific value > to offer. But it's undeniable that we have a serious problem here, > to the point where JIT is a net negative for quite a few people. Some further data: to my knowledge, most major managed postgres providers disable jit for their users. Azure certainly does, but I don't have a Google Cloud SQL or RDS instance running right to verify their settings. I do seem to remember that they did as well though, at least a while back. Michael
Re: deb’s pg_upgradecluster(1) vs streaming replication
Hi, On Sat, Jun 17, 2023 at 07:10:23PM -0400, James Cloos wrote: > Has anyone recently tried updating a streaming replication cluster using > debian’s pg_upgradecluster(1) on each node? Note that the word "cluster" in upgradecluster refers to a single Postgres instance, a.k.a a cluster of databases. It is not designed to upgrade streaming replication clusters. > Did things work well? > > My last attempt (11 to 13, as I recall) had issues and I had to drop and > re-install the db on the secondaries. > > I'd like to avoid that this time... > > Should I expect things to work easily? No, you need to either rebuild the secondaries or use the rsync method to resync them from the documentation. The latter is complicated and iffy though, and not generally recommended I believe. Michael
Re: Amcheck verification of GiST and GIN
Hi, On Thu, Feb 02, 2023 at 12:56:47PM -0800, Nikolay Samokhvalov wrote: > On Thu, Feb 2, 2023 at 12:43 PM Peter Geoghegan wrote: > > I think that that problem should be solved at a higher level, in the > > program that runs amcheck. Note that pg_amcheck will already do this > > for B-Tree indexes. > > That's a great tool, and it's great it supports parallelization, very useful > on large machines. Right, but unfortunately not an option on managed services. It's clear that this restriction should not be a general guideline for Postgres development, but it makes the amcheck extension (that is now shipped everywhere due to being in-code I believe) somewhat less useful for use-case of checking your whole database for corruption. Michael
Re: Support load balancing in libpq
Hi, On Tue, Nov 29, 2022 at 02:57:08PM +, Jelte Fennema wrote: > Attached is a new version with the tests cleaned up a bit (more > comments mostly). > > @Michael, did you have a chance to look at the last version? Because I > feel that the patch is pretty much ready for a committer to look at, > at this point. I had another look; it still applies and tests pass. I still find the distribution over three hosts a bit skewed (even when OpenSSL is enabled, which wasn't the case when I first tested it), but couldn't find a general fault and it worked well enough in my testing. I wonder whether making the parameter a boolean will paint us into a corner, and whether maybe additional modes could be envisioned in the future, but I can't think of some right now (you can pretty neatly restrict load-balancing to standbys by setting target_session_attrs=standby in addition). Maybe a way to change the behaviour if a dns hostname is backed by multiple entries? Some further (mostly nitpicking) comments on the patch: > From 6e20bb223012b666161521b5e7249c066467a5f3 Mon Sep 17 00:00:00 2001 > From: Jelte Fennema > Date: Mon, 12 Sep 2022 09:44:06 +0200 > Subject: [PATCH v5] Support load balancing in libpq > > Load balancing connections across multiple read replicas is a pretty > common way of scaling out read queries. There are two main ways of doing > so, both with their own advantages and disadvantages: > 1. Load balancing at the client level > 2. Load balancing by connecting to an intermediary load balancer > > Both JBDC (Java) and Npgsql (C#) already support client level load > balancing (option #1). This patch implements client level load balancing > for libpq as well. To stay consistent with the JDBC and Npgsql part of > the ecosystem, a similar implementation and name for the option are > used. I still think all of the above has no business in the commit message, though maybe the first paragraph can stay as introduction. > It contains two levels of load balancing: > 1. The given hosts are randomly shuffled, before resolving them > one-by-one. > 2. Once a host its addresses get resolved, those addresses are shuffled, > before trying to connect to them one-by-one. That's fine. What should be in the commit message is at least a mention of what the new connection parameter is called and possibly what is done to accomplish it. But the committer will pick this up if needed I guess. > diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml > index f9558dec3b..6ce7a0c9cc 100644 > --- a/doc/src/sgml/libpq.sgml > +++ b/doc/src/sgml/libpq.sgml > @@ -1316,6 +1316,54 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname > > > > + xreflabel="load_balance_hosts"> > + load_balance_hosts > + > + > +Controls whether the client load balances connections across hosts > and > +adresses. The default value is 0, meaning off, this means that hosts > are > +tried in order they are provided and addresses are tried in the order > +they are received from DNS or a hosts file. If this value is set to > 1, > +meaning on, the hosts and addresses that they resolve to are tried in > +random order. Subsequent queries once connected will still be sent to > +the same server. Setting this to 1, is mostly useful when opening > +multiple connections at the same time, possibly from different > machines. > +This way connections can be load balanced across multiple Postgres > +servers. > + > + > +When providing multiple hosts, these hosts are resolved in random > order. > +Then if that host resolves to multiple addresses, these addresses are > +connected to in random order. Only once all addresses for a single > host > +have been tried, the addresses for the next random host will be > +resolved. This behaviour can lead to non-uniform address selection in > +certain cases. Such as when not all hosts resolve to the same number > of > +addresses, or when multiple hosts resolve to the same address. So if > you > +want uniform load balancing, this is something to keep in mind. > However, > +non-uniform load balancing also has usecases, e.g. providing the > +hostname of a larger server multiple times in the host string so it > gets > +more requests. > + > + > +When using this setting it's recommended to also configure a > reasonable > +value for . Because > then, > +if one of the nodes that are used for load balancing is not > responding, > +a new node will be tried. > + > + > + I think this whole section is generally fine, but needs some copyediting. > + > + random_seed > + > + > +Sets the random seed that is used by linkend="libpq-load-balance-hosts"/> > +to randomize the host order. This option is mo
Re: pg_basebackup --create-slot-if-not-exists?
Hi, On Wed, Sep 21, 2022 at 09:12:04PM -0400, Tom Lane wrote: > In any case I agree with the point that --create-slot seems > rather obsolete. If you are trying to resume in a previous > replication stream (which seems like the point of persistent > slots) then the slot had better already exist. If you are > satisfied with just starting replication from the current > instant, then a temp slot seems like what you want. One advantage of using a permanent slot is that it's getting written into the recovery configuration when you use --write-recovery-conf and you only need to start the standby after intial bootstrap to have it connect using the slot. Not sure that's worth keeping it around, but it makes automating things somewhat simpler I guess. I do somewhat agree with the thread starter, that --create-slot-if-not-exists would make things even easier, but in the light of your concerns regarding security it's probably not the best idea and would make things even more convoluted than they are now. Michael
Re: [EXTERNAL] Re: Support load balancing in libpq
Hi, On Mon, Sep 12, 2022 at 02:16:56PM +, Jelte Fennema wrote: > Attached is an updated patch with the following changes: > 1. rebased (including solved merge conflict) > 2. fixed failing tests in CI > 3. changed the commit message a little bit > 4. addressed the two remarks from Micheal > 5. changed the prng_state from a global to a connection level value for > thread-safety > 6. use pg_prng_uint64_range Thanks! I tested this some more, and found it somewhat surprising that at least when looking at it on a microscopic level, some hosts are chosen more often than the others for a while. I basically ran while true; do psql -At "host=pg1,pg2,pg3 load_balance_hosts=1" -c "SELECT inet_server_addr()"; sleep 1; done and the initial output was: 10.0.3.109 10.0.3.109 10.0.3.240 10.0.3.109 10.0.3.109 10.0.3.240 10.0.3.109 10.0.3.240 10.0.3.240 10.0.3.240 10.0.3.240 10.0.3.109 10.0.3.240 10.0.3.109 10.0.3.109 10.0.3.240 10.0.3.240 10.0.3.109 10.0.3.60 I.e. the second host (pg2/10.0.3.60) was only hit after 19 iterations. Once significantly more than a hundred iterations are run, the hosts somewhat even out, but it is maybe suprising to users: 50 100 250 500 1000 1 10.0.3.60 92477 1653283317 10.0.3.109 254288 1783533372 10.0.3.240 163485 1573193311 Or maybe my test setup is skewed? When I choose a two seconds timeout between psql calls, I get a more even distribution initially, but it then diverges after 100 iterations: 50 100 250500 1000 10.0.3.60 193698199374 10.0.3.109 133380150285 10.0.3.240 183172151341 Could just be bad luck... I also switch one host to have two IP addresses in /etc/hosts: 10.0.3.109 pg1 10.0.3.60 pg1 10.0.3.240 pg3 And this resulted in this (one second timeout again): First run: 50 100 250500 1000 10.0.3.60 101856120255 10.0.3.109 143067139278 10.0.3.240 2652 127241467 Second run: 50 100 250500 1000 10.0.3.60 203177138265 10.0.3.109 92052116245 10.0.3.240 2149 121246490 So it looks like it load-balances between pg1 and pg3, and not between the three IPs - is this expected? If I switch from "host=pg1,pg3" to "host=pg1,pg1,pg3", each IP adress is hit roughly equally. So I guess this is how it should work, but in that case I think the documentation should be more explicit about what is to be expected if a host has multiple IP addresses or hosts are specified multiple times in the connection string. > > Maybe my imagination is not so great, but what else than hosts could we > > possibly load-balance? I don't mind calling it load_balance, but I also > > don't feel very strongly one way or the other and this is clearly > > bikeshed territory. > > I agree, which is why I called it load_balance in my original patch. > But I also think it's useful to match the naming for the already > existing implementations in the PG ecosystem around this. > But like you I don't really feel strongly either way. It's a tradeoff > between short name and consistency in the ecosystem. I don't think consistency is an extremely valid concern. As a counterpoint, pgJDBC had targetServerType some time before Postgres, and the libpq parameter was then named somewhat differently when it was introduced, namely target_session_attrs. > > If I understand correctly, you've added DNS-based load balancing on top > > of just shuffling the provided hostnames. This makes sense if a > > hostname is backed by more than one IP address in the context of load > > balancing, but it also complicates the patch. So I'm wondering how much > > shorter the patch would be if you leave that out for now? > > Yes, that's correct and indeed the patch would be simpler without, i.e. all > the > addrinfo changes would become unnecessary. But IMHO the behaviour of > the added option would be very unexpected if it didn't load balance across > multiple IPs in a DNS record. libpq currently makes no real distinction in > handling of provided hosts and handling of their resolved IPs. If load > balancing > would only apply to the host list that would start making a distinction > between the two. Fair enough, I agree. > Apart from that the load balancing across IPs is one of the main reasons > for my interest in this patch. The reason is that it allows expanding or > reducing > the number of nodes that are being load balanced across transparently to the > application. Which means that there's no need to re-deploy applications with > new connection strings when changing the number hosts. That's a good point as well. > > On the other hand, I believe pgJDBC keeps track of which hosts are up or > > down and only load balances among the ones whi
Re: [EXTERNAL] Re: Support load balancing in libpq
Hi, On Wed, Sep 14, 2022 at 05:53:48PM +0300, Maxim Orlov wrote: > > Also, IMO, the solution must have a fallback mechanism if the > > standby/chosen host isn't reachable. > > Yeah, I think it should. I'm not insisting on a particular name of options > here, but in my view, the overall idea may be next: > - we have two libpq's options: "load_balance_hosts" and "failover_timeout"; > - the "load_balance_hosts" should be "sequential" or "random"; > - the "failover_timeout" is a time period, within which, if connection to > the server is not established, we switch to the next address or host. Isn't this exactly what connect_timeout is providing? In my tests, it worked exactly as I would expect it, i.e. after connect_timeout seconds, libpq was re-shuffling and going for another host. If you specify only one host (or all are down), you get an error. In any case, I am not sure what failover has to do with it if we are talking about initiating connections - usually failover is for already established connections that suddendly go away for one reason or another. Or maybe I'm just not understanding where you're getting at? > While writing this text, I start thinking that load balancing is a > combination of two parameters above. So I guess what you are saying is that if load_balance_hosts is set, not setting connect_timeout would be a hazard, cause it would stall the connection attempt even though other hosts would be available. That's right, but I guess it's already a hazard if you put multiple hosts in there, and the connection is not immediately failed (because the host doesn't exist or it rejects the connection) but stalled by a firewall for one host, while other hosts later on in the list would be happy to accept connections. So maybe this is something to think about, but just changing the defaul of connect_timeout to something else when load balancing is on would be very surprising. In any case, I don't think this absolutely needs to be addressed by the initial feature, it could be expanded upon later on if needed, the feature is useful on its own, along with connect_timeout. Michael
Re: Support load balancing in libpq
Hi, the patch no longer applies cleanly, please rebase (it's trivial). I don't like the provided commit message very much, I think the discussion about pgJDBC having had load balancing for a while belongs elsewhere. On Wed, Jun 22, 2022 at 07:54:19AM +, Jelte Fennema wrote: > I tried to stay in line with the naming of this same option in JDBC and > Npgsql, where it's called "loadBalanceHosts" and "Load Balance Hosts" > respectively. So, actually to be more in line it should be the option for > libpq should be called "load_balance_hosts" (not "loadbalance" like > in the previous patch). I attached a new patch with the name of the > option changed to this. Maybe my imagination is not so great, but what else than hosts could we possibly load-balance? I don't mind calling it load_balance, but I also don't feel very strongly one way or the other and this is clearly bikeshed territory. > I also don't think the name is misleading. Randomization of hosts will > automatically result in balancing the load across multiple hosts. This is > assuming more than a single connection is made using the connection > string, either on the same client node or on different client nodes. I think > I think is a fair assumption to make. Also note that this patch does not load > balance queries, it load balances connections. This is because libpq works > at the connection level, not query level, due to session level state. I agree. Also, I think the scope is ok for a first implementation (but see below). You or others could possibly further enhance the algorithm in the future, but it seems to be useful as-is. > I agree it is indeed fairly simplistic load balancing. If I understand correctly, you've added DNS-based load balancing on top of just shuffling the provided hostnames. This makes sense if a hostname is backed by more than one IP address in the context of load balancing, but it also complicates the patch. So I'm wondering how much shorter the patch would be if you leave that out for now? On the other hand, I believe pgJDBC keeps track of which hosts are up or down and only load balances among the ones which are up (maybe rechecking after a timeout? I don't remember), is this something you're doing, or did you consider it? Some quick remarks on the patch: /* OK, scan this addrlist for a working server address */ - conn->addr_cur = conn->addrlist; reset_connection_state_machine = true; conn->try_next_host = false; The comment might need to be updated. + int naddr; /* # of addrs returned by getaddrinfo */ This is spelt "number of" in several other places in the file, and we still have enough space to spell it out here as well without a line-wrap. Michael
Re: PostgreSQL 15 release announcement draft
Hi, On Tue, Aug 30, 2022 at 03:58:48PM -0400, Jonathan S. Katz wrote: > ### Other Notable Changes > > PostgreSQL server-level statistics are now collected in shared memory, > eliminating the statistics collector process and writing these stats to disk. > PostgreSQL 15 also revokes the `CREATE` permission from all users except a > database owner from the `public` (or default) schema. It's a bit weird to lump those two in the same paragraph, but ok. However, I think the "and writing these stats to disk." might not be very clear to people not familiar with the feature, they might think writing stats to disk is part of the new feature. So I propose "as well as writing theses stats to disk" instead or something. Michael
[PATCH] Allow usage of archive .backup files as backup_label
Hi, The .backup files written to the archive (if archiving is on) are very similar to the backup_label that's written/returned by pg_stop_backup()/pg_backup_stop(), they just have a few extra lines about the end of backup process that are missing from backup_label. The parser in xlogrecovery.c however barfs on them because it does not expect the additional STOP WAL LOCATION on line 2. The attached makes parsing this line optional, so that one can use those .backup files in place of backup_label. This is e.g. useful if the backup_label got lost or the output of pg_stop_backup() was not captured. Michael -- Team Lead PostgreSQL Project Manager Tel.: +49 2166 9901-171 Mail: michael.ba...@credativ.de credativ GmbH, HRB Mönchengladbach 12080 USt-ID-Nummer: DE204566209 Trompeterallee 108, 41189 Mönchengladbach Management: Dr. Michael Meskes, Geoff Richardson, Peter Lilley Our handling of personal data is subject to: https://www.credativ.de/en/contact/privacy/ >From 00b1b245f74a0496a4d60cfafff92735dbe73d22 Mon Sep 17 00:00:00 2001 From: Michael Banck Date: Mon, 22 Aug 2022 16:20:14 +0200 Subject: [PATCH] Allow usage of archive .backup files as backup_label. This lets the backup_label parser not bail if STOP WAL LOCATION is encountered, which is the only meaningful difference between an archive .backup file and a backup_label file, thus allowing to just copy the corresponding .backup file from the archive as backup_label, in case the backup_label file got lost or was never recorded. --- src/backend/access/transam/xlogrecovery.c | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c index a59a0e826b..a95946e391 100644 --- a/src/backend/access/transam/xlogrecovery.c +++ b/src/backend/access/transam/xlogrecovery.c @@ -1149,6 +1149,7 @@ read_backup_label(XLogRecPtr *checkPointLoc, TimeLineID *backupLabelTLI, bool *backupEndRequired, bool *backupFromStandby) { char startxlogfilename[MAXFNAMELEN]; + char stopxlogfilename[MAXFNAMELEN]; TimeLineID tli_from_walseg, tli_from_file; FILE *lfp; @@ -1183,7 +1184,10 @@ read_backup_label(XLogRecPtr *checkPointLoc, TimeLineID *backupLabelTLI, /* * Read and parse the START WAL LOCATION and CHECKPOINT lines (this code * is pretty crude, but we are not expecting any variability in the file - * format). + * format). Also allow STOP WAL LOCATION to be in the file. This line does + * not appear in backup_label, but it is written to the corresponding + * .backup file and allows users to rename or copy that file to + * backup_label without further editing. */ if (fscanf(lfp, "START WAL LOCATION: %X/%X (file %08X%16s)%c", &hi, &lo, &tli_from_walseg, startxlogfilename, &ch) != 5 || ch != '\n') @@ -1192,6 +1196,11 @@ read_backup_label(XLogRecPtr *checkPointLoc, TimeLineID *backupLabelTLI, errmsg("invalid data in file \"%s\"", BACKUP_LABEL_FILE))); RedoStartLSN = ((uint64) hi) << 32 | lo; RedoStartTLI = tli_from_walseg; + if (fscanf(lfp, "STOP WAL LOCATION: %X/%X (file %*08X%16s)%c", +&hi, &lo, stopxlogfilename, &ch) == 4) + ereport(DEBUG1, +(errmsg_internal("stop wal location %X/%X in file \"%s\"", + hi, lo, BACKUP_LABEL_FILE))); if (fscanf(lfp, "CHECKPOINT LOCATION: %X/%X%c", &hi, &lo, &ch) != 3 || ch != '\n') ereport(FATAL, -- 2.30.2
Re: Add version and data directory to initdb output
Hi, On Thu, Apr 21, 2022 at 10:18:56AM -0400, Tom Lane wrote: > And as for the version, if you want that you can get it from "initdb > --version". I assumed the point in stamping the version in the output was that people might want to pipe it to some logfile and then later on, when they found some issues, be able to go back and know what version was used when initializing this data directory. Michael
Re: PATCH: add "--config-file=" option to pg_rewind
Hi, Am Donnerstag, dem 10.03.2022 um 15:28 +0100 schrieb Gunnar "Nick" Bluth: > Am 10.03.22 um 14:43 schrieb Michael Banck: > > some minor comments, I didn't look at the added test and I did not > > test the patch at all, but (as part of the Debian/Ubuntu packaging > > team) I think this patch is really important: > > Much appreciated! > > [...] > > Fixed. > > [...] Thanks for the updated patch. The patch applies, make is ok, make check is ok, pg_rewind TAP tests are ok. I did some tests now with Patroni 2.1.3 and the attached patch applied. The following test was made: 1. Deploy 3-node (pg1, pg2, pg3) patroni cluster on Debian unstable running postgresql-15 (approx. master) 2. Switch on archive_mode, and set archive_command and restore_command 3. Switchover so that pg1 is the primary (if not already) 4. Kill pg1 hard 5. Wait till a new leader has taken over and the (now 2-node) cluster is healthy again 6. Restart pg1 without this patch: |Apr 03 19:09:01 pg1 patroni@15-main[99]: 2022-04-03 19:09:01,084 INFO: running pg_rewind from pg3 |Apr 03 19:09:01 pg1 patroni@15-main[99]: 2022-04-03 19:09:01,102 INFO: running pg_rewind from dbname=postgres user=postgres_rewind host=10.0.3.227 port=5432 target_session_attrs=read-write |Apr 03 19:09:01 pg1 patroni@15-main[99]: 2022-04-03 19:09:01,135 INFO: pg_rewind exit code=1 |Apr 03 19:09:01 pg1 patroni@15-main[99]: 2022-04-03 19:09:01,136 INFO: stdout= |Apr 03 19:09:01 pg1 patroni@15-main[99]: 2022-04-03 19:09:01,136 INFO: stderr=postgres: could not access the server configuration file "/var/lib/postgresql/15/main/postgresql.conf": No such file or directory |Apr 03 19:09:01 pg1 patroni@15-main[99]: no data was returned by command "/usr/lib/postgresql/15/bin/postgres -D /var/lib/postgresql/15/main -C restore_command" |Apr 03 19:09:01 pg1 patroni@15-main[99]: 2022-04-03 19:09:01,149 ERROR: Failed to rewind from healty master: pg3 |Apr 03 19:09:01 pg1 patroni@15-main[99]: 2022-04-03 19:09:01,149 WARNING: remove_data_directory_on_diverged_timelines is set. removing... |Apr 03 19:09:01 pg1 patroni@15-main[99]: 2022-04-03 19:09:01,149 INFO: Removing data directory: /var/lib/postgresql/15/main |Apr 03 19:09:01 pg1 patroni@15-main[99]: 2022-04-03 19:09:01,245 INFO: Lock owner: pg3; I am pg1 |Apr 03 19:09:01 pg1 patroni@15-main[99]: 2022-04-03 19:09:01,248 INFO: trying to bootstrap from leader 'pg3' So pg_rewind fails and Patroni does a re-bootstrap. with this patch: |Apr 03 19:12:35 pg1 patroni@15-main[99]: 2022-04-03 19:12:35,576 INFO: running pg_rewind from pg2 |Apr 03 19:12:35 pg1 patroni@15-main[99]: 2022-04-03 19:12:35,590 INFO: running pg_rewind from dbname=postgres user=postgres_rewind host=10.0.3.145 port=5432 target_session_attrs=read-write |Apr 03 19:12:37 pg1 patroni@15-main[99]: 2022-04-03 19:12:37,147 INFO: pg_rewind exit code=0 |Apr 03 19:12:37 pg1 patroni@15-main[99]: 2022-04-03 19:12:37,148 INFO: stdout= |Apr 03 19:12:37 pg1 patroni@15-main[99]: 2022-04-03 19:12:37,148 INFO: stderr=pg_rewind: servers diverged at WAL location 0/1D000180 on timeline 38 |Apr 03 19:12:37 pg1 patroni@15-main[99]: pg_rewind: rewinding from last common checkpoint at 0/1D000108 on timeline 38 |Apr 03 19:12:37 pg1 patroni@15-main[99]: pg_rewind: Done! |Apr 03 19:12:37 pg1 patroni@15-main[99]: 2022-04-03 19:12:37,151 WARNING: Postgresql is not running. |Apr 03 19:12:37 pg1 patroni@15-main[99]: 2022-04-03 19:12:37,151 INFO: Lock owner: pg2; I am pg1 Here, pg_rewind is called and works fine. So I think this works as intended, and I'm marking it Ready for Committer. Michael -- Michael Banck Teamleiter PostgreSQL-Team Projektleiter Tel.: +49 2166 9901-171 E-Mail: michael.ba...@credativ.de credativ GmbH, HRB Mönchengladbach 12080 USt-ID-Nummer: DE204566209 Trompeterallee 108, 41189 Mönchengladbach Geschäftsführung: Dr. Michael Meskes, Geoff Richardson, Peter Lilley Unser Umgang mit personenbezogenen Daten unterliegt folgenden Bestimmungen: https://www.credativ.de/datenschutz --- /usr/lib/python3/dist-packages/patroni/postgresql/__init__.py 2022-02-18 13:16:15.0 + +++ __init__.py 2022-04-03 19:17:29.952665383 + @@ -798,7 +798,8 @@ return True def get_guc_value(self, name): -cmd = [self.pgcommand('postgres'), '-D', self._data_dir, '-C', name] +cmd = [self.pgcommand('postgres'), '-D', self._data_dir, '-C', name, + '--config-file={}'.format(self.config.postgresql_conf)] try: data = subprocess.check_output(cmd) if data: --- /usr/lib/python3/dist-packages/patroni/postgresql/rewind.py 2022-02-18 13:16:15.0 + +++ rewind.py 2022-04-03 19:21:14.479726127 + @@ -314,6 +314,7 @@ cmd = [self._postgresql.pgcommand('pg_rewind')] if self._
Re: Add 64-bit XIDs into PostgreSQL 15
Hi, On Mon, Mar 14, 2022 at 01:32:04PM +0300, Aleksander Alekseev wrote: > IMO v16-0001 and v16-0002 are in pretty good shape and are as much as > we are going to deliver in PG15. I'm going to change the status of the > CF entry to "Ready for Committer" somewhere this week unless someone > believes v16-0001 and/or v16-0002 shouldn't be merged. Not sure, but if you want more people to look at them, probably best would be to start a new thread with just the v15-target patches. Right now, one has to download your tarball, extract it and look at the patches in there. I hope v16-0001 and v16-0002 are small enough (I didn't do the above) that they can just be attached normally? Michael -- Michael Banck Team Lead PostgreSQL Project Manager Tel.: +49 2166 9901-171 Mail: michael.ba...@credativ.de credativ GmbH, HRB Mönchengladbach 12080 USt-ID-Nummer: DE204566209 Trompeterallee 108, 41189 Mönchengladbach Management: Dr. Michael Meskes, Geoff Richardson, Peter Lilley Our handling of personal data is subject to: https://www.credativ.de/en/contact/privacy/
Re: PATCH: add "--config-file=" option to pg_rewind
Heya, some minor comments, I didn't look at the added test and I did not test the patch at all, but (as part of the Debian/Ubuntu packaging team) I think this patch is really important: On Tue, Mar 01, 2022 at 12:35:27PM +0100, Gunnar "Nick" Bluth wrote: > diff --git a/doc/src/sgml/ref/pg_rewind.sgml b/doc/src/sgml/ref/pg_rewind.sgml > index 33e6bb64ad..d0278e9b46 100644 > --- a/doc/src/sgml/ref/pg_rewind.sgml > +++ b/doc/src/sgml/ref/pg_rewind.sgml > @@ -241,6 +241,18 @@ PostgreSQL documentation > > > > + > + --config-file= class="parameter">filepath > + > + > +When using the -c / --restore-target-wal option, > the restore_command > +is extracted from the configuration of the target cluster. If the > postgresql.conf > +of that cluster is not in the target pgdata directory (or if you > want to use an alternative config), I think we usually just say "data directory"; pgdata was previously only used as part of the option name, not like this in the text. > diff --git a/src/bin/pg_rewind/pg_rewind.c b/src/bin/pg_rewind/pg_rewind.c > index b39b5c1aac..294353a9d5 100644 > --- a/src/bin/pg_rewind/pg_rewind.c > +++ b/src/bin/pg_rewind/pg_rewind.c > @@ -61,6 +61,7 @@ char *datadir_target = NULL; > char*datadir_source = NULL; > char*connstr_source = NULL; > char*restore_command = NULL; > +char*config_file = NULL; > > static bool debug = false; > bool showprogress = false; > @@ -87,6 +88,7 @@ usage(const char *progname) > printf(_("Options:\n")); > printf(_(" -c, --restore-target-wal use restore_command in > target configuration to\n" >" retrieve WAL files > from archives\n")); > + printf(_(" --config-file=FILE path to postgresql.conf if > outside target-pgdata (for -c)\n")); > printf(_(" -D, --target-pgdata=DIRECTORY existing data directory to > modify\n")); > printf(_(" --source-pgdata=DIRECTORY source data directory to > synchronize with\n")); > printf(_(" --source-server=CONNSTRsource server to synchronize > with\n")); target-pgdata is the name of the option, but maybe it is useful to spell out "target data directory" here, even if that means we spill over to two lines (which we might have to do anyway, that new line is pretty long). > @@ -121,6 +123,7 @@ main(int argc, char **argv) > {"no-sync", no_argument, NULL, 'N'}, > {"progress", no_argument, NULL, 'P'}, > {"debug", no_argument, NULL, 3}, > + {"config-file", required_argument, NULL, 5}, Not sure what our policy is, but the way the numbered options are 1, 2, 4, 3, 5 after this is weird; this was already off before this patch though. > {NULL, 0, NULL, 0} > }; > int option_index; > @@ -205,6 +208,10 @@ main(int argc, char **argv) > case 4: > no_ensure_shutdown = true; > break; > + > + case 5: > + config_file = pg_strdup(optarg); > + break; > } > } > > @@ -1060,7 +1067,11 @@ getRestoreCommand(const char *argv0) > /* add -D switch, with properly quoted data directory */ > appendPQExpBufferStr(postgres_cmd, " -D "); > appendShellString(postgres_cmd, datadir_target); > - > + if (config_file != NULL) > + { > + appendPQExpBufferStr(postgres_cmd, " --config_file="); > + appendShellString(postgres_cmd, config_file); > + } > /* add -C switch, for restore_command */ You removed an empty line here. Maybe rather prepend this with a comment on what's going on, and have en empty line before and afterwards. > appendPQExpBufferStr(postgres_cmd, " -C restore_command"); > > @@ -1138,6 +1149,11 @@ ensureCleanShutdown(const char *argv0) > /* add set of options with properly quoted data directory */ > appendPQExpBufferStr(postgres_cmd, " --single -F -D "); > appendShellString(postgres_cmd, datadir_target); > + if (config_file != NULL) > + { > + appendPQExpBufferStr(postgres_cmd, " --config_file="); > + appendShellString(postgres_cmd, config_file); > + } > > /* finish with the database name, and a properly quoted redirection */ Kinde same here. Cheers, Michael -- Michael Banck Teamleiter PostgreSQL-Team Projektleiter Tel.: +49 2166 9901-171 Email: michael.ba...@credativ.de credativ GmbH, HRB Mönchengladbach 12080 USt-ID-Nummer: DE204566209 Trompeterallee 108, 41189 Mönchengladbach Geschäftsführung: Dr. Michael Meskes, Geoff Richardson, Peter Lilley Unser Umgang mit personenbezogenen Daten unterliegt folgenden Bestimmungen: https://www.credativ.de/datenschutz
Re: Observability in Postgres
Hi, On Wed, Feb 16, 2022 at 12:48:11AM -0500, Greg Stark wrote: > But on further thought I think what you're asking is whether there are > standard database metrics and standard names for them. A lot of this > work has already been done with pg_exporter but it is worth looking at > other database software and see if there are opportunities to > standardize metrics naming for across databases. Can you clarify what exactly you mean with "pg_exporter", I think you mentioned it upthread as well? https://github.com/Vonng/pg_exporter (90 GH stars, never heard of it) ? https://github.com/prometheus-community/postgres_exporter (1.6k GH stars) ? Something else? Michael -- Michael Banck Teamleiter PostgreSQL-Team Projektleiter Tel.: +49 2166 9901-171 Email: michael.ba...@credativ.de credativ GmbH, HRB Mönchengladbach 12080 USt-ID-Nummer: DE204566209 Trompeterallee 108, 41189 Mönchengladbach Geschäftsführung: Dr. Michael Meskes, Geoff Richardson, Peter Lilley Unser Umgang mit personenbezogenen Daten unterliegt folgenden Bestimmungen: https://www.credativ.de/datenschutz
Re: Database-level collation version tracking
On Mon, Feb 07, 2022 at 04:44:24PM +0100, Peter Eisentraut wrote: > On 07.02.22 11:29, Julien Rouhaud wrote: > > - that's not really something new with this patch, but should we output the > >collation version info or mismatch info in \l / \dO? > > It's a possibility. Perhaps there is a question of performance if we show > it in \l and people have tons of databases and they have to make a locale > call for each one. As you say, it's more an independent feature, but it's > worth looking into. Ok, but \l+ shows among others the database size, so I guess at that level also showing this should be fine? (or is that already the case?) Michael
Re: Release notes for February minor releases
On Fri, Feb 04, 2022 at 04:35:07PM -0500, Tom Lane wrote: > Michael Banck writes: > > On Fri, Feb 04, 2022 at 02:58:59PM -0500, Tom Lane wrote: > >> + If this seems to have affected a table, REINDEX > >> + should repair the damage. > > > I don't think this is very helpful to the reader, are their indexes > > corrupt or not? If we can't tell them a specific command to run to > > check, can we at least mention that running amcheck would detect that > > (if it actually does)? Otherwise, I guess the only way to be sure is to > > just reindex every index? Or is this at least specific to b-trees? > > Yeah, I wasn't too happy with that advice either, but it seems like the > best we can do [1]. I don't think we should advise blindly reindexing > your whole installation, because it's a very low-probability bug. Right ok. I wonder whether it makes sense to at hint at the low probability then; I guess if you know Postgres well you can deduct from the "when the last transaction that could see the tuple ends while the page is being pruned" that it is a low-probability corner-case, but I fear lots of users will be unable to gauge the chances they got hit by this bug and just blindly assume they are affected (and/or ask around). I just woke up, so I don't have any good wording suggetsions yet. Michael -- Michael Banck Teamleiter PostgreSQL-Team Projektleiter Tel.: +49 2166 9901-171 Email: michael.ba...@credativ.de credativ GmbH, HRB Mönchengladbach 12080 USt-ID-Nummer: DE204566209 Trompeterallee 108, 41189 Mönchengladbach Geschäftsführung: Dr. Michael Meskes, Geoff Richardson, Peter Lilley Unser Umgang mit personenbezogenen Daten unterliegt folgenden Bestimmungen: https://www.credativ.de/datenschutz
Re: Release notes for February minor releases
On Fri, Feb 04, 2022 at 02:58:59PM -0500, Tom Lane wrote: > I've pushed the first draft for $SUBJECT at > > https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=ab22eea83169c8d0eb15050ce61cbe3d7dae4de6 > > Please send comments/corrections by Sunday. > + > + Fix corruption of HOT chains when a RECENTLY_DEAD tuple changes > + state to fully DEAD during page pruning (Andres Freund) > + > + > + > + This happens when the last transaction that could see > + the tuple ends while the page is being pruned. It was then possible > + to remove a tuple that is pointed to by a redirect item elsewhere on > + the page. While that causes no immediate problem, when the item slot > + is re-used by some new tuple, that tuple would be thought to be part > + of the pre-existing HOT chain, creating a form of index corruption. Well, ouchy. > + If this seems to have affected a table, REINDEX > + should repair the damage. I don't think this is very helpful to the reader, are their indexes corrupt or not? If we can't tell them a specific command to run to check, can we at least mention that running amcheck would detect that (if it actually does)? Otherwise, I guess the only way to be sure is to just reindex every index? Or is this at least specific to b-trees? > + > + Enforce standard locking protocol for TOAST table updates, to prevent > + problems with REINDEX CONCURRENTLY (Michael Paquier) > + > + > + > + If applied to a TOAST table or TOAST table's index, REINDEX > + CONCURRENTLY tended to produce a corrupted index. This > + happened because sessions updating TOAST entries released > + their ROW EXCLUSIVE locks immediately, rather > + than holding them until transaction commit as all other updates do. > + The fix is to make TOAST updates hold the table lock according to the > + normal rule. Any existing corrupted indexes can be repaired by > + reindexing again. > + > + Same, but at least here the admin can cut it down to only those indexes which were added concurrently. Michael -- Michael Banck Teamleiter PostgreSQL-Team Projektleiter Tel.: +49 2166 9901-171 Email: michael.ba...@credativ.de credativ GmbH, HRB Mönchengladbach 12080 USt-ID-Nummer: DE204566209 Trompeterallee 108, 41189 Mönchengladbach Geschäftsführung: Dr. Michael Meskes, Geoff Richardson, Peter Lilley Unser Umgang mit personenbezogenen Daten unterliegt folgenden Bestimmungen: https://www.credativ.de/datenschutz
Re: XTS cipher mode for cluster file encryption
Hi, On Tue, Feb 01, 2022 at 01:07:36PM -0500, Stephen Frost wrote: > On Tue, Feb 1, 2022 at 12:50 Bruce Momjian wrote: > > On Tue, Feb 1, 2022 at 07:45:06AM +0100, Antonin Houska wrote: > > > > With pg_upgrade modified to preserve the relfilenode, tablespace > > > > oid, and database oid, we are now closer to implementing cluster > > > > file encryption using XTS. I think we have a few steps left: > > > > > > > > 1. modify temporary file I/O to use a more centralized API > > > > 2. modify the existing cluster file encryption patch to use XTS with a > > > > IV that uses more than the LSN > > > > 3. add XTS regression test code like CTR > > > > 4. create WAL encryption code using CTR > > > > > > > > If we can do #1 in PG 15 I think I can have #2 ready for PG 16 in July. > > > > The feature wiki page is: > > > > > > > > https://wiki.postgresql.org/wiki/Transparent_Data_Encryption > > > > > > > > Do people want to advance this feature forward? > > > > > > I confirm that we (Cybertec) do and that we're ready to spend more > > > time on the community implementation. > > > > Well, I sent an email a week ago asking if people want to advance this > > feature forward, and so far you are the only person to reply, which I > > think means there isn't enough interest in this feature to advance it. > > This confuses me. Clearly there’s plenty of interest, but asking on hackers > in a deep old sub thread isn’t a terribly good way to judge that. Yet even > when there is an active positive response, you argue that there isn’t > enough. Even more so because not Antonin not only replied as an individual, but in the name of a whole company developing Postgres in general and TDE in particular. Michael -- Michael Banck Teamleiter PostgreSQL-Team Projektleiter Tel.: +49 2166 9901-171 Email: michael.ba...@credativ.de credativ GmbH, HRB Mönchengladbach 12080 USt-ID-Nummer: DE204566209 Trompeterallee 108, 41189 Mönchengladbach Geschäftsführung: Dr. Michael Meskes, Geoff Richardson, Peter Lilley Unser Umgang mit personenbezogenen Daten unterliegt folgenden Bestimmungen: https://www.credativ.de/datenschutz
Re: CREATEROLE and role ownership hierarchies
Hi, Am Montag, dem 31.01.2022 um 09:18 -0800 schrieb Mark Dilger: > On Jan 31, 2022, at 12:43 AM, Michael Banck < > michael.ba...@credativ.de> wrote: > Ok, sure. I think this topic is hugely important and as I read the > patch anyway, I added some comments, but yeah, we need to figure out > the fundamentals first. Right. Perhaps some background on this patch series will help. [...] Thanks a lot! If we don't go with backwards compatibility, then CREATEROLE would only allow you to create a new role, but not to give that role LOGIN, nor CREATEDB, etc. You'd need to also have admin option on those things. To create a role that can give those things away, you'd need to run something like: CREATE ROLE michael CREATEROLE WITH ADMIN OPTION -- can further give away "createrole" CREATEDB WITH ADMIN OPTION -- can further give away "createdb" LOGIN WITH ADMIN OPTION -- can further give away "login" NOREPLICATION WITHOUT ADMIN OPTION -- this would be implied anyway NOBYPASSRLS WITHOUT ADMIN OPTION -- this would be implied anyway CONNECTION LIMIT WITH ADMIN OPTION -- can specify connection limits PASSWORD WITH ADMIN OPTION -- can specify passwords VALID UNTIL WITH ADMIN OPTION -- can specify expiration Those last three don't work for me: postgres=# CREATE ROLE admin3 VALID UNTIL WITH ADMIN OPTION; ERROR: syntax error at or near "WITH" postgres=# CREATE ROLE admin3 PASSWORD WITH ADMIN OPTION; ERROR: syntax error at or near "WITH" postgres=# CREATE ROLE admin3 CONNECTION LIMIT WITH ADMIN OPTION; ERROR: syntax error at or near "WITH" > (I'm on the fence about the phrase "WITH ADMIN OPTION" vs. the phrase > "WITH GRANT OPTION".) > > Even then, when "michael" creates new roles, if he wants to be able > to further administer those roles, he needs to remember to give > himself ADMIN membership in that role at creation time. After the > role is created, if he doesn't have ADMIN, he can't give it to > himself. So, at create time, he needs to remember to do this: > > SET ROLE michael; > CREATE ROLE mark ADMIN michael; What would happen if ADMIN was implicit if michael is a non-superuser and there's no ADMIN in the CREATE ROLE statement? It would be backwards-compatible, one could still let somebody else be ADMIN, but ISTM a CREATEROLE role could no longer admin a role already existing previously/it didn't create/got assigned admin for (e.g. the predefined roles). I.e. (responding what you wrote much further below), the CREATEROLE role would no longer be ADMIN for all roles, just automatically for the ones it created. > But that's still a bit strange, because "ADMIN michael" means that > michael can grant other roles membership in "mark", not that michael > can, for example, change mark's password. Yeah, changing a password is one of the important tasks of a delegated role admin, if no superusers are around. > If we don't want CREATEROLE to imply that you can mess around with > arbitrary roles (rather than only roles that you created or have been > transferred control over) then we need the concept of role > ownership. This patch doesn't go that far, so for now, only > superusers can do those things. Assuming some form of this patch is > acceptable, the v9 series will resurrect some of the pre-v7 logic for > role ownership and say that the owner can do those things. > > Ok, so what I would have needed to do in the above in order to have > > "admin2" and "admin" be the same as far as creating login users is (I > > believe): > > > > ALTER ROLE admin2 CREATEROLE LOGIN WITH ADMIN OPTION; > > Yes, those it's more likely admin2 would have been created with these > privileges to begin with, if the creator intended admin2 to do such > things. Right, maybe people just have to adjust to the new way. It still feels strange that whatever you do at role creation time is more meaningful than when altering a role. > > > By the way, is there now even a way to add admpassword to a role > > after it got created? > > > > postgres=# SET ROLE admin2; > > SET > > postgres=> \password test > > Enter new password for user "test": > > Enter it again: > > ERROR: must have admin on password to change password attribute > > postgres=> RESET ROLE; > > RESET > > postgres=# ALTER ROLE admin2 PASSWORD WITH ADMIN OPTION; > > ERROR: syntax error at or near "WITH" > > UPDATE pg_authid SET admpassword = 't' WHERE rolname = 'admin2'; > > UPDATE 1 > > postgres=# SET
Re: CREATEROLE and role ownership hierarchies
Hi, Am Sonntag, dem 30.01.2022 um 17:11 -0800 schrieb Mark Dilger: > > On Jan 30, 2022, at 2:38 PM, Michael Banck < > > michael.ba...@credativ.de> wrote: > > > The attached WIP patch attempts to solve most of the CREATEROLE > > I'm mostly looking for whether the general approach in this Work In > Progress patch is acceptable, so I was a bit sloppy with whitespace > and such Ok, sure. I think this topic is hugely important and as I read the patch anyway, I added some comments, but yeah, we need to figure out the fundamentals first. > > > One thing I noticed (and which will likely make DBAs grumpy) is that it > > seems being able to create users (as opposed to non-login roles/groups) > > depends on when you get the CREATEROLE attribute (on role creation or > > later), viz: > > > > postgres=# CREATE USER admin CREATEROLE; > > CREATE ROLE > > postgres=# SET ROLE admin; > > SET > > postgres=> CREATE USER testuser; -- this works > > CREATE ROLE > > postgres=> RESET ROLE; > > RESET > > postgres=# CREATE USER admin2; > > CREATE ROLE > > postgres=# ALTER ROLE admin2 CREATEROLE; -- we get CREATEROLE after the fact > > ALTER ROLE > > postgres=# SET ROLE admin2; > > SET > > postgres=> CREATE USER testuser2; -- bam > > ERROR: must have grant option on LOGIN privilege to create login users > > postgres=# SELECT rolname, admcreaterole, admcanlogin FROM > > pg_authid > > WHERE rolname LIKE 'admin%'; > > rolname | admcreaterole | admcanlogin > > -+---+- > > admin | t | t > > admin2 | f | f > > (2 rows) > > > > Is that intentional? If it is, I think it would be nice if this > > could be > > changed, unless I'm missing some serious security concerns or so. > > It's intentional, but part of what I wanted review comments about. > The issue is that historically: > > CREATE USER michael CREATEROLE > > meant that you could go on to do things like create users with LOGIN > privilege. I could take that away, which would be a backwards > compatibility break, or I can do the weird thing this patch does. Or > I could have your > > ALTER ROLE admin2 CREATEROLE; > > also grant the other privileges like LOGIN unless you explicitly say > otherwise with a bunch of explicit WITHOUT ADMIN OPTION clauses. > Finding out which of those this is preferred was a big part of why I > put this up for review. Thanks for calling it out in under 24 hours! Ok, so what I would have needed to do in the above in order to have "admin2" and "admin" be the same as far as creating login users is (I believe): ALTER ROLE admin2 CREATEROLE LOGIN WITH ADMIN OPTION; I think if possible, it would be nice to just have this part as default if possible. I.e. CREATEROLE and HASLOGIN are historically so much intertwined that I think the above should be implicit (again, if that is possible); I don't care and/or haven't made up my mind about any of the other options so far... Ok, so now that I had another look, I see we are going down Pandora's box: For any of the other things a role admin would like to do (change password, change conn limit), one would have to go with this weird disconnect between CREATE USER admin CREATEROLE and ALTER USER admin2 CREATEROLE [massive list of WITH ADMIN OPTION], and then I'm not sure where we stop. By the way, is there now even a way to add admpassword to a role after it got created? postgres=# SET ROLE admin2; SET postgres=> \password test Enter new password for user "test": Enter it again: ERROR: must have admin on password to change password attribute postgres=> RESET ROLE; RESET postgres=# ALTER ROLE admin2 PASSWORD WITH ADMIN OPTION; ERROR: syntax error at or near "WITH" UPDATE pg_authid SET admpassword = 't' WHERE rolname = 'admin2'; UPDATE 1 postgres=# SET ROLE admin2; SET postgres=> \password test Enter new password for user "test": Enter it again: postgres=> However, the next thing is: postgres=# SET ROLE admin; SET postgres=> CREATE GROUP testgroup; CREATE ROLE postgres=> GRANT testgroup TO test; ERROR: must have admin option on role "testgroup" First off, what does "admin option" mean on a role? I then tried this: postgres=# CREATE USER admin3 CREATEROLE WITH ADMIN OPTION; CREATE ROLE postgres=# SET ROLE admin3; SET postgres=> CREATE USER test3; CREATE ROLE postgres=> CREATE GROUP testgroup3; CREATE ROLE postgres=> GRANT testgroup3 TO test3; ERROR: must have admin option on role "testgroup3" So I created both user and group, I have the CREATEROLE pri
Re: CREATEROLE and role ownership hierarchies
dcreaterole || dcreatedb || dcanlogin || > dconnlimit || > - drolemembers || dvalidUntil || !dpassword || roleid != > GetUserId()) > - ereport(ERROR, > - > (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE), > - errmsg("permission denied"))); > - } > + > + /* To mess with replication roles, must have admin on REPLICATION */ > + if ((authform->rolreplication || disreplication) && > + !may_admin_replication_privilege(GetUserId())) > + ereport(ERROR, > + (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE), > + errmsg("must have admin on replication to > alter replication roles or change replication attribute"))); "have admin" sounds a bit weird, but I understand the error message is too long already to spell out "must have admin option"? Or am I mistaken and "admin" is what it's actually called (same for the ones below)? Also, I think those role options are usually capitalized like REPLICATION in other error messages. > diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y > index b5966712ce..7503d3ead6 100644 > --- a/src/backend/parser/gram.y > +++ b/src/backend/parser/gram.y > @@ -1131,67 +1140,111 @@ AlterOptRoleElem: [...] > + | VALID ALWAYS opt_admin_spec > + { > + RoleElem *n = makeNode(RoleElem); > + n->elem = makeDefElem("validUntil", > (Node *)makeString("always"), @1); > + n->admin_spec = $3; > + $$ = (Node *)n; This one is from another patch as well I think. > } > /* Supported but not documented for roles, for use by > ALTER GROUP. */ > - | USER role_list > + | USER role_list opt_admin_spec > { > - $$ = makeDefElem("rolemembers", (Node > *)$2, @1); > + RoleElem *n = makeNode(RoleElem); > + n->elem = makeDefElem("rolemembers", > (Node *)$2, @1); > + n->admin_spec = $3; > + $$ = (Node *)n; > } > - | IDENT > + | IDENT opt_admin_spec > { > /* >* We handle identifiers that aren't > parser keywords with >* the following special-case codes, to > avoid bloating the >* size of the main parser. >*/ > + RoleElem *n = makeNode(RoleElem); > + > + /* > + * Record whether the user specified > WITH GRANT OPTION. WITH ADMIN OPTION rather? > + * Note that for some privileges this > is always implied, > + * such as SUPERUSER, but we don't > reflect that here. > + */ > + n->admin_spec = $2; > + > diff --git a/src/include/catalog/pg_authid.dat > b/src/include/catalog/pg_authid.dat > index 6c28119fa1..4829a6dbd2 100644 > --- a/src/include/catalog/pg_authid.dat > +++ b/src/include/catalog/pg_authid.dat > @@ -22,67 +22,93 @@ > { oid => '10', oid_symbol => 'BOOTSTRAP_SUPERUSERID', >rolname => 'POSTGRES', rolsuper => 't', rolinherit => 't', >rolcreaterole => 't', rolcreatedb => 't', rolcanlogin => 't', > - rolreplication => 't', rolbypassrls => 't', rolconnlimit => '-1', > + rolreplication => 't', rolbypassrls => 't', adminherit => 't', > admcreaterole => 't', > + admcreatedb => 't', admcanlogin => 't', admreplication => 't', > admbypassrls => 't', > + admconnlimit => 't', admpassword => 't', admvaliduntil => 't', > rolconnlimit => '-1', >rolpassword => '_null_', rolvaliduntil =>
Re: [PATCH] New default role allowing to change per-role/database settings
Hi, Am Mittwoch, dem 08.09.2021 um 07:38 + schrieb shinya11.k...@nttdata.com: > > Thanks for letting me know, I've attached a rebased v4 of this > > patch, no other changes. Please find attached another rebase, sorry it took so long. I tried it, but when I used set command, tab completion did not work > properly and an error occurred. It's been a while, but I believe the patch only changes the ALTER ROLE command, not the SET/SHOW commands. I thought that was more generally useful, can you explain the SET use-case? > --- > postgres=> \conninfo > You are connected to database "postgres" as user "aaa" via socket in > "/tmp" at port "5432". > postgres=> \du > List of roles > Role name | > Attributes | Member of > ---+- > ---+--- > aaa > | | > {pg_change_role_settings} > penguin | Superuser, Create role, Create DB, Replication, Bypass > RLS | {} > postgres=> show log > log_autovacuum_min_duration log_executor_stats > log_min_error_statement log_replication_commands > log_timezone > log_checkpoints log_file_mode > log_min_messages log_rotation_age > log_transaction_sample_rate > log_connections log_hostname > log_parameter_max_length log_rotation_size > log_truncate_on_rotation > log_destination log_line_prefix > log_parameter_max_length_on_error log_statement > logging_collector > log_disconnections log_lock_waits > log_parser_stats log_statement_sample_rate > logical_decoding_work_mem > log_duration log_min_duration_sample > log_planner_stats log_statement_stats > log_error_verbosity log_min_duration_statement > log_recovery_conflict_waits log_temp_files > postgres=> show log_duration ; > log_duration > -- > off > (1 row) > > postgres=> set log > log_parameter_max_length_on_error logical_decoding_work_mem > postgres=> set log_duration to on; > 2021-09-08 16:23:39.216 JST [533860] ERROR: permission denied to set > parameter "log_duration" > 2021-09-08 16:23:39.216 JST [533860] STATEMENT: set log_duration to > on; > ERROR: permission denied to set parameter "log_duration" So this would work: postgres=> SHOW ROLE; role -- rolesetadmin (1 row) postgres=> \du List of roles Role name | Attributes | Member of --++--- postgres | Superuser, Create role, Create DB, Replication, Bypass RLS | {} rolesetadmin | Cannot login | {pg_change_role_settings} test | Cannot login | {} postgres=> ALTER ROLE test SET log_statement='all'; ALTER ROLE postgres=> \drds List of settings Role | Database | Settings --+--+--- test | | log_statement=all (1 row) I am not sure if there is anything to be done about tab completion, can you clarify here? Michael -- Michael Banck Teamleiter PostgreSQL-Team Projektleiter Tel.: +49 2166 9901-171 E-Mail: michael.ba...@credativ.de credativ GmbH, HRB Mönchengladbach 12080 USt-ID-Nummer: DE204566209 Trompeterallee 108, 41189 Mönchengladbach Geschäftsführung: Dr. Michael Meskes, Geoff Richardson, Peter Lilley Unser Umgang mit personenbezogenen Daten unterliegt folgenden Bestimmungen: https://www.credativ.de/datenschutz From 4144cd5ccaa074b04dc4cbec3689f91e19f7f138 Mon Sep 17 00:00:00 2001 From: Michael Banck Date: Sat, 29 Jan 2022 11:44:57 +0100 Subject: [PATCH v5] Add new PGC_ADMINSET guc context and pg_change_role_settings predefined role. This commit introduces `administrator' as a middle ground between `superuser' and `user' for guc contexts. Those settings initially include all previously `superuser'-context settings from the 'Reporting and Logging' ('What/When to Log' but not 'Where to Log') and both 'Statistics' categories. Further settings
Re: should we enable log_checkpoints out of the box?
Am Freitag, dem 05.11.2021 um 11:27 -0300 schrieb Alvaro Herrera: > On 2021-Nov-03, Jan Wieck wrote: > > On 11/3/21 09:09, Robert Haas wrote: > > > > > For better or for worse, the distinction between ERROR, FATAL, > > > and > > > PANIC is entirely based on what we do after printing the message, > > > and NOT on how serious the message is. > > > > THAT is a real problem with our error handling and logging system. > > Agreed. Well that, and the fact those distinctions are only done for user- facing events, whereas it seems to me we only distinguish between LOG and PANIC for server-facing events; maybe we need one or more additional levels here in order to make it easier for admins to see the really bad things that are happening? Michael -- Michael Banck Teamleiter PostgreSQL-Team Projektleiter Tel.: +49 2166 9901-171 E-Mail: michael.ba...@credativ.de credativ GmbH, HRB Mönchengladbach 12080 USt-ID-Nummer: DE204566209 Trompeterallee 108, 41189 Mönchengladbach Geschäftsführung: Dr. Michael Meskes, Geoff Richardson, Peter Lilley Unser Umgang mit personenbezogenen Daten unterliegt folgenden Bestimmungen: https://www.credativ.de/datenschutz
Re: should we enable log_checkpoints out of the box?
Hi, On Tue, Nov 02, 2021 at 11:55:23AM -0400, Robert Haas wrote: > I think shipping with log_checkpoints=on and > log_autovacuum_min_duration=10m or so would be one of the best things > we could possibly do to allow ex-post-facto troubleshooting of > system-wide performance issues. I don't disagree, but while we're at it, I'd like to throw log_lock_waits into the ring as well. IME, once you get to a situation where you have it pop up, the overall log volume usually dwarfs it, but it's pretty important to possibly answer the "why was that query slow 5 days ago" question. Michael -- Michael Banck Team Lead PostgreSQL Project Manager Tel.: +49 2166 9901-171 Mail: michael.ba...@credativ.de credativ GmbH, HRB Mönchengladbach 12080 USt-ID-Nummer: DE204566209 Trompeterallee 108, 41189 Mönchengladbach Management: Dr. Michael Meskes, Geoff Richardson, Peter Lilley Our handling of personal data is subject to: https://www.credativ.de/en/contact/privacy/
Re: should we enable log_checkpoints out of the box?
Hi, On Sun, Oct 31, 2021 at 01:16:33PM -0700, Andres Freund wrote: > On 2021-10-31 15:43:57 -0400, Tom Lane wrote: > > Andres Freund writes: > > > On 2021-10-31 10:59:19 -0400, Tom Lane wrote: > > >> No DBA would be likely to consider it as anything but log spam. > > > > > I don't agree at all. No postgres instance should be run without > > > log_checkpoints enabled. Performance is poor if checkpoints are > > > triggered by anything but time, and that can only be diagnosed if > > > log_checkpoints is on. > > > > This is complete nonsense. > > Shrug. It's based on many years of doing or being around people doing > postgres support escalation shifts. And it's not like log_checkpoints > incurs meaningful overhead or causes that much log volume. It could be a bit of reverse-survivorship-bias, i.e., you're only seeing the pathological cases, while 99% of the Postgres installations out there just hum along fine without anybody ever having to look at the checkpoint entries in the log. But yeah, for serious production instances, it makes sense to turn that option on, but I'm not convinced it should be the default. To put another option on the table: maybe a compromise could be to log xlog checkpoints unconditionally, and the (checkpoint_timeout) time ones only if log_checkpoints are set (maybe with some exponential backoff to avoid log spam)? Michael -- Michael Banck Team Lead PostgreSQL Project Manager Tel.: +49 2166 9901-171 Mail: michael.ba...@credativ.de credativ GmbH, HRB Mönchengladbach 12080 USt-ID-Nummer: DE204566209 Trompeterallee 108, 41189 Mönchengladbach Management: Dr. Michael Meskes, Geoff Richardson, Peter Lilley Our handling of personal data is subject to: https://www.credativ.de/en/contact/privacy/
Re: Patching documentation of ALTER TABLE re column type changes on binary-coercible fields
Hi, I've stumbled over this topic today, and found your patch. On Thu, Jan 23, 2020 at 11:01:36PM -0800, Mike Lissner wrote: > Enclosed please find a patch to tweak the documentation of the ALTER TABLE > page. I believe this patch is ready to be applied to master and backported > all the way to 9.2. > > On the ALTER TABLE page, it currently notes that if you change the type of > a column, even to a binary coercible type: > > > any indexes on the affected columns must still be rebuilt. > > It appears this hasn't been true for about eight years, since 367bc426a. > > Here's the discussion of the topic from earlier today and yesterday: > > https://www.postgresql.org/message-id/flat/CAMp9%3DExXtH0NeF%2BLTsNrew_oXycAJTNVKbRYnqgoEAT01t%3D67A%40mail.gmail.com > > I haven't run tests, but I presume they'll be unaffected by a documentation > change. > > I've made an effort to follow the example of other people's patches I > looked at, but I haven't contributed here before. Happy to take another > stab at this if this doesn't hit the mark — though I hope it does. I love > and appreciate Postgresql and hope that I can do my little part to make it > better. > > For the moment, I haven't added this to commitfest. I don't know what it > is, but I suspect this is small enough somebody will just pick it up. > > Mike > Index: doc/src/sgml/ref/alter_table.sgml > IDEA additional info: > Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP > <+>UTF-8 > === > --- doc/src/sgml/ref/alter_table.sgml (revision > 6de7bcb76f6593dcd107a6bfed645f2142bf3225) > +++ doc/src/sgml/ref/alter_table.sgml (revision > 9a813e0896e828900739d95f78b5e4be10dac365) > @@ -1225,10 +1225,9 @@ > existing column, if the USING clause does not change > the column contents and the old type is either binary coercible to the > new > type or an unconstrained domain over the new type, a table rewrite is not > -needed; but any indexes on the affected columns must still be rebuilt. > -Table and/or index rebuilds may take a > -significant amount of time for a large table; and will temporarily > require > -as much as double the disk space. > +needed. Table and/or index rebuilds may take a significant amount of time > +for a large table; and will temporarily require as much as double the > disk > +space. > In general, I find the USING part in that paragraph a bit confusing; I think the main use case for ALTER COLUMN ... TYPE is without it. So I would suggest separating the two and make the general case (table and indexe rewrites are not needed if the type is binary coercible without having USING in the same sentence. Michael -- Michael Banck Projektleiter / Senior Berater Tel.: +49 2166 9901-171 Fax: +49 2166 9901-100 Email: michael.ba...@credativ.de credativ GmbH, HRB Mönchengladbach 12080 USt-ID-Nummer: DE204566209 Trompeterallee 108, 41189 Mönchengladbach Geschäftsführung: Dr. Michael Meskes, Sascha Heuer, Geoff Richardson, Peter Lilley Unser Umgang mit personenbezogenen Daten unterliegt folgenden Bestimmungen: https://www.credativ.de/datenschutz
Re: add operator ^= to mean not equal (like != and <>)
Hi, On Tue, Aug 10, 2021 at 11:13:03AM +0200, Daniel Gustafsson wrote: > > On 10 Aug 2021, at 11:10, Andreas Karlsson wrote: > > What is he reason you want to add ^= is there any other databases > > which uses ^= for inequality? > > I assume it's because of Oracle compatibility which AFAIK is the only > database supporting ^=. DB2 also supports (or supported) it, but it's deprecated: https://www.ibm.com/docs/en/db2/9.7?topic=predicates-basic-predicate We encountered it at least in one customer setting, so we added it to db2fce: https://github.com/credativ/db2fce/blob/master/db2fce.sql#L588 Michael -- Michael Banck Projektleiter / Senior Berater Tel.: +49 2166 9901-171 Fax: +49 2166 9901-100 Email: michael.ba...@credativ.de credativ GmbH, HRB Mönchengladbach 12080 USt-ID-Nummer: DE204566209 Trompeterallee 108, 41189 Mönchengladbach Geschäftsführung: Dr. Michael Meskes, Sascha Heuer, Geoff Richardson, Peter Lilley Unser Umgang mit personenbezogenen Daten unterliegt folgenden Bestimmungen: https://www.credativ.de/datenschutz
Re: Commitfest overflow
On Tue, Aug 03, 2021 at 11:55:50AM -0400, Bruce Momjian wrote: > On Tue, Aug 3, 2021 at 04:53:40PM +0100, Simon Riggs wrote: > > There are 273 patches in the queue for the Sept Commitfest already, so > > it seems clear the queue is not being cleared down each CF as it was > > before. We've been trying hard, but it's overflowing. > > > > Of those, about 50 items have been waiting more than one year, and > > about 25 entries waiting for more than two years. > > > > One problem is that many entries don't have official reviewers, though > > many people have commented on Hackers, making it difficult to track > > down items that actually need work. That wastes a lot of time for > > would-be reviewers (or at least, it has done for me). > > > > Please can there be some additional coordination to actively clear > > down this problem? I won't attempt to come up with ideas to do this, > > but others may wish to suggest ways that avoid Committer burn-out? > > > > I will be happy to volunteer to be part of an organized approach that > > spreads out the work. > > I wonder if our lack of in-person developer meetings is causing some of > our issues to not get closed. Well, or the lack of virtual developer meetings? Sure, in-person meetings are difficult now, but OTOH virtual meetings are very common these days and maybe could be tried as quarterly developer meetings or so and have the advantage that, modulo timezone problems, every (invited) developer could attend. Michael -- Michael Banck Projektleiter / Senior Berater Tel.: +49 2166 9901-171 Fax: +49 2166 9901-100 Email: michael.ba...@credativ.de credativ GmbH, HRB Mönchengladbach 12080 USt-ID-Nummer: DE204566209 Trompeterallee 108, 41189 Mönchengladbach Geschäftsführung: Dr. Michael Meskes, Sascha Heuer, Geoff Richardson, Peter Lilley Unser Umgang mit personenbezogenen Daten unterliegt folgenden Bestimmungen: https://www.credativ.de/datenschutz
Re: [PATCH] New default role allowing to change per-role/database settings
Hi, Am Mittwoch, den 14.07.2021, 21:29 +0530 schrieb vignesh C: > On Wed, Apr 7, 2021 at 5:23 PM Michael Banck > wrote: > > Hi, > > > > Am Dienstag, den 06.04.2021, 15:37 +0200 schrieb Michael Banck: > > > Am Montag, den 05.04.2021, 14:33 -0400 schrieb Stephen Frost: > > > > Should drop the 'DEFAULT_' to match the others since the rename to > > > > 'predefined' roles went in. > > > > > > Right, will send a rebased patch ASAP. > > > > Here's a rebased version that also removes the ALTER DATABASE SET > > capability from this new predefined role and adds some documentation. > > The patch does not apply on Head anymore, could you rebase and post a > patch. I'm changing the status to "Waiting for Author". Thanks for letting me know, I've attached a rebased v4 of this patch, no other changes. Michael -- Michael Banck Projektleiter / Senior Berater Tel.: +49 2166 9901-171 E-Mail: michael.ba...@credativ.de credativ GmbH, HRB Mönchengladbach 12080 USt-ID-Nummer: DE204566209 Trompeterallee 108, 41189 Mönchengladbach Geschäftsführung: Dr. Michael Meskes, Sascha Heuer, Geoff Richardson, Peter Lilley Unser Umgang mit personenbezogenen Daten unterliegt folgenden Bestimmungen: https://www.credativ.de/datenschutz From 144306ed2835d33b2c1b2f5c7b5c247c41d80df0 Mon Sep 17 00:00:00 2001 From: Michael Banck Date: Tue, 15 Dec 2020 20:53:59 +0100 Subject: [PATCH v4] Add new PGC_ADMINSET guc context and pg_change_role_settings predefined role. This commit introduces `administrator' as a middle ground between `superuser' and `user' for guc contexts. Those settings initially include all previously `superuser'-context settings from the 'Reporting and Logging' ('What/When to Log' but not 'Where to Log') and both 'Statistics' categories. Further settings could be added in follow-up commits. Also, a new predefined role pg_change_role_settings is introduced. This allows administrators (that are not superusers) that have been granted this role to change the above PGC_ADMINSET settings on a per-role basis like "ALTER ROLE [...] SET log_statement". The rationale is to allow certain settings that affect logging to be changed in setups where the DBA has access to the database log, but is not a full superuser. --- doc/src/sgml/ref/alter_role.sgml | 11 +++--- doc/src/sgml/user-manag.sgml | 7 src/backend/commands/user.c | 7 +++- src/backend/utils/misc/guc.c | 61 +++ src/include/catalog/pg_authid.dat | 5 +++ src/include/utils/guc.h | 1 + 6 files changed, 62 insertions(+), 30 deletions(-) diff --git a/doc/src/sgml/ref/alter_role.sgml b/doc/src/sgml/ref/alter_role.sgml index 5aa5648ae7..d332c29c71 100644 --- a/doc/src/sgml/ref/alter_role.sgml +++ b/doc/src/sgml/ref/alter_role.sgml @@ -115,11 +115,12 @@ ALTER ROLE { role_specification | A Superusers can change anyone's session defaults. Roles having - CREATEROLE privilege can change defaults for non-superuser - roles. Ordinary roles can only set defaults for themselves. - Certain configuration variables cannot be set this way, or can only be - set if a superuser issues the command. Only superusers can change a setting - for all roles in all databases. + CREATEROLE privilege or which are a member of the + pg_change_role_settings predefined role can change + defaults for non-superuser roles. Ordinary roles can only set defaults for + themselves. Certain configuration variables cannot be set this way, or can + only be set if a superuser issues the command. Only superusers can change a + setting for all roles in all databases. diff --git a/doc/src/sgml/user-manag.sgml b/doc/src/sgml/user-manag.sgml index fe0bdb7599..48a7aa8cc2 100644 --- a/doc/src/sgml/user-manag.sgml +++ b/doc/src/sgml/user-manag.sgml @@ -541,6 +541,13 @@ DROP ROLE doomed_role; Read all configuration variables, even those normally visible only to superusers. + + pg_change_role_settings + Allow changing role-based settings for all non-superuser roles + with ALTER ROLE rolename SET + varname TO + value. + pg_read_all_stats Read all pg_stat_* views and use various statistics related extensions, diff --git a/src/backend/commands/user.c b/src/backend/commands/user.c index aa69821be4..652f8aca21 100644 --- a/src/backend/commands/user.c +++ b/src/backend/commands/user.c @@ -876,7 +876,8 @@ AlterRoleSet(AlterRoleSetStmt *stmt) /* * To mess with a superuser you gotta be superuser; else you need - * createrole, or just want to change your own settings + * createrole, the pg_change_role_settings default role, or just want + * to change your own settings. */ if (rolef
Failover messages in Timeline History
Hi, if a failover (or probably a switchover, at least in the way Patroni does it) occurs, the timeline history (e.g. via "patronictl history"[1]) seems to read "no recovery target specified". That's correct, of course, from a PITR perspective, but for the (possibly more common?) promotion- of-a-standby-due-to-failover/switchover case rather misleading. I wonder whether it could be made more informative; like "no recovery target or failover" or "(standby) promotion witout recovery target"? Michael [1] root@pg1:~# patronictl -c /etc/patroni/13-main.yml history | head ++--+--+--+ | TL | LSN | Reason | Timestamp | ++--+--+--+ | 1 | 83886296 | no recovery target specified | 2021-06-18T20:04:11.645437+00:00 | | 2 | 83886928 | no recovery target specified | 2021-06-18T20:08:45.820304+00:00 | | 3 | 83887384 | no recovery target specified | 2021-06-19T05:57:50.431980+00:00 | | 4 | 83887840 | no recovery target specified | 2021-06-19T08:32:55.527975+00:00 | | 5 | 84017040 | no recovery target specified | 2021-06-19T12:05:40.495982+00:00 | | 6 | 84019264 | no recovery target specified | 2021-06-19T15:51:49.983987+00:00 | | 7 | 84135720 | no recovery target specified | 2021-06-20T03:46:22.775851+00:00 | -- Michael Banck Projektleiter / Senior Berater Tel.: +49 2166 9901-171 Fax: +49 2166 9901-100 Email: michael.ba...@credativ.de credativ GmbH, HRB Mönchengladbach 12080 USt-ID-Nummer: DE204566209 Trompeterallee 108, 41189 Mönchengladbach Geschäftsführung: Dr. Michael Meskes, Sascha Heuer Unser Umgang mit personenbezogenen Daten unterliegt folgenden Bestimmungen: https://www.credativ.de/datenschutz
Re: Commitfest app vs. pgsql-docs
On Thu, May 20, 2021 at 09:39:13AM +0900, Michael Paquier wrote: > On Wed, May 19, 2021 at 03:35:00PM -0400, Tom Lane wrote: > > Magnus Hagander writes: > >> Changing that to look globally can certainly be done. It takes a bit > >> of work I think, as there are no API endpoints today that will do > >> that, but those could be added. > > > > Ah. Personally, I'd settle for it checking -hackers, -docs and -bugs. > > Perhaps there's some case for -general as well. > > FWIW, I have seen cases for -general in the past. I was under the impression that posting patches to -hackers meant an implicit acknowledge that this code can be used by the Postgres project under the Postgres license and the PGDG copyright. Is this the same for all lists, and/or does this need to be amended then somehow (or am I getting this totally wrong)? I assume the point of cross-linking patches to the commitfest is to get them into Postgres after all. Also, I'd have expected that any meaningful patch surfacing on -general would be cross-posted to -hackers anyway (less/not so for -bugs and -docs). Michael -- Michael Banck Projektleiter / Senior Berater Tel.: +49 2166 9901-171 Fax: +49 2166 9901-100 Email: michael.ba...@credativ.de credativ GmbH, HRB Mönchengladbach 12080 USt-ID-Nummer: DE204566209 Trompeterallee 108, 41189 Mönchengladbach Geschäftsführung: Dr. Michael Meskes, Sascha Heuer Unser Umgang mit personenbezogenen Daten unterliegt folgenden Bestimmungen: https://www.credativ.de/datenschutz
Re: Granting control of SUSET gucs to non-superusers
Hi, On Fri, Apr 30, 2021 at 04:19:22PM -0700, Mark Dilger wrote: > PostgreSQL defines a number of GUCs that can only be set by > superusers. I would like to support granting privileges on subsets of > these to non-superuser roles, inspired by Stephen Frost's recent work > on pg_read_all_data and pg_write_all_data roles. > > The specific use case motivating this work is that of a PostgreSQL > service provider. The provider guarantees certain aspects of the > service, such as periodic backups, replication, uptime, availability, > etc., while making no guarantees of other aspects, such as performance > associated with the design of the schema or the queries executed. The > provider should be able to grant to the tenant privileges to set any > GUC which cannot be used to "escape the sandbox" and interfere with > the handful of metrics being guaranteed. Given that the guarantees > made by one provider may differ from those made by another, the exact > set of GUCs which the provider allows the tenant to control may > differ. > > By my count, there are currently 50 such GUCs, already broken down > into 15 config groups. Creating a single new role pg_set_all_gucs > seems much too coarse a control, but creating 50 new groups may be > excessive. We could certainly debate which GUCs could be used to > escape the sandbox vs. which ones could not, but I would prefer a > design that allows the provider to make that determination. The patch > I would like to submit would only give the provider the mechanism for > controlling these things, but would not make the security choices for > them. > > Do folks think it would make sense to create a role per config group? > Adding an extra 15 default roles seems high to me, but organizing the > feature this way would make the roles easier to document, because > there would be a one-to-one correlation between the roles and the > config groups. > > I have a WIP patch that I'm not attaching, but if I get any feedback, > I might be able to adjust the patch before the first version posted. > The basic idea is that it allows things like: > > GRANT pg_set_stats_monitoring TO tenant_role; > > And then tenant_role could, for example > > SET log_parser_stats TO off; Just saying, I've proposed something very similar, albeit for a narrower scope (mostly the Reporting and Logging category) here: https://www.postgresql.org/message-id/flat/c2ee39152957af339ae6f3e851aef09930dd2faf.ca...@credativ.de Michael -- Michael Banck Projektleiter / Senior Berater Tel.: +49 2166 9901-171 Fax: +49 2166 9901-100 Email: michael.ba...@credativ.de credativ GmbH, HRB Mönchengladbach 12080 USt-ID-Nummer: DE204566209 Trompeterallee 108, 41189 Mönchengladbach Geschäftsführung: Dr. Michael Meskes, Sascha Heuer Unser Umgang mit personenbezogenen Daten unterliegt folgenden Bestimmungen: https://www.credativ.de/datenschutz
Re: Problems around compute_query_id
Hi, On Mon, Apr 12, 2021 at 02:56:59PM +0800, Julien Rouhaud wrote: > On Mon, Apr 12, 2021 at 03:12:40PM +0900, Michael Paquier wrote: > > Fujii-san has reported on Twitter that enabling the computation of > > query IDs does not work properly with log_statement as the query ID is > > calculated at parse analyze time and the query is logged before that. > > As far as I can see, that's really a problem as any queries logged are > > combined with a query ID of 0, and log parsers would not really be > > able to use that, even if the information provided by for example > > log_duration gives the computed query ID prevent in pg_stat_activity. > > I don't see any way around that. The goal of log_statements is to log all > syntactically valid queries, including otherwise invalid queries. For > instance, it would log > > SELECT notacolumn; > > and there's no way to compute a queryid in that case. I think that > log_statements already causes some issues with log parsers. At least pgbadger > recommends to avoid using that: > > "Do not enable log_statement as its log format will not be parsed by > pgBadger." > > I think we should simply document that %Q is not compatible with > log_statements. What about log_statement_sample_rate ? Does compute_query_id have the same problem with that? Michael -- Michael Banck Projektleiter / Senior Berater Tel.: +49 2166 9901-171 Fax: +49 2166 9901-100 Email: michael.ba...@credativ.de credativ GmbH, HRB Mönchengladbach 12080 USt-ID-Nummer: DE204566209 Trompeterallee 108, 41189 Mönchengladbach Geschäftsführung: Dr. Michael Meskes, Sascha Heuer Unser Umgang mit personenbezogenen Daten unterliegt folgenden Bestimmungen: https://www.credativ.de/datenschutz
Re: [PATCH] New default role allowing to change per-role/database settings
Hi, Am Dienstag, den 06.04.2021, 15:37 +0200 schrieb Michael Banck: > Am Montag, den 05.04.2021, 14:33 -0400 schrieb Stephen Frost: > > Should drop the 'DEFAULT_' to match the others since the rename to > > 'predefined' roles went in. > > Right, will send a rebased patch ASAP. Here's a rebased version that also removes the ALTER DATABASE SET capability from this new predefined role and adds some documentation. Michael -- Michael Banck Projektleiter / Senior Berater Tel.: +49 2166 9901-171 Fax: +49 2166 9901-100 Email: michael.ba...@credativ.de credativ GmbH, HRB Mönchengladbach 12080 USt-ID-Nummer: DE204566209 Trompeterallee 108, 41189 Mönchengladbach Geschäftsführung: Dr. Michael Meskes, Sascha Heuer Unser Umgang mit personenbezogenen Daten unterliegt folgenden Bestimmungen: https://www.credativ.de/datenschutz From a581d62db1751ba94d800ea2c6c6dfb04e4bcbf7 Mon Sep 17 00:00:00 2001 From: Michael Banck Date: Tue, 15 Dec 2020 20:53:59 +0100 Subject: [PATCH v3] Add new PGC_ADMINSET guc context and pg_change_role_settings predefined role. This commit introduces `administrator' as a middle ground between `superuser' and `user' for guc contexts. Those settings initially include all previously `superuser'-context settings from the 'Reporting and Logging' ('What/When to Log' but not 'Where to Log') and both 'Statistics' categories. Further settings could be added in follow-up commits. Also, a new predefined role pg_change_role_settings is introduced. This allows administrators (that are not superusers) that have been granted this role to change the above PGC_ADMINSET settings on a per-role basis like "ALTER ROLE [...] SET log_statement". The rationale is to allow certain settings that affect logging to be changed in setups where the DBA has access to the database log, but is not a full superuser. --- doc/src/sgml/ref/alter_role.sgml | 11 +++--- doc/src/sgml/user-manag.sgml | 7 src/backend/commands/user.c | 7 ++-- src/backend/utils/misc/guc.c | 56 +++ src/include/catalog/pg_authid.dat | 5 +++ src/include/utils/guc.h | 1 + 6 files changed, 59 insertions(+), 28 deletions(-) diff --git a/doc/src/sgml/ref/alter_role.sgml b/doc/src/sgml/ref/alter_role.sgml index 5aa5648ae7..d332c29c71 100644 --- a/doc/src/sgml/ref/alter_role.sgml +++ b/doc/src/sgml/ref/alter_role.sgml @@ -115,11 +115,12 @@ ALTER ROLE { role_specification | A Superusers can change anyone's session defaults. Roles having - CREATEROLE privilege can change defaults for non-superuser - roles. Ordinary roles can only set defaults for themselves. - Certain configuration variables cannot be set this way, or can only be - set if a superuser issues the command. Only superusers can change a setting - for all roles in all databases. + CREATEROLE privilege or which are a member of the + pg_change_role_settings predefined role can change + defaults for non-superuser roles. Ordinary roles can only set defaults for + themselves. Certain configuration variables cannot be set this way, or can + only be set if a superuser issues the command. Only superusers can change a + setting for all roles in all databases. diff --git a/doc/src/sgml/user-manag.sgml b/doc/src/sgml/user-manag.sgml index fe0bdb7599..48a7aa8cc2 100644 --- a/doc/src/sgml/user-manag.sgml +++ b/doc/src/sgml/user-manag.sgml @@ -541,6 +541,13 @@ DROP ROLE doomed_role; Read all configuration variables, even those normally visible only to superusers. + + pg_change_role_settings + Allow changing role-based settings for all non-superuser roles + with ALTER ROLE rolename SET + varname TO + value. + pg_read_all_stats Read all pg_stat_* views and use various statistics related extensions, diff --git a/src/backend/commands/user.c b/src/backend/commands/user.c index a8c5188ebc..3039c0d69c 100644 --- a/src/backend/commands/user.c +++ b/src/backend/commands/user.c @@ -936,7 +936,8 @@ AlterRoleSet(AlterRoleSetStmt *stmt) /* * To mess with a superuser you gotta be superuser; else you need - * createrole, or just want to change your own settings + * createrole, the pg_change_role_settings default role, or just want + * to change your own settings. */ if (roleform->rolsuper) { @@ -947,7 +948,9 @@ AlterRoleSet(AlterRoleSetStmt *stmt) } else { - if (!have_createrole_privilege() && roleid != GetUserId()) + if (!have_createrole_privilege() && +roleid != GetUserId() && +!is_member_of_role(GetUserId(), ROLE_PG_CHANGE_ROLE_SETTINGS)) ereport(ERROR, (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE), errmsg("permission denied"))); diff --git a/src/backend/utils/misc/guc.c b/src/backend/ut