Re: New PostgreSQL Contributors
Congratulations to all! This seems the sort of thing we should crosspost to -general and/or -announce, no? Cheers, Greg
Re: Truncate logs by max_log_size
On Fri, Sep 27, 2024 at 6:37 AM Andrey M. Borodin wrote: > Consider max_log_size = 10Mb. The perspective might look very different. > It’s not about WHERE anymore. It's a guard against heavy abuse. > Can you elaborate on this? Do you mean someone purposefully writing large entries to your log file?
Re: Changing the default random_page_cost value
On Fri, Sep 27, 2024 at 12:03 PM Dagfinn Ilmari Mannsåker wrote: > It might also be worth mentioning cloudy block storage (e.g. AWS' EBS), > which is typically backed by SSDs, but has extra network latency. > That seems a little too in the weeds for me, but wording suggestions are welcome. To get things moving forward, I made a doc patch which changes a few things, namely: * Mentions the distinction between ssd and hdd right up front. * Moves the tablespace talk to the very end, as tablespace use is getting rarer (again, thanks in part to ssds) * Mentions the capability to set per-database and per-role since we mention per-tablespace. * Removes a lot of the talk of caches and justifications for the 4.0 setting. While those are interesting, I've been tuning this parameter for many years and never really cared about the "90% cache rate". The proof is in the pudding: rpc is the canonical "try it and see" parameter. Tweak. Test. Repeat. Cheers, Greg 0001-Lower-random_page_cost-default-to-1.2-and-update-docs-about-it.patch Description: Binary data
Changing the default random_page_cost value
tl;dr let's assume SSDs are popular and HDDs are the exception and flip our default As I write this email, it's the year 2024. I think it is time we lower our "default" setting of random_page_cost (as set in postgresql.conf.sample and the docs). Even a decade ago, the current default of 4 was considered fairly conservative and often lowered. The git logs shows that this value was last touched in 2006, during the age of spinning metal. We are now in a new era, the age of SSDs, and thus we should lower this default value to reflect the fact that the vast majority of people using Postgres these days are doing so on solid state drives. We tend to stay ultra-conservative in all of our settings, but we also need to recognize when there has been a major shift in the underlying hardware - and calculations that our defaults are based on. Granted, there are other factors involved, and yes, perhaps we should tweak some of the similar settings as well, but ranom_page_cost is the one setting most out of sync with today's hardware realities. So I'll be brave and throw a number out there: 1.2. And change our docs to say wordage like "if you are using an older hard disk drive technology, you may want to try raising rpc" to replace our fairly-hidden note about SSDs buried in the last sentence - of the fourth paragraph - of the rpc docs. Real data about performance on today's SSDs are welcome, and/or some way to generate a more accurate default. Cheers, Greg
Re: Why mention to Oracle ?
> And if we create a page like > https://www.postgresql.org/about/featurematrix/ > But instead of Postgres versions we have other vendors. > This sounds like something that would fit well on the Postgres wiki: https://wiki.postgresql.org/ Cheers, Greg
Re: Exit walsender before confirming remote flush in logical replication
Thanks for everyone's work on this, I am very interested in it getting into a release. What is the status of this? My use case is Patroni - when it needs to do a failover, it shuts down the primary. However, large transactions can cause it to stay in the "shutting down" state for a long time, which means your entire HA system is now non-functional. I like the idea of a new flag. I'll test this out soon if the original authors want to make a rebased patch. This thread is old, so if I don't hear back in a bit, I'll create and test a new one myself. :) Cheers, Greg
Re: Jargon and acronyms on this mailing list
On Tue, Sep 3, 2024 at 11:50 AM Nathan Bossart wrote: > Do you think these acronyms make it difficult for some to contribute to > Postgres? I've always felt that they were pretty easy to figure out and a > nice way to save some typing for common phrases, but I'm not sure it's ever > really been discussed > I do think it raises the bar a bit, especially for non-native-English speakers. Granted, the learning curve is not super high, and context plus web searching can usually help people out, but the lists are dense enough already, so I wanted to help people out. Also, mailing lists in general are a pretty foreign concept to young developers, and as AFAICT [1], not all the acronyms have crossed to the texting world. Cheers, Greg [1] See what I did there? [2] [2] Do people still learn about and use footnotes?
Jargon and acronyms on this mailing list
I normally wouldn't mention my blog entries here, but this one was about the hackers mailing list, so wanted to let people know about it in case you don't follow Planet Postgres. I scanned the last year's worth of posts and gathered the most used acronyms and jargon. The most commonly used acronym was IMO (in my opinion), followed by FWIW (for what it's worth), and IIUC (if I understand correctly). The complete list can be found in the post below, I'll refrain from copying everything here. https://www.crunchydata.com/blog/understanding-the-postgres-hackers-mailing-list Cheers, Greg
Re: Better error message when --single is not the first arg to postgres executable
On Mon, Aug 26, 2024 at 11:43 AM Nathan Bossart wrote: > On Sun, Aug 25, 2024 at 01:14:36PM -0400, Greg Sabino Mullane wrote: > > I'm not happy about making this and the const char[] change their > structure > > based on the ifdefs - could we not just leave forkchild in? Their usage > is > > already protected by the ifdefs in the calling code. > > Here's an attempt at this. > Looks great, thank you.
Re: Enable data checksums by default
On Mon, Aug 26, 2024 at 3:46 PM Nathan Bossart wrote: > Should we error if both --data-checksum and --no-data-checksums are > specified? IIUC with 0001, we'll use whichever is specified last. > Hmmm, that is a good question. We have never (to my recollection) flipped a default quite like this before. I'm inclined to leave it as "last one wins", as I can see automated systems appending their desired selection to the end of the arg list, and expecting it to work. nitpick: these 4 patches are small enough that they could likely be > combined and committed together. > This was split per request upthread, which I do agree with. I think it's fair to say we should make the pg_upgrade experience nicer > once the default changes, but IMHO that needn't block actually changing the > default. > +1 Cheers, Greg
Re: Better error message when --single is not the first arg to postgres executable
I'm not opposed to this new method, as long as the error code improves. :) +typedef enum Subprogram +{ + SUBPROGRAM_CHECK, + SUBPROGRAM_BOOT, +#ifdef EXEC_BACKEND + SUBPROGRAM_FORKCHILD, +#endif I'm not happy about making this and the const char[] change their structure based on the ifdefs - could we not just leave forkchild in? Their usage is already protected by the ifdefs in the calling code. Heck, we could put SUBPROGRAM_FORKCHILD first in the list, keep the ifdef in parse_subprogram, and start regular checking with i = 1; This would reduce to a single #ifdef Cheers, Greg
Re: Enable data checksums by default
Rebased and reworked patches attached: v2-0001-Add-new-initdb-argument-no-data-checksums-to-force-checksums-off.patch - Adds --no-data-checksums to initdb.c, adjusts --help and sgml docs, adds simple tests v2-0002-Allow-tests-to-force-checksums-off-when-calling-init.patch - Adjusts the Perl tests to use the new flag as needed v2-0003-Change-initdb-to-default-to-using-data-checksums.patch - Flips the boolean to true, adjusts the tests to match it, tweaks docs v2-0004-Tweak-docs-to-reduce-possible-impact-of-data-checksums.patch - Changes "noticeable" penalty to "small" penalty Cheers, Greg v2-0004-Tweak-docs-to-reduce-possible-impact-of-data-checksums.patch Description: Binary data v2-0001-Add-new-initdb-argument-no-data-checksums-to-force-checksums-off.patch Description: Binary data v2-0003-Change-initdb-to-default-to-using-data-checksums.patch Description: Binary data v2-0002-Allow-tests-to-force-checksums-off-when-calling-init.patch Description: Binary data
Re: [PATCH] GROUP BY ALL
On Tue, Jul 23, 2024 at 9:37 AM David Christensen wrote: > I'm not married to the exact syntax of this feature; anything else short > and consistent could work if `ALL` is considered to potentially > gain a different interpretation in the future. > GROUP BY * Just kidding. But a big +1 to the whole concept. It would have been extraordinarily useful over the years. Cheers, Greg
Re: Normalize queries starting with SET for pg_stat_statements
Now that I've spent some time away from this, I'm reconsidering why we are going through all the trouble of semi-jumbling SET statements. Maybe we just keep it simple and everything becomes "SET myvar = $1" or even "SET myvar" full stop? I'm having a hard time finding a real-world situation in which we need to distinguish different SET/RESET items within pg_stat_statements. Cheers, Greg
Re: Enable data checksums by default
On Thu, Aug 8, 2024 at 6:11 AM Peter Eisentraut wrote: > My understanding was that the reason for some hesitation about adopting > data checksums was the performance impact. Not the checksumming itself, > but the overhead from hint bit logging. The last time I looked into that, > you could get performance impacts on the order of 5% tps. Maybe that's > acceptable, and you of course can turn it off if you want the extra > performance. But I think this should be discussed in this thread. > Fair enough. I think the performance impact is acceptable, as evidenced by the large number of people that turn it on. And it is easy enough to turn it off again, either via --no-data-checksums or pg_checksums --disable. I've come across people who have regretted not throwing a -k into their initial initdb, but have not yet come across someone who has the opposite regret. When I did some measurements some time ago, I found numbers much less than 5%, but of course it depends on a lot of factors. About the claim that it's already the de-facto standard. Maybe that is > approximately true for "serious" installations. But AFAICT, the popular > packagings don't enable checksums by default, so there is likely a > significant middle tier between "just trying it out" and serious > production use that don't have it turned on. > I would push back on that "significant" a good bit. The number of Postgres installations in the cloud is very likely to dwarf the total package installations. Maybe not 10 years ago, but now? Maybe someone from Amazon can share some numbers. Not that we have any way to compare against package installs :) But anecdotally the number of people who mention RDS etc. on the various fora has exploded. > For those uses, this change would render pg_upgrade useless for upgrades > from an old instance with default settings to a new instance with default > settings. And then users would either need to re-initdb with checksums > turned back off, or I suppose run pg_checksums on the old instance before > upgrading? This is significant additional complication. > Meh, re-running initdb with --no-data-checksums seems a fairly low hurdle. > And packagers who have built abstractions on top of pg_upgrade (such as > Debian pg_upgradecluster) would also need to implement something to manage > this somehow. > How does it deal with clusters with checksums enabled now? > I'm thinking pg_upgrade could have a mode where it adds the checksum > during the upgrade as it copies the files (essentially a subset > of pg_checksums). I think that would be useful for that middle tier of > users who just want a good default experience. > Hm...might be a bad experience if it forces a switch out of --link mode. Perhaps a warning at the end of pg_upgrade that suggests running pg_checksums on your new cluster if you want to enable checksums? Cheers, Greg
Re: Subscription to Postgres Releases via ICS
Sounds like a good idea. You probably want to ask on pgsql-www. I imagine it would have to be coded against this: https://git.postgresql.org/gitweb/?p=pgweb.git A working patch would probably be nice to get things started. but I recognize it's a little bit of chicken-and-egg. Cheers, Greg
Re: Enable data checksums by default
Thank you for the feedback. Please find attached three separate patches. One to add a new flag to initdb (--no-data-checksums), one to adjust the tests to use this flag as needed, and the final to make the actual switch of the default value (along with tests and docs). Cheers, Greg 0001-Add-new-initdb-argument-no-data-checksums-to-force-checksums-off.patch Description: Binary data 0002-Allow-tests-to-force-checksums-off-when-calling-init.patch Description: Binary data 0003-Change-initdb-to-default-to-using-data-checksums.patch Description: Binary data
Re: Enable data checksums by default
On Wed, Aug 7, 2024 at 4:43 AM Michael Banck wrote: > I think the last time we dicussed this the consensus was that > computational overhead of computing the checksums is pretty small for > most systems (so the above change seems warranted regardless of whether > we switch the default), but turning on wal_compression also turns on > wal_log_hints, which can increase WAL by quite a lot. Maybe this is > covered elsewhere in the documentation (I just looked at the patch), but > if not, it probably should be added here as a word of caution. > Yeah, that seems something beyond this patch? Certainly we should mention wal_compression in the release notes if the default changes. I mean, I feel wal_log_hints should probably default to on as well, but I've honestly never really given it much thought because my fingers are trained to type "initdb -k". I've been using data checksums for roughly a decade now. I think the only time I've NOT used checksums was when I was doing checksum overhead measurements, or hacking on the pg_checksums program. > I think we usually do not mention when a feature was added/changed, do > we? So I'd just write "(default: enabled)" or whatever is the style of > the surrounding options. > +1 > > + {"no-data-checksums", no_argument, NULL, 20}, > > Does it make sense to add -K (capital k) as a short-cut for this? I > think this is how we distinguish on/off for pg_dump (-t/-T etc.) but > maybe that is not wider project policy. > I'd rather not. Better to keep it explicit rather than some other weird letter that has no mnemonic value. Cheers, Greg
Enable data checksums by default
Please find attached a patch to enable data checksums by default. Currently, initdb only enables data checksums if passed the --data-checksums or -k argument. There was some hesitation years ago when this feature was first added, leading to the current situation where the default is off. However, many years later, there is wide consensus that this is an extraordinarily safe, desirable setting. Indeed, most (if not all) of the major commercial and open source Postgres systems currently turn this on by default. I posit you would be hard-pressed to find many systems these days in which it has NOT been turned on. So basically we have a de-facto standard, and I think it's time we flipped the switch to make it on by default. The patch is simple enough: literally flipping the boolean inside of initdb.c, and adding a new argument '--no-data-checksums' for those instances that truly want the old behavior. One place that still needs the old behavior is our internal tests for pg_checksums and pg_amcheck, so I added a new argument to init() in PostgreSQL/Test/Cluster.pm to allow those to still pass their tests. This is just the default - people are still more than welcome to turn it off with the new flag. The pg_checksums program is another option that actually argues for having the default "on", as switching to "off" once initdb has been run is trivial. Yes, I am aware of the previous discussions on this, but the world moves fast - wal compression is better than in the past, vacuum is better now, and data-checksums being on is such a complete default in the wild, it feels weird and a disservice that we are not running all our tests like that. Cheers, Greg From 12ce067f5ba64414d1d14c5f2e763d04cdfacd13 Mon Sep 17 00:00:00 2001 From: Greg Sabino Mullane Date: Tue, 6 Aug 2024 18:18:56 -0400 Subject: [PATCH] Make initdb enable data checksums by default --- doc/src/sgml/ref/initdb.sgml | 4 ++- src/bin/initdb/initdb.c | 6 - src/bin/initdb/t/001_initdb.pl| 31 +-- src/bin/pg_amcheck/t/003_check.pl | 2 +- src/bin/pg_amcheck/t/004_verify_heapam.pl | 2 +- src/bin/pg_checksums/t/002_actions.pl | 2 +- src/test/perl/PostgreSQL/Test/Cluster.pm | 7 + 7 files changed, 41 insertions(+), 13 deletions(-) diff --git a/doc/src/sgml/ref/initdb.sgml b/doc/src/sgml/ref/initdb.sgml index bdd613e77f..511f489d34 100644 --- a/doc/src/sgml/ref/initdb.sgml +++ b/doc/src/sgml/ref/initdb.sgml @@ -267,12 +267,14 @@ PostgreSQL documentation Use checksums on data pages to help detect corruption by the I/O system that would otherwise be silent. Enabling checksums -may incur a noticeable performance penalty. If set, checksums +may incur a small performance penalty. If set, checksums are calculated for all objects, in all databases. All checksum failures will be reported in the pg_stat_database view. See for details. +As of version 18, checksums are enabled by default. They can be +disabled by use of --no-data-checksums. diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c index f00718a015..ce7d3e99e5 100644 --- a/src/bin/initdb/initdb.c +++ b/src/bin/initdb/initdb.c @@ -164,7 +164,7 @@ static bool noinstructions = false; static bool do_sync = true; static bool sync_only = false; static bool show_setting = false; -static bool data_checksums = false; +static bool data_checksums = true; static char *xlog_dir = NULL; static int wal_segment_size_mb = (DEFAULT_XLOG_SEG_SIZE) / (1024 * 1024); static DataDirSyncMethod sync_method = DATA_DIR_SYNC_METHOD_FSYNC; @@ -3121,6 +3121,7 @@ main(int argc, char *argv[]) {"waldir", required_argument, NULL, 'X'}, {"wal-segsize", required_argument, NULL, 12}, {"data-checksums", no_argument, NULL, 'k'}, + {"no-data-checksums", no_argument, NULL, 20}, {"allow-group-access", no_argument, NULL, 'g'}, {"discard-caches", no_argument, NULL, 14}, {"locale-provider", required_argument, NULL, 15}, @@ -3319,6 +3320,9 @@ main(int argc, char *argv[]) if (!parse_sync_method(optarg, &sync_method)) exit(1); break; + case 20: +data_checksums = false; +break; default: /* getopt_long already emitted a complaint */ pg_log_error_hint("Try \"%s --help\" for more information.", progname); diff --git a/src/bin/initdb/t/001_initdb.pl b/src/bin/initdb/t/001_initdb.pl index 06a35ac0b7..64395ec531 100644 --- a/src/bin/initdb/t/001_initdb.pl +++ b/src/bin/initdb/t/001_initdb.pl @@ -69,16 +69,11 @@ mkdir $datadir; } } -# Control file should tell that data checksums are disabled by default. +# Control file should tell that data checksums are enabled by default. command_like( [ 'pg_c
Normalize queries starting with SET for pg_stat_statements
I saw a database recently where some app was inserting the source port into the application_name field, which meant that pg_stat_statements.max was quickly reached and queries were simply pouring in and out of pg_stat_statements, dominated by some "SET application_name = 'myapp 10.0.0.1:1234'" calls. Which got me thinking, is there really any value to having non-normalized 'SET application_name' queries inside of pg_stat_statements? Or any SET stuff, for that matter? Attached please find a small proof-of-concept for normalizing/de-jumbling certain SET queries. Because we only want to cover the VAR_SET_VALUE parts of VariableSetStmt, a custom jumble func was needed. There are a lot of funky SET things inside of gram.y as well that don't do the standard SET X = Y formula (e.g. SET TIME ZONE, SET SCHEMA). I tried to handle those as best I could, and carved a couple of exceptions for time zones and xml. I'm not sure where else to possibly draw lines. Obviously calls to time zone have a small and finite pool of possible values, so easy enough to exclude them, while things like application_name and work_mem are fairly infinite, so great candidates for normalizing. One could argue for simply normalizing everything, as SET is trivially fast for purposes of performance tracking via pg_stat_statements, so who cares if we don't have the exact string? That's what regular logging is for, after all. Most importantly, less unique queryids means less chance that errant SETs will crowd out the more important stuff. In summary, we want to change this: SELECT calls, query from pg_stat_statements where query ~ 'set' order by 1; 1 | set application_name = 'alice' 1 | set application_name = 'bob' 1 | set application_name = 'eve' 1 | set application_name = 'mallory' to this: SELECT calls, query from pg_stat_statements where query ~ 'set' order by 1; 4 | set application_name = $1 I haven't updated the regression tests yet, until we reach a consensus on how thorough the normalizing should be. But there is a new test to exercise the changes in gram.y. Cheers, Greg 0001-Normalize-queries-starting-with-SET.patch Description: Binary data
Re: Set log_lock_waits=on by default
> > Are there any others who have an opinion on this? Big +1 to having it on by default. It's already one of the first things I turn on by default on any system I come across. The log spam is minimal, compared to all the other stuff that ends up in there. And unlike most of that stuff, this is output you generally want and need, when problems start occurring. Cheers, Greg
Re: Send duration output to separate log files
On Thu, Jul 11, 2024 at 6:47 AM Alastair Turner wrote: > The other category of logging which would benefit from a separate file is > audit. It also can create massive volumes of log content. Splitting audit > information off into a separate file for use by a separate team or function > is also a request I have heard from some financial institutions adopting > Postgres. With audit being provided by an extension, this would become > quite an intrusive change. > Thanks for the feedback. I agree pgaudit is another thing that can create massive log files, and should be solved at some point. However, I wanted to keep this patch self-contained to in-core stuff. And pgaudit is already an odd duck, in that it puts CSV into your stderr stream (and into your json!). Ideally it would put a single CSV stream into a separate csv file. Perhaps even something that did not necessarily live in log_directory. Would an extension be able to safely modify the message_type field you have >> added using emit_log_hook? If so, the field becomes more of a log >> destination label than a type marker. If an extension could hook into the >> log file creation/rotation process, that would be a nice basis for enabling >> extensions (I'm particularly thinking of pgAudit) to manage separate >> logging destinations. >> > Yes, I had more than duration in mind when I created errmessagetype. A hook to set it would be the obvious next step, and then some sort of way of mapping that to arbitrary log files. But I see that as mostly orthagonal to this patch (and certainly a much larger endeavor). (wades in anyways). I'm not sure about hooking into the log rotation process so much as registering something on startup, then letting Postgres handle all the log files in the queue. Although as I alluded to above, sometimes having large log files NOT live in the data directory (or more specifically, not hang out with the log_directory crowd), could be a plus for space, efficiency, and security reasons. That makes log rotation harder, however. And do we / should we put extension-driven logfiles into current_logfiles? Do we still fall back to stderr even for extension logs? Lots to ponder. :) Cheers, Greg
Re: Logging which local address was connected to in log_line_prefix
Thanks for the review. Please find attached a new version with proper tabs and indenting. Cheers, Greg 0002-Add-local-address-to-log_line_prefix.patch Description: Binary data
Send duration output to separate log files
Please find attached a patch to allow for durations to optionally be sent to separate log files. In other words, rather than cluttering up our postgres202007.log file with tons of output from log_min_duration_statement, duration lines are sent instead to the file postgres202007.duration. Over the years, durations have been the number one cause of very large log files, in which the more "important" items get buried in the noise. Also, programs that are scanning for durations typically do not care about the normal, non-duration output. Some people have a policy of logging everything, which in effect means setting log_min_duration_statement to 0, which in turn makes your log files nearly worthless for spotting non-duration items. This feature will also be very useful for those who need to temporarily turn on log_min_duration_statement, for some quick auditing of exactly what is being run on their database. When done, you can move or remove the duration file without messing up your existing log stream. This only covers the case when both the duration and statement are set on the same line. In other words, log_min_duration_statement output, but not log_duration (which is best avoided anyway). It also requires logging_collector to be on, obviously. Details: The edata structure is expanded to have a new message_type, with a matching function errmessagetype() created. [include/utils/elog.h] [backend/utils/elog.c] Any errors that have both a duration and a statement are marked via errmessagetype() [backend/tcop/postgres.c] A new GUC named "log_duration_destination" is created, which supports any combination of stderr, csvlog, and jsonlog. It does not need to match log_destination, in order to support different use cases. For example, the user wants durations sent to a CSV file for processing by some other tool, but still wants everything else going to a normal text log file. Code: [include/utils/guc_hooks.h] [backend/utils/misc/guc_tables.c] Docs: [sgml/config.sgml] [backend/utils/misc/postgresql.conf.sample] Create a new flag called PIPE_PROTO_DEST_DURATION [include/postmaster/syslogger.h] Create new flags: LOG_DESTINATION_DURATION, LOG_DESTINATION_DURATION_CSV, LOG_DESTINATION_DURATION_JSON [include/utils/elog.h] Routing and mapping LOG_DESTINATION to PIPE_PROTO [backend/utils/error/elog.c] Minor rerouting when using alternate forms [backend/utils/error/csvlog.c] [backend/utils/error/jsonlog.c] Create new filehandles, do log rotation, map PIPE_PROTO to LOG_DESTINATION. Rotation and entry into the "current_logfiles" file are the same as existing log files. The new names/suffixes are duration, duration.csv, and duration.json. [backend/postmaster/syslogger.c] Testing to ensure combinations of log_destination and log_duration_destination work as intended [bin/pg_ctl/meson.build] [bin/pg_ctl/t/005_log_duration_destination.pl] Questions I've asked along the way, and perhaps other might as well: What about logging other things? Why just duration? Duration logging is a very unique event in our logs. There is nothing quite like it - it's always client-driven, yet automatically generated. And it can be extraordinarily verbose. Removing it from the existing logging stream has no particular downsides. Almost any other class of log message would likely meet resistance as far as moving it to a separate log file, with good reason. Why not build a more generic log filtering case? I looked into this, but it would be a large undertaking, given the way our logging system works. And as per above, I don't think the pain would be worth it, as duration covers 99% of the use cases for separate logs. Certainly, nothing else other than a recurring ERROR from the client can cause massive bloat in the size of the files. (There is a nearby patch to exclude certain errors from the log file as a way to mitigate the error spam - I don't like that idea, but should mention it here as another effort to keep the log files a manageable size) Why not use an extension for this? I did start this as an extension, but it only goes so far. We can use emit_log_hook, but it requires careful coordination of open filehandles, has to do inefficient regex of every log message, and cannot do things like log rotation. Why not bitmap PIPE_PROTO *and* LOG_DESTINATION? I tried to do both as simple bitmaps (i.e. duration+csv = duration.csv), and not have to use e.g. LOG_DESTIATION_DURATION_CSV, but size_rotation_for ruined it for me. Since our PIPE always sends one thing at a time, a single new flag enables it to stay as a clean bits8 type. What about Windows? Untested. I don't have access to a Windows build, but I think in theory it should work fine. Cheers, Greg From c6d19d07cfa7eac51e5ead79f1c1b230b5b2d364 Mon Sep 17 00:00:00 2001 From: Greg Sabino Mullane Date: Tue, 2 Jul 2024 15:13:44 -0400 Subject: [PATCH] Add new parameter log_duration_destination This allows writing
Re: Better error message when --single is not the first arg to postgres executable
If I am reading your patch correctly, we have lost the behavior of least surprise in which the first "meta" argument overrides all others: $ bin/postgres --version --boot --extrastuff postgres (PostgreSQL) 16.2 What about just inlining --version and --help e.g. else if (strcmp(argv[i], "--version") == 0 || strcmp(argv[i], "-V") == 0) { fputs(PG_BACKEND_VERSIONSTR, stdout); exit(0); } I'm fine with being more persnickety about the other options; they are much rarer and not unixy. However, there's a complication: > ... > This remaining discrepancy might be okay, but I was really hoping to reduce > the burden on users to figure out the correct ordering of options. The > situations in which I've had to use single-user mode are precisely the > situations in which I'd rather not have to spend time learning these kinds > of details. > Yes, that's unfortunate. But I'd be okay with the db-last requirement as long as the error message is sane and points one in the right direction. Cheers, Greg
Re: cost delay brainstorming
On Mon, Jun 17, 2024 at 3:39 PM Robert Haas wrote: > So, a very simple algorithm would be: If the maximum number of workers > have been running continuously for more than, say, > 10 minutes, assume we're falling behind Hmm, I don't know about the validity of this. I've seen plenty of cases where we hit the max workers but all is just fine. On the other hand, I don't have an alternative trigger point yet. But I do overall like the idea of dynamically changing the delay. And agree it is pretty conservative. > 2. If we decided to gradually increase the rate of vacuuming instead of > just removing the throttling all at once, what formula would we use > and why would that be the right idea? Well, since the idea of disabling the delay is on the table, we could raise the cost every minute by X% until we effectively reach an infinite cost / zero delay situation. I presume this would only affect currently running vacs, and future ones would get the default cost until things get triggered again? Cheers, Greg
Re: RFC: adding pytest as a supported test framework
On Fri, Jun 14, 2024 at 5:09 PM Jelte Fennema-Nio wrote: > Test::More on the other hand, while indeed still maintained, it's > definitely not getting significant new feature development or > improvements[2]. Especially when comparing it to pytest[3]. > That's fair, although it's a little hard to tell if the lack of new features is because they are not needed for a stable, mature project, or because few people are asking for and developing new features. Probably a bit of both. But I'll be the first to admit Perl is dying; I just don't know what should replace it (or how - or when). Python has its quirks, but all languages do, and your claim that it will encourage more and easier test writing by developers is a good one. Cheers, Greg
Re: RFC: adding pytest as a supported test framework
On Sat, Jun 15, 2024 at 12:48 PM Jelte Fennema-Nio wrote: > Afaict, there's a significant part of our current community who feel the > same way (and I'm pretty sure every sub-30 year old person who > newly joins the community would feel the exact same way too). > Those young-uns are also the same group who hold their nose when coding in C, and are always clamoring for rewriting Postgres in Rust. And before that, C++. And next year, some other popular language that is clearly better and more popular than C. And I agree with Robbert that Python seems like the best choice for this > other language, given its current popularity level. But as I said > before, I'm open to other languages as well. > Despite my previous posts, I am open to other languages too, including Python, but the onus is really on the new language promoters to prove that the very large amount of time and trouble is worth it, and worth it for language X. Cheers, Greg
Re: RFC: adding pytest as a supported test framework
On Thu, Jun 13, 2024 at 1:08 PM Jelte Fennema-Nio wrote: > But Perl is at the next level of unmaintained infrastructure. It is > actually clear how you can contribute to it, but still no new > community members actually want to contribute to it. Also, it's not > only unmaintained by us but it's also pretty much unmaintained by the > upstream community. I am not happy with the state of Perl, as it has made some MAJOR missteps along the way, particularly in the last 5 years. But can we dispel this strawman? There is a difference between "unpopular" and "unmaintained". The latest version of Perl was released May 20, 2024. The latest release of Test::More was April 25, 2024. Both are heavily used. Just not as heavily as they used to be. :) Cheers, Greg
Re: RFC: adding pytest as a supported test framework
On Thu, Jun 13, 2024 at 9:38 AM Robert Haas wrote: > I agree with you, but I'm skeptical that solving it will be as easy as > switching to Python. For whatever reason, it seems like every piece of > infrastructure that the PostgreSQL community has suffers from severe > neglect. Literally everything I know of either has one or maybe two > very senior hackers maintaining it, or no maintainer at all. ... > All of this stuff is critical project infrastructure and yet it feels like > nobody wants to work on > it. I feel at least some of this is a visibility / marketing problem. I've not seen any dire requests for help come across on the lists, nor things on the various todos/road maps/ blog posts people make from time to time. If I had, I would have jumped in. And for the record, I'm very proficient with Perl. Cheers, Greg
Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs
On Wed, Jun 5, 2024 at 9:03 PM Robert Haas wrote: > It's a funny use of "max" and "min", because the max is really what we're > trying to do and the min is what we end > up with, and those terms don't necessarily bring those ideas to mind. requested_protocol_version and minimum_protocol_version?
Re: Better error message when --single is not the first arg to postgres executable
On Wed, Jun 5, 2024 at 3:18 PM Nathan Bossart wrote: > Could we remove the requirement that --single must be first? I'm not > thrilled about adding a list of "must be first" options that needs to stay > updated, but given this list probably doesn't change too frequently, maybe > that's still better than a more invasive patch to allow specifying these > options in any order... > It would be nice, and I briefly looked into removing the "first" requirement, but src/backend/tcop/postgres.c for one assumes that --single is always argv[1], and it seemed not worth the extra effort to make it work for argv[N] instead of argv[1]. I don't mind it being the first argument, but that confusing error message needs to go. Thanks, Greg
Better error message when --single is not the first arg to postgres executable
Please find attached a quick patch to prevent this particularly bad error message for running "postgres", when making the common mistake of forgetting to put the "--single" option first because you added an earlier arg (esp. datadir) Current behavior: $ ~/pg/bin/postgres -D ~/pg/data --single 2024-06-05 18:30:40.296 GMT [22934] FATAL: --single requires a value Improved behavior: $ ~/pg/bin/postgres -D ~/pg/data --single --single must be first argument. I applied it for all the "first arg only" flags (boot, check, describe-config, and fork), as they suffer the same fate. Cheers, Greg 0001-Give-a-more-accurate-error-message-if-single-not-number-one.patch Description: Binary data
Re: Logging which local address was connected to in log_line_prefix
Peter, thank you for the feedback. Attached is a new patch with "address" rather than "interface", plus a new default of "local" if there is no address. I also removed the questionable comment, and updated the commitfest title. Cheers, Greg From bfa69fc2fffcb29dee0c6acfa4fc3749f987b272 Mon Sep 17 00:00:00 2001 From: Greg Sabino Mullane Date: Fri, 24 May 2024 11:25:48 -0400 Subject: [PATCH] Add local address to log_line_prefix --- doc/src/sgml/config.sgml | 5 src/backend/utils/error/elog.c| 25 +++ src/backend/utils/misc/postgresql.conf.sample | 1 + 3 files changed, 31 insertions(+) diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index 698169afdb..d0b5e4d9ea 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -7470,6 +7470,11 @@ local0.*/var/log/postgresql Remote host name or IP address yes + + %L + Local address + yes + %b Backend type diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c index d91a85cb2d..b1525d901c 100644 --- a/src/backend/utils/error/elog.c +++ b/src/backend/utils/error/elog.c @@ -67,6 +67,7 @@ #endif #include "access/xact.h" +#include "common/ip.h" #include "libpq/libpq.h" #include "libpq/pqformat.h" #include "mb/pg_wchar.h" @@ -79,6 +80,7 @@ #include "storage/ipc.h" #include "storage/proc.h" #include "tcop/tcopprot.h" +#include "utils/builtins.h" #include "utils/guc_hooks.h" #include "utils/memutils.h" #include "utils/ps_status.h" @@ -3023,6 +3025,29 @@ log_status_format(StringInfo buf, const char *format, ErrorData *edata) appendStringInfoSpaces(buf, padding > 0 ? padding : -padding); break; + case 'L': +if (MyProcPort + && (MyProcPort->laddr.addr.ss_family == AF_INET + || + MyProcPort->laddr.addr.ss_family == AF_INET6) + ) +{ + Port *port = MyProcPort; + char local_host[NI_MAXHOST]; + + local_host[0] = '\0'; + + if (0 == pg_getnameinfo_all(&port->laddr.addr, port->laddr.salen, + local_host, sizeof(local_host), + NULL, 0, + NI_NUMERICHOST | NI_NUMERICSERV) + ) + appendStringInfo(buf, "%s", local_host); +} +else + appendStringInfo(buf, "[local]"); + +break; case 'r': if (MyProcPort && MyProcPort->remote_host) { diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample index 83d5df8e46..85a9c59116 100644 --- a/src/backend/utils/misc/postgresql.conf.sample +++ b/src/backend/utils/misc/postgresql.conf.sample @@ -587,6 +587,7 @@ # %d = database name # %r = remote host and port # %h = remote host +# %L = local address # %b = backend type # %p = process ID # %P = process ID of parallel group leader -- 2.30.2
Re: commitfest.postgresql.org is no longer fit for purpose
On Fri, May 17, 2024 at 7:11 AM Tomas Vondra wrote: > It does seem to me a part of the solution needs to be helping to get > those patches reviewed. I don't know how to do that, but perhaps there's > a way to encourage people to review more stuff, or review stuff from a > wider range of contributors. Say by treating reviews more like proper > contributions. This is a huge problem. I've been in the situation before where I had some cycles to do a review, but actually finding one to review is super-difficult. You simply cannot tell without clicking on the link and wading through the email thread. Granted, it's easy as an occasional reviewer to simply disregard potential patches if the email thread is over a certain size, but it's still a lot of work. Having some sort of summary/status field would be great, even if not everything was labelled. It would also be nice if simpler patches were NOT picked up by experienced hackers, as we want to encourage new/inexperienced people, and having some "easy to review" patches available will help people gain confidence and grow. > Long time ago there was a "rule" that people submitting patches are > expected to do reviews. Perhaps we should be more strict this. > Big -1. How would we even be more strict about this? Public shaming? Withholding a commit? Cheers, Greg
Re: Logging which interface was connected to in log_line_prefix
Thank you for taking the time to review this. I've attached a new rebased version, which has no significant changes. > There is a comment in the patch that states: > > /* We do not need clean_ipv6_addr here: just report verbatim */ > > I am not quite sure what it means, but I am guessing it means that the > patch does not need to format the IPv6 addresses in any specific way. Yes, basically correct. There is a kluge (their word, not mine) in utils/adt/network.c to strip the zone - see the comment for the clean_ipv6_addr() function in that file. I added the patch comment in case some future person wonders why we don't "clean up" the ipv6 address, like other places in the code base do. We don't need to pass it back to anything else, so we can simply output the correct version, zone and all. Cheers, Greg add_local_interface_to_log_line_prefix.v2.patch Description: Binary data
Re: Tarball builds in the new world order
On Tue, Apr 23, 2024 at 6:06 PM Tom Lane wrote: > This change seems like a good thing anyway for anyone who's tempted > to use "make dist" manually, since they wouldn't necessarily want > to package HEAD either. Now, if we just do it exactly like that > then trying to "make dist" without setting PG_COMMIT_HASH will > fail, since "git archive" has no default for its > argument. I can't quite decide if that's a good thing, or if we > should hack the makefile a little further to allow PG_COMMIT_HASH > to default to HEAD. > Just having it fail seems harsh. What if we had plain "make dist" at least output a friendly hint about "please specify a hash"? That seems better than an implicit HEAD default, as they can manually set it to HEAD themselves per the hint. Cheers, Greg
Re: psql: Greatly speed up "\d tablename" when not using regexes
Patch looks good to me. Great idea overall, that forced regex has always bugged me. + char *regexChars = "|*+?()[]{}.^$\\"; One super minor optimization is that we technically do not need to scan for ')' and ']'. If they appear without their partner, the query will fail anyway. :) Cheers, Greg
Re: Security lessons from liblzma
> > It would be better if we created the required test files as part of the > test run. (Why not? Too slow?) Alternatively, I have been thinking > that maybe we could make the output more reproducible by messing with > whatever random seed OpenSSL uses. Or maybe use a Python library to > create the files. Some things to think about. > I think this last idea is the way to go. I've hand-crafted GIF images and PGP messages in the past; surely we have enough combined brain power around here to craft our own SSL files? It may even be a wheel that someone has invented already. Cheers, Greg
Re: Reports on obsolete Postgres versions
On Thu, Apr 4, 2024 at 2:23 PM Bruce Momjian wrote: > -end-of-life (EOL) and no longer supported. > +after its initial release. After this, a final minor version will be > released > +and the software will then be unsupported (end-of-life). Would be a shame to lose the EOL acronym. +Such upgrades might require manual changes to complete so always read > +the release notes first. Proposal: "Such upgrades might require additional steps, so always read the release notes first." I went with frequently-encountered and low risk bugs". > But neither of those classifications are really true. Especially the "low risk" part - I could see various ways a reader could wrongly interpret that. Cheers, Greg
Re: On disable_cost
On Wed, Apr 3, 2024 at 3:21 PM Robert Haas wrote: > It's also pretty clear to me that the fact that enable_indexscan > and enable_indexonlyscan work completely differently from each other > is surprising at best, wrong at worst, but here again, what this patch > does about that is not above reproach. Yes, that is wrong, surely there is a reason we have two vars. Thanks for digging into this: if nothing else, the code will be better for this discussion, even if we do nothing for now with disable_cost. Cheers, Greg
Re: On disable_cost
On Mon, Apr 1, 2024 at 7:54 PM Robert Haas wrote: > What I think we're mostly doing in the regression tests is shutting > off every relevant type of plan except one. I theorize that what we > actually want to do is tell the planner what we do want to happen, > rather than what we don't want to happen, but we've got this weird set > of GUCs that do the opposite of that and we're super-attached to them > because they've existed forever. So rather than listing all the things we don't want to happen, we need a way to force (nay, highly encourage) a particular solution. As our costing is a based on positive numbers, what if we did something like this in costsize.c? Costdisable_cost = 1.0e10; Costpromotion_cost = 1.0e10; // or higher or lower, depending on how strongly we want to "beat" disable_costs effects. ... if (!enable_seqscan) startup_cost += disable_cost; else if (promote_seqscan) startup_cost -= promotion_cost; // or replace "promote" with "encourage"? Cheers, Greg
Re: Possibility to disable `ALTER SYSTEM`
> > The purpose of the setting is to prevent accidental > modifications via ALTER SYSTEM in environments where The emphasis on 'accidental' seems a bit heavy here, and odd. Surely, just "to prevent modifications via ALTER SYSTEM in environments where..." is enough? Cheers, Greg
Re: Adding comments to help understand psql hidden queries
On Fri, Mar 22, 2024 at 11:39 AM David Christensen wrote: > I think it's easier to keep the widths balanced than constant (patch > version included here) Yeah, I'm fine with that, especially because nobody is translating it, nor are they likely to, to be honest. Cheers, Greg
Re: Adding comments to help understand psql hidden queries
On Thu, Mar 21, 2024 at 6:20 PM Peter Eisentraut wrote: > lines are supposed to align vertically. With your patch, the first line > would have variable length depending on the command. > Yes, that is a good point. Aligning those would be quite tricky, what if we just kept a standard width for the closing query? Probably the 24 stars we currently have to match "QUERY", which it appears nobody has changed for translation purposes yet anyway. (If I am reading the code correctly, it would be up to the translators to maintain the vertical alignment). Cheers, Greg
Re: Avoiding inadvertent debugging mode for pgbench
My mistake. Attached please find version 3, which should hopefully make cfbot happy again. pgbench.dash.d.or.not.dash.d.v3.patch Description: Binary data
Re: Possibility to disable `ALTER SYSTEM`
> > As a bonus, if that GUC is set, we could even check at server startup that > all the configuration files are not writable by the postgres user, > and print a warning or refuse to start up if they are. > Ugh, please let's not do this. This was bouncing around in my head last night, and this is really a quite radical change - especially just to handle the given ask, which is to prevent a specific command from running. Not implement a brand new security system. There are so many ways this could go wrong if we start having separate permissions for some of our files. In addition to backups and other tools that need to write to the conf files as the postgres user, what about systems that create a new cluster automatically e.g. Patroni? It will now need elevated privs just to create the conf files and assign the new ownership to them. Lots of moving pieces there and ways things could go wrong. So a big -1 from me, as they say/ :) Cheers, Greg
Re: Avoiding inadvertent debugging mode for pgbench
Rebased version attached (v2), with another sentence in the sgml to explain the optional use of -d Cheers, Greg pgbench.dash.d.or.not.dash.d.v2.patch Description: Binary data
Re: Possibility to disable `ALTER SYSTEM`
On Tue, Mar 19, 2024 at 12:05 PM Tom Lane wrote: > If you aren't willing to build a solution that blocks off mods > using COPY TO FILE/PROGRAM and other readily-available-to-superusers > tools (plpythonu for instance), I think you shouldn't bother asking > for a feature at all. Just trust your superusers. > There is a huge gap between using a well-documented standard tool like ALTER SYSTEM and going out of your way to modify the configuration files through trickery. I think we need to only solve the former as in "hey, please don't do that because your changes will be overwritten" Cheers, Greg
Re: Possibility to disable `ALTER SYSTEM`
Going to agree with Robert Treat here about an extension being a great solution. I resisted posting earlier as I wanted to see how this all pans out, but I wrote a quick little POC extension some months ago that does the disabling and works well (and cannot be easily worked around). On Mon, Mar 18, 2024 at 4:59 PM Robert Haas wrote: > I think that all of this is true except for (c). I think we'd need a > new hook to make it work. > Seems we can just use ProcessUtility and: if (IsA(parsetree, AlterSystemStmt) { ereport(ERROR, ... When we know that a feature is > widely-needed, it's better to have one good implementation of it in > core than several perhaps not-so-good implementations out of core. > Meh, maybe. This one seems pretty dirt simple. Granted, I have expanded my original POC to allow *some* things to be changed by ALTER SYSTEM, but the original use case warrants a very small extension. That allows us to focus all of our efforts on that one implementation > instead of splitting them across several -- which is the whole selling > point of open source, really -- and it makes it easier for users who > want the feature to get access to it. > Well, yeah, but they have to wait until version 18 at best, while an extension can run on any current version and probably be pretty future-proof as well. Cheers, Greg
Re: Reports on obsolete Postgres versions
On Mon, Mar 11, 2024 at 4:38 PM Bruce Momjian wrote: > https://www.postgresql.org/support/versioning/ > > This web page should correct the idea that "upgrades are more risky than > staying with existing versions". Is there more we can do? Should we have > a more consistent response for such reporters? > It could be helpful to remove this sentence: "Upgrading to a minor release does not normally require a dump and restore" While technically true, "not normally" is quite the understatement, as the true answer is "never" or at least "not in the last few decades". I have a hard time even imagining a scenario that would require a minor revision to do a dump and restore - surely, that in itself would warrant a major release? > It would be a crazy idea to report something in the logs if a major > version is run after a certain date, since we know the date when major > versions will become unsupported. > Could indeed be useful to spit something out at startup. Heck, even minor versions are fairly regular now. Sure would be nice to be able to point a client at the database and say "See? Even Postgres itself thinks you should upgrade from 11.3!!" (totally made up example, not at all related to an actual production system /sarcasm) Cheers, Greg
Logging which interface was connected to in log_line_prefix
Someone on -general was asking about this, as they are listening on multiple IPs and would like to know which exact one clients were hitting. I took a quick look and we already have that information, so I grabbed some stuff from inet_server_addr and added it as part of a "%L" (for 'local interface'). Quick patch / POC attached. Cheers, Greg add_local_interface_to_log_line_prefix.v1.patch Description: Binary data
Re: Reducing the log spam
On Tue, Mar 5, 2024 at 7:55 AM Laurenz Albe wrote: > My experience from the field is that a lot of log spam looks like > > database/table/... "xy" does not exist > duplicate key value violates unique constraint "xy" Forcibly hiding those at the Postgres level seems a heavy hammer for what is ultimately an application problem. Tell me about a system that logs different classes of errors to different log files, and I'm interested again. Cheers, Greg
Avoiding inadvertent debugging mode for pgbench
Attached please find a patch to adjust the behavior of the pgbench program and make it behave like the other programs that connect to a database (namely, psql and pg_dump). Specifically, add support for using -d and --dbname to specify the name of the database. This means that -d can no longer be used to turn on debugging mode, and the long option --debug must be used instead. This removes a long-standing footgun, in which people assume that the -d option behaves the same as other programs. Indeed, because it takes no arguments, and because the first non-option argument is the database name, it still appears to work. However, people then wonder why pgbench is so darn verbose all the time! :) This is a breaking change, but fixing it this way seems to have the least total impact, as the number of people using the debug mode of pgbench is likely quite small. Further, those already using the long option are unaffected, and those using the short one simply need to replace '-d' with '--debug', arguably making their scripts a little more self-documenting in the process. Cheers, Greg pgbench.dash.d.or.not.dash.d.v1.patch Description: Binary data
Re: An improved README experience for PostgreSQL
+1 on the general idea. Maybe make that COPYRIGHT link go to an absolute URI, like all the other links, in case this file gets copied somewhere? Perhaps point it to https://www.postgresql.org/about/licence/ Cheers, Greg
Re: What about Perl autodie?
> > 2. Don't wait, migrate them all now. This would mean requiring > Perl 5.10.1 or later to run the TAP tests, even in back branches. > #2 please. For context, meson did not even exist in 2009. Cheers, Greg
Re: What about Perl autodie?
On Wed, Feb 7, 2024 at 9:05 AM Peter Eisentraut wrote: > I came across the Perl autodie pragma > (https://perldoc.perl.org/autodie). This seems pretty useful; is this > something we can use? Any drawbacks? Any minimum Perl version? Big +1 No drawbacks. I've been using it heavily for many, many years. Came out in 5.10.1, which should be available everywhere at this point (2009 was the year of release) Cheers, Greg
Adding comments to help understand psql hidden queries
The use of the --echo-hidden flag in psql is used to show people the way psql performs its magic for its backslash commands. None of them has more magic than "\d relation", but it suffers from needing a lot of separate queries to gather all of the information it needs. Unfortunately, those queries can get overwhelming and hard to figure out which one does what, especially for those not already very familiar with the system catalogs. Attached is a patch to add a small SQL comment to the top of each SELECT query inside describeOneTableDetail. All other functions use a single query, and thus need no additional context. But "\d mytable" has the potential to run over a dozen SQL queries! The new format looks like this: / QUERY */ /* Get information about row-level policies */ SELECT pol.polname, pol.polpermissive, CASE WHEN pol.polroles = '{0}' THEN NULL ELSE pg_catalog.array_to_string(array(select rolname from pg_catalog.pg_roles where oid = any (pol.polroles) order by 1),',') END, pg_catalog.pg_get_expr(pol.polqual, pol.polrelid), pg_catalog.pg_get_expr(pol.polwithcheck, pol.polrelid), CASE pol.polcmd WHEN 'r' THEN 'SELECT' WHEN 'a' THEN 'INSERT' WHEN 'w' THEN 'UPDATE' WHEN 'd' THEN 'DELETE' END AS cmd FROM pg_catalog.pg_policy pol WHERE pol.polrelid = '134384' ORDER BY 1; // Cheers, Greg psql.echo.hidden.comments.v1.patch Description: Binary data
Adding ordering to list of available extensions
Please find attached a patch to provide some basic ordering to the system views pg_available_extensions and pg_available_extension_versions. It is sorely tempting to add ORDER BYs to many of the other views in that file, but I understand that would be contentious as there are reasons for not adding an ORDER BY. However, in the case of pg_available_extensions, it's a very, very small resultset, with an obvious default ordering, and extremely unlikely to be a part of a larger complex query. It's much more likely people like myself are just doing a "SELECT * FROM pg_available_extensions" and then get annoyed at the random ordering. Cheers, Greg show_extensions_in_natural_order.pg.patch Description: Binary data
Re: [PATCH] Add inline comments to the pg_hba_file_rules view
Also a reluctant -1, as the comment-at-EOL style is very rare in my experience over the years of seeing many a pg_hba file.
Re: Make --help output fit within 80 columns per line
On Fri, Sep 15, 2023 at 11:11 AM torikoshia wrote: > I do not intend to adhere to this rule(my terminals are usually bigger > than 80 chars per line), but wouldn't it be a not bad direction to use > 80 characters for all commands? > Well, that's the question du jour, isn't it? The 80 character limit is based on punch cards, and really has no place in modern systems. While gnu systems are stuck in the past, many other ones have moved on to more sensible defaults: $ wget --help | wc -L 110 $ gcloud --help | wc -L 122 $ yum --help | wc -L 122 git is an interesting one, as they force things through a pager for their help, but if you look at their raw help text files, they have plenty of times they go past 80 when needed: $ wc -L git/Documentation/git-*.txt | sort -g | tail -20 109 git-filter-branch.txt 109 git-rebase.txt 116 git-diff-index.txt 116 git-http-fetch.txt 117 git-restore.txt 122 git-checkout.txt 122 git-ls-tree.txt 129 git-init-db.txt 131 git-push.txt 132 git-update-ref.txt 142 git-maintenance.txt 144 git-interpret-trailers.txt 146 git-cat-file.txt 148 git-repack.txt 161 git-config.txt 162 git-notes.txt 205 git-stash.txt 251 git-submodule.txt So in summary, I think 80 is a decent soft limit, but let's not stress out about some lines going over that, and make a hard limit of perhaps 120. See also: https://hilton.org.uk/blog/source-code-line-length Cheers, Greg P.S. I know this won't change anything right away, but it will get the conversation started, so we can escape the inertia of punch cards / VT100 terminals someday. :)
Re: Make --help output fit within 80 columns per line
On Tue, Jul 4, 2023 at 9:47 PM torikoshia wrote: > Since it seems preferable to have consistent line break policy and some > people use 80-column terminal, wouldn't it be better to make all commands > in 80 > columns per line? > All this seems an awful lot of work to support this mythical 80-column terminal user. It's 2023, perhaps it's time to widen the default assumption past 80 characters? Cheers, Greg
Re: Possibility to disable `ALTER SYSTEM`
Seems to be some resistance to getting this in core, so why not just use an extension? I was able to create a quick POC to do just that. Hook into PG and look for AlterSystemStmt, throw a "Sorry, ALTER SYSTEM is not currently allowed" error. Put into shared_preload_libraries and you're done. As a bonus, works on all supported versions, so no need to wait for Postgres 17 - or Postgres 18/19 given the feature drift this thread is experiencing :) Cheers, Greg
Re: Prevent psql \watch from running queries that return no rows
Thank you for the feedback, everyone. Attached is version 4 of the patch, featuring a few tests and minor rewordings. Cheers, Greg psql_watch_exit_on_zero_rows_v4.patch Description: Binary data
Re: Improve pg_stat_statements by making jumble handle savepoint names better
On Mon, Jul 24, 2023 at 6:46 PM Michael Paquier wrote: > Shouldn't this new field be marked as query_jumble_location > Yes, it should. I had some trouble getting it to work that way in the first place, but now I realize it was just my unfamiliarity with this part of the code. So thanks for the hint: v2 of the patch is much simplified by adding two attributes to theTransactionStmt node. I've also added some tests per your suggestion. Unrelated to this patch, I'm struggling with meson testing. Why doesn't this update the postgres test binary?: meson test --suite pg_stat_statements It runs "ninja" as expected, but it does not put a new build/tmp_install/home/greg/pg/17/bin/postgres in place until I do a "meson test" Cheers, Greg savepoint_jumble.v2.patch Description: Binary data
Improve pg_stat_statements by making jumble handle savepoint names better
Please find attached a patch to jumble savepoint name, to prevent certain transactional commands from filling up pg_stat_statements. This has been a problem with some busy systems that use django, which likes to wrap everything in uniquely named savepoints. Soon, over 50% of your pg_stat_statements buffer is filled with savepoint stuff, pushing out the more useful queries. As each query is unique, it looks like this: postgres=# select calls, query, queryid from pg_stat_statements where query ~ 'save|release|rollback' order by 2; calls |query | queryid ---+--+-- 1 | release b900150983cd24fb0d6963f7d28e17f7 | 8797482500264589878 1 | release ed9407630eb1000c0f6b63842defa7de | -9206510099095862114 1 | rollback | -2049453941623996126 1 | rollback to c900150983cd24fb0d6963f7d28e17f7 | -5335832667999552746 1 | savepoint b900150983cd24fb0d6963f7d28e17f7 | -117254996647181 1 | savepoint c47bce5c74f589f4867dbd57e9ca9f80 | 355123032993044571 1 | savepoint c900150983cd24fb0d6963f7d28e17f7 | -5921314469994822125 1 | savepoint d8f8e0260c64418510cefb2b06eee5cd | -981090856656063578 1 | savepoint ed9407630eb1000c0f6b63842defa7de | -25952890433218603 As the actual name of the savepoint is not particularly useful, the patch will basically ignore the savepoint name and allow things to be collapsed: calls | query | queryid ---++-- 2 | release $1 | -7998168840889089775 1 | rollback | 3749380189022910195 1 | rollback to $1 | -1816677871228308673 5 | savepoint $1 | 6160699978368237767 Without the patch, the only solution is to keep raising pg_stat_statements.max to larger and larger values to compensate for the pollution of the statement pool. Cheers, Greg savepoint_jumble.v1.patch Description: Binary data
Re: Forgive trailing semicolons inside of config files
On Tue, Jul 11, 2023 at 11:04 AM Isaac Morland wrote: > Please, no! > > There is no end to accepting sloppy syntax. What next, allow "SET > random_page_cost = 2.5;" (with or without semicolon) in config files? > Well yes, there is an end. A single, trailing semicolon. Full stop. It's not a slippery slope in which we end up asking the AI parser to interpret our haikus to derive the actual value. The postgresql.conf file is not some finicky YAML/JSON beast - we already support some looseness in quoting or not quoting values, optional whitespace, etc. Think of the trailing semicolon as whitespace, if you like. You can see from the patch that this does not replace EOL/EOF. > I'd be more interested in improvements in visibility of errors. For > example, maybe if I try to start the server and there is a config file > problem, I could somehow get a straightforward error message right in the > terminal window complaining about the line of the configuration which is > wrong. > That ship has long since sailed. We already send a detailed error message with the line number, but in today's world of "service start", "systemctl start", and higher level of control such as Patroni and Kubernetes, getting things to show in a terminal window isn't happening. We can't work around 2>&1. > Or maybe there could be a "check configuration" subcommand which checks > the configuration. > There are things inside of Postgres once it has started, but yeah, something akin to visudo would be nice for editing config files. > But I think it would be way more useful than a potentially never-ending > series of patches to liberalize the config parser. > It's a single semicolon, not a sign of the parser apocalypse. I've no plans for future enhancements, but if they do no harm and make Postgres more user friendly, I will support them. Cheers, Greg
Forgive trailing semicolons inside of config files
This has been a long-standing annoyance of mine. Who hasn't done something like this?: psql> SET random_page_cost = 2.5; (do some stuff, realize that rpc was too high) Let's put that inside of postgresql.conf: #-- # CUSTOMIZED OPTIONS #-- # Add settings for extensions here random_page_cost = 2.5; Boom! Server will not start. Surely, we can be a little more liberal in what we accept? Attached patch allows a single trailing semicolon to be silently discarded. As this parsing happens before the logging collector starts up, the error about the semicolon is often buried somewhere in a separate logfile or journald - so let's just allow postgres to start up since there is no ambiguity about what random_page_cost (or any other GUC) is meant to be set to. I also considered doing an additional ereport(LOG) when we find one, but seemed better on reflection to simply ignore it. Cheers, Greg postgresql.conf_allow_trailing_semicolon.v1.patch Description: Binary data
Re: Prevent psql \watch from running queries that return no rows
Thanks for the feedback! On Wed, Jul 5, 2023 at 5:51 AM Daniel Gustafsson wrote: > > The comment on ExecQueryAndProcessResults() needs to be updated with an > explanation of what this parameter is. > I added a comment in the place where min_rows is used, but not sure what you mean by adding it to the main comment at the top of the function? None of the other args are explained there, even the non-intuitive ones (e.g. svpt_gone_p) - return cancel_pressed ? 0 : success ? 1 : -1; > + return (cancel_pressed || return_early) ? 0 : success ? 1 : -1; > > I think this is getting tangled up enough that it should be replaced with > separate if() statements for the various cases. > Would like to hear others weigh in, I think it's still only three states plus a default, so I'm not convinced it warrants multiple statements yet. :) + HELP0(" \\watch [[i=]SEC] [c=N] [m=ROW]\n"); > + HELP0(" execute query every SEC seconds, > up to N times\n"); > + HELP0(" stop if less than ROW minimum > rows are rerturned\n"); > > "less than ROW minimum rows" reads a bit awkward IMO, how about calling it > [m=MIN] and describe as "stop if less than MIN rows are returned"? Also, > there > is a typo: s/rerturned/returned/. > Great idea: changed and will attach a new patch Cheers, Greg psql_watch_exit_on_zero_rows_v3.patch Description: Binary data
Re: Bytea PL/Perl transform
> > So I decided to propose a simple transform extension to pass bytea as > native Perl octet strings. Quick review, mostly housekeeping things: * Needs a rebase, minor failure on Mkvcbuild.pm * Code needs standardized formatting, esp. bytea_plperl.c * Needs to be meson-i-fied (i.e. add a "meson.build" file) * Do all of these transforms need to be their own contrib modules? So much duplicated code across contrib/*_plperl already (and *plpython too for that matter) ... Cheers, Greg
Re: Bypassing shared_buffers
On Thu, Jun 15, 2023 at 4:16 AM Vladimir Churyukin wrote: > We're trying to see what is the worst performance in terms of I/O, i.e. >> when the database just started up or the data/indexes being queried are not >> cached at all. > > You could create new tables that are copies of the existing ones (CREATE TABLE foo as SELECT * FROM ...), create new indexes, and run a query on those. Use schemas and search_path to keep the queries the same. No restart needed! (just potentially lots of I/O, time, and disk space :) Don't forget to do explain (analyze, buffers) to double check things.
Re: Let's make PostgreSQL multi-threaded
On Thu, Jun 8, 2023 at 8:44 AM Hannu Krosing wrote: > Do we have any statistics for the distribution of our user base ? > > My gut feeling says that for performance-critical use the non-Linux is > in low single digits at best. > Stats are probably not possible, but based on years of consulting, as well as watching places like SO, Slack, IRC, etc. over the years, IMO that's a very accurate gut feeling. I'd hazard 1% or less for non-Linux systems. Cheers, Greg
Re: Prevent psql \watch from running queries that return no rows
On Sat, Jun 3, 2023 at 5:58 PM Michael Paquier wrote: > > Wouldn't something like a target_rows be more flexible? You could use > this parameter with a target number of rows to expect, zero being one > choice in that. > Thank you! That does feel better to me. Please see attached a new v2 patch that uses a min_rows=X syntax (defaults to 0). Also added some help.c changes. Cheers, Greg psql_watch_exit_on_zero_rows_v2.patch Description: Binary data
Prevent psql \watch from running queries that return no rows
Attached is a patch to allow a new behavior for the \watch command in psql. When enabled, this instructs \watch to stop running once the query returns zero rows. The use case is the scenario in which you are watching the output of something long-running such as pg_stat_progress_create_index, but once it finishes you don't need thousands of runs showing empty rows from the view. This adds a new argument "zero" to the existing i=SEC and c=N arguments Notes: * Not completely convinced of the name "zero" (better than "stop_when_no_rows_returned"). Considered adding a new x=y argument, or overloading c (c=-1) but neither seemed very intuitive. On the other hand, it's tempting to stick to a single method moving forward, although this is a boolean option not a x=y one like the other two. * Did not update help.c on purpose - no need to make \watch span two lines there. * Considered leaving early (e.g. don't display the last empty result) but seemed better to show the final empty result as an explicit confirmation as to why it stopped. * Quick way to test: select * from pg_stat_activity where backend_start > now() - '20 seconds'::interval; \watch zero Cheers, Greg psql_watch_exit_on_zero_rows_v1.patch Description: Binary data
Re: PostgreSQL 14 press release draft
Some super quick nitpicks; feel free to ignore/apply/laugh off. ... administrators to deploy their data-backed applications. PostgreSQL continues to add innovations on complex data types, including more conveniences for accessing JSON and support for noncontiguous ranges of data. This latest release also adds to PostgreSQL's trend on improvements for high performance and distributed data workloads, with advances in support for connection concurrency, high-write workloads, query parallelism and logical replication. >> add innovations on complex data types -> add innovations to? "on" sounds odd >> comma after "query parallelism" please now works. This aligns PostgreSQL with commonly recognized syntax for retrieving information from JSON data. The subscripting framework added to PostgreSQL 14 can be generally extended to other nested data structures, and is also applied to the `hstore` data type in this release. >> with commonly recognized syntax -> with the commonly recognized syntax >> hyperlink hstore? [Range types](https://www.postgresql.org/docs/14/rangetypes.html), also first released in PostgreSQL 9.2, now have support for noncontiguous ranges through the introduction of the "[multirange]( https://www.postgresql.org/docs/14/rangetypes.html#RANGETYPES-BUILTIN)". A multirange is an ordered list of ranges that are nonoverlapping, which allows for developers to write simpler queries for dealing with complex sequences of >> introduction of the multirange -> introduction of the multirange type >> which allows for developers to write -> which lets developers write on the recent improvements to the overall management of B-tree indexes by >> to the overall management -> to the management operations. As this is a client-side feature, you can use pipeline mode with any modern PostgreSQL database so long as you use the version 14 client. >> more complicated than "version 14 client", more like - if the application has explicit >> support for it and was compiled via libpq against PG 14. >> Just don't want to overpromise here [Foreign data wrappers]( https://www.postgresql.org/docs/14/sql-createforeigndatawrapper.html), which are used for working with federated workloads across PostgreSQL and other >> which are used for -> used for In addition to supporting query parallelism, `postgres_fdw` can now also bulk insert data on foreign tables and import table partitions with the [`IMPORT FOREIGN SCHEMA`]( https://www.postgresql.org/docs/14/sql-importforeignschema.html) directive. >> can now also bulk insert -> can now bulk insert >> on foreign tables and import -> on foreign tables, and can import PostgreSQL 14 extends its performance gains to the [vacuuming]( https://www.postgresql.org/docs/14/routine-vacuuming.html) system, including optimizations for reducing overhead from B-Trees. >> B-Tree or B-tree - pick one (latter used earlier in this doc) lets you uniquely track a query through several PostgreSQL systems, including [`pg_stat_activity`]( https://www.postgresql.org/docs/14/monitoring-stats.html#MONITORING-PG-STAT-ACTIVITY-VIEW ), [`EXPLAIN VERBOSE`](https://www.postgresql.org/docs/14/sql-explain.html), and through several logging functions. >> "several PostgreSQL systems" sounds weird. >> "several logging functions" - not sure what this means parallel queries when using the `RETURN QUERY` directive, and enabling >> directive -> command now benefit from incremental sorts, a feature that was introduced in [PostgreSQL 13]( https://www.postgresql.org/about/news/postgresql-13-released-2077/). >> that was introduced in -> introduced in function. This release also adds the SQL conforming [`SEARCH`]( https://www.postgresql.org/docs/14/queries-with.html#QUERIES-WITH-SEARCH) and [`CYCLE`]( https://www.postgresql.org/docs/14/queries-with.html#QUERIES-WITH-CYCLE) directives to help with ordering and cycle detection for recursive [common table expressions]( https://www.postgresql.org/docs/14/queries-with.html#QUERIES-WITH-RECURSIVE ). >> directives -> clauses PostgreSQL 14 makes it convenient to assign read-only and write-only privileges >> convenient -> easy companies and organizations. Built on over 30 years of engineering, starting at >> companies -> companies, >> we claims 25 years at the top, 30 here. That's 5 non-open-source years? :)
Re: Add option --drop-cascade for pg_dump/restore
On Wed, Aug 11, 2021 at 10:53 PM Wu Haotian wrote: > Maybe we can add checks like "option --clean requires plain text format"? > If so, should I start a new mail thread for this? > Shrug. To me, that seems related enough it could go into the existing patch/thread. Cheers, Greg
Re: make MaxBackends available in _PG_init
On Wed, Aug 11, 2021 at 10:08 AM Tom Lane wrote: > You must not have enabled EXEC_BACKEND properly. It's a compile-time > #define that affects multiple modules, so it's easy to get wrong. > The way I usually turn it on is > Thank you. I was able to get it all working, and withdraw any objections to that bit of the patch. :) Cheers, Greg
Re: make MaxBackends available in _PG_init
On Mon, Aug 9, 2021 at 8:22 PM Bossart, Nathan wrote: > > Is this going to get tripped by a call from restore_backend_variables? > > I ran 'make check-world' with EXEC_BACKEND with no problems, so I > don't think so. > v3 looks good, but I'm still not sure how to test the bit mentioned above. I'm not familiar with this part of the code (SubPostmasterMain etc.), but running make check-world with EXEC_BACKEND does not seem to execute that code, as I added exit(1) to restore_backend_variables() and the tests still ran fine. Further digging shows that even though the #ifdef EXEC_BACKEND path is triggered, no --fork argument was being passed. Is there something else one needs to provide to force that --fork (see line 189 of src/backend/main/main.c) when testing? Cheers, Greg
Re: Add option --drop-cascade for pg_dump/restore
On Fri, Jul 16, 2021 at 9:40 AM Tom Lane wrote: > That would require pg_restore to try to edit the DROP commands during > restore, which sounds horribly fragile. I'm inclined to think that > supporting this option only during initial dump is safer. > Safer, but not nearly as useful. Maybe see what the OP (Wu Haotian) can come up with as a first implementation? Cheers, Greg
Re: Avoid stuck of pbgench due to skipped transactions
Apologies, just saw this. I found no problems, those "failures" were just me missing checkboxes on the commitfest interface. +1 on the patch. Cheers, Greg
Re: make MaxBackends available in _PG_init
On Sat, Aug 7, 2021 at 2:01 PM Bossart, Nathan wrote: > Here is a rebased version of the patch. > Giving this a review. Patch applies cleanly and `make check` works as of e12694523e7e4482a052236f12d3d8b58be9a22c Overall looks very nice and tucks MaxBackends safely away. I have a few suggestions: > + size = add_size(size, mul_size(GetMaxBackends(0), sizeof(BTOneVacInfo))); The use of GetMaxBackends(0) looks weird - can we add another constant in there for the "default" case? Or just have GetMaxBackends() work? > + GMB_MAX_PREPARED_XACTS = 1 << 0, /* include max_prepared_xacts */ s/include/add in/ > +typedef enum GMBOption > +{ > + GMB_MAX_PREPARED_XACTS = 1 << 0, /* include max_prepared_xacts */ > + GMB_NUM_AUXILIARY_PROCS = 1 << 1 /* include NUM_AUXILIARY_PROCS */ > +} GMBOption; Is a typedef enum really needed? As opposed to something like this style: #define WL_LATCH_SET (1 << 0) #define WL_SOCKET_READABLE (1 << 1) #define WL_SOCKET_WRITEABLE (1 << 2) > - (MaxBackends + max_prepared_xacts + 1)); > + (GetMaxBackends(GMB_MAX_PREPARED_XACTS) + 1)); This is a little confusing - there is no indication to the reader that this is an additive function. Perhaps a little more intuitive name: > + (GetMaxBackends(GMB_PLUS_MAX_PREPARED_XACTS) + 1)); However, the more I went through this patch, the more the GetMaxBackends(0) nagged at me. The vast majority of the calls are with "0". I'd argue for just having no arguments at all, which removes a bit of code and actually makes things like this easier to read: Original change: > - uint32 TotalProcs = MaxBackends + NUM_AUXILIARY_PROCS + max_prepared_xacts; > + uint32 TotalProcs = GetMaxBackends(GMB_NUM_AUXILIARY_PROCS | GMB_MAX_PREPARED_XACTS); Versus: > - uint32 TotalProcs = MaxBackends + NUM_AUXILIARY_PROCS + max_prepared_xacts; > + uint32 TotalProcs = GetMaxBackends() + NUM_AUXILIARY_PROCS + max_prepared_xacts; > + * This must be called after modules have had the change to alter GUCs in > + * shared_preload_libraries, and before shared memory size is determined. s/change/chance/; > +void > +SetMaxBackends(int max_backends) > +{ > + if (MaxBackendsInitialized) > + elog(ERROR, "MaxBackends already initialized"); Is this going to get tripped by a call from restore_backend_variables? Cheers, Greg
Mention --enable-tap-tests in the TAP section page
Quick patch to add mention of the need for compiling with --enable-tap-tests on the TAP section page https://www.postgresql.org/docs/current/regress-tap.html Searching about the TAP tests often leads to this page, but there is no easy link or mention of the fact that the sample invocations will not work without the special config flag. Cheers, Greg mention-enable-tap.patch Description: Binary data
Re: Speed up pg_checksums in cases where checksum already set
On Tue, Jun 29, 2021 at 2:59 AM Michael Paquier wrote: > Does that look fine to you? > Looks great, I appreciate the renaming. Cheers, Greg
Re: Remove useless int64 range checks on BIGINT sequence MINVALUE/MAXVALUE values
Those code comments look good.
Re: Avoid stuck of pbgench due to skipped transactions
The following review has been posted through the commitfest application: make installcheck-world: tested, failed Implements feature: tested, failed Spec compliant: not tested Documentation:not tested Looks fine to me, as a way of catching this edge case.
Re: Speed up pg_checksums in cases where checksum already set
On Fri, Jun 18, 2021 at 1:57 AM Michael Paquier wrote: > This doc addition is a bit confusing, as it could mean that each file > has just one single checksum. We could be more precise, say: > "When enabling checksums, each relation file block with a changed > checksum is rewritten in place." > Agreed, I like that wording. New patch attached. > Should we also mention that the sync happens even if no blocks are > rewritten based on the reasoning of upthread (aka we'd better do the > final flush as an interrupted pg_checksums may let a portion of the > files as not flushed)? > I don't know that we need to bother: the default is already to sync and one has to go out of one's way using the -N argument to NOT sync, so I think it's a pretty safe assumption to everyone (except those who read my first version of my patch!) that syncing always happens. Cheers, Greg 003.pg_checksums.optimize.writes.and.always.sync.patch Description: Binary data
Re: Update maintenance_work_mem/autovacuum_work_mem to reflect the 1GB limitation with VACUUM
The following review has been posted through the commitfest application: make installcheck-world: not tested Implements feature: not tested Spec compliant: not tested Documentation:not tested Latest patch looks fine to me, to be clear. The new status of this patch is: Ready for Committer
Re: Update maintenance_work_mem/autovacuum_work_mem to reflect the 1GB limitation with VACUUM
Should we say "currently has"?
Re: Speed up pg_checksums in cases where checksum already set
Newer version attach that adds a small documentation tweak as well. Cheers, Greg pg_checksums.optimize.writes.and.always.sync.patch Description: Binary data
Re: Speed up pg_checksums in cases where checksum already set
Fair enough; thanks for the feedback. Attached is a new version that does an unconditional sync (well, unless do_sync is false, a flag I am not particularly fond of). Cheers, Greg pg_checksums.optimize.writes.always.sync.patch Description: Binary data
Re: Update maintenance_work_mem/autovacuum_work_mem to reflect the 1GB limitation with VACUUM
s/Node/Note/ Other than that, +1 to the patch and +1 to backpatching. The new status of this patch is: Waiting on Author
Re: Add option --drop-cascade for pg_dump/restore
Overall the patch looks good, but I did notice a few small things: 1. In pg_dumpall.c, the section /* Add long options to the pg_dump argument list */, we are now passing along the --drop-cascade option. However, --clean is not passed in, so any call to pg_dumpall using --drop-cascade fails a the pg_dump step. You'll note that --if-exists it not passed along either; because we are dropping the whole database, we don't need to have pg_dump worry about dropping objects at all. So I think that --drop-cascade should NOT be passed along from pg_dumpall to pg_dump. 2. I'm not even sure if --drop-cascade makes sense for pg_dumpall, as you cannot cascade global things like databases and roles. 3. In the file pg_backup_archiver.c, the patch does a stmtEnd = strstr(mark + strlen(buffer), ";");" and then spits out things "past" the semicolon as the final %s in the appendPQExpBuffer line. I'm not clear why: are we expecting more things to appear after the semi-colon? Why not just append a "\n" manually as part of the previous %s? Cheers, Greg The new status of this patch is: Waiting on Author
Re: Reduce lock level for ALTER TABLE ... ADD CHECK .. NOT VALID
The following review has been posted through the commitfest application: make installcheck-world: tested, passed Implements feature: tested, passed Spec compliant: tested, passed Documentation:tested, passed Looks fine to me The new status of this patch is: Ready for Committer
Re: Speed up pg_checksums in cases where checksum already set
Thanks for the quick replies, everyone. On Wed, May 26, 2021 at 10:17 PM Michael Paquier wrote: > > -if (do_sync) > +if (do_sync && total_files_modified) > Here, I am on the edge. It could be an advantage to force a flush of > the data folder anyway, no? I was originally on the fence about including this as well, but it seems like since the database is shut down and already in a consistent state, there seems no advantage to syncing if we have not made any changes. Things are no better or worse than when we arrived. However, the real-world use case of running pg_checksums --enable and getting no changed blocks is probably fairly rare, so if there is a strong objection, I'm happy reverting to just (do_sync). (I'm not sure how cheap a sync is, I assume it's low impact as the database is shut down, I guess it becomes a "might as well while we are here"?) On Wed, May 26, 2021 at 10:29 PM Justin Pryzby wrote: > In one of the checksum patches, there was an understanding that the pages > should be written even if the checksum is correct, to handle replicas. > ... > Does your patch complicate things for the "stop all the clusters before > switching them all" case? > I cannot imagine how it would, but, like Michael, I'm not really understanding the reasoning here. We only run when safely shutdown, so no WAL or dirty buffers need concern us :). Of course, once the postmaster is up and running, fiddling with checksums becomes vastly more complicated, as evidenced by that thread. I'm happy sticking to and speeding up the offline version for now. Cheers, Greg
Speed up pg_checksums in cases where checksum already set
The attached patch makes an optimization to pg_checksums which prevents rewriting the block if the checksum is already what we expect. This can lead to much faster runs in cases where it is already set (e.g. enabled -> disabled -> enable, external helper process, interrupted runs, future parallel processes). There is also an effort to not sync the data directory if no changes were written. Finally, added a bit more output on how many files were actually changed, e.g.: Checksum operation completed Files scanned: 1236 Blocks scanned: 23283 Files modified: 38 Blocks modified: 19194 pg_checksums: syncing data directory pg_checksums: updating control file Checksums enabled in cluster Cheers, Greg pg_checksums.optimize.writes.patch Description: Binary data
Re: psql \df choose functions by their arguments
I like the wildcard aspect, but I have a few issues with the patch: * It doesn't respect some common abbreviations that work elsewhere (e.g. CREATE FUNCTION). So while "int4" works, "int" does not. Nor does "float", which thus requires the mandatory-double-quoted "double precision" * Adding commas to the args, as returned by psql itself via \df, provides no matches. * There seems to be no way (?) to limit the functions returned if they share a common root. The previous incantation allowed you to pull out foo(int) from foo(int, bigint). This was a big motivation for writing this patch. * SQL error on \df foo a..b as well as one on \df foo (bigint bigint) Cheers, Greg
Re: psql \df choose functions by their arguments
Ha ha ha, my bad, I am not sure why I left those out. Here is a new patch with int2, int4, and int8. Thanks for the email. Cheers, Greg v6-psql-df-pick-function-by-type.patch Description: Binary data
Re: psql \df choose functions by their arguments
Thanks for the feedback: new version v5 (attached) has int8, plus the suggested code formatting. Cheers, Greg v5-psql-df-pick-function-by-type.patch Description: Binary data
Re: psql \df choose functions by their arguments
On Sat, Jan 2, 2021 at 1:56 AM Thomas Munro wrote: > ... > It looks like there is a collation dependency here that causes the > test to fail on some systems: > Thanks for pointing that out. I tweaked the function definitions to hopefully sidestep the ordering issue - attached is v4. Cheers, Greg v4-psql-df-pick-function-by-type.patch Description: Binary data
Re: psql \df choose functions by their arguments
Attached is the latest patch against HEAD - basically fixes a few typos. Cheers, Greg v3-psql-df-pick-function-by-type.patch Description: Binary data