Re: Minor cleanups in the SSL tests

2024-05-16 Thread Peter Eisentraut

On 16.05.24 23:27, Daniel Gustafsson wrote:

On 16 May 2024, at 11:43, Peter Eisentraut  wrote:



You might want to run your patch through pgperltidy.  The result doesn't look 
bad, but a bit different than what you had crafted.


Ugh, I thought I had but clearly had forgotten. Fixed now.


append_conf() opens and closes the file for each call.  It might be nice if it 
could accept a list.  Or you can just pass the whole block as one string, like 
it was done for pg_ident.conf before.


The attached v2 pass the whole block as a here-doc which seemed like the best
option to retain readability of the config.


Works for me.





Re: Why does pgindent's README say to download typedefs.list from the buildfarm?

2024-05-16 Thread Peter Eisentraut

On 16.05.24 16:45, Tom Lane wrote:

Peter Eisentraut  writes:

In these cases, I think for
NotificationHash
ResourceOwnerData
WalSyncMethod
we can just get rid of the typedef.


I have no objection to dealing with NotificationHash as you have here.


ReadBuffersFlags shouldn't be an enum at all, because its values are
used as flag bits.


Yeah, that was bothering me too, but I went for the minimum delta.
I did think that a couple of integer macros would be a better idea,
so +1 for what you did here.


I committed this, and Michael took care of WaitEventExtension, so we 
should be all clear here.






Re: Why does pgindent's README say to download typedefs.list from the buildfarm?

2024-05-16 Thread Michael Paquier
On Fri, May 17, 2024 at 10:24:57AM +0900, Michael Paquier wrote:
> Yep.  I can handle that in 2~3 hours.

And done with 110eb4aefbad.  If there's anything else, feel free to
let me know.
--
Michael


signature.asc
Description: PGP signature


Re: pg_combinebackup does not detect missing files

2024-05-16 Thread David Steele

On 4/25/24 00:05, Robert Haas wrote:

On Tue, Apr 23, 2024 at 7:23 PM David Steele  wrote:

I don't understand what you mean here. I thought we were in agreement
that verifying contents would cost a lot more. The verification that
we can actually do without much cost can only check for missing files
in the most recent backup, which is quite weak. pg_verifybackup is
available if you want more comprehensive verification and you're
willing to pay the cost of it.


I simply meant that it is *possible* to verify the output of
pg_combinebackup without explicitly verifying all the backups. There
would be overhead, yes, but it would be less than verifying each backup
individually. For my 2c that efficiency would make it worth doing
verification in pg_combinebackup, with perhaps a switch to turn it off
if the user is confident in their sources.


Hmm, can you outline the algorithm that you have in mind? I feel we've
misunderstood each other a time or two already on this topic, and I'd
like to avoid more of that. Unless you just mean what the patch I
posted does (check if anything from the final manifest is missing from
the corresponding directory), but that doesn't seem like verifying the
output.


Yeah, it seems you are right that it is not possible to verify the 
output in all cases.


However, I think allowing the user to optionally validate the input 
would be a good feature. Running pg_verifybackup as a separate step is 
going to be a more expensive then verifying/copying at the same time. 
Even with storage tricks to copy ranges of data, pg_combinebackup is 
going to aware of files that do not need to be verified for the current 
operation, e.g. old copies of free space maps.


Additionally, if pg_combinebackup is updated to work against tar.gz, 
which I believe will be important going forward, then there would be 
little penalty to verification since all the required data would be in 
memory at some point anyway. Though, if the file is compressed it might 
be redundant since compression formats generally include checksums.


One more thing occurs to me -- if data checksums are enabled then a 
rough and ready output verification would be to test the checksums 
during combine. Data checksums aren't very good but something should be 
triggered if a bunch of pages go wrong, especially since the block 
offset is part of the checksum. This would be helpful for catching 
combine bugs.


Regards,
-David




Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-16 Thread Peter Eisentraut

On 17.05.24 04:26, Robert Haas wrote:

I do*emphatically*  think we need a parking lot.


People seem to like this idea; I'm not quite sure I follow it.  If you 
just want the automated patch testing, you can do that on your own by 
setting up github/cirrus for your own account.  If you keep emailing the 
public your patches just because you don't want to set up your private 
testing tooling, that seems a bit abusive?


Are there other reasons why developers might want their patches 
registered in a parking lot?


We also need to consider that the cfbot/cirrus resources are limited. 
Whatever changes we make, we should make sure that they are prioritized 
well.






Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-16 Thread Peter Eisentraut

On 17.05.24 00:13, Tom Lane wrote:

So a third status that encompasses the various other situations like
maybe forgotten by author, disagreements between author and reviewer,
process difficulties, needs some senior developer intervention, etc.
could be helpful.


Hmm, "forgotten by author" seems to generally turn into "this has been
in WOA state a long time".  Not sure we have a problem representing
that, only with a process for eventually retiring such entries.
Your other three examples all sound like "needs senior developer
attention", which could be a helpful state that's distinct from "ready
for committer".  It's definitely not the same as "Unclear".


Yeah, some fine-tuning might be required.  But I would be wary of 
over-designing too many new states at this point.  I think the key idea 
is that there ought to be a state that communicates "needs attention 
from someone other than author, reviewer, or committer".






Re: race condition when writing pg_control

2024-05-16 Thread Thomas Munro
The specific problem here is that LocalProcessControlFile() runs in
every launched child for EXEC_BACKEND builds.  Windows uses
EXEC_BACKEND, and Windows' NTFS file system is one of the two file
systems known to this list to have the concurrent read/write data
mashing problem (the other being ext4).




Re: Slow catchup of 2PC (twophase) transactions on replica in LR

2024-05-16 Thread Peter Smith
Hi Kuroda-san,

I did not apply these v12* patches, but I have diff'ed all of them
with the previous v11* patches and confirmed that all of my previous
review comments now seem to be addressed.

I don't have any more comments to make at this time. Thanks!

==
Kind Regards,
Peter Smith.
Fujitsu Australia




Re: [Buildfarm:63] Re: Why is citext/regress failing on hamerkop?

2024-05-16 Thread TAKATSUKA Haruka
Hello,

I'm a hamerkop maintainer.
Sorry I have missed the scm error for so long.

Today I switched scmrepo from git.postgresql.org/git/postgresql.git 
to github.com/postgres/postgres.git and successfully modernized
the build target code.

with best regards, Haruka Takatsuka


On Thu, 16 May 2024 16:18:23 -0400
Tom Lane  wrote:

> Thomas Munro  writes:
> > For citext_utf8, I pushed cff4e5a3.  Hamerkop runs infrequently, so
> > here's hoping for 100% green on master by Tuesday or so.
> 
> Meanwhile, back at the ranch, it doesn't seem that changed anything:
> 
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=hamerkop=2024-05-16%2011%3A00%3A32
> 
> ... and now that I look more closely, the reason why it didn't
> change anything is that hamerkop is still building 0294df2
> on HEAD.  All its other branches are equally stuck at the
> end of March.  So this is a flat-out-broken animal, and I
> plan to just ignore it until its owner un-sticks it.
> (In particular, I think we shouldn't be in a hurry to push
> the patch discussed downthread.)
> 
> Andrew: maybe the buildfarm server could be made to flag
> animals building exceedingly old commits?  This is the second
> problem of this sort that I've noticed this month, and you
> really have to look closely to realize it's happening.
> 
>   regards, tom lane


_




Re: GUC names in messages

2024-05-16 Thread Peter Smith
On Thu, May 16, 2024 at 9:35 PM Peter Eisentraut  wrote:
>
> On 04.01.24 07:53, Peter Smith wrote:
> >> Now that I read this again, I think this is wrong.
> >>
> >> We should decide the quoting for a category, not the actual content.
> >> Like, quote all file names; do not quote keywords.
> >>
> >> This led to the attempted patch to decide the quoting of GUC parameter
> >> names dynamically based on the actual content, which no one really
> >> liked.  But then, to preserve consistency, we also need to be uniform in
> >> quoting GUC parameter names where the name is hardcoded.
> >>
> >
> > I agree. By attempting to define when to and when not to use quotes it
> > has become overcomplicated.
> >
> > Earlier in the thread, I counted how quotes were used in the existing
> > messages [5]; there were ~39 quoted and 164 not quoted. Based on that
> > we chose to stay with the majority, and leave all the unquoted ones so
> > only adding quotes "when necessary". In hindsight, that was probably
> > the wrong choice because it opened a can of worms about what "when
> > necessary" even means (e.g. what about underscores, mixed case etc).
> >
> > Certainly one simple rule "just quote everything" is easiest to follow.
>
> I've been going through the translation updates for PG17 these days and
> was led back around to this issue.  It seems we left it in an
> intermediate state that no one was really happy with and which is
> arguably as inconsistent or more so than before.
>
> I think we should accept your two patches
>
> v6-0001-GUC-names-docs.patch
> v6-0002-GUC-names-add-quotes.patch
>
> which effectively everyone was in favor of and which seem to be the most
> robust and sustainable solution.
>
> (The remaining three patches from the v6 set would be PG18 material at
> this point.)

Thanks very much for taking an interest in resurrecting this thread.

It was always my intention to come back to this when the dust had
settled on PG17. But it would be even better if the docs for the rule
"just quote everything", and anything else you deem acceptable, can be
pushed sooner.

Of course, there will still be plenty more to do for PG18, including
locating examples in newly pushed code for messages that have slipped
through the cracks during the last few months using different formats,
and other improvements, but those tasks should become easier if we can
get some of these v6 patches out of the way first.

==
Kind Regards,
Peter Smith.
Fujitsu Australia




Re: Add recovery to pg_control and remove backup_label

2024-05-16 Thread David Steele

On 4/16/24 18:55, Stefan Fercot wrote:

Hi,

On Sunday, March 10th, 2024 at 4:47 AM, David Steele wrote:

I've had a new idea which may revive this patch. The basic idea is to
keep backup_label but also return a copy of pg_control from
pg_stop_backup(). This copy of pg_control would be safe from tears and
have a backupLabelRequired field set (as Andres suggested) so recovery
cannot proceed without the backup label.

So, everything will continue to work as it does now. But, backup
software can be enhanced to write the improved pg_control that is
guaranteed not to be torn and has protection against a missing backup label.

Of course, pg_basebackup will write the new backupLabelRequired field
into pg_control, but this way third party software can also gain
advantages from the new field.


Bump on this idea.

Given the discussion in [1], even if it obviously makes sense to improve the in 
core backup capabilities, the more we go in that direction, the more we'll rely 
on outside orchestration.
So IMHO it also worth worrying about given more leverage to such orchestration 
tools. In that sense, I really like the idea to extend the backup functions.


I have implemented this idea and created a new thread [1] for it. 
Hopefully it will address the concerns expressed in this thread.


Regards,
-David

[1] 
https://www.postgresql.org/message-id/e2636c5d-c031-43c9-a5d6-5e5c7e4c5514%40pgmasters.net





Return pg_control from pg_backup_stop().

2024-05-16 Thread David Steele

Hackers,

This is greatly simplified implementation of the patch proposed in [1] 
and hopefully it addresses the concerns expressed there. Since the 
implementation is quite different it seemed like a new thread was 
appropriate, especially since the old thread title would be very 
misleading regarding the new functionality.


The basic idea is to harden recovery by returning a copy of pg_control 
from pg_backup_stop() that has a flag set to prevent recovery if the 
backup_label file is missing. Instead of backup software copying 
pg_control from PGDATA, it stores an updated version that is returned 
from pg_backup_stop(). This is better for the following reasons:


* The user can no longer remove backup_label and get what looks like a 
successful recovery (while almost certainly causing corruption). If 
backup_label is removed the cluster will not start. The user may try 
pg_resetwal, but that tool makes it pretty clear that corruption will 
result from its use.


* We don't need to worry about backup software seeing a torn copy of 
pg_control, since Postgres can safely read it out of memory and provide 
a valid copy via pg_backup_stop(). This solves torn reads without 
needing to write pg_control via a temp file, which may affect 
performance on a standby.


* For backup from standby, we no longer need to instruct the backup 
software to copy pg_control last. In fact the backup software should not 
copy pg_control from PGDATA at all.


These changes have no impact on current backup software and they are 
free to use the pg_control available from pg_stop_backup() or continue 
to use pg_control from PGDATA. Of course they will miss the benefits of 
getting a consistent copy of pg_control and the backup_label checking, 
but will be no worse off than before.


I'll register this in the July CF.

Regards,
-David

[1] 
https://www.postgresql.org/message-id/2daf8adc-8db7-4204-a7f2-a7e94e2bf...@pgmasters.netFrom 531872dcdb09f5d2f89af44a3c04c9d2d6da89be Mon Sep 17 00:00:00 2001
From: David Steele 
Date: Fri, 17 May 2024 02:42:35 +
Subject: Return pg_control from pg_backup_stop().

Harden recovery by returning a copy of pg_control from pg_backup_stop() that has
a flag set to prevent recovery if the backup_label file is missing. Instead of
backup software copying pg_control from PGDATA, it stores an updated version
that is returned from pg_backup_stop(). This is better for the following
reasons:

* The user can no longer remove backup_label and get what looks like a
successful recovery (while almost certainly causing corruption). If backup_label
is removed the cluster will not start. The user may try pg_resetwal, but that
tool makes it pretty clear that corruption will result from its use.

* We don't need to worry about backup software seeing a torn copy of pg_control,
since Postgres can safely read it out of memory and provide a valid copy via
pg_backup_stop(). This solves torn reads without needing to write pg_control via
a temp file, which may affect performance on a standby.

* For backup from standby, we no longer need to instruct the backup software to
copy pg_control last. In fact the backup software should not copy pg_control 
from
PGDATA at all.

These changes have no impact on current backup software and they are free to use
the pg_control available from pg_stop_backup() or continue to use pg_control 
from
PGDATA. Of course they will miss the benefits of getting a consistent copy of
pg_control and the backup_label checking, but will be no worse off than before.

Control and catalog version bumps are required.
---
 doc/src/sgml/backup.sgml| 18 +-
 doc/src/sgml/func.sgml  | 10 ++-
 src/backend/access/transam/xlog.c   | 19 ++
 src/backend/access/transam/xlogfuncs.c  | 17 --
 src/backend/access/transam/xlogrecovery.c   | 10 ++-
 src/backend/backup/basebackup.c | 19 +++---
 src/backend/catalog/system_functions.sql|  2 +-
 src/backend/utils/misc/pg_controldata.c |  7 ++-
 src/bin/pg_controldata/pg_controldata.c |  2 +
 src/bin/pg_resetwal/pg_resetwal.c   |  1 +
 src/bin/pg_rewind/pg_rewind.c   |  1 +
 src/include/access/xlogbackup.h |  8 +++
 src/include/catalog/pg_control.h|  4 ++
 src/include/catalog/pg_proc.dat | 10 +--
 src/test/recovery/t/002_archiving.pl| 20 ++
 src/test/recovery/t/042_low_level_backup.pl | 67 -
 16 files changed, 182 insertions(+), 33 deletions(-)

diff --git a/doc/src/sgml/backup.sgml b/doc/src/sgml/backup.sgml
index 91da3c26ba..1b87a05dc5 100644
--- a/doc/src/sgml/backup.sgml
+++ b/doc/src/sgml/backup.sgml
@@ -1010,9 +1010,12 @@ SELECT * FROM pg_backup_stop(wait_for_archive => true);
  values. The second of these fields should be written to a file named
  backup_label in the root directory of the backup. The
  third field should be written to a file named
- tablespace_map unless the field 

Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-16 Thread Robert Haas
On Thu, May 16, 2024 at 5:46 PM Tom Lane  wrote:
> Right, so what can we do about that?  Does Needs Review state need to
> be subdivided, and if so how?

It doesn't really matter how many states we have if they're not set accurately.

> At this point it seems like there's consensus to have a "parking"
> section of the CF app, separate from the time-boxed CFs, and I hope
> somebody will go make that happen.  But I don't think that's our only
> issue, so we need to keep thinking about what should be improved.

I do *emphatically* think we need a parking lot. And a better
integration between commitfest.postgresql.org and the cfbot, too. It's
just ridiculous that more work hasn't been put into this. I'm not
faulting Thomas or anyone else -- I mean, it's not his job to build
the infrastructure the project needs any more than it is anyone else's
-- but for a project of the size and importance of PostgreSQL to take
years to make minor improvements to this kind of critical
infrastructure is kind of nuts. If we don't have enough volunteers,
let's go recruit some more and promise them cookies.

I think you're overestimating the extent to which the problem is "we
don't reject enough patches". That *is* a problem, but it seems we
have also started rejecting some patches that just never really got
much attention for no particularly good reason, while letting other
things linger that didn't really get that much more attention, or
which were objectively much worse ideas. I think that one place where
the process is breaking down is in the tacit assumption that the
person who wrote the patch wants to get it committed. In some cases,
people park things in the CF for CI runs without a strong intention of
pursuing them; in other cases, the patch author is in no kind of rush.

I think we need to move to a system where patches exist independent of
a CommitFest, and get CI run on them unless they get explicitly closed
or are too inactive or some other criterion that boils down to nobody
cares any more. Then, we need to get whatever subset of those patches
need to be reviewed in a particular CommitFest added to that
CommitFest. For example, imagine that the CommitFest is FORCIBLY empty
until a week before it starts. You can still register patches in the
system generally, but that just means they get CI runs, not that
they're scheduled to be reviewed. A week before the CommitFest,
everyone who has a patch registered in the system that still applies
gets an email saying "click here if you think this patch should be
reviewed in the upcoming CommitFest -- if you don't care about the
patch any more or it needs more work before other people review it,
don't click here". Then, the CommitFest ends up containing only the
things where the patch author clicked there during that week.

For bonus points, suppose we make it so that when you click the link,
it takes you to a box where you can type in a text comment that will
display in the app, explaining why your patch needs review: "Robert
says the wire protocol changes in this patch are wrong, but I think
he's full of hot garbage and want a second opinion!" (a purely
hypothetical example, I'm sure) If you put in a comment like this when
you register your patch for the CommitFest, it gets a sparkly gold
border and floats to the top of the list, or we mail you a Kit-Kat
bar, or something. I don't know.

The point is - we need a much better signal to noise ratio here. I bet
the number of patches in the CommitFest that actually need review is
something like 25% of the total. The rest are things that are just
parked there by a committer, or that the author doesn't care about
right now, or that are already being actively discussed, or where
there's not a clear way forward. We could create new statuses for all
of those states - "Parked", "In Hibernation," "Under Discussion," and
"Unclear" - but I think that's missing the point. What we really want
is to not see that stuff in the first place. It's a CommitFest, not
once-upon-a-time-I-wrote-a-patch-Fest.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Slow catchup of 2PC (twophase) transactions on replica in LR

2024-05-16 Thread Peter Smith
Hi, Here are my remaining review comments for the latest v11* patches.

//
v11-0001
//

No changes. No comments.

//
v11-0002
//

==
doc/src/sgml/ref/alter_subscription.sgml

2.1.
ALTER SUBSCRIPTION ... SET (failover = on|off) and
-   ALTER SUBSCRIPTION ... SET (two_phase = on|off)
+   ALTER SUBSCRIPTION ... SET (two_phase = off)

My other thread patch has already been pushed [1], so now you can
modify this to say "true|false" as previously suggested.


//
v11-0003
//

==
src/backend/commands/subscriptioncmds.c

3.1. AlterSubscription

+ subtwophase = LOGICALREP_TWOPHASE_STATE_DISABLED;
+ }
+ else
+ subtwophase = LOGICALREP_TWOPHASE_STATE_PENDING;
+
+
  /* Change system catalog acoordingly */
  values[Anum_pg_subscription_subtwophasestate - 1] =
- CharGetDatum(opts.twophase ?
- LOGICALREP_TWOPHASE_STATE_PENDING :
- LOGICALREP_TWOPHASE_STATE_DISABLED);
+ CharGetDatum(subtwophase);
  replaces[Anum_pg_subscription_subtwophasestate - 1] = true;

Sorry, I don't think that 'subtwophase' change is an improvement --
IMO the existing ternary code was fine as-is.

I only reported the excessive flag checking in the previous v10-0003
review because of some extra "if (!opts.twophase)" code but that was
caused by what you called "wrong git operations." and is already fixed
now.

//
v11-0004
//

==
src/sgml/catalogs.sgml

4.1.
+ 
+  
+   subforcealter bool
+  
+  
+   If true, then the ALTER
SUBSCRIPTION
+   can disable two_phase option, even if there are
+   uncommitted prepared transactions from when two_phase
+   was enabled
+  
+ 
+

I think this description should be changed to say what it *really*
does. IMO, the stuff about 'two_phase' is more like a side-effect.

SUGGESTION (this is just pseudo-code. You can add the CREATE
SUBSCRIPTION 'force_alter' link appropriately)

If true, then the ALTER SUBSCRIPTION command can
sometimes be forced to proceed instead of giving an error. See
force_alter parameter for details about when this might
be useful.

==
[1] 
https://github.com/postgres/postgres/commit/fa65a022db26bc446fb67ce1d7ac543fa4bb72e4

Kind Regards,
Peter Smith.
Fujitsu Australia




Re: allow changing autovacuum_max_workers without restarting

2024-05-16 Thread Nathan Bossart
On Thu, May 16, 2024 at 04:37:10PM +, Imseih (AWS), Sami wrote:
> I thought 256 was a good enough limit. In practice, I doubt anyone will 
> benefit from more than a few dozen autovacuum workers. 
> I think 1024 is way too high to even allow.

WFM

> I don't think combining 1024 + 5 = 1029 is a good idea in docs.
> Breaking down the allotment and using the name of the constant 
> is much more clear.
> 
> I suggest 
> " max_connections + max_wal_senders + max_worker_processes + 
> AUTOVAC_MAX_WORKER_SLOTS + 5"
> 
> and in other places in the docs, we should mention the actual 
> value of AUTOVAC_MAX_WORKER_SLOTS. Maybe in the 
> below section?
> 
> Instead of:
> -() and allowed background
> +(1024) and allowed background
> 
> do something like:
> -() and allowed background
> +   AUTOVAC_MAX_WORKER_SLOTS  (1024) and allowed background
> 
> Also,  replace the 1024 here with AUTOVAC_MAX_WORKER_SLOTS.
> 
> +max_wal_senders,
> +plus max_worker_processes, plus 1024 for autovacuum
> +worker processes, plus one extra for each 16

Part of me wonders whether documenting the exact formula is worthwhile.
This portion of the docs is rather complicated, and I can't recall ever
having to do the arithmetic is describes.  Plus, see below...

> Also, Not sure if I am mistaken here, but the "+ 5" in the existing docs
> seems wrong.
>  
> If it refers to NUM_AUXILIARY_PROCS defined in 
> include/storage/proc.h, it should a "6"
> 
> #define NUM_AUXILIARY_PROCS 6
> 
> This is not a consequence of this patch, and can be dealt with
> In a separate thread if my understanding is correct.

Ha, I think it should actually be "+ 7"!  The value is calculated as

MaxConnections + autovacuum_max_workers + 1 + max_worker_processes + 
max_wal_senders + 6

Looking at the history, this documentation tends to be wrong quite often.
In v9.2, the checkpointer was introduced, and these formulas were not
updated.  In v9.3, background worker processes were introduced, and the
formulas were still not updated.  Finally, in v9.6, it was fixed in commit
597f7e3.  Then, in v14, the archiver process was made an auxiliary process
(commit d75288f), making the formulas out-of-date again.  And in v17, the
WAL summarizer was added.

On top of this, IIUC you actually need even more semaphores if your system
doesn't support atomics, and from a quick skim this doesn't seem to be
covered in this documentation.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: Potential stack overflow in incremental base backup

2024-05-16 Thread Thomas Munro
On Tue, Apr 16, 2024 at 4:10 AM Robert Haas  wrote:
> On Wed, Apr 10, 2024 at 9:55 PM Thomas Munro  wrote:
> > To rescue my initdb --rel-segsize project[1] for v18, I will have a go
> > at making that dynamic.  It looks like we don't actually need to
> > allocate that until we get to the GetFileBackupMethod() call, and at
> > that point we have the file size.  If I understand correctly,
> > statbuf.st_size / BLCKSZ would be enough, so we could embiggen our
> > block number buffer there if required, right?
>
> Yes.

Here is a first attempt at that.  Better factoring welcome.  New
observations made along the way: the current coding can exceed
MaxAllocSize and error out, or overflow 32 bit size_t and allocate
nonsense.  Do you think it'd be better to raise an error explaining
that, or silently fall back to full backup (tried that way in the
attached), or that + log messages?  Another option would be to use
huge allocations, so we only have to deal with that sort of question
for 32 bit systems (i.e. effectively hypothetical/non-relevant
scenarios), but I don't love that idea.

> ...
> I do understand that a 1GB segment size is not that big in 2024, and
> that filesystems with a 2GB limit are thought to have died out a long
> time ago, and I'm not against using larger segments. I do think,
> though, that increasing the segment size by 32768x in one shot is
> likely to be overdoing it.

My intuition is that the primary interesting lines to cross are at 2GB
and 4GB due to data type stuff.  I defend against the most basic
problem in my proposal: I don't let you exceed your off_t type, but
that doesn't mean we don't have off_t/ssize_t/size_t/long snafus
lurking in our code that could bite a 32 bit system with large files.
If you make it past those magic numbers and your tools are all happy,
I think you should be home free until you hit file system limits,
which are effectively unhittable on most systems except ext4's already
bemoaned 16TB limit AFAIK.  But all the same, I'm contemplating
limiting the range to 1TB in the first version, not because of general
fear of unknown unknowns, but just because it means we don't need to
use "huge" allocations for this known place, maybe until we can
address that.
From 30efb6d39c83e9d4f338e10e1b8944c64f8799c5 Mon Sep 17 00:00:00 2001
From: Thomas Munro 
Date: Wed, 15 May 2024 17:23:21 +1200
Subject: [PATCH v1] Limit block number buffer size in incremental backup.

Previously, basebackup.c would allocate an array big enough to hold a
block number for every block in a full sized md.c segment file.  That
works out to 512kB by default, which should be no problem.  However,
users can change the segment size at compile time, and the required
space for the array could be many gigabytes, posing problems:

1. For segment sizes over 2TB, MaxAllocSize would be exceeded,
   raising an error.
2. For segment sizes over 8TB, size_t arithmetic would overflow on 32 bit
   systems, leading to the wrong size allocation.
3. For any very large segment size, it's non-ideal to allocate a huge
   amount of memory if you're not actually going to use it.

This isn't a fundamental fix for the high memory requirement of the
algorithm as coded, but it seems like a good idea to avoid those limits
with a fallback strategy, and defer allocating until we see how big the
segments actually are in the cluster being backed up, upsizing as
required.

These are mostly theoretical problems at the moment, because it is so
hard to change the segment size in practice that people don't do it.  A
new proposal would make it changeable at run time, hence interest in
tidying these rough edges up.

Discussion: https://postgr.es/m/CA%2BhUKG%2B2hZ0sBztPW4mkLfng0qfkNtAHFUfxOMLizJ0BPmi5%2Bg%40mail.gmail.com

diff --git a/src/backend/backup/basebackup.c b/src/backend/backup/basebackup.c
index 9a2bf59e84e..0703b77af94 100644
--- a/src/backend/backup/basebackup.c
+++ b/src/backend/backup/basebackup.c
@@ -34,6 +34,7 @@
 #include "pgstat.h"
 #include "pgtar.h"
 #include "port.h"
+#include "port/pg_bitutils.h"
 #include "postmaster/syslogger.h"
 #include "postmaster/walsummarizer.h"
 #include "replication/walsender.h"
@@ -45,6 +46,7 @@
 #include "storage/reinit.h"
 #include "utils/builtins.h"
 #include "utils/guc.h"
+#include "utils/memutils.h"
 #include "utils/ps_status.h"
 #include "utils/relcache.h"
 #include "utils/resowner.h"
@@ -1198,13 +1200,7 @@ sendDir(bbsink *sink, const char *path, int basepathlen, bool sizeonly,
 	bool		isGlobalDir = false;
 	Oid			dboid = InvalidOid;
 	BlockNumber *relative_block_numbers = NULL;
-
-	/*
-	 * Since this array is relatively large, avoid putting it on the stack.
-	 * But we don't need it at all if this is not an incremental backup.
-	 */
-	if (ib != NULL)
-		relative_block_numbers = palloc(sizeof(BlockNumber) * RELSEG_SIZE);
+	size_t		relative_block_numbers_size = 0;
 
 	/*
 	 * Determine if the current path is a database directory that can contain
@@ -1477,6 +1473,7 @@ sendDir(bbsink *sink, 

Re: pg_sequence_last_value() for unlogged sequences on standbys

2024-05-16 Thread Nathan Bossart
Here is a rebased version of 0002, which I intend to commit once v18
development begins.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
>From e9cba5e4303c7fa5ad2d7d5deb23fe0b1c740b09 Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Tue, 7 May 2024 14:35:34 -0500
Subject: [PATCH v5 1/1] Simplify pg_sequences a bit.

XXX: NEEDS CATVERSION BUMP
---
 src/backend/catalog/system_views.sql |  6 +-
 src/backend/commands/sequence.c  | 12 
 src/test/regress/expected/rules.out  |  5 +
 3 files changed, 6 insertions(+), 17 deletions(-)

diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 53047cab5f..b32e5c3170 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -176,11 +176,7 @@ CREATE VIEW pg_sequences AS
 S.seqincrement AS increment_by,
 S.seqcycle AS cycle,
 S.seqcache AS cache_size,
-CASE
-WHEN has_sequence_privilege(C.oid, 'SELECT,USAGE'::text)
-THEN pg_sequence_last_value(C.oid)
-ELSE NULL
-END AS last_value
+pg_sequence_last_value(C.oid) AS last_value
 FROM pg_sequence S JOIN pg_class C ON (C.oid = S.seqrelid)
  LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace)
 WHERE NOT pg_is_other_temp_schema(N.oid)
diff --git a/src/backend/commands/sequence.c b/src/backend/commands/sequence.c
index 28f8522264..cd0e746577 100644
--- a/src/backend/commands/sequence.c
+++ b/src/backend/commands/sequence.c
@@ -1783,21 +1783,17 @@ pg_sequence_last_value(PG_FUNCTION_ARGS)
 	/* open and lock sequence */
 	init_sequence(relid, , );
 
-	if (pg_class_aclcheck(relid, GetUserId(), ACL_SELECT | ACL_USAGE) != ACLCHECK_OK)
-		ereport(ERROR,
-(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
- errmsg("permission denied for sequence %s",
-		RelationGetRelationName(seqrel;
-
 	/*
 	 * We return NULL for other sessions' temporary sequences.  The
 	 * pg_sequences system view already filters those out, but this offers a
 	 * defense against ERRORs in case someone invokes this function directly.
 	 *
 	 * Also, for the benefit of the pg_sequences view, we return NULL for
-	 * unlogged sequences on standbys instead of throwing an error.
+	 * unlogged sequences on standbys and for sequences for which we lack
+	 * USAGE or SELECT privileges instead of throwing an error.
 	 */
-	if (!RELATION_IS_OTHER_TEMP(seqrel) &&
+	if (pg_class_aclcheck(relid, GetUserId(), ACL_SELECT | ACL_USAGE) == ACLCHECK_OK &&
+		!RELATION_IS_OTHER_TEMP(seqrel) &&
 		(RelationIsPermanent(seqrel) || !RecoveryInProgress()))
 	{
 		Buffer		buf;
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index ef658ad740..04b3790bdd 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1699,10 +1699,7 @@ pg_sequences| SELECT n.nspname AS schemaname,
 s.seqincrement AS increment_by,
 s.seqcycle AS cycle,
 s.seqcache AS cache_size,
-CASE
-WHEN has_sequence_privilege(c.oid, 'SELECT,USAGE'::text) THEN pg_sequence_last_value((c.oid)::regclass)
-ELSE NULL::bigint
-END AS last_value
+pg_sequence_last_value((c.oid)::regclass) AS last_value
FROM ((pg_sequence s
  JOIN pg_class c ON ((c.oid = s.seqrelid)))
  LEFT JOIN pg_namespace n ON ((n.oid = c.relnamespace)))
-- 
2.25.1



Re: Why does pgindent's README say to download typedefs.list from the buildfarm?

2024-05-16 Thread Michael Paquier
On Thu, May 16, 2024 at 09:09:36PM -0400, Tom Lane wrote:
> WFM, and this is probably a place where we don't want to change the
> API in v17 and again in v18, so I agree with pushing now.
>
> Reminder though: beta1 release freeze begins Saturday.
> Not many hours left.

Yep.  I can handle that in 2~3 hours.
--
Michael


signature.asc
Description: PGP signature


Re: optimizing pg_upgrade's once-in-each-database steps

2024-05-16 Thread Nathan Bossart
On Thu, May 16, 2024 at 05:09:55PM -0700, Jeff Davis wrote:
> How much complexity do you avoid by using async instead of multiple
> processes?

If we didn't want to use async, my guess is we'd want to use threads to
avoid complicated IPC.  And if we followed pgbench's example for using
threads, it might end up at a comparable level of complexity, although I'd
bet that threading would be the more complex of the two.  It's hard to say
definitively without coding it up both ways, which might be worth doing.

> Also, did you consider connecting once to each database and running
> many queries? Most of those seem like just checks.

This was the idea behind 347758b.  It may be possible to do more along
these lines.  IMO parallelizing will still be useful even if we do combine
more of the steps.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: WAL record CRC calculated incorrectly because of underlying buffer modification

2024-05-16 Thread Michael Paquier
On Thu, May 16, 2024 at 03:54:52PM -0700, Jeff Davis wrote:
> On Mon, 2024-05-13 at 11:15 +1200, Thomas Munro wrote:
>> Perhaps a no-image, no-change registered buffer should not be
>> including an image, even for XLR_CHECK_CONSISTENCY?  It's
>> actually
>> useless for consistency checking too I guess, this issue
>> aside,
>> because it doesn't change anything so there is nothing to
>> check.
> 
> I'm not convinced by that reasoning. Can't it check that nothing has
> changed?

That's something I've done four weeks ago in the hash replay code
path, and having the image with XLR_CHECK_CONSISTENCY even if
REGBUF_NO_CHANGE was necessary because replay was setting up a LSN on
a REGBUF_NO_CHANGE page it should not have touched.

> I'd prefer that we find a way to get rid of REGBUF_NO_CHANGE and make
> all callers follow the rules than to find new special cases that depend
> on REGBUF_NO_CHANGE. See these messages here:
> 
> https://www.postgresql.org/message-id/b1f2d0f230d60fa8df33bb0b2af3236fde9d750d.camel%40j-davis.com
> 
> https://www.postgresql.org/message-id/CA%2BTgmoY%2BdagCyrMKau7UQeQU6w4LuVEu%2ByjsmJBoXKAo6XbUUA%40mail.gmail.com
> 
> In other words, we added REGBUF_NO_CHANGE for the call sites (only hash
> indexes) that don't follow the rules and where it wasn't easy to make
> them follow the rules. Now that we know of a concrete problem with the
> design, there's more upside to fixing it properly.

Yeah, agreed that getting rid of REGBUF_NO_CHANGE would be nice in the
final picture.  It still strikes me as a weird concept that WAL replay
for hash indexes logs full pages just to be able to lock them at
replay based on what's in the records.  :/
--
Michael


signature.asc
Description: PGP signature


Re: Why does pgindent's README say to download typedefs.list from the buildfarm?

2024-05-16 Thread Tom Lane
Michael Paquier  writes:
> On Thu, May 16, 2024 at 10:45:18AM -0400, Tom Lane wrote:
>> ... I think the enum should be nuked altogether, but
>> it's a bit late to be redesigning that for v17 perhaps.

> You're right, WaitEventExtension is better gone.  The only thing that
> matters is that we want to start computing the IDs assigned to the
> custom wait events for extensions with a number higher than the
> existing WAIT_EXTENSION to avoid interferences in pg_stat_activity, so
> this could be cleaned up as the attached.

WFM, and this is probably a place where we don't want to change the
API in v17 and again in v18, so I agree with pushing now.

Reminder though: beta1 release freeze begins Saturday.
Not many hours left.

regards, tom lane




Re: Why does pgindent's README say to download typedefs.list from the buildfarm?

2024-05-16 Thread Michael Paquier
On Thu, May 16, 2024 at 10:45:18AM -0400, Tom Lane wrote:
> I am also quite confused by that.  It seems to be kind of an enum
> that is supposed to be extended at runtime, meaning that neither
> of the existing enum member values ought to be used as such, although
> either autoprewarm.c didn't get the memo or I misunderstand the
> intended usage.  NUM_BUILTIN_WAIT_EVENT_EXTENSION is possibly the
> most bizarre idea I've ever seen: what would a "built-in extension"
> event be exactly?  I think the enum should be nuked altogether, but
> it's a bit late to be redesigning that for v17 perhaps.

You're right, WaitEventExtension is better gone.  The only thing that
matters is that we want to start computing the IDs assigned to the
custom wait events for extensions with a number higher than the
existing WAIT_EXTENSION to avoid interferences in pg_stat_activity, so
this could be cleaned up as the attached.

The reason why autoprewarm.c does not have a custom wait event
assigned is that it does not make sense there: this would not show up
in pg_stat_activity.  I think that we should just switch back to
PG_WAIT_EXTENSION there and call it a day.

I can still clean up that in time for beta1, as in today's time.  But
that can wait, as well.  Thoughts?
--
Michael
diff --git a/src/include/utils/wait_event.h b/src/include/utils/wait_event.h
index 080e92d1cf..1b735d4a0e 100644
--- a/src/include/utils/wait_event.h
+++ b/src/include/utils/wait_event.h
@@ -53,12 +53,6 @@ extern PGDLLIMPORT uint32 *my_wait_event_info;
  *
  * The ID retrieved can be used with pgstat_report_wait_start() or equivalent.
  */
-typedef enum
-{
-	WAIT_EVENT_EXTENSION = PG_WAIT_EXTENSION,
-	WAIT_EVENT_EXTENSION_FIRST_USER_DEFINED,
-} WaitEventExtension;
-
 extern void WaitEventExtensionShmemInit(void);
 extern Size WaitEventExtensionShmemSize(void);
 
diff --git a/src/backend/utils/activity/wait_event.c b/src/backend/utils/activity/wait_event.c
index 4ffcb10c8b..084a9dfdc2 100644
--- a/src/backend/utils/activity/wait_event.c
+++ b/src/backend/utils/activity/wait_event.c
@@ -89,8 +89,7 @@ typedef struct WaitEventExtensionCounterData
 static WaitEventExtensionCounterData *WaitEventExtensionCounter;
 
 /* first event ID of custom wait events for extensions */
-#define NUM_BUILTIN_WAIT_EVENT_EXTENSION	\
-	(WAIT_EVENT_EXTENSION_FIRST_USER_DEFINED - WAIT_EVENT_EXTENSION)
+#define WAIT_EVENT_EXTENSION_INITIAL_ID	1
 
 /* wait event info for extensions */
 #define WAIT_EVENT_EXTENSION_INFO(eventId)	(PG_WAIT_EXTENSION | eventId)
@@ -129,7 +128,7 @@ WaitEventExtensionShmemInit(void)
 	if (!found)
 	{
 		/* initialize the allocation counter and its spinlock. */
-		WaitEventExtensionCounter->nextId = NUM_BUILTIN_WAIT_EVENT_EXTENSION;
+		WaitEventExtensionCounter->nextId = WAIT_EVENT_EXTENSION_INITIAL_ID;
 		SpinLockInit(>mutex);
 	}
 
@@ -244,7 +243,7 @@ GetWaitEventExtensionIdentifier(uint16 eventId)
 	WaitEventExtensionEntryById *entry;
 
 	/* Built-in event? */
-	if (eventId < NUM_BUILTIN_WAIT_EVENT_EXTENSION)
+	if (eventId < WAIT_EVENT_EXTENSION_INITIAL_ID)
 		return "Extension";
 
 	/* It is a user-defined wait event, so lookup hash table. */
diff --git a/contrib/pg_prewarm/autoprewarm.c b/contrib/pg_prewarm/autoprewarm.c
index 248b9914a3..1c8804dc43 100644
--- a/contrib/pg_prewarm/autoprewarm.c
+++ b/contrib/pg_prewarm/autoprewarm.c
@@ -226,7 +226,7 @@ autoprewarm_main(Datum main_arg)
 			(void) WaitLatch(MyLatch,
 			 WL_LATCH_SET | WL_EXIT_ON_PM_DEATH,
 			 -1L,
-			 WAIT_EVENT_EXTENSION);
+			 PG_WAIT_EXTENSION);
 		}
 		else
 		{
@@ -253,7 +253,7 @@ autoprewarm_main(Datum main_arg)
 			(void) WaitLatch(MyLatch,
 			 WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
 			 delay_in_ms,
-			 WAIT_EVENT_EXTENSION);
+			 PG_WAIT_EXTENSION);
 		}
 
 		/* Reset the latch, loop. */


signature.asc
Description: PGP signature


Re: open items

2024-05-16 Thread Michael Paquier
On Tue, May 14, 2024 at 09:52:35AM -0400, Robert Haas wrote:
> We are down to three open items, all of which have proposed fixes.
> That is great, but we need to keep things moving along, because
> according to https://wiki.postgresql.org/wiki/PostgreSQL_17_Open_Items
> we are due to release beta1 on May 23. That means that a release
> freeze will be in effect from Saturday, May 18, which is four days
> from now. Since committing patches sometimes leads to unexpected
> surprises, it would be best if the proposed fixes were put into place
> sooner rather than later, to allow time for any adjustments that may
> be required.

As of this minute, the open item list is empty @-@.

Thanks all for the various resolutions, updates and commits!
--
Michael


signature.asc
Description: PGP signature


Re: optimizing pg_upgrade's once-in-each-database steps

2024-05-16 Thread Jeff Davis
On Thu, 2024-05-16 at 16:16 -0500, Nathan Bossart wrote:
> I am posting this early to get thoughts on the general approach.  If
> we
> proceeded with this strategy, I'd probably create some generic
> tooling that
> each relevant step would provide a set of callback functions.

The documentation states:

"pg_dump -j uses multiple database connections; it connects to the
database once with the leader process and once again for each worker
job."

That might need to be adjusted.

How much complexity do you avoid by using async instead of multiple
processes?

Also, did you consider connecting once to each database and running
many queries? Most of those seem like just checks.

Regards,
Jeff Davis





Re: Underscore in positional parameters?

2024-05-16 Thread Michael Paquier
On Thu, May 16, 2024 at 08:41:11AM +0200, Peter Eisentraut wrote:
> On this specific patch, maybe reword "parameter too large" to "parameter
> number too large".

WFM here.

> Also, I was bemused by the use of atol(), which is notoriously unportable
> (sizeof(long)).  So I poked around and found more places that might need
> fixing.  I'm attaching a patch here with annotations too look at later.

Yeah atoXX calls have been funky in the tree for some time.  This
reminds this thread, somewhat:
https://www.postgresql.org/message-id/CALAY4q8be6Qw_2J%3DzOp_v1X-zfMBzvVMkAfmMYv%3DUkr%3D2hPcFQ%40mail.gmail.com

The issue is also that there is no "safe" parsing alternative for 64b
integers in the frontend (as you know long is 32b in Windows, which is
why I'd encourage ripping it out as much as we can).  This may be
better as a complementary of strtoint() in src/common/string.c.  Note
as well strtoint64() in pgbench.c.  I think I have a patch lying
around, actually.. 
--
Michael


signature.asc
Description: PGP signature


Re: Fix src/test/subscription/t/029_on_error.pl test when wal_debug is enabled

2024-05-16 Thread Michael Paquier
On Thu, May 16, 2024 at 09:00:47AM +0530, Amit Kapila wrote:
> This can only be a problem if the apply worker generates more LOG
> entries with the required context but it won't do that unless there is
> an operation on the publisher to replicate. If we see any such danger
> then I agree this can break on some slow machines but as of now, I
> don't see any such race condition.

Perhaps, still I'm not completely sure if this assumption is going to
always stand for all the possible configurations we support.  So,
rather than coming back to this test again, I would choose to make the
test as stable as possible from the start.  That's what mapping with
the error message would offer when grabbing the LSN from the CONTEXT
part of it, because there's a one-one mapping between the expected
ERROR and its CONTEXT from which the information used by the test is
retrieved.
--
Michael


signature.asc
Description: PGP signature


Re: Postgres and --config-file option

2024-05-16 Thread David G. Johnston
On Thu, May 16, 2024 at 4:11 PM Michael Paquier  wrote:

> On Thu, May 16, 2024 at 11:57:10AM +0300, Aleksander Alekseev wrote:
> > I propose my original v1 patch for correcting the --help output of
> > 'postgres' too. I agree with the above comments that corresponding
> > changes in v4 became somewhat unwieldy.
>
> Thanks for compiling the rest.
>
> -printf(_("  --NAME=VALUE   set run-time parameter\n"));
> +printf(_("  --NAME=VALUE   set run-time parameter, a shorter form
> of -c\n"));
>
> This part with cross-references in the output is still meh to me, for
> same reason as for the doc changes I've argued to discard upthread.
>

I'm fine with leaving these alone.  If we did change this I'd want to
change both, not just --NAME.


>  write_stderr("%s does not know where to find the server
> configuration file.\n"
> - "You must specify the --config-file or -D invocation
> "
> + "You must specify the --config-file (or equivalent
> -c) or -D invocation "
>
>
I would rather just do this:

diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 3fb6803998..f827086489 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -1828,8 +1828,8 @@ SelectConfigFiles(const char *userDoption, const char
*progname)
else
{
write_stderr("%s does not know where to find the server
configuration file.\n"
-"You must specify the
--config-file or -D invocation "
-"option or set the PGDATA
environment variable.\n",
+"You must specify either the
config_file run-time parameter or "
+"provide the database directory\n",
 progname);
return false;
}

Both "run-time parameter" and "database directory" are words present in the
help and the user can find the correct argument if that is the option they
want to use.  The removal of the mention of the PGDATA environment variable
doesn't seem to be a great loss here.  This error message doesn't seem to
be the correct place to teach the user about all of their options so long
as they choose to read the documentation they learn about them there; and
we need not prescribe the specific means by which they supply either of
those pieces of information - which is the norm.  If someone simply runs
"postgres" at the command line this message and --help gives them
sufficient information to proceed.

David J.


Re: SQL:2011 application time

2024-05-16 Thread Jeff Davis
On Mon, 2024-05-13 at 12:11 +0200, Peter Eisentraut wrote:
> Some of these issues might be design flaws in the underlying
> mechanisms, 
> like range types and exclusion constraints.  Like, if you're supposed
> to 
> use this for scheduling but you can use empty ranges to bypass
> exclusion 
> constraints, how is one supposed to use this?

An empty range does not "bypass" the an exclusion constraint. The
exclusion constraint has a documented meaning and it's enforced.

Of course there are situations where an empty range doesn't make a lot
of sense. For many domains zero doesn't make any sense, either.
Consider receiving an email saying "thank you for purchasing 0
widgets!". Check constraints seem like a reasonable way to prevent
those kinds of problems.

Regards,
Jeff Davis





Re: Postgres and --config-file option

2024-05-16 Thread Michael Paquier
On Thu, May 16, 2024 at 11:57:10AM +0300, Aleksander Alekseev wrote:
> I propose my original v1 patch for correcting the --help output of
> 'postgres' too. I agree with the above comments that corresponding
> changes in v4 became somewhat unwieldy.

Thanks for compiling the rest.

-printf(_("  --NAME=VALUE   set run-time parameter\n"));
+printf(_("  --NAME=VALUE   set run-time parameter, a shorter form of 
-c\n"));

This part with cross-references in the output is still meh to me, for
same reason as for the doc changes I've argued to discard upthread.

 write_stderr("%s does not know where to find the server configuration 
file.\n"
- "You must specify the --config-file or -D invocation "
+ "You must specify the --config-file (or equivalent -c) or 
-D invocation "

I can fall behind changing this one, still I'm not sure if this
proposal is the optimal choice.  Adding this option to --help makes
sense when applied to this error message, but that's incomplete in
regard with the other GUCs where this concept applies.  A different
approach would be to do nothing in --help and change the reference of
--config-file to -c config_name=VALUE, which would be in line with
--help.
--
Michael


signature.asc
Description: PGP signature


Re: Docs: Always use true|false instead of sometimes on|off for the subscription options

2024-05-16 Thread Peter Smith
On Thu, May 16, 2024 at 10:42 PM David Rowley  wrote:
>
> On Thu, 16 May 2024 at 19:05, Peter Smith  wrote:
> >
> > On Thu, May 16, 2024 at 3:11 PM David Rowley  wrote:
> > > If you want to do this, what's the reason to limit it to just this one
> > > page of the docs?
> >
> > Yeah, I had a vested interest in this one place because I've been
> > reviewing the other thread [1] that was mentioned above. If that other
> > thread chooses "true|false" then putting "true|false" adjacent to
> > another "on|off" will look silly.
>
> OK, looking a bit further I see this option is new to v17.  After a
> bit more study of the sgml, I agree that it's worth changing this.
>
> I've pushed your patch.
>

Thanks very much for pushing. It was just a one-time thing -- I won't
go looking for more examples like it.

==
Kind Regards,
Peter Smith.
Fujitsu Australia




Re: Requiring LLVM 14+ in PostgreSQL 18

2024-05-16 Thread Thomas Munro
On Fri, May 17, 2024 at 3:17 AM Nazir Bilal Yavuz  wrote:
> Actually, 32 bit builds are working but the Perl version needs to be
> updated to 'perl5.36-i386-linux-gnu' in .cirrus.tasks.yml. I changed
> 0001 with the working version of 32 bit builds [1] and the rest is the
> same. All tests pass now [2].

Ahh, right, thanks!  I will look at committing your CI/fixup patches.




Re: WAL record CRC calculated incorrectly because of underlying buffer modification

2024-05-16 Thread Jeff Davis
On Mon, 2024-05-13 at 11:15 +1200, Thomas Munro wrote:
> > > > > Perhaps a no-image, no-change registered buffer should not be
> > > > > including an image, even for XLR_CHECK_CONSISTENCY?  It's
> > > > > actually
> > > > > useless for consistency checking too I guess, this issue
> > > > > aside,
> > > > > because it doesn't change anything so there is nothing to
> > > > > check.

I'm not convinced by that reasoning. Can't it check that nothing has
changed?

> > > 
> > > Does it reproduce if you do this?
> > > 
> > > -   include_image = needs_backup || (info &
> > > XLR_CHECK_CONSISTENCY) != 0;
> > > +   include_image = needs_backup ||
> > > +   ((info & XLR_CHECK_CONSISTENCY) != 0 &&
> > > +    (regbuf->flags & REGBUF_NO_CHANGE) ==
> > > 0);
> > 
> > No, it doesn't (at least with the latter, more targeted
> > reproducer).
> 
> OK so that seems like a candidate fix, but ...

...

> ... we'd need to figure out how to fix this in the back-branches too.
> One idea would be to back-patch REGBUF_NO_CHANGE, and another might
> be
> to deduce that case from other variables.  Let me CC a couple more
> people from this thread, which most recently hacked on this stuff, to
> see if they have insights:


Starting from the beginning, XLogRecordAssemble() calculates the CRC of
the record (technically just the non-header portions, but that's not
important here), including any backup blocks. Later,
CopyXLogRecordToWAL() copies that data into the actual xlog buffers. If
the data changes between those two steps, the CRC will be bad.

For most callers, the contents are exclusive-locked, so there's no
problem. For checksums, the data is copied out of shared memory into a
stack variable first, so no concurrent activity can change it. For hash
indexes, it tries to protect itself by passing REGBUF_NO_IMAGE.

There are two problems:

1. That implies another invariant that we aren't checking for: that
REGBUF_NO_CHANGE must be accompanied by REGBUF_NO_IMAGE. That doesn't
seem to be true for all callers, see XLogRegisterBuffer(1, wbuf,
wbuf_flags) in _hash_freeovflpage().

2. As you point out, REGBUF_NO_IMAGE still allows an image to be taken
if XLR_CHECK_CONSISTENCY is set, so we need to figure out what to do
there.

Can we take a step back and think harder about what hash indexes are
doing and if there's a better way? Maybe hash indexes need to take a
copy of the page, like in XLogSaveBufferForHint()?

I'd prefer that we find a way to get rid of REGBUF_NO_CHANGE and make
all callers follow the rules than to find new special cases that depend
on REGBUF_NO_CHANGE. See these messages here:

https://www.postgresql.org/message-id/b1f2d0f230d60fa8df33bb0b2af3236fde9d750d.camel%40j-davis.com

https://www.postgresql.org/message-id/CA%2BTgmoY%2BdagCyrMKau7UQeQU6w4LuVEu%2ByjsmJBoXKAo6XbUUA%40mail.gmail.com

In other words, we added REGBUF_NO_CHANGE for the call sites (only hash
indexes) that don't follow the rules and where it wasn't easy to make
them follow the rules. Now that we know of a concrete problem with the
design, there's more upside to fixing it properly.

Regards,
Jeff Davis





Re: Simplify documentation related to Windows builds

2024-05-16 Thread Michael Paquier
On Wed, May 15, 2024 at 11:25:34AM -0400, Robert Haas wrote:
> I think that we need to get a more definitive answer to the question
> of whether command-line editing works or not. I have the impression
> that it never has. If it's started working, we should establish that
> for certain and probably also establish what made it start working; if
> it works provided you do X, Y, or Z, we should establish what those
> things are.

Right.

> I'm cool with adding diff to the list of dependencies.

This makes sense also to me, still the patch is not completely right
because it has been adding diff in the list for build dependencies.
Perhaps this should just be a new list.

> I'd prefer to see us update the other links rather than delete them.

Okay.  I'm not sure where this patch is going, so I am going to
withdraw it for now.  The state of Windows is going to be a topic at
the next pgconf.dev, anyway.
--
Michael


signature.asc
Description: PGP signature


Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-16 Thread Melanie Plageman
On Thu, May 16, 2024 at 5:04 PM Tom Lane  wrote:
>
> Jacob Champion  writes:
> > ... Similar to how people currently use the
> > Reviewer field as a personal TODO list... it might be nice to
> > officially separate the ideas a bit.
>
> Oh, that's an independent pet peeve of mine.  Usually, if I'm
> looking over the CF list for a patch to review, I skip over ones
> that already show an assigned reviewer, because I don't want to
> step on that person's toes.  But it seems very common to put
> one's name down for review without any immediate intention of
> doing work.  Or to do a review and wander off, leaving the patch
> apparently being tended to but not really.  (And I confess I'm
> occasionally guilty of both things myself.)
>
> I think it'd be great if we could separate "I'm actively reviewing
> this" from "I'm interested in this".  As a bonus, adding yourself
> to the "interested" list would be a fine proxy for the thumbs-up
> or star markers mentioned upthread.
>
> If those were separate columns, we could implement some sort of
> aging scheme whereby somebody who'd not commented for (say)
> a week or two would get quasi-automatically moved from the "active
> reviewer" column to the "interested" column, whereupon it wouldn't
> be impolite for someone else to sign up for active review.

I really like the idea of an "interested" column of some sort. I think
having some idea of what patches have interest is independently
valuable and helps us distinguish patches that no committer was
interested enough to work on and patches that no one thinks is a good
idea at all.

As for having multiple categories of reviewer, it's almost like we
need someone to take responsibility for shifting the patch to the next
state -- where the next state isn't necessarily "committed". We tend
to wait and assign a committer if the patch is actually committable
and will get committed. Lots of people review a patch without claiming
that were the author to address all of the feedback, the patch would
be committable. It might be helpful if someone could sign up to
shepherd the patch to its next state -- regardless of what that state
is.

- Melanie




Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-16 Thread Tom Lane
Peter Eisentraut  writes:
> On 16.05.24 23:46, Tom Lane wrote:
>> Right, so what can we do about that?  Does Needs Review state need to
>> be subdivided, and if so how?

> Maybe a new state "Unclear". ...

> I think, if we consider the core mission of the commitfest app, we need 
> to be more protective of the Needs Review state.

Yeah, makes sense.

> So a third status that encompasses the various other situations like 
> maybe forgotten by author, disagreements between author and reviewer, 
> process difficulties, needs some senior developer intervention, etc. 
> could be helpful.

Hmm, "forgotten by author" seems to generally turn into "this has been
in WOA state a long time".  Not sure we have a problem representing
that, only with a process for eventually retiring such entries.
Your other three examples all sound like "needs senior developer
attention", which could be a helpful state that's distinct from "ready
for committer".  It's definitely not the same as "Unclear".

regards, tom lane




Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-16 Thread Peter Eisentraut

On 16.05.24 23:46, Tom Lane wrote:

Peter Eisentraut  writes:

The problem is if we have 180 patches in Needs Review, and only 20 are
really actually ready to be reviewed.  And a second-order problem is
that if you already know that this will be the case, you give up before
even looking.


Right, so what can we do about that?  Does Needs Review state need to
be subdivided, and if so how?


Maybe a new state "Unclear".  Then a commitfest manager, or someone like 
Robert just now, can more easily power through the list and set 
everything that's doubtful to Unclear, with the understanding that the 
author can set it back to Needs Review to signal that they actually 
think it's ready.  Or, as a variant of what someone else was saying, 
just set all patches that carry over from March to July as Unclear.  Or 
something like that.


I think, if we consider the core mission of the commitfest app, we need 
to be more protective of the Needs Review state.


I have been, from time to time, when I'm in commitfest management mode, 
aggressive in setting entries back to Waiting on Author, but that's not 
always optimal.


So a third status that encompasses the various other situations like 
maybe forgotten by author, disagreements between author and reviewer, 
process difficulties, needs some senior developer intervention, etc. 
could be helpful.


This could also help scale the commitfest manager role, because then the 
head CFM could have like three helpers and just tell them, look at all 
the "Unclear" ones and help resolve them.





Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-16 Thread Tom Lane
Daniel Gustafsson  writes:
> Pulling in the information from the CFBot regarding test failures and whether
> the patch applies at all, and automatically altering the state to WOA for at
> least the latter category would be nice.

+1.  There are enough intermittent test failures that I don't think
changing state for that would be good, but patch apply failure is
usually persistent.

I gather that the CFBot is still a kluge that is web-scraping the CF
app rather than being actually integrated with it, and that there are
plans to make that better that haven't been moving fast.  Probably
that would have to happen first, but there would be a lot of benefit
from having the info flow be two-way.

regards, tom lane




Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-16 Thread Joe Conway

On 5/16/24 17:36, Jacob Champion wrote:

On Thu, May 16, 2024 at 2:29 PM Joe Conway  wrote:

If no one, including the author (new or otherwise) is interested in
shepherding a particular patch, what chance does it have of ever getting
committed?


That's a very different thing from what I think will actually happen, which is

- new author posts patch
- community member says "use commitfest!"


Here is where we should point them at something that explains the care 
and feeding requirements to successfully grow a patch into a commit.



- new author registers patch
- no one reviews it
- patch gets automatically booted


Part of the care and feeding instructions should be a warning regarding 
what happens if you are unsuccessful in the first CF and still want to 
see it through.



- community member says "register it again!"
- new author says ಠ_ಠ


As long as this is not a surprise ending, I don't see the issue.


Like Tom said upthread, the issue isn't really that new authors are
somehow uninterested in their own patches.


First, some of them objectively are uninterested in doing more than 
dropping a patch over the wall and never looking back. But admittedly 
that is not too often.


Second, I don't think a once every two months effort in order to 
register continuing interest is too much to ask.


And third, if we did something like Magnus' suggestion about a CF 
parking lot, the process would be even more simple.


--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com





Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-16 Thread Daniel Gustafsson
> On 16 May 2024, at 23:46, Tom Lane  wrote:

> At this point it seems like there's consensus to have a "parking"
> section of the CF app, separate from the time-boxed CFs, and I hope
> somebody will go make that happen.  But I don't think that's our only
> issue, so we need to keep thinking about what should be improved.

Pulling in the information from the CFBot regarding test failures and whether
the patch applies at all, and automatically altering the state to WOA for at
least the latter category would be nice.  There's likely to be more analysis we
can do on the thread to measure "activity/hotness", but starting with the
simple boolean data we already have could be a good start.

--
Daniel Gustafsson





Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-16 Thread Tom Lane
Peter Eisentraut  writes:
> The problem is if we have 180 patches in Needs Review, and only 20 are 
> really actually ready to be reviewed.  And a second-order problem is 
> that if you already know that this will be the case, you give up before 
> even looking.

Right, so what can we do about that?  Does Needs Review state need to
be subdivided, and if so how?

If it's just that a patch should be in some other state altogether,
we should simply encourage people to change the state as soon as they
discover that.  I think the problem is not so much "90% are in the
wrong state" as "each potential reviewer has to rediscover that".

At this point it seems like there's consensus to have a "parking"
section of the CF app, separate from the time-boxed CFs, and I hope
somebody will go make that happen.  But I don't think that's our only
issue, so we need to keep thinking about what should be improved.

regards, tom lane




Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-16 Thread Jacob Champion
On Thu, May 16, 2024 at 2:04 PM Tom Lane  wrote:
> Oh, that's an independent pet peeve of mine.  Usually, if I'm
> looking over the CF list for a patch to review, I skip over ones
> that already show an assigned reviewer, because I don't want to
> step on that person's toes.  But it seems very common to put
> one's name down for review without any immediate intention of
> doing work.  Or to do a review and wander off, leaving the patch
> apparently being tended to but not really.  (And I confess I'm
> occasionally guilty of both things myself.)

Yep, I do the same thing (and have the same pet peeve).

> I think it'd be great if we could separate "I'm actively reviewing
> this" from "I'm interested in this".  As a bonus, adding yourself
> to the "interested" list would be a fine proxy for the thumbs-up
> or star markers mentioned upthread.
>
> If those were separate columns, we could implement some sort of
> aging scheme whereby somebody who'd not commented for (say)
> a week or two would get quasi-automatically moved from the "active
> reviewer" column to the "interested" column, whereupon it wouldn't
> be impolite for someone else to sign up for active review.

+1!

--Jacob




Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-16 Thread Peter Eisentraut

On 16.05.24 23:06, Joe Conway wrote:

On 5/16/24 16:57, Jacob Champion wrote:

On Thu, May 16, 2024 at 1:31 PM Joe Conway  wrote:

Maybe we should just make it a policy that *nothing* gets moved forward
from commitfest-to-commitfest and therefore the author needs to care
enough to register for the next one?


I think that's going to severely disadvantage anyone who doesn't do
this as their day job. Maybe I'm bristling a bit too much at the
wording, but not having time to shepherd a patch is not the same as
not caring.


Maybe the word "care" was a poor choice, but forcing authors to think 
about and decide if they have the "time to shepherd a patch" for the 
*next CF* is exactly the point. If they don't, why clutter the CF with it.


Objectively, I think this could be quite effective.  You need to prove 
your continued interest in your project by pressing a button every two 
months.


But there is a high risk that this will double the annoyance for 
contributors whose patches aren't getting reviews.  Now, not only are 
you being ignored, but you need to prove that you're still there every 
two months.






Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-16 Thread Peter Eisentraut

On 16.05.24 23:04, Tom Lane wrote:

I think it'd be great if we could separate "I'm actively reviewing
this" from "I'm interested in this".  As a bonus, adding yourself
to the "interested" list would be a fine proxy for the thumbs-up
or star markers mentioned upthread.

If those were separate columns, we could implement some sort of
aging scheme whereby somebody who'd not commented for (say)
a week or two would get quasi-automatically moved from the "active
reviewer" column to the "interested" column, whereupon it wouldn't
be impolite for someone else to sign up for active review.


Yes, I think some variant of this could be quite useful.




Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-16 Thread Jacob Champion
On Thu, May 16, 2024 at 2:29 PM Joe Conway  wrote:
> If no one, including the author (new or otherwise) is interested in
> shepherding a particular patch, what chance does it have of ever getting
> committed?

That's a very different thing from what I think will actually happen, which is

- new author posts patch
- community member says "use commitfest!"
- new author registers patch
- no one reviews it
- patch gets automatically booted
- community member says "register it again!"
- new author says ಠ_ಠ

Like Tom said upthread, the issue isn't really that new authors are
somehow uninterested in their own patches.

--Jacob




Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-16 Thread Peter Eisentraut

On 16.05.24 22:46, Melanie Plageman wrote:

I was reflecting on why a general purpose patch tracker sounded
appealing to me, and I realized that, at least at this time of year, I
have a few patches that really count as "waiting on author" that I
know I need to do additional work on before they need more review but
which aren't currently my top priority. I should probably simply
withdraw and re-register them. My justification was that I'll lose
them if I don't keep them in the commitfest app. But, I could just,
you know, save them somewhere myself instead of polluting the
commitfest app with them. I don't know if others are in this
situation. Anyway, I'm definitely currently guilty of parking.


I don't have a problem with that at all.  It's pretty understandable 
that many patches are parked between say April and July.


The key is the keep the status up to date.

If we have 890 patches in Waiting for Author and 100 patches in Ready 
for Committer and 10 patches in Needs Review, and those 10 patches are 
actually reviewable, then that's good.  There might need to be a 
"background process" to make sure those 890 waiting patches are not 
parked forever and those 100 patches actually get committed sometime, 
but in principle this is the system working.


The problem is if we have 180 patches in Needs Review, and only 20 are 
really actually ready to be reviewed.  And a second-order problem is 
that if you already know that this will be the case, you give up before 
even looking.






Re: Why is citext/regress failing on hamerkop?

2024-05-16 Thread Tom Lane
Andrew Dunstan  writes:
> On 2024-05-16 Th 17:15, Tom Lane wrote:
>> I'd rather have some visible status on the BF dashboard.  Invariably,
>> with a problem like this, the animal's owner is unaware there's a
>> problem.  If it's just silently not reporting, then no one else will
>> notice either, and we effectively lose an animal (despite it still
>> burning electricity to perform those rejected runs).

> Fair enough. That will mean some database changes and other stuff, so it 
> will take a bit longer.

Sure, I don't think it's urgent.

regards, tom lane




Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-16 Thread Joe Conway

On 5/16/24 17:24, Jacob Champion wrote:

On Thu, May 16, 2024 at 2:06 PM Joe Conway  wrote:

Maybe the word "care" was a poor choice, but forcing authors to think
about and decide if they have the "time to shepherd a patch" for the
*next CF* is exactly the point. If they don't, why clutter the CF with it.


Because the community regularly encourages new patch contributors to
park their stuff in it, without first asking them to sign on the
dotted line and commit to the next X months of their free time. If
that's not appropriate, then I think we should decide what those
contributors need to do instead, rather than making a new bar for them
to clear.


If no one, including the author (new or otherwise) is interested in 
shepherding a particular patch, what chance does it have of ever getting 
committed?


IMHO the probability is indistinguishable from zero anyway.

Perhaps we should be more explicit to new contributors that they need to 
either own their patch through the process, or convince someone to do it 
for them.


--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com





Re: Minor cleanups in the SSL tests

2024-05-16 Thread Daniel Gustafsson
> On 16 May 2024, at 11:43, Peter Eisentraut  wrote:

> You might want to run your patch through pgperltidy.  The result doesn't look 
> bad, but a bit different than what you had crafted.

Ugh, I thought I had but clearly had forgotten. Fixed now.

> append_conf() opens and closes the file for each call.  It might be nice if 
> it could accept a list.  Or you can just pass the whole block as one string, 
> like it was done for pg_ident.conf before.

The attached v2 pass the whole block as a here-doc which seemed like the best
option to retain readability of the config.

--
Daniel Gustafsson



v2-0002-Use-library-functions-to-edit-config-in-SSL-tests.patch
Description: Binary data


v2-0001-Test-for-PG_TEST_EXTRA-separately.patch
Description: Binary data


Re: Why is citext/regress failing on hamerkop?

2024-05-16 Thread Andrew Dunstan



On 2024-05-16 Th 17:15, Tom Lane wrote:

Andrew Dunstan  writes:

On 2024-05-16 Th 16:18, Tom Lane wrote:

Andrew: maybe the buildfarm server could be made to flag
animals building exceedingly old commits?  This is the second
problem of this sort that I've noticed this month, and you
really have to look closely to realize it's happening.

Yeah, that should be doable. Since we have the git ref these days we
should be able to mark it as old, or maybe just reject builds for very
old commits (the latter would be easier).

I'd rather have some visible status on the BF dashboard.  Invariably,
with a problem like this, the animal's owner is unaware there's a
problem.  If it's just silently not reporting, then no one else will
notice either, and we effectively lose an animal (despite it still
burning electricity to perform those rejected runs).





Fair enough. That will mean some database changes and other stuff, so it 
will take a bit longer.



cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-16 Thread Jacob Champion
On Thu, May 16, 2024 at 2:06 PM Joe Conway  wrote:
> Maybe the word "care" was a poor choice, but forcing authors to think
> about and decide if they have the "time to shepherd a patch" for the
> *next CF* is exactly the point. If they don't, why clutter the CF with it.

Because the community regularly encourages new patch contributors to
park their stuff in it, without first asking them to sign on the
dotted line and commit to the next X months of their free time. If
that's not appropriate, then I think we should decide what those
contributors need to do instead, rather than making a new bar for them
to clear.

--Jacob




Re: TerminateOtherDBBackends code comments inconsistency.

2024-05-16 Thread Noah Misch
On Tue, Apr 30, 2024 at 09:10:52AM +0530, Amit Kapila wrote:
> On Tue, Apr 30, 2024 at 2:58 AM Noah Misch  wrote:
> > On Mon, Apr 29, 2024 at 10:18:35AM +0530, Amit Kapila wrote:
> > > 3a9b18b309 didn't change the docs of pg_terminate_backend and whatever
> > > is mentioned w.r.t permissions in the doc of that function sounds
> > > valid for drop database force to me. Do you have any specific proposal
> > > in your mind?
> >
> > Something like the attached.
> 
> LGTM.

Pushed as commit 372700c.  Thanks for the report and the review.




optimizing pg_upgrade's once-in-each-database steps

2024-05-16 Thread Nathan Bossart
A number of pg_upgrade steps require connecting to each database and
running a query.  When there are many databases, these steps are
particularly time-consuming, especially since this is done sequentially in
a single process.  At a quick glance, I see the following such steps:

* create_logical_replication_slots
* check_for_data_types_usage
* check_for_isn_and_int8_passing_mismatch
* check_for_user_defined_postfix_ops
* check_for_incompatible_polymorphics
* check_for_tables_with_oids
* check_for_user_defined_encoding_conversions
* check_old_cluster_subscription_state
* get_loadable_libraries
* get_db_rel_and_slot_infos
* old_9_6_invalidate_hash_indexes
* report_extension_updates

I set out to parallelize these kinds of steps via multiple threads or
processes, but I ended up realizing that we could likely achieve much of
the same gain with libpq's asynchronous APIs.  Specifically, both
establishing the connections and running the queries can be done without
blocking, so we can just loop over a handful of slots and advance a simple
state machine for each.  The attached is a proof-of-concept grade patch for
doing this for get_db_rel_and_slot_infos(), which yielded the following
results on my laptop for "pg_upgrade --link --sync-method=syncfs --jobs 8"
for a cluster with 10K empty databases.

total pg_upgrade_time:
* HEAD:  14m 8s
* patch: 10m 58s

get_db_rel_and_slot_infos() on old cluster:
* HEAD:  2m 45s
* patch: 36s

get_db_rel_and_slot_infos() on new cluster:
* HEAD:  1m 46s
* patch: 29s

I am posting this early to get thoughts on the general approach.  If we
proceeded with this strategy, I'd probably create some generic tooling that
each relevant step would provide a set of callback functions.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
>From 05a9903295cb3b57ca9144217e89f0aac27277b5 Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Wed, 15 May 2024 12:07:10 -0500
Subject: [PATCH v1 1/1] parallel get relinfos

---
 src/bin/pg_upgrade/info.c| 266 +++
 src/tools/pgindent/typedefs.list |   1 +
 2 files changed, 202 insertions(+), 65 deletions(-)

diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
index 95c22a7200..bb28e262c7 100644
--- a/src/bin/pg_upgrade/info.c
+++ b/src/bin/pg_upgrade/info.c
@@ -11,6 +11,7 @@
 
 #include "access/transam.h"
 #include "catalog/pg_class_d.h"
+#include "fe_utils/string_utils.h"
 #include "pg_upgrade.h"
 
 static void create_rel_filename_map(const char *old_data, const char *new_data,
@@ -22,13 +23,16 @@ static void report_unmatched_relation(const RelInfo *rel, const DbInfo *db,
 static void free_db_and_rel_infos(DbInfoArr *db_arr);
 static void get_template0_info(ClusterInfo *cluster);
 static void get_db_infos(ClusterInfo *cluster);
-static void get_rel_infos(ClusterInfo *cluster, DbInfo *dbinfo);
+static void start_rel_infos_query(PGconn *conn);
+static void get_rel_infos_result(PGconn *conn, DbInfo *dbinfo);
 static void free_rel_infos(RelInfoArr *rel_arr);
 static void print_db_infos(DbInfoArr *db_arr);
 static void print_rel_infos(RelInfoArr *rel_arr);
 static void print_slot_infos(LogicalSlotInfoArr *slot_arr);
-static void get_old_cluster_logical_slot_infos(DbInfo *dbinfo, bool live_check);
-static void get_db_subscription_count(DbInfo *dbinfo);
+static void start_old_cluster_logical_slot_infos_query(PGconn *conn, bool live_check);
+static void get_old_cluster_logical_slot_infos_result(PGconn *conn, DbInfo *dbinfo);
+static void start_db_sub_count_query(PGconn *conn, DbInfo *dbinfo);
+static void get_db_sub_count_result(PGconn *conn, DbInfo *dbinfo);
 
 
 /*
@@ -268,6 +272,16 @@ report_unmatched_relation(const RelInfo *rel, const DbInfo *db, bool is_new_db)
 			   reloid, db->db_name, reldesc);
 }
 
+typedef enum
+{
+	UNUSED,
+	CONN_STARTED,
+	CONNECTING,
+	STARTED_RELINFO_QUERY,
+	STARTED_LOGICAL_QUERY,
+	STARTED_SUBSCRIPTION_QUERY,
+} InfoState;
+
 /*
  * get_db_rel_and_slot_infos()
  *
@@ -279,7 +293,12 @@ report_unmatched_relation(const RelInfo *rel, const DbInfo *db, bool is_new_db)
 void
 get_db_rel_and_slot_infos(ClusterInfo *cluster, bool live_check)
 {
-	int			dbnum;
+	int			dbnum = 0;
+	int			dbnum_proc = 0;
+	InfoState  *states;
+	int		   *dbs;
+	PGconn	  **conns;
+	int			jobs = (user_opts.jobs < 1) ? 1 : user_opts.jobs;
 
 	if (cluster->dbarr.dbs != NULL)
 		free_db_and_rel_infos(>dbarr);
@@ -287,20 +306,103 @@ get_db_rel_and_slot_infos(ClusterInfo *cluster, bool live_check)
 	get_template0_info(cluster);
 	get_db_infos(cluster);
 
-	for (dbnum = 0; dbnum < cluster->dbarr.ndbs; dbnum++)
-	{
-		DbInfo	   *pDbInfo = >dbarr.dbs[dbnum];
+	states = (InfoState *) pg_malloc(sizeof(InfoState *) * jobs);
+	dbs = (int *) pg_malloc(sizeof(int) * jobs);
+	conns = (PGconn **) pg_malloc(sizeof(PGconn *) * jobs);
 
-		

Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-16 Thread David G. Johnston
On Thu, May 16, 2024 at 1:46 PM Melanie Plageman 
wrote:

>
> I should probably simply
> withdraw and re-register them. My justification was that I'll lose
> them if I don't keep them in the commitfest app. But, I could just,
> you know, save them somewhere myself instead of polluting the
> commitfest app with them. I don't know if others are in this
> situation. Anyway, I'm definitely currently guilty of parking.
>
>
I use a personal JIRA to track the stuff that I hope makes it into the
codebase, as well as just starring the corresponding emails in the
short-term.  Every patch ever submitted sits in the mailing list archive so
I have no real need to preserve git branches with my submitted work on
them.  At lot of my work comes down to lucky timing so I'm mostly content
with just pinging my draft patches on the email thread once in a while
hoping there is interest and time from someone.  For stuff that I would be
OK committing as submitted I'll add it to the commitfest and wait for
someone to either agree or point out where I can improve things.

Adding both these kinds to WIP appeals to me, particularly with something
akin to a "Collaboration Wanted" category in addition to "Needs Review" for
when I think it is ready, and "Waiting on Author" for stuff that has
pending feedback to resolve - or the author isn't currently fishing for
reviewer time for whatever reason.  Ideally there would be no rejections,
only constructive feedback that convinces the author that, whether for now
or forever, the proposed patch should be withdrawn pending some change in
circumstances that suggests the world is ready for it.

David J.


Re: Why is citext/regress failing on hamerkop?

2024-05-16 Thread Tom Lane
Andrew Dunstan  writes:
> On 2024-05-16 Th 16:18, Tom Lane wrote:
>> Andrew: maybe the buildfarm server could be made to flag
>> animals building exceedingly old commits?  This is the second
>> problem of this sort that I've noticed this month, and you
>> really have to look closely to realize it's happening.

> Yeah, that should be doable. Since we have the git ref these days we 
> should be able to mark it as old, or maybe just reject builds for very 
> old commits (the latter would be easier).

I'd rather have some visible status on the BF dashboard.  Invariably,
with a problem like this, the animal's owner is unaware there's a
problem.  If it's just silently not reporting, then no one else will
notice either, and we effectively lose an animal (despite it still
burning electricity to perform those rejected runs).

regards, tom lane




Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-16 Thread Joe Conway

On 5/16/24 16:48, Magnus Hagander wrote:
On Thu, May 16, 2024 at 10:46 PM Melanie Plageman 
I was reflecting on why a general purpose patch tracker sounded

appealing to me, and I realized that, at least at this time of year, I
have a few patches that really count as "waiting on author" that I
know I need to do additional work on before they need more review but
which aren't currently my top priority. I should probably simply
withdraw and re-register them. My justification was that I'll lose
them if I don't keep them in the commitfest app. But, I could just,
you know, save them somewhere myself instead of polluting the
commitfest app with them. I don't know if others are in this
situation. Anyway, I'm definitely currently guilty of parking.


One thing I think we've talked about before (but not done) is to 
basically have a CF called "parking lot", where you can park patches 
that aren't active in a commitfest  but you also don't want to be dead. 
It would probably also be doable to have the cf bot run patches in that 
commitfest as well as the current one, if that's what people are using 
it for there.


+1

--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com





Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-16 Thread Tom Lane
Jacob Champion  writes:
> On Thu, May 16, 2024 at 1:31 PM Joe Conway  wrote:
>> Maybe we should just make it a policy that *nothing* gets moved forward
>> from commitfest-to-commitfest and therefore the author needs to care
>> enough to register for the next one?

> I think that's going to severely disadvantage anyone who doesn't do
> this as their day job. Maybe I'm bristling a bit too much at the
> wording, but not having time to shepherd a patch is not the same as
> not caring.

Also, I doubt that there are all that many patches that have simply
been abandoned by their authors.  Our problem is the same as it's
been for many years: not enough review person-power, rather than
not enough patches.  So I think authors would just jump through that
hoop, or enough of them would that it wouldn't improve matters.

regards, tom lane




Re: Why is citext/regress failing on hamerkop?

2024-05-16 Thread Andrew Dunstan



On 2024-05-16 Th 16:18, Tom Lane wrote:

Thomas Munro  writes:

For citext_utf8, I pushed cff4e5a3.  Hamerkop runs infrequently, so
here's hoping for 100% green on master by Tuesday or so.

Meanwhile, back at the ranch, it doesn't seem that changed anything:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=hamerkop=2024-05-16%2011%3A00%3A32

... and now that I look more closely, the reason why it didn't
change anything is that hamerkop is still building 0294df2
on HEAD.  All its other branches are equally stuck at the
end of March.  So this is a flat-out-broken animal, and I
plan to just ignore it until its owner un-sticks it.
(In particular, I think we shouldn't be in a hurry to push
the patch discussed downthread.)

Andrew: maybe the buildfarm server could be made to flag
animals building exceedingly old commits?  This is the second
problem of this sort that I've noticed this month, and you
really have to look closely to realize it's happening.





Yeah, that should be doable. Since we have the git ref these days we 
should be able to mark it as old, or maybe just reject builds for very 
old commits (the latter would be easier).



cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-16 Thread Joe Conway

On 5/16/24 16:57, Jacob Champion wrote:

On Thu, May 16, 2024 at 1:31 PM Joe Conway  wrote:

Maybe we should just make it a policy that *nothing* gets moved forward
from commitfest-to-commitfest and therefore the author needs to care
enough to register for the next one?


I think that's going to severely disadvantage anyone who doesn't do
this as their day job. Maybe I'm bristling a bit too much at the
wording, but not having time to shepherd a patch is not the same as
not caring.


Maybe the word "care" was a poor choice, but forcing authors to think 
about and decide if they have the "time to shepherd a patch" for the 
*next CF* is exactly the point. If they don't, why clutter the CF with it.


--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com





Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-16 Thread Tom Lane
Jacob Champion  writes:
> ... Similar to how people currently use the
> Reviewer field as a personal TODO list... it might be nice to
> officially separate the ideas a bit.

Oh, that's an independent pet peeve of mine.  Usually, if I'm
looking over the CF list for a patch to review, I skip over ones
that already show an assigned reviewer, because I don't want to
step on that person's toes.  But it seems very common to put
one's name down for review without any immediate intention of
doing work.  Or to do a review and wander off, leaving the patch
apparently being tended to but not really.  (And I confess I'm
occasionally guilty of both things myself.)

I think it'd be great if we could separate "I'm actively reviewing
this" from "I'm interested in this".  As a bonus, adding yourself
to the "interested" list would be a fine proxy for the thumbs-up
or star markers mentioned upthread.

If those were separate columns, we could implement some sort of
aging scheme whereby somebody who'd not commented for (say)
a week or two would get quasi-automatically moved from the "active
reviewer" column to the "interested" column, whereupon it wouldn't
be impolite for someone else to sign up for active review.

regards, tom lane




Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

2024-05-16 Thread Jelte Fennema-Nio
On Thu, 16 May 2024 at 18:57, Robert Haas  wrote:
> Ugh, it's so hard to communicate clearly about this stuff. I didn't
> really have any thought that we'd ever try to handle something as
> complicated as compression using ParameterSet.

Okay, then it's definitely very hard to communicate clearly about
this. Because being able to re-configure compression using
ParameterSet is exactly the type of thing I want to be able to do in
PgBouncer. Otherwise a connection from PgBouncer to Postgres cannot be
handed off to any client, because its compression state cannot be
changed on the fly to the state that a client expects, if the first
one wants lz4 compression, and a second one wants zstd compression.
There's no way the same connection can be reused for both unless you
decompress and recompress in pgbouncer, which is expensive to do.
Being able to reconfigure the stream to compress the messages in the
expected form is much cheaper.

> Does it send a NegotiateCompressionType message after
> the NegotiateProtocolVersion, for example? That seems like it could
> lead to the client having to be prepared for a lot of NegotiateX
> messages somewhere down the road.

Like Jacob explains, you'd want to allow the client to provide a list
of options in order of preference. And then have the server respond
with a ParameterStatus message saying what it ended up being. So no
new NegotiateXXX messages are needed, as long as we make sure any
_pq_.xxx falls back to reporting its default value on failure. This is
exactly why I said I prefer option 1 of the options I listed, because
we need _pq_.xxx messages to report their current value to the client.

To be clear, this is not special to compression. This applies to ALL
proposed protocol parameters. The server should fall back to some
"least common denominator" if it doesn't understand (part of) the
value for the protocol parameter that's provided by the client,
possibly falling back to disabling the protocol extension completely.

> Changing compression algorithms in mid-stream is tricky, too. If I
> tell the server "hey, turn on server-to-client compression!"

Yes it is tricky, but it's something that it would need to support
imho. And Jacob actually implemented it this way, so I feel like we're
discussing a non-problem here.

> I don't know if we want to support changing compression algorithms in
> mid-stream. I don't think there's any reason we can't, but it might be
> a bunch of work for something that nobody really cares about.

Again, I guess I wasn't clear at all in my previous emails and/or
commit messages. Connection poolers care **very much** about this.
Poolers need to be able to re-configure any protocol parameter to be
able to pool the same server connection across clients with
differently configured protocol parameters. Again: This is the primary
reason for me wanting to introduce the ParameterSet message.




Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-16 Thread Jesper Pedersen

Hi,

On 5/16/24 4:31 PM, Joe Conway wrote:

Yeah.  I think that Robert put his finger on a big part of the
problem, which is that punting a patch to the next CF is a lot
easier than rejecting it, particularly for less-senior CFMs
who may not feel they have the authority to say no (or at
least doubt that the patch author would accept it).


Maybe we should just make it a policy that *nothing* gets moved forward 
from commitfest-to-commitfest and therefore the author needs to care 
enough to register for the next one?




Or at least nothing get moved forward from March.

Spending time on a patch during a major version doesn't mean that you 
have time to do the same for the next major version.


That way July could start "clean" with patches people are interested in 
and willing to maintain during the next year.


Also, it is a bit confusing that f.ex.

 https://commitfest.postgresql.org/48/

already shows 40 patches as Committed...

So, having some sort of "End of Development" state in general would be good.

Best regards,
 Jesper





Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-16 Thread Jacob Champion
On Thu, May 16, 2024 at 1:31 PM Joe Conway  wrote:
> Maybe we should just make it a policy that *nothing* gets moved forward
> from commitfest-to-commitfest and therefore the author needs to care
> enough to register for the next one?

I think that's going to severely disadvantage anyone who doesn't do
this as their day job. Maybe I'm bristling a bit too much at the
wording, but not having time to shepherd a patch is not the same as
not caring.

--Jacob




Lowering the minimum value for maintenance_work_mem

2024-05-16 Thread Andres Freund
Hi,

In the subthread at [1] I needed to trigger multiple rounds of index vacuuming
within one vacuum.

It turns out that with the new dead tuple implementation, that got actually
somewhat expensive. Particularly if all tuples on all pages get deleted, the
representation is just "too dense". Normally that's obviously very good, but
for testing, not so much:

With the minimum setting of maintenance_work_mem=1024kB, a simple table with
narrow rows, where all rows are deleted, the first cleanup happens after
3697812 dead tids. The table for that has to be > ~128MB.

Needing a ~128MB table to be able to test multiple cleanup passes makes it
much more expensive to test and consequently will lead to worse test coverage.

I think we should consider lowering the minimum setting of
maintenance_work_mem to the minimum of work_mem. For real-life workloads
maintenance_work_mem=1024kB is going to already be quite bad, so we don't
protect users much by forbidding a setting lower than 1MB.


Just for comparison, with a limit of 1MB, < 17 needed to do the first cleanup
pass after 174472 dead tuples. That's a 20x improvement. Really nice.

Greetings,

Andres Freund

[1\ https://postgr.es/m/20240516193953.zdj545efq6vabymd%40awork3.anarazel.de




Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-16 Thread Jacob Champion
On Thu, May 16, 2024 at 1:46 PM Melanie Plageman
 wrote:
> I should probably simply
> withdraw and re-register them. My justification was that I'll lose
> them if I don't keep them in the commitfest app. But, I could just,
> you know, save them somewhere myself instead of polluting the
> commitfest app with them. I don't know if others are in this
> situation. Anyway, I'm definitely currently guilty of parking.

Me too -- it's really, really useful. I think there's probably a
better way the app could help us mark it as parked, as opposed to
getting rid of parking. Similar to how people currently use the
Reviewer field as a personal TODO list... it might be nice to
officially separate the ideas a bit.

--Jacob




Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-16 Thread Tom Lane
Melanie Plageman  writes:
> I was reflecting on why a general purpose patch tracker sounded
> appealing to me, and I realized that, at least at this time of year, I
> have a few patches that really count as "waiting on author" that I
> know I need to do additional work on before they need more review but
> which aren't currently my top priority. I should probably simply
> withdraw and re-register them. My justification was that I'll lose
> them if I don't keep them in the commitfest app. But, I could just,
> you know, save them somewhere myself instead of polluting the
> commitfest app with them. I don't know if others are in this
> situation. Anyway, I'm definitely currently guilty of parking.

It's also nice that the CF app will run CI for you, so at least
you can keep the patch building if you're so inclined.

David J. had a suggestion for this too upthread, which was to create a
separate slot for WIP patches that isn't one of the dated CF slots.

It's hard to argue that such patches need to be in "the CF app" at
all, if you're not actively looking for review.  But the CI runs
and the handy per-author patch status list make the app very tempting
infrastructure for parked patches.  Maybe we could have a not-the-CF
app that offers those amenities?

regards, tom lane




Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-16 Thread Jacob Champion
On Thu, May 16, 2024 at 12:14 PM David G. Johnston
 wrote:
> Those likely never get out of the new WIP slot discussed above.  Your patch 
> tracker basically.  And there should be less angst in moving something in the 
> bimonthly into WIP rather than dropping it outright.  There is discussion to 
> be had regarding WIP/patch tracking should we go this direction but even if 
> it is just movIng clutter from one room to another there seems like a clear 
> benefit

Yeah, IMO we're unlikely to get around the fact that it's a patch
tracker -- even if patch trackers inherently tend to suck, as Robert
put it, they tend to have lots of value too. May as well embrace the
need for a tracker and make it more helpful.

> We'll still have a problem with too many WIP patches and not enough ability 
> or desire to resolve them.  But setting a higher bar to get onto the 
> bimonthly slot while still providing a place for collaboration is a step 
> forward that configuring technology can help with.

+1. I think _any_ way to better communicate "what the author needs
right now" would help a lot.

> As for WIP, maybe adding thumbs-up and thumbs-down support tracking widgets 
> will help draw attention to more desired things.

Personally I'd like a way to gauge general interest without
introducing a voting system. "Stars", if you will, rather than
"thumbs". In the context of the CF, it's valuable to me as an author
that you care about what I'm trying to do; if you don't like my
implementation, that's what reviews on the list are for. And I see no
way that the meaning of a thumbs-down button wouldn't degrade
immediately.

I have noticed that past proposals for incremental changes to the CF
app (mine and others') are met with a sort of resigned inertia --
sometimes disagreement, but more often "meh, sounds all right, maybe".
Maybe my suggestions are just meh, and that's fine. But if we can't
tweak small things as we go -- and be generally okay with trying and
reverting failed experiments sometimes -- frustrations are likely to
pile up until someone writes another biyearly manifesto thread.

--Jacob




Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-16 Thread Magnus Hagander
On Thu, May 16, 2024 at 10:46 PM Melanie Plageman 
wrote:

> On Thu, May 16, 2024 at 2:30 PM Robert Haas  wrote:
> >
> > Hi,
> >
> > The original intent of CommitFests, and of commitfest.postgresql.org
> > by extension, was to provide a place where patches could be registered
> > to indicate that they needed to be reviewed, thus enabling patch
> > authors and patch reviewers to find each other in a reasonably
> > efficient way. I don't think it's working any more. I spent a good
> > deal of time going through the CommitFest this week, and I didn't get
> > through a very large percentage of it, and what I found is that the
> > status of the patches registered there is often much messier than can
> > be captured by a simple "Needs Review" or "Waiting on Author," and the
> > number of patches that are actually in need of review is not all that
> > large. For example, there are:
> >
> > - patches parked there by a committer who will almost certainly do
> > something about them after we branch
> > - patches parked there by a committer who probably won't do something
> > about them after we branch, but maybe they will, or maybe somebody
> > else will, and anyway this way we at least run CI
>
> -- snip --
>
> > So, our CommitFest application has turned into a patch tracker. IMHO,
> > patch trackers intrinsically tend to suck, because they fill up with
> > garbage that nobody cares about, and nobody wants to do the colossal
> > amount of work that it takes to maintain them. But our patch tracker
> > sucks MORE, because it's not even intended to BE a general-purpose
> > patch tracker.
>
> I was reflecting on why a general purpose patch tracker sounded
> appealing to me, and I realized that, at least at this time of year, I
> have a few patches that really count as "waiting on author" that I
> know I need to do additional work on before they need more review but
> which aren't currently my top priority. I should probably simply
> withdraw and re-register them. My justification was that I'll lose
> them if I don't keep them in the commitfest app. But, I could just,
> you know, save them somewhere myself instead of polluting the
> commitfest app with them. I don't know if others are in this
> situation. Anyway, I'm definitely currently guilty of parking.
>

One thing I think we've talked about before (but not done) is to basically
have a CF called "parking lot", where you can park patches that aren't
active in a commitfest  but you also don't want to be dead. It would
probably also be doable to have the cf bot run patches in that commitfest
as well as the current one, if that's what people are using it for there.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/ 


Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-16 Thread Melanie Plageman
On Thu, May 16, 2024 at 2:30 PM Robert Haas  wrote:
>
> Hi,
>
> The original intent of CommitFests, and of commitfest.postgresql.org
> by extension, was to provide a place where patches could be registered
> to indicate that they needed to be reviewed, thus enabling patch
> authors and patch reviewers to find each other in a reasonably
> efficient way. I don't think it's working any more. I spent a good
> deal of time going through the CommitFest this week, and I didn't get
> through a very large percentage of it, and what I found is that the
> status of the patches registered there is often much messier than can
> be captured by a simple "Needs Review" or "Waiting on Author," and the
> number of patches that are actually in need of review is not all that
> large. For example, there are:
>
> - patches parked there by a committer who will almost certainly do
> something about them after we branch
> - patches parked there by a committer who probably won't do something
> about them after we branch, but maybe they will, or maybe somebody
> else will, and anyway this way we at least run CI

-- snip --

> So, our CommitFest application has turned into a patch tracker. IMHO,
> patch trackers intrinsically tend to suck, because they fill up with
> garbage that nobody cares about, and nobody wants to do the colossal
> amount of work that it takes to maintain them. But our patch tracker
> sucks MORE, because it's not even intended to BE a general-purpose
> patch tracker.

I was reflecting on why a general purpose patch tracker sounded
appealing to me, and I realized that, at least at this time of year, I
have a few patches that really count as "waiting on author" that I
know I need to do additional work on before they need more review but
which aren't currently my top priority. I should probably simply
withdraw and re-register them. My justification was that I'll lose
them if I don't keep them in the commitfest app. But, I could just,
you know, save them somewhere myself instead of polluting the
commitfest app with them. I don't know if others are in this
situation. Anyway, I'm definitely currently guilty of parking.

- Melanie




Re: query_id, pg_stat_activity, extended query protocol

2024-05-16 Thread Imseih (AWS), Sami
> I'm unsure if it needs to call ExecutorStart in the bind code. But if we
> don't change the current logic, would it make more sense to move
> pgstat_report_query_id to the ExecutorRun routine?

I initially thought about that, but for utility statements (CTAS, etc.) being 
executed with extended query protocol, we will still not advertise the queryId 
as  we should. This is why I chose to set the queryId in PortalRunSelect and
PortalRunMulti in v2 of the patch [1].

We can advertise the queryId inside ExecutorRun instead of
PortalRunSelect as the patch does, but we will still need to advertise 
the queryId inside PortalRunMulti.

[1] 
https://www.postgresql.org/message-id/FAB6AEA1-AB5E-4DFF-9A2E-BB320E6C3DF1%40amazon.com


Regards,

Sami







Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-16 Thread Joe Conway

On 5/16/24 15:47, Tom Lane wrote:

Daniel Gustafsson  writes:

On 16 May 2024, at 20:30, Robert Haas  wrote:
The original intent of CommitFests, and of commitfest.postgresql.org
by extension, was to provide a place where patches could be registered
to indicate that they needed to be reviewed, thus enabling patch
authors and patch reviewers to find each other in a reasonably
efficient way. I don't think it's working any more.



But which part is broken though, the app, our commitfest process and workflow
and the its intent, or our assumption that we follow said process and workflow
which may or may not be backed by evidence?  IMHO, from being CMF many times,
there is a fair bit of the latter, which excacerbates the problem.  This is
harder to fix with more or better software though. 


Yeah.  I think that Robert put his finger on a big part of the
problem, which is that punting a patch to the next CF is a lot
easier than rejecting it, particularly for less-senior CFMs
who may not feel they have the authority to say no (or at
least doubt that the patch author would accept it).


Maybe we should just make it a policy that *nothing* gets moved forward 
from commitfest-to-commitfest and therefore the author needs to care 
enough to register for the next one?



I spent a good deal of time going through the CommitFest this week



And you deserve a big Thank You for that.


+ many


+1 agreed

--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com





Re: GUC names in messages

2024-05-16 Thread Tom Lane
Daniel Gustafsson  writes:
> On 16 May 2024, at 13:35, Peter Eisentraut  wrote:
>> - errmsg("WAL generated with full_page_writes=off was replayed "
>> + errmsg("WAL generated with \"full_page_writes=off\" was replayed "
 
> I'm not a fan of this syntax, but I at the same time can't offer a better idea
> so this isn't an objection but a hope that it can be made even better during
> the v18 cycle.

Yeah ... formally correct would be something like

errmsg("WAL generated with \"full_page_writes\"=\"off\" was replayed "

but that's a bit much for my taste.

regards, tom lane




Re: Why is citext/regress failing on hamerkop?

2024-05-16 Thread Tom Lane
Thomas Munro  writes:
> For citext_utf8, I pushed cff4e5a3.  Hamerkop runs infrequently, so
> here's hoping for 100% green on master by Tuesday or so.

Meanwhile, back at the ranch, it doesn't seem that changed anything:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=hamerkop=2024-05-16%2011%3A00%3A32

... and now that I look more closely, the reason why it didn't
change anything is that hamerkop is still building 0294df2
on HEAD.  All its other branches are equally stuck at the
end of March.  So this is a flat-out-broken animal, and I
plan to just ignore it until its owner un-sticks it.
(In particular, I think we shouldn't be in a hurry to push
the patch discussed downthread.)

Andrew: maybe the buildfarm server could be made to flag
animals building exceedingly old commits?  This is the second
problem of this sort that I've noticed this month, and you
really have to look closely to realize it's happening.

regards, tom lane




gist index builds try to open FSM over and over

2024-05-16 Thread Andres Freund
Hi,

I built a gist index to try to test a theory in some other thread. For that
the indexes need to cover a lot of entries. With gist creating the index took
a long time, which made me strace the index build process.

Which lead me to notice this:

...
openat(AT_FDCWD, "base/16462/17747_fsm", O_RDWR|O_CLOEXEC) = -1 ENOENT (No such 
file or directory)
lseek(56, 0, SEEK_END)  = 40173568
pwrite64(56, 
"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192, 
40173568) = 8192
lseek(56, 0, SEEK_END)  = 40181760
openat(AT_FDCWD, "base/16462/17747_fsm", O_RDWR|O_CLOEXEC) = -1 ENOENT (No such 
file or directory)
lseek(56, 0, SEEK_END)  = 40181760
pwrite64(56, 
"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192, 
40181760) = 8192
lseek(56, 0, SEEK_END)  = 40189952
lseek(56, 0, SEEK_END)  = 40189952
openat(AT_FDCWD, "base/16462/17747_fsm", O_RDWR|O_CLOEXEC) = -1 ENOENT (No such 
file or directory)
lseek(56, 0, SEEK_END)  = 40189952
...

I.e. for every write we try and fail to open the FSM.

#0  __libc_open64 (file=0x30469c8 "base/16462/17747_fsm", oflag=524290) at 
../sysdeps/unix/sysv/linux/open64.c:30
#1  0x00cbe582 in BasicOpenFilePerm (fileName=0x30469c8 
"base/16462/17747_fsm", fileFlags=524290, fileMode=384)
at 
../../../../../home/andres/src/postgresql/src/backend/storage/file/fd.c:1134
#2  0x00cbf057 in PathNameOpenFilePerm (fileName=0x30469c8 
"base/16462/17747_fsm", fileFlags=524290, fileMode=384)
at 
../../../../../home/andres/src/postgresql/src/backend/storage/file/fd.c:1620
#3  0x00cbef88 in PathNameOpenFile (fileName=0x30469c8 
"base/16462/17747_fsm", fileFlags=2)
at 
../../../../../home/andres/src/postgresql/src/backend/storage/file/fd.c:1577
#4  0x00cfeed2 in mdopenfork (reln=0x2fd5af8, forknum=FSM_FORKNUM, 
behavior=2)
at 
../../../../../home/andres/src/postgresql/src/backend/storage/smgr/md.c:649
#5  0x00cfe20b in mdexists (reln=0x2fd5af8, forknum=FSM_FORKNUM) at 
../../../../../home/andres/src/postgresql/src/backend/storage/smgr/md.c:181
#6  0x00d015b3 in smgrexists (reln=0x2fd5af8, forknum=FSM_FORKNUM) at 
../../../../../home/andres/src/postgresql/src/backend/storage/smgr/smgr.c:400
#7  0x00cc5078 in fsm_readbuf (rel=0x7f5b87977f38, addr=..., 
extend=false)
at 
../../../../../home/andres/src/postgresql/src/backend/storage/freespace/freespace.c:571
#8  0x00cc52d3 in fsm_search (rel=0x7f5b87977f38, min_cat=128 '\200')
at 
../../../../../home/andres/src/postgresql/src/backend/storage/freespace/freespace.c:690
#9  0x00cc47a5 in GetPageWithFreeSpace (rel=0x7f5b87977f38, 
spaceNeeded=4096)
at 
../../../../../home/andres/src/postgresql/src/backend/storage/freespace/freespace.c:141
#10 0x00cc5e52 in GetFreeIndexPage (rel=0x7f5b87977f38) at 
../../../../../home/andres/src/postgresql/src/backend/storage/freespace/indexfsm.c:40
#11 0x00855d4a in gistNewBuffer (r=0x7f5b87977f38, 
heaprel=0x7f5b87979688)
at 
../../../../../home/andres/src/postgresql/src/backend/access/gist/gistutil.c:831
#12 0x00844541 in gistplacetopage (rel=0x7f5b87977f38, freespace=819, 
giststate=0x2feae68, buffer=67261, itup=0x7ffd3ce86c30, ntup=1, oldoffnum=0,
newblkno=0x0, leftchildbuf=0, splitinfo=0x7ffd3ce86be0, 
markfollowright=true, heapRel=0x7f5b87979688, is_build=true)
at 
../../../../../home/andres/src/postgresql/src/backend/access/gist/gist.c:353
#13 0x00846263 in gistinserttuples (state=0x7ffd3ce86c90, 
stack=0x2fde7e8, giststate=0x2feae68, tuples=0x7ffd3ce86c30, ntup=1, 
oldoffnum=0,
leftchild=0, rightchild=0, unlockbuf=false, unlockleftchild=false) at 
../../../../../home/andres/src/postgresql/src/backend/access/gist/gist.c:1298
#14 0x008461a7 in gistinserttuple (state=0x7ffd3ce86c90, 
stack=0x2fde7e8, giststate=0x2feae68, tuple=0x2fde708, oldoffnum=0)
at 
../../../../../home/andres/src/postgresql/src/backend/access/gist/gist.c:1251
#15 0x00845681 in gistdoinsert (r=0x7f5b87977f38, itup=0x2fde708, 
freespace=819, giststate=0x2feae68, heapRel=0x7f5b87979688, is_build=true)
at 
../../../../../home/andres/src/postgresql/src/backend/access/gist/gist.c:887
#16 0x00848c79 in gistBuildCallback (index=0x7f5b87977f38, 
tid=0x2f31f74, values=0x7ffd3ce87060, isnull=0x7ffd3ce87040, tupleIsAlive=true,
state=0x7ffd3ce87340) at 
../../../../../home/andres/src/postgresql/src/backend/access/gist/gistbuild.c:863
#17 0x0087d605 in heapam_index_build_range_scan 
(heapRelation=0x7f5b87979688, indexRelation=0x7f5b87977f38, 
indexInfo=0x2fd9f50, allow_sync=true,
anyvisible=false, progress=true, start_blockno=0, numblocks=4294967295, 
callback=0x848b6b , callback_state=0x7ffd3ce87340,
scan=0x2f31f18) at 
../../../../../home/andres/src/postgresql/src/backend/access/heap/heapam_handler.c:1706
#18 

Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-16 Thread Greg Stark
When I was CFM for a couple cycles I started with the idea that I was going
to try being a hardass and rejecting or RWF all the patches that had gotten
negative reviews and been bounced forward.

Except when I actually went through them I didn't find many. Mostly like
Robert I found perfectly reasonable patches that had received generally
positive reviews and had really complex situations that really needed more
analysis.

I also found a lot of patches that were just not getting any reviews at all
:( and rejecting those didn't feel great

On Thu, May 16, 2024, 21:48 Tom Lane  wrote:

> Daniel Gustafsson  writes:
> >> On 16 May 2024, at 20:30, Robert Haas  wrote:
> >> The original intent of CommitFests, and of commitfest.postgresql.org
> >> by extension, was to provide a place where patches could be registered
> >> to indicate that they needed to be reviewed, thus enabling patch
> >> authors and patch reviewers to find each other in a reasonably
> >> efficient way. I don't think it's working any more.
>
> > But which part is broken though, the app, our commitfest process and
> workflow
> > and the its intent, or our assumption that we follow said process and
> workflow
> > which may or may not be backed by evidence?  IMHO, from being CMF many
> times,
> > there is a fair bit of the latter, which excacerbates the problem.  This
> is
> > harder to fix with more or better software though.
>
> Yeah.  I think that Robert put his finger on a big part of the
> problem, which is that punting a patch to the next CF is a lot
> easier than rejecting it, particularly for less-senior CFMs
> who may not feel they have the authority to say no (or at
> least doubt that the patch author would accept it).  It's hard
> even for senior people to get patch authors to take no for an
> answer --- I know I've had little luck at it --- so maybe that
> problem is inherent.  But a CF app full of patches that are
> unlikely ever to go anywhere isn't helpful.
>
> It's also true that some of us are abusing the process a bit.
> I know I frequently stick things into the CF app even if I intend
> to commit them pretty darn soon, because it's a near-zero-friction
> way to run CI on them, and I'm too lazy to learn how to do that
> otherwise.  I like David's suggestion of a "Pending Commit"
> status, or maybe I should just put such patches into RfC state
> immediately?  However, short-lived entries like that don't seem
> like they're a big problem beyond possibly skewing the CF statistics
> a bit.  It's the stuff that keeps hanging around that seems like
> the core of the issue.
>
> >> I spent a good deal of time going through the CommitFest this week
>
> > And you deserve a big Thank You for that.
>
> + many
>
> regards, tom lane
>
>
>


Re: GUC names in messages

2024-05-16 Thread Daniel Gustafsson
> On 16 May 2024, at 13:56, Alvaro Herrera  wrote:

> I think we should also take patch 0005 in pg17, which reduces the number
> of strings to translate.

Agreed, lessening the burden on translators is always a good idea.

--
Daniel Gustafsson





Re: GUC names in messages

2024-05-16 Thread Daniel Gustafsson
> On 16 May 2024, at 13:35, Peter Eisentraut  wrote:

> I think we should accept your two patches

I agree with this.

> v6-0001-GUC-names-docs.patch

+1

> v6-0002-GUC-names-add-quotes.patch

- errmsg("WAL generated with full_page_writes=off was replayed "
+ errmsg("WAL generated with \"full_page_writes=off\" was replayed "
 
I'm not a fan of this syntax, but I at the same time can't offer a better idea
so this isn't an objection but a hope that it can be made even better during
the v18 cycle.

--
Daniel Gustafsson





Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-16 Thread Tom Lane
Daniel Gustafsson  writes:
>> On 16 May 2024, at 20:30, Robert Haas  wrote:
>> The original intent of CommitFests, and of commitfest.postgresql.org
>> by extension, was to provide a place where patches could be registered
>> to indicate that they needed to be reviewed, thus enabling patch
>> authors and patch reviewers to find each other in a reasonably
>> efficient way. I don't think it's working any more.

> But which part is broken though, the app, our commitfest process and workflow
> and the its intent, or our assumption that we follow said process and workflow
> which may or may not be backed by evidence?  IMHO, from being CMF many times,
> there is a fair bit of the latter, which excacerbates the problem.  This is
> harder to fix with more or better software though. 

Yeah.  I think that Robert put his finger on a big part of the
problem, which is that punting a patch to the next CF is a lot
easier than rejecting it, particularly for less-senior CFMs
who may not feel they have the authority to say no (or at
least doubt that the patch author would accept it).  It's hard
even for senior people to get patch authors to take no for an
answer --- I know I've had little luck at it --- so maybe that
problem is inherent.  But a CF app full of patches that are
unlikely ever to go anywhere isn't helpful.

It's also true that some of us are abusing the process a bit.
I know I frequently stick things into the CF app even if I intend
to commit them pretty darn soon, because it's a near-zero-friction
way to run CI on them, and I'm too lazy to learn how to do that
otherwise.  I like David's suggestion of a "Pending Commit"
status, or maybe I should just put such patches into RfC state
immediately?  However, short-lived entries like that don't seem
like they're a big problem beyond possibly skewing the CF statistics
a bit.  It's the stuff that keeps hanging around that seems like
the core of the issue.

>> I spent a good deal of time going through the CommitFest this week

> And you deserve a big Thank You for that.

+ many

regards, tom lane




Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-16 Thread Daniel Gustafsson
> On 16 May 2024, at 20:30, Robert Haas  wrote:

> The original intent of CommitFests, and of commitfest.postgresql.org
> by extension, was to provide a place where patches could be registered
> to indicate that they needed to be reviewed, thus enabling patch
> authors and patch reviewers to find each other in a reasonably
> efficient way. I don't think it's working any more.

But which part is broken though, the app, our commitfest process and workflow
and the its intent, or our assumption that we follow said process and workflow
which may or may not be backed by evidence?  IMHO, from being CMF many times,
there is a fair bit of the latter, which excacerbates the problem.  This is
harder to fix with more or better software though. 

> I spent a good deal of time going through the CommitFest this week

And you deserve a big Thank You for that.

--
Daniel Gustafsson





Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-16 Thread Maciek Sakrejda
Thanks for raising this. As someone who is only modestly familiar with
Postgres internals or even C, but would still like to contribute through
review, I find the current process of finding a suitable patch both tedious
and daunting. The Reviewing a Patch article on the wiki [0] says reviews
like mine are still welcome, but it's hard to get started. I'd love to see
this be more approachable.

Thanks,
Maciek

[0]: https://wiki.postgresql.org/wiki/Reviewing_a_Patch


Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-16 Thread David G. Johnston
On Thu, May 16, 2024 at 11:30 AM Robert Haas  wrote:

> Hi,
>
> The original intent of CommitFests, and of commitfest.postgresql.org
> by extension, was to provide a place where patches could be registered
> to indicate that they needed to be reviewed, thus enabling patch
> authors and patch reviewers to find each other in a reasonably
> efficient way. I don't think it's working any more. I spent a good
> deal of time going through the CommitFest this week, and I didn't get
> through a very large percentage of it, and what I found is that the
> status of the patches registered there is often much messier than can
> be captured by a simple "Needs Review" or "Waiting on Author," and the
> number of patches that are actually in need of review is not all that
> large. For example, there are:
>
> - patches parked there by a committer who will almost certainly do
> something about them after we branch
> - patches parked there by a committer who probably won't do something
> about them after we branch, but maybe they will, or maybe somebody
> else will, and anyway this way we at least run CI
> - patches parked there by a committer who may well do something about
> them before we even branch, because they're not actually subject to
> the feature freeze
>

If a committer has a patch in the CF that is going to be committed in the
future unless there is outside action those should be put under a "Pending
Commit" status.

- patches that we've said we don't want but the author thinks we do
> (sometimes i agree with the author, sometimes not)
> - patches that have long-unresolved difficulties which the author
> either doesn't know how to solve or is in no hurry to solve
> - patches that have already been reviewed by multiple people, often
> including several committers, and which have been updated multiple
> times, but for one reason or another, not committed
>

Use the same software but a different endpoint - Collaboration.  Or even
the same endpoint just add an always open slot named "Work In Process"
(WIP).  If items can be moved from there to another open or future
commitfest slot so much the better.

- patches that actually do need to be reviewed
>

If we can distinguish between needs to be reviewed by a committer
(commitfest dated slots - bimonthlies) and reviewed by someone other than
the author (work in process slot) that should help everyone.  Of course,
anyone is welcome to review a patch that has been marked ready to commit.
I suppose "ready to commit" already sorta does this without the need for
WIP but a quick sanity check would be that ready to commit shouldn't (not
mustn't) be seen in WIP and needs review shouldn't be seen in the
bimonthlies.  A needs review in WIP means that the patch has been seen by a
committer and sent back for more work but that the scope and intent are
such that it will make it into the upcoming major release.  Or is something
like a bug fix that just goes right into the bimonthly instead of starting
out as a WIP item.


> I think there are a couple of things that have led to this state of
> affairs. First, we got tired of making people mad by booting their
> stuff out of the CommitFest, so we mostly just stopped doing it, even
> if it had 0% chance of being committed this CommitFest, and really
> even if it had a 0% chance of being committed ever.


Those likely never get out of the new WIP slot discussed above.  Your patch
tracker basically.  And there should be less angst in moving something in
the bimonthly into WIP rather than dropping it outright.  There is
discussion to be had regarding WIP/patch tracking should we go this
direction but even if it is just movIng clutter from one room to another
there seems like a clear benefit and need to tighten up what it means to be
in the bimonthly slot - to have qualifications laid out for a patch to earn
its way there, either by effort (authored and reviewed) or need (i.e., bug
fixes), or, realistically, being authored by a committer and being mostly
trivial in nature.


> Second, we added
> CI, which means that there is positive value to registering the patch
> in the CommitFest even if committing it is not in the cards.


The new slot retains this benefit.

And those
> things together have created a third problem, which is that the list
> is now so long and so messy that even the CommitFest managers probably
> don't manage to go through the whole thing thoroughly in a month.
>

The new slot wouldn't be subject to this.

We'll still have a problem with too many WIP patches and not enough ability
or desire to resolve them.  But setting a higher bar to get onto the
bimonthly slot while still providing a place for collaboration is a step
forward that configuring technology can help with.  As for WIP, maybe
adding thumbs-up and thumbs-down support tracking widgets will help draw
attention to more desired things.

David J.


Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-16 Thread Erik Rijkers

Op 5/16/24 om 20:30 schreef Robert Haas:

Hi,

The original intent of CommitFests, and of commitfest.postgresql.org
by extension, was to provide a place where patches could be registered
to indicate that they needed to be reviewed, thus enabling patch
authors and patch reviewers to find each other in a reasonably
efficient way. I don't think it's working any more. I spent a good


Hi,

Perhaps it would be an idea to let patches 'expire' automatically unless 
they are 'rescued' (=given another year)  by committer or commitfest 
manager (or perhaps a somewhat wider group - but not too many). 
Expiration after, say, one year should force patch-authors to mount a 
credible defense for his/her patch to either get his work rescued or 
reinstated/resubmitted.


Just a thought that has crossed my mind already a few times. It's not 
very sympathetic but it might work keep the list smaller.


Erik Rijkers





Re: race condition when writing pg_control

2024-05-16 Thread Andres Freund
Hi,

On 2024-05-16 15:01:31 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > On 2024-05-16 14:50:50 -0400, Tom Lane wrote:
> >> The intention was certainly always that it be atomic.  If it isn't
> >> we have got *big* trouble.
> 
> > We unfortunately do *know* that on several systems e.g. basebackup can read 
> > a
> > partially written control file, while the control file is being
> > updated.
> 
> Yeah, but can't we just retry that if we get a bad checksum?

Retry what/where precisely? We can avoid the issue in basebackup.c by taking
ControlFileLock in the right moment - but that doesn't address
pg_start/stop_backup based backups. Hence the patch in the referenced thread
moving to replacing the control file by atomic-rename if there are base
backups ongoing.


> What had better be atomic is the write to disk.

That is still true to my knowledge.


> Systems that can't manage POSIX semantics for concurrent reads and writes
> are annoying, but not fatal ...

I think part of the issue that people don't agree what posix says about
a read that's concurrent to a write... See e.g.
https://utcc.utoronto.ca/~cks/space/blog/unix/WriteNotVeryAtomic

Greetings,

Andres Freund




Re: race condition when writing pg_control

2024-05-16 Thread Tom Lane
Andres Freund  writes:
> On 2024-05-16 14:50:50 -0400, Tom Lane wrote:
>> The intention was certainly always that it be atomic.  If it isn't
>> we have got *big* trouble.

> We unfortunately do *know* that on several systems e.g. basebackup can read a
> partially written control file, while the control file is being
> updated.

Yeah, but can't we just retry that if we get a bad checksum?

What had better be atomic is the write to disk.  Systems that can't
manage POSIX semantics for concurrent reads and writes are annoying,
but not fatal ...

regards, tom lane




Re: Introduce new multi insert Table AM and improve performance of various SQL commands with it for Heap AM

2024-05-16 Thread Jeff Davis
On Wed, 2024-05-15 at 16:31 -0700, Jeff Davis wrote:
> Even better would be if we could take into account partitioning. That
> might be out of scope for your current work, but it would be very
> useful. We could have a couple new GUCs like modify_table_buffer and
> modify_table_buffer_per_partition or something like that.

To expand on this point:

For heap, the insert buffer is only 1000 tuples, which doesn't take
much memory. But for an AM that does any significant reorganization of
the input data, the buffer may be much larger. For insert into a
partitioned table, that buffer could be multiplied across many
partitions, and start to be a real concern.

We might not need table AM API changes at all here beyond what v21
offers. The ModifyTableState includes the memory context, so that gives
the caller a way to know the memory consumption of a single partition's
buffer. And if it needs to free the resources, it can just call
modify_table_end(), and then _begin() again if more tuples hit that
partition.

So I believe what I'm asking for here is entirely orthogonal to the
current proposal.

However, it got me thinking that we might not want to use work_mem for
controlling the heap's buffer size. Each AM is going to have radically
different memory needs, and may have its own (extension) GUCs to
control that memory usage, so they won't honor work_mem. We could
either have a separate GUC for the heap if it makes sense, or we could
just hard-code a reasonable value.

Regards,
Jeff Davis





Re: race condition when writing pg_control

2024-05-16 Thread Andres Freund
Hi,

On 2024-05-16 14:50:50 -0400, Tom Lane wrote:
> Nathan Bossart  writes:
> > I suspect it will be difficult to investigate this one too much further
> > unless we can track down a copy of the control file with the bad checksum.
> > Other than searching for any new code that isn't doing the appropriate
> > locking, maybe we could search the buildfarm for any other occurrences.  I
> > also seem some threads concerning whether the way we are reading/writing
> > the control file is atomic.
> 
> The intention was certainly always that it be atomic.  If it isn't
> we have got *big* trouble.

We unfortunately do *know* that on several systems e.g. basebackup can read a
partially written control file, while the control file is being
updated. Thomas addressed this partially for frontend code, but not yet for
backend code. See
https://postgr.es/m/CA%2BhUKGLhLGCV67NuTiE%3Detdcw5ChMkYgpgFsa9PtrXm-984FYA%40mail.gmail.com

Greetings,

Andres Freund




Re: race condition when writing pg_control

2024-05-16 Thread Tom Lane
Nathan Bossart  writes:
> I suspect it will be difficult to investigate this one too much further
> unless we can track down a copy of the control file with the bad checksum.
> Other than searching for any new code that isn't doing the appropriate
> locking, maybe we could search the buildfarm for any other occurrences.  I
> also seem some threads concerning whether the way we are reading/writing
> the control file is atomic.

The intention was certainly always that it be atomic.  If it isn't
we have got *big* trouble.

regards, tom lane




Re: race condition when writing pg_control

2024-05-16 Thread Nathan Bossart
On Thu, May 16, 2024 at 12:19:22PM -0400, Melanie Plageman wrote:
> Today, after committing a3e6c6f, I saw recovery/018_wal_optimize.pl
> fail and see this message in the replica log [2].
> 
> 2024-05-16 15:12:22.821 GMT [5440][not initialized] FATAL:  incorrect
> checksum in control file
> 
> I'm pretty sure it's not related to my commit. So, I was looking for
> existing reports of this error message.

Yeah, I don't see how it could be related.

> It's a long shot, since 0001 and 0002 were already pushed, but this is
> the only recent report I could find of "FATAL:  incorrect checksum in
> control file" in pgsql-hackers or bugs archives.
> 
> I do see this thread from 2016 [3] which might be relevant because the
> reported bug was also on Windows.

I suspect it will be difficult to investigate this one too much further
unless we can track down a copy of the control file with the bad checksum.
Other than searching for any new code that isn't doing the appropriate
locking, maybe we could search the buildfarm for any other occurrences.  I
also seem some threads concerning whether the way we are reading/writing
the control file is atomic.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




commitfest.postgresql.org is no longer fit for purpose

2024-05-16 Thread Robert Haas
Hi,

The original intent of CommitFests, and of commitfest.postgresql.org
by extension, was to provide a place where patches could be registered
to indicate that they needed to be reviewed, thus enabling patch
authors and patch reviewers to find each other in a reasonably
efficient way. I don't think it's working any more. I spent a good
deal of time going through the CommitFest this week, and I didn't get
through a very large percentage of it, and what I found is that the
status of the patches registered there is often much messier than can
be captured by a simple "Needs Review" or "Waiting on Author," and the
number of patches that are actually in need of review is not all that
large. For example, there are:

- patches parked there by a committer who will almost certainly do
something about them after we branch
- patches parked there by a committer who probably won't do something
about them after we branch, but maybe they will, or maybe somebody
else will, and anyway this way we at least run CI
- patches parked there by a committer who may well do something about
them before we even branch, because they're not actually subject to
the feature freeze
- patches that we've said we don't want but the author thinks we do
(sometimes i agree with the author, sometimes not)
- patches that have long-unresolved difficulties which the author
either doesn't know how to solve or is in no hurry to solve
- patches that have already been reviewed by multiple people, often
including several committers, and which have been updated multiple
times, but for one reason or another, not committed
- patches that actually do need to be reviewed

What's a bit depressing is that this last category is a relatively
small percentage of the total. If you'd like to sit down and review a
bunch of patches, you'll probably spend AT LEAST as much time trying
to identify which CommitFest entries are worth your time as you will
actually reviewing. I suspect you could easily spend 2 or 3 times as
much time finding things to review as actually reviewing them,
honestly. And the chances that you're going to find the things to
review that most need your attention are pretty much nil. You could
happen just by chance to discover a patch that was worth weeks of your
time to review, but you could also miss that patch forever amidst all
the clutter.

I think there are a couple of things that have led to this state of
affairs. First, we got tired of making people mad by booting their
stuff out of the CommitFest, so we mostly just stopped doing it, even
if it had 0% chance of being committed this CommitFest, and really
even if it had a 0% chance of being committed ever. Second, we added
CI, which means that there is positive value to registering the patch
in the CommitFest even if committing it is not in the cards. And those
things together have created a third problem, which is that the list
is now so long and so messy that even the CommitFest managers probably
don't manage to go through the whole thing thoroughly in a month.

So, our CommitFest application has turned into a patch tracker. IMHO,
patch trackers intrinsically tend to suck, because they fill up with
garbage that nobody cares about, and nobody wants to do the colossal
amount of work that it takes to maintain them. But our patch tracker
sucks MORE, because it's not even intended to BE a general-purpose
patch tracker. I'm not saying that replacing it with (let me show how
old I am) bugzilla or whatever the hip modern equivalent of that may
be these days is the right thing to do, but it's probably worth
considering. If we decide to roll our own, that might be OK too, but
we have to come up with some way of organizing this stuff that's
better than what we have today, some way that actually lets you find
the stuff that you care about.

To give just one example that I think highlights the issues pretty
well, consider the "Refactoring" section of the current CommitFest.
There are 24 patches in there, and 13 of them are by committers. Now,
maybe some of those patches are things that the committer really wants
someone else to review, e.g.
https://commitfest.postgresql.org/48/3998/ seems like it might be
that. On the other hand, that one could also just be an idea Thomas
had that he doesn't really intend to pursue even if the reviews are
absolutely glowing, so maybe it's not worth spending time on after
all. Then there are things that are probably 100% likely to get
committed unless somebody objects, so I shouldn't bother looking at
them unless I want to object, e.g.
https://commitfest.postgresql.org/48/4939/ seems like it's probably
that. And, also, regardless of authorship, some of these patches have
already had a great deal of discussion, and some have had none, and
you can sort of tell that from looking at the time the patch was
created vs. the last activity, but it's really not that obvious. So
overall it's just really unclear where to spend time.

I wonder what ideas people have for improving 

Re: Statistics Import and Export

2024-05-16 Thread Jeff Davis
On Thu, 2024-05-16 at 05:25 -0400, Corey Huinker wrote:
> 
> Per previous comments, it was suggested by others that:
> 
> - having them in SECTION_NONE was a grave mistake
> - Everything that could belong in SECTION_DATA should, and the rest
> should be in SECTION_POST_DATA

I don't understand the gravity of the choice here: what am I missing?

To be clear: I'm not arguing against it, but I'd like to understand it
better. Perhaps it has to do with the relationship between the sections
and the dependencies?

Regards,
Jeff Davis





Re: [UNVERIFIED SENDER] pg_upgrade can result in early wraparound on databases with high transaction load

2024-05-16 Thread Tom Lane
Daniel Gustafsson  writes:
>> On 5 Jul 2022, at 18:59, Tom Lane  wrote:
>> Given the lack of field complaints, it's probably not worth trying
>> to do anything to restore that capability.  But we really ought to
>> update pg_upgrade's code and docs in pre-v15 branches to say that
>> the minimum supported source version is 9.0.

> (reviving an old thread from the TODO)

> Since we never got around to doing this we still refer to 8.4 as a possible
> upgrade path in v14 and older.

Oh, yeah, that seems to have fallen through a crack.

> The attached takes the conservative approach of raising the minimum supported
> version to 9.0 while leaving the code to handle 8.4 in place.  While it can be
> removed, the risk/reward tradeoff of gutting code in backbranches doesn't seem
> appealing since the code will be unreachable with this check anyways.

Yeah, it's not worth working harder than this.  I do see one typo
in your comment: s/supported then/supported when/.  LGTM otherwise.

regards, tom lane




Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

2024-05-16 Thread Jacob Burroughs
On Thu, May 16, 2024 at 6:57 AM Robert Haas  wrote:
>
> Ugh, it's so hard to communicate clearly about this stuff. I didn't
> really have any thought that we'd ever try to handle something as
> complicated as compression using ParameterSet. I tend to agree that
> for compression I'd like to see the startup packet contain more than
> _pq_.compression=1, but I'm not sure what would happen after that
> exactly. If the client asks for _pq_.compression=lz4 and the server
> tells the client that it doesn't understand _pq_.compression at all,
> then everybody's on the same page: no compression. But, if the server
> understands the option but isn't OK with the proposed value, what
> happens then? Does it send a NegotiateCompressionType message after
> the NegotiateProtocolVersion, for example? That seems like it could
> lead to the client having to be prepared for a lot of NegotiateX
> messages somewhere down the road.
>
> I think at some point in the past we had discussed having the client
> list all the algorithms it supported in the argument to
> _pq_.compression, and then the server would respond with the algorithm
> it wanted use, or maybe a list of algorithms that it could allow, and
> then we'd go from there. But I'm not entirely sure if that's the right
> idea, either.

As currently implemented [1], the client sends the server the list of
all compression algorithms it is willing to accept, and the server can
use one of them.  If the server knows what `_pq_.compression` means
but doesn't actually support any compression, it will both send the
client its empty list of supported algorithms and just never send any
compressed messages, and everyone involved will be (relatively) happy.
There is a libpq function that a client can use to check what
compression is in use if a client *really* doesn't want to continue
with the conversation without compression, but 99% of the time I can't
see why a client wouldn't prefer to continue using a connection with
whatever compression the server supports (or even none) without more
explicit negotiation.  (Unlike TLS, where automagically picking
between using and not using TLS has strange security implications and
effects, compression is a convenience feature for everyone involved.)

> Changing compression algorithms in mid-stream is tricky, too. If I
> tell the server "hey, turn on server-to-client compression!" then I
> need to be able to identify where exactly that happens. Any messages
> already sent by the server and not yet processed by me, or any
> messages sent after that but before the server handles my request, are
> going to be uncompressed. Then, at some point, I'll start getting
> compressed data. If the compressed data is framed inside some message
> type created for that purpose, like I get a CompressedMessage message
> and then I decompress to get the actual message, this is simpler to
> manage. But even then, it's tricky if the protocol shifts. If I tell
> the server, you know what, gzip was a bad choice, I want lz4, I'll
> need to know where the switch happens to be able to decompress
> properly.
>
> I don't know if we want to support changing compression algorithms in
> mid-stream. I don't think there's any reason we can't, but it might be
> a bunch of work for something that nobody really cares about. Not
> sure.

As the protocol layer is currently designed [1], it explicitly makes
it very easy to change/restart compression streams, specifically for
this use case (and in particular for the general connection pooler use
case).  Compressed data is already framed in a `CompressedData`
message, and that message has a header byte that corresponds to an
enum value for which algorithm is currently in use.  Any time the
compression stream was restarted by the sender, the first
`CompressedData` message will set that byte, and then the client will
restart its decompression stream with the chosen algorithm from that
point.  For `CompressedData` messages that continue using the
already-established stream, the byte is simply set to 0.  (This is
also how the "each side sends a list" form of negotiation is able to
work without additional round trips, as the `CompressedData` framing
itself communicates which compression algorithm has been selected.)

[1] 
https://www.postgresql.org/message-id/CACzsqT5-7xfbz%2BSi35TBYHzerNX3XJVzAUH9AewQ%2BPp13fYBoQ%40mail.gmail.com

-- 
Jacob Burroughs | Staff Software Engineer
E: jburrou...@instructure.com




Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

2024-05-16 Thread Robert Haas
On Thu, May 16, 2024 at 12:09 PM Jelte Fennema-Nio  wrote:
> I don't really understand the benefit of your proposal over option 2
> that I proposed. Afaict you're proposing that for e.g. compression we
> first set _pq_.supports_compression=1 in the StartupMessage and use
> that  to do feature detection, and then after we get the response we
> send ParameterSet("compression", "gzip"). To me this is pretty much
> identical to option 2, except that it introduces an extra round trip
> for no benefit (as far as I can see). Why not go for option 2 and send
> _pq_.compression=gzip in the StartupMessage directly.

Ugh, it's so hard to communicate clearly about this stuff. I didn't
really have any thought that we'd ever try to handle something as
complicated as compression using ParameterSet. I tend to agree that
for compression I'd like to see the startup packet contain more than
_pq_.compression=1, but I'm not sure what would happen after that
exactly. If the client asks for _pq_.compression=lz4 and the server
tells the client that it doesn't understand _pq_.compression at all,
then everybody's on the same page: no compression. But, if the server
understands the option but isn't OK with the proposed value, what
happens then? Does it send a NegotiateCompressionType message after
the NegotiateProtocolVersion, for example? That seems like it could
lead to the client having to be prepared for a lot of NegotiateX
messages somewhere down the road.

I think at some point in the past we had discussed having the client
list all the algorithms it supported in the argument to
_pq_.compression, and then the server would respond with the algorithm
it wanted use, or maybe a list of algorithms that it could allow, and
then we'd go from there. But I'm not entirely sure if that's the right
idea, either.

Changing compression algorithms in mid-stream is tricky, too. If I
tell the server "hey, turn on server-to-client compression!" then I
need to be able to identify where exactly that happens. Any messages
already sent by the server and not yet processed by me, or any
messages sent after that but before the server handles my request, are
going to be uncompressed. Then, at some point, I'll start getting
compressed data. If the compressed data is framed inside some message
type created for that purpose, like I get a CompressedMessage message
and then I decompress to get the actual message, this is simpler to
manage. But even then, it's tricky if the protocol shifts. If I tell
the server, you know what, gzip was a bad choice, I want lz4, I'll
need to know where the switch happens to be able to decompress
properly.

I don't know if we want to support changing compression algorithms in
mid-stream. I don't think there's any reason we can't, but it might be
a bunch of work for something that nobody really cares about. Not
sure.

--
Robert Haas
EDB: http://www.enterprisedb.com




Re: Docs: Always use true|false instead of sometimes on|off for the subscription options

2024-05-16 Thread Tom Lane
Peter Smith  writes:
> Yeah, I had a vested interest in this one place because I've been
> reviewing the other thread [1] that was mentioned above. If that other
> thread chooses "true|false" then putting "true|false" adjacent to
> another "on|off" will look silly. But if that other thread adopts the
> existing 'failover=on|off' values then it will diverge even further
> from being consistent with the CREATE SUBSCRIPTION page.

It's intentional that we allow more than one spelling of boolean
values.  I can see some value in being consistent within nearby
examples, but I don't agree at all that it should be uniform
across all our docs.

regards, tom lane




Re: allow changing autovacuum_max_workers without restarting

2024-05-16 Thread Imseih (AWS), Sami
>>> That's true, but using a hard-coded limit means we no longer need to add a
>>> new GUC. Always allocating, say, 256 slots might require a few additional
>>> kilobytes of shared memory, most of which will go unused, but that seems
>>> unlikely to be a problem for the systems that will run Postgres v18.
>>
>> I agree with this.


> Here's what this might look like. I chose an upper limit of 1024, which
> seems like it "ought to be enough for anybody," at least for now.

I thought 256 was a good enough limit. In practice, I doubt anyone will 
benefit from more than a few dozen autovacuum workers. 
I think 1024 is way too high to even allow.

Besides that the overall patch looks good to me, but I have
some comments on the documentation.

I don't think combining 1024 + 5 = 1029 is a good idea in docs.
Breaking down the allotment and using the name of the constant 
is much more clear.

I suggest 
" max_connections + max_wal_senders + max_worker_processes + 
AUTOVAC_MAX_WORKER_SLOTS + 5"

and in other places in the docs, we should mention the actual 
value of AUTOVAC_MAX_WORKER_SLOTS. Maybe in the 
below section?

Instead of:
-() and allowed background
+(1024) and allowed background

do something like:
-() and allowed background
+   AUTOVAC_MAX_WORKER_SLOTS  (1024) and allowed background

Also,  replace the 1024 here with AUTOVAC_MAX_WORKER_SLOTS.

+max_wal_senders,
+plus max_worker_processes, plus 1024 for autovacuum
+worker processes, plus one extra for each 16


Also, Not sure if I am mistaken here, but the "+ 5" in the existing docs
seems wrong.
 
If it refers to NUM_AUXILIARY_PROCS defined in 
include/storage/proc.h, it should a "6"

#define NUM_AUXILIARY_PROCS 6

This is not a consequence of this patch, and can be dealt with
In a separate thread if my understanding is correct.


Regards,

Sami 




Re: broken JIT support on Fedora 40

2024-05-16 Thread Dmitry Dolgov
> On Tue, May 14, 2024 at 11:09:39AM -0400, Robert Haas wrote:
> On Tue, Apr 9, 2024 at 8:44 PM Thomas Munro  wrote:
> > I pushed the illegal attribute fix though.  Thanks for the detective work!
>
> This was commit 53c8d6c9f157f2bc8211b8de02417e55fefddbc7 and as I
> understand it that fixed the issue originally reported on this thread.
>
> Therefore, I have marked https://commitfest.postgresql.org/48/4917/ as
> Committed.
>
> If that's not correct, please feel free to fix. If there are other
> issues that need to be patched separately, please consider opening a
> new CF entry for those issues once a patch is available.

Thanks, that's correct. I think the only thing left is to add a verifier
pass, which everybody seems to be agreed is nice to have. The plan is to
add it only to master without back-patching. I assume Thomas did not
have time for that yet, so I've added the latest suggestions into his
patch, and going to open a CF item to not forget about it.
>From 67d955d9b69ea5204f02ace1a27ee4938b0dfd3a Mon Sep 17 00:00:00 2001
From: Thomas Munro 
Date: Tue, 9 Apr 2024 18:57:48 +1200
Subject: [PATCH v1] Run LLVM verify pass on all generated IR.

We fixed a recent problem that crashed LLVM 18, but Dmitry pointed out
that we'd have known about that all along if we'd automatically run the
verify pass on the IR we generate.

Turn that on in assertion builds.  That reveals one other complaint
about a name, which is also fixed here.

Suggested-by: Dmitry Dolgov <9erthali...@gmail.com>
Discussion: https://postgr.es/m/CAFj8pRACpVFr7LMdVYENUkScG5FCYMZDDdSGNU-tch%2Bw98OxYg%40mail.gmail.com
---
 src/backend/jit/llvm/llvmjit.c  | 5 +
 src/backend/jit/llvm/llvmjit_expr.c | 2 +-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/src/backend/jit/llvm/llvmjit.c b/src/backend/jit/llvm/llvmjit.c
index 1d439f24554..0bafe309bb6 100644
--- a/src/backend/jit/llvm/llvmjit.c
+++ b/src/backend/jit/llvm/llvmjit.c
@@ -714,6 +714,11 @@ llvm_optimize_module(LLVMJitContext *context, LLVMModuleRef module)
 	LLVMPassBuilderOptionsSetDebugLogging(options, 1);
 #endif
 
+	/* In assertion builds, run the LLVM verify pass. */
+#ifdef USE_ASSERT_CHECKING
+	LLVMPassBuilderOptionsSetVerifyEach(options, true);
+#endif
+
 	LLVMPassBuilderOptionsSetInlinerThreshold(options, 512);
 
 	err = LLVMRunPasses(module, passes, NULL, options);
diff --git a/src/backend/jit/llvm/llvmjit_expr.c b/src/backend/jit/llvm/llvmjit_expr.c
index 9e0efd26687..cd956fa7e38 100644
--- a/src/backend/jit/llvm/llvmjit_expr.c
+++ b/src/backend/jit/llvm/llvmjit_expr.c
@@ -2765,7 +2765,7 @@ create_LifetimeEnd(LLVMModuleRef mod)
 	LLVMContextRef lc;
 
 	/* variadic pointer argument */
-	const char *nm = "llvm.lifetime.end.p0i8";
+	const char *nm = "llvm.lifetime.end.p0";
 
 	fn = LLVMGetNamedFunction(mod, nm);
 	if (fn)

base-commit: a3e6c6f929912f928fa405909d17bcbf0c1b03ee
-- 
2.41.0



Re: race condition when writing pg_control

2024-05-16 Thread Melanie Plageman
On Sun, Jun 7, 2020 at 10:49 PM Thomas Munro  wrote:
>
> On Wed, Jun 3, 2020 at 2:03 PM Michael Paquier  wrote:
> > On Wed, Jun 03, 2020 at 10:56:13AM +1200, Thomas Munro wrote:
> > > Sorry for my radio silence, I got tangled up with a couple of
> > > conferences.  I'm planning to look at 0001 and 0002 shortly.
> >
> > Thanks!
>
> I pushed 0001 and 0002, squashed into one commit.  I'm not sure about
> 0003.  If we're going to do that, wouldn't it be better to just
> acquire the lock in that one extra place in StartupXLOG(), rather than
> introducing the extra parameter?

Today, after committing a3e6c6f, I saw recovery/018_wal_optimize.pl
fail and see this message in the replica log [2].

2024-05-16 15:12:22.821 GMT [5440][not initialized] FATAL:  incorrect
checksum in control file

I'm pretty sure it's not related to my commit. So, I was looking for
existing reports of this error message.

It's a long shot, since 0001 and 0002 were already pushed, but this is
the only recent report I could find of "FATAL:  incorrect checksum in
control file" in pgsql-hackers or bugs archives.

I do see this thread from 2016 [3] which might be relevant because the
reported bug was also on Windows.

- Melanie

[1] https://cirrus-ci.com/task/4626725689098240
[2] 
https://api.cirrus-ci.com/v1/artifact/task/4626725689098240/testrun/build/testrun/recovery/018_wal_optimize/log/018_wal_optimize_node_replica.log
[3] 
https://www.postgresql.org/message-id/flat/CAEepm%3D0hh_Dvd2Q%2BfcjYpkVzSoNX2%2Bf167cYu5nwu%3Dqh5HZhJw%40mail.gmail.com#042e9ec55c782370ab49c3a4ef254f4a




Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

2024-05-16 Thread Jelte Fennema-Nio
On Thu, 16 May 2024 at 17:29, Robert Haas  wrote:
> You're probably not going to like this answer, but I feel like this is
> another sign that you're trying to use the protocol extensibility
> facilities in the wrong way. In my first reply to the thread, I
> proposed having the client send _pq_.protocol_set=1 in that startup
> message. If the server accepts that message, then you can send
> whatever set of message types are associated with that option, which
> could include messages to list known settings, as well as messages to
> set them. Alternatively, if we used a wire protocol bump for this, you
> could request version 3.1 and everything that I just said still
> applies. In other words, I feel that if you had adopted the design
> that I proposed back in March, or some variant of it, the problem
> you're having now wouldn't exist.

I don't really understand the benefit of your proposal over option 2
that I proposed. Afaict you're proposing that for e.g. compression we
first set _pq_.supports_compression=1 in the StartupMessage and use
that  to do feature detection, and then after we get the response we
send ParameterSet("compression", "gzip"). To me this is pretty much
identical to option 2, except that it introduces an extra round trip
for no benefit (as far as I can see). Why not go for option 2 and send
_pq_.compression=gzip in the StartupMessage directly.

> IMHO, we need to negotiate the language that we're going to use to
> communicate before we start communicating. We should find out which
> protocol version we're using, and what protocol options are accepted,
> based on sending a StartupMessage and receiving a reply. Then, after
> that, having established a common language, we can do whatever. I
> think you're trying to meld those two steps into one, which is
> understandable from the point of view of saving a round trip, but I
> just don't see it working out well.

I think not increasing the number of needed round trips in the startup
of a connection is extremely important. I think it's so important that
I honestly don't think we should merge a protocol change that
introduces an extra round trip without a VERY good reason, and this
round trip should only be needed when actually using the feature.

> What we can do in the startup
> message is extremely limited, because any startup messages we generate
> can't break existing servers, and also because of the concerns I
> raised earlier about leaving room for more extension in the future.
> Once we get past the startup message negotiation, the sky's the limit!

Sure, what we can do in the StartupMessage is extremely limited, but
what it does allow is passing arbitrary key value pairs to the server.
But by only using _pq_.feature_name=1, we're effectively only using
the key part of the key value pair. Limiting ourselves even more, by
throwing half of our communication channel away, seems like a bad idea
to me. But maybe I'm just not understanding the problem you're seeing
with using the value too.




Re: First draft of PG 17 release notes

2024-05-16 Thread Andrew Dunstan



On 2024-05-14 Tu 20:39, Bruce Momjian wrote:

On Sat, May 11, 2024 at 03:32:55PM -0400, Andrew Dunstan wrote:

On 2024-05-09 Th 00:03, Bruce Momjian wrote:

I have committed the first draft of the PG 17 release notes;  you can
see the results here:

https://momjian.us/pgsql_docs/release-17.html

It will be improved until the final release.  The item count is 188,
which is similar to recent releases:

release-10:  189
release-11:  170
release-12:  180
release-13:  178
release-14:  220
release-15:  184
release-16:  206
release-17:  188

I welcome feedback.  For some reason it was an easier job than usual.


I don't like blowing my own horn but I feel commit 3311ea86ed "Introduce a
non-recursive JSON parser" should be in the release notes. This isn't
something that's purely internal, but it could be used by an extension or a
client program to parse JSON documents that are too large to handle with the
existing API.

Maybe "Introduce an incremental JSON parser" would have been a better
headline.

Well, this gets into a level of detail that is beyond the average
reader.  I think at that level people will need to read the git logs or
review the code.  Do we use it for anything yet?



Yes, certainly, it's used in handling backup manifests. Without it we 
can't handle huge manifests. See commits ea7b4e9a2a and 222e11a10a.


Other uses are in the works.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: CREATE TABLE creates a composite type corresponding to the table row, which is and is not there

2024-05-16 Thread David G. Johnston
On Wed, May 15, 2024 at 8:46 AM Robert Haas  wrote:

> On Thu, Apr 4, 2024 at 12:41 AM Erik Wienhold  wrote:
> > Thanks, fixed in v4.  Looks like American English prefers that comma and
> > it's also more common in our docs.
>
> Reviewing this patch:
>
> -  Creates a typed table, which takes its
> -  structure from the specified composite type (name optionally
> -  schema-qualified).  A typed table is tied to its type; for
> -  example the table will be dropped if the type is dropped
> -  (with DROP TYPE ... CASCADE).
> +  Creates a typed table, which takes its
> structure
> +  from an existing (name optionally schema-qualified) stand-alone
> composite
> +  type (i.e., created using ) though
> it
> +  still produces a new composite type as well.  The table will have
> +  a dependency on the referenced type such that cascaded alter and
> drop
> +  actions on the type will propagate to the table.
>
> It would be better if this diff didn't reflow the unchanged portions
> of the paragraph.
>
> I agree that it's a good idea to mention that the table must have been
> created using CREATE TYPE .. AS here, but I disagree with the rest of
> the rewording in this hunk. I think we could just add "creating using
> CREATE TYPE" to the end of the first sentence, with an xref, and leave
> the rest as it is.



> I don't see a reason to mention that the typed
> table also spawns a rowtype; that's just standard CREATE TABLE
> behavior and not really relevant here.


I figured it wouldn't be immediately obvious that the system would create a
second type with identical structure.  Of course, in order for SELECT tbl
FROM tbl; to work it must indeed do so.  I'm not married to pointing out
this dynamic explicitly though.


> And I don't understand what the
> rest of the rewording does for us.
>

It calls out the explicit behavior that the table's columns can change due
to actions on the underlying type.  Mentioning this unique behavior seems
worth a sentence.


>   
> -  When a typed table is created, then the data types of the
> -  columns are determined by the underlying composite type and are
> -  not specified by the CREATE TABLE command.
> +  A typed table always has the same column names and data types as the
> +  type it is derived from, and you cannot specify additional columns.
>But the CREATE TABLE command can add defaults
> -  and constraints to the table and can specify storage parameters.
> +  and constraints to the table, as well as specify storage parameters.
>   
>
> I don't see how this is better.
>

I'll agree this is more of a stylistic change, but mainly because the talk
about data types reasonably implies the other items the patch explicitly
mentions - names and additional columns.


> - errmsg("type %s is not a composite type",
> + errmsg("type %s is not a stand-alone composite type",
>
> I agree with Peter's complaint that people aren't going to understand
> what a stand-alone composite type means when they see the revised
> error message; to really help people, we're going to need to do better
> than this, I think.
>
>
We have a glossary.

That said, leave the wording as-is and add a conditional hint: The
composite type must not also be a table.

David J.


Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

2024-05-16 Thread Robert Haas
On Thu, May 16, 2024 at 5:22 AM Jelte Fennema-Nio  wrote:
> If we're not setting the protocol parameter in the StartupMessage,
> there's currently no way for us to know if the protocol parameter is
> supported by the server. If protocol parameters were unchangable then
> that would be fine, but the whole point of introducing ParameterSet is
> to make it possible to change protocol parameters on an existing
> connection. Having the function SupportsProtocolCompression return
> false, even though you can enable compression just fine, only because
> we didn't ask for compression when connecting seems quite silly and
> confusing.

You're probably not going to like this answer, but I feel like this is
another sign that you're trying to use the protocol extensibility
facilities in the wrong way. In my first reply to the thread, I
proposed having the client send _pq_.protocol_set=1 in that startup
message. If the server accepts that message, then you can send
whatever set of message types are associated with that option, which
could include messages to list known settings, as well as messages to
set them. Alternatively, if we used a wire protocol bump for this, you
could request version 3.1 and everything that I just said still
applies. In other words, I feel that if you had adopted the design
that I proposed back in March, or some variant of it, the problem
you're having now wouldn't exist.

IMHO, we need to negotiate the language that we're going to use to
communicate before we start communicating. We should find out which
protocol version we're using, and what protocol options are accepted,
based on sending a StartupMessage and receiving a reply. Then, after
that, having established a common language, we can do whatever. I
think you're trying to meld those two steps into one, which is
understandable from the point of view of saving a round trip, but I
just don't see it working out well. What we can do in the startup
message is extremely limited, because any startup messages we generate
can't break existing servers, and also because of the concerns I
raised earlier about leaving room for more extension in the future.
Once we get past the startup message negotiation, the sky's the limit!

-- 
Robert Haas
EDB: http://www.enterprisedb.com




  1   2   >