Re: race condition when writing pg_control

2024-05-17 Thread Thomas Munro
On Fri, May 17, 2024 at 4:46 PM Thomas Munro  wrote:
> The specific problem here is that LocalProcessControlFile() runs in
> every launched child for EXEC_BACKEND builds.  Windows uses
> EXEC_BACKEND, and Windows' NTFS file system is one of the two file
> systems known to this list to have the concurrent read/write data
> mashing problem (the other being ext4).

Phngh... this is surprisingly difficult to fix.

Things that don't work: We "just" need to acquire ControlFileLock
while reading the file or examining the object in shared memory, or
get a copy of it, passed through the EXEC_BACKEND BackendParameters
that was acquired while holding the lock, but the current location of
this code in child startup is too early to use LWLocks, and the
postmaster can't acquire locks either so it can't even safely take a
copy to pass on.  You could reorder startup so that we are allowed to
acquire LWLocks in children at that point, but then you'd need to
convince yourself that there is no danger of breaking some ordering
requirement in external preload libraries, and figure out what to do
about children that don't even attach to shared memory.  Maybe that's
possible, but that doesn't sound like a good idea to back-patch.

First idea idea I've come up with to avoid all of that: pass a copy of
the "proto-controlfile", to coin a term for the one read early in
postmaster startup by LocalProcessControlFile().  As far as I know,
the only reason we need it is to suck some settings out of it that
don't change while a cluster is running (mostly can't change after
initdb, and checksums can only be {en,dis}abled while down).  Right?
Children can just "import" that sucker instead of calling
LocalProcessControlFile() to figure out the size of WAL segments yada
yada, I think?  Later they will attach to the real one in shared
memory for all future purposes, once normal interlocking is allowed.

I dunno.  Draft patch attached.  Better plans welcome.  This passes CI
on Linux systems afflicted by EXEC_BACKEND, and Windows.  Thoughts?
From 48c2de14bd9368b4708a99ecbb75452dc327e608 Mon Sep 17 00:00:00 2001
From: Thomas Munro 
Date: Sat, 18 May 2024 13:41:09 +1200
Subject: [PATCH v1 1/3] Fix pg_control corruption in EXEC_BACKEND startup.

When backend processes were launched in EXEC_BACKEND builds, they would
run LocalProcessControlFile() to read in pg_control and extract several
important settings.

This happens too early to acquire ControlFileLock, and the postmaster is
also not allowed to acquire ControlFileLock, so it can't safely take a
copy to give to the child.

Instead, pass down the "proto-controlfile" that was read by the
postmaster in LocalProcessControlFile().  Introduce functions
ExportProtoControlFile() and ImportProtoControlFile() to allow that.
Subprocesses will extract information from that, and then later attach
to the current control file in shared memory.

Reported-by: Melanie Plageman  per Windows CI failure
Discussion: https://postgr.es/m/CAAKRu_YNGwEYrorQYza_W8tU%2B%3DtoXRHG8HpyHC-KDbZqA_ZVSA%40mail.gmail.com
---
 src/backend/access/transam/xlog.c   | 46 +++--
 src/backend/postmaster/launch_backend.c | 19 ++
 src/include/access/xlog.h   |  5 +++
 3 files changed, 62 insertions(+), 8 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 330e058c5f2..b69a0d95af9 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -568,6 +568,10 @@ static WALInsertLockPadded *WALInsertLocks = NULL;
  */
 static ControlFileData *ControlFile = NULL;
 
+#ifdef EXEC_BACKEND
+static ControlFileData *ProtoControlFile = NULL;
+#endif
+
 /*
  * Calculate the amount of space left on the page after 'endptr'. Beware
  * multiple evaluation!
@@ -686,6 +690,7 @@ static bool PerformRecoveryXLogAction(void);
 static void InitControlFile(uint64 sysidentifier);
 static void WriteControlFile(void);
 static void ReadControlFile(void);
+static void ScanControlFile(void);
 static void UpdateControlFile(void);
 static char *str_time(pg_time_t tnow);
 
@@ -4309,9 +4314,7 @@ WriteControlFile(void)
 static void
 ReadControlFile(void)
 {
-	pg_crc32c	crc;
 	int			fd;
-	static char wal_segsz_str[20];
 	int			r;
 
 	/*
@@ -4344,6 +4347,15 @@ ReadControlFile(void)
 
 	close(fd);
 
+	ScanControlFile();
+}
+
+static void
+ScanControlFile(void)
+{
+	static char wal_segsz_str[20];
+	pg_crc32c	crc;
+
 	/*
 	 * Check for expected pg_control format version.  If this is wrong, the
 	 * CRC check will likely fail because we'll be checking the wrong number
@@ -4815,8 +4827,33 @@ LocalProcessControlFile(bool reset)
 	Assert(reset || ControlFile == NULL);
 	ControlFile = palloc(sizeof(ControlFileData));
 	ReadControlFile();
+
+#ifdef EXEC_BACKEND
+	/* We need to be able to give this to subprocesses. */
+	ProtoControlFile = ControlFile;
+#endif
+}
+
+#ifdef EXEC_BACKEND
+void
+ExportProtoControlFile(ControlFileData *copy)
+{
+	*copy = 

Re: Refactoring backend fork+exec code

2024-05-17 Thread Thomas Munro
On Mon, Mar 18, 2024 at 10:41 PM Heikki Linnakangas  wrote:
> Committed, with some final cosmetic cleanups. Thanks everyone!

Nitpicking from UBSan with EXEC_BACKEND on Linux (line numbers may be
a bit off, from a branch of mine):

../src/backend/postmaster/launch_backend.c:772:2: runtime error: null
pointer passed as argument 2, which is declared to never be null
==13303==Using libbacktrace symbolizer.
#0 0x564b0202 in save_backend_variables
../src/backend/postmaster/launch_backend.c:772
#1 0x564b0242 in internal_forkexec
../src/backend/postmaster/launch_backend.c:311
#2 0x564b0bdd in postmaster_child_launch
../src/backend/postmaster/launch_backend.c:244
#3 0x564b3121 in StartChildProcess
../src/backend/postmaster/postmaster.c:3928
#4 0x564b933a in PostmasterMain
../src/backend/postmaster/postmaster.c:1357
#5 0x562de4ad in main ../src/backend/main/main.c:197
#6 0x7667ad09 in __libc_start_main
(/lib/x86_64-linux-gnu/libc.so.6+0x23d09)
#7 0x55e34279 in _start
(/tmp/cirrus-ci-build/build/tmp_install/usr/local/pgsql/bin/postgres+0x8e0279)

This silences it:

-   memcpy(param->startup_data, startup_data, startup_data_len);
+   if (startup_data_len > 0)
+   memcpy(param->startup_data, startup_data, startup_data_len);

(I found that out by testing EXEC_BACKEND on CI.  I also learned that
the Mac and FreeBSD tasks fail with EXEC_BACKEND because of SysV shmem
bleating.  We probably should go and crank up the relevant sysctls in
the .cirrus.tasks.yml...)




Re: libpq compression (part 3)

2024-05-17 Thread Jacob Burroughs
On Fri, May 17, 2024 at 11:40 AM Robert Haas  wrote:
>
> To be clear, I am not arguing that it should be the receiver's choice.
> I'm arguing it should be the client's choice, which means the client
> decides what it sends and also tells the server what to send to it.
> I'm open to counter-arguments, but as I've thought about this more,
> I've come to the conclusion that letting the client control the
> behavior is the most likely to be useful and the most consistent with
> existing facilities. I think we're on the same page based on the rest
> of your email: I'm just clarifying.

This is what I am imagining too

> I have some quibbles with the syntax but I agree with the concept.
> What I'd probably do is separate the server side thing into two GUCs,
> each with a list of algorithms, comma-separated, like we do for other
> lists in postgresql.conf. Maybe make the default 'all' meaning
> "everything this build of the server supports". On the client side,
> I'd allow both things to be specified using a single option, because
> wanting to do the same thing in both directions will be common, and
> you actually have to type in connection strings sometimes, so
> verbosity matters more.
>
> As far as the format of the value for that keyword, what do you think
> about either compression=DO_THIS_BOTH_WAYS or
> compression=DO_THIS_WHEN_SENDING/DO_THIS_WHEN_RECEIVING, with each "do
> this" being a specification of the same form already accepted for
> server-side compression e.g. gzip or gzip:level=9? If you don't like
> that, why do you think the proposal you made above is better, and why
> is that one now punctuated with slashes instead of semicolons?

I like this more than what I proposed, and will update the patches to
reflect this proposal. (I've gotten them locally back into a state of
applying cleanly and dealing with the changes needed to support direct
SSL connections, so refactoring the protocol layer shouldn't be too
hard now.)

On Fri, May 17, 2024 at 11:10 AM Jacob Champion
 wrote:
> We're talking about a transport-level option, though -- I thought the
> proposal enabled compression before authentication completed? Or has
> that changed?
On Fri, May 17, 2024 at 1:03 PM Jelte Fennema-Nio  wrote:
> I think it would make sense to only compress messages after
> authentication has completed. The gain of compressing authentication
> related packets seems pretty limited.

At the protocol level, compressed data is a message type that can be
used to wrap arbitrary data as soon as the startup packet is
processed.  However, as an implementation detail that clients should
not rely on but that we can rely on in thinking about the
implications, the only message types that are compressed (except in
the 0005 CI patch for test running only) are PqMsg_CopyData,
PqMsg_DataRow, and PqMsg_Query, all of which aren't sent before
authentication.

-- 
Jacob Burroughs | Staff Software Engineer
E: jburrou...@instructure.com




Re: Streaming read-ready sequential scan code

2024-05-17 Thread Thomas Munro
On Sat, May 18, 2024 at 11:30 AM Thomas Munro  wrote:
> Andres happened to have TPC-DS handy, and reproduced that regression
> in q15.  We tried some stuff and figured out that it requires
> parallel_leader_participation=on, ie that this looks like some kind of
> parallel fairness and/or timing problem.  It seems to be a question of
> which worker finishes up processing matching rows, and the leader gets
> a ~10ms head start but may be a little more greedy with the new
> streaming code.  He tried reordering the table contents and then saw
> 17 beat 16.  So for q15, initial indications are that this isn't a
> fundamental regression, it's just a test that is sensitive to some
> arbitrary conditions.
>
> I'll try to figure out some more details about that, ie is it being
> too greedy on small-ish tables,

After more debugging, we learned a lot more things...

1.  That query produces spectacularly bad estimates, so we finish up
having to increase the number of buckets in a parallel hash join many
times.  That is quite interesting, but unrelated to new code.
2.  Parallel hash join is quite slow at negotiating an increase in the
number of hash bucket, if all of the input tuples are being filtered
out by quals, because of the choice of where workers check for
PHJ_GROWTH_NEED_MORE_BUCKETS.  That could be improved quite easily I
think.  I have put that on my todo list 'cause that's also my code,
but it's not a new issue it's just one that is now highlighted...
3.  This bit of read_stream.c is exacerbating unfairness in the
underlying scan, so that 1 and 2 come together and produce a nasty
slowdown, which goes away if you change it like so:

-   BlockNumber blocknums[16];
+   BlockNumber blocknums[1];

I will follow up after some more study.




Re: Propagate sanity checks of ProcessUtility() to standard_ProcessUtility()?

2024-05-17 Thread Michael Paquier
On Fri, May 17, 2024 at 03:53:58PM -0400, Robert Haas wrote:
> The usual pattern for using hooks like this is to call the next
> implementation, or the standard implementation, and pass down the
> arguments that you got. If you try to do that and mess it up, you're
> going to get a crash really, really quickly and it shouldn't be very
> difficult at all to figure out what you did wrong. Honestly, that
> doesn't seem like it would be hard even without the assertions: for
> the most part, a quick glance at the stack backtrace ought to be
> enough. If you're trying to do something more sophisticated, like
> mutating the node tree before passing it down, the chances of your
> mistakes getting caught by these assertions are pretty darn low, since
> they're very superficial checks.

Perhaps, still that would be something.

Hmm.  We've had in the past bugs where DDL paths were playing with the
manipulation of Querys in core, corrupting their contents.  It seems
like what you would mean is to have a check at the *end* of
ProcessUtility() that does an equalFuncs() on the Query, comparing it
with a copy of it taken at its beginning, all that hidden within two
USE_ASSERT_CHECKING blocks, when we'd expect the tree to not change.
With readOnlyTree, that would be just changing from one policy to
another, which is not really interesting at this stage based on how
ProcessUtility() is shaped.
--
Michael


signature.asc
Description: PGP signature


Re: Internal error codes triggered by tests

2024-05-17 Thread Michael Paquier
On Fri, May 17, 2024 at 01:41:29PM -0400, Robert Haas wrote:
> On Tue, Apr 23, 2024 at 12:55 AM Michael Paquier  wrote:
>> Thoughts?
> 
> The patch as proposed seems fine. Marking RfC.

Thanks.  I'll look again at that once v18 opens up for business.
--
Michael


signature.asc
Description: PGP signature


Re: CREATE TABLE creates a composite type corresponding to the table row, which is and is not there

2024-05-17 Thread Erik Wienhold
On 2024-05-18 03:27 +0200, David G. Johnston wrote:
> > On 2024-05-16 17:47 +0200, David G. Johnston wrote:
> > > We have a glossary.
> 
> If sticking with stand-alone composite type as a formal term we should
> document it in the glossary.  It's unclear whether this will survive review
> though.  With the wording provided in this patch it doesn't really add
> enough to continue a strong defense of it.

Oh, I thought you meant we already have that term in the glossary (I
haven't checked until now).  Let's see if we can convince Robert of the
rewording.

> > It's now a separate error message (like I already had in v1) which
> > states that the specified type must not be a row type of another table
> > (based on Peter's feedback).  And the hint directs the user to CREATE
> > TYPE.
> >
> > In passing, I also quoted the type name in the existing error message
> > for consistency.  I saw that table names etc. are already quoted in
> > other error messages.
> >
> >
> The hint and the quoting both violate the documented rules for these things:
> 
> https://www.postgresql.org/docs/current/error-style-guide.html#ERROR-STYLE-GUIDE-QUOTES
> 
> There are functions in the backend that will double-quote their own output
> as needed (for example, format_type_be()). Do not put additional quotes
> around the output of such functions.
> 
> https://www.postgresql.org/docs/current/error-style-guide.html#ERROR-STYLE-GUIDE-GRAMMAR-PUNCTUATION
> 
> Detail and hint messages: Use complete sentences, and end each with a
> period. Capitalize the first word of sentences.

Thanks, I didn't know that guideline.  Both fixed in v6.

-- 
Erik
>From 39d2dc9b58dfa3b68245070648ecdf9eed892c7a Mon Sep 17 00:00:00 2001
From: Erik Wienhold 
Date: Fri, 8 Mar 2024 04:21:56 +0100
Subject: [PATCH v6] Document that typed tables rely on CREATE TYPE

CREATE TABLE OF requires a stand-alone composite type that is not the
row type of another table.  Clarify that with a reference to CREATE TYPE
in the docs.  Also report a separate error message in this case.

Reworded docs provided by David G. Johnston.
---
 doc/src/sgml/ref/create_table.sgml| 16 
 src/backend/commands/tablecmds.c  |  9 -
 src/test/regress/expected/typed_table.out |  7 ++-
 src/test/regress/sql/typed_table.sql  |  4 
 4 files changed, 26 insertions(+), 10 deletions(-)

diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index f19306e776..11458be9cf 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -249,18 +249,18 @@ WITH ( MODULUS numeric_literal, REM
 
  
   Creates a typed table, which takes its
-  structure from the specified composite type (name optionally
-  schema-qualified).  A typed table is tied to its type; for
-  example the table will be dropped if the type is dropped
-  (with DROP TYPE ... CASCADE).
+  structure from an existing (name optionally schema-qualified) stand-alone
+  composite type (i.e., created using )
+  though it still produces a new composite type as well.  The table will
+  have a dependency on the referenced type such that cascaded alter and
+  drop actions on the type will propagate to the table.
  
 
  
-  When a typed table is created, then the data types of the
-  columns are determined by the underlying composite type and are
-  not specified by the CREATE TABLE command.
+  A typed table always has the same column names and data types as the
+  type it is derived from, and you cannot specify additional columns.
   But the CREATE TABLE command can add defaults
-  and constraints to the table and can specify storage parameters.
+  and constraints to the table, as well as specify storage parameters.
  
 

diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 313c782cae..2331a9185a 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -6935,8 +6935,15 @@ check_of_type(HeapTuple typetuple)
 		 * the type before the typed table creation/conversion commits.
 		 */
 		relation_close(typeRelation, NoLock);
+
+		if (!typeOk)
+			ereport(ERROR,
+	(errcode(ERRCODE_WRONG_OBJECT_TYPE),
+	 errmsg("type %s must not be a row type of another table",
+			format_type_be(typ->oid)),
+	 errhint("Must be a stand-alone composite type created with CREATE TYPE.")));
 	}
-	if (!typeOk)
+	else
 		ereport(ERROR,
 (errcode(ERRCODE_WRONG_OBJECT_TYPE),
  errmsg("type %s is not a composite type",
diff --git a/src/test/regress/expected/typed_table.out b/src/test/regress/expected/typed_table.out
index 2e47ecbcf5..5743e74978 100644
--- a/src/test/regress/expected/typed_table.out
+++ b/src/test/regress/expected/typed_table.out
@@ -89,8 +89,13 @@ drop cascades to function get_all_persons()
 drop cascades to table persons2
 drop cascades to table persons3
 CREATE TABLE persons5 

Re: Underscore in positional parameters?

2024-05-17 Thread Erik Wienhold
On 2024-05-17 02:06 +0200, Michael Paquier wrote:
> On Thu, May 16, 2024 at 08:41:11AM +0200, Peter Eisentraut wrote:
> > On this specific patch, maybe reword "parameter too large" to "parameter
> > number too large".
> 
> WFM here.

Done in v3.

I noticed this compiler warning with my previous patch:

scan.l:997:41: warning: ISO C90 forbids mixed declarations and code 
[-Wdeclaration-after-statement]
  997 | ErrorSaveContext escontext 
= {T_ErrorSaveContext};
  | ^~~~

I thought that I had to factor this out into a function similar to
process_integer_literal (which also uses ErrorSaveContext).  But moving
that declaration to the start of the {param} action was enough in the
end.

While trying out the refactoring, I noticed two small things that can be
fixed as well in scan.l:

* Prototype and definition of addunicode do not match.  The prototype
  uses yyscan_t while the definition uses core_yyscan_t.

* Parameter base of process_integer_literal is unused.

But those should be one a separate thread, right, even for minor fixes?

-- 
Erik
>From d3ad2971dcb8ab15fafe1a5c69a15e5994eac76d Mon Sep 17 00:00:00 2001
From: Erik Wienhold 
Date: Tue, 14 May 2024 14:12:11 +0200
Subject: [PATCH v3 1/3] Fix overflow in parsing of positional parameter

Replace atol with pg_strtoint32_safe in the backend parser and with
strtoint in ECPG to reject overflows when parsing the number of a
positional parameter.  With atol from glibc, parameters $2147483648 and
$4294967297 turn into $-2147483648 and $1, respectively.
---
 src/backend/parser/scan.l| 8 +++-
 src/interfaces/ecpg/preproc/pgc.l| 5 -
 src/test/regress/expected/numerology.out | 4 
 src/test/regress/sql/numerology.sql  | 1 +
 4 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/src/backend/parser/scan.l b/src/backend/parser/scan.l
index 9b33fb8d72..436cc64f07 100644
--- a/src/backend/parser/scan.l
+++ b/src/backend/parser/scan.l
@@ -992,8 +992,14 @@ other			.
 }
 
 {param}			{
+	ErrorSaveContext escontext = {T_ErrorSaveContext};
+	int32		val;
+
 	SET_YYLLOC();
-	yylval->ival = atol(yytext + 1);
+	val = pg_strtoint32_safe(yytext + 1, (Node *) );
+	if (escontext.error_occurred)
+		yyerror("parameter number too large");
+	yylval->ival = val;
 	return PARAM;
 }
 {param_junk}	{
diff --git a/src/interfaces/ecpg/preproc/pgc.l b/src/interfaces/ecpg/preproc/pgc.l
index d117cafce6..7122375d72 100644
--- a/src/interfaces/ecpg/preproc/pgc.l
+++ b/src/interfaces/ecpg/preproc/pgc.l
@@ -938,7 +938,10 @@ cppline			{space}*#([^i][A-Za-z]*|{if}|{ifdef}|{ifndef}|{import})((\/\*[^*/]*\*+
 }
 
 {param}			{
-	base_yylval.ival = atol(yytext+1);
+	errno = 0;
+	base_yylval.ival = strtoint(yytext+1, NULL, 10);
+	if (errno == ERANGE)
+		mmfatal(PARSE_ERROR, "parameter number too large");
 	return PARAM;
 }
 {param_junk}	{
diff --git a/src/test/regress/expected/numerology.out b/src/test/regress/expected/numerology.out
index c8196d2c85..5bac05c3b3 100644
--- a/src/test/regress/expected/numerology.out
+++ b/src/test/regress/expected/numerology.out
@@ -206,6 +206,10 @@ PREPARE p1 AS SELECT $1a;
 ERROR:  trailing junk after parameter at or near "$1a"
 LINE 1: PREPARE p1 AS SELECT $1a;
  ^
+PREPARE p1 AS SELECT $2147483648;
+ERROR:  parameter number too large at or near "$2147483648"
+LINE 1: PREPARE p1 AS SELECT $2147483648;
+ ^
 SELECT 0b;
 ERROR:  invalid binary integer at or near "0b"
 LINE 1: SELECT 0b;
diff --git a/src/test/regress/sql/numerology.sql b/src/test/regress/sql/numerology.sql
index 3f0ec34ecf..6722a09c5f 100644
--- a/src/test/regress/sql/numerology.sql
+++ b/src/test/regress/sql/numerology.sql
@@ -52,6 +52,7 @@ SELECT 0.0e1a;
 SELECT 0.0e;
 SELECT 0.0e+a;
 PREPARE p1 AS SELECT $1a;
+PREPARE p1 AS SELECT $2147483648;
 
 SELECT 0b;
 SELECT 1b;
-- 
2.45.1



Re: CREATE TABLE creates a composite type corresponding to the table row, which is and is not there

2024-05-17 Thread David G. Johnston
On Fri, May 17, 2024 at 4:57 PM Erik Wienhold  wrote:

> On 2024-05-16 17:47 +0200, David G. Johnston wrote:
> > On Wed, May 15, 2024 at 8:46 AM Robert Haas 
> wrote:
> >
> > > On Thu, Apr 4, 2024 at 12:41 AM Erik Wienhold  wrote:
> > > > Thanks, fixed in v4.  Looks like American English prefers that comma
> and
> > > > it's also more common in our docs.
> > >
> > > Reviewing this patch:
> > >
> > > -  Creates a typed table, which takes its
> > > -  structure from the specified composite type (name optionally
> > > -  schema-qualified).  A typed table is tied to its type; for
> > > -  example the table will be dropped if the type is dropped
> > > -  (with DROP TYPE ... CASCADE).
> > > +  Creates a typed table, which takes its
> > > structure
> > > +  from an existing (name optionally schema-qualified) stand-alone
> > > composite
> > > +  type (i.e., created using )
> though
> > > it
> > > +  still produces a new composite type as well.  The table will
> have
> > > +  a dependency on the referenced type such that cascaded alter and
> > > drop
> > > +  actions on the type will propagate to the table.
> > >
> > > It would be better if this diff didn't reflow the unchanged portions
> > > of the paragraph.
>
> Right.  I now reformatted it so that first line remains unchanged.  But
> the rest of the para is still a complete rewrite.
>
> > > I agree that it's a good idea to mention that the table must have been
> > > created using CREATE TYPE .. AS here, but I disagree with the rest of
> > > the rewording in this hunk. I think we could just add "creating using
> > > CREATE TYPE" to the end of the first sentence, with an xref, and leave
> > > the rest as it is.
> >
> >
> >
> > > I don't see a reason to mention that the typed
> > > table also spawns a rowtype; that's just standard CREATE TABLE
> > > behavior and not really relevant here.
> >
> >
> > I figured it wouldn't be immediately obvious that the system would
> create a
> > second type with identical structure.  Of course, in order for SELECT tbl
> > FROM tbl; to work it must indeed do so.  I'm not married to pointing out
> > this dynamic explicitly though.
> >
> >
> > > And I don't understand what the
> > > rest of the rewording does for us.
> > >
> >
> > It calls out the explicit behavior that the table's columns can change
> due
> > to actions on the underlying type.  Mentioning this unique behavior seems
> > worth a sentence.
> >
> >
> > >   
> > > -  When a typed table is created, then the data types of the
> > > -  columns are determined by the underlying composite type and are
> > > -  not specified by the CREATE TABLE command.
> > > +  A typed table always has the same column names and data types
> as the
> > > +  type it is derived from, and you cannot specify additional
> columns.
> > >But the CREATE TABLE command can add defaults
> > > -  and constraints to the table and can specify storage parameters.
> > > +  and constraints to the table, as well as specify storage
> parameters.
> > >   
> > >
> > > I don't see how this is better.
> > >
> >
> > I'll agree this is more of a stylistic change, but mainly because the
> talk
> > about data types reasonably implies the other items the patch explicitly
> > mentions - names and additional columns.
>
> I prefer David's changes to both paras because right now the details of
> typed tables are spread over the respective CREATE and ALTER commands
> for types and tables.  Or maybe we should add those details to the
> existing "Typed Tables" section at the very end of CREATE TABLE?
>
> > > - errmsg("type %s is not a composite type",
> > > + errmsg("type %s is not a stand-alone composite type",
> > >
> > > I agree with Peter's complaint that people aren't going to understand
> > > what a stand-alone composite type means when they see the revised
> > > error message; to really help people, we're going to need to do better
> > > than this, I think.
> > >
> > >
> > We have a glossary.
>

If sticking with stand-alone composite type as a formal term we should
document it in the glossary.  It's unclear whether this will survive review
though.  With the wording provided in this patch it doesn't really add
enough to continue a strong defense of it.



> It's now a separate error message (like I already had in v1) which
> states that the specified type must not be a row type of another table
> (based on Peter's feedback).  And the hint directs the user to CREATE
> TYPE.
>
> In passing, I also quoted the type name in the existing error message
> for consistency.  I saw that table names etc. are already quoted in
> other error messages.
>
>
The hint and the quoting both violate the documented rules for these things:

https://www.postgresql.org/docs/current/error-style-guide.html#ERROR-STYLE-GUIDE-QUOTES

There are functions in the backend that will double-quote their own output
as needed (for example, format_type_be()). Do not put 

Re: allow sorted builds for btree_gist

2024-05-17 Thread Paul A Jungwirth
On Fri, May 17, 2024 at 12:41 PM Tomas Vondra
 wrote:
> I've been looking at GiST to see if there could be a good way to do
> parallel builds - and there might be, if the opclass supports sorted
> builds, because then we could parallelize the sort.
>
> But then I noticed we support this mode only for point_ops, because
> that's the only opclass with sortsupport procedure. Which mostly makes
> sense, because types like geometries, polygons, ... do not have a well
> defined order.

Oh, I'm excited about this for temporal tables. It seems to me that
ranges and multiranges should have a well-defined order (assuming
their base types do), since you can do dictionary-style ordering
(compare the first value, then the next, then the next, etc.). Is
there any reason not to support those? No reason not to commit these
btree_gist functions first though. If you aren't interested in adding
GiST sortsupport for ranges & multiranges I'm willing to do it myself;
just let me know.

Do note that the 1.7 -> 1.8 changes have been reverted though (as part
of my temporal work), and it looks like your patch is a bit messed up
from that. You'll want to take 1.8 for yourself, and also your 1.9
upgrade script is trying to add the reverted stratnum functions.

Yours,
Paul




Re: CREATE TABLE creates a composite type corresponding to the table row, which is and is not there

2024-05-17 Thread Erik Wienhold
On 2024-05-16 17:47 +0200, David G. Johnston wrote:
> On Wed, May 15, 2024 at 8:46 AM Robert Haas  wrote:
> 
> > On Thu, Apr 4, 2024 at 12:41 AM Erik Wienhold  wrote:
> > > Thanks, fixed in v4.  Looks like American English prefers that comma and
> > > it's also more common in our docs.
> >
> > Reviewing this patch:
> >
> > -  Creates a typed table, which takes its
> > -  structure from the specified composite type (name optionally
> > -  schema-qualified).  A typed table is tied to its type; for
> > -  example the table will be dropped if the type is dropped
> > -  (with DROP TYPE ... CASCADE).
> > +  Creates a typed table, which takes its
> > structure
> > +  from an existing (name optionally schema-qualified) stand-alone
> > composite
> > +  type (i.e., created using ) though
> > it
> > +  still produces a new composite type as well.  The table will have
> > +  a dependency on the referenced type such that cascaded alter and
> > drop
> > +  actions on the type will propagate to the table.
> >
> > It would be better if this diff didn't reflow the unchanged portions
> > of the paragraph.

Right.  I now reformatted it so that first line remains unchanged.  But
the rest of the para is still a complete rewrite.

> > I agree that it's a good idea to mention that the table must have been
> > created using CREATE TYPE .. AS here, but I disagree with the rest of
> > the rewording in this hunk. I think we could just add "creating using
> > CREATE TYPE" to the end of the first sentence, with an xref, and leave
> > the rest as it is.
> 
> 
> 
> > I don't see a reason to mention that the typed
> > table also spawns a rowtype; that's just standard CREATE TABLE
> > behavior and not really relevant here.
> 
> 
> I figured it wouldn't be immediately obvious that the system would create a
> second type with identical structure.  Of course, in order for SELECT tbl
> FROM tbl; to work it must indeed do so.  I'm not married to pointing out
> this dynamic explicitly though.
> 
> 
> > And I don't understand what the
> > rest of the rewording does for us.
> >
> 
> It calls out the explicit behavior that the table's columns can change due
> to actions on the underlying type.  Mentioning this unique behavior seems
> worth a sentence.
> 
> 
> >   
> > -  When a typed table is created, then the data types of the
> > -  columns are determined by the underlying composite type and are
> > -  not specified by the CREATE TABLE command.
> > +  A typed table always has the same column names and data types as the
> > +  type it is derived from, and you cannot specify additional columns.
> >But the CREATE TABLE command can add defaults
> > -  and constraints to the table and can specify storage parameters.
> > +  and constraints to the table, as well as specify storage parameters.
> >   
> >
> > I don't see how this is better.
> >
> 
> I'll agree this is more of a stylistic change, but mainly because the talk
> about data types reasonably implies the other items the patch explicitly
> mentions - names and additional columns.

I prefer David's changes to both paras because right now the details of
typed tables are spread over the respective CREATE and ALTER commands
for types and tables.  Or maybe we should add those details to the
existing "Typed Tables" section at the very end of CREATE TABLE?

> > - errmsg("type %s is not a composite type",
> > + errmsg("type %s is not a stand-alone composite type",
> >
> > I agree with Peter's complaint that people aren't going to understand
> > what a stand-alone composite type means when they see the revised
> > error message; to really help people, we're going to need to do better
> > than this, I think.
> >
> >
> We have a glossary.
> 
> That said, leave the wording as-is and add a conditional hint: The
> composite type must not also be a table.

It's now a separate error message (like I already had in v1) which
states that the specified type must not be a row type of another table
(based on Peter's feedback).  And the hint directs the user to CREATE
TYPE.

In passing, I also quoted the type name in the existing error message
for consistency.  I saw that table names etc. are already quoted in
other error messages.

-- 
Erik
>From 575bfec362bde5bf77ccb793bd8e2d78679c062f Mon Sep 17 00:00:00 2001
From: Erik Wienhold 
Date: Fri, 8 Mar 2024 04:21:56 +0100
Subject: [PATCH v5] Document that typed tables rely on CREATE TYPE

CREATE TABLE OF requires a stand-alone composite type that is not the
row type of another table.  Clarify that with a reference to CREATE TYPE
in the docs.  Also report a separate error message in this case.

Reworded docs provided by David G. Johnston.
---
 doc/src/sgml/ref/create_table.sgml| 16 
 src/backend/commands/tablecmds.c  | 11 +--
 src/test/regress/expected/typed_table.out |  7 ++-
 src/test/regress/sql/typed_table.sql  |  4 
 4 files 

Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

2024-05-17 Thread Jelte Fennema-Nio
On Fri, 17 May 2024 at 21:24, Robert Haas  wrote:
> I think the
> fear that we're going to run into cases where such complex handshaking
> is necessary is a major reason why I'm afraid of Jelte's ideas about
> ParameterSet: it seems much more opinionated to me than he seems to
> think it is.

I think that fear is valid, and I agree that we might want to add a
bespoke message for cases where ParameterSet is not enough. But as far
as I can tell ParameterSet would at least cover all the protocol
changes that have been suggested so far. Using an opinionated but
limited message type for 90% of the cases and a bespoke message for
the last 10% seems much better to me than having a bespoke one for
each (especially if currently none of the protocol proposals fall into
the last 10%).

> To the extent that I can wrap my head around what Jelte is proposing,
> and all signs point to that extent being quite limited, I suppose I
> was thinking of something like his option (2). That is, I assumed that
> a client would request all the optional features that said client
> might wish to use at connection startup time. But I can see how that
> assumption might be somewhat problematic in a connection-pooling
> environment.

To be clear, I'd also be totally fine with my option (2). I'm
personally slightly leaning towards my option (1), due to the reasons
listed before. But connection poolers could request all the protocol
extensions at the start just fine (using the default "disabled" value)
to check for support. So I think option (2) would probably be the most
conservative, i.e. we could always decide that option (1) is fine in
some future release.

> Still, at least to me, it seems better than trying to
> rely on GUC_REPORT. My opinion is (1) GUC_REPORT isn't a particularly
> well-designed mechanism so I dislike trying to double down on it

I agree that GUC_REPORT is not particularly well designed,
currently... But even in its current form it's already a very
effective mechanism for connection poolers to find out to which value
a specific GUC is set to, and if something similar to patch 0014 would
be merged my main gripe with GUC_REPORT would be gone. Tracking GUC
settings by using ParameterSet would actually be harder, because if
ParameterSet errors at Postgres then the connection pooler would have
to roll back its cache of that setting. While with the GUC_REPORT
response from Postgres it can simply rely on Postgres telling the
pooler that the GUC has changed, even rollbacks are handled correctly
this way.

> and
> (2) trying to mix these protocol-level parameters and the
> transactional GUCs we have together in a single mechanism seems
> potentially problematic

I don't understand what potential problems you're worried about here.
Could you clarify?

> and (3) I'm still not particularly happy about
> the idea of making protocol parameters into GUCs in the first place.

Similar to the above: Could you clarify why you're not happy about that?

> Perhaps these are all minority positions, but I can't tell you what
> everyone thinks, only what I think.

I'd love to hear some opinions from others on these design choices. So
far it seems like we're the only two that have opinions on this (which
seems hard to believe) and our opinions are clearly conflicting. And
above all I'd like to move forward with this, be it my way or yours
(although I'd prefer my way of course ;) )




Re: Streaming read-ready sequential scan code

2024-05-17 Thread Thomas Munro
On Sat, May 18, 2024 at 8:09 AM Thomas Munro  wrote:
> On Sat, May 18, 2024 at 1:00 AM Alexander Lakhin  wrote:
> > I decided to compare v17 vs v16 performance (as I did the last year [1])
> > and discovered that v17 loses to v16 in the pg_tpcds (s64da_tpcds)
> > benchmark, query15 (and several others, but I focused on this one):
> > Best pg-src-master--.* worse than pg-src-16--.* by 52.2 percents (229.84 > 
> > 151.03): pg_tpcds.query15
> > Average pg-src-master--.* worse than pg-src-16--.* by 53.4 percents (234.20 
> > > 152.64): pg_tpcds.query15
> > Please look at the full html report attached in case you're interested.
> >
> > (I used my pg-mark tool to measure/analyze performance, but I believe the
> > same results can be seen without it.)
>
> Will investigate, but if it's easy for you to rerun, does it help if
> you increase Linux readahead, eg blockdev --setra setting?

Andres happened to have TPC-DS handy, and reproduced that regression
in q15.  We tried some stuff and figured out that it requires
parallel_leader_participation=on, ie that this looks like some kind of
parallel fairness and/or timing problem.  It seems to be a question of
which worker finishes up processing matching rows, and the leader gets
a ~10ms head start but may be a little more greedy with the new
streaming code.  He tried reordering the table contents and then saw
17 beat 16.  So for q15, initial indications are that this isn't a
fundamental regression, it's just a test that is sensitive to some
arbitrary conditions.

I'll try to figure out some more details about that, ie is it being
too greedy on small-ish tables, and generally I do wonder about the
interactions between the heuristics and batching working at different
levels (OS, seq scan, read stream, hence my earlier ra question which
is likely a red herring) and how there might be unintended
consequences/interference patterns, but this particular case seems
more data dependent.




Re: libpq compression (part 3)

2024-05-17 Thread Jelte Fennema-Nio
On Fri, 17 May 2024 at 23:10, Jacob Champion
 wrote:
> We're talking about a transport-level option, though -- I thought the
> proposal enabled compression before authentication completed? Or has
> that changed?

I think it would make sense to only compress messages after
authentication has completed. The gain of compressing authentication
related packets seems pretty limited.

> (I'm suspicious of arguments that begin "well you can already do bad
> things"

Once logged in it's really easy to max out a core of the backend
you're connected as. There's many trivial queries you can use to do
that. An example would be:
SELECT sum(i) from generate_series(1, 10) i;

So I don't think it makes sense to worry about an attacker using a
high compression level as a means to DoS the server. Sending a few of
the above queries seems much easier.




Re: libpq compression (part 3)

2024-05-17 Thread Jelte Fennema-Nio
On Fri, 17 May 2024 at 23:40, Robert Haas  wrote:
> To be clear, I am not arguing that it should be the receiver's choice.
> I'm arguing it should be the client's choice, which means the client
> decides what it sends and also tells the server what to send to it.
> I'm open to counter-arguments, but as I've thought about this more,
> I've come to the conclusion that letting the client control the
> behavior is the most likely to be useful and the most consistent with
> existing facilities. I think we're on the same page based on the rest
> of your email: I'm just clarifying.

+1

> I have some quibbles with the syntax but I agree with the concept.
> What I'd probably do is separate the server side thing into two GUCs,
> each with a list of algorithms, comma-separated, like we do for other
> lists in postgresql.conf. Maybe make the default 'all' meaning
> "everything this build of the server supports". On the client side,
> I'd allow both things to be specified using a single option, because
> wanting to do the same thing in both directions will be common, and
> you actually have to type in connection strings sometimes, so
> verbosity matters more.
>
> As far as the format of the value for that keyword, what do you think
> about either compression=DO_THIS_BOTH_WAYS or
> compression=DO_THIS_WHEN_SENDING/DO_THIS_WHEN_RECEIVING, with each "do
> this" being a specification of the same form already accepted for
> server-side compression e.g. gzip or gzip:level=9? If you don't like
> that, why do you think the proposal you made above is better, and why
> is that one now punctuated with slashes instead of semicolons?

+1




Re: Speed up clean meson builds by ~25%

2024-05-17 Thread Peter Eisentraut

On 17.05.24 23:01, Robert Haas wrote:

On Fri, May 17, 2024 at 4:59 PM Tom Lane  wrote:

As I mentioned upthread, I'm more worried about confusing error
reports than the machine time.


Well, Jelte fixed that.


I grant that there are people who are more affected, but still, I'd
just as soon not contort the build rules for a temporary problem.


Arguably by doing this, but I don't think it's enough of a contortion
to get excited about.

Anyone else want to vote?


I retested the patch from 2024-04-07 (I think that's the one that "fixed 
that"?  There are multiple "v1" patches in this thread.) using gcc-14 
and clang-18, with ccache disabled of course.  The measured effects of 
the patch are:


gcc-14:   1% slower
clang-18: 3% faster

So with that, it doesn't seem very interesting.





Re: pg_combinebackup does not detect missing files

2024-05-17 Thread David Steele

On 5/17/24 22:20, Robert Haas wrote:

On Fri, May 17, 2024 at 1:18 AM David Steele  wrote:

However, I think allowing the user to optionally validate the input
would be a good feature. Running pg_verifybackup as a separate step is
going to be a more expensive then verifying/copying at the same time.
Even with storage tricks to copy ranges of data, pg_combinebackup is
going to aware of files that do not need to be verified for the current
operation, e.g. old copies of free space maps.


In cases where pg_combinebackup reuses a checksums from the input
manifest rather than recomputing it, this could accomplish something.
However, for any file that's actually reconstructed, pg_combinebackup
computes the checksum as it's writing the output file. I don't see how
it's sensible to then turn around and verify that the checksum that we
just computed is the same one that we now get. 


Here's an example. First make a few backups:

$ pg_basebackup -c fast -X none -D test/backup/full -F plain
$ pg_basebackup -c fast -D test/backup/incr1 -F plain -i 
/home/dev/test/backup/full/backup_manifest


Then intentionally corrupt a file in the incr backup:

$ truncate -s 0 test/backup/incr1/base/5/3764_fsm

In this case pg_verifybackup will error:

$ pg_verifybackup test/backup/incr1
pg_verifybackup: error: "base/5/3764_fsm" has size 0 on disk but size 
24576 in the manifest


But pg_combinebackup does not complain:

$ pg_combinebackup test/backup/full test/backup/incr1 -o test/backup/combine
$ ls -lah test/backup/combine/base/5/3764_fsm
-rw--- 1 dev dialout 0 May 17 22:08 test/backup/combine/base/5/3764_fsm

It would be nice if pg_combinebackup would (at least optionally but 
prefferrably by default) complain in this case rather than the user 
needing to separately run pg_verifybackup.


Regards,
-David




Re: libpq compression (part 3)

2024-05-17 Thread Robert Haas
On Fri, May 17, 2024 at 4:53 PM Jacob Burroughs
 wrote:
> I think I was more thinking that trying to let both parties control
> the parameter seemed like a recipe for confusion and sadness, and so
> the choice that felt most natural to me was to let the sender control
> it, but I'm definitely open to changing that the other way around.

To be clear, I am not arguing that it should be the receiver's choice.
I'm arguing it should be the client's choice, which means the client
decides what it sends and also tells the server what to send to it.
I'm open to counter-arguments, but as I've thought about this more,
I've come to the conclusion that letting the client control the
behavior is the most likely to be useful and the most consistent with
existing facilities. I think we're on the same page based on the rest
of your email: I'm just clarifying.

> On the server side, we use slash separated sets of options
> connection_compression=DEFAULT_VALUE_FOR_BOTH_DIRECTIONS/client_to_server=OVERRIDE_FOR_THIS_DIRECTION/server_to_client=OVERRIDE_FOR_THIS_DIRECTION
> with the values being semicolon separated compression algorithms.
> On the client side, you can specify
> compression=,
> but on the client side you can actually specify compression options,
> which the server will use if provided, and otherwise it will fall back
> to defaults.

I have some quibbles with the syntax but I agree with the concept.
What I'd probably do is separate the server side thing into two GUCs,
each with a list of algorithms, comma-separated, like we do for other
lists in postgresql.conf. Maybe make the default 'all' meaning
"everything this build of the server supports". On the client side,
I'd allow both things to be specified using a single option, because
wanting to do the same thing in both directions will be common, and
you actually have to type in connection strings sometimes, so
verbosity matters more.

As far as the format of the value for that keyword, what do you think
about either compression=DO_THIS_BOTH_WAYS or
compression=DO_THIS_WHEN_SENDING/DO_THIS_WHEN_RECEIVING, with each "do
this" being a specification of the same form already accepted for
server-side compression e.g. gzip or gzip:level=9? If you don't like
that, why do you think the proposal you made above is better, and why
is that one now punctuated with slashes instead of semicolons?

> If we think we need to, we could let the server specify defaults for
> server-side compression.  My overall thought though is that having an
> excessive number of knobs increases the surface area for testing and
> bugs while also increasing potential user confusion and that allowing
> configuration on *both* sides doesn't seem sufficiently useful to be
> worth adding that complexity.

I agree.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Fix log_line_prefix to display the transaction id (%x) for statements not in a transaction block

2024-05-17 Thread Bruce Momjian
On Thu, May 16, 2024 at 08:37:43AM -0400, Robert Haas wrote:
> On Thu, May 16, 2024 at 6:01 AM Quan Zongliang  wrote:
> > I thought about writing statement log when xid assigned. But it's seemed
> > too complicated.
> > I'm inclined to keep it for a while. Until we find a good way or give
> > up. It's a reasonable request, after all.
> 
> I don't think it's reasonable at all. We can't log the XID before it's
> assigned, and we can't log the statement after the XID is assigned
> without completely changing how the parameter works.

I have removed the TODO item.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Only you can decide what is important to you.




Re: libpq compression (part 3)

2024-05-17 Thread Jacob Champion
On Fri, May 17, 2024 at 1:53 PM Jacob Burroughs
 wrote:
> New proposal, predicated on the assumption that if you enable
> compression you are ok with the client picking whatever level they
> want.  At least with the currently enabled algorithms I don't think
> any of them are so insane that they would knock over a server or
> anything, and in general postgres servers are usually connected to by
> clients that the server admin has some channel to talk to (after all
> they somehow had to get access to log in to the server in the first
> place) if they are doing something wasteful, given that a client can
> do a lot worse things than enable aggressive compression by writing
> bad queries.

We're talking about a transport-level option, though -- I thought the
proposal enabled compression before authentication completed? Or has
that changed?

(I'm suspicious of arguments that begin "well you can already do bad
things", anyway... It seems like there's a meaningful difference
between consuming resources running a parsed query and consuming
resources trying to figure out what the parsed query is. I don't know
if the solution is locking in a compression level, or something else;
maybe they're both reasonably mitigated in the same way. I haven't
really looked into zip bombs much.)

--Jacob




Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.

2024-05-17 Thread Pavel Borisov
Hi, Mark!

> > At the first glance it's not clear to me:
> > - why your test creates cross-page unique constraint violations?
>
> To see if they are detected.
>
> > - how do you know they are not detected?
>
> It appears that they are detected.  At least, rerunning the test after
> adjusting the expected output, I no longer see problems.
>

I understand your point. It was unclear how it modified the index so that
only unique constraint check between pages should have failed with other
checks passed.

Anyway, thanks for your testing and efforts! I'm happy that the test now
passes and confirms that amcheck feature works as intended.

Kind regards,
Pavel Borisov


Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-17 Thread Tom Lane
Robert Haas  writes:
> On Fri, May 17, 2024 at 3:51 PM Laurenz Albe  wrote:
>> I think it is a good rule.  I don't think that it shouldn't lead to putting
>> people on the pillory or kicking their patches, but I imagine that a 
>> committer
>> looking for somebody else's patch to work on could prefer patches by people
>> who are doing their share of reviews.

> If you give me an automated way to find that out, I'll consider paying
> some attention to it.

Yeah, I can't imagine that any committer (or reviewer, really) is
doing any such thing, because it would take far too much effort to
figure out how much work anyone else is doing.  I see CFMs reminding
everybody that this rule exists, but I don't think they ever try to
check it either.  It's pretty much the honor system, and I'm sure
some folk ignore it.

regards, tom lane




Re: Speed up clean meson builds by ~25%

2024-05-17 Thread Robert Haas
On Fri, May 17, 2024 at 4:59 PM Tom Lane  wrote:
> As I mentioned upthread, I'm more worried about confusing error
> reports than the machine time.

Well, Jelte fixed that.

> I grant that there are people who are more affected, but still, I'd
> just as soon not contort the build rules for a temporary problem.

Arguably by doing this, but I don't think it's enough of a contortion
to get excited about.

Anyone else want to vote?

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Speed up clean meson builds by ~25%

2024-05-17 Thread Tom Lane
Robert Haas  writes:
> If this is the consensus opinion, then
> https://commitfest.postgresql.org/48/4914/ should be marked Rejected.
> However, while I think the improvements that Tom was able to make here
> sound fantastic, I don't understand why that's an argument against
> Jelte's patches. After all, Tom's work will only go into v18, but this
> patch could be adopted in v17 and back-patched to all releases that
> support meson builds, saving oodles of compile time for as long as
> those releases are supported. The most obvious beneficiary of that
> course of action would seem to be Tom himself, since he back-patches
> more fixes than anybody, last I checked, but it'd be also be useful to
> get slightly quicker results from the buildfarm and slightly quicker
> results for anyone using CI on back-branches and for other hackers who
> are looking to back-patch bug fixes. I don't quite understand why we
> want to throw those potential benefits out the window just because we
> have a better fix for the future.

As I mentioned upthread, I'm more worried about confusing error
reports than the machine time.  It would save me personally exactly
nada, since (a) I usually develop with gcc not clang, (b) when
I do use clang it's not a heavily-affected version, and (c) since
we *very* seldom change the grammar in stable branches, ccache will
hide the problem pretty effectively anyway in the back branches.

(If you're not using ccache, please don't complain about build time.)

I grant that there are people who are more affected, but still, I'd
just as soon not contort the build rules for a temporary problem.

regards, tom lane




Re: libpq compression (part 3)

2024-05-17 Thread Jacob Burroughs
On Thu, May 16, 2024 at 3:28 AM Robert Haas  wrote:
>
> Well, I mean, I don't really know what the right answer is here, but
> right now I can say pg_dump --compress=gzip to compress the dump with
> gzip, or pg_dump --compress=gzip:9 to compress with gzip level 9. Now,
> say that instead of compressing the output, I want to compress the
> data sent to me over the connection. So I figure I should be able to
> say pg_dump 'compress=gzip' or pg_dump 'compress=gzip:9'. I think you
> want to let me do the first of those but not the second. But, to turn
> your question on its head, what would be the reasoning behind such a
> restriction?

I think I was more thinking that trying to let both parties control
the parameter seemed like a recipe for confusion and sadness, and so
the choice that felt most natural to me was to let the sender control
it, but I'm definitely open to changing that the other way around.

> Note also the precedent of pg_basebackup. I can say pg_basebackup
> --compress=server-gzip:9 to ask the server to compress the backup with
> gzip at level 9. In that case, what I request from the server changes
> the actual output that I get, which is not the case here. Even so, I
> don't really understand what the justification would be for refusing
> to let the client ask for a specific compression level.
>
> And on the flip side, I also don't understand why the server would
> want to mandate a certain compression level. If compression is very
> expensive for a certain algorithm when the level is above some
> threshold X, we could have a GUC to limit the maximum level that the
> client can request. But, given that the gzip compression level
> defaults to 6 in every other context, why would the administrator of a
> particular server want to say, well, the default for my server is 3 or
> 9 or whatever?
>
> (This is of course all presuming you want to use gzip at all, which
> you probably don't, because gzip is crazy slow. Use lz4 or zstd! But
> it makes the point.)

New proposal, predicated on the assumption that if you enable
compression you are ok with the client picking whatever level they
want.  At least with the currently enabled algorithms I don't think
any of them are so insane that they would knock over a server or
anything, and in general postgres servers are usually connected to by
clients that the server admin has some channel to talk to (after all
they somehow had to get access to log in to the server in the first
place) if they are doing something wasteful, given that a client can
do a lot worse things than enable aggressive compression by writing
bad queries.

On the server side, we use slash separated sets of options
connection_compression=DEFAULT_VALUE_FOR_BOTH_DIRECTIONS/client_to_server=OVERRIDE_FOR_THIS_DIRECTION/server_to_client=OVERRIDE_FOR_THIS_DIRECTION
with the values being semicolon separated compression algorithms.
On the client side, you can specify
compression=,
but on the client side you can actually specify compression options,
which the server will use if provided, and otherwise it will fall back
to defaults.

If we think we need to, we could let the server specify defaults for
server-side compression.  My overall thought though is that having an
excessive number of knobs increases the surface area for testing and
bugs while also increasing potential user confusion and that allowing
configuration on *both* sides doesn't seem sufficiently useful to be
worth adding that complexity.

-- 
Jacob Burroughs | Staff Software Engineer
E: jburrou...@instructure.com




Re: broken tables on hot standby after migration on PostgreSQL 16 (3x times last month)

2024-05-17 Thread Pavel Stehule
pá 17. 5. 2024 v 22:05 odesílatel Pavel Stehule 
napsal:

>
>
> pá 17. 5. 2024 v 21:50 odesílatel Andres Freund 
> napsal:
>
>> Hi,
>>
>> On 2024-05-17 15:12:31 +0200, Pavel Stehule wrote:
>> > after migration on PostgreSQL 16 I seen 3x times (about every week)
>> broken
>> > tables on replica nodes. The query fails with error
>>
>> Migrating from what version?
>>
>
> I think 14, but it should be verified tomorrow
>

upgrade was from 15.2



>
>>
>> You're saying that the data is correctly accessible on primaries, but
>> broken
>> on standbys? Is there any difference in how the page looks like on the
>> primary
>> vs standby?
>>
>
> I saved one page from master and standby. Can I send it privately? There
> are some private data (although not too sensitive)
>
>
>>
>>
>> > ERROR:  could not access status of transaction 1442871302
>> > DETAIL:  Could not open file "pg_xact/0560": No such file or directory
>> >
>> > verify_heapam reports
>> >
>> > ^[[Aprd=# select * from verify_heapam('account_login_history') where
>> blkno
>> > = 179036;
>> >  blkno  | offnum | attnum |msg
>> >
>> >
>> +++---
>> >  179036 | 30 || xmin 1393743382 precedes oldest valid
>> > transaction ID 3:1687012112
>>
>> So that's not just a narrow race...
>>
>>
>> > master
>> >
>> > (2024-05-17 14:36:57) prd=# SELECT * FROM
>> > page_header(get_raw_page('account_login_history', 179036));
>> >   lsn  │ checksum │ flags │ lower │ upper │ special │ pagesize │
>> > version │ prune_xid
>> >
>> ───┼──┼───┼───┼───┼─┼──┼─┼───
>> >  A576/810F4CE0 │0 │ 4 │   296 │   296 │8192 │ 8192 │
>> > 4 │ 0
>> > (1 row)
>> >
>> >
>> > replica
>> > prd_aukro=# SELECT * FROM
>> page_header(get_raw_page('account_login_history',
>> > 179036));
>> >   lsn  | checksum | flags | lower | upper | special | pagesize |
>> > version | prune_xid
>> >
>> ---+--+---+---+---+-+--+-+---
>> >  A56C/63979DA0 |0 | 0 |   296 |   296 |8192 | 8192 |
>> > 4 | 0
>> > (1 row)
>>
>> Is the replica behind the primary? Or did we somehow end up with diverging
>> data? The page LSNs differ by about 40GB...
>>
>> Is there evidence of failed truncations of the relation in the log? From
>> autovacuum?
>>
>
> no I did not see these bugs,
>
>>
>> Does the data in the readable versions of the tuples on that page actually
>> look valid? Is it possibly duplicated data?
>>
>
> looks well, I didn't see any strange content
>
>>
>>
>> I'm basically wondering whether it's possible that we errored out during
>> truncation (e.g. due to a file permission issue or such). Due to some
>> brokenness in RelationTruncate() that can lead to data divergence between
>> primary and standby and to old tuples re-appearing on either.
>>
>>
>> Another question: Do you use pg_repack or such?
>>
>
> pg_repack was used 2 months before migration
>
>
>
>>
>> Greetings,
>>
>> Andres Freund
>>
>


Re: First draft of PG 17 release notes

2024-05-17 Thread Jeff Davis
On Thu, 2024-05-09 at 00:03 -0400, Bruce Momjian wrote:
> I have committed the first draft of the PG 17 release notes;  you can
> see the results here:
> 
> https://momjian.us/pgsql_docs/release-17.html

For this item:

Create a "builtin" collation provider similar to libc's C
locale (Jeff Davis)

It uses a "C" locale which is identical but independent of
libc, but it allows the use of non-"C" collations like "en_US"
and "C.UTF-8" with the "C" locale, which libc does not. MORE? 

I suggest something more like:

New, platform-independent "builtin" collation
provider. (Jeff Davis)

Currently, it offers the "C" and "C.UTF-8" locales. The
"C.UTF-8" locale combines stable and fast code point order
collation with Unicode character semantics.

Regards,
Jeff Davis





Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-17 Thread Tomas Vondra


On 5/17/24 21:59, Robert Haas wrote:
> On Fri, May 17, 2024 at 3:51 PM Laurenz Albe  wrote:
>> On Fri, 2024-05-17 at 13:12 -0400, Greg Sabino Mullane wrote:
 Long time ago there was a "rule" that people submitting patches are 
 expected
 to do reviews. Perhaps we should be more strict this.
>>>
>>> Big -1. How would we even be more strict about this? Public shaming? 
>>> Withholding a commit?
>>
>> I think it is a good rule.  I don't think that it shouldn't lead to putting
>> people on the pillory or kicking their patches, but I imagine that a 
>> committer
>> looking for somebody else's patch to work on could prefer patches by people
>> who are doing their share of reviews.
> 

Yeah, I don't have any particular idea how should the rule be "enforced"
and I certainly did not imagine public shaming or anything like that. My
thoughts were more about reminding people the reviews are part of the
deal, that's it ... maybe "more strict" was not quite what I meant.

> If you give me an automated way to find that out, I'll consider paying
> some attention to it. However, in order to sort the list of patches
> needing review by the amount of review done by the patch author, we'd
> first need to have a list of patches needing review.
> 
> And right now we don't, or at least not in any usable way.
> commitfest.postgresql.org is supposed to give us that, but it doesn't.
> 

It'd certainly help to know which patches to consider for review, but I
guess I'd still look at patches from people doing more reviews first,
even if I had to find out in what shape the patch is.

I'm far more skeptical about "automated way" to track this, though. I'm
not sure it's quite possible - reviews can have a lot of very different
forms, and deciding what is or is not a review is pretty subjective. So
it's not clear how would we quantify that. Not to mention I'm sure we'd
promptly find ways to game that.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: Comments on Custom RMGRs

2024-05-17 Thread Robert Haas
On Fri, May 17, 2024 at 4:20 PM Jeff Davis  wrote:
> Regarding this particular change: the checkpointing hook seems more
> like a table AM feature, so I agree with you that we should have a good
> idea how a real table AM might use this, rather than only
> pg_stat_statements.

I would even be OK with a pg_stat_statements example that is fully
working and fully explained. I just don't want to have no example at
all. The original proposal has been changed twice because of
complaints that the hook wasn't quite useful enough, but I think that
only proves that v3 is closer to being useful than v1. If v1 is 40% of
the way to useful and v3 is 120% of the way to useful, wonderful! But
if v1 is 20% of the way to being useful and v3 is 60% of the way to
being useful, it's not time to commit anything yet. I don't know which
is the case, and I think if someone wants this to be committed, they
need to explain clearly why it's the first and not the second.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Comments on Custom RMGRs

2024-05-17 Thread Jeff Davis
On Fri, 2024-05-17 at 14:56 -0400, Robert Haas wrote:
> (2) a detailed
> description of how a non-core table AM or index AM is expected to be
> able to make use of this. Bonus points if the person providing that
> rationale can say credibly that they've actually implemented this and
> it works great with 100TB of production data.

That's a chicken-and-egg problem and we should be careful about setting
the bar too high for table AM improvements. Ultimately, AM authors will
benefit more from a steady stream of improvements that sometimes miss
the mark than complete stagnation, as long as we use reasonable
judgement.

There aren't a lot of table AMs, and to create a good one you need a
lot of internals knowledge. If it's an important AM, the developers are
surely going to try it out on mainline occasionally, and expect API
breaks. If the API breaks for them in some fundamental way, they can
complain and we still have time to fix it.

> The problem here is not only that we don't want to commit a hook that
> does nothing useful. We also don't want to commit a hook that works
> wonderfully for someone but we have no idea why. If we do that, then
> we don't know whether it's OK to modify the hook in the future as the
> code evolves, or more to the point, which kinds of modifications will
> be acceptable.

We have to have some kind of understanding between us and AM authors
that they need to participate in discussions when using these APIs, try
changes during development, be adaptable when they change from release
to release, and come back and tell us when something is wrong.

> And also, the next person who wants to use it is likely
> to have to figure out all on their own how to do so, instead of being
> able to refer to comments or documentation or the commit message or
> at
> least a mailing list post.

Obviously it would be better to have a nice example table AM in
/contrib, different enough from heap, but nobody has done that yet. You
could argue that we never should have exposed the API without something
like this in the first place, but that's also a big ask and we'd
probably still not have it.


Regarding this particular change: the checkpointing hook seems more
like a table AM feature, so I agree with you that we should have a good
idea how a real table AM might use this, rather than only
pg_stat_statements.

Regards,
Jeff Davis





Re: Speed up clean meson builds by ~25%

2024-05-17 Thread Robert Haas
On Wed, Apr 17, 2024 at 11:11 PM Tom Lane  wrote:
> I think we should hold off on this.  I found a simpler way to address
> ecpg's problem than what I sketched upthread.  I have a not-ready-to-
> show-yet patch that allows the vast majority of ecpg's grammar
> productions to use the default semantic action.  Testing on my M1
> Macbook with clang 16.0.6 from MacPorts, I see the compile time for
> preproc.o in HEAD as about 1m44 sec; but with this patch, 0.6 sec.

If this is the consensus opinion, then
https://commitfest.postgresql.org/48/4914/ should be marked Rejected.
However, while I think the improvements that Tom was able to make here
sound fantastic, I don't understand why that's an argument against
Jelte's patches. After all, Tom's work will only go into v18, but this
patch could be adopted in v17 and back-patched to all releases that
support meson builds, saving oodles of compile time for as long as
those releases are supported. The most obvious beneficiary of that
course of action would seem to be Tom himself, since he back-patches
more fixes than anybody, last I checked, but it'd be also be useful to
get slightly quicker results from the buildfarm and slightly quicker
results for anyone using CI on back-branches and for other hackers who
are looking to back-patch bug fixes. I don't quite understand why we
want to throw those potential benefits out the window just because we
have a better fix for the future.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.

2024-05-17 Thread Peter Geoghegan
On Fri, May 17, 2024 at 4:10 PM Mark Dilger
 wrote:
> The quick-and-dirty TAP test I wrote this morning is intended to introduce 
> duplicates across page boundaries, not to test for ones that got there by 
> normal database activity.  In other words, the TAP test forcibly corrupts the 
> index by changing a value on one side of a boundary to be equal to the value 
> on the other side of the boundary.  Prior to the corrupting action the values 
> were all unique.

I understood that. I was just pointing out that an index that looks
even somewhat like that is already quite unnatural.

-- 
Peter Geoghegan




Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.

2024-05-17 Thread Mark Dilger



> On May 17, 2024, at 1:00 PM, Peter Geoghegan  wrote:
> 
> Many different parts of the B-Tree code will fight against allowing
> duplicates of the same value to span multiple leaf pages -- this is
> especially true for unique indexes. 

The quick-and-dirty TAP test I wrote this morning is intended to introduce 
duplicates across page boundaries, not to test for ones that got there by 
normal database activity.  In other words, the TAP test forcibly corrupts the 
index by changing a value on one side of a boundary to be equal to the value on 
the other side of the boundary.  Prior to the corrupting action the values were 
all unique.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: Streaming read-ready sequential scan code

2024-05-17 Thread Thomas Munro
On Sat, May 18, 2024 at 1:00 AM Alexander Lakhin  wrote:
> I decided to compare v17 vs v16 performance (as I did the last year [1])
> and discovered that v17 loses to v16 in the pg_tpcds (s64da_tpcds)
> benchmark, query15 (and several others, but I focused on this one):
> Best pg-src-master--.* worse than pg-src-16--.* by 52.2 percents (229.84 > 
> 151.03): pg_tpcds.query15
> Average pg-src-master--.* worse than pg-src-16--.* by 53.4 percents (234.20 > 
> 152.64): pg_tpcds.query15
> Please look at the full html report attached in case you're interested.
>
> (I used my pg-mark tool to measure/analyze performance, but I believe the
> same results can be seen without it.)

Will investigate, but if it's easy for you to rerun, does it help if
you increase Linux readahead, eg blockdev --setra setting?




Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.

2024-05-17 Thread Mark Dilger



> On May 17, 2024, at 12:42 PM, Pavel Borisov  wrote:
> 
> At the first glance it's not clear to me: 
> - why your test creates cross-page unique constraint violations?

To see if they are detected.

> - how do you know they are not detected?

It appears that they are detected.  At least, rerunning the test after 
adjusting the expected output, I no longer see problems.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: broken tables on hot standby after migration on PostgreSQL 16 (3x times last month)

2024-05-17 Thread Pavel Stehule
pá 17. 5. 2024 v 21:50 odesílatel Andres Freund  napsal:

> Hi,
>
> On 2024-05-17 15:12:31 +0200, Pavel Stehule wrote:
> > after migration on PostgreSQL 16 I seen 3x times (about every week)
> broken
> > tables on replica nodes. The query fails with error
>
> Migrating from what version?
>

I think 14, but it should be verified tomorrow

>
>
> You're saying that the data is correctly accessible on primaries, but
> broken
> on standbys? Is there any difference in how the page looks like on the
> primary
> vs standby?
>

I saved one page from master and standby. Can I send it privately? There
are some private data (although not too sensitive)


>
>
> > ERROR:  could not access status of transaction 1442871302
> > DETAIL:  Could not open file "pg_xact/0560": No such file or directory
> >
> > verify_heapam reports
> >
> > ^[[Aprd=# select * from verify_heapam('account_login_history') where
> blkno
> > = 179036;
> >  blkno  | offnum | attnum |msg
> >
> >
> +++---
> >  179036 | 30 || xmin 1393743382 precedes oldest valid
> > transaction ID 3:1687012112
>
> So that's not just a narrow race...
>
>
> > master
> >
> > (2024-05-17 14:36:57) prd=# SELECT * FROM
> > page_header(get_raw_page('account_login_history', 179036));
> >   lsn  │ checksum │ flags │ lower │ upper │ special │ pagesize │
> > version │ prune_xid
> >
> ───┼──┼───┼───┼───┼─┼──┼─┼───
> >  A576/810F4CE0 │0 │ 4 │   296 │   296 │8192 │ 8192 │
> > 4 │ 0
> > (1 row)
> >
> >
> > replica
> > prd_aukro=# SELECT * FROM
> page_header(get_raw_page('account_login_history',
> > 179036));
> >   lsn  | checksum | flags | lower | upper | special | pagesize |
> > version | prune_xid
> >
> ---+--+---+---+---+-+--+-+---
> >  A56C/63979DA0 |0 | 0 |   296 |   296 |8192 | 8192 |
> > 4 | 0
> > (1 row)
>
> Is the replica behind the primary? Or did we somehow end up with diverging
> data? The page LSNs differ by about 40GB...
>
> Is there evidence of failed truncations of the relation in the log? From
> autovacuum?
>

no I did not see these bugs,

>
> Does the data in the readable versions of the tuples on that page actually
> look valid? Is it possibly duplicated data?
>

looks well, I didn't see any strange content

>
>
> I'm basically wondering whether it's possible that we errored out during
> truncation (e.g. due to a file permission issue or such). Due to some
> brokenness in RelationTruncate() that can lead to data divergence between
> primary and standby and to old tuples re-appearing on either.
>
>
> Another question: Do you use pg_repack or such?
>

pg_repack was used 2 months before migration



>
> Greetings,
>
> Andres Freund
>


Re: broken tables on hot standby after migration on PostgreSQL 16 (3x times last month)

2024-05-17 Thread Peter Geoghegan
On Fri, May 17, 2024 at 3:50 PM Andres Freund  wrote:
> You're saying that the data is correctly accessible on primaries, but broken
> on standbys? Is there any difference in how the page looks like on the primary
> vs standby?

There clearly is. The relevant infomask bits are different. I didn't
examine it much closer than that, though.

-- 
Peter Geoghegan




Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.

2024-05-17 Thread Peter Geoghegan
On Fri, May 17, 2024 at 3:42 PM Mark Dilger
 wrote:
> On further review, the test was not anticipating the error message "high key 
> invariant violated for index".  That wasn't seen in calls to 
> bt_index_parent_check(), but appears as one of the errors from 
> bt_index_check().  I am rerunning the test now

Many different parts of the B-Tree code will fight against allowing
duplicates of the same value to span multiple leaf pages -- this is
especially true for unique indexes. For example, nbtsplitloc.c has a
variety of strategies that will prevent choosing a split point that
necessitates including a distinguishing heap TID in the new high key.
In other words, nbtsplitloc.c is very aggressive about picking a split
point between (rather than within) groups of duplicates.

Of course it's still *possible* for a unique index to have multiple
leaf pages containing the same individual value. The regression tests
do have coverage for certain relevant code paths (e.g., there is
coverage for code paths only hit when _bt_check_unique has to go to
the page to the right). This is only the case because I went out of my
way to make sure of it, by adding tests that allow a huge number of
version duplicates to accumulate within a unique index. (The "move
right" _bt_check_unique branches had zero test coverage for a year or
two.)

Just how important it is that amcheck covers cases where the version
duplicates span multiple leaf pages is of course debatable -- it's
always better to be more thorough, when practical. But it's certainly
something that needs to be assessed based on the merits.

--
Peter Geoghegan




Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-17 Thread Robert Haas
On Fri, May 17, 2024 at 3:51 PM Laurenz Albe  wrote:
> On Fri, 2024-05-17 at 13:12 -0400, Greg Sabino Mullane wrote:
> > > Long time ago there was a "rule" that people submitting patches are 
> > > expected
> > > to do reviews. Perhaps we should be more strict this.
> >
> > Big -1. How would we even be more strict about this? Public shaming? 
> > Withholding a commit?
>
> I think it is a good rule.  I don't think that it shouldn't lead to putting
> people on the pillory or kicking their patches, but I imagine that a committer
> looking for somebody else's patch to work on could prefer patches by people
> who are doing their share of reviews.

If you give me an automated way to find that out, I'll consider paying
some attention to it. However, in order to sort the list of patches
needing review by the amount of review done by the patch author, we'd
first need to have a list of patches needing review.

And right now we don't, or at least not in any usable way.
commitfest.postgresql.org is supposed to give us that, but it doesn't.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Propagate sanity checks of ProcessUtility() to standard_ProcessUtility()?

2024-05-17 Thread Robert Haas
On Thu, Feb 29, 2024 at 2:21 AM Michael Paquier  wrote:
> It's been brought to me that an extension may finish by breaking the
> assumptions ProcessUtility() relies on when calling
> standard_ProcessUtility(), causing breakages when passing down data to
> cascading utility hooks.
>
> Isn't the state of the arguments given something we should check not
> only in the main entry point ProcessUtility() but also in
> standard_ProcessUtility(), to prevent issues if an extension
> incorrectly manipulates the arguments it needs to pass down to other
> modules that use the utility hook, like using a NULL query string?

I can't imagine a scenario where this change saves more than 5 minutes
of debugging, so I'd rather leave things the way they are. If you do
this, then people will see the macro and have to go look at what it
does, whereas right now, they can see the assertions themselves, which
is better.

The usual pattern for using hooks like this is to call the next
implementation, or the standard implementation, and pass down the
arguments that you got. If you try to do that and mess it up, you're
going to get a crash really, really quickly and it shouldn't be very
difficult at all to figure out what you did wrong. Honestly, that
doesn't seem like it would be hard even without the assertions: for
the most part, a quick glance at the stack backtrace ought to be
enough. If you're trying to do something more sophisticated, like
mutating the node tree before passing it down, the chances of your
mistakes getting caught by these assertions are pretty darn low, since
they're very superficial checks.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-17 Thread Laurenz Albe
On Fri, 2024-05-17 at 13:12 -0400, Greg Sabino Mullane wrote:
> > Long time ago there was a "rule" that people submitting patches are expected
> > to do reviews. Perhaps we should be more strict this.
> 
> Big -1. How would we even be more strict about this? Public shaming? 
> Withholding a commit?

I think it is a good rule.  I don't think that it shouldn't lead to putting
people on the pillory or kicking their patches, but I imagine that a committer
looking for somebody else's patch to work on could prefer patches by people
who are doing their share of reviews.

Yours,
Laurenz Albe




Re: broken tables on hot standby after migration on PostgreSQL 16 (3x times last month)

2024-05-17 Thread Andres Freund
Hi,

On 2024-05-17 15:12:31 +0200, Pavel Stehule wrote:
> after migration on PostgreSQL 16 I seen 3x times (about every week) broken
> tables on replica nodes. The query fails with error

Migrating from what version?


You're saying that the data is correctly accessible on primaries, but broken
on standbys? Is there any difference in how the page looks like on the primary
vs standby?


> ERROR:  could not access status of transaction 1442871302
> DETAIL:  Could not open file "pg_xact/0560": No such file or directory
>
> verify_heapam reports
>
> ^[[Aprd=# select * from verify_heapam('account_login_history') where blkno
> = 179036;
>  blkno  | offnum | attnum |msg
>
> +++---
>  179036 | 30 || xmin 1393743382 precedes oldest valid
> transaction ID 3:1687012112

So that's not just a narrow race...


> master
>
> (2024-05-17 14:36:57) prd=# SELECT * FROM
> page_header(get_raw_page('account_login_history', 179036));
>   lsn  │ checksum │ flags │ lower │ upper │ special │ pagesize │
> version │ prune_xid
> ───┼──┼───┼───┼───┼─┼──┼─┼───
>  A576/810F4CE0 │0 │ 4 │   296 │   296 │8192 │ 8192 │
> 4 │ 0
> (1 row)
>
>
> replica
> prd_aukro=# SELECT * FROM page_header(get_raw_page('account_login_history',
> 179036));
>   lsn  | checksum | flags | lower | upper | special | pagesize |
> version | prune_xid
> ---+--+---+---+---+-+--+-+---
>  A56C/63979DA0 |0 | 0 |   296 |   296 |8192 | 8192 |
> 4 | 0
> (1 row)

Is the replica behind the primary? Or did we somehow end up with diverging
data? The page LSNs differ by about 40GB...

Is there evidence of failed truncations of the relation in the log? From
autovacuum?

Does the data in the readable versions of the tuples on that page actually
look valid? Is it possibly duplicated data?


I'm basically wondering whether it's possible that we errored out during
truncation (e.g. due to a file permission issue or such). Due to some
brokenness in RelationTruncate() that can lead to data divergence between
primary and standby and to old tuples re-appearing on either.


Another question: Do you use pg_repack or such?

Greetings,

Andres Freund




Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.

2024-05-17 Thread Pavel Borisov
Hi, Mark!

On Fri, 17 May 2024 at 23:10, Mark Dilger 
wrote:

>
>
> > On May 17, 2024, at 11:51 AM, Pavel Borisov 
> wrote:
> >
> > Amcheck with checkunique option does check uniqueness violation between
> pages. But it doesn't warranty detection of cross page uniqueness
> violations in extremely rare cases when the first equal index entry on the
> next page corresponds to tuple that is not visible (e.g. dead). In this, I
> followed the Peter's notion [1] that checking across a number of dead equal
> entries that could theoretically span even across many pages is an unneeded
> code complication and amcheck is not a tool that provides any warranty when
> checking an index.
>
> This confuses me a bit.  The regression test creates a table and index but
> never performs any DELETE nor any UPDATE operations, so none of the index
> entries should be dead.  If I am understanding you correct, I'd be forced
> to conclude that the uniqueness checking code is broken.  Can you take a
> look?
>
At the first glance it's not clear to me:
- why your test creates cross-page unique constraint violations?
- how do you know they are not detected?


Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.

2024-05-17 Thread Mark Dilger



> On May 17, 2024, at 12:10 PM, Mark Dilger  
> wrote:
> 
>> Amcheck with checkunique option does check uniqueness violation between 
>> pages. But it doesn't warranty detection of cross page uniqueness violations 
>> in extremely rare cases when the first equal index entry on the next page 
>> corresponds to tuple that is not visible (e.g. dead). In this, I followed 
>> the Peter's notion [1] that checking across a number of dead equal entries 
>> that could theoretically span even across many pages is an unneeded code 
>> complication and amcheck is not a tool that provides any warranty when 
>> checking an index.
> 
> This confuses me a bit.  The regression test creates a table and index but 
> never performs any DELETE nor any UPDATE operations, so none of the index 
> entries should be dead.  If I am understanding you correct, I'd be forced to 
> conclude that the uniqueness checking code is broken.  Can you take a look?

On further review, the test was not anticipating the error message "high key 
invariant violated for index".  That wasn't seen in calls to 
bt_index_parent_check(), but appears as one of the errors from 
bt_index_check().  I am rerunning the test now

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







allow sorted builds for btree_gist

2024-05-17 Thread Tomas Vondra
Hi,

I've been looking at GiST to see if there could be a good way to do
parallel builds - and there might be, if the opclass supports sorted
builds, because then we could parallelize the sort.

But then I noticed we support this mode only for point_ops, because
that's the only opclass with sortsupport procedure. Which mostly makes
sense, because types like geometries, polygons, ... do not have a well
defined order.

Still, we have btree_gist, and I don't think there's much reason to not
support sorted builds for those opclasses, where the gist opclass is
defined on top of btree ordering semantics.

So this patch does that, and the difference (compared to builds with
buiffering=on) can easily be an order of magnitude - at least that's
what my tests show:


test=# create index on test_int8 using gist (a) with (buffering = on);
CREATE INDEX
Time: 578799.450 ms (09:38.799)

test=# create index on test_int8 using gist (a) with (buffering = auto);
CREATE INDEX
Time: 53022.593 ms (00:53.023)


test=# create index on test_uuid using gist (a) with (buffering = on);
CREATE INDEX
Time: 39322.799 ms (00:39.323)

test=# create index on test_uuid using gist (a) with (buffering = auto);
CREATE INDEX
Time: 6466.341 ms (00:06.466)


The WIP patch adds enables this for data types with a usable sortsupport
procedure, which excludes time, timetz, cash, interval, bit, vbit, bool,
enum and macaddr8. I assume time, timetz and macaddr8 could be added,
it's just that I didn't find any existing sortsupport procedure. Same
for cash, but IIRC it's mostly deprecated.

Of course, people probably don't use btree_gist with a single column,
because they could just use btree. It's useful for multi-column GiST
indexes, with data types requiring GiST. And if the other opclasses also
allow sorted builds, this patch makes that possible. Of course, most
"proper GiST opclasses" don't support that, but why not - it's easy.

FWIW this is also why sorted builds likely are not a very practical way
to do parallel builds for GiST - it would help only with a small part of
cases, I think.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL CompanyFrom 011e6eee923a6f51668f3277f470d32922923d17 Mon Sep 17 00:00:00 2001
From: Tomas Vondra 
Date: Fri, 17 May 2024 20:36:08 +0200
Subject: [PATCH v20240517] allow sorted builds for btree_gist

---
 contrib/btree_gist/Makefile |  2 +-
 contrib/btree_gist/btree_gist--1.8--1.9.sql | 82 +
 contrib/btree_gist/btree_gist.control   |  2 +-
 3 files changed, 84 insertions(+), 2 deletions(-)
 create mode 100644 contrib/btree_gist/btree_gist--1.8--1.9.sql

diff --git a/contrib/btree_gist/Makefile b/contrib/btree_gist/Makefile
index 7ac2df26c10..68190ac5e46 100644
--- a/contrib/btree_gist/Makefile
+++ b/contrib/btree_gist/Makefile
@@ -34,7 +34,7 @@ DATA = btree_gist--1.0--1.1.sql \
btree_gist--1.1--1.2.sql btree_gist--1.2.sql btree_gist--1.2--1.3.sql \
btree_gist--1.3--1.4.sql btree_gist--1.4--1.5.sql \
btree_gist--1.5--1.6.sql btree_gist--1.6--1.7.sql \
-   btree_gist--1.7--1.8.sql
+   btree_gist--1.7--1.8.sql btree_gist--1.8--1.9.sql
 PGFILEDESC = "btree_gist - B-tree equivalent GiST operator classes"
 
 REGRESS = init int2 int4 int8 float4 float8 cash oid timestamp timestamptz \
diff --git a/contrib/btree_gist/btree_gist--1.8--1.9.sql b/contrib/btree_gist/btree_gist--1.8--1.9.sql
new file mode 100644
index 000..3fd6f33f41b
--- /dev/null
+++ b/contrib/btree_gist/btree_gist--1.8--1.9.sql
@@ -0,0 +1,82 @@
+/* contrib/btree_gist/btree_gist--1.8--1.9.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "ALTER EXTENSION btree_gist UPDATE TO '1.9'" to load this file. \quit
+
+ALTER OPERATOR FAMILY gist_oid_ops USING gist ADD
+	FUNCTION 11 (oid, oid) btoidsortsupport (internal) ;
+
+ALTER OPERATOR FAMILY gist_int2_ops USING gist ADD
+	FUNCTION 11 (int2, int2) btint2sortsupport (internal) ;
+
+ALTER OPERATOR FAMILY gist_int4_ops USING gist ADD
+	FUNCTION 11 (int4, int4) btint4sortsupport (internal) ;
+
+ALTER OPERATOR FAMILY gist_int8_ops USING gist ADD
+	FUNCTION 11 (int8, int8) btint8sortsupport (internal) ;
+
+ALTER OPERATOR FAMILY gist_float4_ops USING gist ADD
+	FUNCTION 11 (float4, float4) btfloat4sortsupport (internal) ;
+
+ALTER OPERATOR FAMILY gist_float8_ops USING gist ADD
+	FUNCTION 11 (float8, float8) btfloat8sortsupport (internal) ;
+
+ALTER OPERATOR FAMILY gist_timestamp_ops USING gist ADD
+	FUNCTION 11 (timestamp, timestamp) timestamp_sortsupport (internal) ;
+
+ALTER OPERATOR FAMILY gist_timestamptz_ops USING gist ADD
+	FUNCTION 11 (timestamptz, timestamptz) timestamp_sortsupport (internal) ;
+
+-- ALTER OPERATOR FAMILY gist_time_ops USING gist ADD
+-- 	FUNCTION 12 (time, time) gist_stratnum_btree (int2) ;
+
+ALTER OPERATOR FAMILY gist_date_ops USING gist ADD
+	FUNCTION 11 (date, date) date_sortsupport (internal) ;
+
+-- ALTER OPERATOR FAMILY 

Re: problems with "Shared Memory and Semaphores" section of docs

2024-05-17 Thread Imseih (AWS), Sami
> postgres -D pgdev-dev -c shared_buffers=16MB -C 
> shared_memory_size_in_huge_pages
> 13
> postgres -D pgdev-dev -c shared_buffers=16MB -c huge_page_size=1GB -C 
> shared_memory_size_in_huge_pages
> 1


> Which is very useful to be able to actually configure that number of huge
> pages. I don't think a system view or such would not help here.

Oops. Totally missed the -C flag. Thanks for clarifying!

Regards,

Sami 



Re: problems with "Shared Memory and Semaphores" section of docs

2024-05-17 Thread Andres Freund
Hi,

On 2024-05-17 18:30:08 +, Imseih (AWS), Sami wrote:
> > The advantage of the GUC is that its value could be seen before trying to
> > actually start the server.
>
> Only if they have a sample in postgresql.conf file, right?
> A GUC like shared_memory_size_in_huge_pages will not be.

You can query gucs with -C. E.g.

postgres -D pgdev-dev -c shared_buffers=16MB -C shared_memory_size_in_huge_pages
13
postgres -D pgdev-dev -c shared_buffers=16MB -c huge_page_size=1GB -C 
shared_memory_size_in_huge_pages
1

Which is very useful to be able to actually configure that number of huge
pages. I don't think a system view or such would not help here.

Greetings,

Andres Freund




Re: problems with "Shared Memory and Semaphores" section of docs

2024-05-17 Thread Nathan Bossart
On Fri, May 17, 2024 at 06:30:08PM +, Imseih (AWS), Sami wrote:
>> The advantage of the GUC is that its value could be seen before trying to
>> actually start the server. 
> 
> Only if they have a sample in postgresql.conf file, right? 
> A GUC like shared_memory_size_in_huge_pages will not be.

shared_memory_size_in_huge_pages is computed at runtime and can be viewed
with "postgres -C" before actually trying to start the server [0].

[0] https://www.postgresql.org/docs/devel/kernel-resources.html#LINUX-HUGE-PAGES

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

2024-05-17 Thread Robert Haas
On Fri, May 17, 2024 at 1:26 PM Jacob Burroughs
 wrote:
> I was leaving the details around triggering that for this conversation
> and in that patch just designing the messages in a way that would
> allow adding the reconfiguration/restarting to be easily added in a
> backwards-compatible way in a future patch.  I would imagine that an
> explicit `ParameterSet` call that sets `_pq_.connection_compression`
> (or whatever the implementation details turn out to be) would also
> trigger a compressor restart, and when restarted it would pick an
> algorithm/configuration based on the new value of the parameter rather
> than the one used at connection establishment.

Hmm, OK, interesting. I suppose that I thought we were going to handle
problems like this by just adding bespoke messages for each case (e.g.
CompressionSet). I wasn't thinking it would be practical to make a
message whose remit was as general as what it seems like Jelte wants
to do with ParameterSet, because I'm not sure everything can be
handled that simply. It's not an exact analogy, but when you want to
stop the server, it's not enough to say that you want to change from
the Running state to the Stopped state. You have to specify which type
of shutdown should be used to make the transition. You could even have
more complicated cases, where one side says "prepare to do X" and the
other side eventually says "OK, I'm prepared" and the first side says
"great, now activate X" and the other side eventually says "OK, it's
activate, please confirm that you've also activated it from your side"
and the first side eventually says "OK, I confirm that". I think the
fear that we're going to run into cases where such complex handshaking
is necessary is a major reason why I'm afraid of Jelte's ideas about
ParameterSet: it seems much more opinionated to me than he seems to
think it is.

To the extent that I can wrap my head around what Jelte is proposing,
and all signs point to that extent being quite limited, I suppose I
was thinking of something like his option (2). That is, I assumed that
a client would request all the optional features that said client
might wish to use at connection startup time. But I can see how that
assumption might be somewhat problematic in a connection-pooling
environment. Still, at least to me, it seems better than trying to
rely on GUC_REPORT. My opinion is (1) GUC_REPORT isn't a particularly
well-designed mechanism so I dislike trying to double down on it and
(2) trying to mix these protocol-level parameters and the
transactional GUCs we have together in a single mechanism seems
potentially problematic and (3) I'm still not particularly happy about
the idea of making protocol parameters into GUCs in the first place.
Perhaps these are all minority positions, but I can't tell you what
everyone thinks, only what I think.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: problems with "Shared Memory and Semaphores" section of docs

2024-05-17 Thread Nathan Bossart
On Fri, May 17, 2024 at 12:48:37PM -0500, Nathan Bossart wrote:
> On Fri, May 17, 2024 at 01:09:55PM -0400, Tom Lane wrote:
>> Nathan Bossart  writes:
>>> At a bare minimum, we should probably fix the obvious problems, but I
>>> wonder if we could simplify this section a bit, too.
>> 
>> Yup.  "The definition of insanity is doing the same thing over and
>> over and expecting different results."  Time to give up on documenting
>> these things in such detail.  Anybody who really wants to know can
>> look at the source code.
> 
> Cool.  I'll at least fix the back-branches as-is, but I'll see about
> revamping this stuff for v18.

Attached is probably the absolute least we should do for the back-branches.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
>From 853104fa59bb1c219f02f71ece0d5106cb6c0588 Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Fri, 17 May 2024 14:17:59 -0500
Subject: [PATCH v1 1/1] fix kernel resources docs on back-branches

---
 doc/src/sgml/runtime.sgml | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/doc/src/sgml/runtime.sgml b/doc/src/sgml/runtime.sgml
index 6047b8171d..883a849e6f 100644
--- a/doc/src/sgml/runtime.sgml
+++ b/doc/src/sgml/runtime.sgml
@@ -781,13 +781,13 @@ psql: error: connection to server on socket "/tmp/.s.PGSQL.5432" failed: No such

 SEMMNI
 Maximum number of semaphore identifiers (i.e., sets)
-at least ceil((max_connections + autovacuum_max_workers + max_wal_senders + max_worker_processes + 5) / 16) plus room for other applications
+at least ceil((max_connections + autovacuum_max_workers + max_wal_senders + max_worker_processes + 7) / 16) plus room for other applications

 

 SEMMNS
 Maximum number of semaphores system-wide
-ceil((max_connections + autovacuum_max_workers + max_wal_senders + max_worker_processes + 5) / 16) * 17 plus room for other applications
+ceil((max_connections + autovacuum_max_workers + max_wal_senders + max_worker_processes + 7) / 16) * 17 plus room for other applications

 

@@ -838,7 +838,8 @@ psql: error: connection to server on socket "/tmp/.s.PGSQL.5432" failed: No such
 When using System V semaphores,
 PostgreSQL uses one semaphore per allowed connection
 (), allowed autovacuum worker process
-() and allowed background
+(), allowed WAL sender process
+(), and allowed background
 process (), in sets of 16.
 Each such set will
 also contain a 17th semaphore which contains a magic
@@ -852,7 +853,7 @@ psql: error: connection to server on socket "/tmp/.s.PGSQL.5432" failed: No such
 linkend="sysvipc-parameters"/>).  The parameter SEMMNI
 determines the limit on the number of semaphore sets that can
 exist on the system at one time.  Hence this parameter must be at
-least ceil((max_connections + autovacuum_max_workers + max_wal_senders + max_worker_processes + 5) / 16).
+least ceil((max_connections + autovacuum_max_workers + max_wal_senders + max_worker_processes + 7) / 16).
 Lowering the number
 of allowed connections is a temporary workaround for failures,
 which are usually confusingly worded No space
-- 
2.25.1



Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.

2024-05-17 Thread Mark Dilger



> On May 17, 2024, at 11:51 AM, Pavel Borisov  wrote:
> 
> Amcheck with checkunique option does check uniqueness violation between 
> pages. But it doesn't warranty detection of cross page uniqueness violations 
> in extremely rare cases when the first equal index entry on the next page 
> corresponds to tuple that is not visible (e.g. dead). In this, I followed the 
> Peter's notion [1] that checking across a number of dead equal entries that 
> could theoretically span even across many pages is an unneeded code 
> complication and amcheck is not a tool that provides any warranty when 
> checking an index.

This confuses me a bit.  The regression test creates a table and index but 
never performs any DELETE nor any UPDATE operations, so none of the index 
entries should be dead.  If I am understanding you correct, I'd be forced to 
conclude that the uniqueness checking code is broken.  Can you take a look?

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: Comments on Custom RMGRs

2024-05-17 Thread Robert Haas
On Fri, Mar 29, 2024 at 1:09 PM Jeff Davis  wrote:
> I am fine with this.
>
> You've moved the discussion forward in two ways:
>
>   1. Changes to pg_stat_statements to actually use the API; and
>   2. The hook is called at multiple points.
>
> Those at least partially address the concerns raised by Andres and
> Robert. But given that there was pushback from multiple people on the
> feature, I'd like to hear from at least one of them. It's very late in
> the cycle so I'm not sure we'll get more feedback in time, though.

In my seemingly-neverending pass through the July CommitFest, I
reached this patch. My comment is: it's possible that
rmgr_003.v3.patch is enough to be useful, but does anyone in the world
think they know that for a fact?

I mean, pgss_001.v1.patch purports to demonstrate that it can be used,
but that's based on rmgr_003.v2.patch, not the v3 patch, and the
emails seem to indicate that it may not actually work. I also think,
looking at it, that it looks much more like a POC than something we'd
consider ready for commit. It also seems very unclear that we'd want
pg_stat_statements to behave this way, and indeed "this way" isn't
really spelled out anywhere.

I think it would be nice if we had an example that uses the proposed
hook that we could actually commit. Maybe that's asking too much,
though. I think the minimum thing we need is a compelling rationale
for why this particular hook design is going to be good enough. That
could be demonstrated by means of (1) a well-commented example that
accomplishes some understandable goal and/or (2) a detailed
description of how a non-core table AM or index AM is expected to be
able to make use of this. Bonus points if the person providing that
rationale can say credibly that they've actually implemented this and
it works great with 100TB of production data.

The problem here is not only that we don't want to commit a hook that
does nothing useful. We also don't want to commit a hook that works
wonderfully for someone but we have no idea why. If we do that, then
we don't know whether it's OK to modify the hook in the future as the
code evolves, or more to the point, which kinds of modifications will
be acceptable. And also, the next person who wants to use it is likely
to have to figure out all on their own how to do so, instead of being
able to refer to comments or documentation or the commit message or at
least a mailing list post.

My basic position is not that this patch is a bad idea, but that it
isn't really finished. The idea is probably a pretty good one, but
whether this is a reasonable implementation of the idea doesn't seem
clear, at least not to me.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.

2024-05-17 Thread Pavel Borisov
Hi, Mark!

> The documentation in
> https://www.postgresql.org/docs/devel/amcheck.html#AMCHECK-FUNCTIONS is
> ambiguous:
>
> "bt_index_check does not verify invariants that span child/parent
> relationships, but will verify the presence of all heap tuples as index
> tuples within the index when heapallindexed is true. When checkunique is
> true bt_index_check will check that no more than one among duplicate
> entries in unique index is visible. When a routine, lightweight test for
> corruption is required in a live production environment, using
> bt_index_check often provides the best trade-off between thoroughness of
> verification and limiting the impact on application performance and
> availability."
>
> The second sentence, "When checkunique is true bt_index_check will check
> that no more than one among duplicate entries in unique index is visible."
> is not strictly true, as it won't check if the violation spans a page
> boundary.
>
Amcheck with checkunique option does check uniqueness violation between
pages. But it doesn't warranty detection of cross page uniqueness
violations in extremely rare cases when the first equal index entry on the
next page corresponds to tuple that is not visible (e.g. dead). In this, I
followed the Peter's notion [1] that checking across a number of dead equal
entries that could theoretically span even across many pages is an
unneeded code complication and amcheck is not a tool that provides any
warranty when checking an index.

I'm not against docs modification in any way that clarifies its exact usage
and limitations.

Kind regards,
Pavel Borisov

[1]
https://www.postgresql.org/message-id/CAH2-Wz%3DttG__BTZ-r5ccopBRb5evjg%3DzsF_o_3C5h4zRBA_LjQ%40mail.gmail.com


Re: problems with "Shared Memory and Semaphores" section of docs

2024-05-17 Thread Imseih (AWS), Sami
>>> If the exact values
>>> are important, maybe we could introduce more GUCs like
>>> shared_memory_size_in_huge_pages that can be consulted (instead of
>>> requiring users to break out their calculators).
>>
>> I don't especially like shared_memory_size_in_huge_pages, and I don't
>> want to introduce more of those. GUCs are not the right way to expose
>> values that you can't actually set. (Yeah, I'm guilty of some of the
>> existing ones like that, but it's still not a good thing.) Maybe it's
>> time to introduce a system view for such things? It could be really
>> simple, with name and value, or we could try to steal some additional
>> ideas such as units from pg_settings.

I always found some of the preset GUCs [1] to be useful for writing SQLs used by
DBAs, particularly block_size, wal_block_size, server_version and 
server_version_num.

> The advantage of the GUC is that its value could be seen before trying to
> actually start the server. 

Only if they have a sample in postgresql.conf file, right? 
A GUC like shared_memory_size_in_huge_pages will not be.


[1] https://www.postgresql.org/docs/current/runtime-config-preset.html


Regards,

Sami 



Re: Postgres and --config-file option

2024-05-17 Thread Tom Lane
Robert Haas  writes:
> If the reason that somebody is upset is because it's not technically
> true to say that you *must* do one of those things, we could fix that
> with "You must" -> "You can" or with "You must specify" -> "Specify".
> The patch you propose is also not terrible or anything, but it goes in
> the direction of listing every alternative, which will become
> unpalatable as soon as somebody adds one more way to do it, or maybe
> it's unpalatable already.

I agree that it's not necessary or particularly useful for this hint
to be exhaustive.  I could get behind your suggestion of
s/You must specify/Specify/, but I also think it's fine to do nothing.

regards, tom lane




Re: Postgres and --config-file option

2024-05-17 Thread Robert Haas
On Thu, May 16, 2024 at 4:57 AM Aleksander Alekseev
 wrote:
> I propose my original v1 patch for correcting the --help output of
> 'postgres' too. I agree with the above comments that corresponding
> changes in v4 became somewhat unwieldy.

So, who is it exactly that will be confused by the status quo? I mean,
let's say you get this error:

postgres does not know where to find the server configuration file.
You must specify the --config-file or -D invocation option or set the
PGDATA environment variable.

As I see it, either you know how it works and just made a mistake this
time, or you are a beginner. If it's the former, the fact that the
error message doesn't mention every possible way of solving the
problem does not matter, because you already know how to fix your
mistake. If it's the latter, you don't need to know *every* way to fix
the problem. You just need to know *one* way to fix the problem. I
don't really understand why somebody would look at the existing
message and say "gosh, it didn't tell me that I could also use -c!".
If you already know that, don't you just ignore the hint and get busy
with fixing the problem?

If the reason that somebody is upset is because it's not technically
true to say that you *must* do one of those things, we could fix that
with "You must" -> "You can" or with "You must specify" -> "Specify".
The patch you propose is also not terrible or anything, but it goes in
the direction of listing every alternative, which will become
unpalatable as soon as somebody adds one more way to do it, or maybe
it's unpalatable already.

Even if we don't do that, I don't see that there's a huge problem here.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Comments about TLS (no SSLRequest) and ALPN

2024-05-17 Thread Greg Stark
On Sat, 11 May 2024 at 15:36, AJ ONeal  wrote:

> Having postgres TLS/SNI/ALPN routable by default will just be more intuitive 
> (it's what I assumed would have been the default anyway), and help increase 
> adoption in cloud, enterprise, and other settings.

Routing is primarily a feature for "cloud-first" deployments. I.e.
things like Kubernetes or equivalent. I don't think people deploying
to their own metal or on their own network often need this kind of
feature today. Of course we don't know what the future holds and it
could well become more common.

In that context I think it's clear that user-oriented tools like psql
shouldn't change their default behaviour. They need the maximum
flexibility of being able to negotiate plain text and GSSAUTH
connections if possible. It's only applications deployed by the same
cloud environment building tools that deploy the database and SSL
proxies that will know where direct SSL connections are necessary.


> We live in the world of ACME / Let's Encrypt / ZeroSSL. Many proxies have 
> that built in. As such optimizing for unverified TLS takes the user down a 
> path that's just more difficult to begin with (it's easier to get a valid TLS 
> cert than it is to get a self-signed cert these days), and more nuanced 
> (upcoming implementors are accustomed to TLS being verified). It's easy to 
> document how to use the letsencrypt client with postgres. It will also be 
> increasingly easy to configure an ACME-enable proxy for postgres and not 
> worry about it in the server at all.

I tend to agree that it would be good for our documentation and
install scripts to assume letsencrypt certs can be requested. That
said there are still a lot of database environments that are not on
networks that can reach internet services directly without special
firewall or routing rules set up.



> Allow the user to specify ALPN
>
> I don't think this is particularly controversial or nuanced, so I don't have 
> much to say here - most TLS tools allow the user to specify ALPN for the same 
> reason they allow specifying the port number - either for privacy, 
> security-by-obscurity, or navigating some form of application or user routing.

I think I need a citation before I believe this. I can't imagine it
makes sense for anything other than general purpose TLS testing tools
to allow arbitrary protocol names. It seems like something that would
be mostly useful for pentesting or regression tests. But for actual
deployed applications it doesn't make any sense to me.


--
greg




Re: problems with "Shared Memory and Semaphores" section of docs

2024-05-17 Thread Nathan Bossart
On Fri, May 17, 2024 at 01:09:55PM -0400, Tom Lane wrote:
> Nathan Bossart  writes:
>> [ many, many problems in documented formulas ]
> 
>> At a bare minimum, we should probably fix the obvious problems, but I
>> wonder if we could simplify this section a bit, too.
> 
> Yup.  "The definition of insanity is doing the same thing over and
> over and expecting different results."  Time to give up on documenting
> these things in such detail.  Anybody who really wants to know can
> look at the source code.

Cool.  I'll at least fix the back-branches as-is, but I'll see about
revamping this stuff for v18.

>> If the exact values
>> are important, maybe we could introduce more GUCs like
>> shared_memory_size_in_huge_pages that can be consulted (instead of
>> requiring users to break out their calculators).
> 
> I don't especially like shared_memory_size_in_huge_pages, and I don't
> want to introduce more of those.  GUCs are not the right way to expose
> values that you can't actually set.  (Yeah, I'm guilty of some of the
> existing ones like that, but it's still not a good thing.)  Maybe it's
> time to introduce a system view for such things?  It could be really
> simple, with name and value, or we could try to steal some additional
> ideas such as units from pg_settings.

The advantage of the GUC is that its value could be seen before trying to
actually start the server.  I don't dispute that it's not the right way to
surface this information, though.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: Internal error codes triggered by tests

2024-05-17 Thread Robert Haas
On Tue, Apr 23, 2024 at 12:55 AM Michael Paquier  wrote:
> Thoughts?

The patch as proposed seems fine. Marking RfC.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.

2024-05-17 Thread Mark Dilger


> On May 17, 2024, at 3:11 AM, Alexander Korotkov  wrote:
> 
> On Mon, May 13, 2024 at 4:42 AM Alexander Korotkov  
> wrote:
>> On Mon, May 13, 2024 at 12:23 AM Alexander Korotkov
>>  wrote:
>>> On Sat, May 11, 2024 at 4:13 AM Mark Dilger
>>>  wrote:
> On May 10, 2024, at 12:05 PM, Alexander Korotkov  
> wrote:
> The only bt_target_page_check() caller is
> bt_check_level_from_leftmost(), which overrides state->target in the
> next iteration anyway.  I think the patch is just refactoring to
> eliminate the confusion pointer by Peter Geoghegan upthread.
 
 I find your argument unconvincing.
 
 After bt_target_page_check() returns at line 919, and before 
 bt_check_level_from_leftmost() overrides state->target in the next 
 iteration, bt_check_level_from_leftmost() conditionally fetches an item 
 from the page referenced by state->target.  See line 963.
 
 I'm left with four possibilities:
 
 
 1)  bt_target_page_check() never gets to the code that uses "rightpage" 
 rather than "state->target" in the same iteration where 
 bt_check_level_from_leftmost() conditionally fetches an item from 
 state->target, so the change you're making doesn't matter.
 
 2)  The code prior to v2-0003 was wrong, having changed state->target in 
 an inappropriate way, causing the wrong thing to happen at what is now 
 line 963.  The patch fixes the bug, because state->target no longer gets 
 overwritten where you are now using "rightpage" for the value.
 
 3)  The code used to work, having set up state->target correctly in the 
 place where you are now using "rightpage", but v2-0003 has broken that.
 
 4)  It's been broken all along and your patch just changes from wrong to 
 wrong.
 
 
 If you believe (1) is true, then I'm complaining that you are relying far 
 to much on action at a distance, and that you are not documenting it.  
 Even with documentation of this interrelationship, I'd be unhappy with how 
 brittle the code is.  I cannot easily discern that the two don't ever 
 happen in the same iteration, and I'm not at all convinced one way or the 
 other.  I tried to set up some Asserts about that, but none of the test 
 cases actually reach the new code, so adding Asserts doesn't help to 
 investigate the question.
 
 If (2) is true, then I'm complaining that the commit message doesn't 
 mention the fact that this is a bug fix.  Bug fixes should be clearly 
 documented as such, otherwise future work might assume the commit can be 
 reverted with only stylistic consequences.
 
 If (3) is true, then I'm complaining that the patch is flat busted.
 
 If (4) is true, then maybe we should revert the entire feature, or have a 
 discussion of mitigation efforts that are needed.
 
 Regardless of which of 1..4 you pick, I think it could all do with more 
 regression test coverage.
 
 
 For reference, I said something similar earlier today in another email to 
 this thread:
 
 This patch introduces a change that stores a new page into variable 
 "rightpage" rather than overwriting "state->target", which the old 
 implementation most certainly did.  That means that after returning from 
 bt_target_page_check() into the calling function 
 bt_check_level_from_leftmost() the value in state->target is not what it 
 would have been prior to this patch.  Now, that'd be irrelevant if nobody 
 goes on to consult that value, but just 44 lines further down in 
 bt_check_level_from_leftmost() state->target is clearly used.  So the 
 behavior at that point is changing between the old and new versions of the 
 code, and I think I'm within reason to ask if it was wrong before the 
 patch, wrong after the patch, or something else?  Is this a bug being 
 introduced, being fixed, or ... ?
>>> 
>>> Thank you for your analysis.  I'm inclined to believe in 2, but not
>>> yet completely sure.  It's really pity that our tests don't cover
>>> this.  I'm investigating this area.
>> 
>> It seems that I got to the bottom of this.  Changing
>> BtreeCheckState.target for a cross-page unique constraint check is
>> wrong, but that happens only for leaf pages.  After that
>> BtreeCheckState.target is only used for setting the low key.  The low
>> key is only used for non-leaf pages.  So, that didn't lead to any
>> visible bug.  I've revised the commit message to reflect this.
>> 
>> So, the picture for the patches is the following now.
>> 0001 – optimization, but rather simple and giving huge effect
>> 0002 – refactoring
>> 0003 – fix for the bug
>> 0004 – better error reporting
> 
> I think the thread contains enough motivation on why 0002, 0003 and
> 0004 are material for post-FF.  They are fixes and refactoring for
> new-in-v17 feature.  I'm going to push them if no objections.

Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

2024-05-17 Thread Jacob Burroughs
On Fri, May 17, 2024 at 3:15 AM Robert Haas  wrote:
>
> OK, so you made it so that compressed data is fully self-identifying.
> Hence, there's no need to worry if something gets changed: the
> receiver, if properly implemented, can't help but notice. The only
> downside that I can see to this design is that you only have one byte
> to identify the compression algorithm, but that doesn't actually seem
> like a real problem at all, because I expect the number of supported
> compression algorithms to grow very slowly. I think it would take
> centuries, possibly millenia, before we started to get short of
> identifiers. So, cool.
>
> But, in your system, how does the client ask the server to switch to a
> different compression algorithm, or to restart the compression stream?

I was leaving the details around triggering that for this conversation
and in that patch just designing the messages in a way that would
allow adding the reconfiguration/restarting to be easily added in a
backwards-compatible way in a future patch.  I would imagine that an
explicit `ParameterSet` call that sets `_pq_.connection_compression`
(or whatever the implementation details turn out to be) would also
trigger a compressor restart, and when restarted it would pick an
algorithm/configuration based on the new value of the parameter rather
than the one used at connection establishment.



-- 
Jacob Burroughs | Staff Software Engineer
E: jburrou...@instructure.com




Re: broken tables on hot standby after migration on PostgreSQL 16 (3x times last month)

2024-05-17 Thread Peter Geoghegan
On Fri, May 17, 2024 at 1:18 PM Pavel Stehule  wrote:
> pá 17. 5. 2024 v 18:02 odesílatel Peter Geoghegan  napsal:
>> You've shown an inconsistency between the primary and standby with
>> respect to the heap tuple infomask bits related to freezing. It looks
>> like a FREEZE WAL record from the primary was never replayed on the
>> standby.
>
>
> It think is possible so broken tuples was created before upgrade from 
> Postgres 15 to Postgres 16 - not too far before, so this bug can be side 
> effect of upgrade

I half suspected something like that myself. Maybe the problem
happened *during* the upgrade, even.

There have been historical bugs affecting pg_upgrade and relfrozenxid.
Commit 74cf7d46 is one good example from only a few years ago.

--
Peter Geoghegan




Re: broken tables on hot standby after migration on PostgreSQL 16 (3x times last month)

2024-05-17 Thread Pavel Stehule
pá 17. 5. 2024 v 18:02 odesílatel Peter Geoghegan  napsal:

> On Fri, May 17, 2024 at 9:13 AM Pavel Stehule 
> wrote:
> > after migration on PostgreSQL 16 I seen 3x times (about every week)
> broken tables on replica nodes. The query fails with error
> >
> > ERROR:  could not access status of transaction 1442871302
> > DETAIL:  Could not open file "pg_xact/0560": No such file or directory
>
> You've shown an inconsistency between the primary and standby with
> respect to the heap tuple infomask bits related to freezing. It looks
> like a FREEZE WAL record from the primary was never replayed on the
> standby.
>

It think is possible so broken tuples was created before upgrade from
Postgres 15 to Postgres 16 - not too far before, so this bug can be side
effect of upgrade



>
> It's natural for me to wonder if my Postgres 16 work on page-level
> freezing might be a factor here. If that really was true, then it
> would be necessary to explain why the primary and standby are
> inconsistent (no reason to suspect a problem on the primary here).
> It'd have to be the kind of issue that could be detected mechanically
> using wal_consistency_checking, but wasn't detected that way before
> now -- that seems unlikely.
>
> It's worth considering if the more aggressive behavior around
> relfrozenxid advancement (in 15) and freezing (in 16) has increased
> the likelihood of problems like these in setups that were already
> faulty, in whatever way. The standby database is indeed corrupt, but
> even on 16 it's fairly isolated corruption in practical terms. The
> full extent of the problem is clear once amcheck is run, but only one
> tuple can actually cause the system to error due to the influence of
> hint bits (for better or worse, hint bits mask the problem quite well,
> even on 16).
>
> --
> Peter Geoghegan
>


Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-17 Thread Greg Sabino Mullane
On Fri, May 17, 2024 at 7:11 AM Tomas Vondra 
wrote:

> It does seem to me a part of the solution needs to be helping to get
> those patches reviewed. I don't know how to do that, but perhaps there's
> a way to encourage people to review more stuff, or review stuff from a
> wider range of contributors. Say by treating reviews more like proper
> contributions.


This is a huge problem. I've been in the situation before where I had some
cycles to do a review, but actually finding one to review is
super-difficult. You simply cannot tell without clicking on the link and
wading through the email thread. Granted, it's easy as an
occasional reviewer to simply disregard potential patches if the email
thread is over a certain size, but it's still a lot of work. Having some
sort of summary/status field would be great, even if not everything was
labelled. It would also be nice if simpler patches were NOT picked up by
experienced hackers, as we want to encourage new/inexperienced people, and
having some "easy to review" patches available will help people gain
confidence and grow.


> Long time ago there was a "rule" that people submitting patches are
> expected to do reviews. Perhaps we should be more strict this.
>

Big -1. How would we even be more strict about this? Public shaming?
Withholding a commit?

Cheers,
Greg


Re: problems with "Shared Memory and Semaphores" section of docs

2024-05-17 Thread Tom Lane
Nathan Bossart  writes:
> [ many, many problems in documented formulas ]

> At a bare minimum, we should probably fix the obvious problems, but I
> wonder if we could simplify this section a bit, too.

Yup.  "The definition of insanity is doing the same thing over and
over and expecting different results."  Time to give up on documenting
these things in such detail.  Anybody who really wants to know can
look at the source code.

> If the exact values
> are important, maybe we could introduce more GUCs like
> shared_memory_size_in_huge_pages that can be consulted (instead of
> requiring users to break out their calculators).

I don't especially like shared_memory_size_in_huge_pages, and I don't
want to introduce more of those.  GUCs are not the right way to expose
values that you can't actually set.  (Yeah, I'm guilty of some of the
existing ones like that, but it's still not a good thing.)  Maybe it's
time to introduce a system view for such things?  It could be really
simple, with name and value, or we could try to steal some additional
ideas such as units from pg_settings.

regards, tom lane




Re: remove Todo item: Allow infinite intervals just like infinite timestamps

2024-05-17 Thread Robert Haas
On Fri, May 17, 2024 at 9:12 AM jian he  wrote:
> https://wiki.postgresql.org/wiki/Todo
> Dates and Times[edit]
> Allow infinite intervals just like infinite timestamps
> https://www.postgresql.org/message-id/4eb095c8.1050...@agliodbs.com
>
> this done at
> https://git.postgresql.org/cgit/postgresql.git/commit/?id=519fc1bd9e9d7b408903e44f55f83f6db30742b7
>
> Should we remove this item?

If the item is done, you should edit the wiki page and mark it that
way. Note this note at the top of the page:

[D] Completed item - marks changes that are done, and will appear in
the PostgreSQL 17 release.

Note that the Todo list is not a very good Todo list and most people
don't use it to find projects (and haven't for a long time). So it may
not get updated, or consulted, very often.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




problems with "Shared Memory and Semaphores" section of docs

2024-05-17 Thread Nathan Bossart
(moving to a new thread)

On Thu, May 16, 2024 at 09:16:46PM -0500, Nathan Bossart wrote:
> On Thu, May 16, 2024 at 04:37:10PM +, Imseih (AWS), Sami wrote:
>> Also, Not sure if I am mistaken here, but the "+ 5" in the existing docs
>> seems wrong.
>>  
>> If it refers to NUM_AUXILIARY_PROCS defined in 
>> include/storage/proc.h, it should a "6"
>> 
>> #define NUM_AUXILIARY_PROCS 6
>> 
>> This is not a consequence of this patch, and can be dealt with
>> In a separate thread if my understanding is correct.
> 
> Ha, I think it should actually be "+ 7"!  The value is calculated as
> 
>   MaxConnections + autovacuum_max_workers + 1 + max_worker_processes + 
> max_wal_senders + 6
> 
> Looking at the history, this documentation tends to be wrong quite often.
> In v9.2, the checkpointer was introduced, and these formulas were not
> updated.  In v9.3, background worker processes were introduced, and the
> formulas were still not updated.  Finally, in v9.6, it was fixed in commit
> 597f7e3.  Then, in v14, the archiver process was made an auxiliary process
> (commit d75288f), making the formulas out-of-date again.  And in v17, the
> WAL summarizer was added.
> 
> On top of this, IIUC you actually need even more semaphores if your system
> doesn't support atomics, and from a quick skim this doesn't seem to be
> covered in this documentation.

A couple of other problems I noticed:

* max_wal_senders is missing from this sentence:

When using System V semaphores,
PostgreSQL uses one semaphore per allowed 
connection
(), allowed autovacuum worker process
() and allowed background
process (), in sets of 16.

* AFAICT the discussion about the formulas in the paragraphs following the
  table doesn't explain the reason for the constant.

* IMHO the following sentence is difficult to decipher, and I can't tell if
  it actually matches the formula in the table:

The maximum number of semaphores in the system
is set by SEMMNS, which consequently must be at least
as high as max_connections plus
autovacuum_max_workers plus 
max_wal_senders,
plus max_worker_processes, plus one extra for each 16
allowed connections plus workers (see the formula in ).

At a bare minimum, we should probably fix the obvious problems, but I
wonder if we could simplify this section a bit, too.  If the exact values
are important, maybe we could introduce more GUCs like
shared_memory_size_in_huge_pages that can be consulted (instead of
requiring users to break out their calculators).

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: Why is citext/regress failing on hamerkop?

2024-05-17 Thread Tom Lane
TAKATSUKA Haruka  writes:
> I'm a hamerkop maintainer.
> Sorry I have missed the scm error for so long.

> Today I switched scmrepo from git.postgresql.org/git/postgresql.git 
> to github.com/postgres/postgres.git and successfully modernized
> the build target code.

Thanks very much!  I see hamerkop has gone green in HEAD.

It looks like it succeeded in v13 too but failed in v12,
which suggests that the isolationcheck problem is intermittent,
which is not too surprising given our current theory about
what's causing that.

At this point I think we are too close to the 17beta1 release
freeze to mess with it, but I'd support pushing Thomas'
proposed patch after the freeze is over.

regards, tom lane




Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-17 Thread Tom Lane
"David G. Johnston"  writes:
> On Friday, May 17, 2024, Joe Conway  wrote:
>> A solution to both of these issues (yours and mine) would be to allow
>> things to be added *during* the CF month. What is the point of having a
>> "freeze" before every CF anyway? Especially if they start out clean. If
>> something is ready for review on day 8 of the CF, why not let it be added
>> for review?

> In conjunction with WIP removing this limitation on the bimonthlies makes
> sense to me.

I think that the original motivation for this was two-fold:

1. A notion of fairness, that you shouldn't get your patch reviewed
ahead of others that have been waiting (much?) longer.  I'm not sure
how much this is really worth.  In particular, even if we do delay
an incoming patch by one CF, it's still going to compete with the
older stuff in future CFs.  There's already a sort feature in the CF
dashboard whereby patches that have lingered for more CFs appear ahead
of patches that are newer, so maybe just ensuring that late-arriving
patches sort as "been around for 0 CFs" is sufficient.

2. As I mentioned a bit ago, the original idea was that we didn't exit
a CF until it was empty of un-handled patches, so obviously allowing
new patches to come in would mean we'd never get to empty.  That idea
didn't work and we don't think that way anymore.

So yeah, I'm okay with abandoning the must-submit-to-a-future-CF
restriction.

regards, tom lane




RE: Proposal for Updating CRC32C with AVX-512 Algorithm.

2024-05-17 Thread Amonson, Paul D
Hi, forgive the top-post but I have not seen any response to this post?

Thanks,
Paul

> -Original Message-
> From: Amonson, Paul D
> Sent: Wednesday, May 1, 2024 8:56 AM
> To: pgsql-hackers@lists.postgresql.org
> Cc: Nathan Bossart ; Shankaran, Akash
> 
> Subject: Proposal for Updating CRC32C with AVX-512 Algorithm.
> 
> Hi,
> 
> Comparing the current SSE4.2 implementation of the CRC32C algorithm in
> Postgres, to an optimized AVX-512 algorithm [0] we observed significant
> gains. The result was a ~6.6X average multiplier of increased performance
> measured on 3 different Intel products. Details below. The AVX-512 algorithm
> in C is a port of the ISA-L library [1] assembler code.
> 
> Workload call size distribution details (write heavy):
>* Average was approximately around 1,010 bytes per call
>* ~80% of the calls were under 256 bytes
>* ~20% of the calls were greater than or equal to 256 bytes up to the max
> buffer size of 8192
> 
> The 256 bytes is important because if the buffer is smaller, it makes sense
> fallback to the existing implementation. This is because the AVX-512 algorithm
> needs a minimum of 256 bytes to operate.
> 
> Using the above workload data distribution,
> at 0%calls < 256 bytes, a 841% improvement on average for crc32c
> functionality was observed.
> at 50%   calls < 256 bytes, a 758% improvement on average for crc32c
> functionality was observed.
> at 90%   calls < 256 bytes, a 44% improvement on average for crc32c
> functionality was observed.
> at 97.6% calls < 256 bytes, the workload's crc32c performance breaks-even.
> at 100%  calls < 256 bytes, a 14% regression is seen when using AVX-512
> implementation.
> 
> The results above are averages over 3 machines, and were measured on: Intel
> Saphire Rapids bare metal, and using EC2 on AWS cloud: Intel Saphire Rapids
> (m7i.2xlarge) and Intel Ice Lake (m6i.2xlarge).
> 
> Summary Data (Saphire Rapids bare metal, AWS m7i-2xl, and AWS m6i-2xl):
> +-+---+---+---+-
> ---+
> | Rates in Bytes/us   | Bare Metal|AWS m6i-2xl|   AWS m7i-2xl 
> |
> |
> | (Larger is Better)  
> +-+-+-+-+-+-+
> Overall Multiplier |
> | | SSE 4.2 | AVX-512 | SSE 4.2 | AVX-512 | SSE 4.2 | 
> AVX-512 |
> |
> +-+-+-+-+-+-+-+---
> -+
> | Numbers 256-8192|  12,046 |  83,196 |   7,471 |  39,965 |  11,867 |
> 84,589 |6.62|
> +-+-+-+-+-+-+-+---
> -+
> | Numbers 64 - 255|  16,865 |  15,909 |   9,209 |   7,363 |  12,496 |
> 10,046 |0.86|
> +-+-+-+-+-+-+-+---
> -+
> |  Weighted Multiplier 
> [*]|1.44|
> 
> +-++
> There was no evidence of AVX-512 frequency throttling from perf data, which
> stayed steady during the test.
> 
> Feedback on this proposed improvement is appreciated. Some questions:
> 1) This AVX-512 ISA-L derived code uses BSD-3 license [2]. Is this compatible
> with the PostgreSQL License [3]? They both appear to be very permissive
> licenses, but I am not an expert on licenses.
> 2) Is there a preferred benchmark I should run to test this change?
> 
> If licensing is a non-issue, I can post the initial patch along with my 
> Postgres
> benchmark function patch for further review.
> 
> Thanks,
> Paul
> 
> [0]
> https://www.researchgate.net/publication/263424619_Fast_CRC_computati
> on#full-text
> [1] https://github.com/intel/isa-l
> [2] https://opensource.org/license/bsd-3-clause
> [3] https://opensource.org/license/postgresql
> 
> [*] Weights used were 90% of requests less than 256 bytes, 10% greater than
> or equal to 256 bytes.




Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-17 Thread Jelte Fennema-Nio
On Fri, 17 May 2024 at 17:59, Robert Haas  wrote:
> If they
> get attention, it's much more likely to be because somebody saw their
> email and wrote back than it is to be because somebody went through
> the CommitFest and found their entry and was like "oh, I should review
> this".

I think this is an important insight. I used the commitfest app to
find patches to review when I was just starting out in postgres
development, since I didn't subscribe to all af pgsql-hackers yet. I
do subscribe nowadays, but now I rarely look at the commitfest app,
instead I skim the titles of emails that go into my "Postgres" folder
in my mailbox. Going from such an email to a commitfest entry is near
impossible (at least I don't know how to do this efficiently). I guess
I'm not the only one doing this.

> Honestly, if we get to a situation where a patch author is sad
> because they have to click a link every 2 months to say "yeah, I'm
> still here, please review my patch," we've already lost the game. That
> person isn't sad because we asked them to click a link. They're sad
> it's already been N * 2 months and nothing has happened.

Maybe it wouldn't be so bad for an author to click the 2 months
button, if it would actually give their patch some higher chance of
being reviewed by doing that. And given the previous insight, that
people don't look at the commitfest app that often, it might be good
if pressing this button would also bump the item in people's
mailboxes.




Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-17 Thread Joe Conway

On 5/17/24 11:58, Robert Haas wrote:

I realize that many people here are (rightly!) concerned with
burdening patch authors with more steps that they have to follow. But
the current system is serving new patch authors very poorly. If they
get attention, it's much more likely to be because somebody saw their
email and wrote back than it is to be because somebody went through
the CommitFest and found their entry and was like "oh, I should review
this". Honestly, if we get to a situation where a patch author is sad
because they have to click a link every 2 months to say "yeah, I'm
still here, please review my patch," we've already lost the game. That
person isn't sad because we asked them to click a link. They're sad
it's already been N * 2 months and nothing has happened.


+many

--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com





Re: broken tables on hot standby after migration on PostgreSQL 16 (3x times last month)

2024-05-17 Thread Peter Geoghegan
On Fri, May 17, 2024 at 9:13 AM Pavel Stehule  wrote:
> after migration on PostgreSQL 16 I seen 3x times (about every week) broken 
> tables on replica nodes. The query fails with error
>
> ERROR:  could not access status of transaction 1442871302
> DETAIL:  Could not open file "pg_xact/0560": No such file or directory

You've shown an inconsistency between the primary and standby with
respect to the heap tuple infomask bits related to freezing. It looks
like a FREEZE WAL record from the primary was never replayed on the
standby.

It's natural for me to wonder if my Postgres 16 work on page-level
freezing might be a factor here. If that really was true, then it
would be necessary to explain why the primary and standby are
inconsistent (no reason to suspect a problem on the primary here).
It'd have to be the kind of issue that could be detected mechanically
using wal_consistency_checking, but wasn't detected that way before
now -- that seems unlikely.

It's worth considering if the more aggressive behavior around
relfrozenxid advancement (in 15) and freezing (in 16) has increased
the likelihood of problems like these in setups that were already
faulty, in whatever way. The standby database is indeed corrupt, but
even on 16 it's fairly isolated corruption in practical terms. The
full extent of the problem is clear once amcheck is run, but only one
tuple can actually cause the system to error due to the influence of
hint bits (for better or worse, hint bits mask the problem quite well,
even on 16).

-- 
Peter Geoghegan




Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-17 Thread Robert Haas
On Fri, May 17, 2024 at 11:57 AM Tom Lane  wrote:
> Melanie Plageman  writes:
> > So, anyway, I'd argue that we need a parking lot for patches which we
> > all agree are important and have a path forward but need someone to do
> > the last 20-80% of the work. To avoid this being a dumping ground,
> > patches should _only_ be allowed in the parking lot if they have a
> > clear path forward. Patches which haven't gotten any interest don't go
> > there. Patches in which the author has clearly not addressed feedback
> > that is reasonable for them to address don't go there. These are
> > effectively community TODOs which we agree need to be done -- if only
> > someone had the time.
>
> Hmm.  I was envisioning "parking lot" as meaning "this is on my
> personal TODO list, and I'd like CI support for it, but I'm not
> expecting anyone else to pay attention to it yet".

+1.

> I think what
> you are describing is valuable but different.  Maybe call it
> "pending" or such?  Or invent a different name for the other thing.

Yeah, there should be someplace that we keep a list of things that are
thought to be important but we haven't gotten around to doing anything
about yet, but I think that's different from the parking lot CF.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-17 Thread Robert Haas
On Fri, May 17, 2024 at 11:05 AM Tom Lane  wrote:
> > We already have gone back to that model. We just haven't admitted it
> > yet. And we're never going to get out of it until we find a way to get
> > the contents of the CommitFest application down to a more reasonable
> > size and level of complexity. There's just no way everyone's up for
> > that level of pain. I'm not sure not up for that level of pain.
>
> Yeah, we clearly need to get the patch list to a point of
> manageability, but I don't agree that abandoning time-boxed CFs
> will improve anything.

I'm not sure. Suppose we plotted commits generally, or commits of
non-committer patches, or reviews on-list, vs. time. Would we see any
uptick in activity during CommitFests? Would it vary by committer? I
don't know. I bet the difference wouldn't be as much as Tom Lane would
like to see. Realistically, we can't manage how contributors spend
their time all that much, and trying to do so is largely tilting at
windmills.

To me, the value of time-based CommitFests is as a vehicle to ensure
freshness, or cleanup, or whatever word you want to do. If you just
make a list of things that need attention and keep incrementally
updating it, eventually for various reasons that list no longer
reflects your current list of priorities. At some point, you have to
throw it out and make a new list, or at least that's what always
happens to me. We've fallen into a system where the default treatment
of a patch is to be carried over to the next CommitFest and CfMs are
expected to exert effort to keep patches from getting that default
treatment when they shouldn't. But that does not scale. We need a
system where things drop off the list unless somebody makes an effort
to keep them on the list, and the easiest way to do that is to
periodically make a *fresh* list that *doesn't* just clone some
previous list.

I realize that many people here are (rightly!) concerned with
burdening patch authors with more steps that they have to follow. But
the current system is serving new patch authors very poorly. If they
get attention, it's much more likely to be because somebody saw their
email and wrote back than it is to be because somebody went through
the CommitFest and found their entry and was like "oh, I should review
this". Honestly, if we get to a situation where a patch author is sad
because they have to click a link every 2 months to say "yeah, I'm
still here, please review my patch," we've already lost the game. That
person isn't sad because we asked them to click a link. They're sad
it's already been N * 2 months and nothing has happened.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-17 Thread Tom Lane
Melanie Plageman  writes:
> So, anyway, I'd argue that we need a parking lot for patches which we
> all agree are important and have a path forward but need someone to do
> the last 20-80% of the work. To avoid this being a dumping ground,
> patches should _only_ be allowed in the parking lot if they have a
> clear path forward. Patches which haven't gotten any interest don't go
> there. Patches in which the author has clearly not addressed feedback
> that is reasonable for them to address don't go there. These are
> effectively community TODOs which we agree need to be done -- if only
> someone had the time.

Hmm.  I was envisioning "parking lot" as meaning "this is on my
personal TODO list, and I'd like CI support for it, but I'm not
expecting anyone else to pay attention to it yet".  I think what
you are describing is valuable but different.  Maybe call it
"pending" or such?  Or invent a different name for the other thing.

regards, tom lane




Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-17 Thread Jacob Champion
On Fri, May 17, 2024 at 7:40 AM Robert Haas  wrote:
>
> On Fri, May 17, 2024 at 10:31 AM Tom Lane  wrote:
> > To my mind, the point of the time-boxed commitfests is to provide
> > a structure wherein people will (hopefully) pay some actual attention
> > to other peoples' patches.  Conversely, the fact that we don't have
> > one running all the time gives committers some defined intervals
> > where they can work on their own stuff without feeling guilty that
> > they're not handling other people's patches.
> >
> > If we go back to the old its-development-mode-all-the-time approach,
> > what is likely to happen is that the commit rate for not-your-own-
> > patches goes to zero, because it's always possible to rationalize
> > your own stuff as being more important.
>
> We already have gone back to that model. We just haven't admitted it
> yet.

I've worked on teams that used the short-timebox CF calendar to
organize community work, like Tom describes. That was a really
positive thing for us.

Maybe it feels different from the committer point of view, but I don't
think all of the community is operating on the long-timebox model, and
I really wouldn't want to see us lengthen the cycles to try to get
around the lack of review/organization that's being complained about.
(But maybe you're not arguing for that in the first place.)

--Jacob




Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-17 Thread Melanie Plageman
On Fri, May 17, 2024 at 7:11 AM Tomas Vondra
 wrote:
>
> On 5/16/24 23:43, Peter Eisentraut wrote:
> > On 16.05.24 23:06, Joe Conway wrote:
> >> On 5/16/24 16:57, Jacob Champion wrote:
> >>> On Thu, May 16, 2024 at 1:31 PM Joe Conway  wrote:
>  Maybe we should just make it a policy that *nothing* gets moved forward
>  from commitfest-to-commitfest and therefore the author needs to care
>  enough to register for the next one?
> >>>
> >>> I think that's going to severely disadvantage anyone who doesn't do
> >>> this as their day job. Maybe I'm bristling a bit too much at the
> >>> wording, but not having time to shepherd a patch is not the same as
> >>> not caring.
> >>
> >> Maybe the word "care" was a poor choice, but forcing authors to think
> >> about and decide if they have the "time to shepherd a patch" for the
> >> *next CF* is exactly the point. If they don't, why clutter the CF with
> >> it.
> >
> > Objectively, I think this could be quite effective.  You need to prove
> > your continued interest in your project by pressing a button every two
> > months.
> >
> > But there is a high risk that this will double the annoyance for
> > contributors whose patches aren't getting reviews.  Now, not only are
> > you being ignored, but you need to prove that you're still there every
> > two months.
> >
>
> Yeah, I 100% agree with this. If a patch bitrots and no one cares enough
> to rebase it once in a while, then sure - it's probably fine to mark it
> RwF. But forcing all contributors to do a dance every 2 months just to
> have a chance someone might take a look, seems ... not great.
>
> I try to see this from the contributors' PoV, and with this process I'm
> sure I'd start questioning if I even want to submit patches.

Agreed.

> That is not to say we don't have a problem with patches that just move
> to the next CF, and that we don't need to do something about that ...
>
> Incidentally, I've been preparing some git+CF stats because of a talk
> I'm expected to do, and it's very obvious the number of committed (and
> rejected) CF entries is more or very stable over time, while the number
> of patches that move to the next CF just snowballs.
>
> My impression is a lot of these contributions/patches just never get the
> review & attention that would allow them to move forward. Sure, some do
> bitrot and/or get abandoned, and let's RwF those. But forcing everyone
> to re-register the patches over and over seems like "reject by default".
> I'd expect a lot of people to stop bothering and give up, and in a way
> that would "solve" the bottleneck. But I'm not sure it's the solution
> we'd like ...

I don't think we should reject by default. It is discouraging and it
is already hard enough as it is to contribute.

> It does seem to me a part of the solution needs to be helping to get
> those patches reviewed. I don't know how to do that, but perhaps there's
> a way to encourage people to review more stuff, or review stuff from a
> wider range of contributors. Say by treating reviews more like proper
> contributions.

One reason I support the parking lot idea is for patches like the one
in [1]. EXPLAIN for parallel bitmap heap scan is just plain broken.
The patch in this commitfest entry is functionally 80% of the way
there. It just needs someone to do the rest of the work to make it
committable. I actually think it is unreasonable of us to expect the
original author to do this. I have had it on my list for weeks to get
back around to helping with this patch. However, I spent the better
part of my coding time in the last two weeks trying to reproduce and
fix a bug on stable branches that causes vacuum processes to
infinitely loop. Arguably that is a bigger problem. Because I knew
this EXPLAIN patch was slipping down my TODO list, I changed the patch
to "waiting on author", but I honestly don't think the original author
should have to do the rest of the work.

Should I spend more time on this patch reviewing it and moving it
forward? Yes. Maybe I'm just too slow at writing postgres code or I
have bad time management or I should spend  less time doing things
like figuring out how many lavalier mics we need in each room for
PGConf.dev. I don't know. But it is hard for me to figure out how to
do more review work and guarantee that this kind of thing won't
happen.

So, anyway, I'd argue that we need a parking lot for patches which we
all agree are important and have a path forward but need someone to do
the last 20-80% of the work. To avoid this being a dumping ground,
patches should _only_ be allowed in the parking lot if they have a
clear path forward. Patches which haven't gotten any interest don't go
there. Patches in which the author has clearly not addressed feedback
that is reasonable for them to address don't go there. These are
effectively community TODOs which we agree need to be done -- if only
someone had the time.

- Melanie

[1] https://commitfest.postgresql.org/48/4248/




Re: Why does pgindent's README say to download typedefs.list from the buildfarm?

2024-05-17 Thread Tom Lane
Peter Eisentraut  writes:
> On 16.05.24 16:45, Tom Lane wrote:
>> Yeah, that was bothering me too, but I went for the minimum delta.
>> I did think that a couple of integer macros would be a better idea,
>> so +1 for what you did here.

> I committed this, and Michael took care of WaitEventExtension, so we 
> should be all clear here.

Thanks.  I just made the committed typedefs.list exactly match the
current buildfarm output, so we're clean for now.

regards, tom lane




Re: WAL record CRC calculated incorrectly because of underlying buffer modification

2024-05-17 Thread Robert Haas
On Fri, May 17, 2024 at 10:56 AM Jeff Davis  wrote:
> I'm still not entirely clear on why hash indexes can't just follow the
> rules and exclusive lock the buffer and dirty it. Presumably
> performance would suffer, but I asked that question previously and
> didn't get an answer:
>
> https://www.postgresql.org/message-id/CA%2BTgmoY%2BdagCyrMKau7UQeQU6w4LuVEu%2ByjsmJBoXKAo6XbUUA%40mail.gmail.com

In my defense, the last time I worked on hash indexes was 7 years ago.
If this question had come up within a year or two of that work, I
probably would have both (a) had a much clearer idea of what the
answer was and (b) felt obliged to drop everything and go research it
if I didn't. But at this point, I feel like it's fair for me to tell
you what I know and leave it to you to do further research if you feel
like that's warranted. I know that we're each responsible for what we
commit, but I don't really think that should extend to having to
prioritize answering a hypothetical question ("what would happen if X
thing worked like Y instead of the way it does?") about an area I
haven't touched in long enough for every release that doesn't contain
those commits to be out of support. If you feel otherwise, let's have
that argument, but I have a feeling that it may be more that you're
hoping I have some kind of oracular powers which, in reality, I lack.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: improve performance of pg_dump --binary-upgrade

2024-05-17 Thread Nathan Bossart
rebased

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
>From ded5e61ff631c2d02835fdba941068dcd86741ce Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Thu, 18 Apr 2024 15:15:19 -0500
Subject: [PATCH v5 1/2] Remove is_index parameter from
 binary_upgrade_set_pg_class_oids().

---
 src/bin/pg_dump/pg_dump.c | 20 ++--
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index e324070828..0fbb8e8831 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -324,7 +324,7 @@ static void binary_upgrade_set_type_oids_by_rel(Archive *fout,
 const TableInfo *tbinfo);
 static void binary_upgrade_set_pg_class_oids(Archive *fout,
 			 PQExpBuffer upgrade_buffer,
-			 Oid pg_class_oid, bool is_index);
+			 Oid pg_class_oid);
 static void binary_upgrade_extension_member(PQExpBuffer upgrade_buffer,
 			const DumpableObject *dobj,
 			const char *objtype,
@@ -5391,8 +5391,7 @@ binary_upgrade_set_type_oids_by_rel(Archive *fout,
 
 static void
 binary_upgrade_set_pg_class_oids(Archive *fout,
- PQExpBuffer upgrade_buffer, Oid pg_class_oid,
- bool is_index)
+ PQExpBuffer upgrade_buffer, Oid pg_class_oid)
 {
 	PQExpBuffer upgrade_query = createPQExpBuffer();
 	PGresult   *upgrade_res;
@@ -5441,7 +5440,8 @@ binary_upgrade_set_pg_class_oids(Archive *fout,
 	appendPQExpBufferStr(upgrade_buffer,
 		 "\n-- For binary upgrade, must preserve pg_class oids and relfilenodes\n");
 
-	if (!is_index)
+	if (relkind != RELKIND_INDEX &&
+		relkind != RELKIND_PARTITIONED_INDEX)
 	{
 		appendPQExpBuffer(upgrade_buffer,
 		  "SELECT pg_catalog.binary_upgrade_set_next_heap_pg_class_oid('%u'::pg_catalog.oid);\n",
@@ -11668,7 +11668,7 @@ dumpCompositeType(Archive *fout, const TypeInfo *tyinfo)
 		binary_upgrade_set_type_oids_by_type_oid(fout, q,
  tyinfo->dobj.catId.oid,
  false, false);
-		binary_upgrade_set_pg_class_oids(fout, q, tyinfo->typrelid, false);
+		binary_upgrade_set_pg_class_oids(fout, q, tyinfo->typrelid);
 	}
 
 	qtypname = pg_strdup(fmtId(tyinfo->dobj.name));
@@ -15802,7 +15802,7 @@ dumpTableSchema(Archive *fout, const TableInfo *tbinfo)
 
 		if (dopt->binary_upgrade)
 			binary_upgrade_set_pg_class_oids(fout, q,
-			 tbinfo->dobj.catId.oid, false);
+			 tbinfo->dobj.catId.oid);
 
 		appendPQExpBuffer(q, "CREATE VIEW %s", qualrelname);
 
@@ -15904,7 +15904,7 @@ dumpTableSchema(Archive *fout, const TableInfo *tbinfo)
 
 		if (dopt->binary_upgrade)
 			binary_upgrade_set_pg_class_oids(fout, q,
-			 tbinfo->dobj.catId.oid, false);
+			 tbinfo->dobj.catId.oid);
 
 		appendPQExpBuffer(q, "CREATE %s%s %s",
 		  tbinfo->relpersistence == RELPERSISTENCE_UNLOGGED ?
@@ -16755,7 +16755,7 @@ dumpIndex(Archive *fout, const IndxInfo *indxinfo)
 
 		if (dopt->binary_upgrade)
 			binary_upgrade_set_pg_class_oids(fout, q,
-			 indxinfo->dobj.catId.oid, true);
+			 indxinfo->dobj.catId.oid);
 
 		/* Plain secondary index */
 		appendPQExpBuffer(q, "%s;\n", indxinfo->indexdef);
@@ -17009,7 +17009,7 @@ dumpConstraint(Archive *fout, const ConstraintInfo *coninfo)
 
 		if (dopt->binary_upgrade)
 			binary_upgrade_set_pg_class_oids(fout, q,
-			 indxinfo->dobj.catId.oid, true);
+			 indxinfo->dobj.catId.oid);
 
 		appendPQExpBuffer(q, "ALTER %sTABLE ONLY %s\n", foreign,
 		  fmtQualifiedDumpable(tbinfo));
@@ -17403,7 +17403,7 @@ dumpSequence(Archive *fout, const TableInfo *tbinfo)
 	if (dopt->binary_upgrade)
 	{
 		binary_upgrade_set_pg_class_oids(fout, query,
-		 tbinfo->dobj.catId.oid, false);
+		 tbinfo->dobj.catId.oid);
 
 		/*
 		 * In older PG versions a sequence will have a pg_type entry, but v14
-- 
2.25.1

>From 7ff5168f5984865bd405e5d53dc6a190f989e7cd Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Mon, 22 Apr 2024 13:21:18 -0500
Subject: [PATCH v5 2/2] Improve performance of pg_dump --binary-upgrade.

---
 src/bin/pg_dump/pg_dump.c| 141 +--
 src/tools/pgindent/typedefs.list |   1 +
 2 files changed, 96 insertions(+), 46 deletions(-)

diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 0fbb8e8831..7b8ddc6443 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -55,6 +55,7 @@
 #include "catalog/pg_trigger_d.h"
 #include "catalog/pg_type_d.h"
 #include "common/connect.h"
+#include "common/int.h"
 #include "common/relpath.h"
 #include "compress_io.h"
 #include "dumputils.h"
@@ -92,6 +93,17 @@ typedef struct
 	int			objsubid;		/* subobject (table column #) */
 } SecLabelItem;
 
+typedef struct
+{
+	Oid			oid;			/* object OID */
+	char		relkind;		/* object kind */
+	RelFileNumber relfilenode;	/* object filenode */
+	Oid			reltoastrelid;	/* toast table OID */
+	RelFileNumber toast_relfilenode;	/* toast table filenode */
+	Oid			indexrelid;		/* toast table 

Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-17 Thread Tom Lane
Robert Haas  writes:
> On Fri, May 17, 2024 at 10:31 AM Tom Lane  wrote:
>> If we go back to the old its-development-mode-all-the-time approach,
>> what is likely to happen is that the commit rate for not-your-own-
>> patches goes to zero, because it's always possible to rationalize
>> your own stuff as being more important.

> We already have gone back to that model. We just haven't admitted it
> yet. And we're never going to get out of it until we find a way to get
> the contents of the CommitFest application down to a more reasonable
> size and level of complexity. There's just no way everyone's up for
> that level of pain. I'm not sure not up for that level of pain.

Yeah, we clearly need to get the patch list to a point of
manageability, but I don't agree that abandoning time-boxed CFs
will improve anything.

regards, tom lane




Re: WAL record CRC calculated incorrectly because of underlying buffer modification

2024-05-17 Thread Jeff Davis
On Fri, 2024-05-17 at 10:12 +0900, Michael Paquier wrote:
> That's something I've done four weeks ago in the hash replay code
> path, and having the image with XLR_CHECK_CONSISTENCY even if
> REGBUF_NO_CHANGE was necessary because replay was setting up a LSN on
> a REGBUF_NO_CHANGE page it should not have touched.

Then the candidate fix to selectively break XLR_CHECK_CONSISTENCY is
not acceptable.

> 
> Yeah, agreed that getting rid of REGBUF_NO_CHANGE would be nice in
> the
> final picture.  It still strikes me as a weird concept that WAL
> replay
> for hash indexes logs full pages just to be able to lock them at
> replay based on what's in the records.  :/

I'm still not entirely clear on why hash indexes can't just follow the
rules and exclusive lock the buffer and dirty it. Presumably
performance would suffer, but I asked that question previously and
didn't get an answer:

https://www.postgresql.org/message-id/CA%2BTgmoY%2BdagCyrMKau7UQeQU6w4LuVEu%2ByjsmJBoXKAo6XbUUA%40mail.gmail.com

And if that does affect performance, what about following the same
protocol as XLogSaveBufferForHint()?

Regards,
Jeff Davis





Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-17 Thread Robert Haas
On Fri, May 17, 2024 at 10:31 AM Tom Lane  wrote:
> To my mind, the point of the time-boxed commitfests is to provide
> a structure wherein people will (hopefully) pay some actual attention
> to other peoples' patches.  Conversely, the fact that we don't have
> one running all the time gives committers some defined intervals
> where they can work on their own stuff without feeling guilty that
> they're not handling other people's patches.
>
> If we go back to the old its-development-mode-all-the-time approach,
> what is likely to happen is that the commit rate for not-your-own-
> patches goes to zero, because it's always possible to rationalize
> your own stuff as being more important.

We already have gone back to that model. We just haven't admitted it
yet. And we're never going to get out of it until we find a way to get
the contents of the CommitFest application down to a more reasonable
size and level of complexity. There's just no way everyone's up for
that level of pain. I'm not sure not up for that level of pain.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-17 Thread Tom Lane
Robert Haas  writes:
> On Fri, May 17, 2024 at 9:54 AM Joe Conway  wrote:
>> I agree with you. Before commitfests were a thing, we had no structure
>> at all as I recall. The dates for the dev cycle were more fluid as I
>> recall, and we had no CF app to track things. Maybe retaining the
>> structure but going back to the continuous development would be just the
>> thing, with the CF app tracking just the currently (as of this
>> week/month/???) active stuff.

> The main thing that we'd gain from that is to avoid all the work of
> pushing lots of things forward to the next CommitFest at the end of
> every CommitFest.

To my mind, the point of the time-boxed commitfests is to provide
a structure wherein people will (hopefully) pay some actual attention
to other peoples' patches.  Conversely, the fact that we don't have
one running all the time gives committers some defined intervals
where they can work on their own stuff without feeling guilty that
they're not handling other people's patches.

If we go back to the old its-development-mode-all-the-time approach,
what is likely to happen is that the commit rate for not-your-own-
patches goes to zero, because it's always possible to rationalize
your own stuff as being more important.

> Like, we could also just have a button that says "move everything
> that's left to the next CommitFest".

Clearly, CFMs would appreciate some more tooling to make that sort
of thing easier.  Perhaps we omitted it in the original CF app
coding because we expected the end-of-CF backlog to be minimal,
but it's now obvious that it generally isn't.

BTW, I was reminded while trawling old email yesterday that
we used to freeze the content of a CF at its start and then
hold the CF open until the backlog actually went to zero,
which resulted in multi-month death-march CFs and no clarity
at all as to release timing.  Let's not go back to that.
But the CF app was probably built with that model in mind.

regards, tom lane




Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-17 Thread Peter Geoghegan
On Fri, May 17, 2024 at 9:09 AM Peter Eisentraut  wrote:
> How would the development process look like if we
> just had one commitfest per dev cycle that runs from July 1st to March 31st?

Exactly the same?

-- 
Peter Geoghegan




Re: psql JSON output format

2024-05-17 Thread Pavel Stehule
pá 17. 5. 2024 v 16:04 odesílatel Robert Haas 
napsal:

> On Fri, May 17, 2024 at 9:42 AM Christoph Berg  wrote:
> > Thanks for summarizing the thread.
> >
> > Things mentioned in the thread:
> >
> > 1) rendering of SQL NULLs - include or omit the column
> >
> > 2) rendering of JSON values - both "quoted string" and "inline as
> >JSON" make sense
> >
> > 3) not quoting numeric values and booleans
> >
> > 4) no special treatment of other datatypes like arrays or compound
> >values, just quote them
> >
> > 5) row format: JSON object or array (array would be close to CSV
> >format)
> >
> > 6) overall format: array of rows, or simply print each row separately
> >("JSON Lines" format, https://jsonlines.org/)
> >
> > I think 1, 2 and perhaps 6 make sense to have configurable. Two or
> > three \pset options (or one option with a list of flags) don't sound
> > too bad complexity-wise.
> >
> > Or maybe just default to "omit NULL columns" and "inline JSON" (and
> > render null as NULL).
>
> If we're going to just have one option, I agree with making that the
> default, and I'd default to an array of row objects. If we're going to
> have something configurable, I'd at least consider making (4)
> configurable.
>
> It's tempting to just have one option, like \pset jsonformat
>
> nullcolumns=omit;inlinevalues=json,array;rowformat=object;resultcontainer=array
> simply because adding a ton of new options just for this isn't very
> appealing. But looking at how long that is, it's probably not a great
> idea. So I guess separate options is probably better?
>

+1 for separate options

lot of these proposed options can be used for XML too

Regards

Pavel


> --
> Robert Haas
> EDB: http://www.enterprisedb.com
>
>
>


Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-17 Thread Robert Haas
On Fri, May 17, 2024 at 9:54 AM Joe Conway  wrote:
> I agree with you. Before commitfests were a thing, we had no structure
> at all as I recall. The dates for the dev cycle were more fluid as I
> recall, and we had no CF app to track things. Maybe retaining the
> structure but going back to the continuous development would be just the
> thing, with the CF app tracking just the currently (as of this
> week/month/???) active stuff.

The main thing that we'd gain from that is to avoid all the work of
pushing lots of things forward to the next CommitFest at the end of
every CommitFest. While I agree that we need to find a better way to
handle that, I don't think it's really the biggest problem. The core
problems here are (1) keeping stuff out of CommitFests that don't
belong there and (2) labelling stuff that does belong in the
CommitFest in useful ways. We should shape the solution around those
problems. Maybe that will solve this problem along the way, but if it
doesn't, that's easy enough to fix afterward.

Like, we could also just have a button that says "move everything
that's left to the next CommitFest". That, too, would avoid the manual
work that this would avoid. But it wouldn't solve any other problems,
so it's not really worth much consideration.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: psql JSON output format

2024-05-17 Thread Robert Haas
On Fri, May 17, 2024 at 9:42 AM Christoph Berg  wrote:
> Thanks for summarizing the thread.
>
> Things mentioned in the thread:
>
> 1) rendering of SQL NULLs - include or omit the column
>
> 2) rendering of JSON values - both "quoted string" and "inline as
>JSON" make sense
>
> 3) not quoting numeric values and booleans
>
> 4) no special treatment of other datatypes like arrays or compound
>values, just quote them
>
> 5) row format: JSON object or array (array would be close to CSV
>format)
>
> 6) overall format: array of rows, or simply print each row separately
>("JSON Lines" format, https://jsonlines.org/)
>
> I think 1, 2 and perhaps 6 make sense to have configurable. Two or
> three \pset options (or one option with a list of flags) don't sound
> too bad complexity-wise.
>
> Or maybe just default to "omit NULL columns" and "inline JSON" (and
> render null as NULL).

If we're going to just have one option, I agree with making that the
default, and I'd default to an array of row objects. If we're going to
have something configurable, I'd at least consider making (4)
configurable.

It's tempting to just have one option, like \pset jsonformat
nullcolumns=omit;inlinevalues=json,array;rowformat=object;resultcontainer=array
simply because adding a ton of new options just for this isn't very
appealing. But looking at how long that is, it's probably not a great
idea. So I guess separate options is probably better?

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-17 Thread Joe Conway

On 5/17/24 09:08, Peter Eisentraut wrote:

On 17.05.24 14:42, Joe Conway wrote:
Namely, the week before commitfest I don't actually know if I will 
have the time during that month, but I will make sure my patch is in 
the commitfest just in case I get a few clear days to work on it. 
Because if it isn't there, I can't take advantage of those "found" hours.


A solution to both of these issues (yours and mine) would be to allow 
things to be added *during* the CF month. What is the point of having a 
"freeze" before every CF anyway? Especially if they start out clean. If 
something is ready for review on day 8 of the CF, why not let it be 
added for review?


Maybe this all indicates that the idea of bimonthly commitfests has run
its course.  The original idea might have been, if we have like 50
patches, we can process all of them within a month.  We're clearly not
doing that anymore.  How would the development process look like if we
just had one commitfest per dev cycle that runs from July 1st to March 31st?


What's old is new again? ;-)

I agree with you. Before commitfests were a thing, we had no structure 
at all as I recall. The dates for the dev cycle were more fluid as I 
recall, and we had no CF app to track things. Maybe retaining the 
structure but going back to the continuous development would be just the 
thing, with the CF app tracking just the currently (as of this 
week/month/???) active stuff.



--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com





Re: psql: Allow editing query results with \gedit

2024-05-17 Thread Robert Haas
On Fri, May 17, 2024 at 9:24 AM Christoph Berg  wrote:
> Tom> The stated complaint was "it's too hard to build UPDATE commands",
> Tom> which I can sympathize with.
>
> ... which this would perfectly address - it's building the commands.
>
> The editor will have a bit more clutter (the UPDATE SET WHERE
> boilerplate), and the fields are somewhat out of order (the key at the
> end), but SQL commands are what people understand, and there is
> absolutely no ambiguity on what is going to be executed since the
> commands are exactly what is leaving the editor.

A point to consider is that the user can also do this in the query
itself, if desired. It'd just be a matter of assembling the query
string with appropriate calls to quote_literal() and quote_ident(),
which is not actually all that hard and I suspect many of us have done
that at one point or another. And then you know that you'll get the
right set of key columns and update the right table (as opposed to,
say, a view over the right table, or the wrong one out of several
tables that you joined).

Now you might say, well, that's not terribly convenient, which is
probably true, but this might be a case of -- convenient, reliable,
works with arbitrary queries -- pick two. I don't know. I wouldn't be
all that surprised if there's something clever and useful we could do
in this area, but I sure don't know what it is.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: psql JSON output format

2024-05-17 Thread Christoph Berg
Re: Robert Haas
> IMHO, the big problem here is that different people want different
> corner-case behaviors and it's not clear what to do about that. I
> don't think there's a single vote for "don't do this at all". So if
> there is a desire to take this work forward, the goal probably ought
> to be to try to either (a) figure out one behavior that everyone can
> live with or (b) figure out a short list of options that can be used
> to customize the behavior to a degree that lets everyone get something
> reasonably close to what they want. For instance, "what to do if you
> find a SQL null" and "whether to include json values as strings or
> json objects" seem like they could potentially be customizable. That's
> probably not a silver bullet because (1) that's more work and (2)
> there might be more behaviors than we want to code, or maintain the
> code for, and (3) if it gets too complicated that can itself become a
> source of objections. But it's an idea.

Thanks for summarizing the thread.

Things mentioned in the thread:

1) rendering of SQL NULLs - include or omit the column

2) rendering of JSON values - both "quoted string" and "inline as
   JSON" make sense

3) not quoting numeric values and booleans

4) no special treatment of other datatypes like arrays or compound
   values, just quote them

5) row format: JSON object or array (array would be close to CSV
   format)

6) overall format: array of rows, or simply print each row separately
   ("JSON Lines" format, https://jsonlines.org/)

I think 1, 2 and perhaps 6 make sense to have configurable. Two or
three \pset options (or one option with a list of flags) don't sound
too bad complexity-wise.

Or maybe just default to "omit NULL columns" and "inline JSON" (and
render null as NULL).

Thoughts?

Christoph




Re: First draft of PG 17 release notes

2024-05-17 Thread Daniel Verite
Bruce Momjian wrote:

> I have committed the first draft of the PG 17 release notes;  you can
> see the results here:
> 
>   https://momjian.us/pgsql_docs/release-17.html

About the changes in collations:


  Create a "builtin" collation provider similar to libc's C locale
  (Jeff Davis)

  It uses a "C" locale which is identical but independent of
  libc, but it allows the use of non-"C" collations like "en_US"
  and "C.UTF-8" with the "C" locale, which libc does not. MORE?


The new builtin provider has two collations:
* ucs_basic which is 100% identical to "C". It was introduced
several versions ago and the v17 novelty is simply to change
its pg_collation.collprovider from 'c' to 'b'.

* pg_c_utf8 which sorts like "C" but is Unicode-aware for
the rest, which makes it quite different from "C".
It's also different from the other UTF-8 collations that could
be used up to v17 in that it does not depend on an external
library, making it free from the collation OS-upgrade risks.

The part that is concretely of interest to users is the introduction
of pg_c_utf8. As described in [1]:


pg_c_utf8

 This collation sorts by Unicode code point values rather than
 natural language order. For the functions lower, initcap, and
 upper, it uses Unicode simple case mapping. For pattern
 matching (including regular expressions), it uses the POSIX
 Compatible variant of Unicode Compatibility Properties. Behavior
 is efficient and stable within a Postgres major version. This
 collation is only available for encoding UTF8.


I'd suggest that the relnote entry should be more like a condensed
version of that description, without mentioning en_US or C.UTF-8,
whose existence and semantics are OS-dependent, contrary to pg_c_utf8.


[1] https://www.postgresql.org/docs/devel/collation.html

Best regards,
-- 
Daniel Vérité
https://postgresql.verite.pro/
Twitter: @DanielVerite




Re: psql: Allow editing query results with \gedit

2024-05-17 Thread Christoph Berg
Re: Robert Haas
> Based on these comments and the one from David Johnston, I think there
> is a consensus that we do not want this patch, so I'm marking it as
> Rejected in the CommitFest application. If I've misunderstood the
> situation, then please feel free to change the status accordingly.

Hi Robert,

thanks for looking at the patch.

> I feel slightly bad about rejecting this not only because rejecting
> patches that people have put work into sucks but also because (1) I do
> understand why it could be useful to have something like this and (2)
> I think in many ways the patch is quite well-considered, e.g. it has
> options like table and key to work around cases where the naive logic
> doesn't get the right answer. But I also do understand why the
> reactions thus far have been skeptical: there's a lot of pretty
> magical stuff in this patch. That's a reliability concern: when you
> type \gedit and it works, that's cool, but sometimes it isn't going to
> work, and you're not always going to understand why, and you can
> probably fix a lot of those cases by using the "table" or "key"
> options, but you have to know they exist, and you have to realize that
> they're needed, and the whole thing is suddenly a lot less convenient.
> I think if we add this feature, a bunch of people won't notice, but
> among those who do, I bet there will be a pretty good chance of people
> complaining about cases that don't work, and perhaps not understanding
> why they don't work, and I suspect fixing some of those complaints may
> require something pretty close to solving the halting problem. :-(

That's a good summary of the situation, thanks.

I still think the feature would be cool to have, but admittedly, in
the meantime I've had cases myself where the automatism went into the
wrong direction (updating key columns results in DELETE-INSERT cycles
that aren't doing the right thing if you didn't select all columns
originally), so I now understand the objections and agree people were
right about that. This could be fixed by feeding the resulting
commands through another editor round, but that just adds complexity
instead of removing confusion.

I think there is a pretty straightforward way to address the problems,
though: instead of letting the user edit JSON, format the query result
in the form of UPDATE commands and let the user edit them. As Tom said
upthread:

Tom> The stated complaint was "it's too hard to build UPDATE commands",
Tom> which I can sympathize with.

... which this would perfectly address - it's building the commands.

The editor will have a bit more clutter (the UPDATE SET WHERE
boilerplate), and the fields are somewhat out of order (the key at the
end), but SQL commands are what people understand, and there is
absolutely no ambiguity on what is going to be executed since the
commands are exactly what is leaving the editor.

> Now maybe that's all wrong and we should adopt this patch with
> enthusiasm, but if so, we need the patch to have significantly more +1
> votes than -1 votes, and right now the reverse seems to be the case.

I'll update the patch and ask here. Thanks!

Christoph




Re: First draft of PG 17 release notes

2024-05-17 Thread jian he
On Thu, May 9, 2024 at 12:04 PM Bruce Momjian  wrote:
>
> I have committed the first draft of the PG 17 release notes;  you can
> see the results here:
>
> https://momjian.us/pgsql_docs/release-17.html
>
> It will be improved until the final release.  The item count is 188,
> which is similar to recent releases:
>

This thread mentioned performance.
actually this[1] refactored some interval aggregation related functions,
which will make these two aggregate function: avg(interval), sum(interval)
run faster, especially avg(interval).
see [2].
well, I guess, this is a kind of niche performance improvement to be
mentioned separately.


these 3 items need to be removed, because of
https://git.postgresql.org/cgit/postgresql.git/commit/?id=8aee330af55d8a759b2b73f5a771d9d34a7b887f

>> Add stratnum GiST support function (Paul A. Jungwirth)

>> Allow foreign keys to reference WITHOUT OVERLAPS primary keys (Paul A. 
>> Jungwirth)
>> The keyword PERIOD is used for this purpose.

>> Allow PRIMARY KEY and UNIQUE constraints to use WITHOUT OVERLAPS for 
>> non-overlapping exclusion constraints (Paul A. Jungwirth)


[1] 
https://git.postgresql.org/cgit/postgresql.git/commit/?id=519fc1bd9e9d7b408903e44f55f83f6db30742b7
[2] 
https://www.postgresql.org/message-id/CAEZATCUJ0xjyQUL7SHKxJ5a%2BDm5pjoq-WO3NtkDLi6c76rh58w%40mail.gmail.com




Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

2024-05-17 Thread Robert Haas
On Thu, May 16, 2024 at 1:39 PM Jacob Burroughs
 wrote:
> As currently implemented [1], the client sends the server the list of
> all compression algorithms it is willing to accept, and the server can
> use one of them.  If the server knows what `_pq_.compression` means
> but doesn't actually support any compression, it will both send the
> client its empty list of supported algorithms and just never send any
> compressed messages, and everyone involved will be (relatively) happy.
> There is a libpq function that a client can use to check what
> compression is in use if a client *really* doesn't want to continue
> with the conversation without compression, but 99% of the time I can't
> see why a client wouldn't prefer to continue using a connection with
> whatever compression the server supports (or even none) without more
> explicit negotiation.  (Unlike TLS, where automagically picking
> between using and not using TLS has strange security implications and
> effects, compression is a convenience feature for everyone involved.)

This all seems sensible to me.

> As the protocol layer is currently designed [1], it explicitly makes
> it very easy to change/restart compression streams, specifically for
> this use case (and in particular for the general connection pooler use
> case).  Compressed data is already framed in a `CompressedData`
> message, and that message has a header byte that corresponds to an
> enum value for which algorithm is currently in use.  Any time the
> compression stream was restarted by the sender, the first
> `CompressedData` message will set that byte, and then the client will
> restart its decompression stream with the chosen algorithm from that
> point.  For `CompressedData` messages that continue using the
> already-established stream, the byte is simply set to 0.  (This is
> also how the "each side sends a list" form of negotiation is able to
> work without additional round trips, as the `CompressedData` framing
> itself communicates which compression algorithm has been selected.)

OK, so you made it so that compressed data is fully self-identifying.
Hence, there's no need to worry if something gets changed: the
receiver, if properly implemented, can't help but notice. The only
downside that I can see to this design is that you only have one byte
to identify the compression algorithm, but that doesn't actually seem
like a real problem at all, because I expect the number of supported
compression algorithms to grow very slowly. I think it would take
centuries, possibly millenia, before we started to get short of
identifiers. So, cool.

But, in your system, how does the client ask the server to switch to a
different compression algorithm, or to restart the compression stream?

-- 
Robert Haas
EDB: http://www.enterprisedb.com




remove Todo item: Allow infinite intervals just like infinite timestamps

2024-05-17 Thread jian he
hi.

https://wiki.postgresql.org/wiki/Todo
Dates and Times[edit]
Allow infinite intervals just like infinite timestamps
https://www.postgresql.org/message-id/4eb095c8.1050...@agliodbs.com

this done at
https://git.postgresql.org/cgit/postgresql.git/commit/?id=519fc1bd9e9d7b408903e44f55f83f6db30742b7

Should we remove this item?




Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-17 Thread Daniel Gustafsson
> On 17 May 2024, at 15:05, Robert Haas  wrote:
> 
> On Fri, May 17, 2024 at 9:02 AM Peter Eisentraut  wrote:
>> We already have the annotations feature, which is kind of this.
> 
> I didn't realize we had that feature. When was it added, and how is it
> supposed to be used?

A while back.

commit 27cba025a501c9dbcfb08da0c4db95dc6111d647
Author: Magnus Hagander 
Date:   Sat Feb 14 13:07:48 2015 +0100

Implement simple message annotations

This feature makes it possible to "pull in" a message in a thread and 
highlight
it with an annotation (free text format). This will list the message in a 
table
along with the annotation and who made it.

Annotations have to be attached to a specific message - for a "generic" one 
it
makes sense to attach it to the latest message available, as that will put 
it
at the correct place in time.

Magnus' commitmessage explains it well.  The way I've used it (albeit
infrequently) is to point to a specific mail in the thread where a significant
change was proposed, like the patch changhing direction or something similar.

--
Daniel Gustafsson





Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-17 Thread Tomas Vondra



On 5/17/24 14:51, Andrey M. Borodin wrote:
> 
> 
>> On 17 May 2024, at 16:39, Robert Haas  wrote:
>>
>> I think Andrey Borodin's nearby suggestion of having a separate CfM
>> for each section of the CommitFest does a good job revealing just how
>> bad the current situation is. I agree with him: that would actually
>> work. Asking somebody, for a one-month period, to be responsible for
>> shepherding one-tenth or one-twentieth of the entries in the
>> CommitFest would be a reasonable amount of work for somebody. But we
>> will not find 10 or 20 highly motivated, well-qualified volunteers
>> every other month to do that work;
> 
> Why do you think so? Let’s just try to find more CFMs for July.
> When I felt that I’m overwhelmed, I asked for help and Alexander Alekseev 
> promptly agreed. That helped a lot.
> If I was in that position again, I would just ask 10 times on a 1st day :)
> 
>> it's a struggle to find one or two
>> highly motivated, well-qualified CommitFest managers, let alone ten or
>> twenty.
> 
> Because we are looking for one person to do a job for 10.
> 

Yes. It's probably easier to find more CF managers doing much less work.

>> So I think the right interpretation of his comment is that
>> managing the CommitFest has become about an order of magnitude more
>> difficult than what it needs to be for the task to be done well.
> 
> Let’s scale the process. Reduce responsibility area of a CFM, define it 
> clearer.
> And maybe even explicitly ask CFM to summarize patch status of each entry at 
> least once a CF.
> 

Should it even be up to the CFM to write the summary, or should he/she
be able to request an update from the patch author? Of at least have the
choice to do so.

I think we'll always struggle with the massive threads, because it's
really difficult to find the right balance between brevity and including
all the relevant details. Or rather impossible. I did try writing such
summaries for a couple of my long-running patches, and while it might
have helped, the challenge was to also explain why stuff *not* done in
some alternative way, which is one of the things usually discussed. But
the summary gets very long, because there are many alternatives.

> 
> Can I do a small poll among those who is on this thread? Would you
volunteer to summarize a status of 20 patches in July’s CF? 5 each week
or so. One per day.
> 

Not sure. For many patches it'll be trivial. And for a bunch it'll be
very very time-consuming.

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-17 Thread Peter Eisentraut

On 17.05.24 14:42, Joe Conway wrote:
Namely, the week before commitfest I don't actually know if I will 
have the time during that month, but I will make sure my patch is in 
the commitfest just in case I get a few clear days to work on it. 
Because if it isn't there, I can't take advantage of those "found" hours.


A solution to both of these issues (yours and mine) would be to allow 
things to be added *during* the CF month. What is the point of having a 
"freeze" before every CF anyway? Especially if they start out clean. If 
something is ready for review on day 8 of the CF, why not let it be 
added for review?


Maybe this all indicates that the idea of bimonthly commitfests has run 
its course.  The original idea might have been, if we have like 50 
patches, we can process all of them within a month.  We're clearly not 
doing that anymore.  How would the development process look like if we 
just had one commitfest per dev cycle that runs from July 1st to March 31st?






  1   2   >