date:20240618

Re: State of pg_createsubscriber

2024-06-18 Thread Euler Taveira

On Tue, Jun 18, 2024, at 12:59 AM, Amit Kapila wrote:
> On Mon, May 20, 2024 at 12:12 PM Amit Kapila  wrote:
> >
> > On Sun, May 19, 2024 at 11:20 PM Euler Taveira  wrote:
> > >
> > > On Sun, May 19, 2024, at 2:30 PM, Tom Lane wrote:
> > >
> > > I'm fairly disturbed about the readiness of pg_createsubscriber.
> > > The 040_pg_createsubscriber.pl TAP test is moderately unstable
> > > in the buildfarm [1], and there are various unaddressed complaints
> > > at the end of the patch thread (pretty much everything below [2]).
> > > I guess this is good enough to start beta with, but it's far from
> > > being good enough to ship, IMO.  If there were active work going
> > > on to fix these things, I'd feel better, but neither the C code
> > > nor the test script have been touched since 1 April.
> > >
> > >
> > > My bad. :( I'll post patches soon to address all of the points.
> > >
> >
> > Just to summarize, apart from BF failures for which we had some
> > discussion, I could recall the following open points:
> >
> > 1. After promotion, the pre-existing replication objects should be
> > removed (either optionally or always), otherwise, it can lead to a new
> > subscriber not being able to restart or getting some unwarranted data.
> > [1][2].
> >
> > 2. Retaining synced slots on new subscribers can lead to unnecessary
> > WAL retention and dead rows [3].
> >
> > 3. We need to consider whether some of the messages displayed in
> > --dry-run mode are useful or not [4].
> >
> 
> The recent commits should silence BF failures and resolve point number
> 2. But we haven't done anything yet for 1 and 3. For 3, we have a
> patch in email [1] (v3-0005-Avoid*) which can be reviewed and
> committed but point number 1 needs discussion. Apart from that
> somewhat related to point 1, Kuroda-San has raised a point in an email
> [2] for replication slots. Shlok has presented a case in this thread
> [3] where the problem due to point 1 can cause ERRORs or can cause
> data inconsistency.

I read v3-0005 and it seems to silence (almost) all "write" messages. Does it
intend to avoid the misinterpretation that the dry run mode is writing
something? It is dry run mode! If I run a tool in dry run mode, I expect it to
execute some verifications and print useful messages so I can evaluate if it is
ok to run it. Maybe it is not your expectation for dry run mode.

I think if it not clear, let's inform that it changes nothing in dry run mode.

pg_createsubscriber: no modifications are done

as a first message in dry run mode. I agree with you when you pointed out that
some messages are misleading.

pg_createsubscriber: hint: If pg_createsubscriber fails after this
point, you must recreate the physical replica before continuing.

Maybe for this one, we omit the fake information, like:

pg_createsubscriber: setting the replication progress on database "postgres"

I will post a patch to address the messages once we agree what needs to be
changed.

Regarding 3, publications and subscriptions are ok to remove. You are not
allowed to create them on standby, hence, all publications and subscriptions
are streamed from primary. However, I'm wondering if you want to remove the
publications. Replication slots on a standby server are "invalidated" despite
of the wal_status is saying "reserved" (I think it is an oversight in the
design that implements slot invalidation), however, required WAL files have
already been removed because of pg_resetwal (see modify_subscriber_sysid()).
The scenario is to promote a standby server, run pg_resetwal on it and check
pg_replication_slots.

Do you have any other scenarios in mind?

> Now, the easiest way out here is that we accept the issues with the
> pre-existing subscriptions and replication slots cases and just
> document them for now with the intention to work on those in the next
> version. OTOH, if there are no major challenges, we can try to
> implement a patch for them as well as see how it goes.

Agree.

--
Euler Taveira
EDB   https://www.enterprisedb.com/

replace strtok()

2024-06-18 Thread Peter Eisentraut

Under the topic of getting rid of thread-unsafe functions in the backend 
[0], here is a patch series to deal with strtok().


Of course, strtok() is famously not thread-safe and can be replaced by 
strtok_r().  But it also has the wrong semantics in some cases, because 
it considers adjacent delimiters to be one delimiter.  So if you parse


SCRAM-SHA-256$:$:

with strtok(), then

SCRAM-SHA-256$$::$$::

parses just the same.  In many cases, this is arguably wrong and could 
hide mistakes.


So I'm suggesting to use strsep() in those places.  strsep() is 
nonstandard but widely available.


There are a few places where strtok() has the right semantics, such as 
parsing tokens separated by whitespace.  For those, I'm using strtok_r().


A reviewer job here would be to check whether I made that distinction 
correctly in each case.


On the portability side, I'm including a port/ replacement for strsep() 
and some workaround to get strtok_r() for Windows.  I have included 
these here as separate patches for clarity.



[0]: 
https://www.postgresql.org/message-id/856e5ec3-879f-42ee-8258-8bcc6ec9b...@eisentraut.orgFrom 8ab537885f007d8bed58f839ca91108ef20422a6 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Mon, 10 Jun 2024 16:08:08 +0200
Subject: [PATCH v1 1/4] Replace some strtok() with strsep()

strtok() considers adjacent delimiters to be one delimiter, which is
arguably the wrong behavior in some cases.  Replace with strsep(),
which has the right behavior: Adjacent delimiters create an empty
token.
---
 src/backend/libpq/auth-scram.c| 11 +--
 src/backend/utils/adt/pg_locale.c |  3 ++-
 src/common/logging.c  |  4 +++-
 src/test/regress/pg_regress.c |  5 ++---
 4 files changed, 12 insertions(+), 11 deletions(-)

diff --git a/src/backend/libpq/auth-scram.c b/src/backend/libpq/auth-scram.c
index 41619599148..03c3c27 100644
--- a/src/backend/libpq/auth-scram.c
+++ b/src/backend/libpq/auth-scram.c
@@ -608,16 +608,15 @@ parse_scram_secret(const char *secret, int *iterations,
 * SCRAM-SHA-256$:$:
 */
v = pstrdup(secret);
-   if ((scheme_str = strtok(v, "$")) == NULL)
+   if ((scheme_str = strsep(&v, "$")) == NULL)
goto invalid_secret;
-   if ((iterations_str = strtok(NULL, ":")) == NULL)
+   if ((iterations_str = strsep(&v, ":")) == NULL)
goto invalid_secret;
-   if ((salt_str = strtok(NULL, "$")) == NULL)
+   if ((salt_str = strsep(&v, "$")) == NULL)
goto invalid_secret;
-   if ((storedkey_str = strtok(NULL, ":")) == NULL)
-   goto invalid_secret;
-   if ((serverkey_str = strtok(NULL, "")) == NULL)
+   if ((storedkey_str = strsep(&v, ":")) == NULL)
goto invalid_secret;
+   serverkey_str = v;
 
/* Parse the fields */
if (strcmp(scheme_str, "SCRAM-SHA-256") != 0)
diff --git a/src/backend/utils/adt/pg_locale.c 
b/src/backend/utils/adt/pg_locale.c
index 7e5bb2b703a..21924d89877 100644
--- a/src/backend/utils/adt/pg_locale.c
+++ b/src/backend/utils/adt/pg_locale.c
@@ -2809,6 +2809,7 @@ icu_set_collation_attributes(UCollator *collator, const 
char *loc,
char   *icu_locale_id;
char   *lower_str;
char   *str;
+   char   *token;
 
/*
 * The input locale may be a BCP 47 language tag, e.g.
@@ -2834,7 +2835,7 @@ icu_set_collation_attributes(UCollator *collator, const 
char *loc,
return;
str++;
 
-   for (char *token = strtok(str, ";"); token; token = strtok(NULL, ";"))
+   while ((token = strsep(&str, ";")))
{
char   *e = strchr(token, '=');
 
diff --git a/src/common/logging.c b/src/common/logging.c
index e9a02e3e46a..aedd1ae2d8c 100644
--- a/src/common/logging.c
+++ b/src/common/logging.c
@@ -119,7 +119,9 @@ pg_logging_init(const char *argv0)
 
if (colors)
{
-   for (char *token = strtok(colors, ":"); token; 
token = strtok(NULL, ":"))
+   char   *token;
+
+   while ((token = strsep(&colors, ":")))
{
char   *e = strchr(token, '=');
 
diff --git a/src/test/regress/pg_regress.c b/src/test/regress/pg_regress.c
index 06f6775fc65..1abfe6c4a5c 100644
--- a/src/test/regress/pg_regress.c
+++ b/src/test/regress/pg_regress.c
@@ -234,12 +234,11 @@ static void
 split_to_stringlist(const char *s, const char *delim, _stringlist **listhead)
 {
char   *sc = pg_strdup(s);
-   char   *token = strtok(sc, delim);
+   char   *token;
 
-   while (token)
+   while ((token = strsep(&sc, delim)))
{
add_stringlist_item(listhead, token);
-   token = strtok(NULL, delim);
}
free(sc);
 }

base-commit: ae482a7ec521e09bb0dbfc261da6e6170a5ac29f
-

Re: speed up a logical replica setup

2024-06-18 Thread Euler Taveira

On Mon, Jun 17, 2024, at 8:04 AM, Peter Eisentraut wrote:
> On 07.06.24 05:49, Euler Taveira wrote:
> > Here it is a patch series to fix the issues reported in recent 
> > discussions. The
> > patches 0001 and 0003 aim to fix the buildfarm issues. The patch 0002 
> > removes
> > synchronized failover slots on subscriber since it has no use. I also 
> > included
> > an optional patch 0004 that improves the usability by checking both 
> > servers if
> > it already failed in any subscriber check.
> 
> I have committed 0001, 0002, and 0003.  Let's keep an eye on the 
> buildfarm to see if that stabilizes things.  So far it looks good.

Thanks.

> For 0004, I suggest inverting the result values from check_publisher() 
> and create_subscriber() so that it returns true if the check is ok.
> 

Yeah, the order doesn't matter after the commit b963913826. I thought about
changing the order but I didn't; I provided a minimal patch. Since you think
it is an improvement, I'm attaching another patch with this change.


--
Euler Taveira
EDB   https://www.enterprisedb.com/
From ec96586fa35be3f855fb8844a197f374ac82a622 Mon Sep 17 00:00:00 2001
From: Euler Taveira 
Date: Fri, 24 May 2024 14:29:06 -0300
Subject: [PATCH] Check both servers before exiting

The current code provides multiple errors for each server. If any
subscriber condition is not met, it terminates without checking the
publisher conditions.
This change allows it to check both servers before terminating if any
condition is not met. It saves at least one dry run execution.

The order it calls the check functions don't matter after commit
b96391382626306c301b67cbd2d28313d96741f3 (because there is no
primary_slot_name check anymore) so check the publisher first.
---
 src/bin/pg_basebackup/pg_createsubscriber.c | 28 -
 1 file changed, 16 insertions(+), 12 deletions(-)

diff --git a/src/bin/pg_basebackup/pg_createsubscriber.c b/src/bin/pg_basebackup/pg_createsubscriber.c
index 5499e6d96a..26b4087ff7 100644
--- a/src/bin/pg_basebackup/pg_createsubscriber.c
+++ b/src/bin/pg_basebackup/pg_createsubscriber.c
@@ -76,9 +76,9 @@ static uint64 get_standby_sysid(const char *datadir);
 static void modify_subscriber_sysid(const struct CreateSubscriberOptions *opt);
 static bool server_is_in_recovery(PGconn *conn);
 static char *generate_object_name(PGconn *conn);
-static void check_publisher(const struct LogicalRepInfo *dbinfo);
+static bool check_publisher(const struct LogicalRepInfo *dbinfo);
 static char *setup_publisher(struct LogicalRepInfo *dbinfo);
-static void check_subscriber(const struct LogicalRepInfo *dbinfo);
+static bool check_subscriber(const struct LogicalRepInfo *dbinfo);
 static void setup_subscriber(struct LogicalRepInfo *dbinfo,
 			 const char *consistent_lsn);
 static void setup_recovery(const struct LogicalRepInfo *dbinfo, const char *datadir,
@@ -803,7 +803,7 @@ server_is_in_recovery(PGconn *conn)
  *
  * XXX Does it not allow a synchronous replica?
  */
-static void
+static bool
 check_publisher(const struct LogicalRepInfo *dbinfo)
 {
 	PGconn	   *conn;
@@ -908,8 +908,7 @@ check_publisher(const struct LogicalRepInfo *dbinfo)
 
 	pg_free(wal_level);
 
-	if (failed)
-		exit(1);
+	return failed;
 }
 
 /*
@@ -923,7 +922,7 @@ check_publisher(const struct LogicalRepInfo *dbinfo)
  * there is not a reliable way to provide a suitable error saying the server C
  * will be broken at the end of this process (due to pg_resetwal).
  */
-static void
+static bool
 check_subscriber(const struct LogicalRepInfo *dbinfo)
 {
 	PGconn	   *conn;
@@ -1014,8 +1013,7 @@ check_subscriber(const struct LogicalRepInfo *dbinfo)
 		failed = true;
 	}
 
-	if (failed)
-		exit(1);
+	return failed;
 }
 
 /*
@@ -1771,6 +1769,9 @@ main(int argc, char **argv)
 
 	char		pidfile[MAXPGPATH];
 
+	bool		pub_failed = false;
+	bool		sub_failed = false;
+
 	pg_logging_init(argv[0]);
 	pg_logging_set_level(PG_LOG_WARNING);
 	progname = get_progname(argv[0]);
@@ -2061,11 +2062,14 @@ main(int argc, char **argv)
 	pg_log_info("starting the standby with command-line options");
 	start_standby_server(&opt, true);
 
-	/* Check if the standby server is ready for logical replication */
-	check_subscriber(dbinfo);
-
 	/* Check if the primary server is ready for logical replication */
-	check_publisher(dbinfo);
+	pub_failed = check_publisher(dbinfo);
+
+	/* Check if the standby server is ready for logical replication */
+	sub_failed = check_subscriber(dbinfo);
+
+	if (pub_failed || sub_failed)
+		exit(1);
 
 	/*
 	 * Stop the target server. The recovery process requires that the server
-- 
2.30.2

CompilerWarnings task does not catch C++ warnings

2024-06-18 Thread Peter Eisentraut

The CompilerWarnings task on Cirrus CI does not catch warnings in C++ 
code.  It tries to make warnings fatal by passing COPT='-Werror', but 
that does not apply to C++ compilations.


I suggest that we just add COPT to CXXFLAGS as well.  I think passing 
-Werror is just about the only reasonable use of COPT nowadays, so 
making that more robust seems useful.  I don't think there is a need for 
a separate make variable for C++ here.From dbd1f09836e59fe163f72c4170d628f302bf5587 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Mon, 17 Jun 2024 10:33:19 +0200
Subject: [PATCH] Apply COPT to CXXFLAGS as well

The main use of that make variable is to pass in -Werror.  It makes
sense to apply this to C++ as well.
---
 src/Makefile.global.in | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/Makefile.global.in b/src/Makefile.global.in
index 0a0ba0719f5..42f50b49761 100644
--- a/src/Makefile.global.in
+++ b/src/Makefile.global.in
@@ -742,6 +742,7 @@ endif
 #
 ifdef COPT
CFLAGS += $(COPT)
+   CXXFLAGS += $(COPT)
LDFLAGS += $(COPT)
 endif
 
-- 
2.45.2

Re: [PATCH] Improve error message when trying to lock virtual tuple.

2024-06-18 Thread Sven Klemm

On Mon, Jun 17, 2024 at 10:25 PM Matthias van de Meent
 wrote:

> I think you're solving the wrong problem here, as I can't think of a
> place where both virtual tuple slots and tuple locking are allowed at
> the same time in core code.
>
> I mean, in which kind of situation could we get a Relation's table
> slot which is not lockable by said relation's AM? Assuming the "could
> not read block 0" error comes from the heap code, why does the
> assertion in heapam_tuple_lock that checks for a
> BufferHeapTupleTableSlot not fire before this `block 0` error? If the
> error is not in the heapam code, could you show an example of the code
> that breaks with that error code?

In assertion enabled builds this will be stopped much earlier and not return
the misleading error message. But most packaged postgres versions don't have
assertions enabled and will produce the misleading `could not read block 0`
error.
I am aware that this not a postgres bug, but i think this error message
is an improvement over the current situation.

-- 
Regards, Sven Klemm

Re: consider -Wmissing-variable-declarations

2024-06-18 Thread Peter Eisentraut

Here is an updated patch set.  I have implemented proper solutions for 
the various hacks in the previous patch set.  So this patch set should 
now be ready for proper consideration.


The way I have organized it here is that patches 0002 through 0008 
should be improvements in their own right.


The remaining two patches 0009 and 0010 are workarounds that are just 
necessary to satisfy -Wmissing-variable-declarations.  I haven't made up 
my mind if I'd want to take the bison patch 0010 like this and undo it 
later if we move to pure parsers everywhere, or instead wait for the 
pure parsers to arrive before we install -Wmissing-variable-declarations 
for everyone.


Obviously, people might also have opinions on some details of where 
exactly to put the declarations etc.  I have tried to follow existing 
patterns as much as possible, but those are also not necessarily great 
in all cases.
From 6f3d2c1ee18cc7e7cbc5656d21012f504ccc0c7c Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Wed, 8 May 2024 13:49:37 +0200
Subject: [PATCH v2 01/10] Add -Wmissing-variable-declarations to the standard
 compilation flags

This warning flag detects global variables not declared in header
files.  This is similar to what -Wmissing-prototypes does for
functions.  (More correctly, it is similar to what
-Wmissing-declarations does for functions, but -Wmissing-prototypes is
a superset of that in C.)

This flag is new in GCC 14.  Clang has supported it for a while.

Discussion: 
https://www.postgresql.org/message-id/flat/e0a62134-83da-4ba4-8cdb-ceb0111c9...@eisentraut.org

XXX many warnings to fix first
---
 configure | 49 +++
 configure.ac  |  9 +
 meson.build   | 10 +
 src/Makefile.global.in|  1 +
 src/interfaces/ecpg/test/Makefile.regress |  2 +-
 src/interfaces/ecpg/test/meson.build  |  1 +
 src/makefiles/meson.build |  2 +
 src/tools/pg_bsd_indent/Makefile  |  2 +
 src/tools/pg_bsd_indent/meson.build   |  1 +
 9 files changed, 76 insertions(+), 1 deletion(-)

diff --git a/configure b/configure
index 7b03db56a67..f382a08edc8 100755
--- a/configure
+++ b/configure
@@ -741,6 +741,7 @@ CXXFLAGS_SL_MODULE
 CFLAGS_SL_MODULE
 CFLAGS_VECTORIZE
 CFLAGS_UNROLL_LOOPS
+PERMIT_MISSING_VARIABLE_DECLARATIONS
 PERMIT_DECLARATION_AFTER_STATEMENT
 LLVM_BINPATH
 LLVM_CXXFLAGS
@@ -6080,6 +6081,54 @@ if test x"$pgac_cv_prog_CXX_cxxflags__Wformat_security" 
= x"yes"; then
 fi
 
 
+  # gcc 14+, clang for a while
+  # (Supported in C++ by clang but not gcc.  For consistency, omit in C++.)
+  save_CFLAGS=$CFLAGS
+
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking whether ${CC} supports 
-Wmissing-variable-declarations, for CFLAGS" >&5
+$as_echo_n "checking whether ${CC} supports -Wmissing-variable-declarations, 
for CFLAGS... " >&6; }
+if ${pgac_cv_prog_CC_cflags__Wmissing_variable_declarations+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  pgac_save_CFLAGS=$CFLAGS
+pgac_save_CC=$CC
+CC=${CC}
+CFLAGS="${CFLAGS} -Wmissing-variable-declarations"
+ac_save_c_werror_flag=$ac_c_werror_flag
+ac_c_werror_flag=yes
+cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h.  */
+
+int
+main ()
+{
+
+  ;
+  return 0;
+}
+_ACEOF
+if ac_fn_c_try_compile "$LINENO"; then :
+  pgac_cv_prog_CC_cflags__Wmissing_variable_declarations=yes
+else
+  pgac_cv_prog_CC_cflags__Wmissing_variable_declarations=no
+fi
+rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext
+ac_c_werror_flag=$ac_save_c_werror_flag
+CFLAGS="$pgac_save_CFLAGS"
+CC="$pgac_save_CC"
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: 
$pgac_cv_prog_CC_cflags__Wmissing_variable_declarations" >&5
+$as_echo "$pgac_cv_prog_CC_cflags__Wmissing_variable_declarations" >&6; }
+if test x"$pgac_cv_prog_CC_cflags__Wmissing_variable_declarations" = x"yes"; 
then
+  CFLAGS="${CFLAGS} -Wmissing-variable-declarations"
+fi
+
+
+  PERMIT_MISSING_VARIABLE_DECLARATIONS=
+  if test x"$save_CFLAGS" != x"$CFLAGS"; then
+PERMIT_MISSING_VARIABLE_DECLARATIONS=-Wno-missing-variable-declarations
+  fi
+
   # Disable strict-aliasing rules; needed for gcc 3.3+
 
 { $as_echo "$as_me:${as_lineno-$LINENO}: checking whether ${CC} supports 
-fno-strict-aliasing, for CFLAGS" >&5
diff --git a/configure.ac b/configure.ac
index 63e7be38472..30352b8258b 100644
--- a/configure.ac
+++ b/configure.ac
@@ -530,6 +530,15 @@ if test "$GCC" = yes -a "$ICC" = no; then
   # This was included in -Wall/-Wformat in older GCC versions
   PGAC_PROG_CC_CFLAGS_OPT([-Wformat-security])
   PGAC_PROG_CXX_CFLAGS_OPT([-Wformat-security])
+  # gcc 14+, clang for a while
+  # (Supported in C++ by clang but not gcc.  For consistency, omit in C++.)
+  save_CFLAGS=$CFLAGS
+  PGAC_PROG_CC_CFLAGS_OPT([-Wmissing-variable-declarations])
+  PERMIT_MISSING_VARIABLE_DECLARATIONS=
+  if test x"$save_CFLAGS" != x"$CFLAGS"; then
+PERMIT_MISSING_VARIABLE_DECLARATIONS=-Wno-mi

Re: Conflict Detection and Resolution

2024-06-18 Thread Dilip Kumar

On Tue, Jun 18, 2024 at 12:11 PM Amit Kapila  wrote:
>
> On Tue, Jun 18, 2024 at 11:54 AM Dilip Kumar  wrote:
> >
> > On Mon, Jun 17, 2024 at 8:51 PM Robert Haas  wrote:
> > >
> > > On Mon, Jun 17, 2024 at 1:42 AM Amit Kapila  
> > > wrote:
> > > > The difference w.r.t the existing mechanisms for holding deleted data
> > > > is that we don't know whether we need to hold off the vacuum from
> > > > cleaning up the rows because we can't say with any certainty whether
> > > > other nodes will perform any conflicting operations in the future.
> > > > Using the example we discussed,
> > > > Node A:
> > > >   T1: INSERT INTO t (id, value) VALUES (1,1);
> > > >   T2: DELETE FROM t WHERE id = 1;
> > > >
> > > > Node B:
> > > >   T3: UPDATE t SET value = 2 WHERE id = 1;
> > > >
> > > > Say the order of receiving the commands is T1-T2-T3. We can't predict
> > > > whether we will ever get T-3, so on what basis shall we try to prevent
> > > > vacuum from removing the deleted row?
> > >
> > > The problem arises because T2 and T3 might be applied out of order on
> > > some nodes. Once either one of them has been applied on every node, no
> > > further conflicts are possible.
> >
> > If we decide to skip the update whether the row is missing or deleted,
> > we indeed reach the same end result regardless of the order of T2, T3,
> > and Vacuum. Here's how it looks in each case:
> >
> > Case 1: T1, T2, Vacuum, T3 -> Skip the update for a non-existing row
> > -> end result we do not have a row.
> > Case 2: T1, T2, T3 -> Skip the update for a deleted row -> end result
> > we do not have a row.
> > Case 3: T1, T3, T2 -> deleted the row -> end result we do not have a row.
> >
>
> In case 3, how can deletion be successful? The row required to be
> deleted has already been updated.

Hmm, I was considering this case in the example given by you above[1],
so we have updated some fields of the row with id=1, isn't this row
still detectable by the delete because delete will find this by id=1
as we haven't updated the id?  I was making the point w.r.t. the
example used above.

[1]
> > > > Node A:
> > > >   T1: INSERT INTO t (id, value) VALUES (1,1);
> > > >   T2: DELETE FROM t WHERE id = 1;
> > > >
> > > > Node B:
> > > >   T3: UPDATE t SET value = 2 WHERE id = 1;




-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Re: Proposal for Updating CRC32C with AVX-512 Algorithm.

2024-06-18 Thread Alvaro Herrera

On 2024-Jun-12, Amonson, Paul D wrote:

> +/*-
> + *
> + * pg_crc32c_avx512.c
> + * Compute CRC-32C checksum using Intel AVX-512 instructions.
> + *
> + * Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
> + * Portions Copyright (c) 1994, Regents of the University of California
> + * Portions Copyright (c) 2024, Intel(r) Corporation
> + *
> + * IDENTIFICATION
> + * src/port/pg_crc32c_avx512.c
> + *
> + *-
> + */

Hmm, I wonder if the "(c) 2024 Intel" line is going to bring us trouble.
(I bet it's not really necessary anyway.)

> +/***
> + * pg_crc32c_avx512(): compute the crc32c of the buffer, where the
> + * buffer length must be at least 256, and a multiple of 64. Based
> + * on:
> + *
> + * "Fast CRC Computation for Generic Polynomials Using PCLMULQDQ
> + * Instruction"
> + *  V. Gopal, E. Ozturk, et al., 2009,
> + *  
> https://www.researchgate.net/publication/263424619_Fast_CRC_computation#full-text
> + *
> + * This Function:
> + * Copyright 2017 The Chromium Authors
> + * Copyright (c) 2024, Intel(r) Corporation
> + *
> + * Use of this source code is governed by a BSD-style license that can be
> + * found in the Chromium source repository LICENSE file.
> + * https://chromium.googlesource.com/chromium/src/+/refs/heads/main/LICENSE
> + */

And this bit doesn't look good.  The LICENSE file says:

> // Redistribution and use in source and binary forms, with or without
> // modification, are permitted provided that the following conditions are
> // met:
> //
> //* Redistributions of source code must retain the above copyright
> // notice, this list of conditions and the following disclaimer.
> //* Redistributions in binary form must reproduce the above
> // copyright notice, this list of conditions and the following disclaimer
> // in the documentation and/or other materials provided with the
> // distribution.
> //* Neither the name of Google LLC nor the names of its
> // contributors may be used to endorse or promote products derived from
> // this software without specific prior written permission.

The second clause essentially says we would have to add a page to our
"documentation and/or other materials" with the contents of the license
file.

There's good reasons for UCB to have stopped using the old BSD license,
but apparently Google (or more precisely the Chromium authors) didn't
get the memo.


Our fork distributors spent a lot of time scouring out source cleaning
up copyrights, a decade ago or two.  I bet they won't be happy to see
this sort of thing crop up now.

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/
"No nos atrevemos a muchas cosas porque son difíciles,
pero son difíciles porque no nos atrevemos a hacerlas" (Séneca)

Re: 001_rep_changes.pl fails due to publisher stuck on shutdown

2024-06-18 Thread Kyotaro Horiguchi

At Thu, 13 Jun 2024 09:29:03 +0530, Amit Kapila  wrote 
in 
> Yeah, but the commit you quoted later reverted by commit 703f148e98
> and committed again as c6c3334364.

Yeah, right..

> > aiming to prevent walsenders from
> > generating competing WAL (by, for example, CREATE_REPLICATION_SLOT)
> > records with the shutdown checkpoint. Thus, it seems that the
> > walsender cannot see the shutdown record,
> >
> 
> This is true of logical walsender. The physical walsender do send
> shutdown checkpoint record before getting terminated.

Yes, I know. They differ in their blocking mechanisms.

> > and a certain amount of
> > bytes before it, as the walsender appears to have relied on the
> > checkpoint flushing its record, rather than on XLogBackgroundFlush().
> >
> > If we approve of the walsender being terminated before the shutdown
> > checkpoint, we need to "fix" the comment, then provide a function to
> > ensure the synchronization of WAL records.
> >
> 
> Which comment do you want to fix?

Yeah. The part you seem to think I was trying to fix is actually
fine. Instead, I have revised the comment on the modified section to
make its intention clearer.

> > I'll consider this direction for a while.
> >
> 
> Okay, thanks.

The attached patch is it. It's only for the master.

I decided not to create a new function because the simple code has
only one caller. I haven't seen the test script fail with this fix.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
>From 663bdeaf8d4d2f5a192dd3ecb6d2817d9b1311f1 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Tue, 18 Jun 2024 16:36:43 +0900
Subject: [PATCH] Ensure WAL is written out when terminating a logical
 walsender

In commit c6c3334364, XLogBackgroundFlush() was assumed to flush all
written WAL to the end, but this function may not flush the last
incomplete page in a single call. In such cases, if the last written
page ends with a continuation record, the latter part will not be
flushed, causing shutdown to wait indefinitely for that part. This
commit ensures that the written records are fully flushed to the end,
guaranteeing expected behavior.
---
 src/backend/replication/walsender.c | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index c623b07cf0..5aa0f58238 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -1858,12 +1858,13 @@ WalSndWaitForWal(XLogRecPtr loc)
 		ProcessRepliesIfAny();
 
 		/*
-		 * If we're shutting down, trigger pending WAL to be written out,
-		 * otherwise we'd possibly end up waiting for WAL that never gets
-		 * written, because walwriter has shut down already.
+		 * If we're shutting down, all WAL-writing processes are gone, leaving
+		 * only checkpointer to perform the shutdown checkpoint. Ensure that
+		 * any pending WAL is written out here; otherwise, we'd possibly end up
+		 * waiting for WAL that never gets written.
 		 */
-		if (got_STOPPING)
-			XLogBackgroundFlush();
+		if (got_STOPPING && !RecoveryInProgress())
+			XLogFlush(GetInsertRecPtr());
 
 		/*
 		 * To avoid the scenario where standbys need to catch up to a newer
-- 
2.43.0

Re: Conflict Detection and Resolution

2024-06-18 Thread Amit Kapila

On Tue, Jun 18, 2024 at 1:18 PM Dilip Kumar  wrote:
>
> On Tue, Jun 18, 2024 at 12:11 PM Amit Kapila  wrote:
> >
> > On Tue, Jun 18, 2024 at 11:54 AM Dilip Kumar  wrote:
> > >
> > > On Mon, Jun 17, 2024 at 8:51 PM Robert Haas  wrote:
> > > >
> > > > On Mon, Jun 17, 2024 at 1:42 AM Amit Kapila  
> > > > wrote:
> > > > > The difference w.r.t the existing mechanisms for holding deleted data
> > > > > is that we don't know whether we need to hold off the vacuum from
> > > > > cleaning up the rows because we can't say with any certainty whether
> > > > > other nodes will perform any conflicting operations in the future.
> > > > > Using the example we discussed,
> > > > > Node A:
> > > > >   T1: INSERT INTO t (id, value) VALUES (1,1);
> > > > >   T2: DELETE FROM t WHERE id = 1;
> > > > >
> > > > > Node B:
> > > > >   T3: UPDATE t SET value = 2 WHERE id = 1;
> > > > >
> > > > > Say the order of receiving the commands is T1-T2-T3. We can't predict
> > > > > whether we will ever get T-3, so on what basis shall we try to prevent
> > > > > vacuum from removing the deleted row?
> > > >
> > > > The problem arises because T2 and T3 might be applied out of order on
> > > > some nodes. Once either one of them has been applied on every node, no
> > > > further conflicts are possible.
> > >
> > > If we decide to skip the update whether the row is missing or deleted,
> > > we indeed reach the same end result regardless of the order of T2, T3,
> > > and Vacuum. Here's how it looks in each case:
> > >
> > > Case 1: T1, T2, Vacuum, T3 -> Skip the update for a non-existing row
> > > -> end result we do not have a row.
> > > Case 2: T1, T2, T3 -> Skip the update for a deleted row -> end result
> > > we do not have a row.
> > > Case 3: T1, T3, T2 -> deleted the row -> end result we do not have a row.
> > >
> >
> > In case 3, how can deletion be successful? The row required to be
> > deleted has already been updated.
>
> Hmm, I was considering this case in the example given by you above[1],
> so we have updated some fields of the row with id=1, isn't this row
> still detectable by the delete because delete will find this by id=1
> as we haven't updated the id?  I was making the point w.r.t. the
> example used above.
>

Your point is correct w.r.t the example but I responded considering a
general update-delete ordering. BTW, it is not clear to me how
update_delete conflict will be handled with what Robert and you are
saying. I'll try to say what I understood. If we assume that there are
two nodes A & B as mentioned in the above example and DELETE has
applied on both nodes, now say UPDATE has been performed on node B
then irrespective of whether we consider the conflict as update_delete
or update_missing, the data will remain same on both nodes. So, in
such a case, we don't need to bother differentiating between those two
types of conflicts. Is that what we can interpret from above?

-- 
With Regards,
Amit Kapila.

Re: altering a column's collation leaves an invalid foreign key

2024-06-18 Thread Peter Eisentraut


On 08.06.24 06:14, jian he wrote:

if FK is nondeterministic, then it looks PK more like FK.
the following example, one FK row is referenced by two PK rows.

DROP TABLE IF EXISTS fktable, pktable;
CREATE TABLE pktable (x text COLLATE "C" PRIMARY KEY);
CREATE TABLE fktable (x text COLLATE ignore_accent_case REFERENCES
pktable on update cascade on delete cascade);
INSERT INTO pktable VALUES ('A'), ('Å');
INSERT INTO fktable VALUES ('A');


Yes, this is a problem.  The RI checks are done with the collation of 
the primary key.


The comment at ri_GenerateQualCollation() says "the SQL standard 
specifies that RI comparisons should use the referenced column's 
collation".  But this is not what it says in my current copy.


... [ digs around ISO archives ] ...

Yes, this was changed in SQL:2016 to require the collation on the PK 
side and the FK side to match at constraint creation time.  The argument 
made is exactly the same we have here.  This was actually already the 
rule in SQL:1999 but was then relaxed in SQL:2003 and then changed back 
because it was a mistake.


We probably don't need to enforce this for deterministic collations, 
which would preserve some backward compatibility.


I'll think some more about what steps to take to solve this and what to 
do about back branches etc.

Re: Backup and Restore of Partitioned Table in PG-15

2024-06-18 Thread Ashutosh Bapat

Hi Gayatri,

On Sun, Jun 16, 2024 at 4:39 AM Gayatri Singh 
wrote:

> Hi Team,
>
> Greetings of the day!!
>
> We are planning to partition tables using pg_partman. Like we are planning
> for their backup and restoration process.
>
> Got a few URLs where pg_dump had issues while restoring some data that was
> lost.
>

This mailing list is for discussing development topics - bugs and features.
Please provide more details about the issues - URL where the issue is
reported, a reproducer etc. If the issues are already being discussed,
please participate in the relevant threads.

>
> kindly guide me the process or steps I need to follow for backing up
> partitioned tables correctly so that while restoration I don't face any
> issue.
>
> Another question, currently we are using pg_dump for database backup which
> locks tables and completely puts db transactions on hold. For this I want
> tables shouldnt get locked also the backup process should complete in less
> time.
>

These questions are appropriate for pgsql-general mailing list.

-- 
Best Wishes,
Ashutosh Bapat

Re: pgsql: Add more SQL/JSON constructor functions

2024-06-18 Thread Amit Langote

On Tue, Jun 4, 2024 at 7:03 PM Amit Langote  wrote:
> On Tue, Jun 4, 2024 at 2:20 AM Tom Lane  wrote:
> > Peter Eisentraut  writes:
> > > On 02.06.24 21:46, Tom Lane wrote:
> > >> If you don't
> > >> like our current behavior, then either you have to say that RETURNING
> > >> with a length-limited target type is illegal (which is problematic
> > >> for the spec, since they have no such type) or that the cast behaves
> > >> like an implicit cast, with errors for overlength input (which I find
> > >> to be an unintuitive definition for a construct that names the target
> > >> type explicitly).
> >
> > > It asks for the latter behavior, essentially (but it's not defined in
> > > terms of casts).  It says:
> >
> > Meh.  Who needs consistency?  But I guess the answer is to do what was
> > suggested earlier and change the code to use COERCE_IMPLICIT_CAST.
>
> OK, will post a patch to do so in a new thread on -hackers.

Oops, didn't realize that this is already on -hackers.

Attached is a patch to use COERCE_IMPLICIT_CAST when the RETURNING
type specifies a length limit.

Given that this also affects JSON_OBJECT() et al that got added in
v16, maybe back-patching is in order but I'd like to hear opinions on
that.


--
Thanks, Amit Langote


v1-0001-SQL-JSON-Use-implicit-casts-for-RETURNING-type-wi.patch
Description: Binary data

Re: [PATCH] Improve error message when trying to lock virtual tuple.

2024-06-18 Thread Matthias van de Meent

On Tue, 18 Jun 2024 at 09:32, Sven Klemm  wrote:
>
> On Mon, Jun 17, 2024 at 10:25 PM Matthias van de Meent
>  wrote:
>
> > I think you're solving the wrong problem here, as I can't think of a
> > place where both virtual tuple slots and tuple locking are allowed at
> > the same time in core code.
> >
> > I mean, in which kind of situation could we get a Relation's table
> > slot which is not lockable by said relation's AM? Assuming the "could
> > not read block 0" error comes from the heap code, why does the
> > assertion in heapam_tuple_lock that checks for a
> > BufferHeapTupleTableSlot not fire before this `block 0` error? If the
> > error is not in the heapam code, could you show an example of the code
> > that breaks with that error code?
>
> In assertion enabled builds this will be stopped much earlier and not return
> the misleading error message. But most packaged postgres versions don't have
> assertions enabled and will produce the misleading `could not read block 0`
> error.
> I am aware that this not a postgres bug, but i think this error message
> is an improvement over the current situation.

Extensions shouldn't cause assertions to trigger, IMO, and I don't
think that this check in ExecLockRows is a good way to solve that
issue. In my opinion, authors should test their extension on
assert-enabled PostgreSQL, so that they're certain they're not doing

If you're dead-set on having users see less confusing error messages
when assertions should have triggered (but are not enabled, and thus
don't), I think the better place to add additional checks & error
messages is in the actual heapam_tuple_lock method, just after the
assertion, rather than in the AM-agnostic ExecLockRows: If or when a
tableAM decides that VirtualTableTupleSlot is the slot type they want
to use for passing tuples around, then that shouldn't be broken by
code in ExecLockRows that was put there to mimick an assert in the
heap AM.

Kind regards,

Matthias van de Meent
Neon (https://neon.tech)

Re: Should consider materializing the cheapest inner path in consider_parallel_nestloop()

2024-06-18 Thread Richard Guo

On Tue, Jun 4, 2024 at 6:51 PM Tender Wang  wrote:
> Yeah, Richard commented the v1 patch about JOIN_UNIQUE_INNER in [1]
>
> * I think we should not consider materializing the cheapest inner path
> if we're doing JOIN_UNIQUE_INNER, because in this case we have to
> unique-ify the inner path.
>
> We don't consider material inner path if jointype is JOIN_UNIQUE_INNER in 
> match_unsorted_order().
> So here is as same logic as match_unsorted_order(). I added comments to 
> explain why.

I looked through the v4 patch and found an issue.  For the plan diff:

+ ->  Nested Loop
+   ->  Parallel Seq Scan on prt1_p1 t1_1
+   ->  Materialize
+ ->  Sample Scan on prt1_p1 t2_1
+   Sampling: system (t1_1.a) REPEATABLE (t1_1.b)
+   Filter: (t1_1.a = a)

This does not seem correct to me.  The inner path is parameterized by
the outer rel, in which case it does not make sense to add a Materialize
node on top of it.

I updated the patch to include a check in consider_parallel_nestloop
ensuring that inner_cheapest_total is not parameterized by outerrel
before materializing it.  I also tweaked the comments, test cases and
commit message.

Thanks
Richard

v5-0001-Consider-materializing-the-cheapest-inner-path-in-parallel-nestloop.patch
Description: Binary data

Re: Is creating logical replication slots in template databases useful at all?

2024-06-18 Thread Ashutosh Bapat

On Mon, Jun 17, 2024 at 5:50 PM Bharath Rupireddy <
bharath.rupireddyforpostg...@gmail.com> wrote:

> Hi,
>
> While looking at the commit
>
> https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=29d0a77fa6606f9c01ba17311fc452dabd3f793d
> ,
> I noticed that get_old_cluster_logical_slot_infos gets called for even
> template1 and template0 databases. Which means, pg_upgrade executes
> queries against the template databases to get replication slot
> information. I then realized that postgres allows one to connect to
> template1 database (or any other user-defined template databases for
> that matter), and create logical replication slots in it. If created,
> all the subsequent database creations will end up adding inactive
> logical replication slots in the postgres server. This might not be a
> problem in production servers as I assume the connections to template
> databases are typically restricted. Despite the connection
> restrictions, if at all one gets to connect to template databases in
> any way, it's pretty much possible to load the postgres server with
> inactive replication slots.
>

The replication slot names are unique across databases [1] Hence
replication slots created by connecting to template1 database should not
get copied over when creating a new database. Is that broken? A logical
replication slot is associated with a database but a physical replication
slot is not. The danger you mention above applies only to logical
replication slots I assume.


>
> This leads me to think why one would need logical replication slots in
> template databases at all. Can postgres restrict logical replication
> slots creation in template databases? If restricted, it may deviate
> from the fundamental principle of template databases in the sense that
> everything in the template database must be copied over to the new
> database created using it. Is it okay to do this? Am I missing
> something here?
>

If applications are using template1, they would want to keep the template1
on primary and replica in sync. Replication slot associated with template1
would be useful there.

[1]
https://www.postgresql.org/docs/current/logicaldecoding-explanation.html#LOGICALDECODING-REPLICATION-SLOTS
-- 
Best Wishes,
Ashutosh Bapat

Re: Conflict Detection and Resolution

2024-06-18 Thread shveta malik

On Tue, Jun 18, 2024 at 11:34 AM Dilip Kumar  wrote:
>
> On Tue, Jun 18, 2024 at 10:17 AM Dilip Kumar  wrote:
> >
> > On Mon, Jun 17, 2024 at 3:23 PM Amit Kapila  wrote:
> > >
> > > On Wed, Jun 12, 2024 at 10:03 AM Dilip Kumar  
> > > wrote:
> > > >
> > > > On Tue, Jun 11, 2024 at 7:44 PM Tomas Vondra
> > > >  wrote:
> > > >
> > > > > > Yes, that's correct. However, many cases could benefit from the
> > > > > > update_deleted conflict type if it can be implemented reliably. 
> > > > > > That's
> > > > > > why we wanted to give it a try. But if we can't achieve predictable
> > > > > > results with it, I'm fine to drop this approach and conflict_type. 
> > > > > > We
> > > > > > can consider a better design in the future that doesn't depend on
> > > > > > non-vacuumed entries and provides a more robust method for 
> > > > > > identifying
> > > > > > deleted rows.
> > > > > >
> > > > >
> > > > > I agree having a separate update_deleted conflict would be beneficial,
> > > > > I'm not arguing against that - my point is actually that I think this
> > > > > conflict type is required, and that it needs to be detected reliably.
> > > > >
> > > >
> > > > When working with a distributed system, we must accept some form of
> > > > eventual consistency model. However, it's essential to design a
> > > > predictable and acceptable behavior. For example, if a change is a
> > > > result of a previous operation (such as an update on node B triggered
> > > > after observing an operation on node A), we can say that the operation
> > > > on node A happened before the operation on node B. Conversely, if
> > > > operations on nodes A and B are independent, we consider them
> > > > concurrent.
> > > >
> > > > In distributed systems, clock skew is a known issue. To establish a
> > > > consistency model, we need to ensure it guarantees the
> > > > "happens-before" relationship. Consider a scenario with three nodes:
> > > > NodeA, NodeB, and NodeC. If NodeA sends changes to NodeB, and
> > > > subsequently NodeB makes changes, and then both NodeA's and NodeB's
> > > > changes are sent to NodeC, the clock skew might make NodeB's changes
> > > > appear to have occurred before NodeA's changes. However, we should
> > > > maintain data that indicates NodeB's changes were triggered after
> > > > NodeA's changes arrived at NodeB. This implies that logically, NodeB's
> > > > changes happened after NodeA's changes, despite what the timestamps
> > > > suggest.
> > > >
> > > > A common method to handle such cases is using vector clocks for
> > > > conflict resolution.
> > > >
> > >
> > > I think the unbounded size of the vector could be a problem to store
> > > for each event. However, while researching previous discussions, it
> > > came to our notice that we have discussed this topic in the past as
> > > well in the context of standbys. For recovery_min_apply_delay, we
> > > decided the clock skew is not a problem as the settings of this
> > > parameter are much larger than typical time deviations between servers
> > > as mentioned in docs. Similarly for casual reads [1], there was a
> > > proposal to introduce max_clock_skew parameter and suggesting the user
> > > to make sure to have NTP set up correctly. We have tried to check
> > > other databases (like Ora and BDR) where CDR is implemented but didn't
> > > find anything specific to clock skew. So, I propose to go with a GUC
> > > like max_clock_skew such that if the difference of time between the
> > > incoming transaction's commit time and the local time is more than
> > > max_clock_skew then we raise an ERROR. It is not clear to me that
> > > putting bigger effort into clock skew is worth especially when other
> > > systems providing CDR feature (like Ora or BDR) for decades have not
> > > done anything like vector clocks. It is possible that this is less of
> > > a problem w.r.t CDR and just detecting the anomaly in clock skew is
> > > good enough.
> >
> > I believe that if we've accepted this solution elsewhere, then we can
> > also consider the same. Basically, we're allowing the application to
> > set its tolerance for clock skew. And, if the skew exceeds that
> > tolerance, it's the application's responsibility to synchronize;
> > otherwise, an error will occur. This approach seems reasonable.
>
> This model can be further extended by making the apply worker wait if
> the remote transaction's commit_ts is greater than the local
> timestamp. This ensures that no local transactions occurring after the
> remote transaction appear to have happened earlier due to clock skew
> instead we make them happen before the remote transaction by delaying
> the remote transaction apply.  Essentially, by having the remote
> application wait until the local timestamp matches the remote
> transaction's timestamp, we ensure that the remote transaction, which
> seems to occur after concurrent local transactions due to clock skew,
> is actually applied after those transactions.
>
> With this model, the

Re: post-freeze damage control

2024-06-18 Thread Alena Rybakina

Hi! Unfortunately,Iwas notableto fullyunderstandyourmessage.Couldyou 
explainit to meplease?


On 09.04.2024 16:20, Andrei Lepikhov wrote:


Moreover, it helps even SeqScan: attempting to find a value in the 
hashed array is much faster than cycling a long-expression on each 
incoming tuple.


AsIunderstandit,youtalkedaboutspeedingupSeqScan 
byfasterre-searchingthroughthe useof a hashtable. Atthe same time, 
wehaveto builditbeforethat,whenthere was the initiallookuptuples,right?


Ifoundthisinformationinthe ExecEvalHashedScalarArrayOp 
function,andIassumeyoumeantthisfunctioninyourmessage.


But I couldn't find information, when you told about cycling a 
long-expression on each incoming tuple. Could you ask me what function 
you were talking about or maybe functionality? I saw ExecSeqScan 
function, but I didn't see it.


--
Regards,
Alena Rybakina
Postgres Professional:http://www.postgrespro.com
The Russian Postgres Company

Re: New GUC autovacuum_max_threshold ?

2024-06-18 Thread Frédéric Yhuel





Le 18/06/2024 à 05:06, Nathan Bossart a écrit :

I didn't see a commitfest entry for this, so I created one to make sure we
don't lose track of this:

https://commitfest.postgresql.org/48/5046/



OK thanks!

By the way, I wonder if there were any off-list discussions after 
Robert's conference at PGConf.dev (and I'm waiting for the video of the 
conf).

Re: Logical Replication of sequences

2024-06-18 Thread Amit Kapila

On Tue, Jun 18, 2024 at 7:30 AM Masahiko Sawada  wrote:
>
> On Fri, Jun 14, 2024 at 4:04 PM Amit Kapila  wrote:
> >
> > On Thu, Jun 13, 2024 at 6:14 PM Masahiko Sawada  
> > wrote:
> > >
> >
> > > >
> > > > Now, say we don't want to maintain the state of sequences for initial
> > > > sync at all then after the error how will we detect if there are any
> > > > pending sequences to be synced? One possibility is that we maintain a
> > > > subscription level flag 'subsequencesync' in 'pg_subscription' to
> > > > indicate whether sequences need sync. This flag would indicate whether
> > > > to sync all the sequences in pg_susbcription_rel. This would mean that
> > > > if there is an error while syncing the sequences we will resync all
> > > > the sequences again. This could be acceptable considering the chances
> > > > of error during sequence sync are low. The benefit is that both the
> > > > REFRESH PUBLICATION SEQUENCES and CREATE SUBSCRIPTION can use the same
> > > > idea and sync all sequences without needing a new relfilenode. Users
> > > > can always refer 'subsequencesync' flag in 'pg_subscription' to see if
> > > > all the sequences are synced after executing the command.
> > >
> > > I think that REFRESH PUBLICATION {SEQUENCES} can be executed even
> > > while the sequence-sync worker is synchronizing sequences. In this
> > > case, the worker might not see new sequences added by the concurrent
> > > REFRESH PUBLICATION {SEQUENCES} command since it's already running.
> > > The worker could end up marking the subsequencesync as completed while
> > > not synchronizing these new sequences.
> > >
> >
> > This is possible but we could avoid REFRESH PUBLICATION {SEQUENCES} by
> > not allowing to change the subsequencestate during the time
> > sequence-worker is syncing the sequences. This could be restrictive
> > but there doesn't seem to be cases where user would like to
> > immediately refresh sequences after creating the subscription.
>
> I'm concerned that users would not be able to add sequences during the
> time the sequence-worker is syncing the sequences. For example,
> suppose we have 1 sequences and execute REFRESH PUBLICATION
> {SEQUENCES} to synchronize 1 sequences. Now if we add one sequence
> to the publication and want to synchronize it to the subscriber, we
> have to wait for the current REFRESH PUBLICATION {SEQUENCES} to
> complete, and then execute it again, synchronizing 10001 sequences,
> instead of synchronizing only the new one.
>

I see your point and it could hurt such scenarios even though they
won't be frequent. So, let's focus on our other approach of
maintaining the flag at a per-sequence level in pg_subscription_rel.

> >
> > > >
> > > > > > > Or yet another idea I came up with is that a tablesync worker will
> > > > > > > synchronize both the table and sequences owned by the table. That 
> > > > > > > is,
> > > > > > > after the tablesync worker caught up with the apply worker, the
> > > > > > > tablesync worker synchronizes sequences associated with the target
> > > > > > > table as well. One benefit would be that at the time of initial 
> > > > > > > table
> > > > > > > sync being completed, the table and its sequence data are 
> > > > > > > consistent.
> > > > >
> > > > > Correction; it's not guaranteed that the sequence data and table data
> > > > > are consistent even in this case since the tablesync worker could get
> > > > > on-disk sequence data that might have already been updated.
> > > > >
> > > >
> > > > The benefit of this approach is not clear to me. Our aim is to sync
> > > > all sequences before the upgrade, so not sure if this helps because
> > > > anyway both table values and corresponding sequences can again be
> > > > out-of-sync very quickly.
> > >
> > > Right.
> > >
> > > Given that our aim is to sync all sequences before the upgrade, do we
> > > need to synchronize sequences even at CREATE SUBSCRIPTION time? In
> > > cases where there are a large number of sequences, synchronizing
> > > sequences in addition to tables could be overhead and make less sense,
> > > because sequences can again be out-of-sync quickly and typically
> > > CREATE SUBSCRIPTION is not created just before the upgrade.
> > >
> >
> > I think for the upgrade one should be creating a subscription just
> > before the upgrade. Isn't something similar is done even in the
> > upgrade steps you shared once [1]?
>
> I might be missing something but in the blog post they created
> subscriptions in various ways, waited for the initial table data sync
> to complete, and then set the sequence values with a buffer based on
> the old cluster. What I imagined with this sequence synchronization
> feature is that after the initial table sync completes, we stop to
> execute further transactions on the publisher, synchronize sequences
> using REFRESH PUBLICATION {SEQUENCES}, and resume the application to
> execute transactions on the subscriber. So a subscription would be
> created just before the

Re: DOCS: Generated table columns are skipped by logical replication

2024-06-18 Thread Amit Kapila

On Tue, Jun 18, 2024 at 12:11 PM Peter Smith  wrote:
>
> While reviewing another thread that proposes to include "generated
> columns" support for logical replication [1] I was looking for any
> existing PostgreSQL documentation on this topic.
>
> But, I found almost nothing about it at all -- I only saw one aside
> mention saying that logical replication low-level message information
> is not sent for generated columns [2].
>
> ~~
>
> IMO there should be some high-level place in the docs where the
> behaviour for logical replication w.r.t. generated columns is
> described.
>

+1.

> There are lots of candidate places which could talk about this topic.
> * e.g.1 in "Generated Columns" (section 5.4)
> * e.g.2 in LR "Column-Lists" docs (section 29.5)
> * e.g.3 in LR "Restrictions" docs (section 29.7)
> * e.g.4 in the "CREATE PUBLICATION" reference page
>
> For now, I have provided just a simple patch for the "Generated
> Columns" section [3]. Perhaps it is enough.
>

Can we try to clarify if their corresponding values are replicated?


-- 
With Regards,
Amit Kapila.

Re: replace strtok()

2024-06-18 Thread Ranier Vilela

Em ter., 18 de jun. de 2024 às 04:18, Peter Eisentraut 
escreveu:

> Under the topic of getting rid of thread-unsafe functions in the backend
> [0], here is a patch series to deal with strtok().
>
> Of course, strtok() is famously not thread-safe and can be replaced by
> strtok_r().  But it also has the wrong semantics in some cases, because
> it considers adjacent delimiters to be one delimiter.  So if you parse
>
>  SCRAM-SHA-256$:$:
>
> with strtok(), then
>
>  SCRAM-SHA-256$$::$$::
>
> parses just the same.  In many cases, this is arguably wrong and could
> hide mistakes.
>
> So I'm suggesting to use strsep() in those places.  strsep() is
> nonstandard but widely available.
>
> There are a few places where strtok() has the right semantics, such as
> parsing tokens separated by whitespace.  For those, I'm using strtok_r().
>
> A reviewer job here would be to check whether I made that distinction
> correctly in each case.
>
> On the portability side, I'm including a port/ replacement for strsep()
> and some workaround to get strtok_r() for Windows.  I have included
> these here as separate patches for clarity.
>
+1 For making the code thread-safe.
But I would like to see more const char * where this is possible.

For example, in pg_locale.c
IMO, the token variable can be const char *.

At least strchr expects a const char * as the first parameter.

I found another implementation of strsep, it seems lighter to me.
I will attach it for consideration, however, I have not done any testing.

best regards,
Ranier Vilela
/* strsep.h
 *
 *  Provides the 4.4BSD strsep(3) function for those that don't have it.
 *
 *  Copyright 2011 Michael Thomas Greer
 *  Distributed under the Boost Software License, Version 1.0.
 *  ( See accompanying file LICENSE_1_0.txt or copy at
 *   http://www.boost.org/LICENSE_1_0.txt )
 *
 *  Including this file modifies the std namespace in C++.
 *
 *  Don't include this file if your compiler provides the strsep function in 
.
 *  Make sure your build process tests for this and behaves accordingly!
 *
 */

#include 

char *
strsep(char **stringp, const char *delim)
{
char *result;

if ((stringp == NULL) || (*stringp == NULL))
return NULL;

result = *stringp;

while(**stringp && !(strchr(delim, **stringp)))
++*stringp;

if (**stringp)
*(*stringp)++ = '\0';
else
*stringp = NULL;

return result;
}

jsonapi type fixups

2024-06-18 Thread Peter Eisentraut

I have this patch series that fixes up the types of the new incremental 
JSON API a bit.  Specifically, it uses "const" throughout so that the 
top-level entry points such as pg_parse_json_incremental() can declare 
their arguments as const char * instead of just char *.  This just 
works, it doesn't require any new casting tricks.  In fact, it removes a 
few unconstify() calls.


Also, a few arguments and variables that relate to object sizes should 
be size_t rather than int.  At this point, this mainly makes the API 
better self-documenting.  I don't think it actually works to parse 
larger than 2 GB chunks (not tested).From 24546d9043f0bbde185dafca74f9eb7225267a68 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Tue, 18 Jun 2024 13:42:23 +0200
Subject: [PATCH 1/3] jsonapi: Use size_t

Use size_t instead of int for object sizes in the jsonapi.  This makes
the API better self-documenting.
---
 src/common/jsonapi.c| 24 
 src/common/parse_manifest.c |  2 +-
 src/include/common/jsonapi.h|  8 
 src/include/common/parse_manifest.h |  2 +-
 4 files changed, 18 insertions(+), 18 deletions(-)

diff --git a/src/common/jsonapi.c b/src/common/jsonapi.c
index 172310f6f19..f71c1c54b2a 100644
--- a/src/common/jsonapi.c
+++ b/src/common/jsonapi.c
@@ -85,7 +85,7 @@ struct JsonParserStack
 {
int stack_size;
char   *prediction;
-   int pred_index;
+   size_t  pred_index;
/* these two are indexed by lex_level */
char  **fnames;
bool   *fnull;
@@ -212,7 +212,7 @@ static char JSON_PROD_GOAL[] = {JSON_TOKEN_END, 
JSON_NT_JSON, 0};
 
 static inline JsonParseErrorType json_lex_string(JsonLexContext *lex);
 static inline JsonParseErrorType json_lex_number(JsonLexContext *lex, char *s,
-   
 bool *num_err, int *total_len);
+   
 bool *num_err, size_t *total_len);
 static inline JsonParseErrorType parse_scalar(JsonLexContext *lex, 
JsonSemAction *sem);
 static JsonParseErrorType parse_object_field(JsonLexContext *lex, 
JsonSemAction *sem);
 static JsonParseErrorType parse_object(JsonLexContext *lex, JsonSemAction 
*sem);
@@ -269,10 +269,10 @@ lex_expect(JsonParseContext ctx, JsonLexContext *lex, 
JsonTokenType token)
  * str is of length len, and need not be null-terminated.
  */
 bool
-IsValidJsonNumber(const char *str, int len)
+IsValidJsonNumber(const char *str, size_t len)
 {
boolnumeric_error;
-   int total_len;
+   size_t  total_len;
JsonLexContext dummy_lex;
 
if (len <= 0)
@@ -324,7 +324,7 @@ IsValidJsonNumber(const char *str, int len)
  */
 JsonLexContext *
 makeJsonLexContextCstringLen(JsonLexContext *lex, char *json,
-int len, int encoding, 
bool need_escapes)
+size_t len, int 
encoding, bool need_escapes)
 {
if (lex == NULL)
{
@@ -650,7 +650,7 @@ JsonParseErrorType
 pg_parse_json_incremental(JsonLexContext *lex,
  JsonSemAction *sem,
  char *json,
- int len,
+ size_t len,
  bool is_last)
 {
JsonTokenType tok;
@@ -888,7 +888,7 @@ pg_parse_json_incremental(JsonLexContext *lex,
}
else
{
-   int 
tlen = (lex->token_terminator - lex->token_start);
+   ptrdiff_t   
tlen = (lex->token_terminator - lex->token_start);
 

pstack->scalar_val = palloc(tlen + 1);

memcpy(pstack->scalar_val, lex->token_start, tlen);
@@ -1332,7 +1332,7 @@ json_lex(JsonLexContext *lex)
 * recursive call
 */
StringInfo  ptok = &(lex->inc_state->partial_token);
-   int added = 0;
+   size_t  added = 0;
booltok_done = false;
JsonLexContext dummy_lex;
JsonParseErrorType partial_result;
@@ -1354,7 +1354,7 @@ json_lex(JsonLexContext *lex)
break;
}
 
-   for (int i = 0; i < lex->input_leng

Re: post-freeze damage control

2024-06-18 Thread Alena Rybakina

Sorry, I've just noticed that the letter is shown incorrectly. I rewrote 
it below.


As I understand it, you talked about speeding up SeqScan by faster 
re-searching through the use of a hash table. At the same time, we have 
to build it before that, when there was the initial lookup tuples, right?


I found this information in the ExecEvalHashedScalarArrayOp function, 
and I assume you meant this function in your message, right?


But I couldn't find information, when you told about cycling a 
long-expression on each incoming tuple. Could you ask me what function 
you were talking about or maybe functionality? I saw ExecSeqScan 
function, but I didn't see it.


On 18.06.2024 13:05, Alena Rybakina wrote:


Hi! Unfortunately,Iwas notableto fullyunderstandyourmessage.Couldyou 
explainit to meplease?


On 09.04.2024 16:20, Andrei Lepikhov wrote:


Moreover, it helps even SeqScan: attempting to find a value in the 
hashed array is much faster than cycling a long-expression on each 
incoming tuple.


AsIunderstandit,youtalkedaboutspeedingupSeqScan 
byfasterre-searchingthroughthe useof a hashtable. Atthe same time, 
wehaveto builditbeforethat,whenthere was the initiallookuptuples,right?


Ifoundthisinformationinthe ExecEvalHashedScalarArrayOp 
function,andIassumeyoumeantthisfunctioninyourmessage.


But I couldn't find information, when you told about cycling a 
long-expression on each incoming tuple. Could you ask me what function 
you were talking about or maybe functionality? I saw ExecSeqScan 
function, but I didn't see it.


--
Regards,
Alena Rybakina
Postgres Professional:http://www.postgrespro.com
The Russian Postgres Company


--
Regards,
Alena Rybakina
Postgres Professional:http://www.postgrespro.com
The Russian Postgres Company

Truncation of mapped catalogs (whether local or shared) leads to server crash

2024-06-18 Thread Ashutosh Sharma

Hi everyone,

I've noticed that truncating mapped catalogs causes the server to
crash due to an assertion failure. Here are the details:

Executing below commands:

-- set allow_system_table_mods TO on;
-- truncate table pg_type;

Results into a server crash with below backtrace:

...
#2  0x55736767537d in ExceptionalCondition
(conditionName=0x5573678c5760 "relation->rd_rel->relkind ==
RELKIND_INDEX", fileName=0x5573678c4b28 "relcache.c",
lineNumber=3896) at assert.c:66
#3  0x557367664e31 in RelationSetNewRelfilenumber
(relation=0x7f68240f1d58, persistence=112 'p') at relcache.c:3896
#4  0x55736715b952 in ExecuteTruncateGuts
(explicit_rels=0x55736989e5b0, relids=0x55736989e600,
relids_logged=0x0, behavior=DROP_RESTRICT, restart_seqs=false,
run_as_table_owner=false) at tablecmds.c:2146
#5  0x55736715affa in ExecuteTruncate (stmt=0x55736989f950) at
tablecmds.c:1877
#6  0x557367493693 in standard_ProcessUtility
(pstmt=0x55736989fa00, queryString=0x55736989eed0 "truncate table
pg_type;", readOnlyTree=false,
context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0,
dest=0x5573698a0330, qc=0x7ffe19367ac0) at utility.c:728

As seen from the backtrace above, the assertion failure occurs in
'RelationSetNewRelfilenumber()' at:

if (RelationIsMapped(relation))
{
/* This case is only supported for indexes */
Assert(relation->rd_rel->relkind == RELKIND_INDEX);
}

I would like to know why we are only expecting index tables here and
not the regular catalog tables. For instance, pg_type is a mapped
relation but not of index type, leading to the failure in this case.
Should we consider changing this Assert condition from RELKIND_INDEX
to (RELKIND_INDEX || RELKIND_RELATION)?

Additionally, is it advisable to restrict truncation of the pg_class
table? It's like a kind of circular dependency in case of pg_class
which is not applicable in case of other catalog tables.

--
With Regards,
Ashutosh Sharma.

Re: Missing docs for new enable_group_by_reordering GUC

2024-06-18 Thread Alexander Korotkov

On Tue, Jun 18, 2024 at 9:14 AM Andrei Lepikhov  wrote:
> On 6/18/24 09:32, Bruce Momjian wrote:
> > This commit added enable_group_by_reordering:
> >
> >   commit 0452b461bc4
> >   Author: Alexander Korotkov 
> >   Date:   Sun Jan 21 22:21:36 2024 +0200
> > It mentions it was added as a GUC to postgresql.conf, but I see no SGML
> > docs for this new GUC value.  Would someone please add docs for this?
> > Thanks.
> It is my mistake, sorry for that. See the patch in attachment.

Bruce, thank for noticing.  Andrei, thank you for providing a fix.
Please, check the revised patch.

--
Regards,
Alexander Korotkov
Supabase


v2-0001-Add-doc-entry-for-the-new-GUC-paramenter-enable_g.patch
Description: Binary data

Re: jsonpath: Missing regex_like && starts with Errors?

2024-06-18 Thread Peter Eisentraut


On 18.06.24 04:17, Chapman Flack wrote:

On 06/17/24 19:17, David E. Wheeler wrote:

[1]: 
https://github.com/postgres/postgres/blob/82ed67a/src/backend/utils/adt/jsonpath_exec.c#L2058-L2059


Huh, I just saw something peculiar, skimming through the code:

https://github.com/postgres/postgres/blob/82ed67a/src/backend/utils/adt/jsonpath_exec.c#L1385

We allow .boolean() applied to a jbvBool, a jbvString (those are the only
two possibilities allowed by the standard), or to a jbvNumeric (!), but
only if it can be serialized and then parsed as an int4, otherwise we say
ERRCODE_NON_NUMERIC_SQL_JSON_ITEM, or if it survived all that we call it
true if it isn't zero.

I wonder what that alternative is doing there.


Are you saying we shouldn't allow .boolean() to be called on a JSON number?

I would concur that that's what the spec says.

Re: Separate HEAP WAL replay logic into its own file

2024-06-18 Thread Melanie Plageman

On Mon, Jun 17, 2024 at 9:12 PM Li, Yong  wrote:
>
> As a newcomer, when I was walking through the code looking for WAL replay 
> related code, it was relatively easy for me to find them for the B-Tree 
> access method because of the “xlog” hint in the file names. It took me a 
> while to find the same for the heap access method. When I finally found them 
> (via text search), it was a small surprise. Having different file 
> organizations for different access methods gives me this urge to make 
> everything consistent. I think it will make it easier for newcomers, and it 
> will reduce the mental load for everyone to remember that heap replay is 
> inside the heapam.c not some “???xlog.c”.

That makes sense. The branch for PG18 has not been cut yet, so I
recommend registering this patch for the July commitfest [1] so it
doesn't get lost.

- Melanie

[1] https://commitfest.postgresql.org/48/

Re: Missing docs for new enable_group_by_reordering GUC

2024-06-18 Thread Pavel Borisov

Hi, Alexander!

On Tue, 18 Jun 2024 at 16:13, Alexander Korotkov 
wrote:

> On Tue, Jun 18, 2024 at 9:14 AM Andrei Lepikhov  wrote:
> > On 6/18/24 09:32, Bruce Momjian wrote:
> > > This commit added enable_group_by_reordering:
> > >
> > >   commit 0452b461bc4
> > >   Author: Alexander Korotkov 
> > >   Date:   Sun Jan 21 22:21:36 2024 +0200
> > > It mentions it was added as a GUC to postgresql.conf, but I see no SGML
> > > docs for this new GUC value.  Would someone please add docs for this?
> > > Thanks.
> > It is my mistake, sorry for that. See the patch in attachment.
>
> Bruce, thank for noticing.  Andrei, thank you for providing a fix.
> Please, check the revised patch.
>
I briefly looked into this docs patch. Planner gucs are arranged
alphabetically, so enable_group_by_reordering is better to come after
enable-gathermerge not before.

+  Enables or disables the reordering of keys in a
+GROUP BY clause to match the ordering keys of a
+child node of the plan, such as an index scan. When turned off,
keys
+in a GROUP BY clause are only reordered to match
+the ORDER BY clause, if any. The default is
+on.
I'd also suggest the same style as already exists
for enable_presorted_aggregate guc i.e:

Controls if the query planner will produce a plan which will provide
GROUP BY keys presorted in the order of keys of a child
node of the plan, such as an index scan. When disabled, the query planner
will produce a plan with GROUP BY keys only reordered to
match
the ORDER BY clause, if any. When enabled, the planner
will try to produce a more efficient plan. The default value is on.

Regards,
Pavel Borisov
Supabase

Re: tls 1.3: sending multiple tickets

2024-06-18 Thread Daniel Gustafsson

> On 17 Jun 2024, at 19:38, Andres Freund  wrote:

> Note the second to last paragraph: Because we use SSL_OP_NO_TICKET we trigger
> use of stateful tickets. Which afaict are never going to be useful, because we
> don't share the necessary state.

Nice catch, I learned something new today.  I was under the impression that the
flag turned of all tickets but clearly not.

> I guess openssl really could have inferred this from the fact that we *do*
> call SSL_CTX_set_session_cache_mode(SSL_SESS_CACHE_OFF), b

Every day with the OpenSSL API is an adventure.

> Seems we ought to use SSL_CTX_set_num_tickets() to prevent issuing the useless
> tickets?

Agreed, in 1.1.1 and above as the API was only introduced then.  LibreSSL added
the API in 3.5.4 but only for compatibility since it doesn't support TLS
tickets at all.

> It seems like a buglet in openssl that it forces each session tickets to be
> sent in its own packet (it does an explicit BIO_flush(), so even if we
> buffered between openssl and OS, as I think we should, we'd still send it
> separately), but I don't really understand most of this stuff.

I don't see anything in the RFCs so not sure.

The attached applies this, and I think this is backpatching material since we
arguably fail to do what we say in the code.  AFAIK we don't have a hard rule
against backpatching changes to autoconf/meson?

--
Daniel Gustafsson

v1-0001-Disable-all-TLS-session-tickets.patch
Description: Binary data

Re: Missing docs for new enable_group_by_reordering GUC

2024-06-18 Thread Pavel Borisov

>
> Controls if the query planner will produce a plan which will provide
> GROUP BY keys presorted in the order of keys of a child
> node of the plan, such as an index scan. When disabled, the query planner
> will produce a plan with GROUP BY keys only reordered to
> match
> the ORDER BY clause, if any. When enabled, the planner
> will try to produce a more efficient plan. The default value is on.
>
A correction of myself: presorted -> sorted, reordered ->sorted

Regards,
Pavel

Re: CI and test improvements

2024-06-18 Thread Justin Pryzby

On Fri, Jun 14, 2024 at 08:34:37AM -0700, Andres Freund wrote:
> Hm. There actually already is the mingw ccache installed. The images for 
> mingw and msvc used to be separate (for startup performance when using 
> containers), but we just merged them.  So it might be easiest to just 
> explicitly use the ccache from there

> (also an explicit path might be faster).

I don't think the path search is significant; it's fast so long as
there's no choco stub involved.

> Could you check if that's fast?

Yes, it is.

> If not, we can just install the binaries distributed by the project, it's 
> just more work to keep up2date that way. 

I guess you mean a separate line to copy choco's ccache.exe somewhere.

-- 
Justin
>From ad03da2cfa3ecf23334f8b31f2e4529ba3094512 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Wed, 25 May 2022 21:53:22 -0500
Subject: [PATCH 1/7] cirrus/windows: add compiler_warnings_script

The goal is to fail due to warnings only after running tests.
(At least historically, it's too slow to run a separate windows VM to
compile with -Werror.)

https://www.postgresql.org/message-id/20220212212310.f645c6vw3njkgxka%40alap3.anarazel.de

I'm not sure how to write this test in windows shell; it's also easy to
write something that doesn't work in posix sh, since windows shell is
interpretting && and ||...

See also:
8a1ce5e54f6d144e4f8e19af7c767b026ee0c956
https://cirrus-ci.com/task/6241060062494720
https://cirrus-ci.com/task/6496366607204352

20221104161934.gb16...@telsasoft.com
20221031223744.gt16...@telsasoft.com
20221104161934.gb16...@telsasoft.com
20221123054329.gg11...@telsasoft.com
20230224002029.gq1...@telsasoft.com

ci-os-only: windows
---
 .cirrus.tasks.yml | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/.cirrus.tasks.yml b/.cirrus.tasks.yml
index 33646faeadf..5a2b64f64c2 100644
--- a/.cirrus.tasks.yml
+++ b/.cirrus.tasks.yml
@@ -566,12 +566,21 @@ task:
 
   build_script: |
 vcvarsall x64
-ninja -C build
+ninja -C build |tee build.txt
+REM Since pipes lose the exit status of the preceding command, rerun the compilation
+REM without the pipe, exiting now if it fails, to avoid trying to run checks
+ninja -C build > nul
 
   check_world_script: |
 vcvarsall x64
 meson test %MTEST_ARGS% --num-processes %TEST_JOBS%
 
+  # This should be last, so check_world is run even if there are warnings
+  always:
+compiler_warnings_script:
+  # this avoids using metachars which would be interpretted by the windows shell
+  - sh -c 'if grep ": warning " build.txt; then exit 1; fi; exit 0'
+
   on_failure:
 <<: *on_failure_meson
 crashlog_artifacts:
-- 
2.42.0

>From 78031c45afad402397e4efa1389e6396557cd405 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Wed, 25 May 2022 22:05:13 -0500
Subject: [PATCH 2/7] cirrus/windows: ccache

ccache 4.7 added support for depend mode with MSVC, and ccache v4.10
supports PCH, so this is now viable.

https://www.postgresql.org/message-id/flat/20220522232606.GZ19626%40telsasoft.com

https://cirrus-ci.com/task/5213936495099904 build 29sec
---
 .cirrus.tasks.yml | 14 +-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/.cirrus.tasks.yml b/.cirrus.tasks.yml
index 5a2b64f64c2..6ebbc1ccd81 100644
--- a/.cirrus.tasks.yml
+++ b/.cirrus.tasks.yml
@@ -536,6 +536,12 @@ task:
   env:
 TEST_JOBS: 8 # wild guess, data based value welcome
 
+CC: c:\msys64\ucrt64\bin\ccache.exe cl.exe
+CCACHE_DIR: $CIRRUS_WORKING_DIR/.ccache
+CCACHE_MAXSIZE: "500M"
+CCACHE_SLOPPINESS: pch_defines,time_macros
+CCACHE_DEPEND: 1
+
 # Cirrus defaults to SetErrorMode(SEM_NOGPFAULTERRORBOX | ...). That
 # prevents crash reporting from working unless binaries do SetErrorMode()
 # themselves. Furthermore, it appears that either python or, more likely,
@@ -553,6 +559,9 @@ task:
   setup_additional_packages_script: |
 REM choco install -y --no-progress ...
 
+  ccache_cache:
+folder: $CCACHE_DIR
+
   setup_hosts_file_script: |
 echo 127.0.0.1 pg-loadbalancetest >> c:\Windows\System32\Drivers\etc\hosts
 echo 127.0.0.2 pg-loadbalancetest >> c:\Windows\System32\Drivers\etc\hosts
@@ -562,7 +571,7 @@ task:
   # Use /DEBUG:FASTLINK to avoid high memory usage during linking
   configure_script: |
 vcvarsall x64
-meson setup --backend ninja --buildtype debug -Dc_link_args=/DEBUG:FASTLINK -Dcassert=true -Dinjection_points=true -Db_pch=true -Dextra_lib_dirs=c:\openssl\1.1\lib -Dextra_include_dirs=c:\openssl\1.1\include -DTAR=%TAR% -DPG_TEST_EXTRA="%PG_TEST_EXTRA%" build
+meson setup build --backend ninja --buildtype debug -Dc_link_args=/DEBUG:FASTLINK -Dcassert=true -Dinjection_points=true -Db_pch=true -Dextra_lib_dirs=c:\openssl\1.1\lib -Dextra_include_dirs=c:\openssl\1.1\include -DTAR=%TAR% -DPG_TEST_EXTRA="%PG_TEST_EXTRA%" -Dc_args="/Z7"
 
   build_script: |
 vcvarsall x64
@@ -571,6 +580,9 @@ task:
 REM without the pipe, exiting now if it

Re: jsonpath: Missing regex_like && starts with Errors?

2024-06-18 Thread Chapman Flack

On 06/18/24 08:30, Peter Eisentraut wrote:
> Are you saying we shouldn't allow .boolean() to be called on a JSON number?
> 
> I would concur that that's what the spec says.

Or, if we want to extend the spec and allow .boolean() on a JSON number,
should it just check that the number is nonzero or zero, rather than
checking that it can be serialized then deserialized as an int4 and
otherwise complaining that it isn't a number?

Which error code to use seems to be a separate issue. Is it possible that
more codes like 2202V non-boolean SQL/JSON item were added in a later spec
than we developed the code from?

I have not read through all of the code to see in how many other places
the error code doesn't match the spec.

Regards,
-Chap

Meson far from ready on Windows

2024-06-18 Thread Dave Page

Hi

Further to my previous report [1] about zlib detection not working with
Meson on Windows, I found it's similarly or entirely broken for the
majority of other dependencies, none of which are tested on the buildfarm
as far as I can see.

For convenience, I've put together a number of Github actions [2] that show
how to build the various dependencies on Windows, in the most
standard/recommended way I can find for each. Another action combines these
into a single downloadable archive that people can test with, and another
one uses that archive to build PostgreSQL 12 through 16, all successfully.

You can see build logs, and download the various builds/artefacts from the
Github Workflow pages.

My next task was to extend that to support PostgreSQL 17 and beyond, which
is where I started to run into problems. I've attempted builds using Meson
with each of the dependencies defined in the old-style config.pl, both with
and without modifying the INCLUDE/LIBS envvars to include the directories
for the dependencies (as was found to work in the previous discussion re
zlib):

Will not successfully configure at all:

meson setup --auto-features=disabled
-Dextra_include_dirs=C:\build64\include
-Dextra_lib_dirs=C:\build64\lib;C:\build64\lib64 --wipe -Dgssapi=enabled
build-gssapi

meson setup --auto-features=disabled
-Dextra_include_dirs=C:\build64\include
-Dextra_lib_dirs=C:\build64\lib;C:\build64\lib64 --wipe -Dicu=enabled
build-icu

meson setup --auto-features=disabled
-Dextra_include_dirs=C:\build64\include
-Dextra_lib_dirs=C:\build64\lib;C:\build64\lib64 --wipe -Dlibxml=enabled
build-libxml

meson setup --auto-features=disabled
-Dextra_include_dirs=C:\build64\include
-Dextra_lib_dirs=C:\build64\lib;C:\build64\lib64 --wipe -Dlz4=enabled
build-lz4

meson setup --auto-features=disabled
-Dextra_include_dirs=C:\build64\include
-Dextra_lib_dirs=C:\build64\lib;C:\build64\lib64 --wipe -Dnls=enabled
build-nls

meson setup --auto-features=disabled
-Dextra_include_dirs=C:\build64\include
-Dextra_lib_dirs=C:\build64\lib;C:\build64\lib64 --wipe -Duuid=ossp
build-uuid

meson setup --auto-features=disabled
-Dextra_include_dirs=C:\build64\include
-Dextra_lib_dirs=C:\build64\lib;C:\build64\lib64 --wipe -Dzstd=enabled
build-zstd

Configured with modified LIBS/INCLUDE:

meson setup --auto-features=disabled
-Dextra_include_dirs=C:\build64\include
-Dextra_lib_dirs=C:\build64\lib;C:\build64\lib64 --wipe -Dlibxslt=enabled
build-libxslt

meson setup --auto-features=disabled
-Dextra_include_dirs=C:\build64\include
-Dextra_lib_dirs=C:\build64\lib;C:\build64\lib64 --wipe -Dssl=openssl
build-openssl

meson setup --auto-features=disabled
-Dextra_include_dirs=C:\build64\include
-Dextra_lib_dirs=C:\build64\lib;C:\build64\lib64 --wipe -Dzlib=enabled
build-zlib

I think it's important to note that Meson largely seems to want to use
pkgconfig and cmake to find dependencies. pkgconfig isn't really a thing on
Windows (it is available, but isn't commonly used), and even cmake would
typically rely on finding things in either known installation directories
or through lib/include vars. There really aren't standard directories like
/usr/lib or /usr/include as we find on unixes, or pkgconfig files for
everything.

For the EDB installers, the team has hand-crafted pkgconfig files for
everything, which is clearly not a proper solution.

I can provide logs and run tests if anyone wants me to do so. Testing so
far has been with the Ninja backend, in a VS2022 x86_64 native environment.

[1]
https://www.postgresql.org/message-id/CA+OCxozrPZx57ue8rmhq6CD1Jic5uqKh80=vtpzurskesn-...@mail.gmail.com
[2] https://github.com/dpage/winpgbuild/actions

-- 
Dave Page
pgAdmin: https://www.pgadmin.org
PostgreSQL: https://www.postgresql.org
EDB: https://www.enterprisedb.com

Re: CompilerWarnings task does not catch C++ warnings

2024-06-18 Thread Tom Lane

Peter Eisentraut  writes:
> The CompilerWarnings task on Cirrus CI does not catch warnings in C++ 
> code.  It tries to make warnings fatal by passing COPT='-Werror', but 
> that does not apply to C++ compilations.
> I suggest that we just add COPT to CXXFLAGS as well.  I think passing 
> -Werror is just about the only reasonable use of COPT nowadays, so 
> making that more robust seems useful.  I don't think there is a need for 
> a separate make variable for C++ here.

+1, but what about the meson side of things?

regards, tom lane

Re: may be a buffer overflow problem

2024-06-18 Thread Andres Freund

Hi,

On 2024-06-17 22:42:41 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > On 2024-06-17 23:52:54 +0200, Daniel Gustafsson wrote:
> >> Since sqlca is, according to our docs, present in other database systems we
> >> should probably keep it a 5-char array for portability reasons.  Adding a
> >> padding character should be fine though.
> 
> > How about, additionally, adding __attribute__((nonstring))? Wrapped in an
> > attribute, of course.  That'll trigger warning for many unsafe uses, like
> > strlen().
> 
> What I was advocating for is that we make it *safe* for strlen, not
> that we double down on awkward, non-idiomatic, unsafe coding
> practices.

Given that apparently other platforms have it as a no-trailing-zero-byte
"string", I'm not sure that that is that clearly a win. Also, if they just
copy the field onto the stack or such, they'll have the same issue again.

And then there is this:

> Admittedly, I'm not sure how we could persuade compilers that
> a char[5] followed by a char field is a normal C string ...

I think the explicit backstop of a zero byte is a good idea, but I don't think
we'd just want to rely on it.

Greetings,

Andres Freund

Re: Truncation of mapped catalogs (whether local or shared) leads to server crash

2024-06-18 Thread Robert Haas

On Tue, Jun 18, 2024 at 8:10 AM Ashutosh Sharma  wrote:
> I've noticed that truncating mapped catalogs causes the server to
> crash due to an assertion failure. Here are the details:
>
> Executing below commands:
>
> -- set allow_system_table_mods TO on;
> -- truncate table pg_type;

If the operation isn't allowed without turning on
allow_system_table_mods, that means that doing it is probably a bad
idea and will probably break stuff, as happened here.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Truncation of mapped catalogs (whether local or shared) leads to server crash

2024-06-18 Thread Tom Lane

Robert Haas  writes:
> On Tue, Jun 18, 2024 at 8:10 AM Ashutosh Sharma  wrote:
>> Executing below commands:
>> -- set allow_system_table_mods TO on;
>> -- truncate table pg_type;

> If the operation isn't allowed without turning on
> allow_system_table_mods, that means that doing it is probably a bad
> idea and will probably break stuff, as happened here.

Nothing good can come of truncating *any* core system catalog --- what
do you think you'll still be able to do afterwards?

I think the assertion you noticed is there because the code path gets
traversed during REINDEX, which is an operation we do support on
system catalogs.  I have zero interest in making truncate work
on them.

regards, tom lane

Re: may be a buffer overflow problem

2024-06-18 Thread Peter Eisentraut


On 18.06.24 04:35, Andres Freund wrote:

On 2024-06-17 23:52:54 +0200, Daniel Gustafsson wrote:

Since sqlca is, according to our docs, present in other database systems we
should probably keep it a 5-char array for portability reasons.  Adding a
padding character should be fine though.


How about, additionally, adding __attribute__((nonstring))? Wrapped in an
attribute, of course.  That'll trigger warning for many unsafe uses, like
strlen().

It doesn't quite detect the problematic case in ecpg_log() though, seems it
doesn't understand fprintf() well enough (it does trigger in simple printf()
cases, because they get reduced to puts(), which it understands).


See also .


Adding nonstring possibly allow us to re-enable -Wstringop-truncation,


Note that that would only work because we now always use our own 
snprintf(), which is not covered by that option.  I mean, we could still 
do it, but it's not like the reasons we originally disabled that option 
have actually gone away.

Re: CompilerWarnings task does not catch C++ warnings

2024-06-18 Thread Peter Eisentraut


On 18.06.24 16:08, Tom Lane wrote:

Peter Eisentraut  writes:

The CompilerWarnings task on Cirrus CI does not catch warnings in C++
code.  It tries to make warnings fatal by passing COPT='-Werror', but
that does not apply to C++ compilations.
I suggest that we just add COPT to CXXFLAGS as well.  I think passing
-Werror is just about the only reasonable use of COPT nowadays, so
making that more robust seems useful.  I don't think there is a need for
a separate make variable for C++ here.


+1, but what about the meson side of things?


If you use meson {setup|configure} --werror, that would affect both C 
and C++ compilers.

Re: Truncation of mapped catalogs (whether local or shared) leads to server crash

2024-06-18 Thread Ashutosh Sharma

Hi,

On Tue, Jun 18, 2024 at 7:50 PM Tom Lane  wrote:
>
> Robert Haas  writes:
> > On Tue, Jun 18, 2024 at 8:10 AM Ashutosh Sharma  
> > wrote:
> >> Executing below commands:
> >> -- set allow_system_table_mods TO on;
> >> -- truncate table pg_type;
>
> > If the operation isn't allowed without turning on
> > allow_system_table_mods, that means that doing it is probably a bad
> > idea and will probably break stuff, as happened here.
>
> Nothing good can come of truncating *any* core system catalog --- what
> do you think you'll still be able to do afterwards?
>
> I think the assertion you noticed is there because the code path gets
> traversed during REINDEX, which is an operation we do support on
> system catalogs.  I have zero interest in making truncate work
> on them.
>

I agree with you on that point. How about considering a complete
restriction instead?

--
With Regards,
Ashutosh Sharma.

Re: Maybe don't process multi xmax in FreezeMultiXactId() if it is already marked as invalid?

2024-06-18 Thread Maxim Orlov

Hi!

Maybe, I'm too bold, but looks like a kinda bug to me.  At least, I don't
understand why we do not check the HEAP_XMAX_INVALID flag.
My guess is nobody noticed, that MultiXactIdIsValid call does not check the
mentioned flag in the "first" condition, but it's all my speculation.
Does anyone know if there are reasons to deliberately ignore the HEAP_XMAX
INVALID flag? Or this is just an unfortunate oversight.

PFA, my approach on this issue.

-- 
Best regards,
Maxim Orlov.


v2-0001-Invalidate-xmax-if-HEAP_XMAX_INVALID-is-set.patch
Description: Binary data

Re: CompilerWarnings task does not catch C++ warnings

2024-06-18 Thread Andres Freund

Hi,

On 2024-06-18 09:27:02 +0200, Peter Eisentraut wrote:
> The CompilerWarnings task on Cirrus CI does not catch warnings in C++ code.
> It tries to make warnings fatal by passing COPT='-Werror', but that does not
> apply to C++ compilations.
> 
> I suggest that we just add COPT to CXXFLAGS as well.  I think passing
> -Werror is just about the only reasonable use of COPT nowadays, so making
> that more robust seems useful.  I don't think there is a need for a separate
> make variable for C++ here.

+1

Greetings,

Andres Freund

Re: Suggest two small improvements for PITR.

2024-06-18 Thread Yura Sokolov


11.01.2024 19:58, Yura Sokolov пишет:

Good day, hackers.

Here I am to suggest two small improvements to Point In Time Recovery.

First is ability to recover recovery-target-time with timestamp stored 
in XLOG_RESTORE_POINT. Looks like historically this ability did exist 
and were removed unintentionally during refactoring at commit [1]

c945af80 "Refactor checking whether we've reached the recovery target."

Second is extending XLOG_BACKUP_END record with timestamp, therefore 
backup will have its own timestamp as well. It is backward compatible 
change since there were no record length check before.


Both changes slightly helps in mostly idle systems, when between several 
backups may happens no commits at all, so there's no timestamp to 
recover to.


Attached sample patches are made in reverse order:
- XLOG_BACKUP_END then XLOG_RESTORE_POINT.
Second patch made by colleague by my idea.
Publishing for both is permitted.

If idea is accepted, patches for tests will be applied as well.

[1]
https://git.postgresql.org/gitweb/?p=postgresql.git;a=patch;h=c945af80


Good day.

Here're reordered and rebased patches with tests.
Now first patch is for XLOG_RETORE_POINT, and second one adds timestamp 
to XLOG_BACKUP_END.


Btw, there's other thread by Simon Riggs with additions to 
getRecordTimestamp:


https://www.postgresql.org/message-id/flat/CANbhV-F%2B8%3DY%3DcfurfD2hjoWVUvTk-Ot9BJdw2Myc%3Dst3TsZy9g%40mail.gmail.com

I didn't rush to adsorb it, because I'm not recoveryTargetUseOriginTime.
Though reaction on XLOG_END_OF_RECOVERY's timestamp is easily could be 
copied from.


I believe, to react on XLOG_CHECKPOINT_ONLINE/XLOG_CHECKPOINT_SHUTDOWN 
the CheckPoint.time field should be changed from `pg_time_t` to 
`TimestampTz` type, since otherwise it interfere hard with 
"inclusive"-ness of recovery_target_time.


-

regards,
YuraFrom 3f5f03a942abf08763e9692b00c82f3bfd2c8ed4 Mon Sep 17 00:00:00 2001
From: Yura Sokolov 
Date: Tue, 18 Jun 2024 12:07:41 +0300
Subject: [PATCH v2 1/2] Restore ability to react on timestamp in
 XLOG_RESTORE_POINT

Looks like, reaction on XLOG_RESTORE_POINT's timestamp as a
recovery_target_time were lost during refactoring in
c945af80 "Refactor checking whether we've reached the recovery target."

Co-authored-by: Sergey Fukanchik 
---
 src/backend/access/transam/xlogrecovery.c   | 43 +
 src/test/recovery/t/003_recovery_targets.pl | 19 -
 2 files changed, 60 insertions(+), 2 deletions(-)

diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index b45b8331720..e1ba4452113 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -2783,6 +2783,49 @@ recoveryStopsAfter(XLogReaderState *record)
 		return true;
 	}
 
+	if (recoveryTarget == RECOVERY_TARGET_TIME &&
+		rmid == RM_XLOG_ID && getRecordTimestamp(record, &recordXtime))
+	{
+		bool		stops_here = false;
+
+		/*
+		 * Use same conditions as in recoveryStopsBefore for transaction
+		 * records to not override transactions time handling.
+		 */
+		if (recoveryTargetInclusive)
+			stops_here = recordXtime > recoveryTargetTime;
+		else
+			stops_here = recordXtime >= recoveryTargetTime;
+
+		if (stops_here)
+		{
+			recoveryStopAfter = true;
+			recoveryStopXid = InvalidTransactionId;
+			recoveryStopTime = recordXtime;
+			recoveryStopLSN = InvalidXLogRecPtr;
+			recoveryStopName[0] = '\0';
+
+			if (info == XLOG_RESTORE_POINT)
+			{
+xl_restore_point *recordRestorePointData;
+
+recordRestorePointData = (xl_restore_point *) XLogRecGetData(record);
+
+ereport(LOG,
+		(errmsg("recovery stopping at restore point \"%.*s\", time %s",
+MAXFNAMELEN, recordRestorePointData->rp_name,
+timestamptz_to_str(recoveryStopTime;
+			}
+			else
+			{
+ereport(LOG,
+		(errmsg("recovery stopping at %s, time %s",
+xlog_identify(info),
+timestamptz_to_str(recoveryStopTime;
+			}
+		}
+	}
+
 	if (rmid != RM_XACT_ID)
 		return false;
 
diff --git a/src/test/recovery/t/003_recovery_targets.pl b/src/test/recovery/t/003_recovery_targets.pl
index 58241a68f0a..d5345c1bf1e 100644
--- a/src/test/recovery/t/003_recovery_targets.pl
+++ b/src/test/recovery/t/003_recovery_targets.pl
@@ -22,6 +22,7 @@ sub test_recovery_standby
 	my $recovery_params = shift;
 	my $num_rows = shift;
 	my $until_lsn = shift;
+	my $expect_in_log = shift;
 
 	my $node_standby = PostgreSQL::Test::Cluster->new($node_name);
 	$node_standby->init_from_backup($node_primary, 'my_backup',
@@ -45,6 +46,16 @@ sub test_recovery_standby
 	  $node_standby->safe_psql('postgres', "SELECT count(*) FROM tab_int");
 	is($result, qq($num_rows), "check standby content for $test_name");
 
+	if (defined $expect_in_log)
+	{
+		local $/;
+		open(my $FH, '<', $node_standby->logfile())
+			or die "Can't open standby nodes's log";
+		my $lines = <$FH>;
+		close($FH);
+		like($lines, qr/${expect_in_log}/i, "expecting \'$expect_in_lo

Re: Truncation of mapped catalogs (whether local or shared) leads to server crash

2024-06-18 Thread Andres Freund

Hi,

On 2024-06-18 19:58:26 +0530, Ashutosh Sharma wrote:
> On Tue, Jun 18, 2024 at 7:50 PM Tom Lane  wrote:
> >
> > Robert Haas  writes:
> > > On Tue, Jun 18, 2024 at 8:10 AM Ashutosh Sharma  
> > > wrote:
> > >> Executing below commands:
> > >> -- set allow_system_table_mods TO on;
> > >> -- truncate table pg_type;
> >
> > > If the operation isn't allowed without turning on
> > > allow_system_table_mods, that means that doing it is probably a bad
> > > idea and will probably break stuff, as happened here.
> >
> > Nothing good can come of truncating *any* core system catalog --- what
> > do you think you'll still be able to do afterwards?
> >
> > I think the assertion you noticed is there because the code path gets
> > traversed during REINDEX, which is an operation we do support on
> > system catalogs.  I have zero interest in making truncate work
> > on them.
> >
> 
> I agree with you on that point. How about considering a complete
> restriction instead?

What's the point?  There are occasional cases where doing something dangerous
is useful, for debugging or corruption recovery. If we flat out prohibit this
we'll just need a allow_system_table_mods_but_for_real option.

Greetings,

Andres Freund

Re: RFC: adding pytest as a supported test framework

2024-06-18 Thread Jacob Champion

(slowly catching up from the weekend email backlog)

On Fri, Jun 14, 2024 at 5:10 AM Robert Haas  wrote:
> I mean, both Perl and Python are Turing-complete.

Tom responded to this better than I could have, but I don't think this
is a helpful statement. In fact I opened the unconference session with
it and immediately waved it away as not-the-point.

> I just don't believe in the idea that we're going to write one
> category of tests in one language and another category in another
> language.

You and I will probably not agree, then, because IMO we already do
that. SQL behavior is tested in SQL via pg_regress characterization
tests. End-to-end tests are written in Perl. Lower-level tests are
often written in C (and, unfortunately, then driven in Perl instead of
C; see above mail to Noah).

I'm fundamentally a polyglot tester by philosophy, so I don't see
careful use of multiple languages as an inherent problem to be solved.
They increase complexity (considerably so!) but generally provide
sufficient value to offset the cost.

> As soon as we open the door to Python tests, people are
> going to start writing the TAP tests that they would have written in
> Perl in Python instead.

There's a wide spectrum of opinions between yours (which I will
cheekily paraphrase as "people will love testing in Python so much
they'll be willing to reinvent all of the wheels" -- which in the
short term would increase maintenance cost but in the long term sounds
like a very good problem to have), and people who seem to think that
new test suite infrastructure would sit unused because no one wants to
write tests anyway (to pull a strawman out of some hallway
conversations at PGConf.dev). I think the truth is probably somewhere
in the middle?

My prior mail was an attempt to bridge the gap between today and the
medium term, by introducing a series of compromises and incremental
steps in response to specific fears. We can jump forward to the end
state and try to talk about it, but I don't control the end state and
I don't have a crystal ball.

> So as I see
> it, the only reasonable plan here if we want to introduce testing in
> Python (or C#, or Ruby, or Go, or JavaScript, or Lua, or LOLCODE) is
> to try to achieve a reasonable degree of parity between that language
> and Perl. Because then we can at least review the new infrastructure
> all at once, instead of incrementally spread across many patches
> written, reviewed, and committed by many different people.

I don't at all believe that a test API which is ported en masse as a
horizontal layer, without motivating vertical slices of test
functionality, is going to be fit for purpose.

And "written, reviewed, and committed by many different people" is a
feature for me, not a bug. One of the goals of the thread is to
encourage more community test writing than we currently have.
Otherwise, I could have kept silent (I am very happy with my personal
suite and have been comfortably maintaining it for a while). I am
trying to build community momentum around a pain point that is
currently rusted in place.

> Consider the meson build system project. To get that committed, Andres
> had to make it do pretty much everything MSVC could do and mostly
> everything that configure could do

I think some lessons can be pulled from that, but fundamentally that's
a port of the build infrastructure done by a person with a commit bit.
There are some pretty considerable differences. (And even then, it
wasn't "done" with the first volley of patches, right?)

Thanks,
--Jacob

Re: RFC: adding pytest as a supported test framework

2024-06-18 Thread Jacob Champion

On Fri, Jun 14, 2024 at 8:49 AM Tom Lane  wrote:
> I think that's an oversimplified analysis.  Sure, the languages are
> both Turing-complete, but for our purposes here they are both simply
> glue languages around some set of testing facilities.  Some of those
> facilities will be provided by the base languages (plus whatever
> extension modules we choose to require) and some by code we write.
> The overall experience of writing tests will be determined by the
> testing facilities far more than by the language used for glue.

+1. As an example, the more extensive (and high-quality) a language's
standard library, the more tests you'll be able to write. Convincing
committers to adopt a new third-party dependency is hard (for good
reason); the strength of the standard library should be considered as
a point of technical comparison.

> That being the case, I do agree with your point that Python
> equivalents to most of PostgreSQL::Test will need to be built up PDQ.
> Maybe they can be better than the originals, in features or ease of
> use, but "not there at all" is not better.

There's a wide gulf between "not there at all" and "reimplement it all
as a prerequisite for v1" as Robert proposed. I'm arguing for a middle
ground.

> But what I'd really like to see is some comparison of the
> language-provided testing facilities that we're proposing
> to depend on.  Why is pytest (or whatever) better than Test::More?

People are focusing a lot on failure reporting, and I agree with them,
but I did try to include more than just that in my OP.

I'll requote what I personally think is the #1 killer feature of
pytest, which prove and Test::More appear to be unable to accomplish
on their own: configurable isolation of tests from each other via
fixtures [1].

Problem 1 (rerun failing tests): One architectural roadblock to this
in our Test::More suite is that tests depend on setup that's done by
previous tests. pytest allows you to declare each test's setup
requirements via pytest fixtures, letting the test runner build up the
world exactly as it needs to be for a single isolated test. These
fixtures may be given a "scope" so that multiple tests may share the
same setup for performance or other reasons.

When I'm doing red-to-green feature development (e.g. OAuth) or
green-to-green refactoring (e.g. replacement of libiddawc with libcurl
in OAuth), quick cycle time and reduction of noise is extremely
important. I want to be able to rerun just the single red test I care
about before moving on.

(Tests may additionally be organized with custom attributes. My OAuth
suite contains tests that must run slowly due to mandatory timeouts;
I've marked them as slow, and they can be easily skipped from the test
runner.)

2. The ability to break into a test case with the built-in debugger
[2] is also fantastic for quick red-green work. Much better than
print() statements.

(Along similar lines, even the ability to use Python's built-in REPL
increases velocity. Python users understand that they can execute
`python3` and be dropped into a sandbox to try out syntax or some
unfamiliar library. Searching for how to do this in Perl results in a
handful of custom-built scripts; people here may know which to use as
a Perl monk, sure, but the point is to make it easy for newcomers to
write tests.)

> I also wonder about integration of python-based testing with what
> we already have.  A significant part of what you called the meson
> work had to do with persuading pg_regress, isolationtester, etc
> to output test results in the common format established by TAP.
> Am I right in guessing that pytest will have nothing to do with that?

Andres covered this pretty well. I will note that I had problems with
pytest-tap itself [3], and I'm unclear whether that represents a bug
in pytest-tap or a bug in pytest.

Thanks,
--Jacob

[1] https://docs.pytest.org/en/stable/how-to/fixtures.html
[2] https://docs.pytest.org/en/stable/how-to/failures.html
[3] https://github.com/python-tap/pytest-tap/issues/30

Re: Meson far from ready on Windows

2024-06-18 Thread Andres Freund

Hi,

On 2024-06-18 14:53:53 +0100, Dave Page wrote:
> My next task was to extend that to support PostgreSQL 17 and beyond, which
> is where I started to run into problems. I've attempted builds using Meson
> with each of the dependencies defined in the old-style config.pl, both with
> and without modifying the INCLUDE/LIBS envvars to include the directories
> for the dependencies (as was found to work in the previous discussion re
> zlib):
> 
> Will not successfully configure at all:

Do you have logs for those failures?


> meson setup --auto-features=disabled
> -Dextra_include_dirs=C:\build64\include
> -Dextra_lib_dirs=C:\build64\lib;C:\build64\lib64 --wipe -Dgssapi=enabled
> build-gssapi

> meson setup --auto-features=disabled
> -Dextra_include_dirs=C:\build64\include
> -Dextra_lib_dirs=C:\build64\lib;C:\build64\lib64 --wipe -Dicu=enabled
> build-icu
> 
> meson setup --auto-features=disabled
> -Dextra_include_dirs=C:\build64\include
> -Dextra_lib_dirs=C:\build64\lib;C:\build64\lib64 --wipe -Dlibxml=enabled
> build-libxml
> 
> meson setup --auto-features=disabled
> -Dextra_include_dirs=C:\build64\include
> -Dextra_lib_dirs=C:\build64\lib;C:\build64\lib64 --wipe -Dlz4=enabled
> build-lz4
> 
> meson setup --auto-features=disabled
> -Dextra_include_dirs=C:\build64\include
> -Dextra_lib_dirs=C:\build64\lib;C:\build64\lib64 --wipe -Dnls=enabled
> build-nls
> 
> meson setup --auto-features=disabled
> -Dextra_include_dirs=C:\build64\include
> -Dextra_lib_dirs=C:\build64\lib;C:\build64\lib64 --wipe -Duuid=ossp
> build-uuid
> 
> meson setup --auto-features=disabled
> -Dextra_include_dirs=C:\build64\include
> -Dextra_lib_dirs=C:\build64\lib;C:\build64\lib64 --wipe -Dzstd=enabled
> build-zstd
> 
> Configured with modified LIBS/INCLUDE:
> 
> meson setup --auto-features=disabled
> -Dextra_include_dirs=C:\build64\include
> -Dextra_lib_dirs=C:\build64\lib;C:\build64\lib64 --wipe -Dlibxslt=enabled
> build-libxslt
> 
> meson setup --auto-features=disabled
> -Dextra_include_dirs=C:\build64\include
> -Dextra_lib_dirs=C:\build64\lib;C:\build64\lib64 --wipe -Dssl=openssl
> build-openssl
> 
> meson setup --auto-features=disabled
> -Dextra_include_dirs=C:\build64\include
> -Dextra_lib_dirs=C:\build64\lib;C:\build64\lib64 --wipe -Dzlib=enabled
> build-zlib
> 
> I think it's important to note that Meson largely seems to want to use
> pkgconfig and cmake to find dependencies. pkgconfig isn't really a thing on
> Windows (it is available, but isn't commonly used), and even cmake would
> typically rely on finding things in either known installation directories
> or through lib/include vars.

I am not really following what you mean with the cmake bit here?

You can configure additional places to search for cmake files with
meson setup --cmake-prefix-path=...


> There really aren't standard directories like
> /usr/lib or /usr/include as we find on unixes, or pkgconfig files for
> everything.

Yes, and?

Greetings,

Andres Freund

Re: Meson far from ready on Windows

2024-06-18 Thread Dave Page

Hi

On Tue, 18 Jun 2024 at 15:38, Andres Freund  wrote:

> Hi,
>
> On 2024-06-18 14:53:53 +0100, Dave Page wrote:
> > My next task was to extend that to support PostgreSQL 17 and beyond,
> which
> > is where I started to run into problems. I've attempted builds using
> Meson
> > with each of the dependencies defined in the old-style config.pl, both
> with
> > and without modifying the INCLUDE/LIBS envvars to include the directories
> > for the dependencies (as was found to work in the previous discussion re
> > zlib):
> >
> > Will not successfully configure at all:
>
> Do you have logs for those failures?
>

Sure - https://developer.pgadmin.org/~dpage/build-logs.zip. Those are all
without any modifications to %LIB% or %INCLUDE%.

> I think it's important to note that Meson largely seems to want to use
> > pkgconfig and cmake to find dependencies. pkgconfig isn't really a thing
> on
> > Windows (it is available, but isn't commonly used), and even cmake would
> > typically rely on finding things in either known installation directories
> > or through lib/include vars.
>
> I am not really following what you mean with the cmake bit here?
>
> You can configure additional places to search for cmake files with
> meson setup --cmake-prefix-path=...
>

None of the dependencies include cmake files for distribution on Windows,
so there are no additional files to tell meson to search for. The same
applies to pkgconfig files, which is why the EDB team had to manually craft
them.

>
>
> > There really aren't standard directories like
> > /usr/lib or /usr/include as we find on unixes, or pkgconfig files for
> > everything.
>
> Yes, and?
>

And that's why we really need to be able to locate headers and libraries
easily by passing paths to meson, as we can't rely on pkgconfig, cmake, or
things being in some standard directory on Windows.

Thanks.

-- 
Dave Page
pgAdmin: https://www.pgadmin.org
PostgreSQL: https://www.postgresql.org
EDB: https://www.enterprisedb.com

Re: Truncation of mapped catalogs (whether local or shared) leads to server crash

2024-06-18 Thread Tom Lane

Ashutosh Sharma  writes:
> On Tue, Jun 18, 2024 at 7:50 PM Tom Lane  wrote:
>> I think the assertion you noticed is there because the code path gets
>> traversed during REINDEX, which is an operation we do support on
>> system catalogs.  I have zero interest in making truncate work
>> on them.

> I agree with you on that point. How about considering a complete
> restriction instead?

You already broke the safety seal by enabling allow_system_table_mods.
Perhaps the documentation of that is not scary enough?

Allows modification of the structure of system tables as well as
certain other risky actions on system tables.  This is otherwise not
allowed even for superusers.  Ill-advised use of this setting can
cause irretrievable data loss or seriously corrupt the database
system.

regards, tom lane

Re: consider -Wmissing-variable-declarations

2024-06-18 Thread Andres Freund

Hi,

+many for doing this in principle


> -const char *EAN13_range[][2] = {
> +static const char *EAN13_range[][2] = {
>   {"000", "019"}, /* GS1 US */
>   {"020", "029"}, /* Restricted distribution (MO 
> defined) */
>   {"030", "039"}, /* GS1 US */

> -const char *ISBN_range[][2] = {
> +static const char *ISBN_range[][2] = {
>   {"0-00", "0-19"},
>   {"0-200", "0-699"},
>   {"0-7000", "0-8499"},
> @@ -967,7 +967,7 @@ const char *ISBN_range[][2] = {
>   */

I think these actually ought be "static const char *const" - right now the
table is mutable, unless this day ends in *day and I've confused myself with C
syntax again.




>  /* Hook to check passwords in CreateRole() and AlterRole() */
>  check_password_hook_type check_password_hook = NULL;
> diff --git a/src/backend/postmaster/launch_backend.c 
> b/src/backend/postmaster/launch_backend.c
> index bdfa238e4fe..bb1b0ac2b9c 100644
> --- a/src/backend/postmaster/launch_backend.c
> +++ b/src/backend/postmaster/launch_backend.c
> @@ -176,7 +176,7 @@ typedef struct
>   boolshmem_attach;
>  } child_process_kind;
>  
> -child_process_kind child_process_kinds[] = {
> +static child_process_kind child_process_kinds[] = {
>   [B_INVALID] = {"invalid", NULL, false},
>  
>   [B_BACKEND] = {"backend", BackendMain, true},

This really ought to be const as well and is new.  Unless somebody protests
I'm going to make it so soon.


Structs like these, containing pointers, make for nice helpers in
exploitation. We shouldn't make it easier by unnecessarily making them
mutable.


> diff --git a/src/bin/pg_archivecleanup/pg_archivecleanup.c 
> b/src/bin/pg_archivecleanup/pg_archivecleanup.c
> index 07bf356b70c..5a124385b7c 100644
> --- a/src/bin/pg_archivecleanup/pg_archivecleanup.c
> +++ b/src/bin/pg_archivecleanup/pg_archivecleanup.c
> @@ -19,17 +19,18 @@
>  #include "common/logging.h"
>  #include "getopt_long.h"
>  
> -const char *progname;
> +static const char *progname;

Hm, this one I'm not so sure about. The backend version is explicitly globally
visible, and I don't see why we shouldn't do the same for other binaries.



> From d89312042eb76c879d699380a5e2ed0bc7956605 Mon Sep 17 00:00:00 2001
> From: Peter Eisentraut 
> Date: Sun, 16 Jun 2024 23:52:06 +0200
> Subject: [PATCH v2 05/10] Fix warnings from -Wmissing-variable-declarations
>  under EXEC_BACKEND
> 
> The NON_EXEC_STATIC variables need a suitable declaration in a header
> file under EXEC_BACKEND.
> 
> Also fix the inconsistent application of the volatile qualifier for
> PMSignalState, which was revealed by this change.

I'm very very unenthused about adding volatile to more places. It's rarely
correct and often slow. But I guess this doesn't really make it any worse.


> +#ifdef TRACE_SYNCSCAN
> +#include "access/syncscan.h"
> +#endif

I'd just include it unconditionally.




> From f99c8712ff3dc2156c3e437cfa14f1f1a7f09079 Mon Sep 17 00:00:00 2001
> From: Peter Eisentraut 
> Date: Wed, 8 May 2024 13:49:37 +0200
> Subject: [PATCH v2 09/10] Fix -Wmissing-variable-declarations warnings for
>  float.c special case
> 
> Discussion: 
> https://www.postgresql.org/message-id/flat/e0a62134-83da-4ba4-8cdb-ceb0111c9...@eisentraut.org
> ---
>  src/backend/utils/adt/float.c | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/src/backend/utils/adt/float.c b/src/backend/utils/adt/float.c
> index cbbb8aecafc..bf047ee1b4c 100644
> --- a/src/backend/utils/adt/float.c
> +++ b/src/backend/utils/adt/float.c
> @@ -56,6 +56,11 @@ static float8 cot_45 = 0;
>   * compiler to know that, else it might try to precompute expressions
>   * involving them.  See comments for init_degree_constants().
>   */
> +extern float8 degree_c_thirty;
> +extern float8 degree_c_forty_five;
> +extern float8 degree_c_sixty;
> +extern float8 degree_c_one_half;
> +extern float8 degree_c_one;
>  float8   degree_c_thirty = 30.0;
>  float8   degree_c_forty_five = 45.0;
>  float8   degree_c_sixty = 60.0;

Yikes, this is bad code. Relying on extern to have effects like this will just
break with lto. But not the responsibility of this patch.


> From 649e8086df1f175e843b26cad41a698c8c074c09 Mon Sep 17 00:00:00 2001
> From: Peter Eisentraut 
> Date: Wed, 8 May 2024 13:49:37 +0200
> Subject: [PATCH v2 10/10] Fix -Wmissing-variable-declarations warnings in
>  bison code
> 
> Add extern declarations for global variables produced by bison.

:(

Greetings,

Andres Freund

Re: What is a typical precision of gettimeofday()?

2024-06-18 Thread Hannu Krosing

I plan to send patch to pg_test_timing in a day or two

the underlying time precision on modern linux seems to be

2 ns for some Intel CPUs
10 ns for Zen4
40 ns for ARM (Ampere)

---
Hannu



|




On Tue, Jun 18, 2024 at 7:48 AM Andrey M. Borodin 
wrote:

>
>
> > On 19 Mar 2024, at 13:28, Peter Eisentraut  wrote:
> >
> > I feel that we don't actually have any information about this
> portability concern.  Does anyone know what precision we can expect from
> gettimeofday()?  Can we expect the full microsecond precision usually?
>
> At PGConf.dev Hannu Krossing draw attention to pg_test_timing module. I’ve
> tried this module(slightly modified to measure nanoseconds) on some
> systems, and everywhere I found ~100ns resolution (95% of ticks fall into
> 64ns and 128ns buckets).
>
> I’ll add cc Hannu, and also pg_test_timing module authors Ants ang Greg.
> Maybe they can add some context.
>
>
> Best regards, Andrey Borodin.

Re: Truncation of mapped catalogs (whether local or shared) leads to server crash

2024-06-18 Thread Ashutosh Sharma

Hi Robert, Andres, Tom,

Thank you for sharing your thoughts.

On Tue, Jun 18, 2024 at 8:02 PM Andres Freund  wrote:
>
> Hi,
>
> On 2024-06-18 19:58:26 +0530, Ashutosh Sharma wrote:
> > On Tue, Jun 18, 2024 at 7:50 PM Tom Lane  wrote:
> > >
> > > Robert Haas  writes:
> > > > On Tue, Jun 18, 2024 at 8:10 AM Ashutosh Sharma  
> > > > wrote:
> > > >> Executing below commands:
> > > >> -- set allow_system_table_mods TO on;
> > > >> -- truncate table pg_type;
> > >
> > > > If the operation isn't allowed without turning on
> > > > allow_system_table_mods, that means that doing it is probably a bad
> > > > idea and will probably break stuff, as happened here.
> > >
> > > Nothing good can come of truncating *any* core system catalog --- what
> > > do you think you'll still be able to do afterwards?
> > >
> > > I think the assertion you noticed is there because the code path gets
> > > traversed during REINDEX, which is an operation we do support on
> > > system catalogs.  I have zero interest in making truncate work
> > > on them.
> > >
> >
> > I agree with you on that point. How about considering a complete
> > restriction instead?
>
> What's the point?  There are occasional cases where doing something dangerous
> is useful, for debugging or corruption recovery. If we flat out prohibit this
> we'll just need a allow_system_table_mods_but_for_real option.
>

This is specifically about truncation of system catalogs, and does not
refer to any other DML operations on system catalogs, which I see are
necessary for many extensions that directly update catalogs like
pg_proc and others. Additionally, according to the comments in
truncate_check_rel(), we permit truncation of the pg_largeobject
catalog specifically during pg_upgrade. So, afaiu truncation of any
catalogs other than this can be restricted.

--
With Regards,
Ashutosh Sharma.

Re: Inval reliability, especially for inplace updates

2024-06-18 Thread Noah Misch

On Mon, Jun 17, 2024 at 06:57:30PM -0700, Andres Freund wrote:
> On 2024-06-17 16:58:54 -0700, Noah Misch wrote:
> > On Sat, Jun 15, 2024 at 03:37:18PM -0700, Noah Misch wrote:
> > > On Wed, May 22, 2024 at 05:05:48PM -0700, Noah Misch wrote:
> > > > https://postgr.es/m/20240512232923.aa.nmi...@google.com wrote:
> > > > > Separable, nontrivial things not fixed in the attached patch stack:
> > > > >
> > > > > - Inplace update uses transactional CacheInvalidateHeapTuple().  
> > > > > ROLLBACK of
> > > > >   CREATE INDEX wrongly discards the inval, leading to the 
> > > > > relhasindex=t loss
> > > > >   still seen in inplace-inval.spec.  CacheInvalidateRelmap() does 
> > > > > this right.
> > > >
> > > > I plan to fix that like CacheInvalidateRelmap(): send the inval 
> > > > immediately,
> > > > inside the critical section.  Send it in heap_xlog_inplace(), too.
> 
> I'm worried this might cause its own set of bugs, e.g. if there are any places
> that, possibly accidentally, rely on the invalidation from the inplace update
> to also cover separate changes.

Good point.  I do have index_update_stats() still doing an ideally-superfluous
relcache update for that reason.  Taking that further, it would be cheap
insurance to have the inplace update do a transactional inval in addition to
its immediate inval.  Future master-only work could remove the transactional
one.  How about that?

> Have you considered instead submitting these invalidations during abort as
> well?

I had not.  Hmmm.  If the lock protocol in README.tuplock (after patch
inplace120) told SearchSysCacheLocked1() to do systable scans instead of
syscache reads, that could work.  Would need to ensure a PANIC if transaction
abort doesn't reach the inval submission.  Overall, it would be harder to
reason about the state of caches, but I suspect the patch would be smaller.
How should we choose between those strategies?

> > > > a. Within logical decoding, cease processing invalidations for inplace
> > >
> > > I'm attaching the implementation.  This applies atop the v3 patch stack 
> > > from
> > > https://postgr.es/m/20240614003549.c2.nmi...@google.com, but the threads 
> > > are
> > > mostly orthogonal and intended for independent review.  Translating a 
> > > tuple
> > > into inval messages uses more infrastructure than relmapper, which needs 
> > > just
> > > a database ID.  Hence, this ended up more like a miniature of inval.c's
> > > participation in the transaction commit sequence.
> > >
> > > I waffled on whether to back-patch inplace150-inval-durability-atcommit
> >
> > That inplace150 patch turned out to be unnecessary.  Contrary to the
> > "noncritical resource releasing" comment some lines above
> > AtEOXact_Inval(true), the actual behavior is already to promote ERROR to
> > PANIC.  An ERROR just before or after sending invals becomes PANIC, "cannot
> > abort transaction %u, it was already committed".
> 
> Relying on that, instead of explicit critical sections, seems fragile to me.
> IIRC some of the behaviour around errors around transaction commit/abort has
> changed a bunch of times. Tying correctness into something that could be
> changed for unrelated reasons doesn't seem great.

Fair enough.  It could still be a good idea for master, but given I missed a
bug in inplace150-inval-durability-atcommit-v1.patch far worse than the ones
$SUBJECT fixes, let's not risk it in back branches.

> I'm not sure it holds true even today - what if the transaction didn't have an
> xid? Then RecordTransactionAbort() wouldn't trigger
>  "cannot abort transaction %u, it was already committed"
> I think?

I think that's right.  As the inplace160-inval-durability-inplace-v2.patch
edits to xact.c say, the concept of invals in XID-less transactions is buggy
at its core.  Fortunately, after that patch, we use them only for two things
that could themselves stop with something roughly as simple as the attached.

> > > - Same change, no WAL version bump.  Standby must update before primary.  
> > > This
> > >   is best long-term, but the transition is more disruptive.  I'm leaning
> > >   toward this one, but the second option isn't bad:
> 
> Hm. The inplace record doesn't use the length of the "main data" record
> segment for anything, from what I can tell. If records by an updated primary
> were replayed by an old standby, it'd just ignore the additional data, afaict?

Agreed, but ...

> I think with the code as-is, the situation with an updated standby replaying
> an old primary's record would actually be worse - it'd afaict just assume the
> now-longer record contained valid fields, despite those just pointing into
> uninitialized memory.  I think the replay routine would have to check the
> length of the main data and execute the invalidation conditionally.

I anticipated back branches supporting a new XLOG_HEAP_INPLACE_WITH_INVAL
alongside the old XLOG_HEAP_INPLACE.  Updated standbys would run both fine,
and old binaries consuming new WAL would PANIC, "

Re: Truncation of mapped catalogs (whether local or shared) leads to server crash

2024-06-18 Thread Ashutosh Sharma

Hi,

On Tue, Jun 18, 2024 at 8:25 PM Tom Lane  wrote:
>
> Ashutosh Sharma  writes:
> > On Tue, Jun 18, 2024 at 7:50 PM Tom Lane  wrote:
> >> I think the assertion you noticed is there because the code path gets
> >> traversed during REINDEX, which is an operation we do support on
> >> system catalogs.  I have zero interest in making truncate work
> >> on them.
>
> > I agree with you on that point. How about considering a complete
> > restriction instead?
>
> You already broke the safety seal by enabling allow_system_table_mods.
> Perhaps the documentation of that is not scary enough?
>
> Allows modification of the structure of system tables as well as
> certain other risky actions on system tables.  This is otherwise not
> allowed even for superusers.  Ill-advised use of this setting can
> cause irretrievable data loss or seriously corrupt the database
> system.
>

I was actually referring to just the truncation part here, not any DML
operations, as I've observed their usage in certain extensions.
However, truncation is just used for pg_largeobject and that too only
during pg_upgrade, so for other catalogs truncation can be avoided.
But that is just my perspective; if it's not essential, we can
possibly stop this discussion here.

Thank you to everyone for sharing your valuable insights.

--
With Regards,
Ashutosh Sharma.

Re: Maybe don't process multi xmax in FreezeMultiXactId() if it is already marked as invalid?

2024-06-18 Thread Peter Geoghegan

On Tue, Jun 18, 2024 at 10:29 AM Maxim Orlov  wrote:
> Maybe, I'm too bold, but looks like a kinda bug to me.  At least, I don't 
> understand why we do not check the HEAP_XMAX_INVALID flag.
> My guess is nobody noticed, that MultiXactIdIsValid call does not check the 
> mentioned flag in the "first" condition, but it's all my speculation.

A related code path was changed in commit 02d647bbf0. That change made
the similar xmax handling that covers XIDs (not MXIDs) *stop* doing
what you're now proposing to do in the Multi path.

Why do you think this is a bug?

> Does anyone know if there are reasons to deliberately ignore the HEAP_XMAX 
> INVALID flag? Or this is just an unfortunate oversight.

HEAP_XMAX_INVALID is just a hint.

-- 
Peter Geoghegan

Re: Meson far from ready on Windows

2024-06-18 Thread Andres Freund

Hi,

On 2024-06-18 15:54:27 +0100, Dave Page wrote:
> On Tue, 18 Jun 2024 at 15:38, Andres Freund  wrote:
> > Do you have logs for those failures?
> >
>
> Sure - https://developer.pgadmin.org/~dpage/build-logs.zip. Those are all
> without any modifications to %LIB% or %INCLUDE%.

Thanks.

> > I think it's important to note that Meson largely seems to want to use
> > > pkgconfig and cmake to find dependencies. pkgconfig isn't really a thing
> > on
> > > Windows (it is available, but isn't commonly used), and even cmake would
> > > typically rely on finding things in either known installation directories
> > > or through lib/include vars.
> >
> > I am not really following what you mean with the cmake bit here?
> >
> > You can configure additional places to search for cmake files with
> > meson setup --cmake-prefix-path=...
> >
>
> None of the dependencies include cmake files for distribution on Windows,
> so there are no additional files to tell meson to search for. The same
> applies to pkgconfig files, which is why the EDB team had to manually craft
> them.

Many of them do include at least cmake files on windows if you build them
though?

Btw, I've been working with Bilal to add a few of the dependencies to the CI
images so we can test those automatically. Using vcpkg. We got that nearly
working, but he's on vacation this week...  That does ensure both cmake and
.pc files are generated, fwiw.

Currently builds gettext, icu, libxml2, libxslt, lz4, openssl, pkgconf,
python3, tcl, zlib, zstd.

I'm *NOT* sure that vcpkg is the way to go, fwiw. It does seem advantageous to
use one of the toolkits thats commonly built for building dependencies on
windows, which seems to mean vcpkg or conan.

> And that's why we really need to be able to locate headers and libraries
> easily by passing paths to meson, as we can't rely on pkgconfig, cmake, or
> things being in some standard directory on Windows.

Except that that often causes hard to diagnose breakages, because that doesn't
allow including the necessary compiler/linker flags [2].  It's a bad model, we 
shouldn't
perpetuate it.  If we want to forever make windows a complicated annoying
stepchild, that's the way to go.

FWIW, at least libzstd, libxml [3], lz4, zlib can generate cmake dependency
files on windows in their upstream code.

I'm *not* against adding "hardcoded" dependency lookup stuff for libraries
where other approaches aren't feasible, I just don't think it's a good idea to
add fragile stuff that will barely be tested, when not necessary.

Greetings,

Andres Freund

[1] Here's a build of PG with the dependencies installed, builds
https://cirrus-ci.com/task/4953968097361920

[2] E.g.

https://github.com/postgres/postgres/blob/REL_16_STABLE/src/tools/msvc/Mkvcbuild.pm#L600

https://github.com/postgres/postgres/blob/REL_16_STABLE/src/tools/msvc/Solution.pm#L1039

[3] Actually, at least your libxml build actually *did* include both .pc and
cmake files. So just pointing to the relevant path would do the trick.

PostgreSQL 17 Beta 2 release date & commit freeze

2024-06-18 Thread Jonathan S. Katz



Hi,

PostgreSQL 17 Beta 2 is planned to be release on June 27, 2024. Please 
continue your hard work on closing out open items[1] ahead of the 
release and have the fixes targeted for the release committed by June 
22, 2024.


Thanks!

Jonathan

[1] https://wiki.postgresql.org/wiki/PostgreSQL_17_Open_Items


OpenPGP_signature.asc
Description: OpenPGP digital signature

Re: PostgreSQL 17 Beta 2 release date & commit freeze

2024-06-18 Thread Bruce Momjian

On Tue, Jun 18, 2024 at 12:10:50PM -0400, Jonathan Katz wrote:
> 
> Hi,
> 
> PostgreSQL 17 Beta 2 is planned to be release on June 27, 2024. Please
> continue your hard work on closing out open items[1] ahead of the release
> and have the fixes targeted for the release committed by June 22, 2024.

I am adding markup to the PG 17 release notes and will be finished by
this Friday.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Only you can decide what is important to you.

MergeAppend could consider sorting cheapest child path

2024-06-18 Thread Alexander Pyhalov


Hi.

Now when planner finds suitable pathkeys in 
generate_orderedappend_paths(), it uses them, even if explicit sort of 
the cheapest child path could be more efficient.


We encountered this issue on partitioned table with two indexes, where 
one is suitable for sorting, and another is good for selecting data. 
MergeAppend was generated
with subpaths doing index scan on suitably ordered index and filtering a 
lot of data.
The suggested fix allows MergeAppend to consider sorting on cheapest 
childrel total path as an alternative.


--
Best regards,
Alexander Pyhalov,
Postgres ProfessionalFrom d5eb3d326d83e2ca47c17552fcc6fd27b6b98d4e Mon Sep 17 00:00:00 2001
From: Alexander Pyhalov 
Date: Tue, 18 Jun 2024 15:56:18 +0300
Subject: [PATCH] MergeAppend could consider using sorted best path.

This helps when index with suitable pathkeys is not
good for filtering data.
---
 .../postgres_fdw/expected/postgres_fdw.out|  6 +-
 src/backend/optimizer/path/allpaths.c | 21 +
 src/test/regress/expected/inherit.out | 45 +-
 src/test/regress/expected/partition_join.out  | 87 +++
 src/test/regress/expected/union.out   |  6 +-
 src/test/regress/sql/inherit.sql  | 17 
 6 files changed, 141 insertions(+), 41 deletions(-)

diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index ea566d50341..687591e4a97 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -10074,13 +10074,15 @@ SELECT t1.a, t2.b FROM fprt1 t1 INNER JOIN fprt2 t2 ON (t1.a = t2.b) WHERE t1.a
->  Nested Loop
  Join Filter: (t1.a = t2.b)
  ->  Append
-   ->  Foreign Scan on ftprt1_p1 t1_1
+   ->  Sort
+ Sort Key: t1_1.a
+ ->  Foreign Scan on ftprt1_p1 t1_1
->  Foreign Scan on ftprt1_p2 t1_2
  ->  Materialize
->  Append
  ->  Foreign Scan on ftprt2_p1 t2_1
  ->  Foreign Scan on ftprt2_p2 t2_2
-(10 rows)
+(12 rows)
 
 SELECT t1.a, t2.b FROM fprt1 t1 INNER JOIN fprt2 t2 ON (t1.a = t2.b) WHERE t1.a % 25 = 0 ORDER BY 1,2 FOR UPDATE OF t1;
   a  |  b  
diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index 4895cee9944..827bc469269 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -1845,6 +1845,27 @@ generate_orderedappend_paths(PlannerInfo *root, RelOptInfo *rel,
 /* Assert we do have an unparameterized path for this child */
 Assert(cheapest_total->param_info == NULL);
 			}
+			else
+			{
+/*
+ * Even if we found necessary pathkeys, using unsorted path
+ * can be more efficient.
+ */
+Path		sort_path;
+
+cost_sort(&sort_path,
+		  root,
+		  pathkeys,
+		  childrel->cheapest_total_path->total_cost,
+		  childrel->cheapest_total_path->rows,
+		  childrel->cheapest_total_path->pathtarget->width,
+		  0.0,
+		  work_mem,
+		  -1.0 /* need all tuples to sort them */ );
+
+if (compare_path_costs(&sort_path, cheapest_total, TOTAL_COST) < 0)
+	cheapest_total = childrel->cheapest_total_path;
+			}
 
 			/*
 			 * When building a fractional path, determine a cheapest
diff --git a/src/test/regress/expected/inherit.out b/src/test/regress/expected/inherit.out
index ad732134148..16e78c8d2ff 100644
--- a/src/test/regress/expected/inherit.out
+++ b/src/test/regress/expected/inherit.out
@@ -1555,6 +1555,7 @@ insert into matest2 (name) values ('Test 4');
 insert into matest3 (name) values ('Test 5');
 insert into matest3 (name) values ('Test 6');
 set enable_indexscan = off;  -- force use of seqscan/sort, so no merge
+set enable_sort = off; -- avoid sorting below MergeAppend
 explain (verbose, costs off) select * from matest0 order by 1-id;
  QUERY PLAN 
 
@@ -1608,6 +1609,7 @@ select min(1-id) from matest0;
 (1 row)
 
 reset enable_indexscan;
+reset enable_sort;
 set enable_seqscan = off;  -- plan with fewest seqscans should be merge
 set enable_parallel_append = off; -- Don't let parallel-append interfere
 explain (verbose, costs off) select * from matest0 order by 1-id;
@@ -1702,7 +1704,9 @@ order by t1.b limit 10;
  Merge Cond: (t1.b = t2.b)
  ->  Merge Append
Sort Key: t1.b
-   ->  Index Scan using matest0i on matest0 t1_1
+   ->  Sort
+ Sort Key: t1_1.b
+ ->  Seq Scan on matest0 t1_1
->  Index Scan using matest1i on matest1 t1_2
  ->  Materialize
->  Merge Append
@@ -1711,7 +1715,7 @@ order by t1.b limit 10;
Filter: (c = d)
  ->  Index Scan using matest1i on matest1 t2_2

Re: Reducing the log spam

2024-06-18 Thread Laurenz Albe

On Mon, 2024-06-17 at 16:40 -0500, Justin Pryzby wrote:
> > The feature will become much less useful if unique voilations keep getting 
> > logged.
> 
> Uh, to be clear, your patch is changing the *defaults*, which I found
> surprising, even after reaading the thread.  Evidently, the current
> behavior is not what you want, and you want to change it, but I'm *sure*
> that whatever default you want to use at your site/with your application
> is going to make someone else unhappy.  I surely want unique violations
> to be logged, for example.

I was afraid that setting the default non-empty would cause objections.

> > +   
> > +This setting is useful to exclude error messages from the log that 
> > are
> > +frequent but irrelevant.
> 
> I think you should phrase the feature as ".. *allow* skipping error
> logging for messages ... that are frequent but irrelevant for a given
> site/role/DB/etc."

I have reworded that part.

> I suggest that this patch should not change the default behavior at all:
> its default should be empty.  That you, personally, would plan to
> exclude this or that error code is pretty uninteresting.  I think the
> idea of changing the default behavior will kill the patch, and even if
> you want to propose to do that, it should be a separate discussion.
> Maybe you should make it an 002 patch.

I have attached a new version that leaves the parameter empty by default.

The patch is not motivated by my personal dislike of certain error messages.
A well-written application would not need that parameter at all.
The motivation for me is based on my dealing with customers' log files,
which are often full of messages that are only distracting from serious
problems and fill up the disk.

But you are probably right that it would be hard to find a default setting
that nobody has quibbles with, and the default can always be changed with
a future patch.

Yours,
Laurenz Albe
From 8e1eeb9cbcb94d9b15eb9ee97956cc4044ff7964 Mon Sep 17 00:00:00 2001
From: Laurenz Albe 
Date: Tue, 18 Jun 2024 18:39:05 +0200
Subject: [PATCH v4] Add parameter log_suppress_errcodes

The parameter contains a list of SQLSTATEs for which
FATAL and ERROR messages are not logged.  This is to
suppress messages that are of little interest to the
database administrator, but tend to clutter the log.

Author: Laurenz Albe
Discussion: https://postgr.es/m/408f399e7de1416c47bab7e260327ed5ad92838c.camel%40cybertec.at
---
 doc/src/sgml/config.sgml  |  35 ++
 src/backend/utils/error/elog.c| 119 ++
 src/backend/utils/misc/guc_tables.c   |  11 ++
 src/backend/utils/misc/postgresql.conf.sample |   3 +
 src/include/pg_config_manual.h|  10 ++
 src/include/utils/guc.h   |   1 +
 src/include/utils/guc_hooks.h |   2 +
 7 files changed, 181 insertions(+)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 698169afdb..a58b43bceb 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -6883,6 +6883,41 @@ local0.*/var/log/postgresql
   
  
 
+ 
+  log_suppress_errcodes (string)
+  
+   log_suppress_errcodes configuration parameter
+  
+  
+  
+   
+Causes ERROR and FATAL messages
+from client backend processes with certain error codes to be excluded
+from the log.
+The value is a comma-separated list of five-character error codes as
+listed in .  An error code that
+represents a class of errors (ends with three zeros) suppresses logging
+of all error codes within that class.  For example, the entry
+08000 (connection_exception)
+would suppress an error with code 08P01
+(protocol_violation).  The default setting is
+empty, that is, all error codes get logged.
+Only superusers and users with the appropriate SET
+privilege can change this setting.
+   
+
+   
+This setting allows you to skip error logging for messages that are
+frequent but irrelevant.  Supressing such messages makes it easier to
+spot relevant messages in the log and keeps log files from growing too
+big.  For example, if you have a monitoring system that regularly
+establishes a TCP connection to the server without sending a correct
+startup packet, you could suppress the protocol violation errors by
+adding the error code 08P01 to the list.
+   
+  
+ 
+
  
   log_min_duration_statement (integer)
   
diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c
index d91a85cb2d..0b2506e07a 100644
--- a/src/backend/utils/error/elog.c
+++ b/src/backend/utils/error/elog.c
@@ -111,12 +111,16 @@ int			Log_error_verbosity = PGERROR_DEFAULT;
 char	   *Log_line_prefix = NULL; /* format for extra log line info */
 int			Log_destination = LOG_DESTINATION_STDERR;
 char	   *Log_desti

Re: improve predefined roles documentation

2024-06-18 Thread Nathan Bossart

On Mon, Jun 17, 2024 at 02:10:22PM -0400, Robert Haas wrote:
> On Thu, Jun 13, 2024 at 3:48â¯PM Nathan Bossart  
> wrote:
>> I think we could improve matters by abandoning the table and instead
>> documenting these roles more like we document GUCs, i.e., each one has a
>> section below it where we can document it in as much detail as we want.
>> Some of these roles should probably be documented together (e.g.,
>> pg_read_all_data and pg_write_all_data), so the ordering is unlikely to be
>> perfect, but I'm hoping it would still be a net improvement.
> 
> +1. I'm not sure about all of the details, but I like the general idea.

Here is a first try.  I did pretty much exactly what I proposed in the
quoted text, so I don't have much else to say about it.  I didn't see an
easy way to specify multiple ids and xreflabels for a given entry, so the
entries that describe multiple roles just use the name of the first role
listed.  In practice, I think this just means you need to do a little extra
work when linking to one of the other roles from elsewhere in the docs,
which doesn't seem too terrible.

-- 
nathan
>From 89db4a562ddb07aa1215608fb116b511143b0e66 Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Tue, 18 Jun 2024 11:38:40 -0500
Subject: [PATCH v1 1/1] revamp predefined roles documentation

---
 doc/src/sgml/config.sgml |   2 +-
 doc/src/sgml/monitoring.sgml |   4 +-
 doc/src/sgml/ref/checkpoint.sgml |   2 +-
 doc/src/sgml/ref/reindex.sgml|   2 +-
 doc/src/sgml/user-manag.sgml | 339 ---
 5 files changed, 185 insertions(+), 164 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 698169afdb..b9fd3f3bd6 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -731,7 +731,7 @@ include_dir 'conf.d'

 Determines the number of connection slots that are
 reserved for connections by roles with privileges of the
-pg_use_reserved_connections
+
 role.  Whenever the number of free connection slots is greater than
  but less than or
 equal to the sum of superuser_reserved_connections
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index b2ad9b446f..133ad462cb 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -286,8 +286,8 @@ postgres   27093  0.0  0.0  30096  2752 ?Ss   11:34 
  0:00 postgres: ser
other sessions, many columns will be null.  Note, however, that the
existence of a session and its general properties such as its sessions user
and database are visible to all users.  Superusers and roles with 
privileges of
-   built-in role pg_read_all_stats (see also ) can see all the information about all 
sessions.
+   built-in role pg_read_all_stats
+   can see all the information about all sessions.
   
 
   
diff --git a/doc/src/sgml/ref/checkpoint.sgml b/doc/src/sgml/ref/checkpoint.sgml
index 28a1d717b8..db011a47d0 100644
--- a/doc/src/sgml/ref/checkpoint.sgml
+++ b/doc/src/sgml/ref/checkpoint.sgml
@@ -53,7 +53,7 @@ CHECKPOINT
 
   
Only superusers or users with the privileges of
-   the pg_checkpoint
+   the 
role can call CHECKPOINT.
   
  
diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml
index 2942dccf1e..dcf70d14bc 100644
--- a/doc/src/sgml/ref/reindex.sgml
+++ b/doc/src/sgml/ref/reindex.sgml
@@ -305,7 +305,7 @@ REINDEX [ ( option [, ...] ) ] { DA
partitioned table, such commands skip the privilege checks when processing
the individual partitions.  Reindexing a schema or database requires being 
the
owner of that schema or database or having privileges of the
-   pg_maintain
+   
role.  Note specifically that it's thus
possible for non-superusers to rebuild indexes of tables owned by
other users.  However, as a special exception,
diff --git a/doc/src/sgml/user-manag.sgml b/doc/src/sgml/user-manag.sgml
index 07a16247d7..1d3805d303 100644
--- a/doc/src/sgml/user-manag.sgml
+++ b/doc/src/sgml/user-manag.sgml
@@ -590,101 +590,71 @@ DROP ROLE doomed_role;
and information.  Administrators (including roles that have the
CREATEROLE privilege) can GRANT these
roles to users and/or other roles in their environment, providing those
-   users with access to the specified capabilities and information.
+   users with access to the specified capabilities and information.  For
+   example:
+
+
+GRANT pg_signal_backend TO admin_user;
+
   
 
+  
+   
+Care should be taken when granting these roles to ensure they are only used
+where needed and with the understanding that these roles grant access to
+privileged information.
+   
+  
+
   
-   The predefined roles are described in .
+   The predefined roles are described below.
Note that the specific permissions for each of the roles may change in
the future as additional capabilities are added.  Administrators
should monitor the release notes for changes.
-

Re: IPC::Run accepts bug reports

2024-06-18 Thread Noah Misch

On Mon, Jun 17, 2024 at 11:11:17AM -0700, Andres Freund wrote:
> On 2024-06-15 16:48:24 -0700, Noah Misch wrote:
> > On Sat, Jun 15, 2024 at 01:26:57PM -0400, Robert Haas wrote:
> > > The one
> > > thing I know about that *I* think is a pretty big problem about Perl
> > > is that IPC::Run is not really maintained.
> >
> > I don't see in https://github.com/cpan-authors/IPC-Run/issues anything
> > affecting PostgreSQL.  If you know of IPC::Run defects, please report them.
> > If I knew of an IPC::Run defect affecting PostgreSQL, I likely would work on
> > it before absurdity like https://github.com/cpan-authors/IPC-Run/issues/175
> > NetBSD-10-specific behavior coping.
> 
> 1) Sometimes hangs hard on windows if started processes have not been shut
>down before script exits.  I've mostly encountered this via the buildfarm /
>CI, so I never had a good way of narrowing this down. It's very painful
>because things seem to often just get stuck once that happens.

That's bad.  Do you have a link to a log, a thread discussing it, or even just
one of the test names experiencing it?

> 2) If a subprocess dies in an inopportune moment, IPC::Run dies with "ack
>Broken pipe:" (in _do_filters()). There's plenty reports of this on the
>list, and I've hit this several times personally. It seems to be timing
>dependent, I've encountered it after seemingly irrelevant ordering changes.
> 
>I suspect I could create a reproducer with a bit of time.

I've seen that one.  If the harness has data to write to a child, the child
exiting before the write is one way to reach that.  Perhaps before exec(),
IPC::Run should do a non-blocking write from each pending IO.  That way, small
writes never experience the timing-dependent behavior.

> 3) It's very slow on windows (in addition to the windows process
>slowness). That got to be fixable to some degree.

Agreed.  For the next release, today's git has some optimizations.  There are
other known-possible Windows optimizations not implemented.

RE: Proposal for Updating CRC32C with AVX-512 Algorithm.

2024-06-18 Thread Amonson, Paul D

> Hmm, I wonder if the "(c) 2024 Intel" line is going to bring us trouble.
> (I bet it's not really necessary anyway.)

Our lawyer agrees, copyright is covered by the "PostgreSQL Global Development 
Group" copyright line as a contributor.

> And this bit doesn't look good.  The LICENSE file says:
...
> > //* Redistributions in binary form must reproduce the above
> > // copyright notice, this list of conditions and the following
> > disclaimer // in the documentation and/or other materials provided
> > with the // distribution.
...
> The second clause essentially says we would have to add a page to our
> "documentation and/or other materials" with the contents of the license file.

According to one of Intel’s lawyers, 55 instances of this clause was found when 
they searched in the PostgreSQL repository. Therefore, I assume that this 
obligation has either been satisfied or determined not to apply, given that the 
second BSD clause already appears in the PostgreSQL source tree. I might have 
misunderstood the concern, but the lawyer believes this is a non-issue. Could 
you please provide more clarifying details about the concern?

Thanks,
Paul



0002-v4-Fix-Copyright-and-Licensing-issues.patch
Description: 0002-v4-Fix-Copyright-and-Licensing-issues.patch


0001-v4-Feat-Add-AVX512-crc32c-algorithm-to-postgres.patch
Description: 0001-v4-Feat-Add-AVX512-crc32c-algorithm-to-postgres.patch

Re: Proposal for Updating CRC32C with AVX-512 Algorithm.

2024-06-18 Thread Bruce Momjian

On Tue, Jun 18, 2024 at 05:14:08PM +, Amonson, Paul D wrote:
> > And this bit doesn't look good.  The LICENSE file says:
> ...
> > > //* Redistributions in binary form must reproduce the above
> > > // copyright notice, this list of conditions and the following
> > > disclaimer // in the documentation and/or other materials provided
> > > with the // distribution.
> ...
> > The second clause essentially says we would have to add a page to our
> > "documentation and/or other materials" with the contents of the license 
> > file.
> 
> According to one of Intel’s lawyers, 55 instances of this clause was found 
> when they searched in the PostgreSQL repository. Therefore, I assume that 
> this obligation has either been satisfied or determined not to apply, given 
> that the second BSD clause already appears in the PostgreSQL source tree. I 
> might have misunderstood the concern, but the lawyer believes this is a 
> non-issue. Could you please provide more clarifying details about the concern?

Yes, I can confirm that:

grep -Rl 'Redistributions in binary form must reproduce' . | wc -l

reports 54;  file list attached.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Only you can decide what is important to you.
./src/include/lib/sort_template.h
./src/tools/pg_bsd_indent/err.c
./src/tools/pg_bsd_indent/lexi.c
./src/tools/pg_bsd_indent/indent_globs.h
./src/tools/pg_bsd_indent/pr_comment.c
./src/tools/pg_bsd_indent/indent.c
./src/tools/pg_bsd_indent/parse.c
./src/tools/pg_bsd_indent/io.c
./src/tools/pg_bsd_indent/indent_codes.h
./src/tools/pg_bsd_indent/err.h
./src/tools/pg_bsd_indent/indent.h
./src/tools/pg_bsd_indent/args.c
./src/backend/utils/mb/wstrcmp.c
./src/backend/utils/mb/wstrncmp.c
./src/common/md5.c
./src/common/sha2_int.h
./src/common/sha1_int.h
./src/common/sha2.c
./src/common/sha1.c
./src/common/md5_int.h
./src/timezone/strftime.c
./src/port/pgmkdirp.c
./src/port/getopt_long.c
./src/port/mkdtemp.c
./src/port/getopt.c
./src/port/snprintf.c
./src/port/bsearch_arg.c
./src/port/inet_aton.c
./contrib/pgcrypto/openssl.c
./contrib/pgcrypto/pgp-pubdec.c
./contrib/pgcrypto/pgp-cfb.c
./contrib/pgcrypto/pgcrypto.h
./contrib/pgcrypto/mbuf.h
./contrib/pgcrypto/pgcrypto.c
./contrib/pgcrypto/crypt-des.c
./contrib/pgcrypto/px-hmac.c
./contrib/pgcrypto/pgp-armor.c
./contrib/pgcrypto/px.c
./contrib/pgcrypto/px-crypt.c
./contrib/pgcrypto/mbuf.c
./contrib/pgcrypto/px-crypt.h
./contrib/pgcrypto/pgp-pgsql.c
./contrib/pgcrypto/pgp-mpi.c
./contrib/pgcrypto/pgp-mpi-openssl.c
./contrib/pgcrypto/pgp-decrypt.c
./contrib/pgcrypto/pgp.h
./contrib/pgcrypto/px.h
./contrib/pgcrypto/pgp-info.c
./contrib/pgcrypto/pgp-pubenc.c
./contrib/pgcrypto/pgp-encrypt.c
./contrib/pgcrypto/pgp-compress.c
./contrib/pgcrypto/pgp-s2k.c
./contrib/pgcrypto/pgp.c
./contrib/pgcrypto/pgp-pubkey.c

Re: Proposal for Updating CRC32C with AVX-512 Algorithm.

2024-06-18 Thread Bruce Momjian

On Tue, Jun 18, 2024 at 01:20:50PM -0400, Bruce Momjian wrote:
> On Tue, Jun 18, 2024 at 05:14:08PM +, Amonson, Paul D wrote:
> > > And this bit doesn't look good.  The LICENSE file says:
> > ...
> > > > //* Redistributions in binary form must reproduce the above
> > > > // copyright notice, this list of conditions and the following
> > > > disclaimer // in the documentation and/or other materials provided
> > > > with the // distribution.
> > ...
> > > The second clause essentially says we would have to add a page to our
> > > "documentation and/or other materials" with the contents of the license 
> > > file.
> > 
> > According to one of Intel’s lawyers, 55 instances of this clause was found 
> > when they searched in the PostgreSQL repository. Therefore, I assume that 
> > this obligation has either been satisfied or determined not to apply, given 
> > that the second BSD clause already appears in the PostgreSQL source tree. I 
> > might have misunderstood the concern, but the lawyer believes this is a 
> > non-issue. Could you please provide more clarifying details about the 
> > concern?
> 
> Yes, I can confirm that:
> 
>   grep -Rl 'Redistributions in binary form must reproduce' . | wc -l
> 
> reports 54;  file list attached.

I am somewhat embarrassed by this since we made the Intel lawyers find
something that was in our own source code.

First, the "advertizing clause" in the 4-clause license:

 3. All advertising materials mentioning features or use of this
software must display the following acknowledgement: This product
includes software developed by the University of California,
Berkeley and its contributors.

and was disavowed by Berkeley on July 22nd, 1999:

https://elrc-share.eu/static/metashare/licences/BSD-3-Clause.pdf

While the license we are concerned about does not have this clause, it
does have:

 2. Redistributions in binary form must reproduce the above
copyright notice, this list of conditions and the following
disclaimer in the documentation and/or other materials provided
with the distribution.

I assume that must also include the name of the copyright holder.

I think that means we need to mention The Regents of the University of
California in our copyright notice, which we do.  However several
non-Regents of the University of California copyright holder licenses
exist in our source tree, and accepting this AVX-512 patch would add
another one.  Specifically, I see existing entries for:

Aaron D. Gifford
Board of Trustees of the University of Illinois
David Burren
Eric P. Allman
Jens Schweikhardt
Marko Kreen
Sun Microsystems, Inc.
WIDE Project

Now, some of these are these names plus Berkeley, and some are just the
names above.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Only you can decide what is important to you.

Re: Inval reliability, especially for inplace updates

2024-06-18 Thread Noah Misch

On Tue, Jun 18, 2024 at 08:23:49AM -0700, Noah Misch wrote:
> On Mon, Jun 17, 2024 at 06:57:30PM -0700, Andres Freund wrote:
> > On 2024-06-17 16:58:54 -0700, Noah Misch wrote:
> > > On Sat, Jun 15, 2024 at 03:37:18PM -0700, Noah Misch wrote:
> > > > On Wed, May 22, 2024 at 05:05:48PM -0700, Noah Misch wrote:
> > > > > https://postgr.es/m/20240512232923.aa.nmi...@google.com wrote:
> > > > > > Separable, nontrivial things not fixed in the attached patch stack:
> > > > > >
> > > > > > - Inplace update uses transactional CacheInvalidateHeapTuple().  
> > > > > > ROLLBACK of
> > > > > >   CREATE INDEX wrongly discards the inval, leading to the 
> > > > > > relhasindex=t loss
> > > > > >   still seen in inplace-inval.spec.  CacheInvalidateRelmap() does 
> > > > > > this right.
> > > > >
> > > > > I plan to fix that like CacheInvalidateRelmap(): send the inval 
> > > > > immediately,
> > > > > inside the critical section.  Send it in heap_xlog_inplace(), too.
> > 
> > I'm worried this might cause its own set of bugs, e.g. if there are any 
> > places
> > that, possibly accidentally, rely on the invalidation from the inplace 
> > update
> > to also cover separate changes.
> 
> Good point.  I do have index_update_stats() still doing an ideally-superfluous
> relcache update for that reason.  Taking that further, it would be cheap
> insurance to have the inplace update do a transactional inval in addition to
> its immediate inval.  Future master-only work could remove the transactional
> one.  How about that?
> 
> > Have you considered instead submitting these invalidations during abort as
> > well?
> 
> I had not.  Hmmm.  If the lock protocol in README.tuplock (after patch
> inplace120) told SearchSysCacheLocked1() to do systable scans instead of
> syscache reads, that could work.  Would need to ensure a PANIC if transaction
> abort doesn't reach the inval submission.  Overall, it would be harder to
> reason about the state of caches, but I suspect the patch would be smaller.
> How should we choose between those strategies?
> 
> > > > > a. Within logical decoding, cease processing invalidations for inplace
> > > >
> > > > I'm attaching the implementation.  This applies atop the v3 patch stack 
> > > > from
> > > > https://postgr.es/m/20240614003549.c2.nmi...@google.com, but the 
> > > > threads are
> > > > mostly orthogonal and intended for independent review.  Translating a 
> > > > tuple
> > > > into inval messages uses more infrastructure than relmapper, which 
> > > > needs just
> > > > a database ID.  Hence, this ended up more like a miniature of inval.c's
> > > > participation in the transaction commit sequence.
> > > >
> > > > I waffled on whether to back-patch inplace150-inval-durability-atcommit
> > >
> > > That inplace150 patch turned out to be unnecessary.  Contrary to the
> > > "noncritical resource releasing" comment some lines above
> > > AtEOXact_Inval(true), the actual behavior is already to promote ERROR to
> > > PANIC.  An ERROR just before or after sending invals becomes PANIC, 
> > > "cannot
> > > abort transaction %u, it was already committed".
> > 
> > Relying on that, instead of explicit critical sections, seems fragile to me.
> > IIRC some of the behaviour around errors around transaction commit/abort has
> > changed a bunch of times. Tying correctness into something that could be
> > changed for unrelated reasons doesn't seem great.
> 
> Fair enough.  It could still be a good idea for master, but given I missed a
> bug in inplace150-inval-durability-atcommit-v1.patch far worse than the ones
> $SUBJECT fixes, let's not risk it in back branches.
> 
> > I'm not sure it holds true even today - what if the transaction didn't have 
> > an
> > xid? Then RecordTransactionAbort() wouldn't trigger
> >  "cannot abort transaction %u, it was already committed"
> > I think?
> 
> I think that's right.  As the inplace160-inval-durability-inplace-v2.patch
> edits to xact.c say, the concept of invals in XID-less transactions is buggy
> at its core.  Fortunately, after that patch, we use them only for two things
> that could themselves stop with something roughly as simple as the attached.

Now actually attached.

> > > > - Same change, no WAL version bump.  Standby must update before 
> > > > primary.  This
> > > >   is best long-term, but the transition is more disruptive.  I'm leaning
> > > >   toward this one, but the second option isn't bad:
> > 
> > Hm. The inplace record doesn't use the length of the "main data" record
> > segment for anything, from what I can tell. If records by an updated primary
> > were replayed by an old standby, it'd just ignore the additional data, 
> > afaict?
> 
> Agreed, but ...
> 
> > I think with the code as-is, the situation with an updated standby replaying
> > an old primary's record would actually be worse - it'd afaict just assume 
> > the
> > now-longer record contained valid fields, despite those just pointing into
> > uninitialized memory.  I think the replay ro

Re: cost delay brainstorming

2024-06-18 Thread Nathan Bossart

On Mon, Jun 17, 2024 at 03:39:27PM -0400, Robert Haas wrote:
> I think we might able to get fairly far by observing that if the
> number of running autovacuum workers is equal to the maximum allowable
> number of running autovacuum workers, that may be a sign of trouble,
> and the longer that situation persists, the more likely it is that
> we're in trouble. So, a very simple algorithm would be: If the maximum
> number of workers have been running continuously for more than, say,
> 10 minutes, assume we're falling behind and exempt all workers from
> the cost limit for as long as the situation persists. One could
> criticize this approach on the grounds that it causes a very sudden
> behavior change instead of, say, allowing the rate of vacuuming to
> gradually increase. I'm curious to know whether other people think
> that would be a problem.
> 
> I think it might be OK, for a couple of reasons:
> 
> 1. I'm unconvinced that the vacuum_cost_delay system actually prevents
> very many problems. I've fixed a lot of problems by telling users to
> raise the cost limit, but virtually never by lowering it. When we
> lowered the delay by an order of magnitude a few releases ago -
> equivalent to increasing the cost limit by an order of magnitude - I
> didn't personally hear any complaints about that causing problems. So
> disabling the delay completely some of the time might just be fine.

Have we ruled out further adjustments to the cost parameters as a first
step?  If you are still recommending that folks raise it and never
recommending that folks lower it, ISTM that our defaults might still not be
in the right ballpark.  The autovacuum_vacuum_cost_delay adjustment you
reference (commit cbccac3) is already 5 years old, so maybe it's worth
another look.

Perhaps only tangentially related, but I feel like the default of 3 for
autovacuum_max_workers is a bit low, especially for systems with many
tables.  Changing the default for that likely requires changing the default
for the delay/limit, too.

-- 
nathan

Re: allow changing autovacuum_max_workers without restarting

2024-06-18 Thread Nathan Bossart

On Mon, Jun 03, 2024 at 04:24:27PM -0700, Andres Freund wrote:
> On 2024-06-03 14:28:13 -0500, Nathan Bossart wrote:
>> On Mon, Jun 03, 2024 at 12:08:52PM -0700, Andres Freund wrote:
>> > Why do we think that increasing the number of PGPROC slots, heavyweight 
>> > locks
>> > etc by 256 isn't going to cause issues?  That's not an insubstantial 
>> > amount of
>> > memory to dedicate to something that will practically never be used.
>> 
>> I personally have not observed problems with these kinds of bumps in
>> resource usage, although I may be biased towards larger systems where it
>> doesn't matter as much.
> 
> IME it matters *more* on larger systems. Or at least used to, I haven't
> experimented with this in quite a while.
> 
> It's possible that we improved a bunch of things sufficiently for this to not
> matter anymore.

I'm curious if there is something specific you would look into to verify
this.  IIUC one concern is the lock table not fitting into L3.  Is there
anything else?  Any particular workloads you have in mind?

-- 
nathan

Re: IPC::Run accepts bug reports

2024-06-18 Thread Andres Freund

Hi,

On 2024-06-18 10:10:17 -0700, Noah Misch wrote:
> On Mon, Jun 17, 2024 at 11:11:17AM -0700, Andres Freund wrote:
> > On 2024-06-15 16:48:24 -0700, Noah Misch wrote:
> > > On Sat, Jun 15, 2024 at 01:26:57PM -0400, Robert Haas wrote:
> > > > The one
> > > > thing I know about that *I* think is a pretty big problem about Perl
> > > > is that IPC::Run is not really maintained.
> > >
> > > I don't see in https://github.com/cpan-authors/IPC-Run/issues anything
> > > affecting PostgreSQL.  If you know of IPC::Run defects, please report 
> > > them.
> > > If I knew of an IPC::Run defect affecting PostgreSQL, I likely would work 
> > > on
> > > it before absurdity like 
> > > https://github.com/cpan-authors/IPC-Run/issues/175
> > > NetBSD-10-specific behavior coping.
> > 
> > 1) Sometimes hangs hard on windows if started processes have not been shut
> >down before script exits.  I've mostly encountered this via the 
> > buildfarm /
> >CI, so I never had a good way of narrowing this down. It's very painful
> >because things seem to often just get stuck once that happens.
> 
> That's bad.  Do you have a link to a log, a thread discussing it, or even just
> one of the test names experiencing it?

I'm unfortunately blanking on the right keyword right now.

I think it basically required not shutting down a process started in the
background with IPC::Run.

I'll try to repro it by removing some ->finish or ->quit calls.

There's also a bunch of tests that have blocks like

# some Windows Perls at least don't like IPC::Run's start/kill_kill 
regime.
skip "Test fails on Windows perl", 2 if $Config{osname} eq 'MSWin32';

Some of them may have been related to this.


> > 2) If a subprocess dies in an inopportune moment, IPC::Run dies with "ack
> >Broken pipe:" (in _do_filters()). There's plenty reports of this on the
> >list, and I've hit this several times personally. It seems to be timing
> >dependent, I've encountered it after seemingly irrelevant ordering 
> > changes.
> > 
> >I suspect I could create a reproducer with a bit of time.
> 
> I've seen that one.  If the harness has data to write to a child, the child
> exiting before the write is one way to reach that.  Perhaps before exec(),
> IPC::Run should do a non-blocking write from each pending IO.  That way, small
> writes never experience the timing-dependent behavior.

I think the question is rather, why is ipc run choosing to die in this
situation and can that be fixed?


> > 3) It's very slow on windows (in addition to the windows process
> >slowness). That got to be fixable to some degree.
> 
> Agreed.  For the next release, today's git has some optimizations.  There are
> other known-possible Windows optimizations not implemented.

Yay!

Greetings,

Andres Freund

Re: Xact end leaves CurrentMemoryContext = TopMemoryContext

2024-06-18 Thread Tom Lane

David Rowley  writes:
> On Tue, 18 Jun 2024 at 16:53, Tom Lane  wrote:
>> I'll poke at this tomorrow, unless you're hot to try it right now.

> Please go ahead. I was just in suggestion mode here.

So I tried that, and while it kind of worked, certain parts of the
system (notably logical replication) had acute indigestion.  Turns
out there is a fair amount of code that does

StartTransactionCommand();
... some random thing or other ...
CommitTransactionCommand();

and does not stop to think at all about whether that has any effect on
its memory context.  Andres' idea would break every single such place,
and this idea isn't much better because while it does provide a
current memory context after CommitTransactionCommand, that context is
effectively short-lived: the next Start/CommitTransactionCommand will
trash it.  That broke a lot more places than I'd hoped, mostly in
obscure ways.

After awhile I had an epiphany: what we should do is make
CommitTransactionCommand restore the memory context that was active
before StartTransactionCommand.  That's what we want in every place
that was cognizant of this issue, and it seems to be the case in every
place that wasn't doing anything explicit about it, either.

The 0001 patch attached does that, and seems to work nicely.
I made it implement the idea of recycling TopTransactionContext,
too.  (Interestingly, src/backend/utils/mmgr/README *already*
claims we manage TopTransactionContext this way.  Did we do that
and then change it back in the mists of time?)  The core parts
of the patch are all in xact.c --- the other diffs are just random
cleanup that I found while surveying use of TopMemoryContext and
CommitTransactionCommand.

Also, 0002 is what I propose for the back branches.  It just adds
memory context save/restore in notify interrupt processing to solve
the original bug report, as well as in sinval interrupt processing
which I discovered has the same disease.  We don't need this in
HEAD if we apply 0001.

At this point I'd be inclined to wait for the branch to be made,
and then apply 0001 in HEAD/v18 only and 0002 in v17 and before.
While 0001 seems fairly straightforward, it's still a nontrivial
change and I'm hesitant to shove it in at this late stage of the
v17 cycle.

regards, tom lane

diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index 9bda1aa6bc..fce12fea6e 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -200,6 +200,7 @@ typedef struct TransactionStateData
 	int			gucNestLevel;	/* GUC context nesting depth */
 	MemoryContext curTransactionContext;	/* my xact-lifetime context */
 	ResourceOwner curTransactionOwner;	/* my query resources */
+	MemoryContext priorContext; /* CurrentMemoryContext before xact started */
 	TransactionId *childXids;	/* subcommitted child XIDs, in XID order */
 	int			nChildXids;		/* # of subcommitted child XIDs */
 	int			maxChildXids;	/* allocated size of childXids[] */
@@ -1174,6 +1175,11 @@ AtStart_Memory(void)
 {
 	TransactionState s = CurrentTransactionState;

+	/*
+	 * Remember the memory context that was active prior to transaction start.
+	 */
+	s->priorContext = CurrentMemoryContext;
+
 	/*
 	 * If this is the first time through, create a private context for
 	 * AbortTransaction to work in.  By reserving some space now, we can
@@ -1190,17 +1196,15 @@ AtStart_Memory(void)
   32 * 1024);

 	/*
-	 * We shouldn't have a transaction context already.
-	 */
-	Assert(TopTransactionContext == NULL);
-
-	/*
-	 * Create a toplevel context for the transaction.
+	 * Likewise, if this is the first time through, create a top-level context
+	 * for transaction-local data.  This context will be reset at transaction
+	 * end, and then re-used in later transactions.
 	 */
-	TopTransactionContext =
-		AllocSetContextCreate(TopMemoryContext,
-			  "TopTransactionContext",
-			  ALLOCSET_DEFAULT_SIZES);
+	if (TopTransactionContext == NULL)
+		TopTransactionContext =
+			AllocSetContextCreate(TopMemoryContext,
+  "TopTransactionContext",
+  ALLOCSET_DEFAULT_SIZES);

 	/*
 	 * In a top-level transaction, CurTransactionContext is the same as
@@ -1251,6 +1255,11 @@ AtSubStart_Memory(void)

 	Assert(CurTransactionContext != NULL);

+	/*
+	 * Remember the context that was active prior to subtransaction start.
+	 */
+	s->priorContext = CurrentMemoryContext;
+
 	/*
 	 * Create a CurTransactionContext, which will be used to hold data that
 	 * survives subtransaction commit but disappears on subtransaction abort.
@@ -1576,20 +1585,30 @@ AtCCI_LocalCache(void)
 static void
 AtCommit_Memory(void)
 {
+	TransactionState s = CurrentTransactionState;
+
 	/*
-	 * Now that we're "out" of a transaction, have the system allocate things
-	 * in the top memory context instead of per-transaction contexts.
+	 * Return to the memory context that was current before we started the
+	 * transaction.  (In

fix pg_upgrade comment

2024-06-18 Thread Nathan Bossart

I noticed that the "check" variable, which is used for "pg_upgrade
--check", is commented as follows:

boolcheck;  /* true -> ask user for 
permission to make
 * changes */

This comment was first added when pg_upgrade was introduced (commit
c2e9b2f288), but I imagine it predates even that.  I've attached a patch to
fix this.  Barring objections, I'll probably commit this soon.

-- 
nathan
>From 379793a07dd98f65466452806756bfa0f6312bdf Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Tue, 18 Jun 2024 14:41:35 -0500
Subject: [PATCH v1 1/1] pg_upgrade: fix comment

---
 src/bin/pg_upgrade/pg_upgrade.h | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/src/bin/pg_upgrade/pg_upgrade.h b/src/bin/pg_upgrade/pg_upgrade.h
index 92bcb693fb..8afe240bdf 100644
--- a/src/bin/pg_upgrade/pg_upgrade.h
+++ b/src/bin/pg_upgrade/pg_upgrade.h
@@ -321,8 +321,7 @@ typedef struct
 */
 typedef struct
 {
-   boolcheck;  /* true -> ask user for 
permission to make
-* changes */
+   boolcheck;  /* check clusters only, don't 
change any data */
booldo_sync;/* flush changes to disk */
transferMode transfer_mode; /* copy files or link them? */
int jobs;   /* number of 
processes/threads to use */
-- 
2.39.3 (Apple Git-146)

Re: fix pg_upgrade comment

2024-06-18 Thread Daniel Gustafsson

> On 18 Jun 2024, at 21:50, Nathan Bossart  wrote:
> 
> I noticed that the "check" variable, which is used for "pg_upgrade
> --check", is commented as follows:
> 
> bool check; /* true -> ask user for permission to make
> * changes */
> 
> This comment was first added when pg_upgrade was introduced (commit
> c2e9b2f288), but I imagine it predates even that.  I've attached a patch to
> fix this.  Barring objections, I'll probably commit this soon.

Nice catch, +1 for committing. 

--
Daniel Gustafsson

Re: Xact end leaves CurrentMemoryContext = TopMemoryContext

2024-06-18 Thread Andres Freund

Hi,

On 2024-06-18 15:28:03 -0400, Tom Lane wrote:
> After awhile I had an epiphany: what we should do is make
> CommitTransactionCommand restore the memory context that was active
> before StartTransactionCommand.  That's what we want in every place
> that was cognizant of this issue, and it seems to be the case in every
> place that wasn't doing anything explicit about it, either.

I like it.

I wonder if there's an argument the "persistent" TopTransactionContext should
live under a different name outside of transactions, to avoid references to it
working in a context where it's not valid?   It's probably not worth it, but
not sure.

> The 0001 patch attached does that, and seems to work nicely.
> I made it implement the idea of recycling TopTransactionContext,
> too

Nice.

I think there might be some benefit to doing that for some more things,
later/separately. E.g. the allocation of TopTransactionResourceOwner shows up
in profiles for workloads with small transactions. Which got a bit worse with
17 (largely bought back in other places by the advantages of the new resowner
system).

> At this point I'd be inclined to wait for the branch to be made,
> and then apply 0001 in HEAD/v18 only and 0002 in v17 and before.
> While 0001 seems fairly straightforward, it's still a nontrivial
> change and I'm hesitant to shove it in at this late stage of the
> v17 cycle.

Seems reasonable.

Greetings,

Andres Freund

Re: cost delay brainstorming

2024-06-18 Thread Andres Freund

Hi,

On 2024-06-18 13:50:46 -0500, Nathan Bossart wrote:
> Have we ruled out further adjustments to the cost parameters as a first
> step?

I'm not against that, but I it doesn't address the issue that with the current
logic one set of values just isn't going to fit a 60MB that's allowed to burst
to 100 iops and a 60TB database that has multiple 1M iops NVMe drives.

That said, the fact that vacuum_cost_page_hit is 1 and vacuum_cost_page_miss
is 2 just doesn't make much sense aesthetically. There's a far bigger
multiplier in actual costs than that...

> If you are still recommending that folks raise it and never recommending
> that folks lower it, ISTM that our defaults might still not be in the right
> ballpark.  The autovacuum_vacuum_cost_delay adjustment you reference (commit
> cbccac3) is already 5 years old, so maybe it's worth another look.

Adjusting cost delay much lower doesn't make much sense imo. It's already only
2ms on a 1ms granularity variable.  We could increase the resolution, but
sleeping for much shorter often isn't that cheap (you need to set up hardware
timers all the time and due to the short time they can't be combined with
other timers) and/or barely gives time to switch to other tasks.

So we'd have to increase the cost limit.

Greetings,

Andres Freund

Re: IPC::Run accepts bug reports

2024-06-18 Thread Andrew Dunstan



On 2024-06-18 Tu 3:00 PM, Andres Freund wrote:

Hi,

On 2024-06-18 10:10:17 -0700, Noah Misch wrote:

On Mon, Jun 17, 2024 at 11:11:17AM -0700, Andres Freund wrote:

On 2024-06-15 16:48:24 -0700, Noah Misch wrote:

On Sat, Jun 15, 2024 at 01:26:57PM -0400, Robert Haas wrote:

The one
thing I know about that *I* think is a pretty big problem about Perl
is that IPC::Run is not really maintained.

I don't see inhttps://github.com/cpan-authors/IPC-Run/issues  anything
affecting PostgreSQL.  If you know of IPC::Run defects, please report them.
If I knew of an IPC::Run defect affecting PostgreSQL, I likely would work on
it before absurdity likehttps://github.com/cpan-authors/IPC-Run/issues/175
NetBSD-10-specific behavior coping.

1) Sometimes hangs hard on windows if started processes have not been shut
down before script exits.  I've mostly encountered this via the buildfarm /
CI, so I never had a good way of narrowing this down. It's very painful
because things seem to often just get stuck once that happens.

That's bad.  Do you have a link to a log, a thread discussing it, or even just
one of the test names experiencing it?

I'm unfortunately blanking on the right keyword right now.

I think it basically required not shutting down a process started in the
background with IPC::Run.

I'll try to repro it by removing some ->finish or ->quit calls.

There's also a bunch of tests that have blocks like

# some Windows Perls at least don't like IPC::Run's start/kill_kill 
regime.
skip "Test fails on Windows perl", 2 if $Config{osname} eq 'MSWin32';

Some of them may have been related to this.



I only found one of those, in 
src/test/recovery/t/006_logical_decoding.pl, which seems to be the only 
place we use kill_kill at all. That comment dates back to 2017, so maybe 
a more modern perl and/or IPC::Run will improve matters.


It's not clear to me why that code isn't calling finish() before trying 
kill_kill(). That's what the IPC::Run docs seem to suggest you should do.



cheers


andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com

Re: allow changing autovacuum_max_workers without restarting

2024-06-18 Thread Andres Freund

Hi,

On 2024-06-18 14:00:00 -0500, Nathan Bossart wrote:
> On Mon, Jun 03, 2024 at 04:24:27PM -0700, Andres Freund wrote:
> > On 2024-06-03 14:28:13 -0500, Nathan Bossart wrote:
> >> On Mon, Jun 03, 2024 at 12:08:52PM -0700, Andres Freund wrote:
> >> > Why do we think that increasing the number of PGPROC slots, heavyweight 
> >> > locks
> >> > etc by 256 isn't going to cause issues?  That's not an insubstantial 
> >> > amount of
> >> > memory to dedicate to something that will practically never be used.
> >> 
> >> I personally have not observed problems with these kinds of bumps in
> >> resource usage, although I may be biased towards larger systems where it
> >> doesn't matter as much.
> > 
> > IME it matters *more* on larger systems. Or at least used to, I haven't
> > experimented with this in quite a while.
> > 
> > It's possible that we improved a bunch of things sufficiently for this to 
> > not
> > matter anymore.
> 
> I'm curious if there is something specific you would look into to verify
> this.  IIUC one concern is the lock table not fitting into L3.  Is there
> anything else?  Any particular workloads you have in mind?

That was the main thing I was thinking of.

But I think I just thought of one more: It's going to *substantially* increase
the resource usage for tap tests.  Right now Cluster.pm has
# conservative settings to ensure we can run multiple 
postmasters:
print $conf "shared_buffers = 1MB\n";
print $conf "max_connections = 10\n";

for nodes that allow streaming.

Adding 256 extra backend slots increases the shared memory usage from ~5MB to
~18MB.

I just don't see much point in reserving 256 worker "possibilities", tbh. I
can't think of any practical system where it makes sense to use this much (nor
do I think it's going to be reasonable in the next 10 years) and it's just
going to waste memory and startup time for everyone.

Nor does it make sense to me to have the max autovac workers be independent of
max_connections.

Greetings,

Andres Freund

Re: Xact end leaves CurrentMemoryContext = TopMemoryContext

2024-06-18 Thread Tom Lane

Andres Freund  writes:
> I wonder if there's an argument the "persistent" TopTransactionContext should
> live under a different name outside of transactions, to avoid references to it
> working in a context where it's not valid?   It's probably not worth it, but
> not sure.

Hm.  We could stash the long-lived pointer in a static variable,
and only install it in TopTransactionContext when you're supposed
to use it.  I tend to agree not worth it, though.

regards, tom lane

Re: allow changing autovacuum_max_workers without restarting

2024-06-18 Thread Nathan Bossart

On Tue, Jun 18, 2024 at 01:43:34PM -0700, Andres Freund wrote:
> I just don't see much point in reserving 256 worker "possibilities", tbh. I
> can't think of any practical system where it makes sense to use this much (nor
> do I think it's going to be reasonable in the next 10 years) and it's just
> going to waste memory and startup time for everyone.

Given this, here are some options I see for moving this forward:

* lower the cap to, say, 64 or 32
* exclude autovacuum worker slots from computing number of locks, etc.
* make the cap configurable and default it to something low (e.g., 8)

My intent with a reserved set of 256 slots was to prevent users from
needing to deal with two GUCs.  For all practical purposes, it would be
possible to change autovacuum_max_workers whenever you want.  But if the
extra resource requirements are too much of a tax, I'm content to change
course.

-- 
nathan

Re: cost delay brainstorming

2024-06-18 Thread Nathan Bossart

On Tue, Jun 18, 2024 at 01:32:38PM -0700, Andres Freund wrote:
> On 2024-06-18 13:50:46 -0500, Nathan Bossart wrote:
>> Have we ruled out further adjustments to the cost parameters as a first
>> step?
> 
> I'm not against that, but I it doesn't address the issue that with the current
> logic one set of values just isn't going to fit a 60MB that's allowed to burst
> to 100 iops and a 60TB database that has multiple 1M iops NVMe drives.

True.

>> If you are still recommending that folks raise it and never recommending
>> that folks lower it, ISTM that our defaults might still not be in the right
>> ballpark.  The autovacuum_vacuum_cost_delay adjustment you reference (commit
>> cbccac3) is already 5 years old, so maybe it's worth another look.
> 
> Adjusting cost delay much lower doesn't make much sense imo. It's already only
> 2ms on a 1ms granularity variable.  We could increase the resolution, but
> sleeping for much shorter often isn't that cheap (you need to set up hardware
> timers all the time and due to the short time they can't be combined with
> other timers) and/or barely gives time to switch to other tasks.
> 
> 
> So we'd have to increase the cost limit.

Agreed.

-- 
nathan

Re: CI and test improvements

2024-06-18 Thread Andres Freund

Hi,

On 2024-06-18 08:36:57 -0500, Justin Pryzby wrote:
> On Fri, Jun 14, 2024 at 08:34:37AM -0700, Andres Freund wrote:
> > Hm. There actually already is the mingw ccache installed. The images for 
> > mingw and msvc used to be separate (for startup performance when using 
> > containers), but we just merged them.  So it might be easiest to just 
> > explicitly use the ccache from there
> 
> > (also an explicit path might be faster).
> 
> I don't think the path search is significant; it's fast so long as
> there's no choco stub involved.

Comparatively it's definitely small, but I've seen it make a difference on
windows.


> > Could you check if that's fast?
> 
> Yes, it is.

Cool.


> > If not, we can just install the binaries distributed by the project, it's 
> > just more work to keep up2date that way. 
> 
> I guess you mean a separate line to copy choco's ccache.exe somewhere.

I was thinking of alternatively just using the binaries upstream
distributes. But the mingw way seems easier.

Perhaps we should add an environment variable pointing to ccache to the image,
or such?


>build_script: |
>  vcvarsall x64
> -ninja -C build
> +ninja -C build |tee build.txt
> +REM Since pipes lose the exit status of the preceding command, rerun the 
> compilation
> +REM without the pipe, exiting now if it fails, to avoid trying to run 
> checks
> +ninja -C build > nul

Perhaps we could do something like
(ninja -C build && touch success) | tee build.txt
cat success
?


> +CCACHE_MAXSIZE: "500M"

Does it have to be this big to work?


>configure_script: |
>  vcvarsall x64
> -meson setup --backend ninja --buildtype debug 
> -Dc_link_args=/DEBUG:FASTLINK -Dcassert=true -Dinjection_points=true 
> -Db_pch=true -Dextra_lib_dirs=c:\openssl\1.1\lib 
> -Dextra_include_dirs=c:\openssl\1.1\include -DTAR=%TAR% 
> -DPG_TEST_EXTRA="%PG_TEST_EXTRA%" build
> +meson setup build --backend ninja --buildtype debug 
> -Dc_link_args=/DEBUG:FASTLINK -Dcassert=true -Dinjection_points=true 
> -Db_pch=true -Dextra_lib_dirs=c:\openssl\1.1\lib 
> -Dextra_include_dirs=c:\openssl\1.1\include -DTAR=%TAR% 
> -DPG_TEST_EXTRA="%PG_TEST_EXTRA%" -Dc_args="/Z7"

A comment explaining why /Z7 is necessary would be good.



> From 3a399c6350ed2f341425431f184e382c3f46d981 Mon Sep 17 00:00:00 2001
> From: Justin Pryzby 
> Date: Sat, 26 Feb 2022 19:39:10 -0600
> Subject: [PATCH 4/7] WIP: cirrus: upload changed html docs as artifacts
> 
> This could be done on the client side (cfbot).  One advantage of doing
> it here is that fewer docs are uploaded - many patches won't upload docs
> at all.
> 
> https://www.postgresql.org/message-id/flat/20220409021853.gp24...@telsasoft.com
> https://www.postgresql.org/message-id/CAB8KJ=i4qmeuopq+pcsmbzgd4o-xv0fcnc+q1x7hn9hsdvk...@mail.gmail.com
> 
> https://cirrus-ci.com/task/5396696388599808

I still think that this runs the risk of increasing space utilization and thus
increase frequency of caches/artifacts being purged.



> +  # The commit that this branch is rebased on.  There's no easy way to find 
> this.

I don't think that's true, IIRC I've complained about it before. We can do
something like:

major_num=$(grep PG_MAJORVERSION_NUM build/src/include/pg_config.h|awk '{print 
$3}');
echo major: $major_num;
if git rev-parse --quiet --verify origin/REL_${major_num}_STABLE > /dev/null ; 
then
basebranch=origin/REL_${major_num}_STABLE;
else
basebranch=origin/master;
fi;
echo base branch: $basebranch
base_commit=$(git merge-base HEAD $basebranch)



Greetings,

Andres Freund

Re: allow changing autovacuum_max_workers without restarting

2024-06-18 Thread Andres Freund

Hi,

On 2024-06-18 16:09:09 -0500, Nathan Bossart wrote:
> On Tue, Jun 18, 2024 at 01:43:34PM -0700, Andres Freund wrote:
> > I just don't see much point in reserving 256 worker "possibilities", tbh. I
> > can't think of any practical system where it makes sense to use this much 
> > (nor
> > do I think it's going to be reasonable in the next 10 years) and it's just
> > going to waste memory and startup time for everyone.
> 
> Given this, here are some options I see for moving this forward:
> 
> * lower the cap to, say, 64 or 32
> * exclude autovacuum worker slots from computing number of locks, etc.

That seems good regardless

> * make the cap configurable and default it to something low (e.g., 8)


Another one:

Have a general cap of 64, but additionally limit it to something like
 max(1, min(WORKER_CAP, max_connections / 4))

so that cases like tap tests don't end up allocating vastly more worker slots
than actual connection slots.


> My intent with a reserved set of 256 slots was to prevent users from
> needing to deal with two GUCs.  For all practical purposes, it would be
> possible to change autovacuum_max_workers whenever you want.  But if the
> extra resource requirements are too much of a tax, I'm content to change
> course.

Approximately tripling shared memory usage for tap test instances does seem
too much to me.

Greetings,

Andres Freund

Re: assertion failure at cost_memoize_rescan()

2024-06-18 Thread David Rowley

On Tue, 18 Jun 2024 at 15:14, Richard Guo  wrote:
>
> On Tue, Jun 18, 2024 at 10:53 AM David Rowley  wrote:
> > I think the best solution is to apply the attached.  I didn't test,
> > but it should fix the issue you reported and also ensure that
> > MemoizePath.calls is never zero, which would also cause issues in the
> > hit_ratio calculation in cost_memoize_rescan().
>
> +1.

Thanks for looking. Pushed.

David

Re: Proposal: Document ABI Compatibility

2024-06-18 Thread Andreas 'ads' Scherbaum


On 03/06/2024 21:21, David E. Wheeler wrote:

On Jun 3, 2024, at 14:58, Andres Freund  wrote:


Hi,

Hello Andres.


Are there notes for the session?

Yes, but not posted yet. Here’s what Andreas 'ads' Scherbaum sent me for that 
bit of the conversation:

*   Core is focused on core ABI stability
*   David: No "statement of stability" in Core
 *   David/Jeremy/Tom: coding guidelines, style guidelines
 *useful to have docs in core about what's stable and what's not, what 
you should compile against or not, and ABI guarantees
*   Abigale: there are hooks, but no overall concept for extensions
*   Tom: Peter Eisentraut is working on tests for extensions stability
*   Jeremy: nothing is preventing people from installing incompatible versions


The full "discussion" is here:

https://wiki.postgresql.org/wiki/PGConf.dev_2024_Developer_Unconference#Improving_extensions_in_core

And the ABI discussion here:
https://wiki.postgresql.org/wiki/PGConf.dev_2024_Extension_Summit#ABI.2FAPI_discussion

--
Andreas 'ads' Scherbaum
German PostgreSQL User Group
European PostgreSQL User Group - Board of Directors
Volunteer Regional Contact, Germany - PostgreSQL Project

Re: Addressing SECURITY DEFINER Function Vulnerabilities in PostgreSQL Extensions

2024-06-18 Thread John H

Hi Ashutosh,

Thanks for clarifying.

> not standalone functions created independently

I'm wondering why we wouldn't want to extend that functionality so
that if users (including superusers) do want to automatically
configure search_path
when they're creating functions it's available.

>  The difference is that installing extensions typically requires superuser 
> privileges,
> which is not the case with standalone functions.

Yes, but inadvertent access to different roles can still occur even if
it's not a superuser.

It's not clear to me who the audience of this commit is aimed at,
cause I see two sort of
different use cases?

1. Make it easier for extension authors to configure search_path when
creating functions

The proposed mechanism via control file makes sense, although I would
like to understand
why specifying it today in CREATE FUNCTION doesn't work. Is it an
awareness issue?
I suppose it's easier if you only need to set it once in the control file?
Is that ease-of-use what we want to solve?

2. Make it easier to avoid inadvertent escalations when functions are created
(e.g CREATE EXTENSION/CREATE FUNCTION)

Then I think it's better to provide users a way to force the
search_path on functions when
extensions are created so superusers aren't reliant on extension authors.

Thanks,

--
John Hsu - Amazon Web Services

Re: allow changing autovacuum_max_workers without restarting

2024-06-18 Thread Nathan Bossart

On Tue, Jun 18, 2024 at 02:33:31PM -0700, Andres Freund wrote:
> Another one:
> 
> Have a general cap of 64, but additionally limit it to something like
>  max(1, min(WORKER_CAP, max_connections / 4))
> 
> so that cases like tap tests don't end up allocating vastly more worker slots
> than actual connection slots.

That's a clever idea.  My only concern would be that we are tethering two
parameters that aren't super closely related, but I'm unsure whether it
would cause any problems in practice.

-- 
nathan

Re: Revive num_dead_tuples column of pg_stat_progress_vacuum

2024-06-18 Thread Masahiko Sawada

On Sat, Jun 15, 2024 at 8:47 PM Robert Treat  wrote:
>
> On Thu, Jun 13, 2024 at 9:22 PM Masahiko Sawada  wrote:
> > On Fri, Jun 14, 2024 at 9:57 AM Masahiko Sawada  
> > wrote:
> > > On Fri, Jun 14, 2024 at 9:41 AM Michael Paquier  
> > > wrote:
> > > > On Thu, Jun 13, 2024 at 08:38:05PM -0400, Tom Lane wrote:
> > > > > Masahiko Sawada  writes:
> > > > >> I was about to push the patch but let me confirm just in case: is it
> > > > >> okay to bump the catversion even after post-beta1?
> > > > >
> > > > > Yes, that happens somewhat routinely.
> > > >
> > > > Up to RC, even after beta2.  This happens routinely every year because
> > > > tweaks are always required for what got committed.  And that's OK to
> > > > do so now.
> > >
> > > Thank you both for confirmation. I'll push it shortly.
> > >
> >
> > Pushed. Thank you for giving feedback and reviewing the patch!
> >
>
> One minor side effect of this change is the original idea of comparing
> pg_stat_progress.num_dead_tuples to pg_stat_all_tables.n_dead_tup
> column becomes less obvious. I presume the release notes for
> pg_stat_progress_vacuum will be updated to also include this column
> name change as well, so maybe that's enough for folks to figure things
> out?

The release note has been updated, and I think it would help users
understand the change.

> At least I couldn't find anywhere in the docs where we have
> described the relationship between these columns before. Thoughts?

It would be a good idea to improve the documentation, but I think that
we cannot simply compare these two numbers since the numbers that
these fields count are slightly different. For instance,
pg_stat_all_tables.n_dead_tup includes the number of dead tuples that
are going to be HOT-pruned.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Re: DOCS: Generated table columns are skipped by logical replication

2024-06-18 Thread Peter Smith

On Tue, Jun 18, 2024 at 9:40 PM Amit Kapila  wrote:
>
> On Tue, Jun 18, 2024 at 12:11 PM Peter Smith  wrote:
> >
> > While reviewing another thread that proposes to include "generated
> > columns" support for logical replication [1] I was looking for any
> > existing PostgreSQL documentation on this topic.
> >
> > But, I found almost nothing about it at all -- I only saw one aside
> > mention saying that logical replication low-level message information
> > is not sent for generated columns [2].
> >
> > ~~
> >
> > IMO there should be some high-level place in the docs where the
> > behaviour for logical replication w.r.t. generated columns is
> > described.
> >
>
> +1.
>
> > There are lots of candidate places which could talk about this topic.
> > * e.g.1 in "Generated Columns" (section 5.4)
> > * e.g.2 in LR "Column-Lists" docs (section 29.5)
> > * e.g.3 in LR "Restrictions" docs (section 29.7)
> > * e.g.4 in the "CREATE PUBLICATION" reference page
> >
> > For now, I have provided just a simple patch for the "Generated
> > Columns" section [3]. Perhaps it is enough.
> >
>
> Can we try to clarify if their corresponding values are replicated?
>

Sure. Here are some current PG17 observed behaviours demonstrating
that generated columns are not replicated.

==

Example #1

The generated cols 'b' column is not replicated. Notice the subscriber
side 'b' has its own computed value which uses a different
calculation.

PUB: create table t1 (a int, b int generated always as (a * 2) stored);
SUB: create table t1 (a int, b int generated always as (a * 20) stored);

PUB:
insert into t1 values (1),(2),(3);
create publication pub1 for table t1;
test_pub=# select * from t1;
 a | b
---+---
 1 | 2
 2 | 4
 3 | 6
(3 rows)

SUB:
create subscription sub1 connection 'dbname=test_pub' publication pub1;
test_sub=# select * from t1;
 a | b
---+
 1 | 20
 2 | 40
 3 | 60
(3 rows)

==

Example 2

You cannot specify a generated column in a CREATE PUBLICATION column-list.

PUB:
create table t2 (a int, b int generated always as (a * 2) stored);
create publication pub2 for table t2(b);
ERROR:  cannot use generated column "b" in publication column list

==

Example 3

Here the subscriber-side table doesn't even have a column 'b'.
Normally, a missing column like this would cause subscription errors,
but since the publisher-side generated column 'b' is not replicated,
this scenario is allowed.

PUB: create table t3 (a int, b int generated always as (a * 2) stored);
SUB: create table t3 (a int);

PUB:
create publication pub3 for table t3;
insert into t3 values (1),(2),(3);
test_pub=# select * from t3;
 a | b
---+---
 1 | 2
 2 | 4
 3 | 6
(3 rows)

SUB:
create subscription sub3 connection 'dbname=test_pub' publication pub3;
test_sub=# select * from t3;
 a
---
 1
 2
 3
(3 rows)

==
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Revive num_dead_tuples column of pg_stat_progress_vacuum

2024-06-18 Thread Peter Geoghegan

On Tue, Jun 18, 2024 at 8:49 PM Masahiko Sawada  wrote:
> > At least I couldn't find anywhere in the docs where we have
> > described the relationship between these columns before. Thoughts?
>
> It would be a good idea to improve the documentation, but I think that
> we cannot simply compare these two numbers since the numbers that
> these fields count are slightly different. For instance,
> pg_stat_all_tables.n_dead_tup includes the number of dead tuples that
> are going to be HOT-pruned.

This isn't a small difference. It's actually a huge one in many cases:

https://www.postgresql.org/message-id/cah2-wzkkgt2gt4xaus5eqoqi4mvvl5x49hbttwccc8deqen...@mail.gmail.com

Practically speaking they're just two different things, with hardly
any fixed relationship.

-- 
Peter Geoghegan

Re: Missing docs for new enable_group_by_reordering GUC

2024-06-18 Thread Alexander Korotkov

On Tue, Jun 18, 2024 at 4:14 PM Pavel Borisov  wrote:
>> Controls if the query planner will produce a plan which will provide 
>> GROUP BY keys presorted in the order of keys of a child 
>> node of the plan, such as an index scan. When disabled, the query planner 
>> will produce a plan with GROUP BY keys only reordered to 
>> match
>> the ORDER BY clause, if any. When enabled, the planner 
>> will try to produce a more efficient plan. The default value is on.
> A correction of myself: presorted -> sorted, reordered ->sorted

Thank you for your review.  I think all of this make sense.  Please,
check the revised patch attached.

--
Regards,
Alexander Korotkov
Supabase


v3-0001-Add-doc-entry-for-the-new-GUC-paramenter-enable_g.patch
Description: Binary data

Re: Better error message when --single is not the first arg to postgres executable

2024-06-18 Thread Greg Sabino Mullane

If I am reading your patch correctly, we have lost the behavior of least
surprise in which the first "meta" argument overrides all others:

$ bin/postgres --version --boot --extrastuff
postgres (PostgreSQL) 16.2

What about just inlining --version and --help e.g.

else if (strcmp(argv[i], "--version") == 0 || strcmp(argv[i], "-V") == 0)
{
 fputs(PG_BACKEND_VERSIONSTR, stdout);
 exit(0);
}

I'm fine with being more persnickety about the other options; they are much
rarer and not unixy.

However, there's a complication:
> ...
> This remaining discrepancy might be okay, but I was really hoping to reduce
> the burden on users to figure out the correct ordering of options.  The
> situations in which I've had to use single-user mode are precisely the
> situations in which I'd rather not have to spend time learning these kinds
> of details.
>

Yes, that's unfortunate. But I'd be okay with the db-last requirement as
long as the error message is sane and points one in the right direction.

Cheers,
Greg

Re: Should consider materializing the cheapest inner path in consider_parallel_nestloop()

2024-06-18 Thread Tender Wang

Richard Guo  于2024年6月18日周二 17:24写道：

> On Tue, Jun 4, 2024 at 6:51 PM Tender Wang  wrote:
> > Yeah, Richard commented the v1 patch about JOIN_UNIQUE_INNER in [1]
> >
> > * I think we should not consider materializing the cheapest inner path
> > if we're doing JOIN_UNIQUE_INNER, because in this case we have to
> > unique-ify the inner path.
> >
> > We don't consider material inner path if jointype is JOIN_UNIQUE_INNER
> in match_unsorted_order().
> > So here is as same logic as match_unsorted_order(). I added comments to
> explain why.
>
> I looked through the v4 patch and found an issue.  For the plan diff:
>
> + ->  Nested Loop
> +   ->  Parallel Seq Scan on prt1_p1 t1_1
> +   ->  Materialize
> + ->  Sample Scan on prt1_p1 t2_1
> +   Sampling: system (t1_1.a) REPEATABLE (t1_1.b)
> +   Filter: (t1_1.a = a)
>
> This does not seem correct to me.  The inner path is parameterized by
> the outer rel, in which case it does not make sense to add a Materialize
> node on top of it.
>

Yeah, you're right.

>
> I updated the patch to include a check in consider_parallel_nestloop
> ensuring that inner_cheapest_total is not parameterized by outerrel
> before materializing it.  I also tweaked the comments, test cases and
> commit message.
>

Thanks for the work. Now it looks better.
I have changed the status from "need review" to "ready for commiters"  on
the commitfest.

-- 
Tender Wang

Re: IPC::Run accepts bug reports

2024-06-18 Thread Andres Freund

Hi,

On 2024-06-18 12:00:13 -0700, Andres Freund wrote:
> On 2024-06-18 10:10:17 -0700, Noah Misch wrote:
> > > 1) Sometimes hangs hard on windows if started processes have not been shut
> > >down before script exits.  I've mostly encountered this via the 
> > > buildfarm /
> > >CI, so I never had a good way of narrowing this down. It's very painful
> > >because things seem to often just get stuck once that happens.
> >
> > That's bad.  Do you have a link to a log, a thread discussing it, or even 
> > just
> > one of the test names experiencing it?
>
> I'm unfortunately blanking on the right keyword right now.
>
> I think it basically required not shutting down a process started in the
> background with IPC::Run.
>
> I'll try to repro it by removing some ->finish or ->quit calls.

Yep, that did it.  It reliably reproduces if I comment out
the lines below
  # explicitly shut down psql instances gracefully - to avoid hangs
  # or worse on windows
in 021_row_visibility.pl

The logfile ends in
Warning: unable to close filehandle GEN25 properly: Bad file descriptor during 
global destruction.
Warning: unable to close filehandle GEN20 properly: Bad file descriptor during 
global destruction.

Even if I cancel the test, I can't rerun it because due to a leftover psql
a) a new temp install can't be made (could be solved by rm -rf)
b) the test's logfile can't be removed (couldn't even rename the directory)

The psql instance needs to be found and terminated first.

Greetings,

Andres Freund

Re: Addressing SECURITY DEFINER Function Vulnerabilities in PostgreSQL Extensions

2024-06-18 Thread Ashutosh Sharma

Hi,

On Wed, Jun 19, 2024 at 6:06 AM John H  wrote:
>
> Hi Ashutosh,
>
> Thanks for clarifying.
>
> > not standalone functions created independently
>
> I'm wondering why we wouldn't want to extend that functionality so
> that if users (including superusers) do want to automatically
> configure search_path
> when they're creating functions it's available.
>
> >  The difference is that installing extensions typically requires superuser 
> > privileges,
> > which is not the case with standalone functions.
>
> Yes, but inadvertent access to different roles can still occur even if
> it's not a superuser.
>
> It's not clear to me who the audience of this commit is aimed at,
> cause I see two sort of
> different use cases?
>
> 1. Make it easier for extension authors to configure search_path when
> creating functions
>
> The proposed mechanism via control file makes sense, although I would
> like to understand
> why specifying it today in CREATE FUNCTION doesn't work. Is it an
> awareness issue?
> I suppose it's easier if you only need to set it once in the control file?
> Is that ease-of-use what we want to solve?
>
> 2. Make it easier to avoid inadvertent escalations when functions are created
> (e.g CREATE EXTENSION/CREATE FUNCTION)
>
> Then I think it's better to provide users a way to force the
> search_path on functions when
> extensions are created so superusers aren't reliant on extension authors.
>

For standalone functions, users can easily adjust the search_path
settings as needed. However, managing this becomes challenging for
functions created via extensions. Extensions are relocatable, making
it difficult to determine and apply search_path settings in advance
within the extension_name--*.sql file when defining functions or
procedures. Introducing a new control flag option allows users to set
implicit search_path settings for functions created by the extension,
if needed. The main aim here is to enhance security for functions
created by extensions by setting search_path at the function level.
This ensures precise control over how objects are accessed within each
function, making behavior more predictable and minimizing security
risks, especially for SECURITY DEFINER functions associated with
extensions created by superusers.

--
With Regards,
Ashutosh Sharma.

Re: Document NULL

2024-06-18 Thread Yugo NAGATA

On Sat, 11 May 2024 08:33:27 -0700
"David G. Johnston"  wrote:

> On Fri, May 3, 2024 at 9:00 AM David G. Johnston 
> wrote:
> 
> > On Fri, May 3, 2024 at 8:44 AM Tom Lane  wrote:
> >
> >> Having said that, I reiterate my proposal that we make it a new
> >>
> >  under DDL, before 5.2 Default Values which is the first
> >> place in ddl.sgml that assumes you have heard of nulls.
> >
> >
> > I will go with this and remove the "Data Basics" section I wrote, leaving
> > it to be just a discussion about null values.  The tutorial is the only
> > section that really needs unique wording to fit in.  No matter where we
> > decide to place it otherwise the core content will be the same, with maybe
> > a different section preface to tie it in.
> >
> >
> v3 Attached.
> 
> Probably at the 90% complete mark.  Minimal index entries, not as thorough
> a look-about of the existing documentation as I'd like.  Probably some
> wording and style choices to tweak.  Figured better to get feedback now
> before I go into polish mode.  In particular, tweaking and re-running the
> examples.
> 
> Yes, I am aware of my improper indentation for programlisting and screen. I
> wanted to be able to use the code folding features of my editor.  Those can
> be readily un-indented in the final version.
> 
> The changes to func.sgml is basically one change repeated something like 20
> times with tweaks for true/false.  Plus moving the discussion regarding the
> SQL specification into the new null handling section.
> 
> It took me doing this to really understand the difference between row
> constructors and composite typed values, especially since array
> constructors produce array typed values and the constructor is just an
> unimportant implementation option while row constructors introduce
> meaningfully different behaviors when used.
> 
> My plan is to have a v4 out next week, without or without a review of this
> draft, but then the subsequent few weeks will probably be a bit quiet.

+   A null value literal is written as unquoted, case insensitive, NULL.
...(snip)...
+  
+  SELECT
+NULL,
+pg_typeof(null),
+pg_typeof(NuLl::text),
+cast(null as text);
+  

It may be a trivial thing but I am not sure we need to mention case 
insensitivity
here, because all keywords and unquoted identifiers are case-insensitive in
PostgreSQL and it is not specific to NULL.

Also, I found the other parts of the documentation use "case-insensitive" in 
which
words are joined with hyphen, so I wonder it is better to use the same form if 
we
leave the description.

Regards,
Yugo Nagata

-- 
Yugo NAGATA

Re: State of pg_createsubscriber

2024-06-18 Thread Amit Kapila

On Tue, Jun 18, 2024 at 12:41 PM Euler Taveira  wrote:
>
> On Tue, Jun 18, 2024, at 12:59 AM, Amit Kapila wrote:
>
> On Mon, May 20, 2024 at 12:12 PM Amit Kapila  wrote:
> >
> > On Sun, May 19, 2024 at 11:20 PM Euler Taveira  wrote:
> > >
> > > On Sun, May 19, 2024, at 2:30 PM, Tom Lane wrote:
> > >
> > > I'm fairly disturbed about the readiness of pg_createsubscriber.
> > > The 040_pg_createsubscriber.pl TAP test is moderately unstable
> > > in the buildfarm [1], and there are various unaddressed complaints
> > > at the end of the patch thread (pretty much everything below [2]).
> > > I guess this is good enough to start beta with, but it's far from
> > > being good enough to ship, IMO.  If there were active work going
> > > on to fix these things, I'd feel better, but neither the C code
> > > nor the test script have been touched since 1 April.
> > >
> > >
> > > My bad. :( I'll post patches soon to address all of the points.
> > >
> >
> > Just to summarize, apart from BF failures for which we had some
> > discussion, I could recall the following open points:
> >
> > 1. After promotion, the pre-existing replication objects should be
> > removed (either optionally or always), otherwise, it can lead to a new
> > subscriber not being able to restart or getting some unwarranted data.
> > [1][2].
> >
> > 2. Retaining synced slots on new subscribers can lead to unnecessary
> > WAL retention and dead rows [3].
> >
> > 3. We need to consider whether some of the messages displayed in
> > --dry-run mode are useful or not [4].
> >
>
> The recent commits should silence BF failures and resolve point number
> 2. But we haven't done anything yet for 1 and 3. For 3, we have a
> patch in email [1] (v3-0005-Avoid*) which can be reviewed and
> committed but point number 1 needs discussion. Apart from that
> somewhat related to point 1, Kuroda-San has raised a point in an email
> [2] for replication slots. Shlok has presented a case in this thread
> [3] where the problem due to point 1 can cause ERRORs or can cause
> data inconsistency.
>
>
> I read v3-0005 and it seems to silence (almost) all "write" messages. Does it
> intend to avoid the misinterpretation that the dry run mode is writing
> something? It is dry run mode! If I run a tool in dry run mode, I expect it to
> execute some verifications and print useful messages so I can evaluate if it 
> is
> ok to run it. Maybe it is not your expectation for dry run mode.
>

I haven't studied the patch so can't comment but the intention was to
not print some unrelated write messages. I have shared my observation
in the email [1].

> I think if it not clear, let's inform that it changes nothing in dry run mode.
>
> pg_createsubscriber: no modifications are done
>
> as a first message in dry run mode. I agree with you when you pointed out that
> some messages are misleading.
>
> pg_createsubscriber: hint: If pg_createsubscriber fails after this
> point, you must recreate the physical replica before continuing.
>
> Maybe for this one, we omit the fake information, like:
>
> pg_createsubscriber: setting the replication progress on database "postgres"
>

I think we don't need to display this message as we are not going to
do anything for this in the --dry-run mode. We can even move the
related code in (!dry_run) check.

> I will post a patch to address the messages once we agree what needs to be
> changed.
>

I suggest we can start a new thread with the messages shared in the
email [1] and your response for each one of those.

> Regarding 3, publications and subscriptions are ok to remove. You are not
> allowed to create them on standby, hence, all publications and subscriptions
> are streamed from primary. However, I'm wondering if you want to remove the
> publications.
>

I am not so sure of publications but we should remove subscriptions as
there are clear problems with those as shown by Shlok in this thread.

> Replication slots on a standby server are "invalidated" despite
> of the wal_status is saying "reserved" (I think it is an oversight in the
> design that implements slot invalidation), however, required WAL files have
> already been removed because of pg_resetwal (see modify_subscriber_sysid()).
> The scenario is to promote a standby server, run pg_resetwal on it and check
> pg_replication_slots.
>

Ideally, invalidated slots shouldn't create any problems but it is
better that we discuss this also as a separate problem in new thread.

> Do you have any other scenarios in mind?
>

No, so we have three issues to discuss (a) some unwarranted messages
in --dry-run mode; (b) whether to remove pre-existing subscriptions
during conversion; (c) whether to remove pre-existing replication
slots.

Would you like to start three new threads for each of these or would
you like Kuroda-San or me to start some or all?

[1] - 
https://www.postgresql.org/message-id/CAA4eK1J2fAvsJ2HihbWJ_GxETd6sdqSMrZdCVJEutRZRpm1MEQ%40mail.gmail.com

-- 
With Regards,
Amit Kapila.

Re: State of pg_createsubscriber

2024-06-18 Thread Amit Kapila

On Tue, Jun 18, 2024 at 12:13 PM Peter Eisentraut  wrote:
>
> On 18.06.24 05:59, Amit Kapila wrote:
> >> 1. After promotion, the pre-existing replication objects should be
> >> removed (either optionally or always), otherwise, it can lead to a new
> >> subscriber not being able to restart or getting some unwarranted data.
> >> [1][2].
> >>
> >> 2. Retaining synced slots on new subscribers can lead to unnecessary
> >> WAL retention and dead rows [3].
> >>
> >> 3. We need to consider whether some of the messages displayed in
> >> --dry-run mode are useful or not [4].
> >>
> >
> > The recent commits should silence BF failures and resolve point number
> > 2. But we haven't done anything yet for 1 and 3. For 3, we have a
> > patch in email [1] (v3-0005-Avoid*) which can be reviewed and
> > committed but point number 1 needs discussion. Apart from that
> > somewhat related to point 1, Kuroda-San has raised a point in an email
> > [2] for replication slots. Shlok has presented a case in this thread
> > [3] where the problem due to point 1 can cause ERRORs or can cause
> > data inconsistency.
> >
> > Now, the easiest way out here is that we accept the issues with the
> > pre-existing subscriptions and replication slots cases and just
> > document them for now with the intention to work on those in the next
> > version. OTOH, if there are no major challenges, we can try to
> > implement a patch for them as well as see how it goes.
>
> This has gotten much too confusing to keep track of.
>
> I suggest, if anyone has anything they want considered for
> pg_createsubscriber for PG17 at this point, they start a new thread, one
> for each topic, ideally with a subject like "pg_createsubscriber:
> Improve this thing", provide a self-contained description of the issue,
> and include a patch if one is available.
>

Fair enough. In my mind, we have three pending issues to discuss and I
have responded to an email to see if Euler can start individual
threads for those, otherwise, I'll do it.

We can close the open item pointing to this thread.

-- 
With Regards,
Amit Kapila.

Re: Document NULL

2024-06-18 Thread David G. Johnston

On Tue, Jun 18, 2024 at 8:34 PM Yugo NAGATA  wrote:

>
> It may be a trivial thing but I am not sure we need to mention case
> insensitivity
> here, because all keywords and unquoted identifiers are case-insensitive in
> PostgreSQL and it is not specific to NULL.
>

But it is neither a keyword nor an identifier.  It behaves more like:
SELECT 1 as one;  A constant, which have no implied rules - mainly because
numbers don't have case.  Which suggests adding some specific mention there
- and also probably need to bring up it and its "untyped" nature in the
syntax chapter, probably here:

https://www.postgresql.org/docs/current/sql-syntax-lexical.html#SQL-SYNTAX-CONSTANTS-GENERIC


> Also, I found the other parts of the documentation use "case-insensitive"
> in which
> words are joined with hyphen, so I wonder it is better to use the same
> form if we
> leave the description.
>
>
Typo on my part, fixed.

I'm not totally against just letting this content be assumed to be learned
from elsewhere in the documentation but it also seems reasonable to
include.  I'm going to leave it for now.

David J.

Re: Inconsistency between try_mergejoin_path and create_mergejoin_plan

2024-06-18 Thread Richard Guo

On Mon, Jun 17, 2024 at 10:51 PM Alexander Pyhalov
 wrote:
> There's the following inconsistency between try_mergejoin_path() and
> create_mergejoin_plan().
> When clause operator has no commutator, we can end up with mergejoin
> path.
> Later create_mergejoin_plan() will call get_switched_clauses(). This
> function can error out with
>
> ERROR:  could not find commutator for operator XXX

Interesting.  This error can be reproduced with table 'ec1' from
sql/equivclass.sql.

set enable_indexscan to off;

explain select * from ec1 t1 join ec1 t2 on t2.ff = t1.f1;
ERROR:  could not find commutator for operator 30450

The column ec1.f1 has a type of 'int8alias1', a new data type created in
this test file.  Additionally, there is also a newly created operator
'int8 = int8alias1' which is mergejoinable but lacks a valid commutator.
Therefore, there is no problem generating the mergejoin path, but when
we create the mergejoin plan, get_switched_clauses would notice the
absence of a valid commutator needed to commute the clause.

It seems to me that the new operator is somewhat artificial, since it is
designed to support a mergejoin but lacks a valid commutator.  So before
we proceed to discuss the fix, I'd like to know whether this is a valid
issue that needs fixing.

Any thoughts?

Thanks
Richard

Re: DOCS: Generated table columns are skipped by logical replication

2024-06-18 Thread Amit Kapila

On Wed, Jun 19, 2024 at 6:46 AM Peter Smith  wrote:
>
> On Tue, Jun 18, 2024 at 9:40 PM Amit Kapila  wrote:
> >
> > On Tue, Jun 18, 2024 at 12:11 PM Peter Smith  wrote:
> > >
> > > While reviewing another thread that proposes to include "generated
> > > columns" support for logical replication [1] I was looking for any
> > > existing PostgreSQL documentation on this topic.
> > >
> > > But, I found almost nothing about it at all -- I only saw one aside
> > > mention saying that logical replication low-level message information
> > > is not sent for generated columns [2].
> > >
> > > ~~
> > >
> > > IMO there should be some high-level place in the docs where the
> > > behaviour for logical replication w.r.t. generated columns is
> > > described.
> > >
> >
> > +1.
> >
> > > There are lots of candidate places which could talk about this topic.
> > > * e.g.1 in "Generated Columns" (section 5.4)
> > > * e.g.2 in LR "Column-Lists" docs (section 29.5)
> > > * e.g.3 in LR "Restrictions" docs (section 29.7)
> > > * e.g.4 in the "CREATE PUBLICATION" reference page
> > >
> > > For now, I have provided just a simple patch for the "Generated
> > > Columns" section [3]. Perhaps it is enough.
> > >
> >
> > Can we try to clarify if their corresponding values are replicated?
> >
>
> Sure. Here are some current PG17 observed behaviours demonstrating
> that generated columns are not replicated.
>

Thanks for sharing examples. Your proposed patch as-is looks good to
me. We should back-patch this unless you or someone else thinks
otherwise.

-- 
With Regards,
Amit Kapila.

1 2 >

1 - 100 of 111 matches

Mail list logo