Re: Could not run generate_unaccent_rules.py script when update unicode

2023-09-26 Thread Japin Li


On Wed, 27 Sep 2023 at 13:46, Michael Paquier  wrote:
> On Wed, Sep 27, 2023 at 09:15:00AM +0800, Japin Li wrote:
>> On Wed, 27 Sep 2023 at 08:03, Michael Paquier  wrote:
>>> I am not sure that many people run this script frequently so that may
>>> not be worth adding a check for a defined, still empty or incorrect
>> 
>> Yeah, not frequently, however, it already be used by me, since we provide 
>> this
>> function, why not make it better?
>
> Well, I don't mind doing as you suggest.
>
>>> value, but..  If you were to change the Makefile we use in this path,
>>> how are you suggesting to change it?
>> 
>> I provide a patch at bottom of in [1].  Attached here again.
>
> No file was attached so I somewhat missed it.  And indeed, you're
> right.  The current rule is useless when compiling without
> --with-python, as PYTHON is empty but defined.  What you are proposing
> is similar to what pgxs.mk does for bison and flex, and "python" would
> still be able to map to python3 or python2.7, which should be OK in
> most cases.
>
> I thought that this was quite old, but no, f85a485f8 was at the origin
> of that.  So I've applied the patch down to 13.

Thanks for your review and push!

-- 
Regrads,
Japin Li
ChengDu WenWu Information Technology Co., Ltd.




Re: Invalidate the subscription worker in cases where a user loses their superuser status

2023-09-26 Thread vignesh C
On Wed, 27 Sept 2023 at 06:58, Peter Smith  wrote:
>
> On Tue, Sep 26, 2023 at 11:57 PM vignesh C  wrote:
> >
> > On Tue, 26 Sept 2023 at 13:03, Peter Smith  wrote:
> > >
> > > Here are some comments for patch v2-0001.
> > >
> > > ==
> > > src/backend/replication/logical/worker.c
> > >
> > > 1. maybe_reread_subscription
> > >
> > > ereport(LOG,
> > > (errmsg("logical replication worker for subscription \"%s\"
> > > will restart because of a parameter change",
> > > MySubscription->name)));
> > >
> > > Is this really a "parameter change" though? It might be a stretch to
> > > call the user role change a subscription parameter change. Perhaps
> > > this case needs its own LOG message?
> >
> > When I was doing this change the same thought had come to my mind too
> > but later I noticed that in case of owner change there was no separate
> > log message. Since superuser check is somewhat similar to owner
> > change, I felt no need to make any change for this.
> >
>
> Yeah, I had seen the same already before my comment. Anyway, YMMV.
>
> > > ==
> > > src/include/catalog/pg_subscription.h
> > >
> > > 2.
> > >   char*origin; /* Only publish data originating from the
> > >   * specified origin */
> > > + bool isownersuperuser; /* Is subscription owner superuser? */
> > >  } Subscription;
> > >
> > >
> > > Is a new Subscription field member really necessary? The Subscription
> > > already has 'owner' -- why doesn't function
> > > maybe_reread_subscription() just check:
> > >
> > > (!superuser_arg(sub->owner) && superuser_arg(MySubscription->owner))
> >
> > We need the new variable so that we store this value when the worker
> > is started and check the current value with the value that was when
> > the worker was started. Since we need the value of the superuser
> > status when the worker is started, I feel this variable is required.
> >
>
> OK. In that case, then shouldn't the patch replace the other
> superuser_arg() code already in function run_apply_worker() to make
> use of this variable? Otherwise, there are 2 ways of getting the same
> information.

Modified

> ==
> src/test/subscription/t/027_nosuperuser.pl
>
> I am suspicious that something may be wrong with either the code or
> the test because while experimenting, I accidentally found that even
> if I *completely* remove the important change below, the TAP test will
> still pass anyway.
>
> - !equal(newsub->publications, MySubscription->publications))
> + !equal(newsub->publications, MySubscription->publications) ||
> + (!newsub->isownersuperuser && MySubscription->isownersuperuser))

The test did not wait for subscription to sync, as the owner has
changed to non-superuser the tablesync were getting started later and
failing with this error in HEAD. Now I have added wait for
subscription to sync, so the error will always come from apply worker
restart and also when the test is run on HEAD we can notice that the
error message does not appear in the log.

The v3 patch attached at [1] has the changes for the same.
[1] - 
https://www.postgresql.org/message-id/CALDaNm2JwPmX0rhkTyjGRQmZjwbKax%3D3Ytgw2KY9msDPPNGmBg%40mail.gmail.com

Regards,
Vignesh




Re: Move global variables of pgoutput to plugin private scope.

2023-09-26 Thread Michael Paquier
On Wed, Sep 27, 2023 at 04:51:29AM +, Zhijie Hou (Fujitsu) wrote:
> While searching the code, I noticed one postgres fork where the PGoutputData 
> is
> used in other places, although it's a separate fork, but it seems better to
> discuss the removal separately.
> 
> [1] 
> https://github.com/Tencent/TBase/blob/7cf7f8afbcab7290538ad5e65893561710be3dfa/src/backend/replication/walsender.c#L100

Indeed.  Interesting.
--
Michael


signature.asc
Description: PGP signature


Re: Could not run generate_unaccent_rules.py script when update unicode

2023-09-26 Thread Michael Paquier
On Wed, Sep 27, 2023 at 09:15:00AM +0800, Japin Li wrote:
> On Wed, 27 Sep 2023 at 08:03, Michael Paquier  wrote:
>> I am not sure that many people run this script frequently so that may
>> not be worth adding a check for a defined, still empty or incorrect
> 
> Yeah, not frequently, however, it already be used by me, since we provide this
> function, why not make it better?

Well, I don't mind doing as you suggest.

>> value, but..  If you were to change the Makefile we use in this path,
>> how are you suggesting to change it?
> 
> I provide a patch at bottom of in [1].  Attached here again.

No file was attached so I somewhat missed it.  And indeed, you're
right.  The current rule is useless when compiling without
--with-python, as PYTHON is empty but defined.  What you are proposing
is similar to what pgxs.mk does for bison and flex, and "python" would
still be able to map to python3 or python2.7, which should be OK in
most cases.

I thought that this was quite old, but no, f85a485f8 was at the origin
of that.  So I've applied the patch down to 13.
--
Michael


signature.asc
Description: PGP signature


Re: Invalidate the subscription worker in cases where a user loses their superuser status

2023-09-26 Thread vignesh C
On Tue, 26 Sept 2023 at 13:03, Peter Smith  wrote:
>
> Here are some comments for patch v2-0001.
> ==
> src/test/subscription/t/027_nosuperuser.pl
>
> 3.
> # The apply worker should get restarted after the superuser prvileges are
> # revoked for subscription owner alice.
>
> typo
>
> /prvileges/privileges/

Modified

> ~
>
> 4.
> +# After the user becomes non-superuser the apply worker should be restarted 
> and
> +# it should fail with 'password is required' error as password option is not
> +# part of the connection string.
>
> /as password option/because the password option/

Modified

The attached v3 version patch has the changes for the same.

Regards,
Vignesh
From f2828562ad7619a6d1fb1844c6afba546df840b5 Mon Sep 17 00:00:00 2001
From: Vignesh C 
Date: Fri, 22 Sep 2023 15:12:23 +0530
Subject: [PATCH v3] Restart the apply worker if the subscription owner has
 changed from superuser to non-superuser.

Restart the apply worker if the subscription owner has changed from
superuser to non-superuser. This is required so that the subscription
connection string gets revalidated to identify cases where the
password option is not specified as part of the connection string for
non-superuser.
---
 src/backend/catalog/pg_subscription.c   |  3 +++
 src/backend/commands/subscriptioncmds.c |  4 ++--
 src/backend/replication/logical/tablesync.c |  3 +--
 src/backend/replication/logical/worker.c| 18 ++-
 src/include/catalog/pg_subscription.h   |  1 +
 src/test/subscription/t/027_nosuperuser.pl  | 25 +
 6 files changed, 45 insertions(+), 9 deletions(-)

diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index d07f88ce28..bc74b695c6 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -108,6 +108,9 @@ GetSubscription(Oid subid, bool missing_ok)
    Anum_pg_subscription_suborigin);
 	sub->origin = TextDatumGetCString(datum);
 
+	/* Get superuser for subscription owner */
+	sub->isownersuperuser = superuser_arg(sub->owner);
+
 	ReleaseSysCache(tup);
 
 	return sub;
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 6fe111e98d..ac57b150d7 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -869,7 +869,7 @@ AlterSubscription_refresh(Subscription *sub, bool copy_data,
 	load_file("libpqwalreceiver", false);
 
 	/* Try to connect to the publisher. */
-	must_use_password = !superuser_arg(sub->owner) && sub->passwordrequired;
+	must_use_password = !sub->isownersuperuser && sub->passwordrequired;
 	wrconn = walrcv_connect(sub->conninfo, true, must_use_password,
 			sub->name, );
 	if (!wrconn)
@@ -1249,7 +1249,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
 			load_file("libpqwalreceiver", false);
 			/* Check the connection info string. */
 			walrcv_check_conninfo(stmt->conninfo,
-  sub->passwordrequired && !superuser_arg(sub->owner));
+  sub->passwordrequired && !sub->isownersuperuser);
 
 			values[Anum_pg_subscription_subconninfo - 1] =
 CStringGetTextDatum(stmt->conninfo);
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index e2cee92cf2..03e4d6adbd 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -1278,9 +1278,8 @@ LogicalRepSyncTableStart(XLogRecPtr *origin_startpos)
 
 	/* Is the use of a password mandatory? */
 	must_use_password = MySubscription->passwordrequired &&
-		!superuser_arg(MySubscription->owner);
+		!MySubscription->isownersuperuser;
 
-	/* Note that the superuser_arg call can access the DB */
 	CommitTransactionCommand();
 
 	SpinLockAcquire(>relmutex);
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 597947410f..f0e202928a 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -3942,7 +3942,8 @@ maybe_reread_subscription(void)
 	 * The launcher will start a new worker but note that the parallel apply
 	 * worker won't restart if the streaming option's value is changed from
 	 * 'parallel' to any other value or the server decides not to stream the
-	 * in-progress transaction.
+	 * in-progress transaction. Exit if the owner of the subscription has
+	 * changed from superuser to a non-superuser.
 	 */
 	if (strcmp(newsub->conninfo, MySubscription->conninfo) != 0 ||
 		strcmp(newsub->name, MySubscription->name) != 0 ||
@@ -3952,7 +3953,8 @@ maybe_reread_subscription(void)
 		newsub->passwordrequired != MySubscription->passwordrequired ||
 		strcmp(newsub->origin, MySubscription->origin) != 0 ||
 		newsub->owner != MySubscription->owner ||
-		!equal(newsub->publications, MySubscription->publications))
+		!equal(newsub->publications, MySubscription->publications) ||
+		(!newsub->isownersuperuser && 

Re: pg_upgrade --check fails to warn about abstime

2023-09-26 Thread Suraj Kharage
On Fri, Sep 22, 2023 at 4:44 PM Alvaro Herrera 
wrote:

> On 2023-Sep-21, Tom Lane wrote:
>
> > Bruce Momjian  writes:
>
> > > Wow, I never added code to pg_upgrade to check for that, and no one
> > > complained either.
> >
> > Yeah, so most people had indeed listened to warnings and moved away
> > from those datatypes.  I'm inclined to think that adding code for this
> > at this point is a bit of a waste of time.
>
> The migrations from versions prior to 12 have not stopped yet, and I did
> receive a complaint about it.  Because the change is so simple, I'm
> inclined to patch it anyway, late though it is.
>
> I decided to follow Tristan's advice to add the version number as a
> parameter to the new function; this way, the knowledge of where was what
> dropped is all in the callsite and none in the function.  It
> looked a bit schizoid otherwise.
>

yeah, looks good to me.


>
> --
> Álvaro Herrera   48°01'N 7°57'E  —
> https://www.EnterpriseDB.com/
> "Postgres is bloatware by design: it was built to house
>  PhD theses." (Joey Hellerstein, SIGMOD annual conference 2002)
>


-- 
--

Thanks & Regards,
Suraj kharage,



edbpostgres.com


Re: Move global variables of pgoutput to plugin private scope.

2023-09-26 Thread Amit Kapila
On Wed, Sep 27, 2023 at 10:26 AM Michael Paquier  wrote:
>
> On Wed, Sep 27, 2023 at 10:15:24AM +0530, Amit Kapila wrote:
> > It's like that from the beginning. Now, even if we want to move, your
> > suggestion is not directly related to this patch as we are just
> > changing one field, and that too to fix a bug. We should start a
> > separate thread to gather a broader consensus if we want to move the
> > exposed structure to an internal file.
>
> As you wish.
>

Thanks.

>
>  You are planning to take care of the patches 0001 and
> 0002 posted on this thread, I guess?
>

I have tested and reviewed
v2-0001-Maintain-publish_no_origin-in-output-plugin-priv and
v2-PG16-0001-Maintain-publish_no_origin-in-output-plugin-priva patches
posted in the email [1]. I'll push those unless there are more
comments on them. I have briefly looked at
v2-0002-Move-in_streaming-to-output-private-data in the same email [1]
but didn't think about it in detail (like whether there is any live
bug that can be fixed or is just an improvement). If you wanted to
look and commit v2-0002-Move-in_streaming-to-output-private-data, I am
fine with that?

[1] - 
https://www.postgresql.org/message-id/OS0PR01MB57164B085332DB677DBFA8E994C3A%40OS0PR01MB5716.jpnprd01.prod.outlook.com

-- 
With Regards,
Amit Kapila.




Re: Move global variables of pgoutput to plugin private scope.

2023-09-26 Thread Michael Paquier
On Wed, Sep 27, 2023 at 10:15:24AM +0530, Amit Kapila wrote:
> It's like that from the beginning. Now, even if we want to move, your
> suggestion is not directly related to this patch as we are just
> changing one field, and that too to fix a bug. We should start a
> separate thread to gather a broader consensus if we want to move the
> exposed structure to an internal file.

As you wish.  You are planning to take care of the patches 0001 and
0002 posted on this thread, I guess?
--
Michael


signature.asc
Description: PGP signature


RE: Move global variables of pgoutput to plugin private scope.

2023-09-26 Thread Zhijie Hou (Fujitsu)
On Wednesday, September 27, 2023 12:45 PM Amit Kapila 
> 
> On Wed, Sep 27, 2023 at 9:46 AM Michael Paquier 
> wrote:
> >
> > On Wed, Sep 27, 2023 at 09:39:19AM +0530, Amit Kapila wrote:
> > > On Wed, Sep 27, 2023 at 9:10 AM Michael Paquier 
> wrote:
> > >> Err, actually, I am going to disagree here for the patch of HEAD.
> > >> It seems to me that there is zero need for pgoutput.h and we don't
> > >> need to show PGOutputData to the world.  The structure is internal
> > >> to Pgoutput.c and used only by its internal static routines.
> > >
> > > Do you disagree with the approach for the PG16 patch or HEAD? You
> > > mentioned HEAD but your argument sounds like you disagree with a
> > > different approach for PG16.
> >
> > Only HEAD where the structure should be moved from pgoutput.h to
> > pgoutput.c, IMO.
> >
> 
> It's like that from the beginning. Now, even if we want to move, your
> suggestion is not directly related to this patch as we are just changing one 
> field,
> and that too to fix a bug. We should start a separate thread to gather a 
> broader
> consensus if we want to move the exposed structure to an internal file.

While searching the code, I noticed one postgres fork where the PGoutputData is
used in other places, although it's a separate fork, but it seems better to
discuss the removal separately.

[1] 
https://github.com/Tencent/TBase/blob/7cf7f8afbcab7290538ad5e65893561710be3dfa/src/backend/replication/walsender.c#L100

Best Regards,
Hou zj


Re: Move global variables of pgoutput to plugin private scope.

2023-09-26 Thread Amit Kapila
On Wed, Sep 27, 2023 at 9:46 AM Michael Paquier  wrote:
>
> On Wed, Sep 27, 2023 at 09:39:19AM +0530, Amit Kapila wrote:
> > On Wed, Sep 27, 2023 at 9:10 AM Michael Paquier  wrote:
> >> Err, actually, I am going to disagree here for the patch of HEAD.  It
> >> seems to me that there is zero need for pgoutput.h and we don't need
> >> to show PGOutputData to the world.  The structure is internal to
> >> Pgoutput.c and used only by its internal static routines.
> >
> > Do you disagree with the approach for the PG16 patch or HEAD? You
> > mentioned HEAD but your argument sounds like you disagree with a
> > different approach for PG16.
>
> Only HEAD where the structure should be moved from pgoutput.h to
> pgoutput.c, IMO.
>

It's like that from the beginning. Now, even if we want to move, your
suggestion is not directly related to this patch as we are just
changing one field, and that too to fix a bug. We should start a
separate thread to gather a broader consensus if we want to move the
exposed structure to an internal file.

-- 
With Regards,
Amit Kapila.




Re: Synchronizing slots from primary to standby

2023-09-26 Thread Peter Smith
Here are some more review comments for the patch v19-0002.

This is a WIP these review comments are all for the file slotsync.c

==
src/backend/replication/logical/slotsync.c

1. wait_for_primary_slot_catchup

+ WalRcvExecResult *res;
+ TupleTableSlot *slot;
+ Oid slotRow[1] = {LSNOID};
+ StringInfoData cmd;
+ bool isnull;
+ XLogRecPtr restart_lsn;
+
+ for (;;)
+ {
+ int rc;

I could not recognize a reason why 'rc' is declared within the loop,
but none of the other local variables are. Personally, I'd declare all
variables at the deepest scope (e.g. inside the for loop).

~~~

2. get_local_synced_slot_names

+/*
+ * Get list of local logical slot names which are synchronized from
+ * primary and belongs to one of the DBs passed in.
+ */
+static List *
+get_local_synced_slot_names(Oid *dbids)
+{

IIUC, this function gets called only from the drop_obsolete_slots()
function. But I thought this list of local slot names (i.e. for the
dbids that this worker is handling) would be something that perhaps
could the initialized one time for the worker, instead of it being
re-calculated every single time the slots processing/dropping happens.
Isn't the current code expending too much effort recalculating over
and over but giving back the same list every time?

~~~

3. get_local_synced_slot_names

+ for (int i = 0; i < max_replication_slots; i++)
+ {
+ ReplicationSlot *s = >replication_slots[i];
+
+ /* Check if it is logical synchronized slot */
+ if (s->in_use && SlotIsLogical(s) && s->data.synced)
+ {
+ for (int j = 0; j < MySlotSyncWorker->dbcount; j++)
+ {

Loop variables are not declared in the common PG code way.

~~~

4. slot_exists_locally

+static bool
+slot_exists_locally(List *remote_slots, ReplicationSlot *local_slot,
+ bool *locally_invalidated)
+{
+ ListCell   *cell;
+
+ foreach(cell, remote_slots)
+ {
+ RemoteSlot *remote_slot = (RemoteSlot *) lfirst(cell);
+
+ if (strcmp(remote_slot->name, NameStr(local_slot->data.name)) == 0)
+ {
+ /*
+ * if remote slot is marked as non-conflicting (i.e. not
+ * invalidated) but local slot is marked as invalidated, then set
+ * the bool.
+ */
+ if (!remote_slot->conflicting &&
+ SlotIsLogical(local_slot) &&
+ local_slot->data.invalidated != RS_INVAL_NONE)
+ *locally_invalidated = true;
+
+ return true;
+ }
+ }
+
+ return false;
+}

Why is there a SlotIsLogical(local_slot) check buried in this
function? How is slot_exists_locally() getting called with a
non-logical local_slot? Shouldn't that have been screened out long
before here?

~~~

5. use_slot_in_query

+static bool
+use_slot_in_query(char *slot_name, Oid *dbids)

There are multiple non-standard for-loop variable declarations in this function.

~~~

6. compute_naptime

+ * The first slot managed by each worker is chosen for monitoring purpose.
+ * If the lsn of that slot changes during each sync-check time, then the
+ * nap time is kept at regular value of WORKER_DEFAULT_NAPTIME_MS.
+ * When no lsn change is observed for WORKER_INACTIVITY_THRESHOLD_MS
+ * time, then the nap time is increased to WORKER_INACTIVITY_NAPTIME_MS.
+ * This nap time is brought back to WORKER_DEFAULT_NAPTIME_MS as soon as
+ * lsn change is observed.

6a.
/regular value/the regular value/

/for WORKER_INACTIVITY_THRESHOLD_MS time/within the threshold period
(WORKER_INACTIVITY_THRESHOLD_MS)/

~

6b.
/as soon as lsn change is observed./as soon as another lsn change is observed./

~~~

7.
+ * The caller is supposed to ignore return-value of 0. The 0 value is returned
+ * for the slots other that slot being monitored.
+ */
+static long
+compute_naptime(RemoteSlot *remote_slot)

This rule about the returning 0 seemed hacky to me. IMO this would be
a better API to pass long *naptime (which this function either updates
or doesn't update, depending on this being the "monitored" slot.
Knowing the current naptime is also useful to improve the function
logic (see the next review comment below).

Also, since this function is really only toggling naptime between 2
values, it would be helpful to assert that

Assert(*naptime == WORKER_DEFAULT_NAPTIME_MS || *naptime ==
WORKER_INACTIVITY_NAPTIME_MS);

~~~

8.
+ if (NameStr(MySlotSyncWorker->monitoring_info.slot_name)[0] == '\0')
+ {
+ /*
+ * First time, just update the name and lsn and return regular
+ * nap time. Start comparison from next time onward.
+ */
+ strcpy(NameStr(MySlotSyncWorker->monitoring_info.slot_name),
+remote_slot->name);

I wasn't sure why it was necessary to identify the "monitoring" slot
by name. Why doesn't the compute_naptime just get called only for the
1st slot found in the tuple loop instead of all the strcmp business
trying to match monitor names?

And, if the monitored slot gets "dropped", then so what; next time
another slot will be the first tuple so will automatically take its
place, right?

~~~

9.
+ /*
+ * If new received lsn (remote one) is different from what we have in
+ * our local slot, then update last_update_time.
+ */
+ if 

RE: [PoC] pg_upgrade: allow to upgrade publisher node

2023-09-26 Thread Hayato Kuroda (Fujitsu)
Dear Bharath,

Thank you for reviewing!

> Thanks for the new patch. Here's a comment on v46:
> 
> 1.
> +Datum
> +binary_upgrade_validate_wal_logical_end(PG_FUNCTION_ARGS
> +{ oid => '8046', descr => 'for use by pg_upgrade',
> +  proname => 'binary_upgrade_validate_wal_logical_end', proisstrict => 'f',
> +  provolatile => 'v', proparallel => 'u', prorettype => 'bool',
> +  proargtypes => 'pg_lsn',
> +  prosrc => 'binary_upgrade_validate_wal_logical_end' },
> 
> I think this patch can avoid catalog changes by turning
> binary_upgrade_validate_wal_logical_end a FRONTEND-only function
> sitting in xlogreader.c after making InitXLogReaderState(),
> ReadNextXLogRecord() FRONTEND-friendly (replace elog/ereport with
> pg_fatal or such). With this change and back-porting of commit
> e0b2eed0 to save logical slots at shutdown, the patch can help support
> upgrading logical replication slots on PG versions < 17.

Hmm, I think your suggestion may be questionable.

If we implement the upgrading function as FRONTEND-only (I have not checked its
feasibility), it means pg_upgrade uses the latest version WAL reader API to read
WALs in old version cluster, which I didn't think is suggested.

Each WAL page header has a magic number, XLOG_PAGE_MAGIC, which indicates the
content of WAL. Sometimes the value has been changed due to the changes of WAL
contents, and some functions requires that the magic number must be same as
expected. E.g., startup process and pg_walinspect functions require that.
Typically XLogReaderValidatePageHeader() ensures the equality.

Now some functions are ported from pg_walinspect, so upgrading function requires
same restriction. I think we should not ease the restriction to verify the
completeness of files. Followings are the call stack of ported functions
till XLogReaderValidatePageHeader().

```
InitXLogReaderState()
XLogFindNextRecord()
ReadPageInternal()
XLogReaderValidatePageHeader()
```

```
ReadNextXLogRecord()
XLogReadRecord()
XLogReadAhead()
XLogDecodeNextRecord()
ReadPageInternal()
XLogReaderValidatePageHeader()
```

Best Regards,
Hayato Kuroda
FUJITSU LIMITED


Re: Move global variables of pgoutput to plugin private scope.

2023-09-26 Thread Michael Paquier
On Wed, Sep 27, 2023 at 09:39:19AM +0530, Amit Kapila wrote:
> On Wed, Sep 27, 2023 at 9:10 AM Michael Paquier  wrote:
>> Err, actually, I am going to disagree here for the patch of HEAD.  It
>> seems to me that there is zero need for pgoutput.h and we don't need
>> to show PGOutputData to the world.  The structure is internal to
>> Pgoutput.c and used only by its internal static routines.
> 
> Do you disagree with the approach for the PG16 patch or HEAD? You
> mentioned HEAD but your argument sounds like you disagree with a
> different approach for PG16.

Only HEAD where the structure should be moved from pgoutput.h to
pgoutput.c, IMO.  The proposed patch for PG16 is OK as the size of the
structure should not change in a branch already released.
--
Michael


signature.asc
Description: PGP signature


Re: Move global variables of pgoutput to plugin private scope.

2023-09-26 Thread Amit Kapila
On Wed, Sep 27, 2023 at 9:10 AM Michael Paquier  wrote:
>
> On Tue, Sep 26, 2023 at 01:55:10PM +, Zhijie Hou (Fujitsu) wrote:
> > On Tuesday, September 26, 2023 4:40 PM Amit Kapila 
> >  wrote:
> >> Do we really need a new parameter in above structure? Can't we just use the
> >> existing origin in the same structure? Please remember if this needs to be
> >> backpatched then it may not be good idea to add new parameter in the
> >> structure but apart from that having two members to represent similar
> >> information doesn't seem advisable to me. I feel for backbranch we can 
> >> just use
> >> PGOutputData->origin for comparison and for HEAD, we can remove origin
> >> and just use a boolean to avoid any extra cost for comparisions for each
> >> change.
> >
> > OK, I agree to remove the origin string on HEAD and we can add that back
> > when we support other origin value. I also modified to use the string for 
> > comparison
> > as suggested for back-branch. I will also test it locally to confirm it 
> > doesn't affect
> > the perf.
>
> Err, actually, I am going to disagree here for the patch of HEAD.  It
> seems to me that there is zero need for pgoutput.h and we don't need
> to show PGOutputData to the world.  The structure is internal to
> pgoutput.c and used only by its internal static routines.
>

Do you disagree with the approach for the PG16 patch or HEAD? You
mentioned HEAD but your argument sounds like you disagree with a
different approach for PG16.

-- 
With Regards,
Amit Kapila.




Re: pg_upgrade and logical replication

2023-09-26 Thread Michael Paquier
On Tue, Sep 26, 2023 at 09:40:48AM +0530, Amit Kapila wrote:
> On Mon, Sep 25, 2023 at 11:43 AM Michael Paquier  wrote:
>> Sure, that's assuming that the publisher side is upgraded.
> 
> At some point, user needs to upgrade publisher and subscriber could
> itself have some publications defined which means the downstream
> subscribers will have the same problem.

Not always.  I take it as a valid case that one may want to create a
logical setup only for the sake of an upgrade, and trashes the
publisher after a failover to an upgraded subscriber node after the
latter has done a sync up of the data that's been added to the
relations tracked by the publications while the subscriber was
pg_upgrade'd.

>>> This is the primary reason why I prioritized to work on the publisher
>>> side before getting this patch done, otherwise, the solution for this
>>> patch was relatively clear. I am not sure but I guess this could be
>>> the reason why originally we left it in the current state, otherwise,
>>> restoring subscription rel's or origin doesn't seem to be too much of
>>> an additional effort than what we are doing now.
>>
>> By "additional effort", you are referring to what the patch is doing,
>> with the binary dump of pg_subscription_rel, right?
>>
> 
> Yes.

Okay.  I'd like to move on with this stuff, then.  At least it helps
in maintaining data integrity when doing an upgrade with a logical
setup.  The patch still needs more polishing, though..
--
Michael


signature.asc
Description: PGP signature


Re: Move global variables of pgoutput to plugin private scope.

2023-09-26 Thread Michael Paquier
On Tue, Sep 26, 2023 at 01:55:10PM +, Zhijie Hou (Fujitsu) wrote:
> On Tuesday, September 26, 2023 4:40 PM Amit Kapila  
> wrote:
>> Do we really need a new parameter in above structure? Can't we just use the
>> existing origin in the same structure? Please remember if this needs to be
>> backpatched then it may not be good idea to add new parameter in the
>> structure but apart from that having two members to represent similar
>> information doesn't seem advisable to me. I feel for backbranch we can just 
>> use
>> PGOutputData->origin for comparison and for HEAD, we can remove origin
>> and just use a boolean to avoid any extra cost for comparisions for each
>> change.
> 
> OK, I agree to remove the origin string on HEAD and we can add that back
> when we support other origin value. I also modified to use the string for 
> comparison
> as suggested for back-branch. I will also test it locally to confirm it 
> doesn't affect
> the perf.

Err, actually, I am going to disagree here for the patch of HEAD.  It
seems to me that there is zero need for pgoutput.h and we don't need
to show PGOutputData to the world.  The structure is internal to
pgoutput.c and used only by its internal static routines.

Doing a codesearch in the Debian repos or just github shows that it is
used nowhere else, as well, something not really surprising as the
structure is filled and maintained in the file.
--
Michael


signature.asc
Description: PGP signature


Re: Index AmInsert Parameter Confused?

2023-09-26 Thread jacktby jacktby



> 2023年9月27日 00:45,Matthias van de Meent  写道:
> 
> On Tue, 26 Sept 2023 at 18:38, jacktby jacktby  wrote:
>> 
>> typedef bool (*aminsert_function) (Relation indexRelation,
>>  Datum *values,
>>  bool *isnull,
>>  ItemPointer heap_tid,
>>  Relation heapRelation,
>>  IndexUniqueCheck checkUnique,
>>  bool indexUnchanged,
>>  struct IndexInfo *indexInfo);
>> 
>> Why is there a heap_tid, We haven’t inserted the value, so where does it 
>> from ?
> 
> Index insertion only happens after the TableAM tuple has been
> inserted. As indexes refer to locations in the heap, this TID contains
> the TID of the table tuple that contains the indexed values, so that
> the index knows which tuple to refer to.
> 
> Note that access/amapi.h describes only index AM APIs; it does not
> cover the table AM APIs descibed in access/tableam.h
> 
> Kind regards,
> 
> Matthias van de Meent
1.Thanks, so if we insert a tuple into a table which has a index on itself, pg 
will insert tuple into heap firstly, and the give the heaptid form heap to the 
Index am api right?
2. I’m trying to implement a new index, but I just need the data held in index 
table, I hope it’s not inserted into heap, because the all data I want can be 
in index table.



Re: Invalidate the subscription worker in cases where a user loses their superuser status

2023-09-26 Thread Peter Smith
On Tue, Sep 26, 2023 at 11:57 PM vignesh C  wrote:
>
> On Tue, 26 Sept 2023 at 13:03, Peter Smith  wrote:
> >
> > Here are some comments for patch v2-0001.
> >
> > ==
> > src/backend/replication/logical/worker.c
> >
> > 1. maybe_reread_subscription
> >
> > ereport(LOG,
> > (errmsg("logical replication worker for subscription \"%s\"
> > will restart because of a parameter change",
> > MySubscription->name)));
> >
> > Is this really a "parameter change" though? It might be a stretch to
> > call the user role change a subscription parameter change. Perhaps
> > this case needs its own LOG message?
>
> When I was doing this change the same thought had come to my mind too
> but later I noticed that in case of owner change there was no separate
> log message. Since superuser check is somewhat similar to owner
> change, I felt no need to make any change for this.
>

Yeah, I had seen the same already before my comment. Anyway, YMMV.

> > ==
> > src/include/catalog/pg_subscription.h
> >
> > 2.
> >   char*origin; /* Only publish data originating from the
> >   * specified origin */
> > + bool isownersuperuser; /* Is subscription owner superuser? */
> >  } Subscription;
> >
> >
> > Is a new Subscription field member really necessary? The Subscription
> > already has 'owner' -- why doesn't function
> > maybe_reread_subscription() just check:
> >
> > (!superuser_arg(sub->owner) && superuser_arg(MySubscription->owner))
>
> We need the new variable so that we store this value when the worker
> is started and check the current value with the value that was when
> the worker was started. Since we need the value of the superuser
> status when the worker is started, I feel this variable is required.
>

OK. In that case, then shouldn't the patch replace the other
superuser_arg() code already in function run_apply_worker() to make
use of this variable? Otherwise, there are 2 ways of getting the same
information.

==
src/test/subscription/t/027_nosuperuser.pl

I am suspicious that something may be wrong with either the code or
the test because while experimenting, I accidentally found that even
if I *completely* remove the important change below, the TAP test will
still pass anyway.

- !equal(newsub->publications, MySubscription->publications))
+ !equal(newsub->publications, MySubscription->publications) ||
+ (!newsub->isownersuperuser && MySubscription->isownersuperuser))

==
Kind Regards,
Peter Smith.
Fujitsu Australia




Re: Fail hard if xlogreader.c fails on out-of-memory

2023-09-26 Thread Noah Misch
On Wed, Sep 27, 2023 at 11:06:37AM +1300, Thomas Munro wrote:
> On Tue, Sep 26, 2023 at 8:38 PM Michael Paquier  wrote:
> > Thoughts and/or comments are welcome.
> 
> I don't have an opinion yet on your other thread about making this
> stuff configurable for replicas, but for the simple crash recovery
> case shown here, hard failure makes sense to me.

> Recycled pages can't fool us into making a huge allocation any more.
> If xl_tot_len implies more than one page but the next page's
> xlp_pageaddr is too low, then either the xl_tot_len you read was
> recycled garbage bits, or it was legitimate but the overwrite of the
> following page didn't make it to disk; either way, we don't have a
> record, so we have an end-of-wal condition.  The xlp_rem_len check
> defends against the second page making it to disk while the first one
> still contains recycled garbage where the xl_tot_len should be*.
> 
> What Michael wants to do now is remove the 2004-era assumption that
> malloc failure implies bogus data.  It must be pretty unlikely in a 64
> bit world with overcommitted virtual memory, but a legitimate
> xl_tot_len can falsely end recovery and lose data, as reported from a
> production case analysed by his colleagues.  In other words, we can
> actually distinguish between lack of resources and recycled bogus
> data, so why treat them the same?

Indeed.  Hard failure is fine, and ENOMEM=end-of-WAL definitely isn't.

> *A more detailed analysis would talk about sectors (page header is
> atomic)

I think the page header is atomic on POSIX-compliant filesystems but not
atomic on ext4.  That doesn't change the conclusion on $SUBJECT.




Re: Could not run generate_unaccent_rules.py script when update unicode

2023-09-26 Thread Japin Li


On Wed, 27 Sep 2023 at 08:03, Michael Paquier  wrote:
> On Tue, Sep 26, 2023 at 10:43:40AM +0800, Japin Li wrote:
>> # Allow running this even without --with-python
>> PYTHON ?= python
>> 
>> $(srcdir)/unaccent.rules: generate_unaccent_rules.py 
>> ../../src/common/unicode/UnicodeData.txt Latin-ASCII.xml
>> $(PYTHON) $< --unicode-data-file $(word 2,$^) --latin-ascii-file 
>> $(word 3,$^) >$@
>> 
>> It use python to run generate_unaccent_rules.py, However, the ?= operator in
>> Makefile only check variable is defined or not, but do not check variable is
>> empty.  Since the PYTHON is defined in src/Makefile.global, so here PYTHON
>> get empty when without --with-ptyhon.
>
> I am not sure that many people run this script frequently so that may
> not be worth adding a check for a defined, still empty or incorrect

Yeah, not frequently, however, it already be used by me, since we provide this
function, why not make it better?

> value, but..  If you were to change the Makefile we use in this path,
> how are you suggesting to change it?

I provide a patch at bottom of in [1].  Attached here again.

diff --git a/contrib/unaccent/Makefile b/contrib/unaccent/Makefile
index 652a3e774c..3ff49ba1e9 100644
--- a/contrib/unaccent/Makefile
+++ b/contrib/unaccent/Makefile
@@ -26,7 +26,9 @@ endif
 update-unicode: $(srcdir)/unaccent.rules
 
 # Allow running this even without --with-python
-PYTHON ?= python
+ifeq ($(PYTHON),)
+PYTHON = python
+endif
 
 $(srcdir)/unaccent.rules: generate_unaccent_rules.py 
../../src/common/unicode/UnicodeData.txt Latin-ASCII.xml
$(PYTHON) $< --unicode-data-file $(word 2,$^) --latin-ascii-file $(word 
3,$^) >$@

[1] 
https://www.postgresql.org/message-id/meyp282mb1669f86c0dc7b4dc48489cb0b6...@meyp282mb1669.ausp282.prod.outlook.com

-- 
Regrads,
Japin Li
ChengDu WenWu Information Technology Co., Ltd.




Re: Annoying build warnings from latest Apple toolchain

2023-09-26 Thread Tom Lane
I wrote:
> Since updating to Xcode 15.0, my macOS machines have been
> spitting a bunch of linker-generated warnings.  There
> seems to be one instance of
> ld: warning: -multiply_defined is obsolete
> for each loadable module we link ...

I poked into this a little more.  We started using "-multiply_defined
suppress" in commit 9df308697 of 2004-07-13, which was in the OS X 10.3
era.  I failed to find any specific discussion of that switch in our
archives, but the commit message suggests that I probably stole it
from a patch the Fink project was carrying.

Googling finds some non-authoritative claims that "-multiply_defined"
has been a no-op since OS X 10.9 (Mavericks).  I don't have anything
older than 10.15 to check, but removing it on 10.15 does not seem
to cause any problems.

So I think we can safely just remove this switch from Makefile.shlib.
The meson build process isn't invoking it either I think.

The other thing will take a bit more work ...

regards, tom lane




Re: Could not run generate_unaccent_rules.py script when update unicode

2023-09-26 Thread Michael Paquier
On Tue, Sep 26, 2023 at 10:43:40AM +0800, Japin Li wrote:
> # Allow running this even without --with-python
> PYTHON ?= python
> 
> $(srcdir)/unaccent.rules: generate_unaccent_rules.py 
> ../../src/common/unicode/UnicodeData.txt Latin-ASCII.xml
> $(PYTHON) $< --unicode-data-file $(word 2,$^) --latin-ascii-file 
> $(word 3,$^) >$@
> 
> It use python to run generate_unaccent_rules.py, However, the ?= operator in
> Makefile only check variable is defined or not, but do not check variable is
> empty.  Since the PYTHON is defined in src/Makefile.global, so here PYTHON
> get empty when without --with-ptyhon.

I am not sure that many people run this script frequently so that may
not be worth adding a check for a defined, still empty or incorrect
value, but..  If you were to change the Makefile we use in this path,
how are you suggesting to change it?
--
Michael


signature.asc
Description: PGP signature


Re: XLog size reductions: Reduced XLog record header size for PG17

2023-09-26 Thread Michael Paquier
On Mon, Sep 25, 2023 at 07:40:00PM +0200, Matthias van de Meent wrote:
> On Wed, 20 Sept 2023 at 07:06, Michael Paquier  wrote:
>>  #define COPY_HEADER_FIELD(_dst, _size)\
>>  do {\
>> -if (remaining < _size)\
>> +if (remaining < (_size))\
>>  goto shortdata_err;\
>>
>> There are a couple of stylistic changes like this one, that I guess
>> could just use their own patch to make these macros easier to use.
> 
> They actually fix complaints of my IDE, but are otherwise indeed stylistic.

Oh, OK.  I just use an old-school terminal, but no objections in
changing these if they make life easier for some hackers.  Still, that
feels independant of what you are proposing here.
--
Michael


signature.asc
Description: PGP signature


Re: Correct the documentation for work_mem

2023-09-26 Thread Bruce Momjian
On Wed, Sep 27, 2023 at 02:05:44AM +1300, David Rowley wrote:
> On Tue, 12 Sept 2023 at 03:03, Bruce Momjian  wrote:
> >
> > On Mon, Sep 11, 2023 at 10:02:55PM +1200, David Rowley wrote:
> > > It's certainly not a show-stopper. I do believe the patch makes some
> > > improvements.  The reason I'd prefer to see either "and" or "and/or"
> > > in place of "or" is because the text is trying to imply that many of
> > > these operations can run at the same time. I'm struggling to
> > > understand why, given that there could be many sorts and many hashes
> > > going on at once that we'd claim it could only be one *or* the other.
> > > If we have 12 sorts and 4 hashes then that's not "several sort or hash
> > > operations", it's "several sort and hash operations".  Of course, it
> > > could just be sorts or just hashes, so "and/or" works fine for that.
> >
> > Yes, I see your point and went with "and",   updated patch attached.
> 
> Looks good to me.

Patch applied back to Postgres 11.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Only you can decide what is important to you.




Re: [PoC/RFC] Multiple passwords, interval expirations

2023-09-26 Thread Jeff Davis
On Mon, 2023-09-25 at 00:31 -0700, Gurjeet Singh wrote:

> Please see attached v4 of the patch. The patch takes care of rebase
> to
> the master/17-devel branch, and includes some changes, too.

FWIW I got some failures applying. I didn't investigate much, and
instead I looked at your git branch (7a35619e).

> Moreover, before the patch, in case of CheckPasswordAuth(), the error
> (if any) would have been thrown _after_ network communication done by
> sendAuthRequest() call. But with this patch, the error is thrown
> before the network interaction, hence this changes the order of
> network interaction and the error message. This may have security
> implications, too, but I'm unable to articulate one right now.

You mean before v3 or before v4? Is this currently a problem in v4?

> Open question: If a client is capable of providing just md5 passwords
> handshake, and because of pg_hba.conf setting, or because the role
> has
> at least one SCRAM password (essentially the 3rd case you mention
> above: pg_hba md5 + md5 and scram pws -> scram), the server will
> respond with a SASL/SCRAM authentication response, and that would
> break the backwards compatibility and will deny access to the client.
> Does this make it necessary to use a newer libpq/client library?

Perhaps you can try the MD5 passwords first, and only if they fail,
move on to try scram passwords?

> Comments?

IIUC, for the case of multiple scram passwords, we use the salt to
select the right scram password, and then proceed from there?

I'm not very excited about the idea of naming passwords, or having
passwords with default names. I can't think of anything better right
now, so it might be OK.

> - Add tests
> - Add/update documentation

These are needed to provide better review.


Regards,
Jeff Davis





Re: pg_rewind with cascade standby doesn't work well

2023-09-26 Thread Michael Paquier
On Tue, Sep 26, 2023 at 06:44:50PM +0300, Aleksander Alekseev wrote:
>> And also, I'm afraid that I'm not sure what kind of tests I have to make
>> for fix this behavior. Would you mind giving me some advice?
> 
> Personally I would prefer not to increase the scope of work. Your TAP
> test added in 0001 seems to be adequate.

Yeah, agreed.  I'm OK with what you are proposing, basically (the
test could be made a bit cheaper actually).

 /*
- * For Hot Standby, we could treat this like a Shutdown Checkpoint,
- * but this case is rarer and harder to test, so the benefit doesn't
- * outweigh the potential extra cost of maintenance.
+ * For Hot Standby, we could treat this like an end-of-recovery 
checkpoint
  */
+RequestCheckpoint(CHECKPOINT_IMMEDIATE | CHECKPOINT_WAIT);

I don't understand what you want to change here.  Archive recovery and
crash recovery are two different things, still this code would be
triggered even if there is no standby.signal, aka the node is not a
standby.  Why isn't this stuff conditional?

>> BTW, I was able to
>> reproduce the assertion failure Kuwamura-san reported, even after applying
>> your latest patch from the thread.
> 
> Do you mean that the test fails or it doesn't but there are other
> steps to reproduce the issue?

I get it as Fujii-san testing the patch from [1], still failing the
test from [2]:
[1]: https://www.postgresql.org/message-id/zarvomifjze7f...@paquier.xyz
[2]: 
https://www.postgresql.org/message-id/camyc8qrye7mkyvpvghct5gpanamp8ss_trbraqxcpbx14vi...@mail.gmail.com

I would be surprised, actually, because the patch from [1] would cause
step 7 of the test to fail: the patch causes standby.signal or
recovery.signal to be required.  Anyway, this specific issue, if any,
had better be discussed on the other thread.  I need to address a few
comments there as well and was planning to get back to it.  It is
possible that I've missed something on the other thread with the
restrictions I was proposing in the latest version of the patch.

For this thread, let's focus on the pg_rewind case and how we want to
treat these records to improve the cascading case.
--
Michael


signature.asc
Description: PGP signature


Re: Fail hard if xlogreader.c fails on out-of-memory

2023-09-26 Thread Michael Paquier
On Wed, Sep 27, 2023 at 11:06:37AM +1300, Thomas Munro wrote:
> I don't have an opinion yet on your other thread about making this
> stuff configurable for replicas, but for the simple crash recovery
> case shown here, hard failure makes sense to me.

Also, if we conclude that we're OK with just failing hard all the time
for crash recovery and archive recovery on OOM, the other patch is not
really required.  That would be disruptive for standbys in some cases,
still perhaps OK in the long-term.  I am wondering if people have lost
data because of this problem on production systems, actually..  It
would not be possible to know that it happened until you see a page on
disk that has a somewhat valid LSN, still an LSN older than the
position currently being inserted, and that could show up in various
forms.  Even that could get hidden quickly if WAL is written at a fast
pace after a crash recovery.  A standby promotion at an LSN older
would be unlikely as monitoring solutions discard standbys lagging
behind N bytes.

> *A more detailed analysis would talk about sectors (page header is
> atomic), and consider whether we're only trying to defend ourselves
> against recycled pages written by PostgreSQL (yes), arbitrary random
> data (no, but it's probably still pretty good) or someone trying to
> trick us (no, and we don't stand a chance).

WAL would not be the only part of the system that would get borked if
arbitrary bytes can be inserted into what's read from disk, random or
not.
--
Michael


signature.asc
Description: PGP signature


Re: Is this a problem in GenericXLogFinish()?

2023-09-26 Thread Jeff Davis
On Wed, 2023-09-27 at 00:14 +0300, Heikki Linnakangas wrote:
> Looks correct. You now loop through all the block IDs three times, 
> however. I wonder if that is measurably slower, but even if it's not,
> was there a reason you wanted to move the XLogRegisterBuffer() calls
> to 
> a separate loop?

I did so to correspond more closely to what's outlined in the README
and in other places in the code, where marking the buffers dirty
happens before XLogBeginInsert(). It didn't occur to me that one extra
loop would matter, but I can combine them again.

It would be a bit more concise to do the XLogBeginInsert() first (like
before) and then register the buffers in the same loop that does the
writes and marks the buffers dirty. Updated patch attached.

> 
> Hmm, I'm sure there are exceptions but log_newpage_range() actually 
> seems to be doing the right thing; it calls MarkBufferDirty() before 
> XLogInsert(). It only calls it after XLogRegisterBuffer() though, and
> I 
> concur that XLogRegisterBuffer() would be the logical place for the 
> assertion. We could tighten this up, require that you call 
> MarkBufferDirty() before XLogRegisterBuffer(), and fix all the
> callers.

That site is pretty trivial to fix, but there are also a couple places
in hash.c and hashovfl.c that are registering a clean page and it's not
clear to me exactly what's going on.

Regards,
Jeff Davis

From 9548bb49865db3a04bbdda4c63df611142e22964 Mon Sep 17 00:00:00 2001
From: Jeff Davis 
Date: Tue, 26 Sep 2023 12:10:11 -0700
Subject: [PATCH v2] Fix GenericXLogFinish().

Mark the buffers dirty before writing WAL.

Discussion: https://postgr.es/m/25104133-7df8-cae3-b9a2-1c0aaa1c0...@iki.fi
---
 src/backend/access/transam/generic_xlog.c | 53 ++-
 1 file changed, 23 insertions(+), 30 deletions(-)

diff --git a/src/backend/access/transam/generic_xlog.c b/src/backend/access/transam/generic_xlog.c
index 6c68191ca6..2d5ff4aa7c 100644
--- a/src/backend/access/transam/generic_xlog.c
+++ b/src/backend/access/transam/generic_xlog.c
@@ -347,6 +347,10 @@ GenericXLogFinish(GenericXLogState *state)
 
 		START_CRIT_SECTION();
 
+		/*
+		 * Compute deltas if necessary, write changes to buffers, mark
+		 * buffers dirty, and register changes.
+		 */
 		for (i = 0; i < MAX_GENERIC_XLOG_PAGES; i++)
 		{
 			PageData   *pageData = >pages[i];
@@ -359,41 +363,31 @@ GenericXLogFinish(GenericXLogState *state)
 			page = BufferGetPage(pageData->buffer);
 			pageHeader = (PageHeader) pageData->image;
 
+			/* delta not needed for a full page image */
+			if (!(pageData->flags & GENERIC_XLOG_FULL_IMAGE))
+computeDelta(pageData, page, (Page) pageData->image);
+
+			/*
+			 * Apply the image, being careful to zero the "hole" between
+			 * pd_lower and pd_upper in order to avoid divergence between
+			 * actual page state and what replay would produce.
+			 */
+			memcpy(page, pageData->image, pageHeader->pd_lower);
+			memset(page + pageHeader->pd_lower, 0,
+   pageHeader->pd_upper - pageHeader->pd_lower);
+			memcpy(page + pageHeader->pd_upper,
+   pageData->image + pageHeader->pd_upper,
+   BLCKSZ - pageHeader->pd_upper);
+
+			MarkBufferDirty(pageData->buffer);
+
 			if (pageData->flags & GENERIC_XLOG_FULL_IMAGE)
 			{
-/*
- * A full-page image does not require us to supply any xlog
- * data.  Just apply the image, being careful to zero the
- * "hole" between pd_lower and pd_upper in order to avoid
- * divergence between actual page state and what replay would
- * produce.
- */
-memcpy(page, pageData->image, pageHeader->pd_lower);
-memset(page + pageHeader->pd_lower, 0,
-	   pageHeader->pd_upper - pageHeader->pd_lower);
-memcpy(page + pageHeader->pd_upper,
-	   pageData->image + pageHeader->pd_upper,
-	   BLCKSZ - pageHeader->pd_upper);
-
 XLogRegisterBuffer(i, pageData->buffer,
    REGBUF_FORCE_IMAGE | REGBUF_STANDARD);
 			}
 			else
 			{
-/*
- * In normal mode, calculate delta and write it as xlog data
- * associated with this page.
- */
-computeDelta(pageData, page, (Page) pageData->image);
-
-/* Apply the image, with zeroed "hole" as above */
-memcpy(page, pageData->image, pageHeader->pd_lower);
-memset(page + pageHeader->pd_lower, 0,
-	   pageHeader->pd_upper - pageHeader->pd_lower);
-memcpy(page + pageHeader->pd_upper,
-	   pageData->image + pageHeader->pd_upper,
-	   BLCKSZ - pageHeader->pd_upper);
-
 XLogRegisterBuffer(i, pageData->buffer, REGBUF_STANDARD);
 XLogRegisterBufData(i, pageData->delta, pageData->deltaLen);
 			}
@@ -402,7 +396,7 @@ GenericXLogFinish(GenericXLogState *state)
 		/* Insert xlog record */
 		lsn = XLogInsert(RM_GENERIC_ID, 0);
 
-		/* Set LSN and mark buffers dirty */
+		/* Set LSN */
 		for (i = 0; i < MAX_GENERIC_XLOG_PAGES; i++)
 		{
 			PageData   *pageData = >pages[i];
@@ -410,7 +404,6 @@ GenericXLogFinish(GenericXLogState *state)
 			if 

Re: Document that server will start even if it's unable to open some TCP/IP ports

2023-09-26 Thread Bruce Momjian
On Tue, Sep 12, 2023 at 05:25:44PM -0700, Gurjeet Singh wrote:
> On Fri, Sep 8, 2023 at 7:52 AM Bruce Momjian  wrote:
> >
> > On Thu, Sep  7, 2023 at 09:21:07PM -0700, Nathan Bossart wrote:
> > > On Thu, Sep 07, 2023 at 07:13:44PM -0400, Bruce Momjian wrote:
> > > > On Thu, Sep  7, 2023 at 02:54:13PM -0700, Nathan Bossart wrote:
> > > >> IMO the phrase "open a port" is kind of nonstandard.  I think we 
> > > >> should say
> > > >> something along the lines of
> > > >>
> > > >>If listen_addresses is not empty, the server will start only if it 
> > > >> can
> > > >>listen on at least one of the specified addresses.  A warning will 
> > > >> be
> > > >>emitted for any addresses that the server cannot listen on.
> > > >
> > > > Good idea, updated patch attached.
> > >
> > > I still think we should say "listen on an address" instead of "open a
> > > port," but otherwise it LGTM.
> >
> > Agreed, I never liked the "port" mention.  I couldn't figure how to get
> > "open" out of the warning sentence though.  Updated patch attached.
> 
> LGTM.

Patch applied back to PG 11.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Only you can decide what is important to you.




Re: pg_upgrade instructions involving "rsync --size-only" might lead to standby corruption?

2023-09-26 Thread Bruce Momjian
On Thu, Sep  7, 2023 at 01:52:45PM -0400, Bruce Momjian wrote:
> On Mon, Jul 10, 2023 at 02:37:24PM -0700, Nikolay Samokhvalov wrote:
> > Maybe. It will require changes in other parts of this doc.
> > Thinking (here: 
> > https://gitlab.com/postgres/postgres/-/merge_requests/18/diffs)
> > 
> > Meanwhile, attached is v2
> > 
> > thanks for the comments
> 
> I looked over this issue thoroughly and I think I see the cause of the
> confusion.  In step 8 we say:
> 
>   8. Stop both servers
>   Streaming replication and log-shipping standby servers can remain
>  ---
>   running until a later step.
> 
> Of course this has to be "must" and it would be good to explain why,
> which I have done in the attached patch.
> 
> Secondly, in step 9 we say "verify the LSNs", but have a parenthetical
> sentence that explains why they might not match:
> 
>   (There will be a mismatch if old standby servers were shut down before
>   the old primary or if the old standby servers are still running.)
> 
> People might take that to mean that it is okay if this is the reason
> they don't match, which is incorrect.  Better to tell them to keep the
> streaming replication and log-shipping servers running so we don't need
> that sentence.
> 
> The instructions are already long so I am hesitant to add more text
> without a clear purpose.

Patch applied back to PG 11.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Only you can decide what is important to you.




Annoying build warnings from latest Apple toolchain

2023-09-26 Thread Tom Lane
Since updating to Xcode 15.0, my macOS machines have been
spitting a bunch of linker-generated warnings.  There
seems to be one instance of

ld: warning: -multiply_defined is obsolete

for each loadable module we link, and some program links complain

ld: warning: ignoring duplicate libraries: '-lpgcommon', '-lpgport'

You can see these in the build logs for both sifaka [1] and
indri [2], so MacPorts isn't affecting it.

I'd held out some hope that this was a transient problem due to
the underlying OS still being Ventura, but I just updated another
machine to shiny new Sonoma (14.0), and it's still doing it.
Guess we gotta do something about it.

We used to need "-multiply_defined suppress" to suppress other
linker warnings.  I tried removing that from the Makefile.shlib
recipes for Darwin, and those complaints go away while no new
ones appear, so that's good --- but I wonder whether slightly
older toolchain versions will still want it.

As for the duplicate libraries, yup guilty as charged, but
I think we were doing that intentionally to satisfy some other
old toolchains.  I wonder whether removing the duplication
will create new problems.

regards, tom lane

[1] 
https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=sifaka=2023-09-26%2021%3A09%3A01=build
[2] 
https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=indri=2023-09-26%2021%3A42%3A17=build




Re: Fail hard if xlogreader.c fails on out-of-memory

2023-09-26 Thread Thomas Munro
On Tue, Sep 26, 2023 at 8:38 PM Michael Paquier  wrote:
> Thoughts and/or comments are welcome.

I don't have an opinion yet on your other thread about making this
stuff configurable for replicas, but for the simple crash recovery
case shown here, hard failure makes sense to me.

Here are some interesting points in the history of this topic:

1999 30659d43: xl_len is 16 bit, fixed size buffer in later commits
2001 7d4d5c00: WAL files recycled, xlp_pageaddr added
2004 0ffe11ab: xl_len is 32 bit, dynamic buffer, malloc() failure ends redo
2005 21fda22e: xl_tot_len and xl_len co-exist
2014 2c03216d: xl_tot_len fully replaces xl_len
2018 70b4f82a: xl_tot_len > 1GB ends redo
2023 8fcb32db: don't let xl_tot_len > 1GB be logged!
2023 bae868ca: check next xlp_pageaddr, xlp_rem_len before allocating

Recycled pages can't fool us into making a huge allocation any more.
If xl_tot_len implies more than one page but the next page's
xlp_pageaddr is too low, then either the xl_tot_len you read was
recycled garbage bits, or it was legitimate but the overwrite of the
following page didn't make it to disk; either way, we don't have a
record, so we have an end-of-wal condition.  The xlp_rem_len check
defends against the second page making it to disk while the first one
still contains recycled garbage where the xl_tot_len should be*.

What Michael wants to do now is remove the 2004-era assumption that
malloc failure implies bogus data.  It must be pretty unlikely in a 64
bit world with overcommitted virtual memory, but a legitimate
xl_tot_len can falsely end recovery and lose data, as reported from a
production case analysed by his colleagues.  In other words, we can
actually distinguish between lack of resources and recycled bogus
data, so why treat them the same?

For comparison, if you run out of disk space during recovery we don't
say "oh well, that's enough redoing for today, the computer is full,
let's forget about the rest of the WAL and start accepting new
transactions!".  The machine running recovery has certain resource
requirements relative to the machine that generated the WAL, and if
they're not met it just can't do it.  It's the same if it various
other allocations fail.  The new situation is that we are now
verifying that xl_tot_len was actually logged by PostgreSQL, so if we
can't allocate space for it, we can't replay the WAL.

*A more detailed analysis would talk about sectors (page header is
atomic), and consider whether we're only trying to defend ourselves
against recycled pages written by PostgreSQL (yes), arbitrary random
data (no, but it's probably still pretty good) or someone trying to
trick us (no, and we don't stand a chance).




Re: dikkop seems unhappy because of openssl stuff (FreeBSD 14-BETA1)

2023-09-26 Thread Tom Lane
Tomas Vondra  writes:
> Hmmm, I got to install BETA2 yesterday, but I still se the tcl failure:

Huh.  I'm baffled as to what's up there.  Is it possible that this is
actually a hardware-based difference?  I didn't think there was much
difference between Pi 3B and Pi 4, but we're running out of other
explanations.

> I wonder what's the difference between the systems ... All I did was
> writing the BETA2 image to SD card, and install a couple packages:

I reinstalled BETA3, since that's out now, but see no change in
behavior.

I did discover that plperl works for me after adding --with-openssl
to the configure options.  Not sure if it's worth digging any further
than that.

regards, tom lane




Re: [PGdocs] fix description for handling pf non-ASCII characters

2023-09-26 Thread Karl O. Pinc
Sep 26, 2023 1:10:55 PM Tom Lane :

> "Karl O. Pinc"  writes:
>> For the last hunk you'd change around "anything".  Write:
>> "... it will be truncated to less than NAMEDATALEN characters and
>> the bytes of the string which are not printable ASCII characters ...".
>
>> Notice that I have also changed "that" to "which" just above. 
>> I _think_ this is better English.
>
> No, I'm pretty sure you're mistaken.  It's been a long time since
> high school English, but the way I think this works is that "that"
> introduces a restrictive clause, which narrows the scope of what
> you are saying.  That is, you say "that" when you want to talk
> about only the bytes of the string that aren't ASCII.  But "which"
> introduces a non-restrictive clause that adds information or
> commentary.  If you say "bytes of the string which are not ASCII",
> you are effectively making a side assertion that no byte of the
> string is ASCII.  Which is not the meaning you want here.

Makes sense to me.  "That" it is.

Thanks for the help. I never would have figured that out.





Re: [PATCH] pgrowlocks: Make mode names consistent with docs

2023-09-26 Thread Bruce Momjian
On Thu, Sep  7, 2023 at 12:58:29PM -0400, Bruce Momjian wrote:
> You are right something is wrong.  However, I looked at your patch and I
> am thinking we need to go the other way and add "For" in the upper
> block, rather than removing it in the lower one.  I have two reasons. 
> Looking at the code block:
> 
> case MultiXactStatusUpdate:
>   snprintf(buf, NCHARS, "Update");
>   break;
> case MultiXactStatusNoKeyUpdate:
>   snprintf(buf, NCHARS, "No Key Update");
>   break;
> case MultiXactStatusForUpdate:
>   snprintf(buf, NCHARS, "For Update");
>   break;
> case MultiXactStatusForNoKeyUpdate:
>   snprintf(buf, NCHARS, "For No Key Update");
>   break;
> case MultiXactStatusForShare:
>   snprintf(buf, NCHARS, "Share");
>   break;
> case MultiXactStatusForKeyShare:
>   snprintf(buf, NCHARS, "Key Share");
>   break;
> 
> You will notice there are "For" and non-"For" versions of "Update" and
> "No Key Update".  Notice that "For" appears in the macro names for the
> "For" macro versions of update, but "For" does not appear in the "Share"
> and "Key Share" versions, though the macro has "For".
> 
> Second, notice that the "For" and non-"For" match in the block below
> that, which suggests it is correct, and the non-"For" block is later:
> 
> values[Atnum_modes] = palloc(NCHARS);
> if (infomask & HEAP_XMAX_LOCK_ONLY)
> {
>   if (HEAP_XMAX_IS_SHR_LOCKED(infomask))
>   snprintf(values[Atnum_modes], NCHARS, "{For Share}");
>   else if (HEAP_XMAX_IS_KEYSHR_LOCKED(infomask))
>   snprintf(values[Atnum_modes], NCHARS, "{For Key Share}");
>   else if (HEAP_XMAX_IS_EXCL_LOCKED(infomask))
>   {
>   if (tuple->t_data->t_infomask2 & HEAP_KEYS_UPDATED)
>   snprintf(values[Atnum_modes], NCHARS, "{For Update}");
>   else
>   snprintf(values[Atnum_modes], NCHARS, "{For No Key Update}");
>   }
>   else
>   /* neither keyshare nor exclusive bit it set */
>   snprintf(values[Atnum_modes], NCHARS,
>"{transient upgrade status}");
> }
> else
> {
>   if (tuple->t_data->t_infomask2 & HEAP_KEYS_UPDATED)
>   snprintf(values[Atnum_modes], NCHARS, "{Update}");
>   else
>   snprintf(values[Atnum_modes], NCHARS, "{No Key Update}");
> }
> 
> I therefore suggest this attached patch, which should be marked as an
> incompatibility in PG 17.

Patch applied to master.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Only you can decide what is important to you.




Re: Better help output for pgbench -I init_steps

2023-09-26 Thread Peter Eisentraut

On 22.09.23 22:01, Tristen Raab wrote:

The following review has been posted through the commitfest application:
make installcheck-world:  not tested
Implements feature:   tested, passed
Spec compliant:   not tested
Documentation:not tested

Hello,

I've reviewed all 4 of your patches, each one applies and builds correctly.


I think I prefer variant 2.  Currently, we only have 8 steps, so it might
be overkill to separate them out into a different option.


+1 to this from Peter. Variant 2 is nicely formatted with lots of information 
which I feel better solves the problem this patch is trying to address.


Committed variant 2.  I just changed the initial capitalization of the 
sentences to be more consistent with the surrounding text.






Re: Is this a problem in GenericXLogFinish()?

2023-09-26 Thread Heikki Linnakangas

On 26/09/2023 22:32, Jeff Davis wrote:

On Mon, 2023-09-25 at 13:04 +0300, Heikki Linnakangas wrote:

Yes, that's a problem.


Patch attached. I rearranged the code a bit to follow the expected
pattern of: write, mark dirty, WAL, set LSN. I think computing the
deltas could also be moved earlier, outside of the critical section,
but I'm not sure that would be useful.


Looks correct. You now loop through all the block IDs three times, 
however. I wonder if that is measurably slower, but even if it's not, 
was there a reason you wanted to move the XLogRegisterBuffer() calls to 
a separate loop?



Do you have a suggestion for any kind of test addition, or should we
just review carefully?


I wish we had an assertion for that. XLogInsert() could assert that
the page is already marked dirty, for example.


Unfortunately that's not always the case, e.g. log_newpage_range().


Hmm, I'm sure there are exceptions but log_newpage_range() actually 
seems to be doing the right thing; it calls MarkBufferDirty() before 
XLogInsert(). It only calls it after XLogRegisterBuffer() though, and I 
concur that XLogRegisterBuffer() would be the logical place for the 
assertion. We could tighten this up, require that you call 
MarkBufferDirty() before XLogRegisterBuffer(), and fix all the callers.


--
Heikki Linnakangas
Neon (https://neon.tech)





Re: Obsolete reference to pg_relation in comment

2023-09-26 Thread Bruce Momjian
On Thu, Sep  7, 2023 at 10:44:25AM +0200, Daniel Gustafsson wrote:
> > On 6 Sep 2023, at 21:13, Bruce Momjian  wrote:
> > On Wed, Jul 26, 2023 at 05:14:08PM -0400, Tom Lane wrote:
> 
> >> I think we should reword this to just generically claim that holding
> >> the Relation reference open for the whole transaction reduces overhead.
> > 
> > How is this attached patch?
> 
> Reads good to me, +1.

Patch applied to master.  I didn't think backpatching it made much sense
since it is so localized.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Only you can decide what is important to you.




Re: Add const qualifiers

2023-09-26 Thread David Steele

On 9/26/23 06:34, Peter Eisentraut wrote:

On 09.09.23 21:03, David Steele wrote:

On 9/1/23 11:39, David Steele wrote:

Hackers,

I noticed that there was a mismatch between the const qualifiers for 
excludeDirContents in src/backend/backup/backup/basebackup.c and 
src/bin/pg_rewind/file_map.c and that led me to use ^static 
const.*\*.*= to do a quick search for similar cases.


I think at the least we should make excludeDirContents match, but the 
rest of the changes seem like a good idea as well.


Added to 2023-11 CF.


committed


Thank you, Peter!




Re: Is this a problem in GenericXLogFinish()?

2023-09-26 Thread Jeff Davis
On Mon, 2023-09-25 at 13:04 +0300, Heikki Linnakangas wrote:
> Yes, that's a problem.

Patch attached. I rearranged the code a bit to follow the expected
pattern of: write, mark dirty, WAL, set LSN. I think computing the
deltas could also be moved earlier, outside of the critical section,
but I'm not sure that would be useful.

Do you have a suggestion for any kind of test addition, or should we
just review carefully?

> I wish we had an assertion for that. XLogInsert() could assert that
> the 
> page is already marked dirty, for example.

Unfortunately that's not always the case, e.g. log_newpage_range().

Regards,
Jeff Davis

From c54749a7939721f0a838115abce09967216cbc0f Mon Sep 17 00:00:00 2001
From: Jeff Davis 
Date: Tue, 26 Sep 2023 12:10:11 -0700
Subject: [PATCH v1] Fix GenericXLogFinish().

Mark the buffers dirty before writing WAL.

Discussion: https://postgr.es/m/25104133-7df8-cae3-b9a2-1c0aaa1c0...@iki.fi
---
 src/backend/access/transam/generic_xlog.c | 66 ---
 1 file changed, 34 insertions(+), 32 deletions(-)

diff --git a/src/backend/access/transam/generic_xlog.c b/src/backend/access/transam/generic_xlog.c
index 6c68191ca6..2b7e054ebe 100644
--- a/src/backend/access/transam/generic_xlog.c
+++ b/src/backend/access/transam/generic_xlog.c
@@ -343,10 +343,12 @@ GenericXLogFinish(GenericXLogState *state)
 	if (state->isLogged)
 	{
 		/* Logged relation: make xlog record in critical section. */
-		XLogBeginInsert();
-
 		START_CRIT_SECTION();
 
+		/*
+		 * Compute deltas if necessary, write changes to buffers, and mark
+		 * them dirty.
+		 */
 		for (i = 0; i < MAX_GENERIC_XLOG_PAGES; i++)
 		{
 			PageData   *pageData = >pages[i];
@@ -359,41 +361,42 @@ GenericXLogFinish(GenericXLogState *state)
 			page = BufferGetPage(pageData->buffer);
 			pageHeader = (PageHeader) pageData->image;
 
+			/* delta not needed for a full page image */
+			if (!(pageData->flags & GENERIC_XLOG_FULL_IMAGE))
+computeDelta(pageData, page, (Page) pageData->image);
+
+			/*
+			 * Apply the image, being careful to zero the "hole" between
+			 * pd_lower and pd_upper in order to avoid divergence between
+			 * actual page state and what replay would produce.
+			 */
+			memcpy(page, pageData->image, pageHeader->pd_lower);
+			memset(page + pageHeader->pd_lower, 0,
+   pageHeader->pd_upper - pageHeader->pd_lower);
+			memcpy(page + pageHeader->pd_upper,
+   pageData->image + pageHeader->pd_upper,
+   BLCKSZ - pageHeader->pd_upper);
+
+			MarkBufferDirty(pageData->buffer);
+		}
+
+		XLogBeginInsert();
+
+		/* Register buffers */
+		for (i = 0; i < MAX_GENERIC_XLOG_PAGES; i++)
+		{
+			PageData   *pageData = >pages[i];
+
+			if (BufferIsInvalid(pageData->buffer))
+continue;
+
 			if (pageData->flags & GENERIC_XLOG_FULL_IMAGE)
 			{
-/*
- * A full-page image does not require us to supply any xlog
- * data.  Just apply the image, being careful to zero the
- * "hole" between pd_lower and pd_upper in order to avoid
- * divergence between actual page state and what replay would
- * produce.
- */
-memcpy(page, pageData->image, pageHeader->pd_lower);
-memset(page + pageHeader->pd_lower, 0,
-	   pageHeader->pd_upper - pageHeader->pd_lower);
-memcpy(page + pageHeader->pd_upper,
-	   pageData->image + pageHeader->pd_upper,
-	   BLCKSZ - pageHeader->pd_upper);
-
 XLogRegisterBuffer(i, pageData->buffer,
    REGBUF_FORCE_IMAGE | REGBUF_STANDARD);
 			}
 			else
 			{
-/*
- * In normal mode, calculate delta and write it as xlog data
- * associated with this page.
- */
-computeDelta(pageData, page, (Page) pageData->image);
-
-/* Apply the image, with zeroed "hole" as above */
-memcpy(page, pageData->image, pageHeader->pd_lower);
-memset(page + pageHeader->pd_lower, 0,
-	   pageHeader->pd_upper - pageHeader->pd_lower);
-memcpy(page + pageHeader->pd_upper,
-	   pageData->image + pageHeader->pd_upper,
-	   BLCKSZ - pageHeader->pd_upper);
-
 XLogRegisterBuffer(i, pageData->buffer, REGBUF_STANDARD);
 XLogRegisterBufData(i, pageData->delta, pageData->deltaLen);
 			}
@@ -402,7 +405,7 @@ GenericXLogFinish(GenericXLogState *state)
 		/* Insert xlog record */
 		lsn = XLogInsert(RM_GENERIC_ID, 0);
 
-		/* Set LSN and mark buffers dirty */
+		/* Set LSN */
 		for (i = 0; i < MAX_GENERIC_XLOG_PAGES; i++)
 		{
 			PageData   *pageData = >pages[i];
@@ -410,7 +413,6 @@ GenericXLogFinish(GenericXLogState *state)
 			if (BufferIsInvalid(pageData->buffer))
 continue;
 			PageSetLSN(BufferGetPage(pageData->buffer), lsn);
-			MarkBufferDirty(pageData->buffer);
 		}
 		END_CRIT_SECTION();
 	}
-- 
2.34.1



Re: [PATCH] Add inline comments to the pg_hba_file_rules view

2023-09-26 Thread Jim Jones

Hi!

On 26.09.23 15:19, Peter Eisentraut wrote:

On 04.09.23 11:54, Jim Jones wrote:
This patch proposes the column "comment" to the pg_hba_file_rules 
view. It basically parses the inline comment (if any) of a valid 
pg_hba.conf entry and displays it in the new column.


For such pg_hba entries ...

host db jim 127.0.0.1/32 md5 # foo
host db jim 127.0.0.1/32 md5 #bar
host db jim 127.0.0.1/32 md5 # #foo#


I'm skeptical about this.

First, there are multiple commenting styles.  The end-of-line style is 
less common in my experience, because pg_hba.conf lines tend to 
belong. Another style is


# foo
host db jim 127.0.0.1/32 md5
# bar
host db jim 127.0.0.1/32 md5

or even as a block

# foo and bar
host db jim 127.0.0.1/32 md5
host db jim 127.0.0.1/32 md5

Another potential problem is that maybe people don't want their 
comments leaked out of the file.  Who knows what they have written in 
there.


I also considered this for a while. That's why I suggested only inline 
comments. On a second thought, I agree that making only certain types of 
comments "accessible" by the pg_hba_file_rules view can be misleading 
and can possibly leak sensible info if misused.




I think we should leave file comments be file comments.  If we want 
some annotations to be exported to higher-level views, we should make 
that an intentional and explicit separate feature.


My first suggestion [1] was to use a different character (other than 
'#'), but a good point was made, that it would add more complexity to 
the hba.c, which is already complex enough.
My main motivation with this feature is to be able to annotate pg_hba 
entries in a way that it can be read using the pg_hba_file_rule via SQL 
- these annotations might contain information like tags, client 
(application) names or any relevant info regarding the granted access. 
This info would help me to generate some reports that contain client 
access information. I can sort of achieve something similar using 
pg_read_file(),[2] but I thought it would be nice to have it directly 
from the view.


Do you think that this feature is in general not a good idea? Or perhaps 
a different annotation method would address your concerns?


Thank you very much for taking a look into it!
Jim



1- 
https://www.postgresql.org/message-id/flat/3fec6550-93b0-b542-b203-b0054aaee...@uni-muenster.de
2- 
https://www.postgresql.org/message-id/b63625ca-580f-14dc-7e7c-f90cd4d95cf7%40uni-muenster.de





Re: [PGdocs] fix description for handling pf non-ASCII characters

2023-09-26 Thread Tom Lane
"Karl O. Pinc"  writes:
> For the last hunk you'd change around "anything".  Write:
> "... it will be truncated to less than NAMEDATALEN characters and
> the bytes of the string which are not printable ASCII characters ...".

> Notice that I have also changed "that" to "which" just above.  
> I _think_ this is better English.

No, I'm pretty sure you're mistaken.  It's been a long time since
high school English, but the way I think this works is that "that"
introduces a restrictive clause, which narrows the scope of what
you are saying.  That is, you say "that" when you want to talk
about only the bytes of the string that aren't ASCII.  But "which"
introduces a non-restrictive clause that adds information or
commentary.  If you say "bytes of the string which are not ASCII",
you are effectively making a side assertion that no byte of the
string is ASCII.  Which is not the meaning you want here.

A smell test that works for native speakers (not sure how helpful
it is for others) is: if the sentence would read well with commas
or parens added before and after the clause, then it's probably
non-restrictive and should use "which".  If it looks wrong that way
then it's a restrictive clause and should use "that".

regards, tom lane




Re: Questions about the new subscription parameter: password_required

2023-09-26 Thread Robert Haas
On Tue, Sep 26, 2023 at 1:00 PM Jeff Davis  wrote:
> As I said earlier, I think the best thing to do is to just have a
> section that describes when to use password_required, what specific
> things you should do to satisfy that case, and what caveats you should
> avoid. Something like:
>
>   "If you want to have a subscription using a connection string without
> a password managed by a non-superuser, then: [ insert SQL steps here ].
> Warning: if the connection string doesn't contain a password, make sure
> to set password_required=false before transferring ownership, otherwise
> it will start failing."
>
> Documenting the behavior is good, too, but I find the behavior
> difficult to document, so examples will help.

Yeah, I think something like that could make sense, with an
appropriate amount of word-smithing.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Eager page freeze criteria clarification

2023-09-26 Thread Robert Haas
On Tue, Sep 26, 2023 at 11:11 AM Andres Freund  wrote:
> That I'd like you to expand on "using the RedoRecPtr of the latest checkpoint
> rather than the LSN of the previou vacuum." - I can think of ways of doing so
> that could end up with quite different behaviour...

Yeah, me too. I'm not sure what is best.

> As long as the most extreme cases are prevented, unnecessarily freezing is imo
> far less harmful than freezing too little.
>
> I'm worried that using something as long as 100-200%
> time-between-recent-checkpoints won't handle insert-mostly workload well,
> which IME are also workloads suffering quite badly under our current scheme -
> and which are quite common.

I wrote about this problem in my other reply and I'm curious as to
your thoughts about it. Basically, suppose we forget about all of
Melanie's tests except for three cases: (1) an insert-only table, (2)
an update-heavy workload with uniform distribution, and (3) an
update-heavy workload with skew. In case (1), freezing is good. In
case (2), freezing is bad. In case (3), freezing is good for cooler
pages and bad for hotter ones. I postulate that any
recency-of-modification threshold that handles (1) well will handle
(2) poorly, and that the only way to get both right is to take some
other factor into account. You seem to be arguing that we can just
freeze aggressively in case (2) and it won't cost much, but it doesn't
sound to me like Melanie believes that and I don't think I do either.

> > This doesn't seem completely stupid, but I fear it would behave
> > dramatically differently on a workload a little smaller than s_b vs.
> > one a little larger than s_b, and that doesn't seem good.
>
> Hm. I'm not sure that that's a real problem. In the case of a workload bigger
> than s_b, having to actually read the page again increases the cost of
> freezing later, even if the workload is just a bit bigger than s_b.

That is true, but I don't think it means that there is no problem. It
could lead to a situation where, for a while, a table never needs any
significant freezing, because we always freeze aggressively. When it
grows large enough, we suddenly stop freezing it aggressively, and now
it starts experiencing vacuums that do a whole bunch of work all at
once. A user who notices that is likely to be pretty confused about
what happened, and maybe not too happy when they find out.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: [PGdocs] fix description for handling pf non-ASCII characters

2023-09-26 Thread Karl O. Pinc
On Tue, 26 Sep 2023 13:40:26 +
"Hayato Kuroda (Fujitsu)"  wrote:

> Your effort is quite helpful for me.

You're welcome.

> Before replying your comments, I thought I should show the difference
> between versions. Regarding old versions (here PG15 was used),
> non-ASCIIs (like Japanese) are replaced with '?'.
> 
> ```
> psql (15.4)
> Type "help" for help.
> 
> postgres=# SET application_name TO 'あああ';
> SET
> postgres=# SHOW application_name ;
>  application_name 
> --
>  ?
> (1 row)
> ```
> 
> As for the HEAD, as my patch said, non-ASCIIs are replaced
> with hexadecimal representations. (Were my terminologies correct?).
> 
> ```
> psql (17devel)
> Type "help" for help.
> 
> postgres=# SET application_name TO 'あああ';
> SET
> postgres=# SHOW application_name ;
>application_name   
> --
>  \xe3\x81\x82\xe3\x81\x82\xe3\x81\x82
> (1 row)
> ```

I think you're terminology is correct, but see my
suggestion at bottom.

Never hurts to have output to look at.  I noticed here
and when reading the patch that changed the output --
the patch is operating on bytes, not characters per-se.

> > First, it is now best practice in the PG docs to
> > put a line break at the end of each sentence.
> > At least for the sentences on the lines you change.
> > (No need to update the whole document!)  Please
> > do this in the next version of your patch.  I don't
> > know if this is a requirement for acceptance by
> > a committer, but it won't hurt.  
> 
> I didn't know that. Could you please tell me if you have a source?

I thought I could find an email but the search is taking
forever.  If I find something I'll let you know.
I could even be mis-remembering, but it's a nice
practice regardless.

There are a number of undocumented conventions.
Another is the use of gender neutral language.

The place to discuss doc conventions/styles would
be the pgsql-docs list.  (If you felt like
asking there.)

You could try submitting another patch to add various
doc conventions to the official docs at
https://www.postgresql.org/docs/current/docguide-style.html
:-)

> Anyway, I put a line break for each sentences for now.

Thanks.

A related thing that's nice to have is to limit the line
length of the documentation source to 80 characters or less.
79 is probably best.  Since the source text around your patch
conforms to this convention you should also.

> IIUC the word " Postgres" cannot be used in the doc.

I think you're right.

> Based on your all comments, I changed as below. How do you think?
> 
> ```
> Characters that are not printable ASCII, like
> \x03, are replaced with the
> PostgreSQL  linkend="sql-syntax-strings-escape">C-style escaped hexadecimal byte
> value. ```

> 
> > Since the both of you have looked and confirmed that the
> > actual behavior matches the new documentation I have not
> > done this.  
> 
> I showed the result again, please see.

Should the committer be interested, your patch applies cleanly
and the docs build as expected.

Also, based on the comments in the
patch which changed the system's behavior, I believe that
your patch updates all the relevant places in the documentation.

> > But, have either of you checked that we're really talking about
> > replacing everything outside the 7-bit ASCII encodings?
> > My reading of the commit referenced in the first email of this
> > thread says that it's everything outside of the _printable_
> > ASCII encodings, ASCII values outside of the range 32 to 127,
> > inclusive.
> > 
> > Please check.  The docs should probably say "printable ASCII",
> > or "non-printable ASCII", depending.  I think the meaning
> > of "printable ASCII" is widely enough known to be 32-127.
> > So "printable" is good enough, right?  
> 
> For me, "non-printable ASCII" sounds like control characters. So that
> I used the sentence "Characters that are not printable ASCII ... are
> replaced with...". Please tell me if you have better explanation?

Your explanation sounds good to me.

I now think that you should consider another change to your wording.
Instead of starting with "Characters that are not printable ASCII ..."
consider writing "The bytes of the string which are not printable ASCII
...".  Notice above that characters (e.g. あ) generate output for
each non-ASCII byte (e.g. \xe3\x81\x82).  So my thought is that
the docs should be talking about bytes.

For the last hunk you'd change around "anything".  Write:
"... it will be truncated to less than NAMEDATALEN characters and
the bytes of the string which are not printable ASCII characters ...".

Notice that I have also changed "that" to "which" just above.  
I _think_ this is better English.  See sense 3 of:
https://en.wiktionary.org/wiki/which
See also the first paragraph of:
https://en.wikipedia.org/wiki/Relative_pronoun

If the comments above move you, send another patch.  Seems to me
we're close to sending this on to a committer.

Regards,

Karl 
Free 

Re: Avoid a possible out-of-bounds access (src/backend/optimizer/util/relnode.c)

2023-09-26 Thread Ranier Vilela
Em ter., 26 de set. de 2023 às 09:30, Ranier Vilela 
escreveu:

> Em ter., 26 de set. de 2023 às 07:34, Ashutosh Bapat <
> ashutosh.bapat@gmail.com> escreveu:
>
>> On Tue, Sep 26, 2023 at 3:32 PM David Rowley 
>> wrote:
>> >
>> > find_base_rel() could be made more robust for free by just casting the
>> > relid and simple_rel_array_size to uint32 while checking that relid <
>> > root->simple_rel_array_size.  The 0th element should be NULL anyway,
>> > so "if (rel)" should let relid==0 calls through and allow that to
>> > ERROR still. I see that just changes a "jle" to "jnb" vs adding an
>> > additional jump for Ranier's version. [1]
>>
>> That's a good suggestion.
>>
>> I am fine with find_base_rel() as it is today as well. But
>> future-proofing it seems to be fine too.
>>
>> >
>> > It seems worth not making find_base_rel() more expensive than it is
>> > today as commonly we just reference root->simple_rel_array[n] directly
>> > anyway because it's cheaper. It would be nice if we didn't add further
>> > overhead to find_base_rel() as this would make the case for using
>> > PlannerInfo.simple_rel_array directly even stronger.
>>
>> I am curious, is the overhead in find_base_rel() impacting overall
>> performance?
>>
> It seems to me that it adds a LEA instruction.
> https://godbolt.org/z/b4jK3PErE
>
> Although it doesn't seem like much,
> I believe the solution (casting to unsigned) seems better.
> So +1.
>
As suggested, casting is the best option that does not add any overhead and
improves the robustness of the find_base_rel function.
I propose patch v1, with the additional addition of fixing the
find_base_rel_ignore_join function,
which despite not appearing in Coverity reports, suffers from the same
problem.

Taking advantage, I also propose a scope reduction,
 as well as the const of the root parameter, which is very appropriate.

best regards,
Ranier Vilela


v1-0001-Avoid-possible-out-of-bounds-access.patch
Description: Binary data


Re: Questions about the new subscription parameter: password_required

2023-09-26 Thread Jeff Davis
On Tue, 2023-09-26 at 18:21 +0200, Benoit Lobréau wrote:
> On 9/26/23 16:27, Benoit Lobréau wrote:
> > I will try to come up with a documentation patch.
> 
> This is my attempt at a documentation patch.
> 


  +   If the ownership of a subscription with
password_required=true
  +   is transferred to a non-superuser, they will gain full control
over the subscription
  +   but will not be able to modify it's connection string.

I think you mean false, right?

  +   If the ownership of a subscription with
password_required=true
  +   has been transferred to a non-superuser, it must be reverted to a
superuser for
  +   the DROP operation to succeed.

That's only needed if the superuser transfers a subscription with
password_required=true to a non-superuser and the connection string
does not contain a password. In that case, the subscription is already
in a failing state, not just for DROP. Ideally we'd have some other
warning in the docs not to do that -- maybe in CREATE and ALTER.

Also, if the subscription is in that kind of failing state, there are
other ways to get out of it as well, like disabling it and setting
connection=none, then dropping it.

The whole thing is fairly delicate. As soon as you work slightly
outside of the intended use, password_required starts causing
unexpected things to happen.

As I said earlier, I think the best thing to do is to just have a
section that describes when to use password_required, what specific
things you should do to satisfy that case, and what caveats you should
avoid. Something like:

  "If you want to have a subscription using a connection string without
a password managed by a non-superuser, then: [ insert SQL steps here ].
Warning: if the connection string doesn't contain a password, make sure
to set password_required=false before transferring ownership, otherwise
it will start failing."

Documenting the behavior is good, too, but I find the behavior
difficult to document, so examples will help.

Regards,
Jeff Davis





Re: Index AmInsert Parameter Confused?

2023-09-26 Thread Matthias van de Meent
On Tue, 26 Sept 2023 at 18:38, jacktby jacktby  wrote:
>
> typedef bool (*aminsert_function) (Relation indexRelation,
>   Datum *values,
>   bool *isnull,
>   ItemPointer heap_tid,
>   Relation heapRelation,
>   IndexUniqueCheck checkUnique,
>   bool indexUnchanged,
>   struct IndexInfo *indexInfo);
>
> Why is there a heap_tid, We haven’t inserted the value, so where does it from 
> ?

Index insertion only happens after the TableAM tuple has been
inserted. As indexes refer to locations in the heap, this TID contains
the TID of the table tuple that contains the indexed values, so that
the index knows which tuple to refer to.

Note that access/amapi.h describes only index AM APIs; it does not
cover the table AM APIs descibed in access/tableam.h

Kind regards,

Matthias van de Meent




Re: False "pg_serial": apparent wraparound” in logs

2023-09-26 Thread Heikki Linnakangas

On 25/08/2023 07:29, Imseih (AWS), Sami wrote:

diff --git a/src/backend/storage/lmgr/predicate.c 
b/src/backend/storage/lmgr/predicate.c
index 1af41213b4..7e7be3b885 100644
--- a/src/backend/storage/lmgr/predicate.c
+++ b/src/backend/storage/lmgr/predicate.c
@@ -992,6 +992,13 @@ SerialSetActiveSerXmin(TransactionId xid)
 
 	serialControl->tailXid = xid;
 
+	/*

+* If the SLRU is being used, set the latest page number to
+* the current tail xid.
+*/
+   if (serialControl->headPage > 0)
+   SerialSlruCtl->shared->latest_page_number = 
SerialPage(serialControl->tailXid);
+
LWLockRelease(SerialSLRULock);
 }


I don't really understand what exactly the problem is, or how this fixes 
it. But this doesn't feel right:


Firstly, isn't headPage == 0 also a valid value? We initialize headPage 
to -1 when it's not in use.


Secondly, shouldn't we set it to the page corresponding to headXid 
rather than tailXid.


Thirdly, I don't think this code should have any business setting 
latest_page_number directly. latest_page_number is set in 
SimpleLruZeroPage(). Are we missing a call to SimpleLruZeroPage() somewhere?


--
Heikki Linnakangas
Neon (https://neon.tech)





Re: Questions about the new subscription parameter: password_required

2023-09-26 Thread Benoit Lobréau

On 9/26/23 16:27, Benoit Lobréau wrote:

I will try to come up with a documentation patch.


This is my attempt at a documentation patch.

--
Benoit Lobréau
Consultant
http://dalibo.comFrom a73baa91032fff37ef039168c276508553830f86 Mon Sep 17 00:00:00 2001
From: benoit 
Date: Tue, 26 Sep 2023 18:07:47 +0200
Subject: [PATCH] Doc patch for require_password

Add documentation to ALTER / DROP SUBSCRIPTION regarding non-superuser
subscriptions with require_password=true.
---
 doc/src/sgml/ref/alter_subscription.sgml | 3 +++
 doc/src/sgml/ref/drop_subscription.sgml  | 6 ++
 2 files changed, 9 insertions(+)

diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index a85e04e4d6..0bbe7e2335 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -51,6 +51,9 @@ ALTER SUBSCRIPTION name RENAME TO <
to alter the owner, you must be able to SET ROLE to the
new owning role. If the subscription has
password_required=false, only superusers can modify it.
+   If the ownership of a subscription with password_required=true
+   is transferred to a non-superuser, they will gain full control over the subscription
+   but will not be able to modify it's connection string.
   
 
   
diff --git a/doc/src/sgml/ref/drop_subscription.sgml b/doc/src/sgml/ref/drop_subscription.sgml
index 2a67bdea91..8ec743abd0 100644
--- a/doc/src/sgml/ref/drop_subscription.sgml
+++ b/doc/src/sgml/ref/drop_subscription.sgml
@@ -102,6 +102,12 @@ DROP SUBSCRIPTION [ IF EXISTS ] nameDROP
SUBSCRIPTION cannot be executed inside a transaction block.
   
+
+  
+   If the ownership of a subscription with password_required=true
+   has been transferred to a non-superuser, it must be reverted to a superuser for
+   the DROP operation to succeed.
+  
  
 
  
-- 
2.41.0



Re: Remove MSVC scripts from the tree

2023-09-26 Thread Andrew Dunstan


On 2023-09-26 Tu 01:25, NINGWEI CHEN wrote:

On Fri, 22 Sep 2023 10:12:29 +0900
Michael Paquier  wrote:


As of today, I can see that the only buildfarm members relying on
these scripts are bowerbird and hamerkop, so these two would fail if
the patch attached were to be applied today.  I am adding Andrew
D. and the maintainers of hamerkop (SRA-OSS) in CC for comments.

hamerkop is not yet prepared for Meson builds, but we plan to work on this 
support soon.
If we go with Meson builds exclusively right now, we will have to temporarily 
remove the master/HEAD for a while.

Best Regards.




I don't think we should switch to that until you're ready.


cheers


andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com


Re: Eager page freeze criteria clarification

2023-09-26 Thread Peter Geoghegan
On Tue, Sep 26, 2023 at 8:19 AM Andres Freund  wrote:
> However, I'm not at all convinced doing this on a system wide level is a good
> idea. Databases do often contain multiple types of workloads at the same
> time. E.g., we want to freeze aggressively in a database that has the bulk of
> its size in archival partitions but has lots of unfrozen data in an active
> partition. And databases have often loads of data that's going to change
> frequently / isn't long lived, and we don't want to super aggressively freeze
> that, just because it's a large portion of the data.

I didn't say that we should always have most of the data in the
database frozen, though. Just that we can reasonably be more lazy
about freezing the remainder of pages if we observe that most pages
are already frozen. How they got that way is another discussion.

I also think that the absolute amount of debt (measured in physical
units such as unfrozen pages) should be kept under control. But that
isn't something that can ever be expected to work on the basis of a
simple threshold -- if only because autovacuum scheduling just doesn't
work that way, and can't really be adapted to work that way.

-- 
Peter Geoghegan




Re: [PATCH] Add inline comments to the pg_hba_file_rules view

2023-09-26 Thread Greg Sabino Mullane
Also a reluctant -1, as the comment-at-EOL style is very rare in my
experience over the years of seeing many a pg_hba file.


Re: pg_rewind with cascade standby doesn't work well

2023-09-26 Thread Aleksander Alekseev
Hi,

> >> IMO a test is needed that makes sure no one is going to break this in
> >> the future.
> >
> > You definitely need more complex test scenarios for that.  If you can
> > come up with new ways to make the TAP tests of pg_rewind mode modular
> > in handling more complicated node setups, that would be a nice
> > addition, for example.
>
> I'm sorry for lacking tests. For now, I started off with a simple test
> that cause the problem I mentioned. The updated WIP patch 0001 includes
> the new test for pg_rewind.

Many thanks for a quick update.

> And also, I'm afraid that I'm not sure what kind of tests I have to make
> for fix this behavior. Would you mind giving me some advice?

Personally I would prefer not to increase the scope of work. Your TAP
test added in 0001 seems to be adequate.

> BTW, I was able to
> reproduce the assertion failure Kuwamura-san reported, even after applying
> your latest patch from the thread.

Do you mean that the test fails or it doesn't but there are other
steps to reproduce the issue?

-- 
Best regards,
Aleksander Alekseev




Re: XLog size reductions: smaller XLRec block header for PG17

2023-09-26 Thread Aleksander Alekseev
Hi,

> This scheme is reused later for the XLogRecord xl_tot_len field over
> at [0], and FWIW is thus being reused. Sure, it's tailored to this WAL
> use case, but IMO we're getting good value from it. We don't use
> protobuf or JSON for WAL, we use our own serialization format. Having
> some specialized encoding/decoding in that format for certain fields
> is IMO quite acceptable.
>
> > Yes - I proposed that and wrote an implementation of reasonably efficient
> > varint encoding. Here's my prototype:
> > https://postgr.es/m/20221004234952.anrguppx5owewb6n%40awork3.anarazel.de
>
> As I mentioned on that thread, that prototype has a significant
> probability of doing nothing to improve WAL size, or even increasing
> the WAL size for installations which consume a lot of OIDs.
>
> > I think it's a bad tradeoff to write lots of custom varint encodings, just 
> > to
> > eek out a bit more space savings.
>
> This is only a single "custom" varint encoding though, if you can even
> call it that. It makes a field's size depend on flags set in another
> byte, which is not that much different from the existing use of
> XLR_BLOCK_ID_DATA_[LONG, SHORT].
>
> > The increase in code complexity IMO makes it a bad tradeoff.
>
> Pardon me for asking, but what would you consider to be a good
> tradeoff then? I think the code relating to the WAL storage format is
> about as simple as you can get it within the feature set it provides
> and the size of the resulting records. While I think there is still
> much to gain w.r.t. WAL record size, I don't think we can get much of
> those improvements without adding at least some amount of complexity,
> something I think to be true for most components in PostgreSQL.
>
> So, except for redesigning significant parts of the public WAL APIs,
> are we just going to ignore any potential improvements because they
> "increase code complexity"?

Here are my two cents.

I definitely see the value in having reusable varint encoding. This
would probably be a good option for extendable TOAST pointers,
compression dictionaries, and perhaps other patches.

In this particular case however Matthias has good arguments that this
is not the right tool for this particular task, IMO. We don't use
rbtrees for everything that needs a map from x to y. Hash tables have
other compromises. Sometimes one container is a better fit, sometimes
the other, sometimes none and we implement a fine tuned container for
the given case. Same here.

-- 
Best regards,
Aleksander Alekseev




Re: SET ROLE documentation improvement

2023-09-26 Thread Yurii Rashkovskii
On Mon, Sep 25, 2023 at 3:09 PM Nathan Bossart 
wrote:

> On Fri, Sep 15, 2023 at 02:36:16PM -0700, Yurii Rashkovskii wrote:
> > On Fri, Sep 15, 2023 at 1:47 PM Nathan Bossart  >
> > wrote:
> >> I think another issue is that the aforementioned note doesn't mention
> the
> >> new SET option added in 3d14e17.
> >
> > How do you think we should word it in that note to make it useful?
>
> Maybe something like this:
>
> The current session user must have the SET option for the specified
> role_name, either directly or indirectly via a chain of memberships
> with the SET option.
>

This is a good start, indeed. I've amended my patch to include it.

-- 
Y.


V2-0001-Minor-improvement-to-SET-ROLE-documentation.patch
Description: Binary data


Re: CREATE FUNCTION ... SEARCH { DEFAULT | SYSTEM | SESSION }

2023-09-26 Thread Robert Haas
On Mon, Sep 25, 2023 at 1:56 PM Jeff Davis  wrote:
> Do users like Bob do that today? If not, what causes you to expect them
> to do so in the future?

What I would say is that if there's a reasonable way of securing your
stuff and you don't make use of it, that's your problem. If securing
your stuff is unreasonably difficult, that's a product problem. I
think that setting the search_path on your own functions is a basic
precaution that you should take if you are worried about multi-user
security. I do not believe it is realistic to eliminate that
requirement, and if people like Bob don't do that today and can't be
made to do that in the future, then I think it's just hopeless. In
contrast, I think that the precautions that you need to take when
doing anything to a table owned by another user are unreasonably
complex and not very realistic for anyone to take on a routine basis.
Even if you validate that there's nothing malicious before you access
the table, the table owner can change that at any time, so it's very
hard to reliably protect yourself.

In terms of whether people like Bob actually DO do that today, I'd say
probably some do and others don't. I think that the overwhelming
majority of PostgreSQL users simply aren't concerned about multi-user
security. They either have a single user account that is used for
everything, or say one account for the application and another for
interactive access, or they have a bunch of users but basically all of
those people are freely accessing each other's stuff and they're not
really concerned with firewalling them from each other. Among the
small percentage of users who are really concerned with making sure
that users can't get into each others accounts, I would expect that
knowing that you need to control search_path is fairly common, but
it's hard to say. I haven't actually met many such users.

> > I think it's self-evident that a SQL function's behavior is
> > not guaranteed to be invariant under all possible values of
> > search_path.
>
> It's certainly not self-evident in a literal sense. I think you mean
> that it's "obvious" or something, and perhaps that narrow question is,
> but it's also not terribly helpful.
>
> If the important behaviors here were so obvious, how did we end up in
> this mess in the first place?

I feel like this isn't really responsive to the argument that I was
and am making, and I'm worried that we're going down a rat-hole here.

I wondered after reading this whether I had misused the term
self-evident, but when I did a Google search for "self-evident" the
definition that comes up is "not needing to be demonstrated or
explained; obvious."

I am not saying that everyone is going to realize that you probably
ought to be setting search_path on all of your functions in any kind
of multi-user environment, and maybe even in a single-user environment
just to avoid weird failures if you ever change your default
search_path. What I am saying is that if you stop to think about what
search_path does while looking at any SQL function you've ever
written, you should probably realize pretty quickly that the behavior
of your function in search_path-dependent, and indeed that the
behavior of every other SQL function you've ever written is probably
search_path-dependent, too. I think the problem here isn't really that
this is hard to understand, but that many people have not stopped to
think about it.

Maybe it is obvious to you what we ought to do about that, but it is
not obvious to me. As I have said, I think that changing the behavior
of CREATE FUNCTION or CREATE PROCEDURE so that some search_path
control is the default is worth considering. However, I think that
such a change inevitably breaks backward compatibility, and I don't
think we have enough people weighing in on this thread to think that
we can just go do that even if everyone agreed on precisely what was
to be done, and I think it is pretty clear that we do not have
unanimous agreement.

> > Respectfully, I find this position unreasonable, to the point of
> > finding it difficult to take seriously.
>
> Which part exactly is unreasonable?

I interpreted you to be saying that we can't expect people to set
search_path on their functions. And I just don't buy that. We have
made mistakes in that area in PostgreSQL itself and had to fix them
later, and we may make more mistakes again in the future, so if you
think we need better documentation or better defaults, I think you
might be right. But if you think it's a crazy idea for people running
PostgreSQL in multi-user environments to understand that setting
search_path on all of their functions and procedures is essential, I
disagree. They've got to understand that, because it's not that
complicated, and there's no real alternative.

> I think analogies to unix are what caused a lot of the problems we have
> today, because postgres is a very different world. In unix-like
> environments, a file is just a file; in postgres, we have all kinds 

Re: Eager page freeze criteria clarification

2023-09-26 Thread Andres Freund
Hi,

On 2023-09-25 14:16:46 -0700, Peter Geoghegan wrote:
> On Mon, Sep 25, 2023 at 11:45 AM Robert Haas  wrote:
> I'm surprised that there hasn't been any discussion of the absolute
> amount of system-wide freeze debt on this thread. If 90% of the pages
> in the entire database are frozen, it'll generally be okay if we make
> the wrong call by freezing lazily when we shouldn't. This is doubly
> true within small to medium sized tables, where the cost of catching
> up on freezing cannot ever be too bad (concentrations of unfrozen
> pages in one big table are what really hurt users).

I generally agree with the idea of using "freeze debt" as an input - IIRC I
proposed using the number of frozen pages vs number of pages as the input to
the heuristic in one of the other threads about the topic a while back. I also
think we should read a portion of all-visible-but-not-frozen pages during
non-aggressive vacuums to manage the freeze debt.

However, I'm not at all convinced doing this on a system wide level is a good
idea. Databases do often contain multiple types of workloads at the same
time. E.g., we want to freeze aggressively in a database that has the bulk of
its size in archival partitions but has lots of unfrozen data in an active
partition. And databases have often loads of data that's going to change
frequently / isn't long lived, and we don't want to super aggressively freeze
that, just because it's a large portion of the data.

Greetings,

Andres Freund




Re: pg_resetwal tests, logging, and docs update

2023-09-26 Thread Aleksander Alekseev
Hi,

> Hmm, I think I like "where" better.

OK.

> Attached is an updated patch set where I have split the changes into
> smaller pieces.  The last two patches still have some open questions
> about what certain constants mean etc.  The other patches should be settled.

The patches 0001..0005 seem to be ready and rather independent. I
suggest merging them and continue discussing the rest of the patches.

-- 
Best regards,
Aleksander Alekseev




Re: Eager page freeze criteria clarification

2023-09-26 Thread Andres Freund
Hi,

On 2023-09-25 14:45:07 -0400, Robert Haas wrote:
> On Fri, Sep 8, 2023 at 12:07 AM Andres Freund  wrote:
> > > Downthread, I proposed using the RedoRecPtr of the latest checkpoint
> > > rather than the LSN of the previou vacuum. I still like that idea.
> >
> > Assuming that "downthread" references
> > https://postgr.es/m/CA%2BTgmoYb670VcDFbekjn2YQOKF9a7e-kBFoj2WJF1HtH7YPaWQ%40mail.gmail.com
> > could you sketch out the logic you're imagining a bit more?
> 
> I'm not exactly sure what the question is here.

That I'd like you to expand on "using the RedoRecPtr of the latest checkpoint
rather than the LSN of the previou vacuum." - I can think of ways of doing so
that could end up with quite different behaviour...


> > Perhaps we can mix both approaches. We can use the LSN and time of the last
> > vacuum to establish an LSN->time mapping that's reasonably accurate for a
> > relation. For infrequently vacuumed tables we can use the time between
> > checkpoints to establish a *more aggressive* cutoff for freezing then what a
> > percent-of-time-since-last-vacuum appach would provide. If e.g. a table gets
> > vacuumed every 100 hours and checkpoint timeout is 1 hour, no realistic
> > percent-of-time-since-last-vacuum setting will allow freezing, as all dirty
> > pages will be too new. To allow freezing a decent proportion of those, we
> > could allow freezing pages that lived longer than ~20%
> > time-between-recent-checkpoints.
> 
> Yeah, I don't know if that's exactly the right idea, but I think it's
> in the direction that I was thinking about. I'd even be happy with
> 100% of the time-between-recent checkpoints, maybe even 200% of
> time-between-recent checkpoints. But I think there probably should be
> some threshold beyond which we say "look, this doesn't look like it
> gets touched that much, let's just freeze it so we don't have to come
> back to it again later."

As long as the most extreme cases are prevented, unnecessarily freezing is imo
far less harmful than freezing too little.

I'm worried that using something as long as 100-200%
time-between-recent-checkpoints won't handle insert-mostly workload well,
which IME are also workloads suffering quite badly under our current scheme -
and which are quite common.


> I think part of the calculus here should probably be that when the
> freeze threshold is long, the potential gains from making it even
> longer are not that much. If I change the freeze threshold on a table
> from 1 minute to 1 hour, I can potentially save uselessly freezing
> that page 59 times per hour, every hour, forever, if the page always
> gets modified right after I touch it. If I change the freeze threshold
> on a table from 1 hour to 1 day, I can only save 23 unnecessary
> freezes per day. Percentage-wise, the overhead of being wrong is the
> same in both cases: I can have as many extra freeze operations as I
> have page modifications, if I pick the worst possible times to freeze
> in every case. But in absolute terms, the savings in the second
> scenario are a lot less. I think if a user is accessing a table
> frequently, the overhead of jamming a useless freeze in between every
> table access is going to be a lot more noticeable then when the table
> is only accessed every once in a while. And I also think it's a lot
> less likely that we'll reliably get it wrong. Workloads that touch a
> page and then touch it again ~N seconds later can exist for all values
> of N, but I bet they're way more common for small values of N than
> large ones.

True. And with larger Ns it also becomes more likely that we'd need to freeze
the rows anyway. I've seen tables being hit with several anti-wraparound vacuums
a day, but not several anti-wraparound vacuums a minute...


> Is there also a need for a similar guard in the other direction? Let's
> say that autovacuum_naptime=15s and on some particular table it
> triggers every time. I've actually seen this on small queue tables. Do
> you think that, in such tables, we should freeze pages that haven't
> been modified in 15s?

I don't think it matters much, proportionally to the workload of rewriting
nearly all of the table every few seconds, the overhead of freezing a bunch of
already dirty pages is neglegible.


> > Hm, possibly stupid idea: What about using shared_buffers residency as a
> > factor? If vacuum had to read in a page to vacuum it, a) we would need read 
> > IO
> > to freeze it later, as we'll soon evict the page via the ringbuffer b)
> > non-residency indicates the page isn't constantly being modified?
> 
> This doesn't seem completely stupid, but I fear it would behave
> dramatically differently on a workload a little smaller than s_b vs.
> one a little larger than s_b, and that doesn't seem good.

Hm. I'm not sure that that's a real problem. In the case of a workload bigger
than s_b, having to actually read the page again increases the cost of
freezing later, even if the workload is just a bit bigger than s_b.
Greetings,


Re: Build the docs if there are changes in docs and don't run other tasks if the changes are only in docs

2023-09-26 Thread Nazir Bilal Yavuz
Hi,

On Tue, 26 Sept 2023 at 13:48, Peter Eisentraut  wrote:
>
> On 25.09.23 12:56, Nazir Bilal Yavuz wrote:
> > +  # Only run if a specific OS is not requested and if there are changes in 
> > docs
> > +  # or in the CI files.
> > +  skip: >
> > +$CIRRUS_CHANGE_MESSAGE =~ '.*\nci-os-only:.*' ||
> > +!changesInclude('doc/**',
> > +'.cirrus.yml',
> > +'.cirrus.tasks.yml',
> > +'src/backend/catalog/sql_feature_packages.txt',
> > +'src/backend/catalog/sql_features.txt',
> > +'src/backend/utils/errcodes.txt',
> > +'src/backend/utils/activity/wait_event_names.txt',
> > +
> > 'src/backend/utils/activity/generate-wait_event_types.pl',
> > +'src/include/parser/kwlist.h')
>
> This is kind of annoying.  Now we need to maintain yet another list of
> these dependencies and keep it in sync with the build systems.

I agree.

>
> I think meson can produce a dependency tree from a build.  Maybe we
> could use that somehow and have Cirrus cache it between runs?

I will check that.

>
> Also note that there are also dependencies in the other direction.  For
> example, the psql help is compiled from XML DocBook sources.  So your
> other patch would also need to include similar changesInclude() clauses.
>

If there are more cases like this, it may not be worth it. Instead, we can just:

- Build the docs when the doc related files are changed (This still
creates a dependency like you said).

- Skip CI completely if the README files are changed.

What are your opinions on these?

Regards,
Nazir Bilal Yavuz
Microsoft




Index AmInsert Parameter Confused?

2023-09-26 Thread jacktby jacktby
typedef bool (*aminsert_function) (Relation indexRelation,
   Datum 
*values,
   bool *isnull,
   ItemPointer 
heap_tid,
   Relation 
heapRelation,
   
IndexUniqueCheck checkUnique,
   bool 
indexUnchanged,
   struct 
IndexInfo *indexInfo);

Why is there a heap_tid, We haven’t inserted the value, so where does it from ?

Re: Questions about the new subscription parameter: password_required

2023-09-26 Thread Benoit Lobréau

On 9/22/23 21:58, Robert Haas wrote

I think that there normally shouldn't be any problem here, because if
form->subpasswordrequired is true, we expect that the connection
string should contain a password which the remote side actually uses,
or we expect the subscription to be owned by the superuser. If neither
of those things is the case, then either the superuser made a
subscription that doesn't use a password owned by a non-superuser
without fixing subpasswordrequired, or else the configuration on the
remote side has changed so that it now doesn't use the password when
formerly it did. In the first case, perhaps it would be fine to go
ahead and drop the slot, but in the second case I don't think it's OK
from a security point view, because the command is going to behave the
same way no matter who executes the drop command, and a non-superuser
who drops the slot shouldn't be permitted to rely on the postgres
processes's identity to do anything on a remote node -- including
dropping a relation slot. So I tentatively think that this behavior is
correct.


I must admin I hadn't considered the implication when the configuration 
on the remote side has changed and we use a non-superuser. I see how it 
could be problematic.


I will try to come up with a documentation patch.


Maybe you're altering it in a way that doesn't involve any connections
or any changes to the connection string? There's no security issue if,
say, you rename it.


I looked at the code again. Indeed, of the ALTER SUBSCRIPTION commands, 
only ALTER SUBSCRIPTION .. CONNECTION uses walrcv_check_conninfo().


I checked the other thread (Re: [16+] subscription can end up in 
inconsistent state [1]) and will try the patch. Is it the thread you 
where refering to earlier ?


[1] 
https://www.postgresql.org/message-id/flat/5dff4caf26f45ce224a33a5e18e110b93a351b2f.camel%40j-davis.com#ff4a06505de317b1ad436b8102a69446


--
Benoit Lobréau
Consultant
http://dalibo.com




Re: Partial aggregates pushdown

2023-09-26 Thread Bruce Momjian
On Tue, Sep 26, 2023 at 06:26:25AM +, 
fujii.y...@df.mitsubishielectric.co.jp wrote:
> Hi Mr.Bruce.
> > I think this needs to be explained in the docs.  I am ready to adjust the 
> > patch to improve the wording whenever you are
> > ready.  Should I do it now and post an updated version for you to use?
> The following explanation was omitted from the documentation, so I added it.
> > *  false - no check
> > *  true, remove version < sender - check
> I have responded to your comment, but if there is a problem with the wording, 
> could you please suggest a correction?

I like your new wording, thanks.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Only you can decide what is important to you.




What's the ItemPointer's meaning here?

2023-09-26 Thread jacktby jacktby
/* Typedef for callback function for table_index_build_scan */
typedef void (*IndexBuildCallback) (Relation index,

ItemPointer tid,
Datum 
*values,
bool 
*isnull,
bool 
tupleIsAlive,
void 
*state);
When we build an index on an existed heap table, so pg will read the tuple one 
by one and give the tuple’s hepatid as this ItemPointer, is that right?



Re: dikkop seems unhappy because of openssl stuff (FreeBSD 14-BETA1)

2023-09-26 Thread Tomas Vondra



On 9/20/23 20:09, Tomas Vondra wrote:
> On 9/20/23 19:59, Tomas Vondra wrote:
>>
>>
>> On 9/20/23 01:24, Tom Lane wrote:
>>> Tomas Vondra  writes:
 bsd@freebsd:~ $ tclsh8.6
 % clock scan "1/26/2010"
 time value too large/small to represent
>>>
>>> In hopes of replicating this, I tried installing FreeBSD 14-BETA2
>>> aarch64 on my Pi 3B.  This test case works fine:
>>>
>>> $ tclsh8.6
>>> % clock scan "1/26/2010"
>>> 1264482000
>>>
>>> $ tclsh8.7
>>> % clock scan "1/26/2010"
>>> 1264482000
>>>
>>> and unsurprisingly, pltcl's regression tests pass.  I surmise
>>> that something is broken in BETA1 that they fixed in BETA2.
>>>
>>> plpython works too, with the python 3.9 package (and no older
>>> python).
>>>
>>> However, all is not peachy, because plperl doesn't work.
>>> Trying to CREATE EXTENSION either plperl or plperlu leads
>>> to a libperl panic:
>>>
>>> pl_regression=# create extension plperl;
>>> server closed the connection unexpectedly
>>> This probably means the server terminated abnormally
>>> before or while processing the request.
>>> The connection to the server was lost. Attempting reset: Succeeded.
>>>
>>> with this in the postmaster log:
>>>
>>> panic: pthread_key_create failed
>>>
>>> That message is certainly not ours, so it must be coming out of libperl.
>>>
>>> Another thing that seemed strange is that ecpg's preproc.o takes
>>> O(forever) to compile.  I killed the build after observing that the
>>> compiler had gotten to 40 minutes of CPU time, and redid that step
>>> with PROFILE=-O0, which allowed it to compile in 20 seconds or so.
>>> (I also tried -O1, but gave up after a few minutes.)  This machine
>>> can compile the main backend grammar in a minute or two, so there is
>>> something very odd there.
>>>
>>> I'm coming to the conclusion that 14-BETA is, well, beta grade.
>>> I'll be interested to see if you get the same results when you
>>> update to BETA2.
>>
>> Thanks, I'll try that when I'll be at the office next week.
>>
> 
> FWIW when I disabled tcl, the tests pass (it's running with --nostatus
> --nosend, so it's not visible on the buildfarm site). Including the
> plperl stuff:
> 
> == running regression test queries==
> test plperl   ... ok  397 ms
> test plperl_lc... ok  152 ms
> test plperl_trigger   ... ok  374 ms
> test plperl_shared... ok  163 ms
> test plperl_elog  ... ok  184 ms
> test plperl_util  ... ok  210 ms
> test plperl_init  ... ok  150 ms
> test plperlu  ... ok  117 ms
> test plperl_array ... ok  228 ms
> test plperl_call  ... ok  189 ms
> test plperl_transaction   ... ok  412 ms
> test plperl_plperlu   ... ok  238 ms
> 
> ==
>  All 12 tests passed.
> ==
> 
> I wonder if this got broken between BETA1 and BETA2.
> 

Hmmm, I got to install BETA2 yesterday, but I still se the tcl failure:

 select tcl_date_week(2010,1,26);
- tcl_date_week

- 04
-(1 row)
-
+ERROR:  time value too large/small to represent
+CONTEXT:  time value too large/small to represent
+while executing
+"ConvertLocalToUTC $date[set date {}] $TZData($timezone) 2361222"
+(procedure "FreeScan" line 86)
+invoked from within
+"FreeScan $string $base $timezone $locale"
+(procedure "::tcl::clock::scan" line 68)
+invoked from within
+"::tcl::clock::scan 1/26/2010"
+("uplevel" body line 1)
+invoked from within
+"uplevel 1 [info level 0]"
+(procedure "::tcl::clock::scan" line 4)
+invoked from within
+"clock scan "$2/$3/$1""
+(procedure "__PLTcl_proc_55335" line 3)
+invoked from within
+"__PLTcl_proc_55335 2010 1 26"
+in PL/Tcl function "tcl_date_week"
 select tcl_date_week(2001,10,24);

I wonder what's the difference between the systems ... All I did was
writing the BETA2 image to SD card, and install a couple packages:

  pkg install xml2c libxslt gettext-tools ccache tcl tcl87 \
  p5-Test-Harness p5-IPC-Run gmake htop bash screen \
  python tcl86 nano p5-Test-LWP-UserAgent \
  p5-LWP-Protocol-https

And then

  perl ./run_branches.pl --run-all --nosend --nostatus --verbose

with the buildfarm config used by dikkop.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: Invalidate the subscription worker in cases where a user loses their superuser status

2023-09-26 Thread vignesh C
On Tue, 26 Sept 2023 at 13:03, Peter Smith  wrote:
>
> Here are some comments for patch v2-0001.
>
> ==
> src/backend/replication/logical/worker.c
>
> 1. maybe_reread_subscription
>
> ereport(LOG,
> (errmsg("logical replication worker for subscription \"%s\"
> will restart because of a parameter change",
> MySubscription->name)));
>
> Is this really a "parameter change" though? It might be a stretch to
> call the user role change a subscription parameter change. Perhaps
> this case needs its own LOG message?

When I was doing this change the same thought had come to my mind too
but later I noticed that in case of owner change there was no separate
log message. Since superuser check is somewhat similar to owner
change, I felt no need to make any change for this.

> ==
> src/include/catalog/pg_subscription.h
>
> 2.
>   char*origin; /* Only publish data originating from the
>   * specified origin */
> + bool isownersuperuser; /* Is subscription owner superuser? */
>  } Subscription;
>
>
> Is a new Subscription field member really necessary? The Subscription
> already has 'owner' -- why doesn't function
> maybe_reread_subscription() just check:
>
> (!superuser_arg(sub->owner) && superuser_arg(MySubscription->owner))

We need the new variable so that we store this value when the worker
is started and check the current value with the value that was when
the worker was started. Since we need the value of the superuser
status when the worker is started, I feel this variable is required.

> ==
> src/test/subscription/t/027_nosuperuser.pl
>
> 3.
> # The apply worker should get restarted after the superuser prvileges are
> # revoked for subscription owner alice.
>
> typo
>
> /prvileges/privileges/
>.
> ~

I will change this in the next version

> 4.
> +# After the user becomes non-superuser the apply worker should be restarted 
> and
> +# it should fail with 'password is required' error as password option is not
> +# part of the connection string.
>
> /as password option/because the password option/

I will change this in the next version

Regards,
Vignesh




Re: [PATCH] Add inline comments to the pg_hba_file_rules view

2023-09-26 Thread Daniel Gustafsson
> On 26 Sep 2023, at 15:19, Peter Eisentraut  wrote:
> 
> On 04.09.23 11:54, Jim Jones wrote:
>> This patch proposes the column "comment" to the pg_hba_file_rules view. It 
>> basically parses the inline comment (if any) of a valid pg_hba.conf entry 
>> and displays it in the new column.
>> For such pg_hba entries ...
>> host db jim 127.0.0.1/32 md5 # foo
>> host db jim 127.0.0.1/32 md5 #bar
>> host db jim 127.0.0.1/32 md5 # #foo#
> 
> I'm skeptical about this.
> 
> First, there are multiple commenting styles.  The end-of-line style is less 
> common in my experience, because pg_hba.conf lines tend to belong. Another 
> style is
> 
> # foo
> host db jim 127.0.0.1/32 md5
> # bar
> host db jim 127.0.0.1/32 md5
> 
> or even as a block
> 
> # foo and bar
> host db jim 127.0.0.1/32 md5
> host db jim 127.0.0.1/32 md5

Or even a more complicated one (which I've seen variants of in production)
where only horizontal whitespace separates two subsequent lines of comments:

# Block comment
host db jim 127.0.0.1/32 md5 #end of line multi-
 #line comment
# A new block comment directly following
host db jim 127.0.0.1/32 md5

> I think we should leave file comments be file comments.  If we want some 
> annotations to be exported to higher-level views, we should make that an 
> intentional and explicit separate feature.

+1

--
Daniel Gustafsson





RE: Move global variables of pgoutput to plugin private scope.

2023-09-26 Thread Zhijie Hou (Fujitsu)
On Tuesday, September 26, 2023 4:40 PM Amit Kapila  
wrote:
> 
> On Tue, Sep 19, 2023 at 12:48 PM Zhijie Hou (Fujitsu) 
> wrote:
> >
> > - static bool publish_no_origin;
> >
> > This flag is also local to pgoutput instance, and we didn't reset the
> > flag in output shutdown callback, so if we consume changes from
> > different slots, then the second call would reuse the flag value that
> > is set in the first call which is unexpected. To completely avoid this
> > issue, we think we'd better move this flag to output plugin private data
> structure.
> >
> > Example:
> >  SELECT data FROM pg_logical_slot_peek_binary_changes('isolation_slot_1',
> NULL, NULL, 'proto_version', '1', 'publication_names', 'pub', 'origin', 
> 'none'); ---
> Set origin in this call.
> >  SELECT data FROM pg_logical_slot_peek_binary_changes('isolation_slot_2',
> NULL, NULL, 'proto_version', '1', 'publication_names', 'pub');  -- Didn't set
> origin, but will reuse the origin flag in the first call.
> >
> 
>   char*origin;
> + bool publish_no_origin;
>  } PGOutputData;
> 
> Do we really need a new parameter in above structure? Can't we just use the
> existing origin in the same structure? Please remember if this needs to be
> backpatched then it may not be good idea to add new parameter in the
> structure but apart from that having two members to represent similar
> information doesn't seem advisable to me. I feel for backbranch we can just 
> use
> PGOutputData->origin for comparison and for HEAD, we can remove origin
> and just use a boolean to avoid any extra cost for comparisions for each
> change.

OK, I agree to remove the origin string on HEAD and we can add that back
when we support other origin value. I also modified to use the string for 
comparison
as suggested for back-branch. I will also test it locally to confirm it doesn't 
affect
the perf.

> 
> Can we add a test case to cover this case?

Added one in replorigin.sql.

Attach the patch set for publish_no_origin and in_streaming including the
patch(v2-PG16-0001) for back-branches. Since the patch for hash table need
more thoughts, I didn't post it this time.


Best Regards,
Hou zj



v2-0001-Maintain-publish_no_origin-in-output-plugin-priv.patch
Description:  v2-0001-Maintain-publish_no_origin-in-output-plugin-priv.patch


v2-PG16-0001-Maintain-publish_no_origin-in-output-plugin-priva.patch
Description:  v2-PG16-0001-Maintain-publish_no_origin-in-output-plugin-priva.patch


v2-0002-Move-in_streaming-to-output-private-data.patch
Description: v2-0002-Move-in_streaming-to-output-private-data.patch


RE: Move global variables of pgoutput to plugin private scope.

2023-09-26 Thread Zhijie Hou (Fujitsu)
On Tuesday, September 19, 2023 1:44 PM Michael Paquier  
wrote:
> 
> On Tue, Sep 19, 2023 at 04:10:39AM +, Zhijie Hou (Fujitsu) wrote:
> > Currently we have serval global variables in pgoutput, but each of
> > them is inherently local to an individual pgoutput instance. This
> > could cause issues if we switch to different output plugin instance in
> > one session and could miss to reset their value in case of errors. The
> > analysis for each variable is as
> > follows:
> 
> (Moved the last block of the message as per the relationship between
> RelationSyncCache and publications_valid).
> 
> > - static HTAB *RelationSyncCache = NULL;
> >
> > pgoutput creates this hash table under cacheMemoryContext to remember
> > the relation schemas that have been sent, but it's local to an
> > individual pgoutput instance, and because it's under global memory
> > context, the hashtable is not properly cleared in error paths which
> > means it has a risk of being accessed in a different output plugin instance.
> This was also mentioned in another thread[2].
> >
> > So I think we'd better allocate this under output plugin private context.
> >
> > But note that, instead of completely moving the hash table into the
> > output plugin private data, we need to to keep the static pointer
> > variable for the map to be accessed by the syscache callbacks. This is
> > because syscache callbacks won't be un-registered even after shutting
> > down the output plugin, so we need a static pointer to cache the map pointer
> so that callbacks can check it.
> >
> > - static bool publications_valid;
> >
> > I thought we need to move this to private data as well, but we need to
> > access this in a syscache callback, which means we need to keep the static
> variable.
> 
> FWIW, I think that keeping publications_valid makes the code kind of confusing
> once 0001 is applied, because this makes the handling of the cached data for
> relations and publications even more inconsistent than it is now, with a mixed
> bag of two different logics caused by the relationship between the synced
> relation cache and the publication
> cache: RelationSyncCache tracks if relations should be rebuilt, while
> publications_valid does it for the publication data, but both are still 
> static and
> could be shared by multiple pgoutput contexts.  On top of that,
> publications_valid is hidden at the top of pgoutput.c within a bunch of
> declarations and no comments to explain why it's here (spoiler: to handle the
> cache rebuilds with its reset in the cache callback).
> 
> I agree that CacheMemoryContext is not really a good idea to cache the data
> only proper to a pgoutput session and that tracking a context in the output
> data makes the whole cleanup attractive, but it also seems to me that we
> should avoid entirely the use of relcache callbacks if the intention is to 
> have one
> RelationSyncEntry per pgoutput.  The patch does something different than
> HEAD and than having one RelationSyncEntry per pgoutout: RelationSyncEntry
> can reference *everything*, with its data stored in multiple memory contexts 
> as
> of one per pgoutput.  It looks like RelationSyncEntry should be a list or a 
> hash
> table, at least, so as it can refer to multiple pgoutput states.  Knowing 
> that a
> session can only use one replication slot with MyReplicationSlot, not sure 
> that's
> worth bothering with.  As a whole,
> 0001 with its changes for RelationSyncCache don't seem like an improvement
> to me.
> 

Thanks for your comments. Currently, I am not sure how to avoid the use of the
syscache callback functions, So I think the change for RelationSyncCache needs
more thought and I will retry later if I find another way.

Best Regards,
Hou zj




RE: [PGdocs] fix description for handling pf non-ASCII characters

2023-09-26 Thread Hayato Kuroda (Fujitsu)
Dear Karl,

Thank you for reviewing! PSA new version.

> I see a few problems with the English and style of the patches
> and am commenting below and have signed up as a reviewer.

Your effort is quite helpful for me.

> At
> commitfest.postgresql.org I have marked the thread
> as needing author attention.  Hayato, you will need
> to mark the thread as needing review when you have
> replied to this message.

Sure. I will change the status after posting the patch.

Before replying your comments, I thought I should show the difference between
versions. Regarding old versions (here PG15 was used), non-ASCIIs (like 
Japanese) are
replaced with '?'.

```
psql (15.4)
Type "help" for help.

postgres=# SET application_name TO 'あああ';
SET
postgres=# SHOW application_name ;
 application_name 
--
 ?
(1 row)
```

As for the HEAD, as my patch said, non-ASCIIs are replaced
with hexadecimal representations. (Were my terminologies correct?).

```
psql (17devel)
Type "help" for help.

postgres=# SET application_name TO 'あああ';
SET
postgres=# SHOW application_name ;
   application_name   
--
 \xe3\x81\x82\xe3\x81\x82\xe3\x81\x82
(1 row)
```

Note that non-printable ASCIIs are also replaced with the same rule.

```
psql (15.4)
Type "help" for help.

postgres=# SET application_name TO E'\x03';
SET
postgres=# SHOW application_name ;
 application_name 
--
 ?
(1 row)

psql (17devel)
Type "help" for help.

postgres=# SET application_name TO E'\x03';
SET
postgres=# SHOW application_name ;
 application_name 
--
 \x03
(1 row)
```

> Now, to reviewing the patch:
> 
> First, it is now best practice in the PG docs to
> put a line break at the end of each sentence.
> At least for the sentences on the lines you change.
> (No need to update the whole document!)  Please
> do this in the next version of your patch.  I don't
> know if this is a requirement for acceptance by
> a committer, but it won't hurt.

I didn't know that. Could you please tell me if you have a source? Anyway,
I put a line break for each sentences for now.

> diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
> index e700782d3c..a4ce99ba4d 100644
> --- a/doc/src/sgml/config.sgml
> +++ b/doc/src/sgml/config.sgml
> @@ -7040,9 +7040,8 @@ local0.*/var/log/postgresql
>  The name will be displayed in the
> pg_stat_activity view and included in CSV log
> entries.  It can also be included in regular log entries via the  linkend="guc-log-line-prefix"/> parameter.
> -Only printable ASCII characters may be used in the
> -application_name value. Other characters
> will be
> -replaced with question marks (?).
> +Non-ASCII characters used in the
> application_name
> +will be replaced with hexadecimal strings.
> 
>
>   
> 
> Don't use the future tense to describe the system's behavior.  Instead
> of "will be" write "are".  (Yes, this change would be an improvement
> on the original text.  We should fix it while we're working on it
> and our attention is focused.)
> 
> It is more accurate, if I understand the issue, to say that characters
> are replaced with hexadecimal _representations_ of the input bytes.
> Finally, it would be good to know what representation we're talking
> about.  Perhaps just give the \xhh example and say: the Postgres
> C-style escaped hexadecimal byte value.  And hyperlink to
> https://www.postgresql.org/docs/16/sql-syntax-lexical.html#SQL-SYNTAX-ST
> RINGS-ESCAPE
> 
> (The docbook would be, depending on text you want to link:
> 
> C-style escaped hexadecimal
> byte value.
> 
> I think.  You link to id="someidvalue" attribute values.)

IIUC the word " Postgres" cannot be used in the doc.
Based on your all comments, I changed as below. How do you think?

```
Characters that are not printable ASCII, like \x03,
are replaced with the PostgreSQL
C-style escaped hexadecimal 
byte value.
```


> @@ -8037,10 +8036,9 @@ COPY postgres_log FROM
> '/full/path/to/logfile.csv' WITH csv; 
>  The name can be any string of less
>  than NAMEDATALEN characters (64 characters
> in
> a standard
> -build). Only printable ASCII characters may be used in the
> -cluster_name value. Other characters will be
> -replaced with question marks (?).  No name
> is shown
> -if this parameter is set to the empty string
> '' (which is
> +build). Non-ASCII characters used in the
> cluster_name
> +will be replaced with hexadecimal strings. No name is shown if
> this
> +parameter is set to the empty string ''
> (which is the default). This parameter can only be set at server start.
> 
>
> 
> Same review as for the first patch hunk.

Fixed like above. You can refer the patch.

> diff --git a/doc/src/sgml/postgres-fdw.sgml
> b/doc/src/sgml/postgres-fdw.sgml index 5062d712e7..98785e87ea 100644
> --- 

Re: [PATCH] Add inline comments to the pg_hba_file_rules view

2023-09-26 Thread Peter Eisentraut

On 04.09.23 11:54, Jim Jones wrote:
This patch proposes the column "comment" to the pg_hba_file_rules view. 
It basically parses the inline comment (if any) of a valid pg_hba.conf 
entry and displays it in the new column.


For such pg_hba entries ...

host db jim 127.0.0.1/32 md5 # foo
host db jim 127.0.0.1/32 md5 #bar
host db jim 127.0.0.1/32 md5 # #foo#


I'm skeptical about this.

First, there are multiple commenting styles.  The end-of-line style is 
less common in my experience, because pg_hba.conf lines tend to belong. 
Another style is


# foo
host db jim 127.0.0.1/32 md5
# bar
host db jim 127.0.0.1/32 md5

or even as a block

# foo and bar
host db jim 127.0.0.1/32 md5
host db jim 127.0.0.1/32 md5

Another potential problem is that maybe people don't want their comments 
leaked out of the file.  Who knows what they have written in there.


I think we should leave file comments be file comments.  If we want some 
annotations to be exported to higher-level views, we should make that an 
intentional and explicit separate feature.






Re: Partial aggregates pushdown

2023-09-26 Thread Alexander Pyhalov

fujii.y...@df.mitsubishielectric.co.jp писал 2023-09-25 06:18:

Hi Mr.Bruce, Mr.Pyhalov, Mr.Finnerty, hackers.

Thank you for your valuable comments. I sincerely apologize for the
very late reply.
Here is a response to your comments or a fix to the patch.

Tuesday, August 8, 2023 at 3:31 Bruce Momjian

> I have modified the program except for the point "if the version of the remote 
server is less than PG17".
> Instead, we have addressed the following.
> "If check_partial_aggregate_support is true and the remote server version is 
older than the local server
> version, postgres_fdw does not assume that the partial aggregate function is 
on the remote server unless
> the partial aggregate function and the aggregate function match."
> The reason for this is to maintain compatibility with any aggregate function 
that does not support partial
> aggregate in one version of V1 (V1 is PG17 or higher), even if the next 
version supports partial aggregate.
> For example, string_agg does not support partial aggregation in PG15, but it 
will support partial aggregation
> in PG16.

Just to clarify, I think you are saying:

If check_partial_aggregate_support is true and the remote 
server

version is older than the local server version, postgres_fdw
checks if the partial aggregate function exists on the remote
server during planning and only uses it if it does.

I tried to phrase it in a positive way, and mentioned the plan time
distinction.  Also, I am sorry I was away for most of July and am just
getting to this.

Thanks for your comment. In the documentation, the description of
check_partial_aggregate_support is as follows
(please see postgres-fdw.sgml).
--
check_partial_aggregate_support (boolean)
Only if this option is true, during query planning, postgres_fdw
connects to the remote server and check if the remote server version
is older than the local server version. If so, postgres_fdw assumes
that for each built-in aggregate function, the partial aggregate
function is not defined on the remote server unless the partial
aggregate function and the aggregate function match. The default is
false.
--

Thursday, 20 July 2023 19:23 Alexander Pyhalov 
:

fujii.y...@df.mitsubishielectric.co.jp писал 2023-07-19 03:43:
> Hi Mr.Pyhalov, hackers.

> 3)
> I modified the patch to safely do a partial aggregate pushdown for
> queries which contain having clauses.
>

Hi.
Sorry, but I don't see how it could work.

We apologize for any inconvenience caused.
Thanks to Pyhalov's and Jim's comments, I have realized that I have
made a fundamental mistake regarding the pushdown of the HAVING clause
and the difficulty of achieving it performing Partial aggregate
pushdown.
So, I removed the codes about pushdown of the HAVING clause performing
Partial aggregate pushdown.

Thursday, 20 July 2023 19:23 Alexander Pyhalov 
:

As for changes in planner.c (setGroupClausePartial()) I have several
questions.

1) Why don't we add non_group_exprs to pathtarget->exprs when
partial_target->exprs is not set?

2) We replace extra->partial_target->exprs with partial_target->exprs
after processing. Why are we sure that after this tleSortGroupRef is
correct?

Response to 1)
The code you pointed out was unnecessary. I have removed this code.
Also, the process of adding PlaceHolderVar's expr to partial_target was 
missing.

So I fixed this.

Response to 2)
The making procedures extra->groupClausePartial and 
extra->partial_target

in make_partial_grouping_target for this patch is as follows.
STEP1. From grouping_target->exprs, extract Aggref, Var and
Placeholdervar that are not included in Aggref.
STEP2. setGroupClausePartial sets the copy of original groupClause to
extra->groupClausePartial
and sets the copy of original partial_target to extra->partial_target.
STEP3. setGroupClausePartial adds Var and Placeholdervar in STEP1 to
partial_target.
The sortgroupref of partial_target->sortgrouprefs to be added to value 
is set to

(the maximum value of the existing sortgroupref) + 1.
setGroupClausePartial adds data sgc of sortgroupclause type where
sgc->tlesortgroupref
matches the sortgroupref to GroupClause.
STEP4. add_new_columns_to_pathtarget adds STEP1's Aggref to 
partial_target.


Due to STEP2, the list of tlesortgrouprefs set in
extra->groupClausePartial is not duplicated.


Do you mean that extra->partial_target->sortgrouprefs is not replaced, 
and so we preserve tlesortgroupref numbers?
I'm suspicious about rewriting extra->partial_target->exprs with 
partial_target->exprs - I'm still not sure why we
 don't we loose information, added by add_column_to_pathtarget() to 
extra->partial_target->exprs?


Also look at the following example.

EXPLAIN VERBOSE SELECT  count(*) , (b/2)::numeric FROM pagg_tab GROUP BY 
b/2 ORDER BY 1;

QUERY PLAN
---
 Sort  (cost=511.35..511.47 rows=50 

Re: Correct the documentation for work_mem

2023-09-26 Thread David Rowley
On Tue, 12 Sept 2023 at 03:03, Bruce Momjian  wrote:
>
> On Mon, Sep 11, 2023 at 10:02:55PM +1200, David Rowley wrote:
> > It's certainly not a show-stopper. I do believe the patch makes some
> > improvements.  The reason I'd prefer to see either "and" or "and/or"
> > in place of "or" is because the text is trying to imply that many of
> > these operations can run at the same time. I'm struggling to
> > understand why, given that there could be many sorts and many hashes
> > going on at once that we'd claim it could only be one *or* the other.
> > If we have 12 sorts and 4 hashes then that's not "several sort or hash
> > operations", it's "several sort and hash operations".  Of course, it
> > could just be sorts or just hashes, so "and/or" works fine for that.
>
> Yes, I see your point and went with "and",   updated patch attached.

Looks good to me.

David




[Code Cleanup] : Small code cleanup in twophase.sql

2023-09-26 Thread Nishant Sharma
Hi,


PFA small code cleanup in twophase.sql. Which contains a drop table
statement for 'test_prepared_savepoint'. Which, to me, appears to be
missing in the cleanup section of that file.

To support it I have below points:-

1) Grepping this table 'test_prepared_savepoint' shows occurrences
only in twophase.out & twophase.sql files. This means that table is
local to that sql test file and not used in any other test file.

2) I don't see any comment on why this was not added in the cleanup
section of twophase.sql, but drop for other two test tables are done.

3) I ran "make check-world" with the patch and I don't see any failures.

Kindly correct, if I missed anything.


Regards,
Nishant (EDB).


v1-0001-Small-code-cleanup-in-twophase.sql.patch
Description: Binary data


Re: Add test module for Table Access Method

2023-09-26 Thread Fabrízio de Royes Mello
On Mon, Jun 5, 2023 at 1:24 PM Fabrízio de Royes Mello <
fabriziome...@gmail.com> wrote:
>
> On Sat, Jun 3, 2023 at 7:42 PM Fabrízio de Royes Mello <
fabriziome...@gmail.com> wrote:
> >
> >
> > Hi all,
> >
> > During the PGCon Unconference session about Table Access Method one
missing item pointed out is that currently we lack documentation and
examples of TAM.
> >
> > So in order to improve things a bit in this area I'm proposing to add a
test module for Table Access Method similar what we already have for Index
Access Method.
> >
> > This code is based on the "blackhole_am" implemented by Michael
Paquier: https://github.com/michaelpq/pg_plugins/tree/main/blackhole_am
> >
>
> Just added some more tests, ran pgindent and also organized a bit some
comments and README.txt.
>

Rebased version.

-- 
Fabrízio de Royes Mello
From 5b6642b520874f4ca7023fc33d2e8e875fb64693 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Fabr=C3=ADzio=20de=20Royes=20Mello?=
 
Date: Sat, 3 Jun 2023 17:23:05 -0400
Subject: [PATCH v3] Add test module for Table Access Method

---
 src/test/modules/Makefile |   1 +
 src/test/modules/dummy_table_am/.gitignore|   3 +
 src/test/modules/dummy_table_am/Makefile  |  20 +
 src/test/modules/dummy_table_am/README|   5 +
 .../dummy_table_am/dummy_table_am--1.0.sql|  14 +
 .../modules/dummy_table_am/dummy_table_am.c   | 500 ++
 .../dummy_table_am/dummy_table_am.control |   5 +
 .../expected/dummy_table_am.out   | 207 
 src/test/modules/dummy_table_am/meson.build   |  33 ++
 .../dummy_table_am/sql/dummy_table_am.sql |  55 ++
 src/test/modules/meson.build  |   1 +
 11 files changed, 844 insertions(+)
 create mode 100644 src/test/modules/dummy_table_am/.gitignore
 create mode 100644 src/test/modules/dummy_table_am/Makefile
 create mode 100644 src/test/modules/dummy_table_am/README
 create mode 100644 src/test/modules/dummy_table_am/dummy_table_am--1.0.sql
 create mode 100644 src/test/modules/dummy_table_am/dummy_table_am.c
 create mode 100644 src/test/modules/dummy_table_am/dummy_table_am.control
 create mode 100644 src/test/modules/dummy_table_am/expected/dummy_table_am.out
 create mode 100644 src/test/modules/dummy_table_am/meson.build
 create mode 100644 src/test/modules/dummy_table_am/sql/dummy_table_am.sql

diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index e81873cb5a..cb7a5b970a 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -10,6 +10,7 @@ SUBDIRS = \
 		  delay_execution \
 		  dummy_index_am \
 		  dummy_seclabel \
+		  dummy_table_am \
 		  libpq_pipeline \
 		  plsample \
 		  spgist_name_ops \
diff --git a/src/test/modules/dummy_table_am/.gitignore b/src/test/modules/dummy_table_am/.gitignore
new file mode 100644
index 00..44d119cfcc
--- /dev/null
+++ b/src/test/modules/dummy_table_am/.gitignore
@@ -0,0 +1,3 @@
+# Generated subdirectories
+/log/
+/results/
diff --git a/src/test/modules/dummy_table_am/Makefile b/src/test/modules/dummy_table_am/Makefile
new file mode 100644
index 00..9ea4a590c6
--- /dev/null
+++ b/src/test/modules/dummy_table_am/Makefile
@@ -0,0 +1,20 @@
+# src/test/modules/dummy_table_am/Makefile
+
+MODULES = dummy_table_am
+
+EXTENSION = dummy_table_am
+DATA = dummy_table_am--1.0.sql
+PGFILEDESC = "dummy_table_am - table access method template"
+
+REGRESS = dummy_table_am
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/dummy_table_am
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/src/test/modules/dummy_table_am/README b/src/test/modules/dummy_table_am/README
new file mode 100644
index 00..35211554b0
--- /dev/null
+++ b/src/test/modules/dummy_table_am/README
@@ -0,0 +1,5 @@
+Dummy Table AM
+==
+
+Dummy table AM is a module for testing any facility usable by an table
+access method, whose code is kept a maximum simple.
diff --git a/src/test/modules/dummy_table_am/dummy_table_am--1.0.sql b/src/test/modules/dummy_table_am/dummy_table_am--1.0.sql
new file mode 100644
index 00..aa0fd82e61
--- /dev/null
+++ b/src/test/modules/dummy_table_am/dummy_table_am--1.0.sql
@@ -0,0 +1,14 @@
+/* src/test/modules/dummy_table_am/dummy_table_am--1.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "CREATE EXTENSION dummy_table_am" to load this file. \quit
+
+CREATE FUNCTION dummy_table_am_handler(internal)
+RETURNS table_am_handler
+AS 'MODULE_PATHNAME'
+LANGUAGE C;
+
+-- Access method
+CREATE ACCESS METHOD dummy_table_am TYPE TABLE HANDLER dummy_table_am_handler;
+COMMENT ON ACCESS METHOD dummy_table_am IS 'dummy table access method';
+
diff --git a/src/test/modules/dummy_table_am/dummy_table_am.c b/src/test/modules/dummy_table_am/dummy_table_am.c
new file mode 100644
index 

Re: Avoid a possible out-of-bounds access (src/backend/optimizer/util/relnode.c)

2023-09-26 Thread David Rowley
On Wed, 27 Sept 2023 at 01:31, Ranier Vilela  wrote:
> It seems to me that it adds a LEA instruction.
> https://godbolt.org/z/b4jK3PErE

There's a fairly significant difference in the optimisability of a
comparison with a compile-time constant vs a variable. For example,
would you expect the compiler to emit assembly for each side of the
boolean AND in: if (a > 12 && a > 20),  or how about if (a > 12 && a >
y)?  No need to answer. Just consider it.

I suggest keeping your experiments as close to the target code as
practical. You might be surprised by what the compiler can optimise.

David




Re: Avoid a possible out-of-bounds access (src/backend/optimizer/util/relnode.c)

2023-09-26 Thread Ranier Vilela
Em ter., 26 de set. de 2023 às 07:34, Ashutosh Bapat <
ashutosh.bapat@gmail.com> escreveu:

> On Tue, Sep 26, 2023 at 3:32 PM David Rowley  wrote:
> >
> > find_base_rel() could be made more robust for free by just casting the
> > relid and simple_rel_array_size to uint32 while checking that relid <
> > root->simple_rel_array_size.  The 0th element should be NULL anyway,
> > so "if (rel)" should let relid==0 calls through and allow that to
> > ERROR still. I see that just changes a "jle" to "jnb" vs adding an
> > additional jump for Ranier's version. [1]
>
> That's a good suggestion.
>
> I am fine with find_base_rel() as it is today as well. But
> future-proofing it seems to be fine too.
>
> >
> > It seems worth not making find_base_rel() more expensive than it is
> > today as commonly we just reference root->simple_rel_array[n] directly
> > anyway because it's cheaper. It would be nice if we didn't add further
> > overhead to find_base_rel() as this would make the case for using
> > PlannerInfo.simple_rel_array directly even stronger.
>
> I am curious, is the overhead in find_base_rel() impacting overall
> performance?
>
It seems to me that it adds a LEA instruction.
https://godbolt.org/z/b4jK3PErE

Although it doesn't seem like much,
I believe the solution (casting to unsigned) seems better.
So +1.

best regards,
Ranier Vilela


Re: Opportunistically pruning page before update

2023-09-26 Thread James Coleman
On Tue, Sep 5, 2023 at 1:40 PM Melanie Plageman
 wrote:
>
> On Wed, Jun 21, 2023 at 8:51 AM James Coleman  wrote:
> > While at PGCon I was chatting with Andres (and I think Peter G. and a
> > few others who I can't remember at the moment, apologies) and Andres
> > noted that while we opportunistically prune a page when inserting a
> > tuple (before deciding we need a new page) we don't do the same for
> > updates.
> >
> > Attached is a patch series to do the following:
> >
> > 0001: Make it possible to call heap_page_prune_opt already holding an
> > exclusive lock on the buffer.
> > 0002: Opportunistically prune pages on update when the current tuple's
> > page has no free space. If this frees up enough space, then we
> > continue to put the new tuple on that page; if not, then we take the
> > existing code path and get a new page.
>
> I've reviewed these patches and have questions.
>
> Under what conditions would this be exercised for UPDATE? Could you
> provide an example?
>
> With your patch applied, when I create a table, the first time I update
> it heap_page_prune_opt() will return before actually doing any pruning
> because the page prune_xid hadn't been set (it is set after pruning as
> well as later in heap_update() after RelationGetBufferForTuple() is
> called).
>
> I actually added an additional parameter to heap_page_prune() and
> heap_page_prune_opt() to identify if heap_page_prune() was called from
> RelationGetBufferForTuple() and logged a message when this was true.
> Running the test suite, I didn't see any UPDATEs executing
> heap_page_prune() from RelationGetBufferForTuple(). I did, however, see
> other statement types doing so (see RelationGetBufferForTuple()'s other
> callers). Was that intended?
>
> > I started to work on benchmarking this, but haven't had time to devote
> > properly to that, so I'm wondering if there's anyone who might be
> > interested in collaborating on that part.
>
> I'm interested in this feature and in helping with it/helping with
> benchmarking it, but I don't yet understand the design in its current
> form.

Hi Melanie,

Thanks for taking a look at this! Apologies for the long delay in
replying: I started to take a look at your questions earlier, and it
turned into more of a rabbit hole than I'd anticipated. I've since
been distracted by other things. So -- I don't have any conclusions
here yet, but I'm hoping at or after PGConf NYC that I'll be able to
dedicate the time this deserves.

Thanks,
James Coleman




Re: [PoC] pg_upgrade: allow to upgrade publisher node

2023-09-26 Thread Bharath Rupireddy
On Tue, Sep 26, 2023 at 10:51 AM Hayato Kuroda (Fujitsu)
 wrote:
>
> Again, thank you for reviewing! PSA a new version.

Thanks for the new patch. Here's a comment on v46:

1.
+Datum
+binary_upgrade_validate_wal_logical_end(PG_FUNCTION_ARGS
+{ oid => '8046', descr => 'for use by pg_upgrade',
+  proname => 'binary_upgrade_validate_wal_logical_end', proisstrict => 'f',
+  provolatile => 'v', proparallel => 'u', prorettype => 'bool',
+  proargtypes => 'pg_lsn',
+  prosrc => 'binary_upgrade_validate_wal_logical_end' },

I think this patch can avoid catalog changes by turning
binary_upgrade_validate_wal_logical_end a FRONTEND-only function
sitting in xlogreader.c after making InitXLogReaderState(),
ReadNextXLogRecord() FRONTEND-friendly (replace elog/ereport with
pg_fatal or such). With this change and back-porting of commit
e0b2eed0 to save logical slots at shutdown, the patch can help support
upgrading logical replication slots on PG versions < 17.

-- 
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Build the docs if there are changes in docs and don't run other tasks if the changes are only in docs

2023-09-26 Thread Peter Eisentraut

On 25.09.23 12:56, Nazir Bilal Yavuz wrote:

+  # Only run if a specific OS is not requested and if there are changes in docs
+  # or in the CI files.
+  skip: >
+$CIRRUS_CHANGE_MESSAGE =~ '.*\nci-os-only:.*' ||
+!changesInclude('doc/**',
+'.cirrus.yml',
+'.cirrus.tasks.yml',
+'src/backend/catalog/sql_feature_packages.txt',
+'src/backend/catalog/sql_features.txt',
+'src/backend/utils/errcodes.txt',
+'src/backend/utils/activity/wait_event_names.txt',
+'src/backend/utils/activity/generate-wait_event_types.pl',
+'src/include/parser/kwlist.h')


This is kind of annoying.  Now we need to maintain yet another list of 
these dependencies and keep it in sync with the build systems.


I think meson can produce a dependency tree from a build.  Maybe we 
could use that somehow and have Cirrus cache it between runs?


Also note that there are also dependencies in the other direction.  For 
example, the psql help is compiled from XML DocBook sources.  So your 
other patch would also need to include similar changesInclude() clauses.







Re: Avoid a possible out-of-bounds access (src/backend/optimizer/util/relnode.c)

2023-09-26 Thread Ashutosh Bapat
On Tue, Sep 26, 2023 at 3:32 PM David Rowley  wrote:
>
> find_base_rel() could be made more robust for free by just casting the
> relid and simple_rel_array_size to uint32 while checking that relid <
> root->simple_rel_array_size.  The 0th element should be NULL anyway,
> so "if (rel)" should let relid==0 calls through and allow that to
> ERROR still. I see that just changes a "jle" to "jnb" vs adding an
> additional jump for Ranier's version. [1]

That's a good suggestion.

I am fine with find_base_rel() as it is today as well. But
future-proofing it seems to be fine too.

>
> It seems worth not making find_base_rel() more expensive than it is
> today as commonly we just reference root->simple_rel_array[n] directly
> anyway because it's cheaper. It would be nice if we didn't add further
> overhead to find_base_rel() as this would make the case for using
> PlannerInfo.simple_rel_array directly even stronger.

I am curious, is the overhead in find_base_rel() impacting overall performance?

-- 
Best Wishes,
Ashutosh Bapat




Re: Add const qualifiers

2023-09-26 Thread Peter Eisentraut

On 09.09.23 21:03, David Steele wrote:

On 9/1/23 11:39, David Steele wrote:

Hackers,

I noticed that there was a mismatch between the const qualifiers for 
excludeDirContents in src/backend/backup/backup/basebackup.c and 
src/bin/pg_rewind/file_map.c and that led me to use ^static 
const.*\*.*= to do a quick search for similar cases.


I think at the least we should make excludeDirContents match, but the 
rest of the changes seem like a good idea as well.


Added to 2023-11 CF.


committed





Re: Avoid a possible out-of-bounds access (src/backend/optimizer/util/relnode.c)

2023-09-26 Thread David Rowley
On Tue, 26 Sept 2023 at 21:45, Ashutosh Bapat
 wrote:
> However, I agree that changing find_base_rel() the way you have done
> in your patch is fine and mildly future-proof. +1 to that idea
> irrespective of what bitmapset functions do.

I'm not a fan of adding additional run-time overhead for this
theoretical problem.

find_base_rel() could be made more robust for free by just casting the
relid and simple_rel_array_size to uint32 while checking that relid <
root->simple_rel_array_size.  The 0th element should be NULL anyway,
so "if (rel)" should let relid==0 calls through and allow that to
ERROR still. I see that just changes a "jle" to "jnb" vs adding an
additional jump for Ranier's version. [1]

It seems worth not making find_base_rel() more expensive than it is
today as commonly we just reference root->simple_rel_array[n] directly
anyway because it's cheaper. It would be nice if we didn't add further
overhead to find_base_rel() as this would make the case for using
PlannerInfo.simple_rel_array directly even stronger.

David

[1] https://godbolt.org/z/qrxKTbvva




Re: POC, WIP: OR-clause support for indexes

2023-09-26 Thread a.rybakina
Sorry for the duplicates, I received a letter that my letter did not 
reach the addressee, I thought the design was incorrect.


On 26.09.2023 12:21, a.rybakina wrote:


I'm sorry I didn't write for a long time, but I really had a very 
difficult month, now I'm fully back to work.


*I was able to implement the patches to the end and moved the 
transformation of "OR" expressions to ANY.* I haven't seen a big 
difference between them yet, one has a transformation before 
calculating selectivity (v7.1-Replace-OR-clause-to-ANY.patch), the 
other after (v7.2-Replace-OR-clause-to-ANY.patch). Regression tests 
are passing, I don't see any problems with selectivity, nothing has 
fallen into the coredump, but I found some incorrect transformations. 
What is the reason for these inaccuracies, I have not found, but, to 
be honest, they look unusual). Gave the error below.


In the patch, I don't like that I had to drag three libraries from 
parsing until I found a way around it.The advantage of this approach 
compared to the other ([1]) is that at this stage all possible or 
transformations are performed, compared to the patch, where the 
transformation was done at the parsing stage. That is, here, for 
example, there are such optimizations in the transformation:



I took the common element out of the bracket and the rest is converted 
to ANY, while, as noted by Peter Geoghegan, we did not have several 
bitmapscans, but only one scan through the array.


postgres=# explain analyze SELECT p1.oid, p1.proname
FROM pg_proc as p1
WHERE prolang = 13 AND prolang=1 OR prolang = 13 AND prolang = 2 OR 
prolang = 13 AND prolang = 3;

  QUERY PLAN
---
 Seq Scan on pg_proc p1  (cost=0.00..151.66 rows=1 width=68) (actual 
time=1.167..1.168 rows=0 loops=1)
   Filter: ((prolang = '13'::oid) AND (prolang = ANY (ARRAY['1'::oid, 
'2'::oid, '3'::oid])))

   Rows Removed by Filter: 3302
 Planning Time: 0.146 ms
 Execution Time: 1.191 ms
(5 rows)

*While I was testing, I found some transformations that don't work, 
although in my opinion, they should:**

**
**1. First case:*
explain analyze SELECT p1.oid, p1.proname
FROM pg_proc as p1
WHERE prolang = 13 AND prolang=1 OR prolang = 2 AND prolang = 2 OR 
prolang = 13 AND prolang = 13;

QUERY PLAN
--
 Seq Scan on pg_proc p1  (cost=0.00..180.55 rows=2 width=68) (actual 
time=2.959..3.335 rows=89 loops=1)
   Filter: (((prolang = '13'::oid) AND (prolang = '1'::oid)) OR 
((prolang = '2'::oid) AND (prolang = '2'::oid)) OR ((prolang = 
'13'::oid) AND (prolang = '13'::oid)))

   Rows Removed by Filter: 3213
 Planning Time: 1.278 ms
 Execution Time: 3.486 ms
(5 rows)

Should have left only prolang = '13'::oid:

  QUERY PLAN
---
 Seq Scan on pg_proc p1  (cost=0.00..139.28 rows=1 width=68) (actual 
time=2.034..2.034 rows=0 loops=1)

   Filter: ((prolang = '13'::oid ))
   Rows Removed by Filter: 3302
 Planning Time: 0.181 ms
 Execution Time: 2.079 ms
(5 rows)

*2. Also does not work:*
postgres=# explain analyze SELECT p1.oid, p1.proname
FROM pg_proc as p1
WHERE prolang = 13 OR prolang = 2 AND prolang = 2 OR prolang = 13;
  QUERY PLAN
---
 Seq Scan on pg_proc p1  (cost=0.00..164.04 rows=176 width=68) (actual 
time=2.422..2.686 rows=89 loops=1)
   Filter: ((prolang = '13'::oid) OR ((prolang = '2'::oid) AND 
(prolang = '2'::oid)) OR (prolang = '13'::oid))

   Rows Removed by Filter: 3213
 Planning Time: 1.370 ms
 Execution Time: 2.799 ms
(5 rows)

Should have left:
Filter: ((prolang = '13'::oid) OR (prolang = '2'::oid))

*3. Or another:*

explain analyze SELECT p1.oid, p1.proname
FROM pg_proc as p1
WHERE prolang = 13 OR prolang=13 OR prolang = 2 AND prolang = 2;
  QUERY PLAN
---
 Seq Scan on pg_proc p1  (cost=0.00..164.04 rows=176 width=68) (actual 
time=2.350..2.566 rows=89 loops=1)
   Filter: ((prolang = '13'::oid) OR (prolang = '13'::oid) OR 
((prolang = '2'::oid) AND (prolang = '2'::oid)))

   Rows Removed by Filter: 3213
 Planning Time: 0.215 ms
 Execution Time: 2.624 ms
(5 rows)

Should have left:
Filter: ((prolang = '13'::oid) OR (prolang = '2'::oid))


*Falls into coredump at me:*
explain analyze SELECT p1.oid, p1.proname
FROM pg_proc as p1
WHERE prolang = 13 OR prolang = 2 AND prolang = 2 OR prolang = 13;

explain analyze SELECT p1.oid, 

Re: OpenSSL v3 performance regressions

2023-09-26 Thread Daniel Gustafsson
> On 26 Sep 2023, at 10:36, Adrien Nayrat  wrote:

> Should Postgres support alternative SSL libraries has HAProxy?

PostgreSQL can be built with LibreSSL instead of OpenSSL, which may or may not
be a better option performance wise for a particular application.  Benchmarking
your workload is key to understanding performance, a lot of the regressions
pointed to in that blogpost wont affect postgres.

--
Daniel Gustafsson





Re: Move global variables of pgoutput to plugin private scope.

2023-09-26 Thread Amit Kapila
On Tue, Sep 19, 2023 at 12:48 PM Zhijie Hou (Fujitsu)
 wrote:
>
> - static bool publish_no_origin;
>
> This flag is also local to pgoutput instance, and we didn't reset the flag in
> output shutdown callback, so if we consume changes from different slots, then
> the second call would reuse the flag value that is set in the first call which
> is unexpected. To completely avoid this issue, we think we'd better move this
> flag to output plugin private data structure.
>
> Example:
>  SELECT data FROM pg_logical_slot_peek_binary_changes('isolation_slot_1', 
> NULL, NULL, 'proto_version', '1', 'publication_names', 'pub', 'origin', 
> 'none'); --- Set origin in this call.
>  SELECT data FROM pg_logical_slot_peek_binary_changes('isolation_slot_2', 
> NULL, NULL, 'proto_version', '1', 'publication_names', 'pub');  -- Didn't set 
> origin, but will reuse the origin flag in the first call.
>

  char*origin;
+ bool publish_no_origin;
 } PGOutputData;

Do we really need a new parameter in above structure? Can't we just
use the existing origin in the same structure? Please remember if this
needs to be backpatched then it may not be good idea to add new
parameter in the structure but apart from that having two members to
represent similar information doesn't seem advisable to me. I feel for
backbranch we can just use PGOutputData->origin for comparison and for
HEAD, we can remove origin and just use a boolean to avoid any extra
cost for comparisions for each change.

Can we add a test case to cover this case?

-- 
With Regards,
Amit Kapila.




OpenSSL v3 performance regressions

2023-09-26 Thread Adrien Nayrat

Hello,


I read this article from Haproxy, they noticed OpenSSL v3 has huge 
performance regressions :

https://github.com/haproxy/wiki/wiki/SSL-Libraries-Support-Status#openssl

This is a known issue : 
https://github.com/openssl/openssl/issues/17627#issuecomment-1060123659


Unfortunately, v3 is shipped with many distributions (Debian, Redhat, 
Rockylinux...) : https://pkgs.org/search/?q=openssl


I am afraid of users will face performance issues once they update their 
distro.


Does someone already encounter problems? Should Postgres support 
alternative SSL libraries has HAProxy?


Regards,

--
Adrien NAYRAT





Re: Regression in COPY FROM caused by 9f8377f7a2

2023-09-26 Thread Laurenz Albe
Here is an improved version of the patch with regression tests.

Yours,
Laurenz Albe
From 71744ada1e2c8cfdbb57e03018572a1af623b09e Mon Sep 17 00:00:00 2001
From: Laurenz Albe 
Date: Tue, 26 Sep 2023 10:09:49 +0200
Subject: [PATCH] Evaluate defaults in COPY FROM only if necessary

Since commit 9f8377f7a2, we evaluate the column default values in
COPY FROM for all columns except generated ones, because they could
be needed if the input value matches the DEFAULT option.

This can cause a surprising regression:

  CREATE TABLE boom (t character varying(5) DEFAULT 'a long string');
  COPY boom FROM STDIN;
  ERROR:  value too long for type character varying(5)

This worked before 9f8377f7a2, since default values were only
evaluated for columns that were not specified in the column list.

To fix, fetch the default values only if the DEFAULT option was
specified or for columns not specified in the column list.
---
 src/backend/commands/copyfrom.c|  9 -
 src/test/regress/expected/copy.out | 17 +
 src/test/regress/sql/copy.sql  | 15 +++
 3 files changed, 40 insertions(+), 1 deletion(-)

diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index 70871ed819..3f3e631dee 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -1567,7 +1567,14 @@ BeginCopyFrom(ParseState *pstate,
 		/* Get default info if available */
 		defexprs[attnum - 1] = NULL;
 
-		if (!att->attgenerated)
+		/*
+		 * We need the default values only for columns that do not appear in the
+		 * column list.  But if the DEFAULT option was given, we may need all
+		 * column default values.  We never need defaults for generated columns.
+		 */
+		if ((cstate->opts.default_print != NULL ||
+			 !list_member_int(cstate->attnumlist, attnum)) &&
+			!att->attgenerated)
 		{
 			Expr	   *defexpr = (Expr *) build_column_default(cstate->rel,
 attnum);
diff --git a/src/test/regress/expected/copy.out b/src/test/regress/expected/copy.out
index 8a8bf43fde..a5912c13a8 100644
--- a/src/test/regress/expected/copy.out
+++ b/src/test/regress/expected/copy.out
@@ -240,3 +240,20 @@ SELECT * FROM header_copytest ORDER BY a;
 (5 rows)
 
 drop table header_copytest;
+-- test COPY with overlong column defaults
+create temp table oversized_column_default (
+col1 varchar(5) DEFAULT 'more than 5 chars',
+col2 varchar(5));
+-- normal COPY should work
+copy oversized_column_default from stdin;
+-- error if the column is excluded
+copy oversized_column_default (col2) from stdin;
+ERROR:  value too long for type character varying(5)
+\.
+invalid command \.
+-- error if the DEFAULT option is given
+copy oversized_column_default from stdin (default '');
+ERROR:  value too long for type character varying(5)
+\.
+invalid command \.
+drop table oversized_column_default;
diff --git a/src/test/regress/sql/copy.sql b/src/test/regress/sql/copy.sql
index f9da7b1508..7fdb26d14f 100644
--- a/src/test/regress/sql/copy.sql
+++ b/src/test/regress/sql/copy.sql
@@ -268,3 +268,18 @@ a	c	b
 
 SELECT * FROM header_copytest ORDER BY a;
 drop table header_copytest;
+
+-- test COPY with overlong column defaults
+create temp table oversized_column_default (
+col1 varchar(5) DEFAULT 'more than 5 chars',
+col2 varchar(5));
+-- normal COPY should work
+copy oversized_column_default from stdin;
+\.
+-- error if the column is excluded
+copy oversized_column_default (col2) from stdin;
+\.
+-- error if the DEFAULT option is given
+copy oversized_column_default from stdin (default '');
+\.
+drop table oversized_column_default;
-- 
2.41.0



Re: logfmt and application_context

2023-09-26 Thread Étienne BERSAC
Hi Daniel,

Thanks for the feedback.

Le mardi 05 septembre 2023 à 11:35 +0200, Daniel Gustafsson a écrit :
> > On 30 Aug 2023, at 14:36, Étienne BERSAC  wrote:
> 
> > ..what do you think of having logfmt output along json and CSV ?
> 
> Less ideal is
> that there is no official formal definition of what logfmt is [...]  If we add
> support for it, can we reasonably expect that what we emit is what consumers 
> of
> it assume it will look like?

I didn't know logfmt had variation. Do you have a case of
incompatibility ?

Anyway, I think that logfmt will be better handled inside Postgres
rather than in an extension due to limitation in syslogger
extendability. I could send a patch if more people are interested in
this.


What do you think about application_context as a way to render e.g. a
web request UUID to all backend log messages ?


Regards,
Étienne





Fail hard if xlogreader.c fails on out-of-memory

2023-09-26 Thread Michael Paquier
Hi all,
(Thomas in CC.)

Now that becfbdd6c1c9 has improved the situation to detect the
difference between out-of-memory and invalid WAL data in WAL, I guess
that it is time to tackle the problem of what we should do when
reading WAL records bit fail on out-of-memory.

To summarize, currently the WAL reader APIs fail the same way if we
detect some incorrect WAL record or if a memory allocation fails: an
error is generated and returned back to the caller to consume.  For
WAL replay, not being able to make the difference between an OOM and
the end-of-wal is a problem in some cases.  For example, in crash
recovery, failing an internal allocation will be detected as the
end-of-wal, causing recovery to stop prematurely.  In the worst cases,
this silently corrupts clusters because not all the records generated
in the local pg_wal/ have been replayed.  Oops.

When in standby mode, things are a bit better, because we'd just loop
and wait for the next record.  But, even in this case, if the startup
process does a crash recovery while standby is set up, we may finish
by attempting recovery from a different source than the local pg_wal/.
Not strictly critical, but less optimal in some cases as we could
switch to archive recovery earlier than necessary.

In a different thread, I have proposed to extend the WAL reader
facility so as an error code is returned to make the difference
between an OOM or the end of WAL with an incorrect record:
https://www.postgresql.org/message-id/ZRJ-p1dLUY0uoChc%40paquier.xyz

However this requires some ABI changes, so that's not backpatchable.

This leaves out what we can do for the existing back-branches, and
one option is to do the simplest thing I can think of: if an
allocation fails, just fail *hard*.  The allocations of the WAL reader
rely on palloc_extended(), so I'd like to suggest that we switch to
palloc() instead.  If we do so, an ERROR is promoted to a FATAL during
WAL replay, which makes sure that we will never stop recovery earlier
than we should, FATAL-ing before things go wrong.

Note that the WAL prefetching already relies on a palloc() on HEAD in
XLogReadRecordAlloc(), which would fail hard the same way on OOM.

So, attached is a proposal of patch to do something down to 12.

Thoughts and/or comments are welcome.
--
Michael
diff --git a/src/backend/access/transam/xlogreader.c b/src/backend/access/transam/xlogreader.c
index a17263df20..a1363e3b8f 100644
--- a/src/backend/access/transam/xlogreader.c
+++ b/src/backend/access/transam/xlogreader.c
@@ -43,7 +43,7 @@
 
 static void report_invalid_record(XLogReaderState *state, const char *fmt,...)
 			pg_attribute_printf(2, 3);
-static bool allocate_recordbuf(XLogReaderState *state, uint32 reclength);
+static void allocate_recordbuf(XLogReaderState *state, uint32 reclength);
 static int	ReadPageInternal(XLogReaderState *state, XLogRecPtr pageptr,
 			 int reqLen);
 static void XLogReaderInvalReadState(XLogReaderState *state);
@@ -155,14 +155,7 @@ XLogReaderAllocate(int wal_segment_size, const char *waldir,
 	 * Allocate an initial readRecordBuf of minimal size, which can later be
 	 * enlarged if necessary.
 	 */
-	if (!allocate_recordbuf(state, 0))
-	{
-		pfree(state->errormsg_buf);
-		pfree(state->readBuf);
-		pfree(state);
-		return NULL;
-	}
-
+	allocate_recordbuf(state, 0);
 	return state;
 }
 
@@ -184,7 +177,6 @@ XLogReaderFree(XLogReaderState *state)
 
 /*
  * Allocate readRecordBuf to fit a record of at least the given length.
- * Returns true if successful, false if out of memory.
  *
  * readRecordBufSize is set to the new buffer size.
  *
@@ -196,7 +188,7 @@ XLogReaderFree(XLogReaderState *state)
  * Note: This routine should *never* be called for xl_tot_len until the header
  * of the record has been fully validated.
  */
-static bool
+static void
 allocate_recordbuf(XLogReaderState *state, uint32 reclength)
 {
 	uint32		newSize = reclength;
@@ -206,15 +198,8 @@ allocate_recordbuf(XLogReaderState *state, uint32 reclength)
 
 	if (state->readRecordBuf)
 		pfree(state->readRecordBuf);
-	state->readRecordBuf =
-		(char *) palloc_extended(newSize, MCXT_ALLOC_NO_OOM);
-	if (state->readRecordBuf == NULL)
-	{
-		state->readRecordBufSize = 0;
-		return false;
-	}
+	state->readRecordBuf = (char *) palloc(newSize);
 	state->readRecordBufSize = newSize;
-	return true;
 }
 
 /*
@@ -505,9 +490,7 @@ XLogReadRecordAlloc(XLogReaderState *state, size_t xl_tot_len, bool allow_oversi
 	/* Not enough space in the decode buffer.  Are we allowed to allocate? */
 	if (allow_oversized)
 	{
-		decoded = palloc_extended(required_space, MCXT_ALLOC_NO_OOM);
-		if (decoded == NULL)
-			return NULL;
+		decoded = palloc(required_space);
 		decoded->oversized = true;
 		return decoded;
 	}
@@ -815,13 +798,7 @@ restart:
 Assert(gotlen <= lengthof(save_copy));
 Assert(gotlen <= state->readRecordBufSize);
 memcpy(save_copy, state->readRecordBuf, gotlen);
-if (!allocate_recordbuf(state, total_len))
-{
-	/* We treat this as 

Re: Invalidate the subscription worker in cases where a user loses their superuser status

2023-09-26 Thread Peter Smith
Here are some comments for patch v2-0001.

==
src/backend/replication/logical/worker.c

1. maybe_reread_subscription

ereport(LOG,
(errmsg("logical replication worker for subscription \"%s\"
will restart because of a parameter change",
MySubscription->name)));

Is this really a "parameter change" though? It might be a stretch to
call the user role change a subscription parameter change. Perhaps
this case needs its own LOG message?

==
src/include/catalog/pg_subscription.h

2.
  char*origin; /* Only publish data originating from the
  * specified origin */
+ bool isownersuperuser; /* Is subscription owner superuser? */
 } Subscription;


Is a new Subscription field member really necessary? The Subscription
already has 'owner' -- why doesn't function
maybe_reread_subscription() just check:

(!superuser_arg(sub->owner) && superuser_arg(MySubscription->owner))

==
src/test/subscription/t/027_nosuperuser.pl

3.
# The apply worker should get restarted after the superuser prvileges are
# revoked for subscription owner alice.

typo

/prvileges/privileges/

~

4.
+# After the user becomes non-superuser the apply worker should be restarted and
+# it should fail with 'password is required' error as password option is not
+# part of the connection string.

/as password option/because the password option/

==
Kind Regards,
Peter Smith.
Fujitsu Australia.




Re: Regression in COPY FROM caused by 9f8377f7a2

2023-09-26 Thread Laurenz Albe
On Mon, 2023-09-25 at 17:49 -0400, Tom Lane wrote:
> Andrew Dunstan  writes:
> > On 2023-09-25 Mo 11:06, Andrew Dunstan wrote:
> > > On 2023-09-25 Mo 04:59, Laurenz Albe wrote:
> > > > CREATE TABLE boom (t character varying(5) DEFAULT 'a long string');
> 
> > Thinking about this a little more, wouldn't it be better if we checked 
> > at the time we set the default that the value is actually valid for the 
> > given column? This is only one manifestation of a problem you could run 
> > into given this table definition.
> 
> I dunno, it seems at least possible that someone would do this
> deliberately as a means of preventing the column from being defaulted.
> In any case, the current behavior has stood for a very long time and
> no one has complained that an error should be thrown sooner.

Moreover, this makes restoring a pg_dump from v15 to v16 fail, which
should never happen.  This is how I got that bug report.

Yours,
Laurenz Albe




Re: Incorrect handling of OOM in WAL replay leading to data loss

2023-09-26 Thread Michael Paquier
On Thu, Aug 10, 2023 at 02:59:07PM +0900, Michael Paquier wrote:
> My apologies if I sounded unclear here.  It seems to me that we should
> wrap the patch on [1] first, and get it backpatched.  At least that
> makes for less conflicts when 0001 gets merged for HEAD when we are
> able to set a proper error code.  (Was looking at it, actually.)

Now that Thomas Munro has addressed the original problem to be able to
trust correctly xl_tot_len with bae868caf22, I am coming back to this
thread.

First, attached is a rebased set:
- 0001 to introduce the new error infra for xlogreader.c with an error
code, so as callers can make the difference between an OOM and an
invalid record.
- 0002 to tweak the startup process.  Once again, I've taken the
approach to make the startup process behave like a standby on crash
recovery: each time that an OOM is found, we loop and retry.
- 0003 to emulate an OOM failure, that can be used with the script
attached to see that we don't stop recovery too early.

>>> I guess that it depends on how much responsiveness one may want.
>>> Forcing a failure on OOM is at least something that users would be
>>> immediately able to act on when we don't run a standby but just
>>> recover from a crash, while a standby would do what it is designed to
>>> do, aka continue to replay what it can see.  One issue with the
>>> wait-and-continue is that a standby may loop continuously on OOM,
>>> which could be also bad if there's a replication slot retaining WAL on
>>> the primary.  Perhaps that's just OK to keep doing that for a
>>> standby.  At least this makes the discussion easier for the sake of
>>> this thread: just consider the case of crash recovery when we don't
>>> have a standby.
>> 
>> Yeah, I'm with you on focusing on crash recovery cases; that's what I
>> intended. Sorry for any confusion.
> 
> Okay, so we're on the same page here, keeping standbys as they are and
> do something for the crash recovery case.

For the crash recovery case, one argument that stood out in my mind is
that causing a hard failure has the disadvantage to force users to do
again WAL replay from the last redo position, which may be far away
even if the checkpointer now runs during crash recovery.  What I am
proposing on this thread has the merit to avoid that.  Anyway, let's
discuss more before settling this point for the crash recovery case.

By the way, anything that I am proposing here cannot be backpatched
because of the infrastructure changes required in walreader.c, so I am
going to create a second thread with something that could be
backpatched (yeah, likely FATALs on OOM to stop recovery from doing
something bad)..
--
Michael
From 3d7bb24fc2f9f070273b63208819c4e54e428d18 Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Tue, 26 Sep 2023 15:40:05 +0900
Subject: [PATCH v4 1/3] Add infrastructure to report error codes in WAL reader

This commits moves the error state coming from WAL readers into a new
structure, that includes the existing pointer to the error message
buffer, but it also gains an error code that fed back to the callers of
the following routines:
XLogPrefetcherReadRecord()
XLogReadRecord()
XLogNextRecord()
DecodeXLogRecord()

This will help in improving the decisions to take during recovery
depending on the failure more reported.
---
 src/include/access/xlogprefetcher.h   |   2 +-
 src/include/access/xlogreader.h   |  33 +++-
 src/backend/access/transam/twophase.c |   8 +-
 src/backend/access/transam/xlog.c |   6 +-
 src/backend/access/transam/xlogprefetcher.c   |   4 +-
 src/backend/access/transam/xlogreader.c   | 170 --
 src/backend/access/transam/xlogrecovery.c |  14 +-
 src/backend/access/transam/xlogutils.c|   2 +-
 src/backend/replication/logical/logical.c |   9 +-
 .../replication/logical/logicalfuncs.c|   9 +-
 src/backend/replication/slotfuncs.c   |   8 +-
 src/backend/replication/walsender.c   |   8 +-
 src/bin/pg_rewind/parsexlog.c |  24 +--
 src/bin/pg_waldump/pg_waldump.c   |  10 +-
 contrib/pg_walinspect/pg_walinspect.c |  11 +-
 src/tools/pgindent/typedefs.list  |   2 +
 16 files changed, 201 insertions(+), 119 deletions(-)

diff --git a/src/include/access/xlogprefetcher.h b/src/include/access/xlogprefetcher.h
index 7dd7f20ad0..5563ad1a67 100644
--- a/src/include/access/xlogprefetcher.h
+++ b/src/include/access/xlogprefetcher.h
@@ -48,7 +48,7 @@ extern void XLogPrefetcherBeginRead(XLogPrefetcher *prefetcher,
 	XLogRecPtr recPtr);
 
 extern XLogRecord *XLogPrefetcherReadRecord(XLogPrefetcher *prefetcher,
-			char **errmsg);
+			XLogReaderError *errordata);
 
 extern void XLogPrefetcherComputeStats(XLogPrefetcher *prefetcher);
 
diff --git a/src/include/access/xlogreader.h b/src/include/access/xlogreader.h
index da32c7db77..06664dc6fb 100644
--- a/src/include/access/xlogreader.h
+++ 

Re: Improve the log message output of basic_archive when basic_archive.archive_directory parameter is not set

2023-09-26 Thread Daniel Gustafsson
> On 26 Sep 2023, at 00:20, Nathan Bossart  wrote:
> 
> On Thu, Sep 21, 2023 at 11:18:00AM +0900, bt23nguyent wrote:
>> -basic_archive_configured(ArchiveModuleState *state)
>> +basic_archive_configured(ArchiveModuleState *state, char **logdetail)
> 
> Could we do something more like GUC_check_errdetail() instead to maintain
> backward compatibility with v16?

We'd still need something exported to call into which isn't in 16, so it
wouldn't be more than optically backwards compatible since a module written for
17 won't compile for 16, or am I missing something?

--
Daniel Gustafsson





Re: [PGdocs] fix description for handling pf non-ASCII characters

2023-09-26 Thread Karl O. Pinc
Hello Hayato and Jian,

On Tue, 4 Jul 2023 01:30:56 +
"Hayato Kuroda (Fujitsu)"  wrote:

> Dear Jian,

> > looks fine. Do you need to add to commitfest?  
> 
> Thank you for your confirmation. ! I registered to following:
> 
> https://commitfest.postgresql.org/44/4437/

The way the Postgres commitfest process works is that
someone has to update the page to mark "reviewed" and the
reviewer has to use the commitfest website to pass
the patches to a committer.

I see a few problems with the English and style of the patches
and am commenting below and have signed up as a reviewer.  At
commitfest.postgresql.org I have marked the thread
as needing author attention.  Hayato, you will need
to mark the thread as needing review when you have
replied to this message.

Jian, you might want to sign on as a reviewer as well.
It can be nice to have that record of your participation.

Now, to reviewing the patch:

First, it is now best practice in the PG docs to
put a line break at the end of each sentence.
At least for the sentences on the lines you change.
(No need to update the whole document!)  Please
do this in the next version of your patch.  I don't
know if this is a requirement for acceptance by
a committer, but it won't hurt.

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index e700782d3c..a4ce99ba4d 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -7040,9 +7040,8 @@ local0.*/var/log/postgresql
 The name will be displayed in the
pg_stat_activity view and included in CSV log
entries.  It can also be included in regular log entries via the  parameter.
-Only printable ASCII characters may be used in the
-application_name value. Other characters
will be
-replaced with question marks (?).
+Non-ASCII characters used in the
application_name
+will be replaced with hexadecimal strings.

   
  

Don't use the future tense to describe the system's behavior.  Instead
of "will be" write "are".  (Yes, this change would be an improvement
on the original text.  We should fix it while we're working on it
and our attention is focused.)

It is more accurate, if I understand the issue, to say that characters
are replaced with hexadecimal _representations_ of the input bytes.
Finally, it would be good to know what representation we're talking
about.  Perhaps just give the \xhh example and say: the Postgres
C-style escaped hexadecimal byte value.  And hyperlink to
https://www.postgresql.org/docs/16/sql-syntax-lexical.html#SQL-SYNTAX-STRINGS-ESCAPE

(The docbook would be, depending on text you want to link:

C-style escaped hexadecimal
byte value.

I think.  You link to id="someidvalue" attribute values.)


@@ -8037,10 +8036,9 @@ COPY postgres_log FROM
'/full/path/to/logfile.csv' WITH csv; 
 The name can be any string of less
 than NAMEDATALEN characters (64 characters in
a standard
-build). Only printable ASCII characters may be used in the
-cluster_name value. Other characters will be
-replaced with question marks (?).  No name
is shown
-if this parameter is set to the empty string
'' (which is
+build). Non-ASCII characters used in the
cluster_name
+will be replaced with hexadecimal strings. No name is shown if
this
+parameter is set to the empty string ''
(which is the default). This parameter can only be set at server start.

   

Same review as for the first patch hunk.

diff --git a/doc/src/sgml/postgres-fdw.sgml
b/doc/src/sgml/postgres-fdw.sgml index 5062d712e7..98785e87ea 100644
--- a/doc/src/sgml/postgres-fdw.sgml
+++ b/doc/src/sgml/postgres-fdw.sgml
@@ -1067,9 +1067,8 @@ postgres=# SELECT postgres_fdw_disconnect_all();
   of any length and contain even non-ASCII characters.  However
when it's passed to and used as application_name
   in a foreign server, note that it will be truncated to less than
-  NAMEDATALEN characters and anything other than
-  printable ASCII characters will be replaced with question
-  marks (?).
+  NAMEDATALEN characters and non-ASCII characters
will be
+  replaced with hexadecimal strings.
   See  for details.
  
 
Same review as for the first patch hunk.

Since the both of you have looked and confirmed that the
actual behavior matches the new documentation I have not
done this.

But, have either of you checked that we're really talking about
replacing everything outside the 7-bit ASCII encodings?  
My reading of the commit referenced in the first email of this
thread says that it's everything outside of the _printable_
ASCII encodings, ASCII values outside of the range 32 to 127,
inclusive.  

Please check.  The docs should probably say "printable ASCII",
or "non-printable ASCII", depending.  I think the meaning
of "printable ASCII" is widely enough known to be 32-127.
So "printable" is good enough, right?

Regards,

Karl 
Free Software:  "You don't pay back, you pay