Re: Invalidate the subscription worker in cases where a user loses their superuser status

2023-09-22 Thread Amit Kapila
On Sat, Sep 23, 2023 at 1:27 AM vignesh C  wrote:
>
>
> Fixed this issue by checking if the subscription owner has changed
> from superuser to non-superuser in case the pg_authid rows changes.
> The attached patch has the changes for the same.
>

@@ -3952,7 +3953,9 @@ maybe_reread_subscription(void)
  newsub->passwordrequired != MySubscription->passwordrequired ||
  strcmp(newsub->origin, MySubscription->origin) != 0 ||
  newsub->owner != MySubscription->owner ||
- !equal(newsub->publications, MySubscription->publications))
+ !equal(newsub->publications, MySubscription->publications) ||
+ (!superuser_arg(MySubscription->owner) &&
+ MySubscription->isownersuperuser))
  {
  if (am_parallel_apply_worker())
  ereport(LOG,
@@ -4605,6 +4608,13 @@ InitializeLogRepWorker(void)
  proc_exit(0);
  }

+ /*
+ * Fetch subscription owner is a superuser. This value will be later
+ * checked to see when there is any change with this role and the worker
+ * will be restarted if required.
+ */
+ MySubscription->isownersuperuser = superuser_arg(MySubscription->owner);

Why didn't you filled this parameter in GetSubscription() like other
parameters? If we do that then the comparison of first change in your
patch will look similar to all other comparisons.

-- 
With Regards,
Amit Kapila.




RE: [PoC] pg_upgrade: allow to upgrade publisher node

2023-09-22 Thread Hayato Kuroda (Fujitsu)
Dear Bharath,

> > You mentioned at line 118, but at that time logical replication system is 
> > not
> created.
> > The subscriber is created at line 163.
> > Therefore WALs would not be consumed automatically.
> 
> So, not calling pg_logical_slot_get_changes() on test_slot1 won't
> consume the WAL?

Yes. This slot was created manually and no one activated it automatically.
pg_logical_slot_get_changes() can consume WALs but never called.

> 
> 2.
> +++ b/src/bin/pg_upgrade/t/003_logical_replication_slots.pl
> 
> How about a more descriptive and pointed name for the TAP test file,
> something like 003_upgrade_logical_replication_slots.pl?

Good suggestion. Renamed.

> 3. Does this patch support upgrading of logical replication slots on a
> streaming standby? If yes, isn't it a good idea to add one test for
> upgrading standby with logical replication slots?

IIUC pg_upgrade would not be used for physical standby. The standby would be 
upgrade by:

* Recreating the database cluster, or
* Executing rsync command.

For more detail, please see the documentation.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED



RE: [PoC] pg_upgrade: allow to upgrade publisher node

2023-09-22 Thread Hayato Kuroda (Fujitsu)
Dear Bharath,

Again, thank you for reviewing! Here is a new version patch.

> 1.
> +{ oid => '8046', descr => 'for use by pg_upgrade',
> +  proname => 'binary_upgrade_validate_wal_records', proisstrict => 'f',
> +  provolatile => 'v', proparallel => 'u', prorettype => 'bool',
> 
> +if (PG_ARGISNULL(0))
> +elog(ERROR, "null argument to
> binary_upgrade_validate_wal_records is not allowed");
> 
> Can proisstrict => 'f' be removed so that there's no need for explicit
> PG_ARGISNULL check? Any specific reason to keep it?

Theoretically it could be, but I was not sure. I think you wanted us to follow
specs of pg_walinspect functions, but it is just a upgrade function. Normally
users cannot call it. Also, as Amit said [1], the caller must consider the
special case. Currently the function returns false at that time, we can change
more appropriate style later.

> And, the before the ISNULL check the arg is read, which isn't good.

Right, fixed.

> 2.
> +Datum
> +binary_upgrade_validate_wal_records(PG_FUNCTION_ARGS)
> 
> The function name looks too generic in the sense that it validates WAL
> records for correctness/corruption, but it is not. Can it be something
> like binary_upgrade_{check_for_wal_logical_end,
> check_for_logical_end_of_wal} or such?

Per discussion [2], changed to binary_upgrade_validate_wal_logical_end.

> 3.
> +/* Quick exit if the given lsn is larger than current one */
> +if (start_lsn >= GetFlushRecPtr(NULL))
> +PG_RETURN_BOOL(false);
> +
> 
> An LSN that doesn't exists yet is an error IMO, may be an error better here?

We think that the invalid slots should be listed at the end, so basically we do
not want to error out. This would be also changed if there are better opinions.

> 4.
> + * This function is used to verify that there are no WAL records (except some
> + * types) after confirmed_flush_lsn of logical slots, which means all the
> + * changes were replicated to the subscriber. There is a possibility that 
> some
> + * WALs are inserted during upgrade, so such types would be ignored.
> + *
> 
> This comment before the function better be at the callsite of the
> function, because as far as this function is concerned, it checks if
> there are any WAL records that are not "certain" types after the given
> LSN, it doesn't know logical slots or confirmed_flush_lsn or such.

Hmm, I think it is better to do the reverse, because otherwise we need to 
mention
the same explanation at other caller of the function if any. So, I have
adjusted the comments atop and at caller. Thought?

> 8.
> +if (nslots_on_new)
> +pg_fatal("expected 0 logical replication slots but found %d",
> + nslots_on_new);
> 
> How about "New cluster database is containing logical replication
> slots", note that the some of the fatal messages are starting with an
> upper-case letter.

I did not use your suggestion, but changed to upper-case.
Actually, the uppper-case rule is broken even in the file. Here I regarded
this sentence as hint message.

> 9.
> +res = executeQueryOrDie(conn, "SHOW wal_level;");
> +res = executeQueryOrDie(conn, "SHOW max_replication_slots;");
> 
> Instead of 2 queries to determine required parameters, isn't it better
> with a single query like the following?
> 
> select setting from pg_settings where name in ('wal_level',
> 'max_replication_slots') order by name;

Modified, but use ORDER BY ... DESC. This come from a previous comment [3].

> 
> 12.
> +extern XLogReaderState *InitXLogReaderState(XLogRecPtr lsn);
> +extern XLogRecord *ReadNextXLogRecord(XLogReaderState *xlogreader);
> +
> 
> Why not these functions be defined in xlogreader.h with elog/ereport
> in #ifndef FRONTEND #endif blocks? IMO, xlogreader.h seems right
> location for these functions.

I checked comments atop both files, and xlogreader.h seems better. Fixed.

> 13.
> +LogicalReplicationSlotInfo
> 
> Where is this structure defined?

Opps, removed.

[1]: 
https://www.postgresql.org/message-id/CAA4eK1LxPDeSkTttEAG2MPEWO%3D83vQe_Bja9F4QcCjVn%3DWt9rA%40mail.gmail.com
[2]: 
https://www.postgresql.org/message-id/CAA4eK1L9oJmdxprFR3oob5KLpHUnkJAt5Le4woxO3wHz-SZ%2BTA%40mail.gmail.com
[3]: 
https://www.postgresql.org/message-id/CAA4eK1LHH_%3DwbxsEn20%3DW%2Bqz1193OqFj-vvJ-u0uHLMmwLHbRw%40mail.gmail.com

Best Regards,
Hayato Kuroda
FUJITSU LIMITED



v44-0001-pg_upgrade-Allow-to-replicate-logical-replicatio.patch
Description:  v44-0001-pg_upgrade-Allow-to-replicate-logical-replicatio.patch


nbtree's ScalarArrayOp array mark/restore code appears to be buggy

2023-09-22 Thread Peter Geoghegan
Attached test case demonstrates an issue with nbtree's mark/restore
code. All supported versions are affected.

My suspicion is that bugfix commit 70bc5833 missed some subtlety
around what we need to do to make sure that the array keys stay "in
sync" with the scan. I'll have time to debug the problem some more
tomorrow.

My ScalarArrayOp project [1] seems unaffected by the bug, so I don't
expect it'll take long to get to the bottom of this. This is probably
due to its general design, and not any specific detail. The patch
makes the relationship between the current scan position and the
current array keys a great deal looser.

[1] https://commitfest.postgresql.org/44/4455/
-- 
Peter Geoghegan


nbtree_array_mark_restore_bug.sql
Description: Binary data


Re: Questions about the new subscription parameter: password_required

2023-09-22 Thread Jeff Davis
On Fri, 2023-09-22 at 08:36 -0400, Robert Haas wrote:
> On Fri, Sep 22, 2023 at 4:25 AM Benoit Lobréau
>  wrote:
> > Can we consider adding something like this to clarify?
> > 
> > """
> > This parameter is enforced when the CREATE SUBSCRIPTION or ALTER
> > SUBSCRIPTION .. CONNECTION commands are executed. Therefore, it's
> > possible to alter the ownership of a subscription with
> > password_required=true to a non-superuser.
> > """
> 
> I'm not sure of the exact wording, but there was another recent
> thread
> complaining about this being unclear, so it seems like some
> clarification is needed.

IIUC there is really one use case here, which is for superuser to
define a subscription including the connection, and then change the
owner to a non-superuser to actually run it (without being able to
touch the connection string itself). I'd just document that in its own
section, and mention a few caveats / mistakes to avoid. For instance,
when the superuser is defining the connection, don't forget to set
password_required=false, so that when you reassign to a non-superuser
then the connection doesn't break.

Regards,
Jeff Davis





Failures on gombessa -- EIO?

2023-09-22 Thread Thomas Munro
I am not sure why REL_16_STABLE fails consistently as of ~4 days ago.
It seems like bad storage or something?  Just now it happened also on
HEAD.  I wonder why it would be sensitive to the branch.




Re: Synchronizing slots from primary to standby

2023-09-22 Thread Amit Kapila
On Fri, Sep 22, 2023 at 6:01 PM Drouvot, Bertrand
 wrote:
>
> Thanks for all the work that has been done on this feature, and sorry
> to have been quiet on it for so long.
>
> On 9/18/23 12:22 PM, shveta malik wrote:
> > On Wed, Sep 13, 2023 at 4:48 PM Hayato Kuroda (Fujitsu)
> >  wrote:
> >> Right, but I wanted to know why it is needed. One motivation seemed to 
> >> know the
> >> WAL location of physical standby, but I thought that struct WalSnd.apply 
> >> could
> >> be also used. Is it bad to assume that the physical walsender always 
> >> exists?
> >>
> >
> > We do not plan to target this case where physical slot is not created
> > between primary and physical-standby in the first draft.  In such a
> > case, slot-synchronization will be skipped for the time being. We can
> > extend this functionality (if needed) later.
> >
>
> I do think it's needed to extend this functionality. Having physical slot
> created sounds like a (too?) strong requirement as:
>
> - It has not been a requirement for Logical decoding on standby so that could 
> sounds weird
> to require it for sync slot (while it's not allowed to logical decode from 
> sync slots)
>

There is a difference here that we also need to prevent removal of
rows required by sync_slots. That could be achieved by physical slot
(and hot_standby_feedback). So, having a requirement to have physical
slot doesn't sound too unreasonable to me. Otherwise, we need to
invent some new mechanism of having some sort of placeholder slot to
avoid removal of required rows. I guess we can always extend the
functionality in later version as Shveta mentioned. Now, if we have
somewhat simpler way to achieve prevention of removal of rows then it
is fine otherwise let's focus on getting other parts correct
considering this is already a reasonably big and complex patch.

Thanks for looking into this work and your feedback will definetely
help in moving this work forward.

-- 
With Regards,
Amit Kapila.




Re: Add support for AT LOCAL

2023-09-22 Thread Vik Fearing

On 9/22/23 23:46, cary huang wrote:

The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:   tested, passed
Spec compliant:   tested, passed
Documentation:tested, passed

Hello

I think this feature can be a useful addition in dealing with time zones. I 
have applied and tried out the patch, The feature works as described and seems 
promising. The problem with compilation failure was probably reported on 
CirrusCI when compiled on different platforms. I have run the latest patch on 
my own Cirrus CI environment and everything checked out fine.

Thank you


Thank you for reviewing!
--
Vik Fearing





Re: Add support for AT LOCAL

2023-09-22 Thread cary huang
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:   tested, passed
Spec compliant:   tested, passed
Documentation:tested, passed

Hello

I think this feature can be a useful addition in dealing with time zones. I 
have applied and tried out the patch, The feature works as described and seems 
promising. The problem with compilation failure was probably reported on 
CirrusCI when compiled on different platforms. I have run the latest patch on 
my own Cirrus CI environment and everything checked out fine. 

Thank you

Cary Huang
--
Highgo Software Canada
www.highgo.ca

Re: Better help output for pgbench -I init_steps

2023-09-22 Thread Tristen Raab
The following review has been posted through the commitfest application:
make installcheck-world:  not tested
Implements feature:   tested, passed
Spec compliant:   not tested
Documentation:not tested

Hello,

I've reviewed all 4 of your patches, each one applies and builds correctly.

> I think I prefer variant 2.  Currently, we only have 8 steps, so it might 
> be overkill to separate them out into a different option.

+1 to this from Peter. Variant 2 is nicely formatted with lots of information 
which I feel better solves the problem this patch is trying to address. 
Both versions 3 and 4 are a bit too jumbled for my liking without adding 
anything significant, even the shortened version 4.

If we were to go with variant 1 however, I think it would add more to have a 
link to the pgbench documentation that refers to the different init steps. 
Perhaps on a new line just under where it says "see pgbench documentation for a 
description of these steps".

Overall good patch, I'm a firm believer that more information is always better 
than less.

Tristen
---
Software Engineer
HighGo Software Inc. (Canada)
tristen.r...@highgo.ca
www.highgo.ca

Is this a problem in GenericXLogFinish()?

2023-09-22 Thread Jeff Davis


src/backend/transam/README says:

  ...
  4. Mark the shared buffer(s) as dirty with MarkBufferDirty().  (This 
  must happen before the WAL record is inserted; see notes in 
  SyncOneBuffer().)
  ...

But GenericXLogFinish() does this:

  ...
  /* Insert xlog record */
  lsn = XLogInsert(RM_GENERIC_ID, 0);

  /* Set LSN and mark buffers dirty */
  for (i = 0; i < MAX_GENERIC_XLOG_PAGES; i++)
  {
  PageData   *pageData = >pages[i];

  if (BufferIsInvalid(pageData->buffer))
  continue;
  PageSetLSN(BufferGetPage(pageData->buffer), lsn);
  MarkBufferDirty(pageData->buffer);
  }
  END_CRIT_SECTION();

Am I missing something or is that a problem?

Regards,
Jeff Davis





Re: CREATE FUNCTION ... SEARCH { DEFAULT | SYSTEM | SESSION }

2023-09-22 Thread Jeff Davis
On Thu, 2023-09-21 at 14:06 -0400, Robert Haas wrote:

> Also, in a case like this, I don't think it's unreasonable to ask
> whether, perhaps, Bob just needs to be a little more careful about
> setting search_path.

That's what this whole thread is about: I wish it was reasonable, but I
don't think the tools we provide today make it reasonable. You expect
Bob to do something like:

  CREATE FUNCTION ... SET search_path = pg_catalog, pg_temp ...

for all functions, not just SECURITY DEFINER functions, is that right?

Up until now, we've mostly treated search_path as a problem for
SECURITY DEFINER, and specifying something like that might be
reasonable for a small number of SECURITY DEFINER functions.

But as my example showed, search_path is actually a problem for
SECURITY INVOKER too: an index expression relies on the function
producing the correct results, and it's hard to control that without
controlling the search_path.

>  I think that there is a big difference between
> (a) defining a SQL-language function that is accessible to multiple
> users and (b) inserting a row into a table you don't own. When you
> define a function, you know people are potentially going to call it.

It's a bit problematic that (a) is the default:

   CREATE FUNCTION f(INT) RETURNS INT IMMUTABLE
 LANGUAGE plpgsql
 AS $$ BEGIN RETURN 42+$1; END; $$;
   CREATE TABLE x(i INT);
   CREATE INDEX x_idx ON x(f(i));
   GRANT INSERT ON TABLE x TO u2;

It's not obvious that f() is directly callable by u2 (though it is
documented).

I'm not disagreeing with the principle behind what you say above. My
point is that "accessible to multiple users" is the ordinary default
case, so there's no cue for the user that they need to do something
special to secure function f().

> Asking you, as the function author, to take some care to secure your
> function against a malicious search_path doesn't seem like an
> unsupportable burden.

What you are suggesting has been possible for quite some time. Do you
think users are taking care to do this today? If not, how can we
encourage them to do so?

> You can, I think, be expected to
> check that functions you define have SET search_path attached.

We've already established that even postgres hackers are having
difficulty keeping up with these nuances. Even though the SET clause
has been there for a long time, our documentation on the subject is
insufficient and misleading. And on top of that, it's extra typing and
noise for every schema file. Until we make some changes I don't think
we can expect users to do as you suggest.

Regards,
Jeff Davis





Re: Questions about the new subscription parameter: password_required

2023-09-22 Thread Robert Haas
On Fri, Sep 22, 2023 at 10:59 AM Benoit Lobréau
 wrote:
> You're right, it comes from the connection to drop the slot.
>
> But the code in for DropSubscription in
> src/backend/commands/subscriptioncmds.c tries to connect before testing
> if the slot is NONE / NULL. So it doesn't work to DISABLE the
> subscription and set the slot to NONE.

So I'm seeing this:

if (!slotname && rstates == NIL)
{
table_close(rel, NoLock);
return;
}

load_file("libpqwalreceiver", false);

wrconn = walrcv_connect(conninfo, true, must_use_password,
subname, );

That looks like it's intended to return if there's nothing to do, and
the comments say as much. But that (!slotname && rstates == NIL) test
looks sketchy. It seems like we should bail out early if *either*
!slotname *or* rstates == NIL, or for that matter if all of the
rstates have rstate->relid == 0 or rstate->state ==
SUBREL_STATE_SYNCDONE. Maybe we need to push setting up the connection
inside the foreach(lc, rstates) loop and do it only once we're sure we
want to do something. Or at least, I don't understand why we don't
bail out immediately in all cases where slotname is NULL, regardless
of rstates. Am I missing something here?

> Reading the code, I think I understand why the postgres user cannot drop
> the slot:
>
> * the owner is sub_owner (not a superuser)
> * and form->subpasswordrequired is true
>
> Should there be a test to check if the user executing the query is
> superuser? maybe it's handled differently? (I am not very familiar with
> the code).

I think that there normally shouldn't be any problem here, because if
form->subpasswordrequired is true, we expect that the connection
string should contain a password which the remote side actually uses,
or we expect the subscription to be owned by the superuser. If neither
of those things is the case, then either the superuser made a
subscription that doesn't use a password owned by a non-superuser
without fixing subpasswordrequired, or else the configuration on the
remote side has changed so that it now doesn't use the password when
formerly it did. In the first case, perhaps it would be fine to go
ahead and drop the slot, but in the second case I don't think it's OK
from a security point view, because the command is going to behave the
same way no matter who executes the drop command, and a non-superuser
who drops the slot shouldn't be permitted to rely on the postgres
processes's identity to do anything on a remote node -- including
dropping a relation slot. So I tentatively think that this behavior is
correct.

> I dont understand (yet?) why I can do ALTER SUBSCRIPTIONs after changing
> the ownership to an unpriviledged user (must_use_password should be true
> also in that case).

Maybe you're altering it in a way that doesn't involve any connections
or any changes to the connection string? There's no security issue if,
say, you rename it.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Invalidate the subscription worker in cases where a user loses their superuser status

2023-09-22 Thread vignesh C
Hi,

The subscription worker was not getting invalidated when the
subscription owner changed from superuser to non-superuser. Here is a
test case for the same:
Publisher:
   CREATE USER repl REPLICATION PASSWORD 'secret';
   CREATE TABLE t(i INT);
   INSERT INTO t VALUES(1);
   GRANT SELECT ON t TO repl;
   CREATE PUBLICATION p1 FOR TABLE t;

Subscriber (has a PGPASSFILE for user "repl"):
   CREATE USER u1 SUPERUSER;
   CREATE TABLE t(i INT);
   ALTER TABLE t OWNER TO u1;
   -- no password specified
   CREATE SUBSCRIPTION s1
 CONNECTION 'dbname=postgres host=127.0.0.1 port=5432 user=repl'
 PUBLICATION p1;

   ALTER USER u1 NOSUPERUSER: -- Change u1 user to non-superuser

Publisher:
INSERT INTO t VALUES(1);

Subscriber:
SELECT COUNT(*) FROM t; -- should have been 1 but is 2, the apply
worker has not exited after changing from superuser to non-superuser.

Fixed this issue by checking if the subscription owner has changed
from superuser to non-superuser in case the pg_authid rows changes.
The attached patch has the changes for the same.
Thanks to Jeff Davis for identifying this issue and reporting it at [1].

[1] - 
https://www.postgresql.org/message-id/5dff4caf26f45ce224a33a5e18e110b93a351b2f.camel%40j-davis.com

Regards,
Vignesh
From 9347e863cc78d328f7556db0e357787d14feae75 Mon Sep 17 00:00:00 2001
From: Vignesh C 
Date: Fri, 22 Sep 2023 15:12:23 +0530
Subject: [PATCH v1] Restart the apply worker if the subscription owner has
 changed from superuser to non-superuser.

Restart the apply worker if the subscription owner has changed from
superuser to non-superuser. This is required so that the subscription
connection string gets revalidated to identify cases where the
password option is not specified as part of the connection string for
non-superuser.
---
 src/backend/replication/logical/worker.c   | 23 +++---
 src/include/catalog/pg_subscription.h  |  1 +
 src/test/subscription/t/027_nosuperuser.pl | 21 
 3 files changed, 42 insertions(+), 3 deletions(-)

diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 597947410f..29197d86cb 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -3942,7 +3942,8 @@ maybe_reread_subscription(void)
 	 * The launcher will start a new worker but note that the parallel apply
 	 * worker won't restart if the streaming option's value is changed from
 	 * 'parallel' to any other value or the server decides not to stream the
-	 * in-progress transaction.
+	 * in-progress transaction. Exit if the owner of the subscription has
+	 * changed from superuser to a non-superuser.
 	 */
 	if (strcmp(newsub->conninfo, MySubscription->conninfo) != 0 ||
 		strcmp(newsub->name, MySubscription->name) != 0 ||
@@ -3952,7 +3953,9 @@ maybe_reread_subscription(void)
 		newsub->passwordrequired != MySubscription->passwordrequired ||
 		strcmp(newsub->origin, MySubscription->origin) != 0 ||
 		newsub->owner != MySubscription->owner ||
-		!equal(newsub->publications, MySubscription->publications))
+		!equal(newsub->publications, MySubscription->publications) ||
+		(!superuser_arg(MySubscription->owner) &&
+		 MySubscription->isownersuperuser))
 	{
 		if (am_parallel_apply_worker())
 			ereport(LOG,
@@ -4605,6 +4608,13 @@ InitializeLogRepWorker(void)
 		proc_exit(0);
 	}
 
+	/*
+	 * Fetch subscription owner is a superuser. This value will be later
+	 * checked to see when there is any change with this role and the worker
+	 * will be restarted if required.
+	 */
+	MySubscription->isownersuperuser = superuser_arg(MySubscription->owner);
+
 	MySubscriptionValid = true;
 	MemoryContextSwitchTo(oldctx);
 
@@ -4621,11 +4631,18 @@ InitializeLogRepWorker(void)
 	SetConfigOption("synchronous_commit", MySubscription->synccommit,
 	PGC_BACKEND, PGC_S_OVERRIDE);
 
-	/* Keep us informed about subscription changes. */
+	/*
+	 * Keep us informed about subscription changes or pg_authid rows.
+	 * (superuser can become non-superuser.)
+	 */
 	CacheRegisterSyscacheCallback(SUBSCRIPTIONOID,
   subscription_change_cb,
   (Datum) 0);
 
+	CacheRegisterSyscacheCallback(AUTHOID,
+  subscription_change_cb,
+  (Datum) 0);
+
 	if (am_tablesync_worker())
 		ereport(LOG,
 (errmsg("logical replication table synchronization worker for subscription \"%s\", table \"%s\" has started",
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index be36c4a820..87f6f644a9 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -144,6 +144,7 @@ typedef struct Subscription
 	List	   *publications;	/* List of publication names to subscribe to */
 	char	   *origin;			/* Only publish data originating from the
  * specified origin */
+	bool		isownersuperuser;	/* Is subscription owner superuser? */
 } Subscription;
 
 /* Disallow streaming in-progress transactions. */
diff --git 

Re: GenBKI emits useless open;close for catalogs without rows

2023-09-22 Thread Matthias van de Meent
On Fri, 22 Sept 2023 at 00:25, Andres Freund  wrote:
>
> Hi,
>
> On 2023-09-19 21:05:41 +0300, Heikki Linnakangas wrote:
> > On 18/09/2023 17:50, Matthias van de Meent wrote:
> > > (initdb takes about 73ms locally with syncing disabled)
> >
> > That's impressive. It takes about 600 ms on my laptop. Of which about 140 ms
> > goes into processing the BKI file. And that's with "initdb -no-sync" option.
>
> I think there must be a digit missing in Matthias' numbers.

Yes, kind of. The run was on 50 iterations, not the assumed 250.
Also note that the improved measurements were recorded inside the
boostrap-mode PostgreSQL instance, not inside the initdb that was
processing the postgres.bki file. So it might well be that I didn't
improve the total timing by much.

> > > Various methods of reducing the size of postgres.bki were applied, as
> > > detailed in the patch's commit message. I believe the current output
> > > is still quite human readable.
> >
> > Overall this does not seem very worthwhile to me.
>
> Because the wins are too small?
>
> FWIW, Making postgres.bki smaller and improving bootstrapping time does seem
> worthwhile to me. But it doesn't seem quite right to handle the batching in
> the file format, it should be on the server side, no?

The main reason I did batching in the file format is to reduce the
storage overhead of the current one "INSERT" per row. Batching
improved that by replacing the token with a different construct, but
it's not neccessarily the only solution. The actual parser still
inserts the tuples one by one in the relation, as I didn't spend time
on making a simple_heap_insert analog for bulk insertions.

> We really should stop emitting WAL during initdb...

I think it's quite elegant that we're able to bootstrap the relation
data of a new PostgreSQL cluster from the WAL generated in another
cluster, even if it is indeed a bit wasteful. I do see your point
though - the WAL shouldn't be needed if we're already fsyncing the
files to disk.

Kind regards,

Matthias van de Meent
Neon (https://neon.tech)




Re: GUC for temporarily disabling event triggers

2023-09-22 Thread Daniel Gustafsson
> On 7 Sep 2023, at 21:02, Robert Haas  wrote:
> 
> On Thu, Sep 7, 2023 at 1:57 AM Michael Paquier  wrote:
>> +   GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
>> 
>> I am a bit surprised by these two additions.  Setting this GUC at
>> file-level can be useful, as is documenting it in the control file if
>> it provides some control of how a statement behaves, no?
> 
> Yeah, I don't think these options should be used.

Removed.

>> +   Allow temporarily disabling execution of event triggers in order to
>> +   troubleshoot and repair faulty event triggers. All event triggers 
>> will
>> +   be disabled by setting it to true. Setting the 
>> value
>> +   to false will not disable any event triggers, this
>> +   is the default value. Only superusers and users with the appropriate
>> +   SET privilege can change this setting.
>> 
>> Event triggers are disabled if setting this GUC to false, while true,
>> the default, allows event triggers.  The values are reversed in this
>> description.
> 
> Woops.

Fixed.

Since the main driver for this is to reduce the usage/need for single-user mode
I also reworded the patch slightly.  Instead of phrasing this as an alternative
to single-user mode, I reversed it such that single-user mode is an alternative
to this GUC.

--
Daniel Gustafsson



v9-0001-Add-GUC-for-temporarily-disabling-event-triggers.patch
Description: Binary data


Re: GenBKI emits useless open;close for catalogs without rows

2023-09-22 Thread Matthias van de Meent
On Tue, 19 Sept 2023 at 20:05, Heikki Linnakangas  wrote:
>
> On 18/09/2023 17:50, Matthias van de Meent wrote:
> > (initdb takes about 73ms locally with syncing disabled)
>
> That's impressive. It takes about 600 ms on my laptop. Of which about
> 140 ms goes into processing the BKI file. And that's with "initdb
> -no-sync" option.

Hmm, yes, I misinterpreted my own benchmark setup, the actual value
would be somewhere around 365ms: I thought I was doing 50*50 runs in
one timed run, but really I was doing only 50 runs. TO add insult to
injury, I divided the total time taken by 250 instead of either 50 or
2500... Thanks for correcting me on that.

> > Various methods of reducing the size of postgres.bki were applied, as
> > detailed in the patch's commit message. I believe the current output
> > is still quite human readable.
>
> Overall this does not seem very worthwhile to me.

Reducing the size of redistributables sounds worthwhile to me, but if
none of these changes are worth the effort, then alright, nothing
gained, only time lost.

> Looking at "perf" profile of initdb, I also noticed that a small but
> measurable amount of time is spent in the "isatty(0)" call in do_end().
> Does anyone care about doing bootstrap mode interactively? We could
> probably remove that.

Yeah, that sounds like a good idea.

Kind regards,

Matthias van de Meent




Re: Questions about the new subscription parameter: password_required

2023-09-22 Thread Benoit Lobréau

On 9/22/23 14:36, Robert Haas wrote:

I haven't checked this, but I think what's happening here is that DROP
SUBSCRIPTION tries to drop the remote slot, which requires making a
connection, which can trigger the error. You might get different
results if you did ALTER SUBSCRIPTION ... SET (slot_name = none)
first.


You're right, it comes from the connection to drop the slot.

But the code in for DropSubscription in 
src/backend/commands/subscriptioncmds.c tries to connect before testing 
if the slot is NONE / NULL. So it doesn't work to DISABLE the 
subscription and set the slot to NONE.



  1522 >~~~must_use_password = !superuser_arg(subowner) && 
form->subpasswordrequired;

  ...
  1685 >~~~wrconn = walrcv_connect(conninfo, true, must_use_password,
 1 >~~~>~~~>~~~>~~~>~~~>~~~>~~~subname, );
 2 >~~~if (wrconn == NULL)
 3 >~~~{
 4 >~~~>~~~if (!slotname)
 5 >~~~>~~~{
 6 >~~~>~~~>~~~/* be tidy */
 7 >~~~>~~~>~~~list_free(rstates);
 8 >~~~>~~~>~~~table_close(rel, NoLock);
 9 >~~~>~~~>~~~return;
10 >~~~>~~~}
11 >~~~>~~~else
12 >~~~>~~~{
13 >~~~>~~~>~~~ReportSlotConnectionError(rstates, subid, slotname, 
err);

14 >~~~>~~~}
15 >~~~}


Reading the code, I think I understand why the postgres user cannot drop 
the slot:


* the owner is sub_owner (not a superuser)
* and form->subpasswordrequired is true

Should there be a test to check if the user executing the query is 
superuser? maybe it's handled differently? (I am not very familiar with 
the code).


I dont understand (yet?) why I can do ALTER SUBSCRIPTIONs after changing 
the ownership to an unpriviledged user (must_use_password should be true 
also in that case).


--
Benoit Lobréau
Consultant
http://dalibo.com




Re: Row pattern recognition

2023-09-22 Thread Jacob Champion
On Fri, Sep 22, 2023, 3:13 AM Tatsuo Ishii  wrote:

> > Op 9/22/23 om 07:16 schreef Tatsuo Ishii:
> >>> Attached is the fix against v6 patch. I will include this in upcoming
> >>> v7 patch.
> >> Attached is the v7 patch. It includes the fix mentioned above.  Also
> > (Champion's address bounced; removed)
>
> On my side his adress bounced too:-<
>

Yep. I'm still here, just lurking for now. It'll take a little time for me
to get back to this thread, as my schedule has changed significantly. :D

Thanks,
--Jacob

>


Re: Index range search optimization

2023-09-22 Thread Alexander Korotkov
Hi Peter,
Hi Pavel,

The v4 of the patch is attached.

On Thu, Sep 21, 2023 at 11:48 PM Peter Geoghegan  wrote:
>
> On Thu, Sep 21, 2023 at 5:11 AM Pavel Borisov  wrote:
> > I looked at the patch code and I agree with this optimization.
> > Implementation also looks good to me except change :
> > + if (key->sk_flags & (SK_BT_REQFWD | SK_BT_REQBKWD) &&
> > + !(key->sk_flags & SK_ROW_HEADER))
> > + requiredDir = true;
> > ...
> > - if ((key->sk_flags & SK_BT_REQFWD) &&
> > - ScanDirectionIsForward(dir))
> > - *continuescan = false;
> > - else if ((key->sk_flags & SK_BT_REQBKWD) &&
> > - ScanDirectionIsBackward(dir))
> > + if (requiredDir)
> >   *continuescan = false;
> >
> > looks like changing behavior in the case when key->sk_flags &
> > SK_BT_REQFWD && (! ScanDirectionIsForward(dir)) &&
> > (!requiredDirMatched)
> > Originally it doesn't set *continuescan = false; and with the patch it will 
> > set.
>
> I agree that this is a problem. Inequality strategy scan keys are used
> when the initial positioning strategy used by _bt_first (for its
> _bt_search call) is based on an operator other than the "=" operator
> for the opclass. These scan keys are required in one direction only
> (Konstantin's original patch just focussed on these cases, actually).
> Obviously, that difference matters. I don't think that this patch
> should do anything that even looks like it might be revising the
> formal definition of "required in the current scan direction".

Sorry, that was messed up from various attempts to write the patch.
Actually, I end up with two boolean variables indicating whether the
current key is required for the same direction or opposite direction
scan.  I believe that the key required for the opposite direction scan
should be already satisfied by _bt_first() except for NULLs case.
I've implemented a skip of calling the key function for this case
(with assert that result is the same).

> Why is SK_ROW_HEADER treated as a special case by the patch? Could it
> be related to the issues with required-ness and scan direction? Note
> that we never use BTEqualStrategyNumber for SK_ROW_HEADER scan key row
> comparisons, so they're only ever required for one scan direction.
> (Equality-type row constructor syntax can of course be used without
> preventing the system from using an index scan, but the nbtree code
> will not see that case as a row comparison in the first place. This is
> due to preprocessing by the planner -- nbtree just sees conventional
> scan keys with multiple simple equality scan keys with = row
> comparisons.)

The thing is that NULLs could appear in the middle of matching values.

# WITH t (a, b) AS (VALUES ('a', 'b'), ('a', NULL), ('b', 'a'))
SELECT a, b, (a, b) > ('a', 'a') FROM t ORDER BY (a, b);
 a |  b   | ?column?
---+--+--
 a | b| t
 a | NULL | NULL
 b | a| t
(3 rows)

So we can't just skip the row comparison operator, because we can meet
NULL at any place.

> > This may be relevant for the first page when requiredDirMatched is
> > intentionally skipped to be set and for call
> > _bt_checkkeys(scan, itup, truncatt, dir, , false);
>
> Also, requiredDirMatched isn't initialized by _bt_readpage() when
> "so->firstPage". Shouldn't it be initialized to false?
>
> Also, don't we need to take more care with a fully empty page? The "if
> (!so->firstPage) ... " block should be gated using a condition such as
> "if (!so->firstPage && minoff < maxoff)". (Adding a "minoff <= maxoff"
> test would also work, but then the optimization will get applied on
> pages with only one non-pivot tuple. That would be harmless, but a
> waste of cycles.)

This makes sense.  I've added (minoff < maxoff) to the condition.

> > Also naming of requiredDirMatched and requiredDir seems semantically
> > hard to understand the meaning without looking at the patch commit
> > message. But I don't have better proposals yet, so maybe it's
> > acceptable.
>
> I agree. How about "requiredMatchedByPrecheck" instead of
> "requiredDirMatched", and "required" instead of "requiredDir"?
>
> It would be nice if this patch worked in a way that could be verified
> by an assertion. Under this scheme, the optimization would only really
> be used in release builds (builds without assertions enabled, really).
> We'd only verify that the optimized case agreed with the slow path in
> assert-enabled builds. It might also make sense to always "apply the
> optimization" on assert-enabled builds, even for the first page seen
> by _bt_readpage by any _bt_first-wise scan. Maybe this sort of
> approach is impractical here for some reason, but I don't see why it
> should be.

Yes, this makes sense.  I've added an assert check that results are
the same as with requiredMatchedByPrecheck == false.

> Obviously, the optimization should lower the amount of work in some
> calls to _bt_checkkeys, without ever changing the answer _bt_checkkeys
> gives. Ideally, it should be done in a way that makes that very
> obvious. There are some 

Re: Index range search optimization

2023-09-22 Thread Pavel Borisov
On Fri, 22 Sept 2023 at 00:48, Peter Geoghegan  wrote:
>
> On Thu, Sep 21, 2023 at 5:11 AM Pavel Borisov  wrote:
> > I looked at the patch code and I agree with this optimization.
> > Implementation also looks good to me except change :
> > + if (key->sk_flags & (SK_BT_REQFWD | SK_BT_REQBKWD) &&
> > + !(key->sk_flags & SK_ROW_HEADER))
> > + requiredDir = true;
> > ...
> > - if ((key->sk_flags & SK_BT_REQFWD) &&
> > - ScanDirectionIsForward(dir))
> > - *continuescan = false;
> > - else if ((key->sk_flags & SK_BT_REQBKWD) &&
> > - ScanDirectionIsBackward(dir))
> > + if (requiredDir)
> >   *continuescan = false;
> >
> > looks like changing behavior in the case when key->sk_flags &
> > SK_BT_REQFWD && (! ScanDirectionIsForward(dir)) &&
> > (!requiredDirMatched)
> > Originally it doesn't set *continuescan = false; and with the patch it will 
> > set.
>
> I agree that this is a problem. Inequality strategy scan keys are used
> when the initial positioning strategy used by _bt_first (for its
> _bt_search call) is based on an operator other than the "=" operator
> for the opclass. These scan keys are required in one direction only
> (Konstantin's original patch just focussed on these cases, actually).
> Obviously, that difference matters. I don't think that this patch
> should do anything that even looks like it might be revising the
> formal definition of "required in the current scan direction".
I think it's the simplification that changed code behavior - just an
overlook and this could be fixed easily.

> Also, requiredDirMatched isn't initialized by _bt_readpage() when
> "so->firstPage". Shouldn't it be initialized to false?
True.

> > Also naming of requiredDirMatched and requiredDir seems semantically
> > hard to understand the meaning without looking at the patch commit
> > message. But I don't have better proposals yet, so maybe it's
> > acceptable.
>
> I agree. How about "requiredMatchedByPrecheck" instead of
> "requiredDirMatched", and "required" instead of "requiredDir"?
For me, the main semantic meaning is omitted and even more unclear,
i.e. what exactly required and matched. I'd  suppose scanDirRequired,
scanDirMatched, but feel it's not ideal either. Or maybe trySkipRange,
canSkipRange etc.

Regards,
Pavel Borisov,
Supabase.




Re: Use virtual tuple slot for Unique node

2023-09-22 Thread Heikki Linnakangas
I did a little more perf testing with this. I'm seeing the same benefit 
with the query you posted. But can we find a case where it's not 
beneficial? If I understand correctly, when the input slot is a virtual 
slot, it's cheaper to copy it to another virtual slot than to form a 
minimal tuple. Like in your test case. What if the input is a minimial 
tuple?


On master:

postgres=# set enable_hashagg=off;
SET
postgres=# explain analyze select distinct g::text, 'a', 'b', 'c','d', 
'e','f','g','h' from generate_series(1, 500) g;


QUERY PLAN 


---
 Unique  (cost=2630852.42..2655852.42 rows=200 width=288) (actual 
time=4525.212..6576.992 rows=500 loops=1)
   ->  Sort  (cost=2630852.42..2643352.42 rows=500 width=288) 
(actual time=4525.211..5960.967 rows=500 loops=1)

 Sort Key: ((g)::text)
 Sort Method: external merge  Disk: 165296kB
 ->  Function Scan on generate_series g  (cost=0.00..75000.00 
rows=500 width=288) (actual time=518.914..1194.702 rows=500 loops=1)

 Planning Time: 0.036 ms
 JIT:
   Functions: 5
   Options: Inlining true, Optimization true, Expressions true, 
Deforming true
   Timing: Generation 0.242 ms (Deform 0.035 ms), Inlining 63.457 ms, 
Optimization 29.764 ms, Emission 20.592 ms, Total 114.056 ms

 Execution Time: 6766.399 ms
(11 rows)


With the patch:

postgres=# set enable_hashagg=off;
SET
postgres=# explain analyze select distinct g::text, 'a', 'b', 'c','d', 
'e','f','g','h' from generate_series(1, 500) g;


QUERY PLAN 


---
 Unique  (cost=2630852.42..2655852.42 rows=200 width=288) (actual 
time=4563.639..7362.467 rows=500 loops=1)
   ->  Sort  (cost=2630852.42..2643352.42 rows=500 width=288) 
(actual time=4563.637..6069.000 rows=500 loops=1)

 Sort Key: ((g)::text)
 Sort Method: external merge  Disk: 165296kB
 ->  Function Scan on generate_series g  (cost=0.00..75000.00 
rows=500 width=288) (actual time=528.060..1191.483 rows=500 loops=1)

 Planning Time: 0.720 ms
 JIT:
   Functions: 5
   Options: Inlining true, Optimization true, Expressions true, 
Deforming true
   Timing: Generation 0.406 ms (Deform 0.065 ms), Inlining 68.385 ms, 
Optimization 21.656 ms, Emission 21.033 ms, Total 111.480 ms

 Execution Time: 7585.306 ms
(11 rows)


So not a win in this case. Could you peek at the outer slot type, and 
use the same kind of slot for the Unique's result? Or some more 
complicated logic, like use a virtual slot if all the values are 
pass-by-val? I'd also like to keep this simple, though...


Would this kind of optimization make sense elsewhere?

--
Heikki Linnakangas
Neon (https://neon.tech)





Re: bug fix and documentation improvement about vacuumdb

2023-09-22 Thread Daniel Gustafsson
> On 22 Sep 2023, at 11:08, Kuwamura Masaki  
> wrote:
> 
> No worries at all.  If you look at the page now you will see all green
> checkmarks indicating that the patch was tested in CI.  So now we know that
> your tests fail without the fix and work with the fix applied, so all is well.
> 
> Thank you for your kind words! 
> 
> And it seems to me that all concerns are resolved.
> I'll change the patch status to Ready for Committer.
> If you have or find any flaw, let me know that.

I had a look at this and tweaked the testcase a bit to make the diff smaller,
as well as removed the (in some cases) superfluous space in the generated SQL
query mentioned upthread.  The attached two patches is what I propose we commit
to fix this, with a backpatch to v16 where it was introduced.

--
Daniel Gustafsson



v3-0002-vacuumdb-Reword-help-message-for-clarity.patch
Description: Binary data


v3-0001-vacuumdb-Fix-excluding-multiple-schemas-with-N.patch
Description: Binary data


Re: Questions about the new subscription parameter: password_required

2023-09-22 Thread Robert Haas
On Fri, Sep 22, 2023 at 4:25 AM Benoit Lobréau
 wrote:
> Can we consider adding something like this to clarify?
>
> """
> This parameter is enforced when the CREATE SUBSCRIPTION or ALTER
> SUBSCRIPTION .. CONNECTION commands are executed. Therefore, it's
> possible to alter the ownership of a subscription with
> password_required=true to a non-superuser.
> """

I'm not sure of the exact wording, but there was another recent thread
complaining about this being unclear, so it seems like some
clarification is needed.

[ adding Jeff Davis in case he wants to weigh in here ]

> Is the DROP SUBSCRIPTION failure in password_required.log expected for
> both superuser and non-superuser?
>
> Is the DROP SUBSCRIPTION success in password_required2.log expected?
> (i.e., with password_require=false, the only action a non-superuser can
> perform is dropping the subscription. Since they own it, it is
> understandable).

I haven't checked this, but I think what's happening here is that DROP
SUBSCRIPTION tries to drop the remote slot, which requires making a
connection, which can trigger the error. You might get different
results if you did ALTER SUBSCRIPTION ... SET (slot_name = none)
first.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Synchronizing slots from primary to standby

2023-09-22 Thread Drouvot, Bertrand

Hi,

Thanks for all the work that has been done on this feature, and sorry
to have been quiet on it for so long.

On 9/18/23 12:22 PM, shveta malik wrote:

On Wed, Sep 13, 2023 at 4:48 PM Hayato Kuroda (Fujitsu)
 wrote:

Right, but I wanted to know why it is needed. One motivation seemed to know the
WAL location of physical standby, but I thought that struct WalSnd.apply could
be also used. Is it bad to assume that the physical walsender always exists?



We do not plan to target this case where physical slot is not created
between primary and physical-standby in the first draft.  In such a
case, slot-synchronization will be skipped for the time being. We can
extend this functionality (if needed) later.



I do think it's needed to extend this functionality. Having physical slot
created sounds like a (too?) strong requirement as:

- It has not been a requirement for Logical decoding on standby so that could 
sounds weird
to require it for sync slot (while it's not allowed to logical decode from sync 
slots)

- One could want to limit the WAL space used on the primary

It seems that the "skipping sync as primary_slot_name not set." warning message 
is emitted
every 10ms, that seems too verbose to me.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




RE: pg_ctl start may return 0 even if the postmaster has been already started on Windows

2023-09-22 Thread Hayato Kuroda (Fujitsu)
Dear Horiguchi-san,

Thank you for making a patch! They can pass ci.
I'm still not sure what should be, but I can respond a part.

> Another issue is.. that I haven't been able to cause the false
> positive of pg_ctl start..  Do you have a concise reproducer of the
> issue?

I found a short sleep in pg_ctl/t/001_start_stop.pl. This was introduced in
6bcce2580 to ensure waiting more than 2 seconds. I've tested on my CI and
found that removing the sleep can trigger the failure. Also, I confirmed your 
patch
fixes the problem. PSA the small patch for cfbot. 0001 and 0002 were not 
changed.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED


v2-0001-Disable-autoruns-for-cmd.exe-on-Windows.patch
Description: v2-0001-Disable-autoruns-for-cmd.exe-on-Windows.patch


v2-0002-Improve-pg_ctl-postmaster-process-check-on-Window.patch
Description:  v2-0002-Improve-pg_ctl-postmaster-process-check-on-Window.patch


v2-0003-Remove-short-sleep-from-001_start_stop.pl.patch
Description: v2-0003-Remove-short-sleep-from-001_start_stop.pl.patch


Re: Refactor ssl tests to avoid using internal PostgreSQL::Test::Cluster methods

2023-09-22 Thread Daniel Gustafsson
> On 3 Jul 2023, at 13:37, Heikki Linnakangas  wrote:

> If "pg_ctl restart" fails because of a timeout, for example, the PID file 
> could be present. Previously, the _update_pid(0) would error out on that, but 
> now it's accepted. I think that's fine, but just wanted to point it out.

Revisiting an old patch.  Agreed, I think that's fine, so I went ahead and
pushed this.  Thanks for review!

It's unfortunately not that common for buildfarm animals to run the SSL tests,
so I guess I'll keep an extra eye on the CFBot for this one.

--
Daniel Gustafsson





Re: Row pattern recognition

2023-09-22 Thread Erik Rijkers

Op 9/22/23 om 12:12 schreef Tatsuo Ishii:

Op 9/22/23 om 07:16 schreef Tatsuo Ishii:

Attached is the fix against v6 patch. I will include this in upcoming
v7 patch.

Attached is the v7 patch. It includes the fix mentioned above.  Also

(Champion's address bounced; removed)


On my side his adress bounced too:-<


Hi,

In my hands, make check fails on the rpr test; see attached .diff
file.
In these two statements:
-- using NEXT
-- using AFTER MATCH SKIP TO NEXT ROW
   result of first_value(price) and next_value(price) are empty.


Strange. I have checked out fresh master branch and applied the v7
patches, then ran make check. All tests including the rpr test
passed. This is Ubuntu 20.04.


The curious thing is that the server otherwise builds ok, and if I 
explicitly run on that server 'CREATE TEMP TABLE stock' + the 20 INSERTS 
 (just to make sure to have known data), those two statements now both 
return the correct result.


So maybe the testing/timing is wonky (not necessarily the server).

Erik



Best reagards,
--
Tatsuo Ishii
SRA OSS LLC
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp





Re: logical decoding and replication of sequences, take 2

2023-09-22 Thread Dilip Kumar
On Wed, Sep 20, 2023 at 3:23 PM Dilip Kumar  wrote:
>
> On Wed, Aug 16, 2023 at 7:57 PM Tomas Vondra
>  wrote:
> >
>
> I was reading through 0001, I noticed this comment in
> ReorderBufferSequenceIsTransactional() function
>
> + * To decide if a sequence change should be handled as transactional or 
> applied
> + * immediately, we track (sequence) relfilenodes created by each transaction.
> + * We don't know if the current sub-transaction was already assigned to the
> + * top-level transaction, so we need to check all transactions.
>
> It says "We don't know if the current sub-transaction was already
> assigned to the top-level transaction, so we need to check all
> transactions". But IIRC as part of the steaming of in-progress
> transactions we have ensured that whenever we are logging the first
> change by any subtransaction we include the top transaction ID in it.
>
> Refer this code
>
> LogicalDecodingProcessRecord(LogicalDecodingContext *ctx,
> XLogReaderState *record)
> {
> ...
> /*
> * If the top-level xid is valid, we need to assign the subxact to the
> * top-level xact. We need to do this for all records, hence we do it
> * before the switch.
> */
> if (TransactionIdIsValid(txid))
> {
> ReorderBufferAssignChild(ctx->reorder,
> txid,
> XLogRecGetXid(record),
> buf.origptr);
> }
> }

Some more comments

1.
ReorderBufferSequenceIsTransactional and ReorderBufferSequenceGetXid
are duplicated except the first one is just confirming whether
relfilelocator was created in the transaction or not and the other is
returning the XID as well so I think these two could be easily merged
so that we can avoid duplicate codes.

2.
/*
+ * ReorderBufferTransferSequencesToParent
+ * Copy the relfilenode entries to the parent after assignment.
+ */
+static void
+ReorderBufferTransferSequencesToParent(ReorderBuffer *rb,
+ReorderBufferTXN *txn,
+ReorderBufferTXN *subtxn)

If we agree with my comment in the previous email (i.e. the first WAL
by a subxid will always include topxid) then we do not need this
function at all and always add relfilelocator directly to the top
transaction and we never need to transfer.

That is all I have for now while first pass of 0001, later I will do a
more detailed review and will look into other patches also.


-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com




Re: pg_upgrade --check fails to warn about abstime

2023-09-22 Thread Alvaro Herrera
On 2023-Sep-21, Tom Lane wrote:

> Bruce Momjian  writes:

> > Wow, I never added code to pg_upgrade to check for that, and no one
> > complained either.
> 
> Yeah, so most people had indeed listened to warnings and moved away
> from those datatypes.  I'm inclined to think that adding code for this
> at this point is a bit of a waste of time.

The migrations from versions prior to 12 have not stopped yet, and I did
receive a complaint about it.  Because the change is so simple, I'm
inclined to patch it anyway, late though it is.

I decided to follow Tristan's advice to add the version number as a
parameter to the new function; this way, the knowledge of where was what
dropped is all in the callsite and none in the function.  It
looked a bit schizoid otherwise.

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"Postgres is bloatware by design: it was built to house
 PhD theses." (Joey Hellerstein, SIGMOD annual conference 2002)
>From 5258379693db55618c6451ce8df4971521eb8e66 Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Tue, 19 Sep 2023 12:30:11 +0200
Subject: [PATCH v2] pg_upgrade: check for types removed in pg12

Commit cda6a8d01d39 removed a few datatypes, but didn't update
pg_upgrade --check to throw error if these types are used.  So the users
find that pg_upgrade --check tells them that everything is fine, only to
fail when the real upgrade is attempted.
---
 src/bin/pg_upgrade/check.c | 51 +-
 1 file changed, 50 insertions(+), 1 deletion(-)

diff --git a/src/bin/pg_upgrade/check.c b/src/bin/pg_upgrade/check.c
index 56e313f562..21a0ff9e42 100644
--- a/src/bin/pg_upgrade/check.c
+++ b/src/bin/pg_upgrade/check.c
@@ -26,6 +26,9 @@ static void check_for_tables_with_oids(ClusterInfo *cluster);
 static void check_for_composite_data_type_usage(ClusterInfo *cluster);
 static void check_for_reg_data_type_usage(ClusterInfo *cluster);
 static void check_for_aclitem_data_type_usage(ClusterInfo *cluster);
+static void check_for_removed_data_type_usage(ClusterInfo *cluster,
+			  const char *version,
+			  const char *datatype);
 static void check_for_jsonb_9_4_usage(ClusterInfo *cluster);
 static void check_for_pg_role_prefix(ClusterInfo *cluster);
 static void check_for_new_tablespace_dir(void);
@@ -111,6 +114,16 @@ check_and_dump_old_cluster(bool live_check)
 	if (GET_MAJOR_VERSION(old_cluster.major_version) <= 1500)
 		check_for_aclitem_data_type_usage(_cluster);
 
+	/*
+	 * PG 12 removed types abstime, reltime, tinterval.
+	 */
+	if (GET_MAJOR_VERSION(old_cluster.major_version) <= 1100)
+	{
+		check_for_removed_data_type_usage(_cluster, "12", "abstime");
+		check_for_removed_data_type_usage(_cluster, "12", "reltime");
+		check_for_removed_data_type_usage(_cluster, "12", "tinterval");
+	}
+
 	/*
 	 * PG 14 changed the function signature of encoding conversion functions.
 	 * Conversions from older versions cannot be upgraded automatically
@@ -1215,7 +1228,8 @@ check_for_aclitem_data_type_usage(ClusterInfo *cluster)
 {
 	char		output_path[MAXPGPATH];
 
-	prep_status("Checking for incompatible \"aclitem\" data type in user tables");
+	prep_status("Checking for incompatible \"%s\" data type in user tables",
+"aclitem");
 
 	snprintf(output_path, sizeof(output_path), "tables_using_aclitem.txt");
 
@@ -1233,6 +1247,41 @@ check_for_aclitem_data_type_usage(ClusterInfo *cluster)
 		check_ok();
 }
 
+/*
+ * check_for_removed_data_type_usage
+ *
+ *	Check for in-core data types that have been removed.  Callers know
+ *	the exact list.
+ */
+static void
+check_for_removed_data_type_usage(ClusterInfo *cluster, const char *version,
+  const char *datatype)
+{
+	char		output_path[MAXPGPATH];
+	char		typename[NAMEDATALEN];
+
+	prep_status("Checking for removed \"%s\" data type in user tables",
+datatype);
+
+	snprintf(output_path, sizeof(output_path), "tables_using_%s.txt",
+			 datatype);
+	snprintf(typename, sizeof(typename), "pg_catalog.%s", datatype);
+
+	if (check_for_data_type_usage(cluster, typename, output_path))
+	{
+		pg_log(PG_REPORT, "fatal");
+		pg_fatal("Your installation contains the \"%s\" data type in user tables.\n"
+ "The \"%s\" type has been removed in PostgreSQL version %s,\n"
+ "so this cluster cannot currently be upgraded.  You can drop the\n"
+ "problem columns, or change them to another data type, and restart\n"
+ "the upgrade.  A list of the problem columns is in the file:\n"
+ "%s", datatype, datatype, version, output_path);
+	}
+	else
+		check_ok();
+}
+
+
 /*
  * check_for_jsonb_9_4_usage()
  *
-- 
2.39.2



Re: Dump-restore loosing 'attnotnull' bit for DEFERRABLE PRIMARY KEY column(s).

2023-09-22 Thread Amul Sul
On Wed, Sep 20, 2023 at 8:29 PM Alvaro Herrera 
wrote:

> On 2023-Sep-20, Amul Sul wrote:
>
> > On the latest master head, I can see a $subject bug that seems to be
> related
> > commit #b0e96f311985:
> >
> > Here is the table definition:
> > create table foo(i int, j int, CONSTRAINT pk PRIMARY KEY(i) DEFERRABLE);
>
> Interesting, thanks for the report.  Your attribution to that commit is
> correct.  The table is dumped like this:
>
> CREATE TABLE public.foo (
> i integer CONSTRAINT pgdump_throwaway_notnull_0 NOT NULL NO INHERIT,
> j integer
> );
> ALTER TABLE ONLY public.foo
> ADD CONSTRAINT pk PRIMARY KEY (i) DEFERRABLE;
> ALTER TABLE ONLY public.foo DROP CONSTRAINT pgdump_throwaway_notnull_0;
>
> so the problem here is that the deferrable PK is not considered a reason
> to keep attnotnull set, so we produce a throwaway constraint that we
> then drop.  This is already bogus, but what is more bogus is the fact
> that the backend accepts the DROP CONSTRAINT at all.
>
> The pg_dump failing should be easy to fix, but fixing the backend to
> error out sounds more critical.  So, the reason for this behavior is
> that RelationGetIndexList doesn't want to list an index that isn't
> marked indimmediate as a primary key.  I can easily hack around that by
> doing
>
> diff --git a/src/backend/utils/cache/relcache.c
> b/src/backend/utils/cache/relcache.c
> index 7234cb3da6..971d9c8738 100644
> --- a/src/backend/utils/cache/relcache.c
> +++ b/src/backend/utils/cache/relcache.c
> @@ -4794,7 +4794,6 @@ RelationGetIndexList(Relation relation)
>  * check them.
>  */
> if (!index->indisunique ||
> -   !index->indimmediate ||
> !heap_attisnull(htup,
> Anum_pg_index_indpred, NULL))
> continue;
>
> @@ -4821,6 +4820,9 @@ RelationGetIndexList(Relation relation)
>  relation->rd_rel->relkind ==
> RELKIND_PARTITIONED_TABLE))
> pkeyIndex = index->indexrelid;
>
> +   if (!index->indimmediate)
> +   continue;
> +
> if (!index->indisvalid)
> continue;
>
>
> But of course this is not great, since it impacts unrelated bits of code
> that are relying on relation->pkindex or RelationGetIndexAttrBitmap
> having their current behavior with non-immediate index.
>

True, but still wondering why would relation->rd_pkattr skipped for a
deferrable primary key, which seems to be a bit incorrect to me since it
sensing that relation doesn't have PK at all.  Anyway, that is unrelated.


> I think a real solution is to stop relying on RelationGetIndexAttrBitmap
> in ATExecDropNotNull().  (And, again, pg_dump needs some patch as well
> to avoid printing a throwaway NOT NULL constraint at this point.)
>

I might not have understood this, but I think, if it is ok to skip
throwaway NOT
NULL for deferrable PK then that would be enough for the reported issue
to be fixed.  I quickly tried with the attached patch which looks sufficient
to skip that, but, TBH, I haven't thought carefully about this change.

Regards,
Amul
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index f7b61766921..5a4e915dc62 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -8543,7 +8543,7 @@ getTableAttrs(Archive *fout, TableInfo *tblinfo, int numTables)
 	appendPQExpBufferStr(q,
 		 "LEFT JOIN pg_catalog.pg_constraint copk ON "
 		 "(copk.conrelid = src.tbloid\n"
-		 "   AND copk.contype = 'p' AND "
+		 "   AND copk.contype = 'p' AND copk.condeferrable = false AND "
 		 "copk.conkey @> array[a.attnum])\n"
 		 "WHERE a.attnum > 0::pg_catalog.int2\n"
 		 "ORDER BY a.attrelid, a.attnum");


Re: Synchronizing slots from primary to standby

2023-09-22 Thread Amit Kapila
On Thu, Sep 21, 2023 at 9:16 AM shveta malik  wrote:
>
> On Tue, Sep 19, 2023 at 10:29 AM shveta malik  wrote:
>
> Currently in patch001, synchronize_slot_names is a GUC on both primary
> and physical standby. This GUC tells which all logical slots need to
> be synced on physical standbys from the primary. Ideally it should be
> a GUC on physical standby alone and each physical standby should be
> able to communicate the value to the primary (considering the value
> may vary for different physical replicas of the same primary). The
> primary on the other hand should be able to take UNION of these values
> and let the logical walsenders (belonging to the slots in UNION
> synchronize_slots_names) wait for physical standbys for confirmation
> before sending those changes to logical subscribers. The intent is
> logical subscribers should never be ahead of physical standbys.
>

Before getting into the details of 'synchronize_slot_names', I would
like to know whether we really need the second GUC
'standby_slot_names'. Can't we simply allow all the logical wal
senders corresponding to 'synchronize_slot_names' to wait for just the
physical standby(s) (physical slot corresponding to such physical
standby) that have sent ' synchronize_slot_names'list? We should have
one physical standby slot corresponding to one physical standby.

-- 
With Regards,
Amit Kapila.




Re: Infinite Interval

2023-09-22 Thread Ashutosh Bapat
On Fri, Sep 22, 2023 at 2:35 PM jian he  wrote:
>
> /* TODO: Handle NULL inputs? */
> since interval_avg_serialize is strict, so handle null would be like:
> if (PG_ARGISNULL(0)) then PG_RETURN_NULL();

That's automatically taken care of by the executor. Functions need to
handle NULL inputs if they are *not* strict.

#select proisstrict from pg_proc where proname = 'interval_avg_serialize';
 proisstrict
-
 t
(1 row)

#select proisstrict from pg_proc where proname = 'interval_avg_deserialize';
 proisstrict
-
 t
(1 row)


-- 
Best Wishes,
Ashutosh Bapat




Re: Row pattern recognition

2023-09-22 Thread Tatsuo Ishii
> Op 9/22/23 om 07:16 schreef Tatsuo Ishii:
>>> Attached is the fix against v6 patch. I will include this in upcoming
>>> v7 patch.
>> Attached is the v7 patch. It includes the fix mentioned above.  Also
> (Champion's address bounced; removed)

On my side his adress bounced too:-<

> Hi,
> 
> In my hands, make check fails on the rpr test; see attached .diff
> file.
> In these two statements:
> -- using NEXT
> -- using AFTER MATCH SKIP TO NEXT ROW
>   result of first_value(price) and next_value(price) are empty.

Strange. I have checked out fresh master branch and applied the v7
patches, then ran make check. All tests including the rpr test
passed. This is Ubuntu 20.04.

Best reagards,
--
Tatsuo Ishii
SRA OSS LLC
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp




Re: Report planning memory in EXPLAIN ANALYZE

2023-09-22 Thread Ashutosh Bapat
Hi Andy,
Thanks for your feedback.

On Fri, Sep 22, 2023 at 8:22 AM Andy Fan  wrote:
>
> 1). The commit message of patch 1 just says how it does but doesn't
> say why it does. After reading through the discussion, I suggest making
> it clearer to others.
>
> ...
> After the planning is done, it may still occupy lots of memory which is
> allocated but not pfree-d. All of these memories can't be used in the later
> stage. This patch will report such memory usage in order to making some
> future mistakes can be found in an easier way.
> ...

That's a good summary of how the memory report can be used. Will
include a line about usage in the commit message.

>
> Planning Memory: used=15088 bytes allocated=16384 bytes
>
> Word 'Planning' is kind of a time duration, so the 'used' may be the
> 'used' during the duration or 'used' after the duration. Obviously you
> means the later one but it is not surprising others think it in the other
> way. So I'd like to reword the metrics to:

We report "PLanning Time" hence used "Planning memory". Would
"Planner" be good instead of "Planning"?

>
> Memory Occupied (Now): Parser: 1k Planner: 10k
>
> 'Now' is a cleaner signal that is a time point rather than a time
> duration. 'Parser' and 'Planner' also show a timeline about the
> important time point. At the same time, it cost very little coding
> effort based on patch 01. Different people may have different
> feelings about these words, I just hope I describe the goal clearly.

Parsing happens before planning and that memory is not measured by
this patch. May be separately but it's out of scope of this work.
"used" and "allocated" are MemoryContext terms indicating memory
actually used vs memory allocated.

>
> Personally I am pretty like patch 1, we just need to refactor some words
> to make everything clearer.

Thanks.

-- 
Best Wishes,
Ashutosh Bapat




Re: Row pattern recognition

2023-09-22 Thread Erik Rijkers

Op 9/22/23 om 07:16 schreef Tatsuo Ishii:

Attached is the fix against v6 patch. I will include this in upcoming v7 patch.


Attached is the v7 patch. It includes the fix mentioned above.  Also


Hi,

In my hands, make check fails on the rpr test; see attached .diff file.
In these two statements:
-- using NEXT
-- using AFTER MATCH SKIP TO NEXT ROW
  result of first_value(price) and next_value(price) are empty.


Erik Rijkers



this time the pattern matching engine is enhanced: previously it
recursed to row direction, which means if the number of rows in a
frame is large, it could exceed the limit of stack depth.  The new
version recurses over matched pattern variables in a row, which is at
most 26 which should be small enough.

Best reagards,
--
Tatsuo Ishii
SRA OSS LLC
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jpdiff -U3 
/home/aardvark/pg_stuff/pg_sandbox/pgsql.rpr/src/test/regress/expected/rpr.out 
/home/aardvark/pg_stuff/pg_sandbox/pgsql.rpr/src/test/regress/results/rpr.out
--- 
/home/aardvark/pg_stuff/pg_sandbox/pgsql.rpr/src/test/regress/expected/rpr.out  
2023-09-22 09:04:17.770392635 +0200
+++ 
/home/aardvark/pg_stuff/pg_sandbox/pgsql.rpr/src/test/regress/results/rpr.out   
2023-09-22 09:38:23.826458109 +0200
@@ -253,23 +253,23 @@
 );
  company  |   tdate| price | first_value | last_value 
 --++---+-+
- company1 | 07-01-2023 |   100 | 100 |200
+ company1 | 07-01-2023 |   100 | |   
  company1 | 07-02-2023 |   200 | |   
  company1 | 07-03-2023 |   150 | |   
- company1 | 07-04-2023 |   140 | 140 |150
+ company1 | 07-04-2023 |   140 | |   
  company1 | 07-05-2023 |   150 | |   
  company1 | 07-06-2023 |90 | |   
- company1 | 07-07-2023 |   110 | 110 |130
+ company1 | 07-07-2023 |   110 | |   
  company1 | 07-08-2023 |   130 | |   
  company1 | 07-09-2023 |   120 | |   
  company1 | 07-10-2023 |   130 | |   
- company2 | 07-01-2023 |50 |  50 |   2000
+ company2 | 07-01-2023 |50 | |   
  company2 | 07-02-2023 |  2000 | |   
  company2 | 07-03-2023 |  1500 | |   
- company2 | 07-04-2023 |  1400 |1400 |   1500
+ company2 | 07-04-2023 |  1400 | |   
  company2 | 07-05-2023 |  1500 | |   
  company2 | 07-06-2023 |60 | |   
- company2 | 07-07-2023 |  1100 |1100 |   1300
+ company2 | 07-07-2023 |  1100 | |   
  company2 | 07-08-2023 |  1300 | |   
  company2 | 07-09-2023 |  1200 | |   
  company2 | 07-10-2023 |  1300 | |   
@@ -290,23 +290,23 @@
 );
  company  |   tdate| price | first_value | last_value 
 --++---+-+
- company1 | 07-01-2023 |   100 | 100 |200
+ company1 | 07-01-2023 |   100 | |   
  company1 | 07-02-2023 |   200 | |   
  company1 | 07-03-2023 |   150 | |   
- company1 | 07-04-2023 |   140 | 140 |150
+ company1 | 07-04-2023 |   140 | |   
  company1 | 07-05-2023 |   150 | |   
  company1 | 07-06-2023 |90 | |   
- company1 | 07-07-2023 |   110 | 110 |130
+ company1 | 07-07-2023 |   110 | |   
  company1 | 07-08-2023 |   130 | |   
  company1 | 07-09-2023 |   120 | |   
  company1 | 07-10-2023 |   130 | |   
- company2 | 07-01-2023 |50 |  50 |   2000
+ company2 | 07-01-2023 |50 | |   
  company2 | 07-02-2023 |  2000 | |   
  company2 | 07-03-2023 |  1500 | |   
- company2 | 07-04-2023 |  1400 |1400 |   1500
+ company2 | 07-04-2023 |  1400 | |   
  company2 | 07-05-2023 |  1500 | |   
  company2 | 07-06-2023 |60 | |   
- company2 | 07-07-2023 |  1100 |1100 |   1300
+ company2 | 07-07-2023 |  1100 | |   
  company2 | 07-08-2023 |  1300 | |   
  company2 | 07-09-2023 |  1200 | |   
  company2 | 07-10-2023 |  1300 | |   


Re: Fix error handling in be_tls_open_server()

2023-09-22 Thread Daniel Gustafsson
> On 20 Sep 2023, at 10:58, Daniel Gustafsson  wrote:
>> On 20 Sep 2023, at 10:55, Sergey Shinderuk  
>> wrote:

>> There is a typo: upon en error. Otherwise, looks good to me. Thank you.
> 
> Thanks, will fix before pushing.

I've applied this down to v15 now and the buildfarm have green builds on all
three branches, thanks for the submission!

--
Daniel Gustafsson





Re: bug fix and documentation improvement about vacuumdb

2023-09-22 Thread Kuwamura Masaki
>
> No worries at all.  If you look at the page now you will see all green
> checkmarks indicating that the patch was tested in CI.  So now we know that
> your tests fail without the fix and work with the fix applied, so all is
> well.
>

Thank you for your kind words!

And it seems to me that all concerns are resolved.
I'll change the patch status to Ready for Committer.
If you have or find any flaw, let me know that.

Best Regards,

Masaki Kuwamura


Re: Infinite Interval

2023-09-22 Thread jian he
On Fri, Sep 22, 2023 at 3:49 PM Ashutosh Bapat
 wrote:
>
> On Thu, Sep 21, 2023 at 7:21 PM Ashutosh Bapat
>  wrote:
> >
> Hence I have changed interval_avg_deserialize() in 0007 to use
> CurrentMemoryContext instead of aggcontext. Rest of the patches are
> same as previous set.
>
> --
> Best Wishes,
> Ashutosh Bapat

/* TODO: Handle NULL inputs? */
since interval_avg_serialize is strict, so handle null would be like:
if (PG_ARGISNULL(0)) then PG_RETURN_NULL();




Re: [PoC] pg_upgrade: allow to upgrade publisher node

2023-09-22 Thread Amit Kapila
On Fri, Sep 22, 2023 at 10:57 AM Bharath Rupireddy
 wrote:
>
> On Thu, Sep 21, 2023 at 5:45 PM Amit Kapila  wrote:
> >
> > > Thanks for the patch. I have some comments on v42:
> >
> > Probably trying to keep it similar with
> > binary_upgrade_create_empty_extension(). I think it depends what
> > behaviour we expect for NULL input.
>
> confirmed_flush_lsn for a logical slot can be null (for instance,
> before confirmed_flush is updated for a newly created logical slot if
> someone calls pg_stat_replication -> pg_get_replication_slots) and
> when it is so, the binary_upgrade_create_empty_extension errors out.
> Is this behaviour wanted? I think the function returning null on null
> input is a better behaviour here.
>

I think if we do return null on null behavior then the caller needs to
add a special case for null value as this function returns bool. We
can probably return false in that case. Does that help to address your
concern?

-- 
With Regards,
Amit Kapila.




Re: Questions about the new subscription parameter: password_required

2023-09-22 Thread Benoit Lobréau

On 9/21/23 20:29, Robert Haas wrote:

Which one? I see 2 ALTER SUBSCRIPTION ... OWNER commands in
password_required.log and 1 more in password_required2.log, but
they're all performed by the superuser, who is entitled to do anything
they want.


Thank you for taking the time to respond!

I expected the ALTER SUBSCRIPTION ... OWNER command in 
password_required.log to fail because the end result of the command is a 
non-superuser owning a subscription with password_required=true, but the 
connection string has no password keyword, and the authentication scheme 
used doesn't require one anyway.


The description of the password_required parameter doesn't clearly state 
what will fail or when the configuration is enforced (during CREATE 
SUBSCRIPTION and ALTER SUBSCRIPTION .. CONNECTION):


""" https://www.postgresql.org/docs/16/sql-createsubscription.html
Specifies whether connections to the publisher made as a result of this 
subscription must use password authentication. This setting is ignored 
when the subscription is owned by a superuser. The default is true. Only 
superusers can set this value to false.

"""

The description of pg_subscription.subpasswordrequired doesn't either:

""" https://www.postgresql.org/docs/16/catalog-pg-subscription.html
If true, the subscription will be required to specify a password for 
authentication

"""

Can we consider adding something like this to clarify?

"""
This parameter is enforced when the CREATE SUBSCRIPTION or ALTER 
SUBSCRIPTION .. CONNECTION commands are executed. Therefore, it's 
possible to alter the ownership of a subscription with 
password_required=true to a non-superuser.

"""

Is the DROP SUBSCRIPTION failure in password_required.log expected for 
both superuser and non-superuser?


Is the DROP SUBSCRIPTION success in password_required2.log expected?
(i.e., with password_require=false, the only action a non-superuser can 
perform is dropping the subscription. Since they own it, it is 
understandable).


--
Benoit Lobréau
Consultant
http://dalibo.com




Re: Row pattern recognition

2023-09-22 Thread Erik Rijkers

Op 9/22/23 om 10:23 schreef Erik Rijkers:

Op 9/22/23 om 07:16 schreef Tatsuo Ishii:
Attached is the fix against v6 patch. I will include this in upcoming 
v7 patch.


Attached is the v7 patch. It includes the fix mentioned above.  Also

(Champion's address bounced; removed)



Sorry, I forgot to re-attach the regression.diffs with resend...

Erik


Hi,

In my hands, make check fails on the rpr test; see attached .diff file.
In these two statements:
-- using NEXT
-- using AFTER MATCH SKIP TO NEXT ROW
   result of first_value(price) and next_value(price) are empty.

Erik Rijkers



this time the pattern matching engine is enhanced: previously it
recursed to row direction, which means if the number of rows in a
frame is large, it could exceed the limit of stack depth.  The new
version recurses over matched pattern variables in a row, which is at
most 26 which should be small enough.

Best reagards,
--
Tatsuo Ishii
SRA OSS LLC
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp


diff -U3 
/home/aardvark/pg_stuff/pg_sandbox/pgsql.rpr/src/test/regress/expected/rpr.out 
/home/aardvark/pg_stuff/pg_sandbox/pgsql.rpr/src/test/regress/results/rpr.out
--- 
/home/aardvark/pg_stuff/pg_sandbox/pgsql.rpr/src/test/regress/expected/rpr.out  
2023-09-22 09:04:17.770392635 +0200
+++ 
/home/aardvark/pg_stuff/pg_sandbox/pgsql.rpr/src/test/regress/results/rpr.out   
2023-09-22 09:38:23.826458109 +0200
@@ -253,23 +253,23 @@
 );
  company  |   tdate| price | first_value | last_value 
 --++---+-+
- company1 | 07-01-2023 |   100 | 100 |200
+ company1 | 07-01-2023 |   100 | |   
  company1 | 07-02-2023 |   200 | |   
  company1 | 07-03-2023 |   150 | |   
- company1 | 07-04-2023 |   140 | 140 |150
+ company1 | 07-04-2023 |   140 | |   
  company1 | 07-05-2023 |   150 | |   
  company1 | 07-06-2023 |90 | |   
- company1 | 07-07-2023 |   110 | 110 |130
+ company1 | 07-07-2023 |   110 | |   
  company1 | 07-08-2023 |   130 | |   
  company1 | 07-09-2023 |   120 | |   
  company1 | 07-10-2023 |   130 | |   
- company2 | 07-01-2023 |50 |  50 |   2000
+ company2 | 07-01-2023 |50 | |   
  company2 | 07-02-2023 |  2000 | |   
  company2 | 07-03-2023 |  1500 | |   
- company2 | 07-04-2023 |  1400 |1400 |   1500
+ company2 | 07-04-2023 |  1400 | |   
  company2 | 07-05-2023 |  1500 | |   
  company2 | 07-06-2023 |60 | |   
- company2 | 07-07-2023 |  1100 |1100 |   1300
+ company2 | 07-07-2023 |  1100 | |   
  company2 | 07-08-2023 |  1300 | |   
  company2 | 07-09-2023 |  1200 | |   
  company2 | 07-10-2023 |  1300 | |   
@@ -290,23 +290,23 @@
 );
  company  |   tdate| price | first_value | last_value 
 --++---+-+
- company1 | 07-01-2023 |   100 | 100 |200
+ company1 | 07-01-2023 |   100 | |   
  company1 | 07-02-2023 |   200 | |   
  company1 | 07-03-2023 |   150 | |   
- company1 | 07-04-2023 |   140 | 140 |150
+ company1 | 07-04-2023 |   140 | |   
  company1 | 07-05-2023 |   150 | |   
  company1 | 07-06-2023 |90 | |   
- company1 | 07-07-2023 |   110 | 110 |130
+ company1 | 07-07-2023 |   110 | |   
  company1 | 07-08-2023 |   130 | |   
  company1 | 07-09-2023 |   120 | |   
  company1 | 07-10-2023 |   130 | |   
- company2 | 07-01-2023 |50 |  50 |   2000
+ company2 | 07-01-2023 |50 | |   
  company2 | 07-02-2023 |  2000 | |   
  company2 | 07-03-2023 |  1500 | |   
- company2 | 07-04-2023 |  1400 |1400 |   1500
+ company2 | 07-04-2023 |  1400 | |   
  company2 | 07-05-2023 |  1500 | |   
  company2 | 07-06-2023 |60 | |   
- company2 | 07-07-2023 |  1100 |1100 |   1300
+ company2 | 07-07-2023 |  1100 | |   
  company2 | 07-08-2023 |  1300 | |   
  company2 | 07-09-2023 |  1200 | |   
  company2 | 07-10-2023 |  1300 | |   


Re: Row pattern recognition

2023-09-22 Thread Erik Rijkers

Op 9/22/23 om 07:16 schreef Tatsuo Ishii:

Attached is the fix against v6 patch. I will include this in upcoming v7 patch.


Attached is the v7 patch. It includes the fix mentioned above.  Also

(Champion's address bounced; removed)

Hi,

In my hands, make check fails on the rpr test; see attached .diff file.
In these two statements:
-- using NEXT
-- using AFTER MATCH SKIP TO NEXT ROW
  result of first_value(price) and next_value(price) are empty.

Erik Rijkers



this time the pattern matching engine is enhanced: previously it
recursed to row direction, which means if the number of rows in a
frame is large, it could exceed the limit of stack depth.  The new
version recurses over matched pattern variables in a row, which is at
most 26 which should be small enough.

Best reagards,
--
Tatsuo Ishii
SRA OSS LLC
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp





Re: [PATCH] Add extra statistics to explain for Nested Loop

2023-09-22 Thread Andrey Lepikhov

On 31/7/2022 10:49, Julien Rouhaud wrote:

On Sat, Jul 30, 2022 at 08:54:33PM +0800, Julien Rouhaud wrote:
Anyway, 1% is in my opinion still too much overhead for extensions that won't
get any extra information.
I have read all the thread and still can't understand something. What 
valuable data can I find with these extra statistics if no one 
parameterized node in the plan exists?
Also, thinking about min/max time in the explain, I guess it would be 
necessary in rare cases. Usually, the execution time will correlate to 
the number of tuples scanned, won't it? So, maybe skip the time 
boundaries in the instrument structure?
In my experience, it is enough to know the total number of tuples 
bubbled up from a parameterized node to decide further optimizations. 
Maybe simplify this feature up to the one total_rows field in the case 
of nloops > 1 and in the presence of parameters?
And at the end. If someone wants a lot of additional statistics, why not 
give them that by extension? It is only needed to add a hook into the 
point of the node explanation and some efforts to make instrumentation 
extensible. But here, honestly, I don't have code/ideas so far.


--
regards,
Andrey Lepikhov
Postgres Professional





Re: Infinite Interval

2023-09-22 Thread Dean Rasheed
On Fri, 22 Sept 2023 at 08:49, Ashutosh Bapat
 wrote:
>
> Following code in ExecInterpExpr makes it clear that the
> deserialization function is be executed in per tuple memory context.
> Whereas the aggregate's context is different from this context and may
> lives longer that the context in which deserialization is expected to
> happen.
>

Right. I was about to reply, saying much the same thing, but it's
always better when you see it for yourself.

> Hence I have changed interval_avg_deserialize() in 0007 to use
> CurrentMemoryContext instead of aggcontext.

+1. And consistency with other deserialisation functions is good.

> Rest of the patches are
> same as previous set.
>

OK, I'll take a look.

Regards,
Dean




Re: pg_ctl start may return 0 even if the postmaster has been already started on Windows

2023-09-22 Thread Kyotaro Horiguchi
At Wed, 20 Sep 2023 14:18:41 +0900 (JST), Kyotaro Horiguchi 
 wrote in 
> I was able to see the trouble in the CI environment, but not
> locally. I'll delve deeper into this. Thanks you for bringing it to my
> attention.

I found two instances with multiple child processes.

# child-pid / parent-pid / given-pid : exec name
process  parent PID  child PID  target PID   exec file
shell   1228   64721228  cmd.exe
child   5184   12281228  cmd.exe
child   6956   12281228  postgres.exe
> launcher shell executed multiple processes

process  parent PID  child PID  target PID   exec file
shell   4296   58804296  cmd.exe
child   5156   42964296  agent.exe
child   5640   42964296  postgres.exe
> launcher shell executed multiple processes

It looks like the environment has autorun setups for cmd.exe. There's
another known issue related to auto-launching chcp at
startup. Ideally, we would avoid such behavior in the
postmaster-launcher shell.  I think we should add "/D" flag to cmd.exe
command line, perhaps in a separate patch.

Even after making that change, I still see something being launched from the 
launcher cmd.exe...

process  parent PID  child PID  target PID   exec file
shell   2784   66682784  cmd.exe
child   6140   27842784  MicrosoftEdgeUpdate.exe
child   6260   27842784  postgres.exe
> launcher shell executed multiple processes

I'm not sure what triggers this; perhaps some other kind of hooks?  If
we cannot avoid this behavior, we'll have to verify the executable
file name. It should be fine, given that the file name is constant,
but I'm not fully convinced that this is the ideal solution.

Another issue is.. that I haven't been able to cause the false
positive of pg_ctl start..  Do you have a concise reproducer of the
issue?

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
>From c6de7b541be55b01bbd0d2e8e8d95aac6799a82c Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Thu, 21 Sep 2023 16:54:33 +0900
Subject: [PATCH 1/3] Disable autoruns for cmd.exe on Windows

On Windows, we use cmd.exe to launch the postmaster process for easy
redirection setup. However, cmd.exe may execute other programs at
startup due to autorun configurations. This patch adds /D flag to the
launcher cmd.exe command line to disable autorun settings written in
the registry.
---
 src/bin/pg_ctl/pg_ctl.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/bin/pg_ctl/pg_ctl.c b/src/bin/pg_ctl/pg_ctl.c
index 807d7023a9..3ac2fcc004 100644
--- a/src/bin/pg_ctl/pg_ctl.c
+++ b/src/bin/pg_ctl/pg_ctl.c
@@ -553,11 +553,11 @@ start_postmaster(void)
 		else
 			close(fd);
 
-		cmd = psprintf("\"%s\" /C \"\"%s\" %s%s < \"%s\" >> \"%s\" 2>&1\"",
+		cmd = psprintf("\"%s\" /D /C \"\"%s\" %s%s < \"%s\" >> \"%s\" 2>&1\"",
 	   comspec, exec_path, pgdata_opt, post_opts, DEVNULL, log_file);
 	}
 	else
-		cmd = psprintf("\"%s\" /C \"\"%s\" %s%s < \"%s\" 2>&1\"",
+		cmd = psprintf("\"%s\" /D /C \"\"%s\" %s%s < \"%s\" 2>&1\"",
 	   comspec, exec_path, pgdata_opt, post_opts, DEVNULL);
 
 	if (!CreateRestrictedProcess(cmd, , false))
-- 
2.39.3

>From f1484974f5bcf14d5ac3c6a702dcb02d0eb45f9f Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Wed, 20 Sep 2023 13:16:35 +0900
Subject: [PATCH 2/3] Improve pg_ctl postmaster process check on Windows

Currently pg_ctl on Windows does not verify that it actually executed
a postmaster process due to lack of process ID knowledge. This can
lead to false positives in cases where another pg_ctl instance starts
a different server simultaneously.
This patch adds the capability to identify the process ID of the
launched postmaster on Windows, similar to other OS versions, ensuring
more reliable detection of concurrent server startups.
---
 src/bin/pg_ctl/pg_ctl.c | 109 
 1 file changed, 100 insertions(+), 9 deletions(-)

diff --git a/src/bin/pg_ctl/pg_ctl.c b/src/bin/pg_ctl/pg_ctl.c
index 3ac2fcc004..ed1b0c43fc 100644
--- a/src/bin/pg_ctl/pg_ctl.c
+++ b/src/bin/pg_ctl/pg_ctl.c
@@ -132,6 +132,7 @@ static void adjust_data_dir(void);
 
 #ifdef WIN32
 #include 
+#include 
 static bool pgwin32_IsInstalled(SC_HANDLE);
 static char *pgwin32_CommandLine(bool);
 static void pgwin32_doRegister(void);
@@ -142,6 +143,7 @@ static void WINAPI pgwin32_ServiceMain(DWORD, LPTSTR *);
 static void pgwin32_doRunAsService(void);
 static int	CreateRestrictedProcess(char *cmd, PROCESS_INFORMATION *processInfo, bool as_service);
 static PTOKEN_PRIVILEGES GetPrivilegesToDelete(HANDLE hToken);
+static pid_t pgwin32_find_postmaster_pid(pid_t shell_pid);
 #endif
 
 static pid_t get_pgpid(bool is_status_request);
@@ -609,7 +611,11 @@ wait_for_postmaster_start(pid_t pm_pid, bool do_checkpoint)
 			/* File is complete enough for us, parse it */
 			pid_t		pmpid;
 			time_t		pmstart;
-

Re: [PoC] pg_upgrade: allow to upgrade publisher node

2023-09-22 Thread Amit Kapila
On Fri, Sep 22, 2023 at 11:59 AM Bharath Rupireddy
 wrote:
>
> On Thu, Sep 21, 2023 at 6:54 PM Hayato Kuroda (Fujitsu)
>  wrote:
> >
>
> 1.
> +/*
> + * Use max_slot_wal_keep_size as -1 to prevent the WAL removal by the
> + * checkpointer process.  If WALs required by logical replication slots
> + * are removed, the slots are unusable.  This setting prevents the
> + * invalidation of slots during the upgrade. We set this option when
>
> IIUC, during upgrade we don't want the checkpointer to remove WAL that
> may be needed by logical slots, for that the patch overrides the user
> set value for max_slot_wal_keep_size. What if the WAL is removed
> because of the wal_keep_size setting?
>

We are fine with the WAL removal unless it can invalidate the slots
which is prevented by max_slot_wal_keep_size.

>
> 3. Does this patch support upgrading of logical replication slots on a
> streaming standby?
>

No, and a note has been added by the patch for the same.

-- 
With Regards,
Amit Kapila.




Re: [PoC] pg_upgrade: allow to upgrade publisher node

2023-09-22 Thread Amit Kapila
On Fri, Sep 22, 2023 at 10:57 AM Bharath Rupireddy
 wrote:
>
> On Thu, Sep 21, 2023 at 5:45 PM Amit Kapila  wrote:
> > > 3.
> > > +/* Quick exit if the given lsn is larger than current one */
> > > +if (start_lsn >= GetFlushRecPtr(NULL))
> > > +PG_RETURN_BOOL(false);
> > > +
> > >
> > > An LSN that doesn't exists yet is an error IMO, may be an error better 
> > > here?
> > >
> >
> > It will anyway lead to error at later point but we will provide more
> > information about all the slots that have invalid value of
> > confirmed_flush LSN.
>
> I disagree with the function returning false for non-existing LSN.
> IMO, failing fast when an LSN that doesn't exist yet is supplied to
> the function is the right approach. We never know, the slots on disk
> content can get corrupted for some reason and confirmed_flush_lsn is
> '/' or a non-existing LSN.
>

I don't think it is big deal to either fail immediately or slightly
later with more information about slot. It could be better if we do
later because various slots can have the same problem, so we can
mention all such slots together.

>
> > > 5. Trying to understand the interaction of this feature with custom
> > > WAL records that a custom WAL resource manager puts in. Is it okay to
> > > have custom WAL records after the "logical WAL end"?
> > > +/*
> > > + * There is a possibility that following records may be generated
> > > + * during the upgrade.
> > > + */
> > >
> >
> > I don't think so. The only valid records for the checks in this
> > function are probably the ones that can get generated by the upgrade
> > process because we ensure that walsender sends all the records before
> > it exits at shutdown time.
>
> Can you help me understand how the list of WAL records that pg_upgrade
> can generate is put up? Identified them after running some tests?
>

Yeah, both by tests and manually verifying the WAL records. Basically,
we need to care about records that could be generated by background
processes like checkpointer/bgwriter or can be generated during system
table scans. You may want to read my latest email for a summary on how
we reached at this design choice [1].

[1] - 
https://www.postgresql.org/message-id/CAA4eK1JVKZGRHLOEotWi%2Be%2B09jucNedqpkkc-Do4dh5FTAU%2B5w%40mail.gmail.com
-- 
With Regards,
Amit Kapila.




Re: [PoC] pg_upgrade: allow to upgrade publisher node

2023-09-22 Thread Bharath Rupireddy
On Thu, Sep 21, 2023 at 6:54 PM Hayato Kuroda (Fujitsu)
 wrote:
>
> > 6.
> > +if (PQntuples(res) != 1)
> > +pg_fatal("could not count the number of logical replication 
> > slots");
> > +
> >
> > Not existing a single logical replication slot an error? I think it
> > must be if (PQntuples(res) == 0) return;?
> >
>
> The query executes "SELECT count(*)...", IIUC it exactly returns 1 row.

Ah, got it.

> > 7. A nit:
> > +nslots_on_new = atoi(PQgetvalue(res, 0, 0));
> > +
> > +if (nslots_on_new)
> >
> > Just do if(atoi(PQgetvalue(res, 0, 0)) > 0) and get rid of nslots_on_new?
>
> Note that the vaule would be used for upcoming pg_fatal. I prefer current 
> style
> because multiple atoi(PQgetvalue(res, 0, 0)) was not so beautiful.

+1.

> You mentioned at line 118, but at that time logical replication system is not 
> created.
> The subscriber is created at line 163.
> Therefore WALs would not be consumed automatically.

So, not calling pg_logical_slot_get_changes() on test_slot1 won't
consume the WAL?

A few more comments:

1.
+/*
+ * Use max_slot_wal_keep_size as -1 to prevent the WAL removal by the
+ * checkpointer process.  If WALs required by logical replication slots
+ * are removed, the slots are unusable.  This setting prevents the
+ * invalidation of slots during the upgrade. We set this option when

IIUC, during upgrade we don't want the checkpointer to remove WAL that
may be needed by logical slots, for that the patch overrides the user
set value for max_slot_wal_keep_size. What if the WAL is removed
because of the wal_keep_size setting?

2.
+++ b/src/bin/pg_upgrade/t/003_logical_replication_slots.pl

How about a more descriptive and pointed name for the TAP test file,
something like 003_upgrade_logical_replication_slots.pl?

3. Does this patch support upgrading of logical replication slots on a
streaming standby? If yes, isn't it a good idea to add one test for
upgrading standby with logical replication slots?

-- 
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: New WAL record to detect the checkpoint redo location

2023-09-22 Thread Amit Kapila
On Thu, Sep 21, 2023 at 9:06 PM Robert Haas  wrote:
>
> On Thu, Sep 21, 2023 at 4:22 AM Amit Kapila  wrote:
> > After the 0003 patch, do we need acquire exclusive lock via
> > WALInsertLockAcquireExclusive() for non-shutdown checkpoints. Even the
> > comment "We must block concurrent insertions while examining insert
> > state to determine the checkpoint REDO pointer." seems to indicate
> > that it is not required. If it is required then we may want to change
> > the comments and also acquiring the locks twice will have more cost
> > than acquiring it once and write the new WAL record under that lock.
>
> I think the comment needs updating. I don't think we can do curInsert
> = XLogBytePosToRecPtr(Insert->CurrBytePos) without taking the locks.
> Same for Insert->fullPageWrites.
>

If we can't do those without taking all the locks then it is fine but
just wanted to give it a try to see if there is a way to avoid in case
of online (non-shutdown) checkpoints. For example, curInsert is used
only for the shutdown path, so we don't need to acquire all locks for
it in the cases except for the shutdown case. Here, we are reading
Insert->fullPageWrites which requires an insertion lock but not all
the locks (as per comments in structure XLogCtlInsert). Now, I haven't
done detailed analysis for
XLogCtl->InsertTimeLineID/XLogCtl->PrevTimeLineID but some places
reading InsertTimeLineID have a comment like "Given that we're not in
recovery, InsertTimeLineID is set and can't change, so we can read it
without a lock." which suggests that some analysis is required whether
reading those requires all locks in this code path. OTOH, it won't
matter to acquire all locks in this code path for the reasons
mentioned by you and it may help in keeping the code simple. So, it is
up to you to take the final call on this matter. I am fine with your
decision.

>
> > BTW, I would like to mention that there is a slight interaction of
> > this work with the patch to upgrade/migrate slots [1]. Basically in
> > [1], to allow slots migration from lower to higher version, we need to
> > ensure that all the WAL has been consumed by the slots before clean
> > shutdown. However, during upgrade we can generate few records like
> > checkpoint which we will ignore for the slot consistency checking as
> > such records doesn't matter for data consistency after upgrade. We
> > probably need to add this record to that list. I'll keep an eye on
> > both the patches so that we don't miss that interaction but mentioned
> > it here to make others also aware of the same.
>
> If your approach requires a code change every time someone adds a new
> WAL record that doesn't modify table data, you might want to rethink
> the approach a bit.
>

I understand your hesitation and we have discussed several approaches
that do not rely on the WAL record type to determine if the slots have
caught up but the other approaches seem to have different other
downsides. I know it may not be a good idea to discuss those here but
as there was a slight interaction with this work, so I thought to
bring it up. To be precise, we need to ensure that we ignore WAL
records that got generated during pg_upgrade operation (say during
pg_upgrade --check).

The approach we initially followed was to check if the slot's
confirmed_flush_lsn is equal to the latest checkpoint in
pg_controldata (which is the shutdown checkpoint after stopping the
server). This approach doesn't work for the use case where the user
runs pg_upgrade --check before actually performing the upgrade [1].
This is because during the upgrade check, the server will be
stopped/started and update the position of the latest checkpoint,
causing the check to fail in the actual upgrade and leading pg_upgrade
to believe that the slot has not been caught up.

To address the issues in the above approach, we also discussed several
alternative approaches[2][3]: a) Adding a new field in pg_controldata
to record the last checkpoint that happens in non-upgrade mode, so
that we can compare the slot's confirmed_flush_lsn with this value.
However, we were not sure if this was a good enough reason to add a
new field in controldata field and sprinkle IsBinaryUpgrade check in
checkpointer code path. b) Advancing each slot's confirmed_flush_lsn
to the latest position if the first upgrade check passes. This way,
when performing the actual upgrade, the confirmed_flush_lsn will also
pass. However, internally advancing the LSN seems unconventional. c)
Introducing a new pg_upgrade option to skip the check for slot
catch-up so that if it is already done at the time of pg_upgrade
--check, we can avoid rechecking during actual upgrade. Although this
might work, the user would need to specify this manually, which is not
ideal. d) Document this and suggest users consume the WALs, but this
doesn't look acceptable to users.

All the above approaches have their downsides, prompting us to
consider the WAL scan approach which is to scan the end of the WAL for

Re: Remove MSVC scripts from the tree

2023-09-22 Thread Peter Eisentraut

On 22.09.23 03:12, Michael Paquier wrote:

It has been mentioned a few times now that, as Meson has been
integrated in the tree, the next step would be to get rid of the
custom scripts in src/tools/msvc/ and moving forward only support
Meson when building with VS compilers.


First we need to fix things so that we can build using meson from a 
distribution tarball, which is the subject of 
.






Re: information_schema and not-null constraints

2023-09-22 Thread Peter Eisentraut

On 19.09.23 09:01, Peter Eisentraut wrote:
While testing this, I noticed that the way the check_clause of regular 
check constraints is computed appears to be suboptimal.  It currently does


CAST(substring(pg_get_constraintdef(con.oid) from 7) AS character_data)

which ends up with an extra set of parentheses, which is ignorable, but 
it also leaves in suffixes like "NOT VALID", which don't belong into 
that column.  Earlier in this thread I had contemplated a fix for the 
first issue, but that wouldn't address the second issue.  I think we can 
fix this quite simply by using pg_get_expr() instead.  I don't know why 
it wasn't done like that to begin with, maybe it was just a (my?) 
mistake.  See attached patch.


committed