Re: Helper functions for wait_for_catchup() in Cluster.pm

2023-01-19 Thread Drouvot, Bertrand

Hi,

On 1/18/23 10:59 AM, Alvaro Herrera wrote:

On 2023-Jan-18, Drouvot, Bertrand wrote:


The current calls are done that way:

wait_for_replay_catchup called:
- 8 times with write LSN as an argument
- 1 time with insert LSN as an argument
- 16 times with flush LSN as an argument



So it looks like that providing a node as a second argument
would not help for the wait_for_replay_catchup() case.


... unless we changed the calls that wait for reply that use write or
insert so that they use flush instead.  


That's a good idea, thanks! Please find attached V2 doing so.


Surely everything should still
work, right?  


Right.


Flushing would still occur, either right after the write
(as the transaction commits) or ~200ms afterwards when WAL writer
catches up to that point.

I suppose this may fail to be true if there is some test that is
specifically testing whether writing WAL without flushing works, which
should rare enough, but if it does exist, 


I don't see this kind of test.

Please note that V2 does not contain wait_for_flush_catchup() and
wait_for_sent_catchup() anymore as: 1) they are not used yet
and 2) it lets to their author (if any) decide the node->lsn() mode to be used.

This is also mentioned as a comment in V2.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.comdiff --git a/src/bin/pg_rewind/t/007_standby_source.pl 
b/src/bin/pg_rewind/t/007_standby_source.pl
index 52644c2c0d..7236a3b177 100644
--- a/src/bin/pg_rewind/t/007_standby_source.pl
+++ b/src/bin/pg_rewind/t/007_standby_source.pl
@@ -74,9 +74,8 @@ $node_a->safe_psql('postgres',
"INSERT INTO tbl1 values ('in A, before promotion')");
 $node_a->safe_psql('postgres', 'CHECKPOINT');
 
-my $lsn = $node_a->lsn('write');
-$node_a->wait_for_catchup('node_b', 'write', $lsn);
-$node_b->wait_for_catchup('node_c', 'write', $lsn);
+$node_a->wait_for_write_catchup('node_b', $node_a);
+$node_b->wait_for_write_catchup('node_c', $node_a);
 
 # Promote C
 #
@@ -160,7 +159,7 @@ in A, after C was promoted
 $node_a->safe_psql('postgres',
"INSERT INTO tbl1 values ('in A, after rewind')");
 
-$node_b->wait_for_catchup('node_c', 'replay', $node_a->lsn('write'));
+$node_b->wait_for_replay_catchup('node_c', $node_a);
 
 check_query(
'SELECT * FROM tbl1',
diff --git a/src/test/modules/brin/t/02_wal_consistency.pl 
b/src/test/modules/brin/t/02_wal_consistency.pl
index 5983ef208e..7211ba8d8d 100644
--- a/src/test/modules/brin/t/02_wal_consistency.pl
+++ b/src/test/modules/brin/t/02_wal_consistency.pl
@@ -70,6 +70,6 @@ my ($ret, $out, $err) = $whiskey->psql(
});
 cmp_ok($out, '>=', 1);
 
-$whiskey->wait_for_catchup($charlie, 'replay', $whiskey->lsn('insert'));
+$whiskey->wait_for_replay_catchup($charlie, $whiskey);
 
 done_testing();
diff --git a/src/test/perl/PostgreSQL/Test/Cluster.pm 
b/src/test/perl/PostgreSQL/Test/Cluster.pm
index 04921ca3a3..cea9796c0c 100644
--- a/src/test/perl/PostgreSQL/Test/Cluster.pm
+++ b/src/test/perl/PostgreSQL/Test/Cluster.pm
@@ -2709,6 +2709,44 @@ sub wait_for_catchup
return;
 }
 
+# Now defining helper functions wait_for_replay_catchup() and
+# wait_for_write_catchup().
+# Please note that wait_for_flush_catchup() and wait_for_sent_catchup() are not
+# defined as: 1) they are not used yet and 2) it lets their author (if any)
+# decide the node->lsn() mode to be used.
+
+=pod
+
+=item $node->wait_for_replay_catchup(standby_name, node)
+
+Helper function for wait_for_catchup when waiting for the replay_lsn column
+to catchup.
+
+=cut
+
+sub wait_for_replay_catchup
+{
+   my ($self, $standby_name, $node) = @_;
+
+   $self->wait_for_catchup($standby_name, 'replay', $node->lsn('flush'));
+}
+
+=pod
+
+=item $node->wait_for_write_catchup(standby_name, node)
+
+Helper function for wait_for_catchup when waiting for the write_lsn column
+to catchup.
+
+=cut
+
+sub wait_for_write_catchup
+{
+   my ($self, $standby_name, $node) = @_;
+
+   $self->wait_for_catchup($standby_name, 'write', $node->lsn('write'));
+}
+
 =pod
 
 =item $node->wait_for_slot_catchup(slot_name, mode, target_lsn)
diff --git a/src/test/recovery/t/001_stream_rep.pl 
b/src/test/recovery/t/001_stream_rep.pl
index 23a90dd85b..5b15a20d54 100644
--- a/src/test/recovery/t/001_stream_rep.pl
+++ b/src/test/recovery/t/001_stream_rep.pl
@@ -47,9 +47,8 @@ $node_primary->safe_psql('postgres',
"CREATE TABLE tab_int AS SELECT generate_series(1,1002) AS a");
 
 # Wait for standbys to catch up
-my $primary_lsn = $node_primary->lsn('write');
-$node_primary->wait_for_catchup($node_standby_1, 'replay', $primary_lsn);
-$node_standby_1->wait_for_catchup($node_standby_2, 'replay', $primary_lsn);
+$node_primary->wait_for_replay_catchup($node_standby_1, $node_primary);
+$node_standby_1->wait_for_replay_catchup($node_standby_2, $node_primary);
 
 my $result =
   $node_standby_1->safe_psql('postgres', "SELECT count(*) FROM tab_int");
@@ -66,9 +65,8 @@

RE: Time delayed LR (WAS Re: logical replication restrictions)

2023-01-19 Thread Takamichi Osumi (Fujitsu)
On Thursday, January 19, 2023 10:42 AM Peter Smith  
wrote:
> On Wed, Jan 18, 2023 at 6:06 PM Peter Smith 
> wrote:
> >
> >  Here are my review comments for the latest patch v16-0001. (excluding
> > the test code)
> >
> ...
> >
> > 8. AlterSubscription (general)
> >
> > I observed during testing there are 3 different errors….
> >
> > At subscription CREATE time you can get this error:
> > ERROR:  min_apply_delay > 0 and streaming = parallel are mutually
> > exclusive options
> >
> > If you try to ALTER the min_apply_delay when already streaming =
> > parallel you can get this error:
> > ERROR:  cannot enable min_apply_delay for subscription in streaming =
> > parallel mode
> >
> > If you try to ALTER the streaming to be parallel if there is already a
> > min_apply_delay > 0 then you can get this error:
> > ERROR:  cannot enable streaming = parallel mode for subscription with
> > min_apply_delay
> >
> > ~
> >
> > IMO there is no need to have 3 different error message texts.  I think
> > all these cases are explained by just the first text (ERROR:
> > min_apply_delay > 0 and streaming = parallel are mutually exclusive
> > options)
> >
> >
> 
> After checking the regression test output I can see the merit of your separate
> error messages like this, even if they are maybe not strictly necessary. So 
> feel
> free to ignore my previous review comment.
Thank you for your notification.

I wrote another reason why we wrote those messages in [1].
So, please have a look at it.

[1] - 
https://www.postgresql.org/message-id/TYCPR01MB8373447440202B248BB63805EDC49%40TYCPR01MB8373.jpnprd01.prod.outlook.com


Best Regards,
Takamichi Osumi





RE: Exit walsender before confirming remote flush in logical replication

2023-01-19 Thread Hayato Kuroda (Fujitsu)
Dear Amit, hackers,

> Let me try to summarize the discussion till now. The problem we are
> trying to solve here is to allow a shutdown to complete when walsender
> is not able to send the entire WAL. Currently, in such cases, the
> shutdown fails. As per our current understanding, this can happen when
> (a) walreceiver/walapply process is stuck (not able to receive more
> WAL) due to locks or some other reason; (b) a long time delay has been
> configured to apply the WAL (we don't yet have such a feature for
> logical replication but the discussion for same is in progress).

Thanks for summarizing.
While analyzing stuck, I noticed that there are two types of shutdown failures.
They could be characterized by the back trace. They are shown at the bottom.

Type i)
The walsender executes WalSndDone(), but cannot satisfy the condition.
It means that all WALs have been sent to the subscriber but have not flushed;
sentPtr is not the same as replicatedPtr. This stuck can happen when the delayed
transaction is small or streamed.

Type ii)
The walsender cannot execute WalSndDone(), stacks at ProcessPendingWrites().
It means that when the send buffer becomes full while replicating a transaction;
pq_is_send_pending() returns true and the walsender cannot break the loop.
This stuck can happen when the delayed transaction is large, but it is not a 
streamed one.

If we choose modification (1), we can only fix type (i) because pending WALs 
cause
the failure. IIUC if we want to shut down walsender processes even if (ii), we 
must
choose (2) and additional fixes are needed.

Based on the above, I prefer modification (2) because it can rescue more cases. 
Thoughts?
PSA the patch for it. It is almost the same as the previous version, but the 
comments are updated.


Appendinx:

The backtrace for type i)

```
#0  WalSndDone (send_data=0x87f825 ) at 
../../PostgreSQL-Source-Dev/src/backend/replication/walsender.c:3111
#1  0x0087ed1d in WalSndLoop (send_data=0x87f825 ) at 
../../PostgreSQL-Source-Dev/src/backend/replication/walsender.c:2525
#2  0x0087d40a in StartLogicalReplication (cmd=0x1f49030) at 
../../PostgreSQL-Source-Dev/src/backend/replication/walsender.c:1320
#3  0x0087df29 in exec_replication_command (
cmd_string=0x1f15498 "START_REPLICATION SLOT \"sub\" LOGICAL 0/0 
(proto_version '4', streaming 'on', origin 'none', publication_names 
'\"pub\"')")
at ../../PostgreSQL-Source-Dev/src/backend/replication/walsender.c:1830
#4  0x0091b032 in PostgresMain (dbname=0x1f4c938 "postgres", 
username=0x1f4c918 "postgres")
at ../../PostgreSQL-Source-Dev/src/backend/tcop/postgres.c:4561
#5  0x0085390b in BackendRun (port=0x1f3d0b0) at 
../../PostgreSQL-Source-Dev/src/backend/postmaster/postmaster.c:4437
#6  0x0085322c in BackendStartup (port=0x1f3d0b0) at 
../../PostgreSQL-Source-Dev/src/backend/postmaster/postmaster.c:4165
#7  0x0084f7a2 in ServerLoop () at 
../../PostgreSQL-Source-Dev/src/backend/postmaster/postmaster.c:1762
#8  0x0084f0a2 in PostmasterMain (argc=3, argv=0x1f0ff30) at 
../../PostgreSQL-Source-Dev/src/backend/postmaster/postmaster.c:1452
#9  0x0074a4d6 in main (argc=3, argv=0x1f0ff30) at 
../../PostgreSQL-Source-Dev/src/backend/main/main.c:200
```

The backtrace for type ii)

```
#0  ProcessPendingWrites () at 
../../PostgreSQL-Source-Dev/src/backend/replication/walsender.c:1438
#1  0x0087d635 in WalSndWriteData (ctx=0x1429ce8, lsn=22406440, 
xid=731, last_write=true)
at ../../PostgreSQL-Source-Dev/src/backend/replication/walsender.c:1405
#2  0x00888420 in OutputPluginWrite (ctx=0x1429ce8, last_write=true) at 
../../PostgreSQL-Source-Dev/src/backend/replication/logical/logical.c:669
#3  0x7f022dfe43a7 in pgoutput_change (ctx=0x1429ce8, txn=0x1457d40, 
relation=0x7f0245075268, change=0x1460ef8)
at 
../../PostgreSQL-Source-Dev/src/backend/replication/pgoutput/pgoutput.c:1491
#4  0x00889125 in change_cb_wrapper (cache=0x142bcf8, txn=0x1457d40, 
relation=0x7f0245075268, change=0x1460ef8)
at 
../../PostgreSQL-Source-Dev/src/backend/replication/logical/logical.c:1077
#5  0x0089507c in ReorderBufferApplyChange (rb=0x142bcf8, 
txn=0x1457d40, relation=0x7f0245075268, change=0x1460ef8, streaming=false)
at 
../../PostgreSQL-Source-Dev/src/backend/replication/logical/reorderbuffer.c:1969
#6  0x00895866 in ReorderBufferProcessTXN (rb=0x142bcf8, txn=0x1457d40, 
commit_lsn=23060624, snapshot_now=0x1440150, command_id=0, streaming=false)
at 
../../PostgreSQL-Source-Dev/src/backend/replication/logical/reorderbuffer.c:2245
#7  0x00896348 in ReorderBufferReplay (txn=0x1457d40, rb=0x142bcf8, 
xid=731, commit_lsn=23060624, end_lsn=23060672, commit_time=727353664342177, 
origin_id=0, origin_lsn=0) at 
../../PostgreSQL-Source-Dev/src/backend/replication/logical/reorderbuffer.c:2675
#8  0x008963d0 in ReorderBufferCommit (rb=0x142bcf8, xid=731, 
commit_lsn=23060624, end

Re: Generating code for query jumbling through gen_node_support.pl

2023-01-19 Thread Peter Eisentraut

On 18.01.23 08:04, Michael Paquier wrote:

On Tue, Jan 17, 2023 at 04:52:28PM +0900, Michael Paquier wrote:

On Tue, Jan 17, 2023 at 08:43:44AM +0100, Peter Eisentraut wrote:

Ok, I understand now, and I agree with this approach over the opposite. I
was confused because the snippet you showed above used "jumble_ignore", but
your patch is correct as it uses "jumble_location".


Okay.  I'll refresh the patch set so as we have only "jumble_ignore",
then, like v1, with preparatory patches for what you mentioned and
anything that comes into mind.


This is done as of the patch series v3 attached:
- 0001 reformats all the comments of the nodes.
- 0002 moves the current files for query jumble as of queryjumble.c ->
queryjumblefuncs.c and utils/queryjumble.h -> nodes/queryjumble.h.
- 0003 is the core feature, where I have done a second pass over the
nodes to make sure that things map with HEAD, incorporating the extra
docs coming from v2, adding a bit more.


This patch structure looks good.


That said, the term "jumble" is really weird, because in the sense that we
are using it here it means, approximately, "to mix together", "to unify".
So what we are doing with the Const nodes is really to *not* jumble the
location, but for all other node types we are jumbling the location.  At
least that is my understanding.


I am quite familiar with this term, FWIW.  That's what we've inherited
from the days where this has been introduced in pg_stat_statements.


I have renamed the node attributes to query_jumble_ignore and
no_query_jumble at the end, after considering Peter's point that only
"jumble" could be fuzzy here.  The file names are changed in
consequence.


I see that in the 0003 patch, most location fields now have an explicit 
markup with query_jumble_ignore.  I thought we had previously resolved 
to consider location fields to be automatically ignored unless 
explicitly included (like for the Const node).  This appears to invert 
that?  Am I missing something?






Re: Generating code for query jumbling through gen_node_support.pl

2023-01-19 Thread Michael Paquier
On Thu, Jan 19, 2023 at 09:42:03AM +0100, Peter Eisentraut wrote:
> I see that in the 0003 patch, most location fields now have an explicit
> markup with query_jumble_ignore.  I thought we had previously resolved to
> consider location fields to be automatically ignored unless explicitly
> included (like for the Const node).  This appears to invert that?  Am I
> missing something?

My misunderstanding then, I thought that you were OK with what was
part of v1, where all these fields was marked as "ignore".  But you
actually prefer v2, with the second field "location" on top of
"ignore".  I can update 0003 to refresh that.

Would you be OK if I apply 0001 (with the comments of the locations
still reshaped to ease future property additions) and 0002?
--
Michael


signature.asc
Description: PGP signature


Re: [PATCH] Constify proclist.h

2023-01-19 Thread Peter Eisentraut

On 17.01.23 13:18, Aleksander Alekseev wrote:

This is a follow-up to [1] and c8ad4d81.


Additionally Bharath pointed out that there are other pieces of code
that we may want to change in a similar fashion,
proclist.h/proclist_types.h as one example. I didn't do this yet
because I would like to know the community opinion first on whether we
should do this at all.


Since the consensus seems to be to constify everything possible here
is the patch for proclist.h. There is nothing to change in
proclist_types.h.

[1]: 
https://postgr.es/m/CAJ7c6TM2%3D08mNKD9aJg8vEY9hd%2BG4L7%2BNvh30UiNT3kShgRgNg%40mail.gmail.com


committed





Re: Improve GetConfigOptionValues function

2023-01-19 Thread Nitin Jadhav
> Yes, the existing caller isn't using the fetched values when noshow is
> set to true. However, I think returning from GetConfigOptionValues()
> when noshow is set to true without fetching values limits the use of
> the function. What if someother caller wants to use the function to
> get the values with noshow passed in and use the values when noshow is
> set to true?

I agree that it limits the use of the function but I feel it is good
to focus on the existing use cases and modify the functions
accordingly. In future, if such a use case occurs, then the author
should take care of modifying the required functions. The idea
suggested by Tom to refactor the function looks good as it aligns with
current use cases and it can be used in future cases as you were
suggesting.


> Also, do we gain anything with the patch? I mean, can
> show_all_settings()/pg_settings/pg_show_all_settings() get faster, say
> with a non-superuser without pg_read_all_settings predefined role or
> with a superuser? I see there're about 6 GUC_NO_SHOW_ALL GUCs and 20
> GUC_SUPERUSER_ONLY GUCs, I'm not sure if it leads to some benefit with
> the patch.

As the number of such parameters (GUC_NO_SHOW_ALL and
GUC_SUPERUSER_ONLY) are less, we may not see improvements in
performance. We can treat it as a kind of refactoring.

Thanks & Regards,
Nitin Jadhav

On Wed, Jan 18, 2023 at 1:47 PM Bharath Rupireddy
 wrote:
>
> On Wed, Jan 18, 2023 at 1:24 PM Nitin Jadhav
>  wrote:
> >
> > Attaching the patch.
> >
> > On Wed, Jan 18, 2023 at 1:21 PM Nitin Jadhav
> >  wrote:
> > >
> > > Hi,
> > >
> > > GetConfigOptionValues function extracts the config parameters for the
> > > given variable irrespective of whether it results in noshow or not.
> > > But the parent function show_all_settings ignores the values parameter
> > > if it results in noshow. It's unnecessary to fetch all the values
> > > during noshow. So a return statement in GetConfigOptionValues() when
> > > noshow is set to true is needed. Attached the patch for the same.
> > > Please share your thoughts.
>
> Yes, the existing caller isn't using the fetched values when noshow is
> set to true. However, I think returning from GetConfigOptionValues()
> when noshow is set to true without fetching values limits the use of
> the function. What if someother caller wants to use the function to
> get the values with noshow passed in and use the values when noshow is
> set to true?
>
> Also, do we gain anything with the patch? I mean, can
> show_all_settings()/pg_settings/pg_show_all_settings() get faster, say
> with a non-superuser without pg_read_all_settings predefined role or
> with a superuser? I see there're about 6 GUC_NO_SHOW_ALL GUCs and 20
> GUC_SUPERUSER_ONLY GUCs, I'm not sure if it leads to some benefit with
> the patch.
>
> Having said above, I don't mind keeping the things the way they're right now.
>
> --
> Bharath Rupireddy
> PostgreSQL Contributors Team
> RDS Open Source Databases
> Amazon Web Services: https://aws.amazon.com




Re: Stack overflow issue

2023-01-19 Thread Egor Chindyaskin
Hello! In continuation of the topic, I, under the leadership of 
Alexander Lakhin, prepared patches that fix these problems.
We decided that these checks would be enough and put them in the places 
we saw fit.diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index fd1a7ac5d5..318f620ed5 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -2812,6 +2812,9 @@ CommitTransactionCommand(void)
 {
 	TransactionState s = CurrentTransactionState;
 
+	/* since this function recurses, it could be driven to stack overflow */
+	check_stack_depth();
+
 	switch (s->blockState)
 	{
 			/*
@@ -5194,6 +5197,9 @@ ShowTransactionStateRec(const char *str, TransactionState s)
 {
 	StringInfoData buf;
 
+	/* since this function recurses, it could be driven to stack overflow */
+	check_stack_depth();
+
 	initStringInfo(&buf);
 
 	if (s->nChildXids > 0)
diff --git a/src/backend/catalog/dependency.c b/src/backend/catalog/dependency.c
index f8510a1483..d1bbc11d3a 100644
--- a/src/backend/catalog/dependency.c
+++ b/src/backend/catalog/dependency.c
@@ -74,6 +74,7 @@
 #include "nodes/nodeFuncs.h"
 #include "parser/parsetree.h"
 #include "rewrite/rewriteRemove.h"
+#include "miscadmin.h"
 #include "storage/lmgr.h"
 #include "utils/fmgroids.h"
 #include "utils/guc.h"
@@ -489,6 +490,11 @@ findDependentObjects(const ObjectAddress *object,
 	if (stack_address_present_add_flags(object, objflags, stack))
 		return;
 
+	/* since this function recurses, it could be driven to stack overflow,
+	 * because of the deep dependency tree, not only due to dependency loops.
+	 */
+	check_stack_depth();
+
 	/*
 	 * It's also possible that the target object has already been completely
 	 * processed and put into targetObjects.  If so, again we just add the
diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index 846c530154..2f243bcf5e 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -513,6 +513,9 @@ CheckAttributeType(const char *attname,
 	char		att_typtype = get_typtype(atttypid);
 	Oid			att_typelem;
 
+	/* since this function recurses, it could be driven to stack overflow */
+	check_stack_depth();
+
 	if (att_typtype == TYPTYPE_PSEUDO)
 	{
 		/*
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index f56219973e..4cf08a1e04 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -5578,6 +5578,9 @@ ATExecAddColumn(List **wqueue, AlteredTableInfo *tab, Relation rel,
 	AclResult	aclresult;
 	ObjectAddress address;
 
+	/* since this function recurses, it could be driven to stack overflow */
+	check_stack_depth();
+
 	/* At top level, permission check was done in ATPrepCmd, else do it */
 	if (recursing)
 		ATSimplePermissions(rel, ATT_TABLE | ATT_FOREIGN_TABLE);
@@ -7004,6 +7007,9 @@ ATExecDropColumn(List **wqueue, Relation rel, const char *colName,
 	ObjectAddress object;
 	bool		is_expr;
 
+	/* since this function recurses, it could be driven to stack overflow */
+	check_stack_depth();
+
 	/* At top level, permission check was done in ATPrepCmd, else do it */
 	if (recursing)
 		ATSimplePermissions(rel, ATT_TABLE | ATT_FOREIGN_TABLE);
@@ -8563,6 +8569,9 @@ ATExecAlterConstrRecurse(Constraint *cmdcon, Relation conrel, Relation tgrel,
 	Oid			conoid;
 	bool		changed = false;
 
+	/* since this function recurses, it could be driven to stack overflow */
+	check_stack_depth();
+
 	currcon = (Form_pg_constraint) GETSTRUCT(contuple);
 	conoid = HeapTupleGetOid(contuple);
 
@@ -9552,6 +9561,9 @@ ATExecDropConstraint(Relation rel, const char *constrName,
 	bool		is_no_inherit_constraint = false;
 	char		contype;
 
+	/* since this function recurses, it could be driven to stack overflow */
+	check_stack_depth();
+
 	/* At top level, permission check was done in ATPrepCmd, else do it */
 	if (recursing)
 		ATSimplePermissions(rel, ATT_TABLE | ATT_FOREIGN_TABLE);
diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c
index edb56ab3a4..67b3a1f4bf 100644
--- a/src/backend/optimizer/util/clauses.c
+++ b/src/backend/optimizer/util/clauses.c
@@ -2650,6 +2650,10 @@ static Node *
 eval_const_expressions_mutator(Node *node,
 			   eval_const_expressions_context *context)
 {
+
+	/* since this function recurses, it could be driven to stack overflow */
+	check_stack_depth();
+
 	if (node == NULL)
 		return NULL;
 	switch (nodeTag(node))
diff --git a/src/backend/utils/mmgr/mcxt.c b/src/backend/utils/mmgr/mcxt.c
index 22da98c19d..800c1ffac0 100644
--- a/src/backend/utils/mmgr/mcxt.c
+++ b/src/backend/utils/mmgr/mcxt.c
@@ -216,6 +216,9 @@ MemoryContextDelete(MemoryContext context)
 	/* And not CurrentMemoryContext, either */
 	Assert(context != CurrentMemoryContext);
 
+	/* since this function recurses, it could be driven to stack overflow */
+	check_stack_depth();
+
 	/* save a function call in common case where there are no children */
 	if (context->firstchild != NULL

Support plpgsql multi-range in conditional control

2023-01-19 Thread 2903807...@qq.com
Dear hackers, my good friend Hou Jiaxing and I have implemented a version of 
the code that supports multiple integer range conditions in the in condition 
control of the for loop statement in the plpgsql procedural language. A typical 
example is as follows:

postgres=# do $$
declare
i int := 10;
begin
for i in 1..10 by 3, reverse i+10..i+1 by 3 loop
   raise info '%', i;
end loop;
end $$;
INFO: 1
INFO: 4
INFO: 7
INFO: 10
INFO: 20
INFO: 17
INFO: 14
INFO: 11
do
postgres=#

Hope to get your feedback, thank you!



2903807...@qq.com


0001-Support-plpgsql-multi-range-in-conditional-control.patch
Description: Binary data


Re: CREATEROLE users vs. role properties

2023-01-19 Thread tushar

On 1/19/23 4:47 AM, Nathan Bossart wrote:

This seems like a clear improvement to me.  However, as the attribute
system becomes more sophisticated, I think we ought to improve the error
messages in user.c.  IMHO messages like "permission denied" could be
greatly improved with some added context.
I observed this behavior where the role is having creatrole but still 
it's unable to pass it to another user.


postgres=# create role abc1 login createrole;
CREATE ROLE
postgres=# create user test1;
CREATE ROLE
postgres=# \c - abc1
You are now connected to database "postgres" as user "abc1".
postgres=> alter role test1 with createrole ;
ERROR:  permission denied
postgres=>

which was working previously without patch.

Is this an expected behavior?

--
regards,tushar
EnterpriseDB  https://www.enterprisedb.com/
The Enterprise PostgreSQL Company





Re: Minimal logical decoding on standbys

2023-01-19 Thread Drouvot, Bertrand

Hi,

On 1/19/23 3:46 AM, Andres Freund wrote:

Hi,

On 2023-01-18 11:24:19 +0100, Drouvot, Bertrand wrote:

On 1/6/23 4:40 AM, Andres Freund wrote:

Hm, that's quite expensive. Perhaps worth adding a C helper that can do that
for us instead? This will likely also be needed in real applications after all.



Not sure I got it. What the C helper would be supposed to do?


Call LogStandbySnapshot().



Got it, I like the idea, will do.




With a reload in place in my testing, now I notice that the catalog_xmin
is updated on the primary physical slot after logical slots invalidation
when reloading hot_standby_feedback from "off" to "on".

This is not the case after a re-start (aka catalog_xmin is NULL).

I think a re-start and reload should produce identical behavior on
the primary physical slot. If so, I'm tempted to think that the catalog_xmin
should be updated in case of a re-start too (even if all the logical slots are 
invalidated)
because the slots are not dropped yet. What do you think?


I can't quite follow the steps leading up to the difference. Could you list
them in a bit more detail?




Sure, so with:

1) hot_standby_feedback set to off on the standby
2) create 2 logical replication slots on the standby and activate one
3) Invalidate the logical slots on the standby with VACUUM FULL on the primary
4) change hot_standby_feedback to on on the standby

If:

5) pg_reload_conf() on the standby, then on the primary we get a catalog_xmin
for the physical slot that the standby is attached to:

postgres=# select slot_type,xmin,catalog_xmin  from pg_replication_slots ;
 slot_type | xmin | catalog_xmin
---+--+--
 physical  |  822 |  748
(1 row)

But if:

5) re-start the standby, then on the primary we get an empty catalog_xmin
for the physical slot that the standby is attached to:

postgres=# select slot_type,xmin,catalog_xmin  from pg_replication_slots ;
 slot_type | xmin | catalog_xmin
---+--+--
 physical  |  816 |
(1 row)




Can we do something cheaper than rewriting the entire database? Seems
rewriting a single table ought to be sufficient?



While implementing the test at the table level I discovered that It looks like there is 
no guarantee that say a "vacuum full pg_class;" would
produce a conflict.


I assume that's mostly when there weren't any removal



Indeed, from what I can see in my testing it could generate a XLOG_HEAP2_PRUNE 
with snapshotConflictHorizon to 0:

"rmgr: Heap2   len (rec/tot):107/   107, tx:848, lsn: 0/03B98B30, 
prev 0/03B98AF0, desc: PRUNE snapshotConflictHorizon 0"


Having a snapshotConflictHorizon to zero leads to 
ResolveRecoveryConflictWithSnapshot() simply returning
without any conflict handling.


That doesn't have to mean anything bad. Some row versions can be removed without
creating a conflict. See HeapTupleHeaderAdvanceConflictHorizon(), specifically

 * Ignore tuples inserted by an aborted transaction or if the tuple was
 * updated/deleted by the inserting transaction.




It does look like that in the standby decoding case that's not the right 
behavior (and that the xid that generated the PRUNING should be used instead)
, what do you think?


That'd not work, because that'll be typically newer than the catalog_xmin. So
we'd start invalidating things left and right, despite not needing to.




Okay, thanks for the explanations that makes sense.


Did you see anything else around this making you suspicious?



No, but a question still remains to me:

Given the fact that the row removal case is already done
in the next test (aka Scenario 2), If we want to replace the "vacuum full" test
on the database (done in Scenario 1) with a cheaper one at the table level,
what could it be to guarantee an invalidation?

Same as scenario 2 but with "vacuum full pg_class" would not really add value
to the tests, right?


+##
+# Test standby promotion and logical decoding behavior
+# after the standby gets promoted.
+##
+


I think this also should test the streaming / walsender case.



Do you mean cascading standby?


I mean a logical walsender that starts on a standby and continues across
promotion of the standby.



Got it, thanks, will do.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Use appendStringInfoSpaces more

2023-01-19 Thread David Rowley
In [1] I noticed a bit of a poor usage of appendStringInfoString which
just appends 4 spaces in a loop, one for each indent level of the
jsonb.  It should be better just to use appendStringInfoSpaces and
just append all the spaces in one go rather than appending 4 spaces in
a loop. That'll save having to check enlargeStringInfo() once for each
loop.

I'm aiming this mostly as a cleanup patch, but after looking at the
appendStringInfoSpaces code, I thought it could be done a bit more
efficiently by using memset instead of using the while loop that keeps
track of 2 counters. memset has the option of doing more than a char
at a time, which should be useful for larger numbers of spaces.

It does seem a bit faster when appending 8 chars at least going by the
attached spaces.c file.

With -O1
$ ./spaces
while 0.536577 seconds
memset 0.326532 seconds

However, I'm not really expecting much of a performance increase from
this change. I do at least want to make sure I've not made anything
worse, so I used pgbench to run:

select jsonb_pretty(row_to_json(pg_class)::jsonb) from pg_class;

perf top says:

Master:
 0.96%  postgres  [.] add_indent.part.0

Patched
  0.25%  postgres  [.] add_indent.part.0

I can't really detect a certain enough TPS change over the noise. I
expect it might become more significant with more complex json that
has more than a single indent level.

I could only find 1 other instance where we use appendStringInfoString
to append spaces. I've adjusted that one too.

David

[1] 
https://postgr.es/m/caaphdvrrfnsm8df24tmyozpvo-r5zp+0foqvo2xcyhrfteh...@mail.gmail.com
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index e4621ef8d6..5212a64b1e 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -3324,7 +3324,7 @@ show_hashagg_info(AggState *aggstate, ExplainState *es)
if (!gotone)
ExplainIndentText(es);
else
-   appendStringInfoString(es->str, "  ");
+   appendStringInfoSpaces(es->str, 2);
 
appendStringInfo(es->str, "Batches: %d  Memory Usage: " 
INT64_FORMAT "kB",
 
aggstate->hash_batches_used, memPeakKb);
diff --git a/src/backend/utils/adt/jsonb.c b/src/backend/utils/adt/jsonb.c
index 4ff2eced4c..0539f41c17 100644
--- a/src/backend/utils/adt/jsonb.c
+++ b/src/backend/utils/adt/jsonb.c
@@ -626,11 +626,8 @@ add_indent(StringInfo out, bool indent, int level)
 {
if (indent)
{
-   int i;
-
appendStringInfoCharMacro(out, '\n');
-   for (i = 0; i < level; i++)
-   appendBinaryStringInfo(out, "", 4);
+   appendStringInfoSpaces(out, level * 4);
}
 }
 
diff --git a/src/common/stringinfo.c b/src/common/stringinfo.c
index b3d3c99b8c..05b22b5c53 100644
--- a/src/common/stringinfo.c
+++ b/src/common/stringinfo.c
@@ -211,8 +211,8 @@ appendStringInfoSpaces(StringInfo str, int count)
enlargeStringInfo(str, count);
 
/* OK, append the spaces */
-   while (--count >= 0)
-   str->data[str->len++] = ' ';
+   memset(&str->data[str->len], ' ', count);
+   str->len += count;
str->data[str->len] = '\0';
}
 }

#include 
#include 
#include 

int f1(char *data, int count)
{
int len = 0;

/* OK, append the spaces */
while (--count >= 0)
data[len++] = ' ';
data[len] = '\0';
return len;
}

int f2(char *data, int count)
{
memset(data, ' ', count);
data[count] = '\0';
return count;
}

#define NLOOPS 1
int main(void)
{
char buffer[8];
clock_t start, end;
unsigned long long x = 0;

start = clock();
for (int i = 0; i < NLOOPS; i++)
{
x += f1(buffer, sizeof(buffer) - 1);
}
end = clock();

printf("while %g seconds\n", (double) (end - start) / CLOCKS_PER_SEC);

start = clock();
for (int i = 0; i < NLOOPS; i++)
{
x += f2(buffer, sizeof(buffer) - 1);
}
end = clock();

printf("memset %g seconds\n", (double) (end - start) / CLOCKS_PER_SEC);
printf("%lld\n", x);
return 0;
}


Re: Improve GetConfigOptionValues function

2023-01-19 Thread Nitin Jadhav
> Possibly a better answer is to refactor into separate functions,
> along the lines of
>
> static bool
> ConfigOptionIsShowable(struct config_generic *conf)
>
> static void
> GetConfigOptionValues(struct config_generic *conf, const char **values)

Nice suggestion. Attached a patch for the same. Please share the
comments if any.

Thanks & Regards,
Nitin Jadhav

On Wed, Jan 18, 2023 at 9:44 PM Tom Lane  wrote:
>
> Nitin Jadhav  writes:
> > GetConfigOptionValues function extracts the config parameters for the
> > given variable irrespective of whether it results in noshow or not.
> > But the parent function show_all_settings ignores the values parameter
> > if it results in noshow. It's unnecessary to fetch all the values
> > during noshow. So a return statement in GetConfigOptionValues() when
> > noshow is set to true is needed. Attached the patch for the same.
> > Please share your thoughts.
>
> I do not think this is an improvement: it causes GetConfigOptionValues
> to be making assumptions about how its results will be used.  If
> show_all_settings() were a big performance bottleneck, and there were
> a lot of no-show values that we could optimize, then maybe the extra
> coupling would be worthwhile.  But I don't believe either of those
> things.
>
> Possibly a better answer is to refactor into separate functions,
> along the lines of
>
> static bool
> ConfigOptionIsShowable(struct config_generic *conf)
>
> static void
> GetConfigOptionValues(struct config_generic *conf, const char **values)
>
> regards, tom lane


v2-0001-Refactor-GetConfigOptionValues-function.patch
Description: Binary data


Re: Record queryid when auto_explain.log_verbose is on

2023-01-19 Thread Julien Rouhaud
Hi,

On Tue, Jan 17, 2023 at 10:53:23PM +0900, torikoshia wrote:
> >
> > For interactive EXPLAIN the query identifier is printed just after the
> > plan,
> > before the triggers and the JIT summary so auto_explain should do the
> > same.
> Thanks for the comment!
> Agreed and updated the patch.

Thanks!

> On 2023-01-17 03:53, Justin Pryzby wrote:
> > On Mon, Jan 16, 2023 at 09:36:59PM +0900, torikoshia wrote:
> > > Attached patch makes auto_explain also print query identifiers.
> >
> > This was asked during the initial patch; does your patch address the
> > issues here ?
> >
> > https://www.postgresql.org/message-id/20200308142644.vlihk7djpwqjkp7w%40nol
>
> Thanks!
> I may misunderstand something, but it seems that the issue occurred since
> queryid was calculated in pgss_post_parse_analyze() at that time.
>
> ```
> --- queryid_exposure-v6.diff, which is the patch just before the discussion
> @@ -792,16 +801,34 @@ pgss_post_parse_analyze(ParseState *pstate, Query
> *query)
> ..snip..
>
> if (query->utilityStmt)
> {
> -   query->queryId = UINT64CONST(0);
> +   if (pgss_track_utility && PGSS_HANDLED_UTILITY(query->utilityStmt)
> +   && pstate->p_sourcetext)
> +   {
> +   const char *querytext = pstate->p_sourcetext;
> +   int query_location = query->stmt_location;
> +   int query_len = query->stmt_len;
> +
> +   /*
> +* Confine our attention to the relevant part of the string, if
> the
> +* query is a portion of a multi-statement source string.
> +*/
> +   querytext = pgss_clean_querytext(pstate->p_sourcetext,
> +&query_location,
> +&query_len);
> +
> +   query->queryId = pgss_compute_utility_queryid(querytext,
> query_len);
> ```
>
> Currently queryId is not calculated in pgss_post_parse_analyze() and the
> issue does not occur, doesn't it?
> I confirmed the attached patch logged queryid for some utility statements.
>
> ```
> LOG:  0: duration: 0.017 ms  plan:
> Query Text: prepare p1 as select 1;
> Result  (cost=0.00..0.01 rows=1 width=4)
>   Output: 1
> Query Identifier: -1801652217649936326
> ```

Yes, this problem was fixed a long time ago.  Just to be sure I checked that
auto_explain and explain agree on the queryid:

=# select count(*) from pg_class;
[...]
LOG:  duration: 0.239 ms  plan:
Query Text: select count(*) from pg_class;
Aggregate  (cost=15.45..15.46 rows=1 width=8)
  Output: count(*)
  ->  Index Only Scan using pg_class_tblspc_relfilenode_index on 
pg_catalog.pg_class  (cost=0.15..14.40 rows=417 width=0)
Output: reltablespace, relfilenode
Query Identifier: 2855866587085353326

=# explain (verbose) select count(*) from pg_class;
QUERY PLAN  
 >
->
 Aggregate  (cost=15.45..15.46 rows=1 width=8)
   Output: count(*)
   ->  Index Only Scan using pg_class_tblspc_relfilenode_index on 
pg_catalog.pg_class  (cost=0.15..14.40 rows>
 Output: reltablespace, relfilenode
 Query Identifier: 2855866587085353326

LOG:  duration: 0.000 ms  plan:
Query Text: explain (verbose) select count(*) from pg_class;
Aggregate  (cost=15.45..15.46 rows=1 width=8)
  Output: count(*)
  ->  Index Only Scan using pg_class_tblspc_relfilenode_index on 
pg_catalog.pg_class  (cost=0.15..14.40 rows=417 width=0)
Output: reltablespace, relfilenode
Query Identifier: 2855866587085353326

So the patch looks good to me.  I didn't find any entry in the commitfest, did
I miss it?  If not, could you create one (feel free to add me and Justin as
reviewer, and probably mark is as RfC).

It's a bit annoying that the info is missing since pg 14, but we probably can't
backpatch this as it might break log parser tools.




Re: [PATCH] Add native windows on arm64 support

2023-01-19 Thread Niyas Sait




On 17/01/2023 22:51, Andres Freund wrote:

Hi,

On 2022-12-16 10:52:23 +, Niyas Sait wrote:

Subject: [PATCH v7] Enable postgres native build for windows-arm64 platform



  elif host_cpu == 'arm' or host_cpu == 'aarch64'
  
-  prog = '''

+  if cc.get_id() == 'msvc'
+cdata.set('USE_ARMV8_CRC32C', false)
+cdata.set('USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK', 1)
+have_optimized_crc = true


I dimly recall that windows might actually require the relevant extension on
arm?


Do you mean we don't need the runtime checks for CRC ?

CRC is an optional extension for ARMv8 base architecture. I am not sure 
if windows make it mandatory to have this implementation.





+  else
+prog = '''
  #include 


I'd just make this include #ifdef _MSV_VER (or whatever it is).


The code snippet is not used for MSVC part. I am not sure why we need to 
add the #ifdef _MSC_VER.



  int main(void)
@@ -1960,18 +1966,19 @@ int main(void)
  }
  '''
  
-  if cc.links(prog, name: '__crc32cb, __crc32ch, __crc32cw, and __crc32cd without -march=armv8-a+crc',

-  args: test_c_args)
-# Use ARM CRC Extension unconditionally
-cdata.set('USE_ARMV8_CRC32C', 1)
-have_optimized_crc = true
-  elif cc.links(prog, name: '__crc32cb, __crc32ch, __crc32cw, and __crc32cd 
with -march=armv8-a+crc',
-  args: test_c_args + ['-march=armv8-a+crc'])
-# Use ARM CRC Extension, with runtime check
-cflags_crc += '-march=armv8-a+crc'
-cdata.set('USE_ARMV8_CRC32C', false)
-cdata.set('USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK', 1)
-have_optimized_crc = true
+if cc.links(prog, name: '__crc32cb, __crc32ch, __crc32cw, and __crc32cd 
without -march=armv8-a+crc',
+args: test_c_args)


Seems like it'd be easier to read if you don't re-indent this, but just have
the cc.get_id() == 'msvc' part of this if/else-if.



Yes that looks better. will do it in next patch.

Thanks Andres for the review.

--
Niyas




Re: Perform streaming logical transactions by background workers and parallel apply

2023-01-19 Thread shveta malik
On Thu, Jan 19, 2023 at 11:11 AM Amit Kapila  wrote:
>
> On Wed, Jan 18, 2023 at 12:09 PM Amit Kapila  wrote:
> >
> > On Fri, Jan 13, 2023 at 11:50 AM Peter Smith  wrote:
> > >
> > > Here are some review comments for patch v79-0002.
> > >
> >
> > So, this is about the latest v84-0001-Stop-extra-worker-if-GUC-was-changed.
> >
> > >
> > > I feel this patch just adds more complexity for almost no gain:
> > > - reducing the 'max_apply_workers_per_suibscription' seems not very
> > > common in the first place.
> > > - even when the GUC is reduced, at that point in time all the workers
> > > might be in use so there may be nothing that can be immediately done.
> > > - IIUC the excess workers (for a reduced GUC) are going to get freed
> > > naturally anyway over time as more transactions are completed so the
> > > pool size will reduce accordingly.
> > >
> >
> > I am still not sure if it is worth pursuing this patch because of the
> > above reasons. I don't think it would be difficult to add this even at
> > a later point in time if we really see a use case for this.
> > Sawada-San, IIRC, you raised this point. What do you think?
> >
> > The other point I am wondering is whether we can have a different way
> > to test partial serialization apart from introducing another developer
> > GUC (stream_serialize_threshold). One possibility could be that we can
> > have a subscription option (parallel_send_timeout or something like
> > that) with some default value (current_timeout used in the patch)
> > which will be used only when streaming = parallel. Users may want to
> > wait for more time before serialization starts depending on the
> > workload (say when resource usage is high on a subscriber-side
> > machine, or there are concurrent long-running transactions that can
> > block parallel apply for a bit longer time). I know with this as well
> > it may not be straightforward to test the functionality because we
> > can't be sure how many changes would be required for a timeout to
> > occur. This is just for brainstorming other options to test the
> > partial serialization functionality.
> >
>
> Apart from the above, we can also have a subscription option to
> specify parallel_shm_queue_size (queue_size used to determine the
> queue between the leader and parallel worker) and that can be used for
> this purpose. Basically, configuring it to a smaller value can help in
> reducing the test time but still, it will not eliminate the need for
> dependency on timing we have to wait before switching to partial
> serialize mode. I think this can be used in production as well to tune
> the performance depending on workload.
>
> Yet another way is to use the existing parameter logical_decode_mode
> [1]. If the value of logical_decoding_mode is 'immediate', then we can
> immediately switch to partial serialize mode. This will eliminate the
> dependency on timing. The one argument against using this is that it
> won't be as clear as a separate parameter like
> 'stream_serialize_threshold' proposed by the patch but OTOH we already
> have a few parameters that serve a different purpose when used on the
> subscriber. For example, 'max_replication_slots' is used to define the
> maximum number of replication slots on the publisher and the maximum
> number of origins on subscribers. Similarly,
> wal_retrieve_retry_interval' is used for different purposes on
> subscriber and standby nodes.
>
> [1] - https://www.postgresql.org/docs/devel/runtime-config-developer.html
>
> --
> With Regards,
> Amit Kapila.

Hi Amit,

On rethinking the complete model, what I feel is that the name
logical_decoding_mode is not something which defines modes of logical
decoding. We, I think, picked it based on logical_decoding_work_mem.
As per current implementation, the parameter 'logical_decoding_mode'
tells what happens when work-memory used by logical decoding reaches
its limit. So it is in-fact 'logicalrep_workmem_vacate_mode' or
'logicalrep_trans_eviction_mode'. So if it is set to immediate,
meaning vacate the work-memory immediately or evict transactions
immediately. Add buffered means the reverse (i.e. keep on buffering
transactions until we reach a limit). Now coming to subscribers, we
can reuse the same parameter. On subscriber as well, shared-memory
queue could be considered as its workmem and thus the name
'logicalrep_workmem_vacate_mode' might look more relevant.

On publisher:
logicalrep_workmem_vacate_mode=immediate, buffered.

On subscriber:
logicalrep_workmem_vacate_mode=stream_serialize  (or if we want to
keep immediate here too, that will also be fine).

Thoughts?
And I am assuming it is possible to change the GUC name before the
coming release. If not, please let me know, we can brainstorm other
ideas.

thanks
Shveta




Re: Perform streaming logical transactions by background workers and parallel apply

2023-01-19 Thread shveta malik
On Thu, Jan 19, 2023 at 3:44 PM shveta malik  wrote:
>
> On Thu, Jan 19, 2023 at 11:11 AM Amit Kapila  wrote:
> >
> > On Wed, Jan 18, 2023 at 12:09 PM Amit Kapila  
> > wrote:
> > >
> > > On Fri, Jan 13, 2023 at 11:50 AM Peter Smith  
> > > wrote:
> > > >
> > > > Here are some review comments for patch v79-0002.
> > > >
> > >
> > > So, this is about the latest 
> > > v84-0001-Stop-extra-worker-if-GUC-was-changed.
> > >
> > > >
> > > > I feel this patch just adds more complexity for almost no gain:
> > > > - reducing the 'max_apply_workers_per_suibscription' seems not very
> > > > common in the first place.
> > > > - even when the GUC is reduced, at that point in time all the workers
> > > > might be in use so there may be nothing that can be immediately done.
> > > > - IIUC the excess workers (for a reduced GUC) are going to get freed
> > > > naturally anyway over time as more transactions are completed so the
> > > > pool size will reduce accordingly.
> > > >
> > >
> > > I am still not sure if it is worth pursuing this patch because of the
> > > above reasons. I don't think it would be difficult to add this even at
> > > a later point in time if we really see a use case for this.
> > > Sawada-San, IIRC, you raised this point. What do you think?
> > >
> > > The other point I am wondering is whether we can have a different way
> > > to test partial serialization apart from introducing another developer
> > > GUC (stream_serialize_threshold). One possibility could be that we can
> > > have a subscription option (parallel_send_timeout or something like
> > > that) with some default value (current_timeout used in the patch)
> > > which will be used only when streaming = parallel. Users may want to
> > > wait for more time before serialization starts depending on the
> > > workload (say when resource usage is high on a subscriber-side
> > > machine, or there are concurrent long-running transactions that can
> > > block parallel apply for a bit longer time). I know with this as well
> > > it may not be straightforward to test the functionality because we
> > > can't be sure how many changes would be required for a timeout to
> > > occur. This is just for brainstorming other options to test the
> > > partial serialization functionality.
> > >
> >
> > Apart from the above, we can also have a subscription option to
> > specify parallel_shm_queue_size (queue_size used to determine the
> > queue between the leader and parallel worker) and that can be used for
> > this purpose. Basically, configuring it to a smaller value can help in
> > reducing the test time but still, it will not eliminate the need for
> > dependency on timing we have to wait before switching to partial
> > serialize mode. I think this can be used in production as well to tune
> > the performance depending on workload.
> >
> > Yet another way is to use the existing parameter logical_decode_mode
> > [1]. If the value of logical_decoding_mode is 'immediate', then we can
> > immediately switch to partial serialize mode. This will eliminate the
> > dependency on timing. The one argument against using this is that it
> > won't be as clear as a separate parameter like
> > 'stream_serialize_threshold' proposed by the patch but OTOH we already
> > have a few parameters that serve a different purpose when used on the
> > subscriber. For example, 'max_replication_slots' is used to define the
> > maximum number of replication slots on the publisher and the maximum
> > number of origins on subscribers. Similarly,
> > wal_retrieve_retry_interval' is used for different purposes on
> > subscriber and standby nodes.
> >
> > [1] - https://www.postgresql.org/docs/devel/runtime-config-developer.html
> >
> > --
> > With Regards,
> > Amit Kapila.
>
> Hi Amit,
>
> On rethinking the complete model, what I feel is that the name
> logical_decoding_mode is not something which defines modes of logical
> decoding. We, I think, picked it based on logical_decoding_work_mem.
> As per current implementation, the parameter 'logical_decoding_mode'
> tells what happens when work-memory used by logical decoding reaches
> its limit. So it is in-fact 'logicalrep_workmem_vacate_mode' or

Minor correction in what I said earlier:
As per current implementation, the parameter 'logical_decoding_mode'
more or less tells how to deal with workmem i.e. to keep it buffering
with txns until it reaches its limit or immediately vacate it.

> 'logicalrep_trans_eviction_mode'. So if it is set to immediate,
> meaning vacate the work-memory immediately or evict transactions
> immediately. Add buffered means the reverse (i.e. keep on buffering
> transactions until we reach a limit). Now coming to subscribers, we
> can reuse the same parameter. On subscriber as well, shared-memory
> queue could be considered as its workmem and thus the name
> 'logicalrep_workmem_vacate_mode' might look more relevant.
>
> On publisher:
> logicalrep_workmem_vacate_mode=immediate, buffered.
>
> On subscriber:
> logicalrep_wo

Re: Rethinking the implementation of ts_headline()

2023-01-19 Thread Alvaro Herrera
On 2023-Jan-18, Tom Lane wrote:

> Alvaro Herrera  writes:

> > and was surprised that the match for the 'day & drink' arm of the OR
> > disappears from the reported headline.
> 
> I'd argue that that's exactly what should happen.  It's supposed to
> find as-short-as-possible cover strings that satisfy the query.

OK, that makes sense.

> I don't find anything particularly principled about the old behavior:
> 
> >  Day after day, day after day,↵
> >We stuck ... motion,   ↵
> >  As idle as a painted Ship  ↵
> >Upon
> 
> It's including hits for "day" into the cover despite the lack of any
> nearby match to "drink".

I suppose it would be possible to put 'day' and 'drink' in two different
fragments: since the query has a & operator for them, the words don't
necessarily have to be nearby.  But OK, your argument for this being the
shortest result is sensible.

I do wonder, though, if it's effectively usable for somebody building a
search interface on top.  If I'm ranking the results from several
documents, and this document comes on top of others that only match one
arm of the OR query, then I would like to be able to show the matches
for both arms of the OR.

> > Another thing I think might be a regression is the way fragments are
> > selected.  Consider what happens if I change the "idle & painted" in the
> > earlier query to "idle <-> painted", and MaxWords is kept low:
> 
> Of course, "idle <-> painted" is satisfied nowhere in this text
> (the words are there, but not adjacent).

Oh, I see the problem, and it is my misunderstanding: the <-> operator
is counting the words in between, even if they are stop words.  I
understood from the docs that those words were ignored, but that is not
so.  I misread the phraseto_tsquery doc as though they were explaining
the <-> operator.

Another experiment shows that the headline becomes "complete" only if I
specify the exact distance in the  operator:

SELECT dist, ts_headline('simple', 'As idle as a painted Ship', 
to_tsquery('simple', format('(idle <%s> painted)', dist)), 'MaxFragments=5, 
MaxWords=8, MinWords=4') from generate_series(1, 4) dist;
 dist │ ts_headline  
──┼──
1 │ As idle as a
2 │ As idle as a
3 │ idle as a painted Ship
4 │ As idle as a
(4 filas)

I again have to question how valuable in practice is a  operator
that's so strict that I have to know exactly how many stop words I want
there to be in between the phrase search.  For some reason, in my mind I
had it as "at most N words, ignoring stop words", but that's not what it
is.

Anyway, I don't think this needs to stop your current patch.

> So the cover has to run from the last 'day' to the 'drink'.  I think
> v15 is deciding that it runs from the first 'day' to the 'drink',
> which while not exactly wrong is not the shortest cover.

Sounds fair.

> The rest of this is just minor variations in what mark_hl_fragments()
> decides to do based on the precise cover string it's given.  I don't
> dispute that mark_hl_fragments() could be improved, but this patch
> doesn't touch its algorithm and I think that doing so should be
> material for a different patch.

Understood and agreed.

> > (Both 15 and master highlight 'painted' in the "Upon a painted Ocean"
> > verse, which perhaps they shouldn't do, since it's not preceded by
> > 'idle'.)
> 
> Yeah, and 'idle' too.  Once it's chosen a string to show, it'll
> highlight all the query words within that string, whether they
> constitute part of the match or not.  I can see arguments on both
> sides of doing it that way; it was probably more sensible before
> we had phrase match than it is now.  But again, changing that phase
> of the processing is outside the scope of this patch.  I'm just
> trying to undo the damage I did to the cover-string-selection phase.

All clear then.

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"The eagle never lost so much time, as
when he submitted to learn of the crow." (William Blake)




Re: Logical replication timeout problem

2023-01-19 Thread Ashutosh Bapat
On Wed, Jan 18, 2023 at 6:00 PM Amit Kapila  wrote:

> + */
> + ReorderBufferUpdateProgressCB update_progress;
>
> Are you suggesting changing the name of the above variable? If so, how
> about apply_progress, progress, or updateprogress? If you don't like
> any of these then feel free to suggest something else. If we change
> the variable name then accordingly, we need to update
> ReorderBufferUpdateProgressCB as well.
>

I would liked to have all the callback names renamed with prefix
"rbcb_xxx" so that they have very less chances of conflicting with
similar names in the code base. But it's probably late to do that :).

How are update_txn_progress since the CB is supposed to be used only
within a transaction? or update_progress_txn?
update_progress_cb_wrapper needs a change of name as well.

-- 
Best Wishes,
Ashutosh Bapat




Re: Reduce timing overhead of EXPLAIN ANALYZE using rdtsc?

2023-01-19 Thread David Geier

Hi Andres,


I also couldn't help and hacked a bit on the rdtsc pieces. I did figure out
how to do the cycles->nanosecond conversion with integer shift and multiply in
the common case, which does show a noticable speedup. But that's for another
day.
I also have code for that here. I decided against integrating it because 
we don't convert frequently enough to make it matter. Or am I missing 
something?

I fought a bit with myself about whether to send those patches in this thread,
because it'll take over the CF entry. But decided that it's ok, given that
David's patches should be rebased over these anyway?

That's alright.
Though, I would hope we attempt to bring your patch set as well as the 
RDTSC patch set in.


--
David Geier
(ServiceNow)





Re: Time delayed LR (WAS Re: logical replication restrictions)

2023-01-19 Thread vignesh C
On Thu, 19 Jan 2023 at 12:06, Takamichi Osumi (Fujitsu)
 wrote:
>
> Updated the comment and the function call.
>
> Kindly have a look at the updated patch v17.

Thanks for the updated patch, few comments:
1) min_apply_delay was accepting values like '600 m s h', I was not
sure if we should allow this:
alter subscription sub1 set (min_apply_delay = ' 600 m s h');

+   /*
+* If no unit was specified, then explicitly
add 'ms' otherwise
+* the interval_in function would assume 'seconds'.
+*/
+   if (strspn(tmp, "-0123456789 ") == strlen(tmp))
+   val = psprintf("%sms", tmp);
+   else
+   val = tmp;
+
+   interval =
DatumGetIntervalP(DirectFunctionCall3(interval_in,
+

CStringGetDatum(val),
+

ObjectIdGetDatum(InvalidOid),
+
  Int32GetDatum(-1)));

2) How about adding current_txn_wait_time in
pg_stat_subscription_stats, we can update the current_txn_wait_time
periodically, this will help the user to check approximately how much
time is left(min_apply_delay - stat value) before this transaction
will be applied in the subscription. If you agree this can be 0002
patch.

3) There is one check at parse_subscription_options and another check
in AlterSubscription, this looks like a redundant check in case of
alter subscription, can we try to merge and keep in one place:
/*
* The combination of parallel streaming mode and min_apply_delay is not
* allowed.
*/
if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
opts->min_apply_delay > 0)
{
if (opts->streaming == LOGICALREP_STREAM_PARALLEL)
ereport(ERROR,
errcode(ERRCODE_SYNTAX_ERROR),
errmsg("%s and %s are mutually exclusive options",
   "min_apply_delay > 0", "streaming = parallel"));
}

if (IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY))
{
/*
* The combination of parallel streaming mode and
* min_apply_delay is not allowed.
*/
if (opts.min_apply_delay > 0)
if ((IsSet(opts.specified_opts, SUBOPT_STREAMING) && opts.streaming ==
LOGICALREP_STREAM_PARALLEL) ||
(!IsSet(opts.specified_opts, SUBOPT_STREAMING) && sub->stream ==
LOGICALREP_STREAM_PARALLEL))
ereport(ERROR,
errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("cannot enable %s for subscription in %s mode",
   "min_apply_delay", "streaming = parallel"));

values[Anum_pg_subscription_subminapplydelay - 1] =
Int64GetDatum(opts.min_apply_delay);
replaces[Anum_pg_subscription_subminapplydelay - 1] = true;
}

4) typo "execeeds" should be "exceeds"

+  time on the subscriber. Any overhead of time spent in
logical decoding
+  and in transferring the transaction may reduce the actual wait time.
+  It is also possible that the overhead already execeeds the requested
+  min_apply_delay value, in which case no additional
+  wait is necessary. If the system clocks on publisher and subscriber
+  are not synchronized, this may lead to apply changes earlier than

Regards,
Vignesh




Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2023-01-19 Thread Jelte Fennema
Is there anything that is currently blocking this patch? I'd quite
like it to get into PG16.

Especially since I ran into another use case that I would want to use
this patch for recently: Adding an async cancel function to Python
it's psycopg3 library. This library exposes both a Connection class
and an AsyncConnection class (using python its asyncio feature). But
one downside of the AsyncConnection type is that it doesn't have a
cancel method.

I ran into this while changing the PgBouncer tests to use python. And
the cancellation tests were the only tests that required me to use a
ThreadPoolExecutor instead of simply being able to use async-await
style programming:
https://github.com/pgbouncer/pgbouncer/blob/master/test/test_cancel.py#LL9C17-L9C17




Re: TAP output format in pg_regress

2023-01-19 Thread vignesh C
On Fri, 6 Jan 2023 at 11:20, vignesh C  wrote:
>
> On Tue, 3 Jan 2023 at 16:01, vignesh C  wrote:
> >
> > On Tue, 29 Nov 2022 at 00:57, Daniel Gustafsson  wrote:
> > >
> > > > On 28 Nov 2022, at 20:02, Nikolay Shaplov  wrote:
> > >
> > > > From my reviewer's point of view patch is ready for commit.
> > > >
> > > > Thank you for your patience :-)
> > >
> > > Thanks for review.
> > >
> > > The attached tweaks a few comments and attempts to address the compiler 
> > > warning
> > > error in the CFBot CI.  Not sure I entirely agree with the compiler there 
> > > but
> > > here is an attempt to work around it at least (by always copying the 
> > > va_list
> > > for the logfile). It also adds a missing va_end for the logfile va_list.
> > >
> > > I hope this is close to a final version of this patch (commitmessage 
> > > needs a
> > > bit of work though).
> > >
> > > > PS Should I change commitfest status?
> > >
> > > Sure, go ahead.
> > >
> >
> > The patch does not apply on top of HEAD as in [1], please post a rebased 
> > patch:
> > === Applying patches on top of PostgreSQL commit ID
> > 92957ed98c5c565362ce665266132a7f08f6b0c0 ===
> > === applying patch
> > ./v14-0001-Change-pg_regress-output-format-to-be-TAP-compli.patch
> > patching file meson.build
> > Hunk #1 FAILED at 2968.
> > 1 out of 1 hunk FAILED -- saving rejects to file meson.build.rej
>
> Attached a rebased patch on top of HEAD to try and see if we can close
> this patch in this commitfest.

The patch does not apply on top of HEAD as in [1], please post a rebased patch:
=== applying patch
./v15-0001-Change-pg_regress-output-format-to-be-TAP-compli.patch
patching file meson.build
patching file src/test/regress/pg_regress.c
...
Hunk #58 FAILED at 2584.
2 out of 58 hunks FAILED -- saving rejects to file
src/test/regress/pg_regress.c.rej

[1] - http://cfbot.cputube.org/patch_41_3837.log

Regards,
Vignesh




Re: CREATEROLE users vs. role properties

2023-01-19 Thread tushar

On 1/19/23 3:05 PM, tushar wrote:
which was working previously without patch. 

My bad, I was testing against PG v15 but this issue is not
reproducible on master (without patch).

As you mentioned- "This implements the standard idea that you can't give 
permissions
you don't have (but you can give the ones you do have)" but here the 
role is having
createrole  privilege that he cannot pass on to another user? Is this 
expected?


postgres=# create role fff with createrole;
CREATE ROLE
postgres=# create role xxx;
CREATE ROLE
postgres=# set role fff;
SET
postgres=> alter role xxx with createrole;
ERROR:  permission denied
postgres=>

--
regards,tushar
EnterpriseDB  https://www.enterprisedb.com/
The Enterprise PostgreSQL Company





Re: ExecRTCheckPerms() and many prunable partitions (checkAsUser)

2023-01-19 Thread Amit Langote
On Tue, Jan 17, 2023 at 7:33 PM Alvaro Herrera  wrote:
> On 2022-Dec-12, Amit Langote wrote:
> > On Sun, Dec 11, 2022 at 6:25 PM Amit Langote  
> > wrote:
> > > I've attached 0001 to remove those extraneous code blocks and add a
> > > comment mentioning that userid need not be recomputed.
> > >
> > > While staring at the build_simple_rel() bit mentioned above, I
> > > realized that this code fails to set userid correctly in the
> > > inheritance parent rels that are child relations of subquery parent
> > > relations, such as UNION ALL subqueries.  In that case, instead of
> > > copying the userid (= 0) of the parent rel, the child should look up
> > > its own RTEPermissionInfo, which should be there, and use the
> > > checkAsUser value from there.  I've attached 0002 to fix this hole.  I
> > > am not sure whether there's a way to add a test case for this in the
> > > core suite.
> >
> > Ah, I realized we could just expand the test added by 553d2ec27 with a
> > wrapper view (to test checkAsUser functionality) and a UNION ALL query
> > over the view (to test this change).
>
> Hmm, but if I run this test without the code change in 0002, the test
> also passes (to wit: the plan still has two hash joins), so I understand
> that either you're testing something that's not changed by the patch,
> or the test case itself isn't really what you wanted.

Yeah, the test case is bogus. :-(.

It seems that, with the test as written, it's not the partitioned
table referenced in the view's query that becomes a child of the UNION
ALL parent subquery, but the subquery itself.  The bug being fixed in
0002 doesn't affect the planning of this query at all, because child
subquery is planned independently of the main query involving UNION
ALL because of it being unable to be pushed up into the latter.  We
want the partitioned table referenced in the child subquery to become
a child of the UNION ALL parent subquery for the bug to be relevant.

I tried rewriting the test such that the view's subquery does get
pulled up such that the partitioned table becomes a child of the UNION
ALL subquery.  By attaching a debugger, I do see the bug affecting the
planning of this query, but still not in a way that changes the plan.
I will keep trying but in the meantime I'm attaching 0001 to show the
rewritten query and the plan.

> As for 0001, it seems simpler to me to put the 'userid' variable in the
> same scope as 'onerel', and then compute it just once and don't bother
> with it at all afterwards, as in the attached.

That sounds better.  Attached as 0002.

-- 
Thanks, Amit Langote
EDB: http://www.enterprisedb.com


v3-0001-Add-test-case-to-test-a-bug-of-build_simple_rel.patch
Description: Binary data


v3-0002-Remove-some-dead-code-in-selfuncs.c.patch
Description: Binary data


v3-0003-Correctly-set-userid-of-subquery-rel-s-child-rels.patch
Description: Binary data


Re: Adjust the description of OutputPluginCallbacks in pg-doc

2023-01-19 Thread Amit Kapila
On Wed, Jan 11, 2023 at 4:20 PM wangw.f...@fujitsu.com
 wrote:
>
> When I was reading the "Logical Decoding Output Plugins" chapter in pg-doc 
> [1],
> I think in the summary section, only the callback message_cb is not described
> whether it is required or optional, and the description of callback
> stream_prepare_cb seems inaccurate.
>
> And after the summary section, I think only the callback stream_xxx_cb section
> and the callback truncate_cb section are not described this tag (are they
> required or optional).
>
> I think we could improve these to be more reader friendly. So I tried to write
> a patch for these and attach it.
>
> Any thoughts?
>

This looks mostly good to me. I have made minor adjustments in the
attached. Do those make sense to you?

-- 
With Regards,
Amit Kapila.


v2-0001-Improve-the-description-of-Output-Plugin-Callback.patch
Description: Binary data


Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)

2023-01-19 Thread vignesh C
On Wed, 18 Jan 2023 at 03:30, Melanie Plageman
 wrote:
>
> v49 attached
>
> On Tue, Jan 17, 2023 at 2:12 PM Andres Freund  wrote:
> > On 2023-01-17 12:22:14 -0500, Melanie Plageman wrote:
> >
> > > > > +typedef struct PgStat_BackendIO
> > > > > +{
> > > > > + PgStat_Counter 
> > > > > data[IOCONTEXT_NUM_TYPES][IOOBJECT_NUM_TYPES][IOOP_NUM_TYPES];
> > > > > +} PgStat_BackendIO;
> > > >
> > > > Would it bother you if we swapped the order of iocontext and iobject 
> > > > here and
> > > > related places? It makes more sense to me semantically, and should now 
> > > > be
> > > > pretty easy, code wise.
> > >
> > > So, thinking about this I started noticing inconsistencies in other
> > > areas around this order:
> > > For example: ordering of objects mentioned in commit messages and 
> > > comments,
> > > ordering of parameters (like in pgstat_count_io_op() [currently in
> > > reverse order]).
> > >
> > > I think we should make a final decision about this ordering and then
> > > make everywhere consistent (including ordering in the view).
> > >
> > > Currently the order is:
> > > BackendType
> > >   IOContext
> > > IOObject
> > >   IOOp
> > >
> > > You are suggesting this order:
> > > BackendType
> > >   IOObject
> > > IOContext
> > >   IOOp
> > >
> > > Could you explain what you find more natural about this ordering (as I
> > > find the other more natural)?
> >
> > The object we're performing IO on determines more things than the context. 
> > So
> > it just seems like the natural hierarchical fit. The context is a 
> > sub-category
> > of the object. Consider how it'll look like if we also have objects for 
> > 'wal',
> > 'temp files'. It'll make sense to group by just the object, but it won't 
> > make
> > sense to group by just the context.
> >
> > If it were trivial to do I'd use a different IOContext for each IOObject. 
> > But
> > it'd make it much harder. So there'll just be a bunch of values of IOContext
> > that'll only be used for one or a subset of the IOObjects.
> >
> >
> > The reason to put BackendType at the top is pragmatic - one backend is of a
> > single type, but can do IO for all kinds of objects/contexts. So any other
> > hierarchy would make the locking etc much harder.
> >
> >
> > > This is one possible natural sentence with these objects:
> > >
> > > During COPY, a client backend may read in data from a permanent
> > > relation.
> > > This order is:
> > > IOContext
> > >   BackendType
> > > IOOp
> > >   IOObject
> > >
> > > I think English sentences are often structured subject, verb, object --
> > > but in our case, we have an extra thing that doesn't fit neatly
> > > (IOContext).
> >
> > "..., to avoid polluting the buffer cache it uses the bulk (read|write)
> > strategy".
> >
> >
> > > Also, IOOp in a sentence would be in the middle (as the
> > > verb). I made it last because a) it feels like the smallest unit b) it
> > > would make the code a lot more annoying if it wasn't last.
> >
> > Yea, I think pragmatically that is the right choice.
>
> I have changed the order and updated all the places using
> PgStat_BktypeIO as well as in all locations in which it should be
> ordered for consistency (that I could find in the pass I did) -- e.g.
> the view definition, function signatures, comments, commit messages,
> etc.
>
> > > > > +-- Change the tablespace so that the table is rewritten directly, 
> > > > > then SELECT
> > > > > +-- from it to cause it to be read back into shared buffers.
> > > > > +SET allow_in_place_tablespaces = true;
> > > > > +CREATE TABLESPACE regress_io_stats_tblspc LOCATION '';
> > > >
> > > > Perhaps worth doing this in tablespace.sql, to avoid the additional
> > > > checkpoints done as part of CREATE/DROP TABLESPACE?
> > > >
> > > > Or, at least combine this with the CHECKPOINTs above?
> > >
> > > I see a checkpoint is requested when dropping the tablespace if not all
> > > the files in it are deleted. It seems like if the DROP TABLE for the
> > > permanent table is before the explicit checkpoints in the test, then the
> > > DROP TABLESPACE will not cause an additional checkpoint.
> >
> > Unfortunately, that's not how it works :(. See the comment above mdunlink():
> >
> > > * For regular relations, we don't unlink the first segment file of the 
> > > rel,
> > > * but just truncate it to zero length, and record a request to unlink it 
> > > after
> > > * the next checkpoint.  Additional segments can be unlinked immediately,
> > > * however.  Leaving the empty file in place prevents that relfilenumber
> > > * from being reused.  The scenario this protects us from is:
> > > ...
> >
> >
> > > Is this what you are suggesting? Dropping the temporary table should not
> > > have an effect on this.
> >
> > I was wondering about simply moving that portion of the test to
> > tablespace.sql, where we already created a tablespace.
> >
> >
> > An alternative would be to propose splitting tablespace.sql into one portion
> > running at the start of pa

Re: Adjust the description of OutputPluginCallbacks in pg-doc

2023-01-19 Thread Amit Kapila
On Thu, Jan 19, 2023 at 4:47 PM Amit Kapila  wrote:
>
> On Wed, Jan 11, 2023 at 4:20 PM wangw.f...@fujitsu.com
>  wrote:
> >
> > When I was reading the "Logical Decoding Output Plugins" chapter in pg-doc 
> > [1],
> > I think in the summary section, only the callback message_cb is not 
> > described
> > whether it is required or optional, and the description of callback
> > stream_prepare_cb seems inaccurate.
> >
> > And after the summary section, I think only the callback stream_xxx_cb 
> > section
> > and the callback truncate_cb section are not described this tag (are they
> > required or optional).
> >
> > I think we could improve these to be more reader friendly. So I tried to 
> > write
> > a patch for these and attach it.
> >
> > Any thoughts?
> >
>
> This looks mostly good to me. I have made minor adjustments in the
> attached. Do those make sense to you?
>

I forgot to mention that I intend to commit this only on HEAD as this
is a doc improvement patch.

-- 
With Regards,
Amit Kapila.




Re: Add the ability to limit the amount of memory that can be allocated to backends.

2023-01-19 Thread vignesh C
On Fri, 6 Jan 2023 at 00:19, Reid Thompson
 wrote:
>
> On Tue, 2023-01-03 at 16:22 +0530, vignesh C wrote:
> > 
> > The patch does not apply on top of HEAD as in [1], please post a
> > rebased patch:
> > ...
> > Regards,
> > Vignesh
> >
>
> Attached is rebased patch, with some updates related to committed changes.

The patch does not apply on top of HEAD as in [1], please post a rebased patch:
=== Applying patches on top of PostgreSQL commit ID
48880840f18cb75fcaecc77b5e7816b92c27157b ===
=== applying patch
./0001-Add-tracking-of-backend-memory-allocated-to-pg_stat_.patch

patching file src/test/regress/expected/rules.out
Hunk #2 FAILED at 1875.
Hunk #4 FAILED at 2090.
2 out of 4 hunks FAILED -- saving rejects to file
src/test/regress/expected/rules.out.rej

[1] - http://cfbot.cputube.org/patch_41_3867.log

Regards,
Vignesh




Re: Operation log for major operations

2023-01-19 Thread vignesh C
On Sat, 14 Jan 2023 at 15:47, Dmitry Koval  wrote:
>
> Hi!
>
>  >The patch does not apply on top of HEAD ...
>
> Here is a fixed version.
> Small additional fixes:
> 1) added CRC calculation for empty 'pg_control_log' file;
> 2) added saving 'errno' before calling LWLockRelease and restoring after
> that;
> 3) corrected pg_upgrade for case old cluster does not have
> 'pg_control_log' file.

The patch does not apply on top of HEAD as in [1], please post a rebased patch:
=== Applying patches on top of PostgreSQL commit ID
14bdb3f13de16523609d838b725540af5e23ddd3 ===
=== applying patch ./v3-0001-Operation-log.patch
...
patching file src/tools/msvc/Mkvcbuild.pm
Hunk #1 FAILED at 134.
1 out of 1 hunk FAILED -- saving rejects to file src/tools/msvc/Mkvcbuild.pm.rej

[1] - http://cfbot.cputube.org/patch_41_4018.log

Regards,
Vignesh




Re: [EXTERNAL] Re: [PATCH] Support using "all" for the db user in pg_ident.conf

2023-01-19 Thread Jelte Fennema
Looks good to me. One tiny typo in a comment that I noticed when going
over the diff:

+* Mark the token as quoted, so it will only be compared literally
+* and not for some special meaning, such as "all" or a group
+* membership checks.

should be either:
1. a group membership check
2. group membership checks

Now it's mixed singular and plural.




Re: Speed up transaction completion faster after many relations are accessed in a transaction

2023-01-19 Thread vignesh C
On Wed, 3 Aug 2022 at 09:04, David Rowley  wrote:
>
> On Wed, 3 Aug 2022 at 07:04, Jacob Champion  wrote:
> > This entry has been waiting on author input for a while (our current
> > threshold is roughly two weeks), so I've marked it Returned with
> > Feedback.
>
> Thanks for taking care of this. You dealt with this correctly based on
> the fact that I'd failed to rebase before or during the entire July
> CF.
>
> I'm still interested in having the LockReleaseAll slowness fixed, so
> here's a rebased patch.

CFBot shows some compilation errors as in [1], please post an updated
version for the same:
[15:40:00.287] [1239/1809] Linking target src/backend/postgres
[15:40:00.287] FAILED: src/backend/postgres
[15:40:00.287] cc @src/backend/postgres.rsp
[15:40:00.287] /usr/bin/ld:
src/backend/postgres_lib.a.p/replication_logical_launcher.c.o: in
function `logicalrep_worker_onexit':
[15:40:00.287] 
/tmp/cirrus-ci-build/build/../src/backend/replication/logical/launcher.c:773:
undefined reference to `LockReleaseAll'
[15:40:00.287] collect2: error: ld returned 1 exit status

[1] - https://cirrus-ci.com/task/4562493863886848

Regards,
Vignesh




Re: Split index and table statistics into different types of stats

2023-01-19 Thread vignesh C
On Tue, 3 Jan 2023 at 19:49, Drouvot, Bertrand
 wrote:
>
> Hi,
>
> On 12/10/22 10:54 AM, Drouvot, Bertrand wrote:
> > Hi,
> >
> > On 12/7/22 11:11 AM, Drouvot, Bertrand wrote:
> >> Hi,
> >>
> >>> Hi,
> >>>
> >>> As [1] mentioned above has been committed (83a1a1b566), please find 
> >>> attached V5 related to this thread making use of the new macros (namely 
> >>> PG_STAT_GET_RELENTRY_INT64 and PG_STAT_GET_RELENTRY_TIMESTAMPTZ).
> >>>
> >>> I switched from using "CppConcat" to using "##", as it looks to me it's 
> >>> easier to read now that we are adding another concatenation to the game 
> >>> (due to the table/index split).
> >>>
> >>> The (Tab,tab) or (Ind,ind) passed as arguments to the macros look "weird" 
> >>> (I don't have a better idea yet): purpose is to follow the naming 
> >>> convention for PgStat_StatTabEntry/PgStat_StatIndEntry and 
> >>> pgstat_fetch_stat_tabentry/pgstat_fetch_stat_indentry, thoughts?
> >>>
> >>> Looking forward to your feedback,
> >>>
> >>
> >> Attaching V6 (mandatory rebase due to 8018ffbf58).
> >>
> >> While at it, I got rid of the weirdness mentioned above by creating 2 sets 
> >> of macros (one for the tables and one for the indexes).
> >>
> >> Looking forward to your feedback,
> >>
> >> Regards,
> >>
> >
> > Attaching V7, mandatory rebase due to 66dcb09246.
> >
>
> Attaching V8, mandatory rebase due to c8e1ba736b.

The patch does not apply on top of HEAD as in [1], please post a rebased patch:
=== Applying patches on top of PostgreSQL commit ID
d540a02a724b9643205abce8c5644a0f0908f6e3 ===
=== applying patch ./v8-0001-split_tables_indexes_stats.patch

patching file src/backend/utils/activity/pgstat_table.c (renamed from
src/backend/utils/activity/pgstat_relation.c)
Hunk #25 FAILED at 759.

1 out of 29 hunks FAILED -- saving rejects to file
src/backend/utils/activity/pgstat_table.c.rej

[1] - http://cfbot.cputube.org/patch_41_3984.log

Regards,
Vignesh




Re: Logical replication timeout problem

2023-01-19 Thread Amit Kapila
On Thu, Jan 19, 2023 at 4:13 PM Ashutosh Bapat
 wrote:
>
> On Wed, Jan 18, 2023 at 6:00 PM Amit Kapila  wrote:
>
> > + */
> > + ReorderBufferUpdateProgressCB update_progress;
> >
> > Are you suggesting changing the name of the above variable? If so, how
> > about apply_progress, progress, or updateprogress? If you don't like
> > any of these then feel free to suggest something else. If we change
> > the variable name then accordingly, we need to update
> > ReorderBufferUpdateProgressCB as well.
> >
>
> I would liked to have all the callback names renamed with prefix
> "rbcb_xxx" so that they have very less chances of conflicting with
> similar names in the code base. But it's probably late to do that :).
>
> How are update_txn_progress since the CB is supposed to be used only
> within a transaction? or update_progress_txn?
>

Personally, I would prefer 'apply_progress' as it would be similar to
a few other callbacks like apply_change, apply_truncate, or as is
proposed by patch update_progress again because it is similar to
existing callbacks like commit_prepared. If you and others don't like
any of those then we can go for 'update_progress_txn' as well. Anybody
else has an opinion on this?

-- 
With Regards,
Amit Kapila.




Re: Inconsistency in vacuum behavior

2023-01-19 Thread Nikita Malakhov
Hi!

For the test Alexander described in the beginning of the discussion - the
results are
postgres@postgres=# set role regress_vacuum_conflict;
SET
Time: 0.324 ms
postgres@postgres=> vacuum vacuum_tab;
WARNING:  permission denied to vacuum "vacuum_tab", skipping it
WARNING:  permission denied to vacuum "vacuum_tab_1", skipping it
WARNING:  permission denied to vacuum "vacuum_tab_2", skipping it
VACUUM
Time: 1.782 ms
postgres@postgres=> vacuum full;
WARNING:  permission denied to vacuum "pg_statistic", skipping it
WARNING:  permission denied to vacuum "vacuum_tab", skipping it
...
postgres@postgres=> cluster vacuum_tab;
ERROR:  must be owner of table vacuum_tab
Time: 0.384 ms

For the standard role "Postgres" the behavior is the same as before patch.

On Thu, Jan 19, 2023 at 10:37 AM Alexander Pyhalov 
wrote:

> Justin Pryzby писал 2023-01-19 04:49:
> > On Mon, Jan 16, 2023 at 08:12:18PM +0300, Nikita Malakhov wrote:
> >> Hi,
> >>
> >> Currently there is no error in this case, so additional thrown error
> >> would
> >> require a new test.
> >> Besides, throwing an error here does not make sense - it is just a
> >> check
> >> for a vacuum
> >> permission, I think the right way is to just skip a relation that is
> >> not
> >> suitable for vacuum.
> >> Any thoughts or objections?
> >
> > Could you check if this is consistent between the behavior of VACUUM
> > FULL and CLUSTER ?  See also Nathan's patches.
>
> Hi.
>
> Cluster behaves in a different way - it errors out immediately if
> relation is not owned by user. For partitioned rel it would anyway raise
> error later.
> VACUUM and VACUUM FULL behave consistently after applying Nikita's patch
> (for partitioned and regular tables) - issue warning "skipping
> TABLE_NAME --- only table or database owner can vacuum it" and return
> success status.
>
> --
> Best regards,
> Alexander Pyhalov,
> Postgres Professional
>


-- 
Regards,
Nikita Malakhov
Postgres Professional
https://postgrespro.ru/


Re: add PROCESS_MAIN to VACUUM

2023-01-19 Thread vignesh C
On Sat, 7 Jan 2023 at 10:37, Nathan Bossart  wrote:
>
> rebased for cfbot

The patch does not apply on top of HEAD as in [1], please post a rebased patch:
=== Applying patches on top of PostgreSQL commit ID
d540a02a724b9643205abce8c5644a0f0908f6e3 ===
=== applying patch ./v2-0001-add-PROCESS_MAIN-to-VACUUM.patch
patching file src/backend/commands/vacuum.c

Hunk #8 FAILED at 2097.
1 out of 8 hunks FAILED -- saving rejects to file
src/backend/commands/vacuum.c.rej

[1] - http://cfbot.cputube.org/patch_41_4088.log

Regards,
Vignesh




Re: TAP output format in pg_regress

2023-01-19 Thread Daniel Gustafsson
> On 19 Jan 2023, at 12:14, vignesh C  wrote:

> The patch does not apply on top of HEAD as in [1], please post a rebased 
> patch:

Sorry for the silence, and thanks for your previous rebase, $life has kept me
too busy lately.  I'll post a rebased version shortly.

--
Daniel Gustafsson





Re: ExecRTCheckPerms() and many prunable partitions (checkAsUser)

2023-01-19 Thread Alvaro Herrera
On 2023-Jan-19, Amit Langote wrote:

> It seems that, with the test as written, it's not the partitioned
> table referenced in the view's query that becomes a child of the UNION
> ALL parent subquery, but the subquery itself.  The bug being fixed in
> 0002 doesn't affect the planning of this query at all, because child
> subquery is planned independently of the main query involving UNION
> ALL because of it being unable to be pushed up into the latter.  We
> want the partitioned table referenced in the child subquery to become
> a child of the UNION ALL parent subquery for the bug to be relevant.
> 
> I tried rewriting the test such that the view's subquery does get
> pulled up such that the partitioned table becomes a child of the UNION
> ALL subquery.  By attaching a debugger, I do see the bug affecting the
> planning of this query, but still not in a way that changes the plan.
> I will keep trying but in the meantime I'm attaching 0001 to show the
> rewritten query and the plan.

Thanks for spending time tracking down a test case.  I'll try to have a
look later today.

> > As for 0001, it seems simpler to me to put the 'userid' variable in the
> > same scope as 'onerel', and then compute it just once and don't bother
> > with it at all afterwards, as in the attached.
> 
> That sounds better.  Attached as 0002.

Pushed this one, thank you.

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/




Re: Doc: Rework contrib appendix -- informative titles, tweaked sentences

2023-01-19 Thread Alvaro Herrera
On 2023-Jan-18, Karl O. Pinc wrote:

> Oops.  Already sent a revised patch that includes starting each
> module on a new page, for PDF output.  I'll wait to rip that
> out after review and start a new thread if necessary.

Here's my review in the form of a delta patch.


I didn't find that a thing called "ISN" actually exists.  Is there a
reference to that?

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/
"How strange it is to find the words "Perl" and "saner" in such close
proximity, with no apparent sense of irony. I doubt that Larry himself
could have managed it." (ncm, http://lwn.net/Articles/174769/)
commit fcdf3ceb0579f4b308b39ee6a957fe47a66bde32
Author: Alvaro Herrera  [Álvaro Herrera 
]
AuthorDate: Thu Jan 19 13:33:12 2023 +0100
CommitDate: Thu Jan 19 13:33:12 2023 +0100

Alvaro review

diff --git a/doc/src/sgml/bloom.sgml b/doc/src/sgml/bloom.sgml
index 672ac2ed19..19f2b172cc 100644
--- a/doc/src/sgml/bloom.sgml
+++ b/doc/src/sgml/bloom.sgml
@@ -1,7 +1,7 @@
 
 
 
- bloom — bloom filter index access
+ bloom — bloom filter index access method
 
  
   bloom
diff --git a/doc/src/sgml/btree-gin.sgml b/doc/src/sgml/btree-gin.sgml
index 0eaea0dbcd..c9efe1cccf 100644
--- a/doc/src/sgml/btree-gin.sgml
+++ b/doc/src/sgml/btree-gin.sgml
@@ -1,15 +1,14 @@
 
 
 
- btree_gin —
-   sample GIN B-tree equivalent operator classes [trusted]
+ btree_gin — GIN B-tree equivalent operator classes 
[trusted]
 
  
   btree_gin
  
 
  
-  btree_gin provides sample GIN operator classes that
+  btree_gin provides GIN operator classes that
   implement B-tree equivalent behavior for the data types
   int2, int4, int8, float4,
   float8, timestamp with time zone,
diff --git a/doc/src/sgml/contrib.sgml b/doc/src/sgml/contrib.sgml
index 810ca73f81..8816e06337 100644
--- a/doc/src/sgml/contrib.sgml
+++ b/doc/src/sgml/contrib.sgml
@@ -1,7 +1,7 @@
 
 
 
- Additional Supplied Modules and Extensions
+ Additional Supplied Extensions and Modules
 
  
   This appendix and the next one contain information on the
@@ -55,7 +55,8 @@
  
 
  
-  Many components supply new user-defined functions, operators, or types.
+  Many components supply new user-defined functions, operators, or types,
+  packaged as extensions.
   To make use of one of these extensions, after you have installed the code
   you need to register the new SQL objects in the database system.
   This is done by executing
@@ -66,7 +67,7 @@
 CREATE EXTENSION extension_name;
 
 
-  This command registers the new SQL objects in the current database only,
+  This command only registers the new SQL objects in the current database,
   so you need to run it in every database in which you want
   the extension's facilities to be available.  Alternatively, run it in
   database template1 so that the extension will be copied 
into
diff --git a/doc/src/sgml/isn.sgml b/doc/src/sgml/isn.sgml
index 0568436a26..27abdc8c66 100644
--- a/doc/src/sgml/isn.sgml
+++ b/doc/src/sgml/isn.sgml
@@ -2,7 +2,7 @@
 
 
  isn —
-   data types for various ISN standards (UPC, books, etc.) [trusted]
+   data types for item numbers (ISBN, EAN, UPC, etc.) [trusted]
 
  
   isn
diff --git a/doc/src/sgml/lo.sgml b/doc/src/sgml/lo.sgml
index c51781b0bb..511e576cac 100644
--- a/doc/src/sgml/lo.sgml
+++ b/doc/src/sgml/lo.sgml
@@ -1,7 +1,7 @@
 
 
 
- lo — manage large objects (BLOBs) [trusted]
+ lo — manage large objects [trusted]
 
  
   lo


Re: document the need to analyze partitioned tables

2023-01-19 Thread Laurenz Albe
On Wed, 2023-01-18 at 16:23 -0500, Bruce Momjian wrote:
> Is it possible to document when partition table statistics helps?

I think it would be difficult to come up with an exhaustive list.

Yours,
Laurenz Albe




Re: minor bug

2023-01-19 Thread Laurenz Albe
On Wed, 2023-01-18 at 15:03 -0500, Tom Lane wrote:
> Laurenz Albe  writes:
> > On Tue, 2023-01-17 at 10:32 -0500, Tom Lane wrote:
> > > I seem to recall that the original idea was to report the timestamp
> > > of the commit/abort record we are stopping at.  Maybe my memory is
> > > faulty, but I think that'd be significantly more useful than the
> > > current behavior, *especially* when the replay stopping point is
> > > defined by something other than time.
> > > (Also, the wording of the log message suggests that that's what
> > > the reported time is supposed to be.  I wonder if somebody messed
> > > this up somewhere along the way.)
> 
> > recoveryStopTime is set to recordXtime (the time of the xlog record)
> > a few lines above that patch, so this is useful information if it is
> > present.
> 
> Ah, but that only happens if recoveryTarget == RECOVERY_TARGET_TIME.
> Digging in the git history, I see that this did use to work as
> I remember: we always extracted the record time before printing it.
> That was accidentally broken in refactoring in c945af80c.  I think
> the correct fix is more like the attached.

Yes, you are right.  Your patch looks fine to me.

Yours,
Laurenz Albe




Re: almost-super-user problems that we haven't fixed yet

2023-01-19 Thread tushar

On 1/19/23 2:44 AM, Nathan Bossart wrote:

On Wed, Jan 18, 2023 at 02:51:38PM -0500, Robert Haas wrote:

Should (nfree < SuperuserReservedBackends) be using <=, or am I confused?

I believe < is correct.  At this point, the new backend will have already
claimed a proc struct, so if the number of remaining free slots equals the
number of reserved slots, it is okay.


What's the deal with removing "and no new replication connections will
be accepted" from the documentation? Is the existing documentation
just wrong? If so, should we fix that first? And maybe delete
"non-replication" from the error message that says "remaining
connection slots are reserved for non-replication superuser
connections"? It seems like right now the comments say that
replication connections are a completely separate pool of connections,
but the documentation and the error message make it sound otherwise.
If that's true, then one of them is wrong, and I think it's the
docs/error message. Or am I just misreading it?

I think you are right.  This seems to have been missed in ea92368.  I moved
this part to a new patch that should probably be back-patched to v12.

On that note, I wonder if it's worth changing the "sorry, too many clients
already" message to make it clear that max_connections has been reached.
IME some users are confused by this error, and I think it would be less
confusing if it pointed to the parameter that governs the number of
connection slots.  I'll create a new thread for this.

There is  one typo , for the doc changes, it is  mentioned 
"pg_use_reserved_backends" but i think it supposed to be 
"pg_use_reserved_connections"

under Table 22.1. Predefined Roles.

--
regards,tushar
EnterpriseDB  https://www.enterprisedb.com/
The Enterprise PostgreSQL Company





Re: Time delayed LR (WAS Re: logical replication restrictions)

2023-01-19 Thread Amit Kapila
On Thu, Jan 19, 2023 at 4:25 PM vignesh C  wrote:
>
> On Thu, 19 Jan 2023 at 12:06, Takamichi Osumi (Fujitsu)
>  wrote:
> >
> > Updated the comment and the function call.
> >
> > Kindly have a look at the updated patch v17.
>
> Thanks for the updated patch, few comments:
> 1) min_apply_delay was accepting values like '600 m s h', I was not
> sure if we should allow this:
> alter subscription sub1 set (min_apply_delay = ' 600 m s h');
>

I think here we should have specs similar to recovery_min_apply_delay.

>
> 2) How about adding current_txn_wait_time in
> pg_stat_subscription_stats, we can update the current_txn_wait_time
> periodically, this will help the user to check approximately how much
> time is left(min_apply_delay - stat value) before this transaction
> will be applied in the subscription. If you agree this can be 0002
> patch.
>

Do we have any similar stats for recovery_min_apply_delay? If not, I
suggest let's postpone this to see if users really need such a
parameter.

-- 
With Regards,
Amit Kapila.




Re: Support plpgsql multi-range in conditional control

2023-01-19 Thread Pavel Stehule
Hi


čt 19. 1. 2023 v 10:23 odesílatel 2903807...@qq.com <2903807...@qq.com>
napsal:

> Dear hackers, my good friend Hou Jiaxing and I have implemented a version
> of the code that supports multiple integer range conditions in the in
> condition control of the for loop statement in the plpgsql procedural
> language. A typical example is as follows:
>
> postgres=# do $$
> declare
> i int := 10;
> begin
> for i in 1..10 by 3, reverse i+10..i+1 by 3 loop
>raise info '%', i;
> end loop;
> end $$;
> INFO: 1
> INFO: 4
> INFO: 7
> INFO: 10
> INFO: 20
> INFO: 17
> INFO: 14
> INFO: 11
> do
> postgres=#
>
> Hope to get your feedback, thank you!
>

I don't like it. The original design of ADA language is to be a safe and
simple language. Proposed design is in 100% inversion.

What use case it should to support?

Regards

Pavel


>
> --
> 2903807...@qq.com
>


Re: Time delayed LR (WAS Re: logical replication restrictions)

2023-01-19 Thread Amit Kapila
On Thu, Jan 19, 2023 at 12:06 PM Takamichi Osumi (Fujitsu)
 wrote:
>
> Kindly have a look at the updated patch v17.
>

Can we try to optimize the test time for this test? On my machine, it
is the second highest time-consuming test in src/test/subscription. It
seems you are waiting twice for apply_delay and both are for streaming
cases by varying the number of changes. I think it should be just once
and that too for the non-streaming case. I think it would be good to
test streaming code path interaction but not sure if it is important
enough to have two test cases for apply_delay.

One minor comment that I observed while going through the patch.
+ /*
+ * The combination of parallel streaming mode and min_apply_delay is not
+ * allowed.
+ */
+ if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
+ opts->min_apply_delay > 0)

I think it would be good if you can specify the reason for not
allowing this combination in the comments.

-- 
With Regards,
Amit Kapila.




Re: [PATCH] Teach planner to further optimize sort in distinct

2023-01-19 Thread David Rowley
On Wed, 18 Jan 2023 at 08:27, Ankit Kumar Pandey  wrote:
> There is bit confusion in wording here:
>
> "returns a List of pathkeys
> which are in keys1 but not in keys2 and NIL if keys2 has a pathkey
> that does not exist as a pathkey in keys1."
>
> You mean extract common keys without ordering right?

I think you should write a function like:

bool pathkeys_count_contained_in_unordered(List *keys1, List *keys2,
List **reorderedkeys, int *n_common)

which works very similarly to pathkeys_count_contained_in, but
populates *reorderedkeys so it contains all of the keys in keys1, but
put the matching ones in the same order as they are in keys2.  If you
find a keys2 that does not exist in keys1 then just add the additional
unmatched keys1 keys to *reorderedkeys.  Set *n_common to the number
of common keys excluding any that come after a key2 key that does not
exist as a key1 key.

You can just switch to using that function in
create_final_distinct_paths(). You'll need to consider if the query is
a DISTINCT ON query and not try the unordered version of the function
in that case.

I also just noticed that in build_index_paths() we'll leave the index
path's pathkeys empty if we deem the pathkeys as useless.  I'm not
sure what the repercussions of setting those to the return value of
build_index_pathkeys() if useful_pathkeys is otherwise empty.  It's
possible that truncate_useless_pathkeys() needs to be modified to
check if the pathkeys might be useful for DISTINCT, but now that I see
we don't populate the IndexPath's pathkeys when we deem them not
useful makes me wonder if this entire patch is a good idea. When I
thought about it I assumed that we always set IndexPath's pathkeys to
whatever (if any) sort order that the index provides.

David




Re: almost-super-user problems that we haven't fixed yet

2023-01-19 Thread tushar
On Thu, Jan 19, 2023 at 6:28 PM tushar 
wrote:

> On 1/19/23 2:44 AM, Nathan Bossart wrote:
> > On Wed, Jan 18, 2023 at 02:51:38PM -0500, Robert Haas wrote:
> >> Should (nfree < SuperuserReservedBackends) be using <=, or am I
> confused?
> > I believe < is correct.  At this point, the new backend will have already
> > claimed a proc struct, so if the number of remaining free slots equals
> the
> > number of reserved slots, it is okay.
> >
> >> What's the deal with removing "and no new replication connections will
> >> be accepted" from the documentation? Is the existing documentation
> >> just wrong? If so, should we fix that first? And maybe delete
> >> "non-replication" from the error message that says "remaining
> >> connection slots are reserved for non-replication superuser
> >> connections"? It seems like right now the comments say that
> >> replication connections are a completely separate pool of connections,
> >> but the documentation and the error message make it sound otherwise.
> >> If that's true, then one of them is wrong, and I think it's the
> >> docs/error message. Or am I just misreading it?
> > I think you are right.  This seems to have been missed in ea92368.  I
> moved
> > this part to a new patch that should probably be back-patched to v12.
> >
> > On that note, I wonder if it's worth changing the "sorry, too many
> clients
> > already" message to make it clear that max_connections has been reached.
> > IME some users are confused by this error, and I think it would be less
> > confusing if it pointed to the parameter that governs the number of
> > connection slots.  I'll create a new thread for this.
> >
> There is  one typo , for the doc changes, it is  mentioned
> "pg_use_reserved_backends" but i think it supposed to be
> "pg_use_reserved_connections"
> under Table 22.1. Predefined Roles.
>
> and in the error message too

[edb@centos7tushar bin]$ ./psql postgres -U r2

psql: error: connection to server on socket "/tmp/.s.PGSQL.5432" failed:
FATAL:  remaining connection slots are reserved for roles with privileges
of pg_use_reserved_backends
[edb@centos7tushar bin]$

regards,


Re: Time delayed LR (WAS Re: logical replication restrictions)

2023-01-19 Thread vignesh C
On Thu, 19 Jan 2023 at 18:29, Amit Kapila  wrote:
>
> On Thu, Jan 19, 2023 at 4:25 PM vignesh C  wrote:
> >
> > On Thu, 19 Jan 2023 at 12:06, Takamichi Osumi (Fujitsu)
> >  wrote:
> > >
> > > Updated the comment and the function call.
> > >
> > > Kindly have a look at the updated patch v17.
> >
> > Thanks for the updated patch, few comments:
> > 1) min_apply_delay was accepting values like '600 m s h', I was not
> > sure if we should allow this:
> > alter subscription sub1 set (min_apply_delay = ' 600 m s h');
> >
>
> I think here we should have specs similar to recovery_min_apply_delay.
>
> >
> > 2) How about adding current_txn_wait_time in
> > pg_stat_subscription_stats, we can update the current_txn_wait_time
> > periodically, this will help the user to check approximately how much
> > time is left(min_apply_delay - stat value) before this transaction
> > will be applied in the subscription. If you agree this can be 0002
> > patch.
> >
>
> Do we have any similar stats for recovery_min_apply_delay? If not, I
> suggest let's postpone this to see if users really need such a
> parameter.

I did not find any statistics for recovery_min_apply_delay, ok it can
be delayed to a later time.

Regards,
Vignesh




Re: vac_update_datfrozenxid will raise "wrong tuple length" if pg_database tuple contains toast attribute.

2023-01-19 Thread Arthur Nascimento
Hi Michael,

I got a customer case that matches this issue. The customer is on an
old version of 12, but I see the patch only went into 14:

On Tue, 8 Dec 2020 at 00:32, Michael Paquier  wrote:
> > Yes the attached patch looks good to me for PostgreSQL. Thanks Michael.
> Okay, committed to HEAD then.

commit 947789f1f5fb61daf663f26325cbe7cad8197d58
Author: Michael Paquier 
Date:   Tue Dec 8 12:13:19 2020 +0900

   Avoid using tuple from syscache for update of pg_database.datfrozenxid

Would it be possible to backport the patch to 12 and 13?

-- 
Arthur Nascimento
EDB




Re: Unicode grapheme clusters

2023-01-19 Thread Pavel Stehule
čt 19. 1. 2023 v 1:20 odesílatel Bruce Momjian  napsal:

> Just my luck, I had to dig into a two-"character" emoji that came to me
> as part of a Google Calendar entry --- here it is:
>
> 👩🏼‍⚕️🩺
>
>   libc
> Unicode UTF8  len
> U+1F469  f0 9f 91 a9   2   woman
> U+1F3FC  f0 9f 8f bc   2   emoji modifier fitzpatrick type-3 (skin
> tone)
> U+200D   e2 80 8d  0   zero width joiner (ZWJ)
> U+2695   e2 9a 95  1   staff with snake
> U+FE0F   ef b8 8f  0   variation selector-16 (VS16) (previous
> character as emoji)
> U+1FA7A  f0 9f a9 ba   2   stethoscope
>
> Now, in Debian 11 character apps like vi, I see:
>
>   a woman(2) - a black box(2) - a staff with snake(1) - a stethoscope(2)
>
> Display widths are in parentheses.  I also see '<200d>' in blue.
>
> In current Firefox, I see a woman with a stethoscope around her neck,
> and then a stethoscope.  Copying the Unicode string above into a browser
> URL bar should show you the same thing, thought it might be too small to
> see.
>
> For those looking for details on how these should be handled, see this
> for an explanation of grapheme clusters that use things like skin tone
> modifiers and zero-width joiners:
>
> https://tonsky.me/blog/emoji/
>
> These comments explain the confusion of the term character:
>
>
> https://stackoverflow.com/questions/27331819/whats-the-difference-between-a-character-a-code-point-a-glyph-and-a-grapheme
>
> and I think this comment summarizes it well:
>
>
> https://github.com/kovidgoyal/kitty/issues/3998#issuecomment-914807237
>
> This is by design. wcwidth() is utterly broken. Any terminal or
> terminal
> application that uses it is also utterly broken. Forget about emoji
> wcwidth() doesn't even work with combining characters, zero width
> joiners, flags, and a whole bunch of other things.
>
> I decided to see how Postgres, without ICU, handles it:
>
> show lc_ctype;
>   lc_ctype
> -
>  en_US.UTF-8
>
> select octet_length('👩🏼‍⚕️🩺');
>  octet_length
> --
>21
>
> select character_length('👩🏼‍⚕️🩺');
>  character_length
> --
> 6
>
> The octet_length() is verified as correct by counting the UTF8 bytes
> above.  I think character_length() is correct if we consider the number
> of Unicode characters, display and non-display.
>
> I then started looking at how Postgres computes and uses _display_
> width.  The display width, when properly processed like by Firefox, is 4
> (two double-wide displayed characters.)  Based on the libc display
> lengths above and incorrect displayed character lengths in Debian 11, it
> would be 7.
>
> libpq has PQdsplen(), which calls pg_encoding_dsplen(), which then calls
> the per-encoding width function stored in pg_wchar_table.dsplen --- for
> UTF8, the function is pg_utf_dsplen().
>
> There is no SQL API for display length, but PQdsplen() that can be
> called with a string by calling pg_wcswidth() the gdb debugger:
>
> pg_wcswidth(const char *pwcs, size_t len, int encoding)
> UTF8 encoding == 6
>
> (gdb) print (int)pg_wcswidth("abcd", 4, 6)
> $8 = 4
> (gdb) print (int)pg_wcswidth("👩🏼‍⚕️🩺", 21, 6))
> $9 = 7
>
> Here is the psql output:
>
> SELECT octet_length('👩🏼‍⚕️🩺'), '👩🏼‍⚕️🩺',
> character_length('👩🏼‍⚕️🩺');
>  octet_length | ?column? | character_length
> --+--+--
>21 | 👩🏼‍⚕️🩺  |6
>
> More often called from psql are pg_wcssize() and pg_wcsformat(), which
> also calls PQdsplen().
>
> I think the question is whether we want to report a string width that
> assumes the display doesn't understand the more complex UTF8
> controls/"characters" listed above.
>
> tsearch has p_isspecial() calls pg_dsplen() which also uses
> pg_wchar_table.dsplen.  p_isspecial() also has a small table of what it
> calls "strange_letter",
>
> Here is a report about Unicode variation selector and combining
> characters from May, 2022:
>
>
> https://www.postgresql.org/message-id/flat/013f01d873bb%24ff5f64b0%24fe1e2e10%24%40ndensan.co.jp
>
> Is this something people want improved?
>

Surely it should be fixed. Unfortunately - all the terminals that I can use
don't support it. So at this moment it may be premature to fix it, because
the visual form will still be broken.

Regards

Pavel


> --
>   Bruce Momjian  https://momjian.us
>   EDB  https://enterprisedb.com
>
> Embrace your flaws.  They make you human, rather than perfect,
> which you will never be.
>
>
>


Re: HOT chain validation in verify_heapam()

2023-01-19 Thread Aleksander Alekseev
Hi hackers,

> Fixed.

I noticed that this patch stuck a little and decided to take another look.

It seems to be well written, covered with tests and my understanding
is that all the previous feedback was accounted for. To your knowledge
is there anything that prevents us from moving it to "Ready for
Committer"?

-- 
Best regards,
Aleksander Alekseev




Re: Re: Support plpgsql multi-range in conditional control

2023-01-19 Thread 2903807...@qq.com
Hello, thank you very much for your reply. But I think you may have 
misunderstood what we have done. 

What we do this time is that we can use multiple range ranges 
(condition_iterator) after in. Previously, we can only use such an interval 
[lower, upper] after in, but in some scenarios, we may need a list: condition_ 
iterator[,condition_iterator ...]

condition_iterator:
[ REVERSE ] expression .. expression [ BY expression ] 

Thanks again!


songjinzhou (2903807...@qq.com)
 
From: Pavel Stehule
Date: 2023-01-19 21:04
To: 2903807...@qq.com
CC: pgsql-hackers; 1276576182
Subject: Re: Support plpgsql multi-range in conditional control
Hi


čt 19. 1. 2023 v 10:23 odesílatel 2903807...@qq.com <2903807...@qq.com> napsal:
Dear hackers, my good friend Hou Jiaxing and I have implemented a version of 
the code that supports multiple integer range conditions in the in condition 
control of the for loop statement in the plpgsql procedural language. A typical 
example is as follows:

postgres=# do $$
declare
i int := 10;
begin
for i in 1..10 by 3, reverse i+10..i+1 by 3 loop
   raise info '%', i;
end loop;
end $$;
INFO: 1
INFO: 4
INFO: 7
INFO: 10
INFO: 20
INFO: 17
INFO: 14
INFO: 11
do
postgres=#

Hope to get your feedback, thank you!

I don't like it. The original design of ADA language is to be a safe and simple 
language. Proposed design is in 100% inversion.

What use case it should to support?

Regards

Pavel
 



2903807...@qq.com


Re: almost-super-user problems that we haven't fixed yet

2023-01-19 Thread tushar
On Thu, Jan 19, 2023 at 6:50 PM tushar 
wrote:

> and in the error message too
>
> [edb@centos7tushar bin]$ ./psql postgres -U r2
>
> psql: error: connection to server on socket "/tmp/.s.PGSQL.5432" failed:
> FATAL:  remaining connection slots are reserved for roles with privileges
> of pg_use_reserved_backends
> [edb@centos7tushar bin]$
>


I think there is also a need to improve the error message if non
super users are not able to connect due to slot unavailability.
--Connect to psql terminal, create a user
create user t1;

--set these GUC parameters in postgresql.conf and restart the server

max_connections = 3 # (change requires restart)

superuser_reserved_connections = 1  # (change requires restart)

reserved_connections = 1

psql terminal ( connect to superuser),  ./psql postgres
psql terminal (try to connect to user t1) ,  ./psql postgres -U t1
Error message is

psql: error: connection to server on socket "/tmp/.s.PGSQL.5432" failed:
FATAL:  remaining connection slots are reserved for roles with privileges
of pg_use_reserved_backends



that is not true because the superuser can still able to connect,

probably in this case message should be like this -

"remaining connection slots are reserved for roles with privileges of
pg_use_reserved_connections and for superusers" or something better.

regards,


Re: ANY_VALUE aggregate

2023-01-19 Thread Peter Eisentraut

On 18.01.23 18:01, Vik Fearing wrote:

On 1/18/23 16:06, Peter Eisentraut wrote:

On 05.12.22 21:18, Vik Fearing wrote:

On 12/5/22 15:57, Vik Fearing wrote:
The SQL:2023 Standard defines a new aggregate named ANY_VALUE.  It 
returns an implementation-dependent (i.e. non-deterministic) value 
from the rows in its group.


PFA an implementation of this aggregate.


Here is v2 of this patch.  I had forgotten to update sql_features.txt.


In your patch, the documentation says the definition is 
any_value("any") but the catalog definitions are 
any_value(anyelement).  Please sort that out.


Since the transition function is declared strict, null values don't 
need to be checked.


Thank you for the review.  Attached is a new version rebased to d540a02a72.


This looks good to me now.




Re: CREATEROLE users vs. role properties

2023-01-19 Thread Robert Haas
On Thu, Jan 19, 2023 at 6:15 AM tushar  wrote:
> postgres=# create role fff with createrole;
> CREATE ROLE
> postgres=# create role xxx;
> CREATE ROLE
> postgres=# set role fff;
> SET
> postgres=> alter role xxx with createrole;
> ERROR:  permission denied
> postgres=>

Here fff would need ADMIN OPTION on xxx to be able to make modifications to it.

See 
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=cf5eb37c5ee0cc54c80d95c1695d7fca1f7c68cb

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: CREATEROLE users vs. role properties

2023-01-19 Thread Robert Haas
On Wed, Jan 18, 2023 at 6:17 PM Nathan Bossart  wrote:
> > Here is a patch implementing the above proposal. Since this is fairly
> > closely related to already-committed work, I would like to get this
> > into v16. That way, all the changes to how CREATEROLE works will go
> > into a single release, which seems less confusing for users. It is
> > also fairly clear to me that this is an improvement over the status
> > quo. Sometimes things that seem clear to me turn out to be false, so
> > if this change seems like a problem to you, please let me know.
>
> This seems like a clear improvement to me.

Cool.

> However, as the attribute
> system becomes more sophisticated, I think we ought to improve the error
> messages in user.c.  IMHO messages like "permission denied" could be
> greatly improved with some added context.
>
> This probably shouldn't block your patch, but I think it's worth doing in
> v16 since there are other changes in this area.  I'm happy to help.

That would be great. I agree that it's good to try to improve the
error messages. It hasn't been entirely clear to me how to do that.
For instance, I don't think we want to say something like:

ERROR: must have CREATEROLE privilege and ADMIN OPTION on the target
role, or in lieu of both of those to be superuser, to set the
CONNECTION LIMIT for another role
ERROR: must have CREATEROLE privilege and ADMIN OPTION on the target
role, plus also CREATEDB, or in lieu of all that to be superuser, to
remove the CREATEDB property from another role

Such messages are long and we'd end up with a lot of variants. It's
possible that the messages could be multi-tier. For instance, if we
determine that you're trying to manage users and you don't have
permission to manage ANY user, we could say:

ERROR: permission to manage roles denied
DETAIL: You must have the CREATEROLE privilege or be a superuser to
manage roles.

If you could potentially manage some user, but not the one you're
trying to manage, we could say:

ERROR: permission to manage role "%s" denied
DETAIL: You need ADMIN OPTION on the target role to manage it.

If you have permission to manage the target role but not in the
requested manner, we could then say something like:

ERROR: permission to manage CREATEDB for role "%s" denied
DETAIL: You need CREATEDB to manage it.

This is just one idea, and maybe not the best one. I'm just trying to
say that I think this is basically an organizational problem. We need
a plan for how we're going to report errors that is not too
complicated to implement with reasonable effort, and that will produce
messages that users will understand. I'd be delighted if you wanted to
provide either ideas or patches...

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Support plpgsql multi-range in conditional control

2023-01-19 Thread Tom Lane
Pavel Stehule  writes:
> čt 19. 1. 2023 v 10:23 odesílatel 2903807...@qq.com <2903807...@qq.com>
> napsal:
>> Dear hackers, my good friend Hou Jiaxing and I have implemented a version
>> of the code that supports multiple integer range conditions in the in
>> condition control of the for loop statement in the plpgsql procedural
>> language. A typical example is as follows:

> I don't like it. The original design of ADA language is to be a safe and
> simple language. Proposed design is in 100% inversion.

Yeah, I'm pretty dubious about this too.  plpgsql's FOR-loop syntax is
already badly overloaded, to the point where it's hard to separate
the true intent of a statement.  We have very ad-hoc rules in there
like "if the first thing after IN is a var of type refcursor, then
it's FOR-IN-cursor, otherwise it couldn't possibly be that".  (So
much for functions returning refcursor, for example.)  Similarly the
"FOR x IN m..n" syntax has a shaky assumption that ".." couldn't
possibly appear in mainline SQL.  If you make any sort of syntax
error you're likely to get a very unintelligible complaint --- or
worse, it might take it and do something you did not expect.

I fear that allowing more complexity in "FOR x IN m..n" will make
those problems even worse.  The proposed patch gives comma a special
status akin to ".."'s, but comma definitely *can* appear within SQL
expressions --- admittedly, it should only appear within parentheses,
but now you're reliant on the user keeping their parenthesization
straight in order to avoid going off into the weeds.  I think this
change increases the chances of confusion with FOR-IN-SELECT as well.

If there were a compelling use-case for what you suggest then
maybe it'd be worth accepting those risks.  But I share Pavel's
opinion that there's little use-case.  We've not heard a request
for such a feature before, AFAIR.

regards, tom lane




Re: almost-super-user problems that we haven't fixed yet

2023-01-19 Thread Robert Haas
On Thu, Jan 19, 2023 at 9:21 AM tushar  wrote:
> that is not true because the superuser can still able to connect,

It is true, but because superusers have all privileges.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Add LZ4 compression in pg_dump

2023-01-19 Thread Tomas Vondra
On 1/18/23 20:05, gkokola...@pm.me wrote:
> 
> 
> 
> 
> 
> --- Original Message ---
> On Wednesday, January 18th, 2023 at 3:00 PM, Tomas Vondra 
>  wrote:
> 
> 
>>
>>
>> Hi,
>>
>> On 1/16/23 16:14, gkokola...@pm.me wrote:
>>
>>> Hi,
>>>
>>> I admit I am completely at lost as to what is expected from me anymore.
>>
> 
>>
>> Unfortunately, this plays against this patch - I'm certainly in favor of
>> adding lz4 (and other compression algos) into pg_dump, but if I commit
>> 0001 we get little benefit, and the other parts actually adding lz4/zstd
>> are treated as "WIP / for completeness" so it's unclear when we'd get to
>> commit them.
> 
> Thank you for your kindness and for taking the time to explain.
>  
>> So if I could recommend one thing, it'd be to get at least one of those
>> WIP patches into a shape that's likely committable right after 0001.
> 
> This was clearly my fault. I misunderstood a suggestion upthread to focus
> on the first patch of the series and ignore documentation and comments on
> the rest.
> 
> Please find v21 to contain 0002 and 0003 in a state which I no longer consider
> as WIP but worthy of proper consideration. Some guidance on where is best to 
> add
> documentation in 0002 for the function pointers in CompressFileHandle will
> be welcomed.
> 

This is internal-only API, not meant for use by regular users and/or
extension authors, so I don't think we need sgml docs. I'd just add
regular code-level documentation to compress_io.h.

For inspiration see docs for "struct ReorderBuffer" in reorderbuffer.h,
or "struct _archiveHandle" in pg_backup_archiver.h.

Or what other kind of documentation you had in mind?

>>
>>> I had posted v19-0001 for a committer's consideration and v19-000{2,3} for 
>>> completeness.
>>> Please find a rebased v20 attached.
>>
>>
>> I took a quick look at 0001, so a couple comments (sorry if some of this
>> was already discussed in the thread):
> 
> Much appreciated!
> 
>>
>> 1) I don't think a "refactoring" patch should reference particular
>> compression algorithms (lz4/zstd), and in particular I don't think we
>> should have "not yet implemented" messages. We only have a couple other
>> places doing that, when we didn't have a better choice. But here we can
>> simply reject the algorithm when parsing the options, we don't need to
>> do that in a dozen other places.
> 
> I have now removed lz4/zstd from where they were present with the exception
> of pg_dump.c which is responsible for parsing.
> 

I'm not sure I understand why leave the lz4/zstd in this place?

>> 2) I wouldn't reorder the cases in WriteDataToArchive, i.e. I'd keep
>> "none" at the end. It might make backpatches harder.
> 
> Agreed. However a 'default' is needed in order to avoid compilation warnings.
> Also note that 0002 completely does away with cases within WriteDataToArchive.
> 

OK, although that's also a consequence of using a "switch" instead of
plan "if" branches.

Furthermore, I'm not sure we really need the pg_fatal() about invalid
compression method in these default blocks. I mean, how could we even
get to these places when the build does not support the algorithm? All
of this (ReadDataFromArchive, WriteDataToArchive, EndCompressor, ...)
happens looong after the compressor was initialized and the method
checked, no? So maybe either this should simply do Assert(false) or use
a different error message.

>> 3) While building, I get bunch of warnings about missing cfdopen()
>> prototype and pg_backup_archiver.c not knowing about cfdopen() and
>> adding an implicit prototype (so I doubt it actually works).
> 
> Fixed. cfdopen() got prematurely introduced in 5e73a6048 and then got removed
> in 69fb29d1af. v20 failed to properly take 69fb29d1af in consideration. Note
> that cfdopen is removed in 0002 which explains why cfbot didn't complain.
>  

OK.

>> 4) "cfp" struct no longer wraps gzFile, but the comment was not updated.
>> FWIW I'm not sure switching to "void *" is an improvement, maybe it'd be
>> better to have a "union" of correct types?
> 
> Please find and updated comment and a union in place of the void *. Also
> note that 0002 completely does away with cfp in favour of a new struct
> CompressFileHandle. I maintained the void * there because it is used by
> private methods of the compressors. 0003 contains such an example with
> LZ4CompressorState.
> 

I wonder if this (and also the previous item) makes sense to keep 0001
and 0002 or to combine them. The "intermediate" state is a bit annoying.

>> 5) cfopen/cfdopen are missing comments. cfopen_internal has an updated
>> comment, but that's a static function while cfopen/cfdopen are the
>> actual API.
> 
> Added comments to cfopen/cfdopen.
> 

OK.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: Non-superuser subscription owners

2023-01-19 Thread Robert Haas
On Wed, Jan 18, 2023 at 3:58 PM Mark Dilger
 wrote:
> > On Jan 18, 2023, at 12:51 PM, Robert Haas  wrote:
> >
> > Unless I'm missing something, it seems like this could be a quite small 
> > patch.
>
> I didn't like the idea of the create/alter subscription commands needing to 
> parse the connection string and think about what it might do, because at some 
> point in the future we might extend what things are allowed in that string, 
> and we have to keep everything that contemplates that string in sync.  I may 
> have been overly hesitant to tackle that problem.  Or maybe I just ran short 
> of round tuits.

I wouldn't be OK with writing our own connection string parser for
this purpose, but using PQconninfoParse seems OK. We still have to
embed knowledge of which connection string parameters can trigger
local file access, but that doesn't seem like a massive problem to me.
If we already had (or have) that logic someplace else, it would
probably make sense to reuse it, but if we don't, writing new logic
doesn't seem prohibitively scary. I'm not 100% confident of my ability
to get those rules right on the first try, but I feel like whatever
problems are there are just bugs that can be fixed with a few lines of
code changes. The basic idea that by looking at which connection
string properties are set we can tell what kinds of things the
connection string is going to do seems sound to me.

If there's some reason that it isn't, that would be good to discover
now rather than later.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Re: Support plpgsql multi-range in conditional control

2023-01-19 Thread Pavel Stehule
čt 19. 1. 2023 v 15:20 odesílatel 2903807...@qq.com <2903807...@qq.com>
napsal:

> Hello, thank you very much for your reply. But I think you may have
> misunderstood what we have done.
>
> What we do this time is that we can use multiple range ranges
> (condition_iterator) after in. Previously, we can only use such an interval
> [lower, upper] after in, but in some scenarios, we may need a list: 
> *condition_
> iterator[,condition_iterator ...]*
>
> condition_iterator:
> [ REVERSE ] expression .. expression [ BY expression ]
>

then you can use second outer for over an array or just while cycle

Reards

Pavel


>
> Thanks again!
> --
> songjinzhou (2903807...@qq.com)
>
>
> *From:* Pavel Stehule 
> *Date:* 2023-01-19 21:04
> *To:* 2903807...@qq.com
> *CC:* pgsql-hackers ; 1276576182
> <1276576...@qq.com>
> *Subject:* Re: Support plpgsql multi-range in conditional control
> Hi
>
>
> čt 19. 1. 2023 v 10:23 odesílatel 2903807...@qq.com <2903807...@qq.com>
> napsal:
>
>> Dear hackers, my good friend Hou Jiaxing and I have implemented a version
>> of the code that supports multiple integer range conditions in the in
>> condition control of the for loop statement in the plpgsql procedural
>> language. A typical example is as follows:
>>
>> postgres=# do $$
>> declare
>> i int := 10;
>> begin
>> for i in 1..10 by 3, reverse i+10..i+1 by 3 loop
>>raise info '%', i;
>> end loop;
>> end $$;
>> INFO: 1
>> INFO: 4
>> INFO: 7
>> INFO: 10
>> INFO: 20
>> INFO: 17
>> INFO: 14
>> INFO: 11
>> do
>> postgres=#
>>
>> Hope to get your feedback, thank you!
>>
>
> I don't like it. The original design of ADA language is to be a safe and
> simple language. Proposed design is in 100% inversion.
>
> What use case it should to support?
>
> Regards
>
> Pavel
>
>
>>
>> --
>> 2903807...@qq.com
>>
>


RE: Modify the document of Logical Replication configuration settings

2023-01-19 Thread Takamichi Osumi (Fujitsu)
Hi,

On Thursday, January 19, 2023 3:14 PM Michael Paquier  
wrote:
> On Wed, Jan 18, 2023 at 02:04:16PM +0530, Bharath Rupireddy wrote:
> > [1]
> > diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index
> > 89d53f2a64..6f9509267c 100644
> > --- a/doc/src/sgml/config.sgml
> > +++ b/doc/src/sgml/config.sgml
> > @@ -4326,7 +4326,8 @@ restore_command = 'copy
> > "C:\\server\\archivedir\\%f" "%p"'  # Windows
> > 
> >  Terminate replication connections that are inactive for longer
> >  than this amount of time. This is useful for
> > -the sending server to detect a standby crash or network outage.
> > +the sending server to detect a standby crash or logical replication
> > +subscriber crash or network outage.
> >  If this value is specified without units, it is taken as 
> > milliseconds.
> >  The default value is 60 seconds.
> >  A value of zero disables the timeout mechanism.
> 
> Perhaps we could do that, I am not sure whether this brings much in this
> section, though.
This might increase comprehensiveness of the doc slightly.

If we want to do this, it might be better to
add this kind of additions to other parameters such as
wal_receiver_timeout, wal_retrieve_retry_interval
and wal_receiver_status_interval, too.

BTH, thank you for having taken care of my patch, Michael-san!


Best Regards,
Takamichi Osumi





Re: Rethinking the implementation of ts_headline()

2023-01-19 Thread Tom Lane
Alvaro Herrera  writes:
> On 2023-Jan-18, Tom Lane wrote:
>> It's including hits for "day" into the cover despite the lack of any
>> nearby match to "drink".

> I suppose it would be possible to put 'day' and 'drink' in two different
> fragments: since the query has a & operator for them, the words don't
> necessarily have to be nearby.  But OK, your argument for this being the
> shortest result is sensible.

> I do wonder, though, if it's effectively usable for somebody building a
> search interface on top.  If I'm ranking the results from several
> documents, and this document comes on top of others that only match one
> arm of the OR query, then I would like to be able to show the matches
> for both arms of the OR.

The fundamental problem with the case you're posing is that MaxWords
is too small to allow the 'day & drink' match to be shown as a whole.
If you make MaxWords large enough then you do find it including
(and highlighting) 'drink', but I'm not sure we should stress out
about what happens otherwise.

> Oh, I see the problem, and it is my misunderstanding: the <-> operator
> is counting the words in between, even if they are stop words.

Yeah.  AFAICS this is a very deliberate, longstanding decision,
so I'm hesitant to change it.  Your test case with 'simple'
proves little, because there are no stop words in 'simple':

regression=# select to_tsvector('simple', 'As idle as a painted Ship');
 to_tsvector  
--
 'a':4 'as':1,3 'idle':2 'painted':5 'ship':6
(1 row)

However, when I switch to 'english':

regression=# select to_tsvector('english', 'As idle as a painted Ship');
to_tsvector 

 'idl':2 'paint':5 'ship':6
(1 row)

the stop words are gone, but the recorded positions remain the same.
So this is really a matter of how to_tsvector chooses to count word
positions, it's not the fault of the <-> construct in particular.

I'm not convinced that this particular behavior is wrong, anyway.
The user of text search isn't supposed to have to think about
which words are stopwords or not, so I think that it's entirely
sensible for 'idle as a painted' to match 'idle <3> painted'.
Maybe the docs need some adjustment?  But in any case that's
material for a different thread.

> I again have to question how valuable in practice is a  operator
> that's so strict that I have to know exactly how many stop words I want
> there to be in between the phrase search.  For some reason, in my mind I
> had it as "at most N words, ignoring stop words", but that's not what it
> is.

Yeah, I recall discussing "up to N words" semantics for this in the
past, but nobody has made that happen.

> Anyway, I don't think this needs to stop your current patch.

Many thanks for looking at it!

regards, tom lane




Re: Re: Support plpgsql multi-range in conditional control

2023-01-19 Thread Pavel Stehule
čt 19. 1. 2023 v 16:54 odesílatel Pavel Stehule 
napsal:

>
>
> čt 19. 1. 2023 v 15:20 odesílatel 2903807...@qq.com <2903807...@qq.com>
> napsal:
>
>> Hello, thank you very much for your reply. But I think you may have
>> misunderstood what we have done.
>>
>> What we do this time is that we can use multiple range ranges
>> (condition_iterator) after in. Previously, we can only use such an interval
>> [lower, upper] after in, but in some scenarios, we may need a list: 
>> *condition_
>> iterator[,condition_iterator ...]*
>>
>> condition_iterator:
>> [ REVERSE ] expression .. expression [ BY expression ]
>>
>
> then you can use second outer for over an array or just while cycle
>

I wrote simple example:

create type range_expr as (r int4range, s int);

do
$$
declare re range_expr;
begin
  foreach re in array ARRAY[('[10, 20]', 1), ('[100, 200]', 10)]
  loop
for i in lower(re.r) .. upper(re.r) by re.s
loop
  raise notice '%', i;
end loop;
  end loop;
end;
$$;

But just I don't know what is wrong on

begin
  for i in 10..20
  loop
raise notice '%', i;
  end loop;

  for i in 100 .. 200 by 10
  loop
raise notice '%', i;
  end loop;
end;

and if there are some longer bodies you should use function or procedure.
Any different cycle is separated. PLpgSQL (like PL/SQL or ADA) are verbose
languages. There is no goal to have short, heavy code.

Regards

Pavel


Re: Remove source code display from \df+?

2023-01-19 Thread Justin Pryzby
On Wed, Jan 18, 2023 at 10:27:46AM -0500, Isaac Morland wrote:
> On Wed, 18 Jan 2023 at 00:00, Pavel Stehule  wrote:
> 
> > út 17. 1. 2023 v 20:29 odesílatel Isaac Morland  
> > napsal:
> >
> >> I welcome comments and feedback. Now to try to find something manageable
> >> to review.
> >
> > looks well
> >
> > you miss update psql documentation
> >
> > https://www.postgresql.org/docs/current/app-psql.html
> >
> > If the form \df+ is used, additional information about each function is
> > shown, including volatility, parallel safety, owner, security
> > classification, access privileges, language, source code and description.
> 
> Thanks, and sorry about that, it just completely slipped my mind. I’ve
> attached a revised patch with a slight update of the psql documentation.
> 
> you should to assign your patch to commitfest app
> >
> > https://commitfest.postgresql.org/
> 
> I thought I had: https://commitfest.postgresql.org/42/4133/

This is failing tests:
http://cfbot.cputube.org/isaac-morland.html

It seems like any "make check" would fail .. but did you also try
cirrusci from your own github account?
./src/tools/ci/README

BTW, I think "ELSE NULL" could be omitted.

-- 
Justin




Re: almost-super-user problems that we haven't fixed yet

2023-01-19 Thread Robert Haas
On Wed, Jan 18, 2023 at 4:14 PM Nathan Bossart  wrote:
> On Wed, Jan 18, 2023 at 02:51:38PM -0500, Robert Haas wrote:
> > Should (nfree < SuperuserReservedBackends) be using <=, or am I confused?
>
> I believe < is correct.  At this point, the new backend will have already
> claimed a proc struct, so if the number of remaining free slots equals the
> number of reserved slots, it is okay.

OK. Might be worth a short comment.

> > What's the deal with removing "and no new replication connections will
> > be accepted" from the documentation? Is the existing documentation
> > just wrong? If so, should we fix that first? And maybe delete
> > "non-replication" from the error message that says "remaining
> > connection slots are reserved for non-replication superuser
> > connections"? It seems like right now the comments say that
> > replication connections are a completely separate pool of connections,
> > but the documentation and the error message make it sound otherwise.
> > If that's true, then one of them is wrong, and I think it's the
> > docs/error message. Or am I just misreading it?
>
> I think you are right.  This seems to have been missed in ea92368.  I moved
> this part to a new patch that should probably be back-patched to v12.

I'm inclined to commit it to master and not back-patch. It doesn't
seem important enough to perturb translations.

Tushar seems to have a point about pg_use_reserved_connections vs.
pg_use_reserved_backends. I think we should standardize on the former,
as backends is an internal term.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Add LZ4 compression in pg_dump

2023-01-19 Thread gkokolatos





--- Original Message ---
On Thursday, January 19th, 2023 at 4:45 PM, Tomas Vondra 
 wrote:


> 
> 
> On 1/18/23 20:05, gkokola...@pm.me wrote:
> 
> > --- Original Message ---
> > On Wednesday, January 18th, 2023 at 3:00 PM, Tomas Vondra 
> > tomas.von...@enterprisedb.com wrote:
> > 
> > > Hi,
> > > 
> > > On 1/16/23 16:14, gkokola...@pm.me wrote:
> > > 
> > > > Hi,
> > > > 
> > > > I admit I am completely at lost as to what is expected from me anymore.
> > 
> > 
> > 
> > > Unfortunately, this plays against this patch - I'm certainly in favor of
> > > adding lz4 (and other compression algos) into pg_dump, but if I commit
> > > 0001 we get little benefit, and the other parts actually adding lz4/zstd
> > > are treated as "WIP / for completeness" so it's unclear when we'd get to
> > > commit them.
> > 
> > Thank you for your kindness and for taking the time to explain.
> > 
> > > So if I could recommend one thing, it'd be to get at least one of those
> > > WIP patches into a shape that's likely committable right after 0001.
> > 
> > This was clearly my fault. I misunderstood a suggestion upthread to focus
> > on the first patch of the series and ignore documentation and comments on
> > the rest.
> > 
> > Please find v21 to contain 0002 and 0003 in a state which I no longer 
> > consider
> > as WIP but worthy of proper consideration. Some guidance on where is best 
> > to add
> > documentation in 0002 for the function pointers in CompressFileHandle will
> > be welcomed.
> 
> 
> This is internal-only API, not meant for use by regular users and/or
> extension authors, so I don't think we need sgml docs. I'd just add
> regular code-level documentation to compress_io.h.
> 
> For inspiration see docs for "struct ReorderBuffer" in reorderbuffer.h,
> or "struct _archiveHandle" in pg_backup_archiver.h.
> 
> Or what other kind of documentation you had in mind?

This is exactly what I was after. I was between compress_io.c and compress_io.h.
Thank you.

> > > > I had posted v19-0001 for a committer's consideration and v19-000{2,3} 
> > > > for completeness.
> > > > Please find a rebased v20 attached.
> > > 
> > > I took a quick look at 0001, so a couple comments (sorry if some of this
> > > was already discussed in the thread):
> > 
> > Much appreciated!
> > 
> > > 1) I don't think a "refactoring" patch should reference particular
> > > compression algorithms (lz4/zstd), and in particular I don't think we
> > > should have "not yet implemented" messages. We only have a couple other
> > > places doing that, when we didn't have a better choice. But here we can
> > > simply reject the algorithm when parsing the options, we don't need to
> > > do that in a dozen other places.
> > 
> > I have now removed lz4/zstd from where they were present with the exception
> > of pg_dump.c which is responsible for parsing.
> 
> 
> I'm not sure I understand why leave the lz4/zstd in this place?

You are right, it is not obvious. Those were added in 5e73a60488 which is
already committed in master and I didn't want to backtrack. Of course, I am
not opposing in doing so if you wish.

> 
> > > 2) I wouldn't reorder the cases in WriteDataToArchive, i.e. I'd keep
> > > "none" at the end. It might make backpatches harder.
> > 
> > Agreed. However a 'default' is needed in order to avoid compilation 
> > warnings.
> > Also note that 0002 completely does away with cases within 
> > WriteDataToArchive.
> 
> 
> OK, although that's also a consequence of using a "switch" instead of
> plan "if" branches.
> 
> Furthermore, I'm not sure we really need the pg_fatal() about invalid
> compression method in these default blocks. I mean, how could we even
> get to these places when the build does not support the algorithm? All
> of this (ReadDataFromArchive, WriteDataToArchive, EndCompressor, ...)
> happens looong after the compressor was initialized and the method
> checked, no? So maybe either this should simply do Assert(false) or use
> a different error message.

I like Assert(false).

> > > 3) While building, I get bunch of warnings about missing cfdopen()
> > > prototype and pg_backup_archiver.c not knowing about cfdopen() and
> > > adding an implicit prototype (so I doubt it actually works).
> > 
> > Fixed. cfdopen() got prematurely introduced in 5e73a6048 and then got 
> > removed
> > in 69fb29d1af. v20 failed to properly take 69fb29d1af in consideration. Note
> > that cfdopen is removed in 0002 which explains why cfbot didn't complain.
> 
> 
> OK.
> 
> > > 4) "cfp" struct no longer wraps gzFile, but the comment was not updated.
> > > FWIW I'm not sure switching to "void *" is an improvement, maybe it'd be
> > > better to have a "union" of correct types?
> > 
> > Please find and updated comment and a union in place of the void *. Also
> > note that 0002 completely does away with cfp in favour of a new struct
> > CompressFileHandle. I maintained the void * there because it is used by
> > private methods of the compressors. 0003 conta

Re: Doc: Rework contrib appendix -- informative titles, tweaked sentences

2023-01-19 Thread Karl O. Pinc
On Thu, 19 Jan 2023 13:35:17 +0100
Alvaro Herrera  wrote:

> On 2023-Jan-18, Karl O. Pinc wrote:
> 
> > Oops.  Already sent a revised patch that includes starting each
> > module on a new page, for PDF output.  I'll wait to rip that
> > out after review and start a new thread if necessary.

(I have not removed the PDF page breaks from the latest patch.)

> Here's my review in the form of a delta patch.

Love it.

> I didn't find that a thing called "ISN" actually exists.  Is there a
> reference to that?

Maybe.  I came across it somewhere and it seemed useful.  It's
an initialism for International Standard Number.
https://en.wikipedia.org/wiki/International_Standard_Number
It's the same ISN as in the file name, "isn.sgml".

I've frobbed the ISN related text in my response patch.
(And added a line break to btree-gin.)

Attached are 2 patches, a regular and a delta from your v4 review:

contrib_v5-delta.patch.txt
contrib_v5.patch.txt

Regards,

Karl 
Free Software:  "You don't pay back, you pay forward."
 -- Robert A. Heinlein
diff --git a/doc/src/sgml/btree-gin.sgml b/doc/src/sgml/btree-gin.sgml
index c9efe1cccf..f0cda0e32a 100644
--- a/doc/src/sgml/btree-gin.sgml
+++ b/doc/src/sgml/btree-gin.sgml
@@ -1,7 +1,8 @@
 
 
 
- btree_gin — GIN B-tree equivalent operator classes 
[trusted]
+ btree_gin —
+   GIN B-tree equivalent operator classes [trusted]
 
  
   btree_gin
diff --git a/doc/src/sgml/isn.sgml b/doc/src/sgml/isn.sgml
index 27abdc8c66..8008c84fe4 100644
--- a/doc/src/sgml/isn.sgml
+++ b/doc/src/sgml/isn.sgml
@@ -1,8 +1,8 @@
 
 
 
- isn —
-   data types for item numbers (ISBN, EAN, UPC, etc.) [trusted]
+ isn — data types for international standard numbers
+   (ISBN, EAN, UPC, etc.) [trusted]
 
  
   isn
diff --git a/doc/src/sgml/adminpack.sgml b/doc/src/sgml/adminpack.sgml
index 184e96d7a0..04f3b52379 100644
--- a/doc/src/sgml/adminpack.sgml
+++ b/doc/src/sgml/adminpack.sgml
@@ -1,7 +1,7 @@
 
 
 
- adminpack
+ adminpack — pgAdmin support toolpack
 
  
   adminpack
diff --git a/doc/src/sgml/amcheck.sgml b/doc/src/sgml/amcheck.sgml
index 923cbde9dd..4006c75cdf 100644
--- a/doc/src/sgml/amcheck.sgml
+++ b/doc/src/sgml/amcheck.sgml
@@ -1,7 +1,7 @@
 
 
 
- amcheck
+ amcheck — tools to verify index consistency
 
  
   amcheck
diff --git a/doc/src/sgml/auth-delay.sgml b/doc/src/sgml/auth-delay.sgml
index 40629311b1..0571f2a99d 100644
--- a/doc/src/sgml/auth-delay.sgml
+++ b/doc/src/sgml/auth-delay.sgml
@@ -1,7 +1,7 @@
 
 
 
- auth_delay
+ auth_delay — pause on authentication failure
 
  
   auth_delay
diff --git a/doc/src/sgml/auto-explain.sgml b/doc/src/sgml/auto-explain.sgml
index bb7342b120..0c4656ee30 100644
--- a/doc/src/sgml/auto-explain.sgml
+++ b/doc/src/sgml/auto-explain.sgml
@@ -1,7 +1,7 @@
 
 
 
- auto_explain
+ auto_explain — log execution plans of slow queries
 
  
   auto_explain
diff --git a/doc/src/sgml/basebackup-to-shell.sgml 
b/doc/src/sgml/basebackup-to-shell.sgml
index 491368eb8f..b6a3b39541 100644
--- a/doc/src/sgml/basebackup-to-shell.sgml
+++ b/doc/src/sgml/basebackup-to-shell.sgml
@@ -1,7 +1,7 @@
 
 
 
- basebackup_to_shell
+ basebackup_to_shell — example "shell" pg_basebackup 
module
 
  
   basebackup_to_shell
diff --git a/doc/src/sgml/basic-archive.sgml b/doc/src/sgml/basic-archive.sgml
index 60f23d2855..b4d43ced20 100644
--- a/doc/src/sgml/basic-archive.sgml
+++ b/doc/src/sgml/basic-archive.sgml
@@ -1,7 +1,7 @@
 
 
 
- basic_archive
+ basic_archive — an example WAL archive module
 
  
   basic_archive
diff --git a/doc/src/sgml/bloom.sgml b/doc/src/sgml/bloom.sgml
index 98d0316175..19f2b172cc 100644
--- a/doc/src/sgml/bloom.sgml
+++ b/doc/src/sgml/bloom.sgml
@@ -1,7 +1,7 @@
 
 
 
- bloom
+ bloom — bloom filter index access method
 
  
   bloom
diff --git a/doc/src/sgml/btree-gin.sgml b/doc/src/sgml/btree-gin.sgml
index 870c25559e..f0cda0e32a 100644
--- a/doc/src/sgml/btree-gin.sgml
+++ b/doc/src/sgml/btree-gin.sgml
@@ -1,14 +1,15 @@
 
 
 
- btree_gin
+ btree_gin —
+   GIN B-tree equivalent operator classes [trusted]
 
  
   btree_gin
  
 
  
-  btree_gin provides sample GIN operator classes that
+  btree_gin provides GIN operator classes that
   implement B-tree equivalent behavior for the data types
   int2, int4, int8, float4,
   float8, timestamp with time zone,
diff --git a/doc/src/sgml/btree-gist.sgml b/doc/src/sgml/btree-gist.sgml
index 92aa8e277e..c08d1fdfc3 100644
--- a/doc/src/sgml/btree-gist.sgml
+++ b/doc/src/sgml/btree-gist.sgml
@@ -1,7 +1,8 @@
 
 
 
- btree_gist
+ btree_gist —
+   B-tree equivalent GiST index operators [trusted]
 
  
   btree_gist
diff --git a/doc/src/sgml/citext.sgml b/doc/src/sgml/citext.sgml
index 3df2825592..710fbdddf2 100644
--- a/doc/src/sgml/citext.sgml
+++ b/doc/src/sgml/citext.sgml
@@ -1,7 +1,8 @@
 
 
 
- citext
+ citext —
+   a case-insensitive character string type [trusted]
 
  
   citext
diff --git a/doc/src/sgml/contrib-spi.sgml b/doc/src/sgml/contrib-spi.sgml
index 77328ae6e8..e7cae4e38d 100644
--- a/doc/src/sgml/contrib-s

Re: Experiments with Postgres and SSL

2023-01-19 Thread Greg Stark
On Thu, 19 Jan 2023 at 00:45, Andrey Borodin  wrote:

> But..do we have to treat any unknown start sequence of bytes as a TLS
> connection? Or is there some definite subset of possible first bytes
> that clearly indicates that this is a TLS connection or not?

Absolutely not, there's only one MessageType that can initiate a
connection, ClientHello, so the initial byte has to be a specific
value. (0x16)

And probably to implement HTTP/Websocket it would probably only peek
at the first byte and check for things like G(ET) and H(EAD) and so
on, possibly only over SSL but in theory it could be over any
connection if the request comes before the startup packet.

Personally I'm motivated by wanting to implement status and monitoring
data for things like Prometheus and the like. For that it would just
be simple GET queries to recognize. But tunneling pg wire protocol
over websockets sounds cool but not really something I know a lot
about. I note that Neon is doing something similar with a proxy:
https://neon.tech/blog/serverless-driver-for-postgres


--
greg




Re: vac_update_datfrozenxid will raise "wrong tuple length" if pg_database tuple contains toast attribute.

2023-01-19 Thread Nathan Bossart
On Thu, Jan 19, 2023 at 10:42:47AM -0300, Arthur Nascimento wrote:
> Would it be possible to backport the patch to 12 and 13?

This was recently back-patched [0] [1] [2].

[0] https://postgr.es/m/y70xnvbuwqsr2...@paquier.xyz
[1] https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=c0ee694
[2] https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=72b6098

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: vac_update_datfrozenxid will raise "wrong tuple length" if pg_database tuple contains toast attribute.

2023-01-19 Thread Arthur Nascimento
On Thu, 19 Jan 2023 at 14:10, Nathan Bossart  wrote:
> This was recently back-patched [0] [1] [2].

Oh, I see that now. Thanks! Sorry for the noise.

--
Arthur Nascimento
EDB




Re: minor bug

2023-01-19 Thread Torsten Förtsch
If we never expect getRecordTimestamp to fail, then why put it in the
if-condition?

getRecordTimestamp can fail if the record is not a restore point nor a
commit or abort record. A few lines before in the same function there is
this:

 /* Otherwise we only consider stopping before COMMIT or ABORT records. */
if (XLogRecGetRmid(record) != RM_XACT_ID)
return false;

I guess that make sure getRecordTimestamp can never fail.

The way it is written in your patch invites it to be optimized out again.
The only thing that prevents it is the comment.

Why not

(void)getRecordTimestamp(record, &recordXtime);
if (recoveryTarget == RECOVERY_TARGET_TIME)
...




On Wed, Jan 18, 2023 at 9:03 PM Tom Lane  wrote:

> Laurenz Albe  writes:
> > On Tue, 2023-01-17 at 10:32 -0500, Tom Lane wrote:
> >> I seem to recall that the original idea was to report the timestamp
> >> of the commit/abort record we are stopping at.  Maybe my memory is
> >> faulty, but I think that'd be significantly more useful than the
> >> current behavior, *especially* when the replay stopping point is
> >> defined by something other than time.
> >> (Also, the wording of the log message suggests that that's what
> >> the reported time is supposed to be.  I wonder if somebody messed
> >> this up somewhere along the way.)
>
> > recoveryStopTime is set to recordXtime (the time of the xlog record)
> > a few lines above that patch, so this is useful information if it is
> > present.
>
> Ah, but that only happens if recoveryTarget == RECOVERY_TARGET_TIME.
> Digging in the git history, I see that this did use to work as
> I remember: we always extracted the record time before printing it.
> That was accidentally broken in refactoring in c945af80c.  I think
> the correct fix is more like the attached.
>
> regards, tom lane
>
>


Re: minor bug

2023-01-19 Thread Tom Lane
Laurenz Albe  writes:
> On Wed, 2023-01-18 at 15:03 -0500, Tom Lane wrote:
>> Ah, but that only happens if recoveryTarget == RECOVERY_TARGET_TIME.
>> Digging in the git history, I see that this did use to work as
>> I remember: we always extracted the record time before printing it.
>> That was accidentally broken in refactoring in c945af80c.  I think
>> the correct fix is more like the attached.

> Yes, you are right.  Your patch looks fine to me.

Pushed then.  Thanks for the report!

regards, tom lane




Re: minor bug

2023-01-19 Thread Tom Lane
=?UTF-8?Q?Torsten_F=C3=B6rtsch?=  writes:
> Why not

> (void)getRecordTimestamp(record, &recordXtime);
> if (recoveryTarget == RECOVERY_TARGET_TIME)
> ...

Could've done it like that, but I already pushed the other
version, and I don't think it's worth the trouble to change.

regards, tom lane




Re: Add semi-join pushdown to postgres_fdw

2023-01-19 Thread Tomas Vondra
Hi.

I took a quick look at the patch. It needs a rebase, although it applies
fine using patch.

A couple minor comments:

1) addl_conds seems a bit hard to understand, I'd use either the full
wording (additional_conds) or maybe extra_conds

2) some of the lines got quite long, and need a wrap

3) unknown_subquery_rels name is a bit misleading - AFAIK it's the rels
that can't be referenced from upper rels (per what the .h says). So they
are known, but hidden. Is there a better name?

4) joinrel_target_ok() needs a better comment, explaining *when* the
reltarget is safe for pushdown. The conditions are on the same row, but
the project style is to break after '&&'.

Also, I'd write

if (!IsA(var, Var))
continue;

which saves one level of nesting. IMHO that makes it more readable.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: almost-super-user problems that we haven't fixed yet

2023-01-19 Thread Nathan Bossart
On Thu, Jan 19, 2023 at 11:40:53AM -0500, Robert Haas wrote:
> On Wed, Jan 18, 2023 at 4:14 PM Nathan Bossart  
> wrote:
>> On Wed, Jan 18, 2023 at 02:51:38PM -0500, Robert Haas wrote:
>> > Should (nfree < SuperuserReservedBackends) be using <=, or am I confused?
>>
>> I believe < is correct.  At this point, the new backend will have already
>> claimed a proc struct, so if the number of remaining free slots equals the
>> number of reserved slots, it is okay.
> 
> OK. Might be worth a short comment.

I added one.

>> > What's the deal with removing "and no new replication connections will
>> > be accepted" from the documentation? Is the existing documentation
>> > just wrong? If so, should we fix that first? And maybe delete
>> > "non-replication" from the error message that says "remaining
>> > connection slots are reserved for non-replication superuser
>> > connections"? It seems like right now the comments say that
>> > replication connections are a completely separate pool of connections,
>> > but the documentation and the error message make it sound otherwise.
>> > If that's true, then one of them is wrong, and I think it's the
>> > docs/error message. Or am I just misreading it?
>>
>> I think you are right.  This seems to have been missed in ea92368.  I moved
>> this part to a new patch that should probably be back-patched to v12.
> 
> I'm inclined to commit it to master and not back-patch. It doesn't
> seem important enough to perturb translations.

That seems reasonable to me.

> Tushar seems to have a point about pg_use_reserved_connections vs.
> pg_use_reserved_backends. I think we should standardize on the former,
> as backends is an internal term.

Oops.  This is what I meant to do.  I probably flubbed it because I was
wondering why the parameter uses "connections" and the variable uses
"backends," especially considering that the variable for max_connections is
called MaxConnections.  I went ahead and renamed everything to use
"connections."

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
>From 7be7e70aaf488a924d61b21b351a3b4f7e33aedc Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Wed, 18 Jan 2023 12:43:41 -0800
Subject: [PATCH v4 1/3] Code review for ea92368.

This commit missed an error message and a line in the docs.
---
 doc/src/sgml/config.sgml  | 3 +--
 src/backend/utils/init/postinit.c | 2 +-
 2 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 89d53f2a64..e019a1aac9 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -725,8 +725,7 @@ include_dir 'conf.d'
 number of active concurrent connections is at least
 max_connections minus
 superuser_reserved_connections, new
-connections will be accepted only for superusers, and no
-new replication connections will be accepted.
+connections will be accepted only for superusers.

 

diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index ae5a85ed65..9145d96b38 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -931,7 +931,7 @@ InitPostgres(const char *in_dbname, Oid dboid,
 		!HaveNFreeProcs(ReservedBackends))
 		ereport(FATAL,
 (errcode(ERRCODE_TOO_MANY_CONNECTIONS),
- errmsg("remaining connection slots are reserved for non-replication superuser connections")));
+ errmsg("remaining connection slots are reserved for superusers")));
 
 	/* Check replication permissions needed for walsender processes. */
 	if (am_walsender)
-- 
2.25.1

>From 058df2b3dcf50ecbe76c794f4f52751e6a9f765f Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Tue, 17 Jan 2023 13:58:56 -0800
Subject: [PATCH v4 2/3] Rename ReservedBackends to
 SuperuserReservedConnections.

This is in preparation for adding a new reserved_connections GUC.
---
 src/backend/postmaster/postmaster.c | 20 ++--
 src/backend/utils/init/postinit.c   |  4 ++--
 src/backend/utils/misc/guc_tables.c |  2 +-
 src/include/postmaster/postmaster.h |  2 +-
 4 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index 9cedc1b9f0..3f799c4ac8 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -204,15 +204,15 @@ char	   *Unix_socket_directories;
 char	   *ListenAddresses;
 
 /*
- * ReservedBackends is the number of backends reserved for superuser use.
- * This number is taken out of the pool size given by MaxConnections so
- * number of backend slots available to non-superusers is
- * (MaxConnections - ReservedBackends).  Note what this really means is
- * "if there are <= ReservedBackends connections available, only superusers
- * can make new connections" --- pre-existing superuser connections don't
- * count against the limit.
+ * SuperuserReservedConnections is the number of backends reserved for
+ * superuse

Re: Support logical replication of DDLs

2023-01-19 Thread Zheng Li
On Thu, Jan 19, 2023 at 2:05 AM Amit Kapila  wrote:
>
> On Thu, Jan 19, 2023 at 8:39 AM Zheng Li  wrote:
> >
> > On Wed, Jan 18, 2023 at 6:27 AM Amit Kapila  wrote:
> > >
> > > On Sat, Jan 7, 2023 at 8:58 PM Zheng Li  wrote:
> > > >
> >
> > Foreign Tables can also be considered replicated with DDL replication 
> > because we
> > don't even need to replicate the data as it resides on the external
> > server. Users
> > need to configure the external server to allow connection from the
> > subscriber for
> > foreign tables to work on the subscriber.
> >
>
> So, this would mean that we expect the subscriber will also have the
> same foreign server as the publisher because we will replicate the
> entire connection/user information of the foreign server for the
> publisher.

Yes, CREATE/ALTER SERVER commands are also supported by the current
DDL replication patch.

>But what about data inserted by the publisher on the
> foreign server?

I thought the data inserted to a foreign table will always be stored
on the foreign server unless I'm mistaken?

> > > We should also think
> > > about initial sync for all those objects as well.
> >
> > Agree, we're starting an investigation on initial sync. But I think
> > initial sync depends on
> > DDL replication to work reliably, not the other way around. DDL replication 
> > can
> > work on its own without the initial sync of schema, users just need to
> > setup the initial
> > schema just like they would today.
> >
>
> The difference is that today users need to take care of all schema
> setup on both and follow changes in the same on the publisher. But
> with DDL replication, there has to be a point prior to which both the
> nodes have the same setup. For that, before setting up DDL
> replication, users need to ensure that both nodes have the same
> schema, and then during setup, the user doesn't perform any DDL on the
> publisher.

The users can perform DDL during the setup if they do the following:
1. Create a logical replication slot to capture changes on the publisher
2. Do a backup for the publisher
3. Restore the backup as the subscriber
4. Advance the logical slot to the last valid LSN of the restore
5. Create pub/sub and use the above logical slot.

Regards,
Zane




Re: Add LZ4 compression in pg_dump

2023-01-19 Thread Tomas Vondra
Hi,

On 1/19/23 17:42, gkokola...@pm.me wrote:
> 
> --- Original Message ---
> On Thursday, January 19th, 2023 at 4:45 PM, Tomas Vondra 
>  wrote:
>>
>> On 1/18/23 20:05, gkokola...@pm.me wrote:
>>
>>> --- Original Message ---
>>> On Wednesday, January 18th, 2023 at 3:00 PM, Tomas Vondra 
>>> tomas.von...@enterprisedb.com wrote:
>>
>> I'm not sure I understand why leave the lz4/zstd in this place?
> 
> You are right, it is not obvious. Those were added in 5e73a60488 which is
> already committed in master and I didn't want to backtrack. Of course, I am
> not opposing in doing so if you wish.
> 

Ah, I didn't realize it was already added by earlier commit. In that
case let's not worry about it.

>>
 2) I wouldn't reorder the cases in WriteDataToArchive, i.e. I'd keep
 "none" at the end. It might make backpatches harder.
>>>
>>> Agreed. However a 'default' is needed in order to avoid compilation 
>>> warnings.
>>> Also note that 0002 completely does away with cases within 
>>> WriteDataToArchive.
>>
>>
>> OK, although that's also a consequence of using a "switch" instead of
>> plan "if" branches.
>>
>> Furthermore, I'm not sure we really need the pg_fatal() about invalid
>> compression method in these default blocks. I mean, how could we even
>> get to these places when the build does not support the algorithm? All
>> of this (ReadDataFromArchive, WriteDataToArchive, EndCompressor, ...)
>> happens looong after the compressor was initialized and the method
>> checked, no? So maybe either this should simply do Assert(false) or use
>> a different error message.
> 
> I like Assert(false).
> 

OK, good. Do you agree we should never actually get there, if the
earlier checks work correctly?

>>
 4) "cfp" struct no longer wraps gzFile, but the comment was not updated.
 FWIW I'm not sure switching to "void *" is an improvement, maybe it'd be
 better to have a "union" of correct types?
>>>
>>> Please find and updated comment and a union in place of the void *. Also
>>> note that 0002 completely does away with cfp in favour of a new struct
>>> CompressFileHandle. I maintained the void * there because it is used by
>>> private methods of the compressors. 0003 contains such an example with
>>> LZ4CompressorState.
>>
>>
>> I wonder if this (and also the previous item) makes sense to keep 0001
>> and 0002 or to combine them. The "intermediate" state is a bit annoying.
> 
> Agreed. It was initially submitted as one patch. Then it was requested to be
> split up in two parts, one to expand the use of the existing API and one to
> replace with the new interface. Unfortunately the expansion of usage of the
> existing API requires some tweaking, but that is not a very good reason for
> the current patch set. I should have done a better job there.
> 
> Please find v22 attach which combines back 0001 and 0002. It is missing the
> documentation that was discussed above as I wanted to give a quick feedback.
> Let me know if you think that the combined version is the one to move forward
> with.
> 

Thanks, I'll take a look.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: Doc: Rework contrib appendix -- informative titles, tweaked sentences

2023-01-19 Thread Karl O. Pinc
On Thu, 19 Jan 2023 11:03:53 -0600
"Karl O. Pinc"  wrote:

> Attached are 2 patches, a regular and a delta from your v4 review:
> 
> contrib_v5-delta.patch.txt
> contrib_v5.patch.txt

I left your appendix title unchanged: "Additional Supplied 
Extensions and Modules".  

I had put "Extensions" after
"Modules", because, apparently, things that come last in the
sentence are most remembered by the reader. My impression is that
more people are looking for extensions than modules.

Regards,

Karl 
Free Software:  "You don't pay back, you pay forward."
 -- Robert A. Heinlein




Re: Remove source code display from \df+?

2023-01-19 Thread Isaac Morland
On Thu, 19 Jan 2023 at 11:30, Justin Pryzby  wrote:

> On Wed, Jan 18, 2023 at 10:27:46AM -0500, Isaac Morland wrote:
> >
> > I thought I had: https://commitfest.postgresql.org/42/4133/
>
> This is failing tests:
> http://cfbot.cputube.org/isaac-morland.html
>
> It seems like any "make check" would fail .. but did you also try
> cirrusci from your own github account?
> ./src/tools/ci/README
>

I definitely ran "make check" but I did not realize there is also cirrusci.
I will look at that. I'm having trouble interpreting the cfbot page to
which you linked but I'll try to run cirrusci myself before worrying too
much about that.

Running "make check" the first time I was surprised to see no failures - so
I added tests for \df+, which passed when I did "make check".


> BTW, I think "ELSE NULL" could be omitted.
>

Looks like; I'll update. Might as well reduce the visual size of the code.


Re: Non-superuser subscription owners

2023-01-19 Thread Jeff Davis
On Wed, 2023-01-18 at 14:38 -0500, Robert Haas wrote:
> I was just noticing that what was committed here didn't actually fix
> the problem implied by the subject line. That is, non-superuser still
> can't own subscriptions. To put that another way, there's no way for
> the superuser to delegate the setup and administration of logical
> replication to a non-superuser. That's a bummer.

Right, though as Mark pointed out, it does accomplish something even if
it's a bit unsatisfying. We could certainly do better here.

> 2. There was also quite a bit of discussion of what to do if a user
> who was previously eligible to own a subscription ceases to be
> eligible, in particular around a superuser who is made into a
> non-superuser, but the same problem would apply

Correct, that's not a new problem, but exists in only a few places now.
Our privilege system is focused on "what action can the user take right
now?", and gets weirder when it comes to object ownership, which is a
more permanent thing.

Extending that system to a subscription object, which has its own
capabilities including a long-lived process, is cause for some
hesitation. I agree it's not necessarily a blocker.

> 3. There was a bit of discussion of maybe wanting to allow users to
> create subscriptions with some connection strings but not others,

This was an alternative to trying to sanitize connection strings,
because it's a bit difficult to reason about what might be "safe"
connection strings for a non-superuser, because it's environment-
dependent. But if we do identify a reasonable set of sanitization
rules, we can proceed without 3.

> I am very curious to know (a) why work on this was abandoned (perhaps
> the answer is just lack of round tuits, in which case there is no
> more
> to be said), and (b) what people think of (1)-(3) above, and (c)
> whether anyone knows of further problems that need to be considered
> here.

(a) Mostly round-tuits. There are problems and questions; but there are
with any work, and they could be solved. Or, if they don't turn out to
be terribly serious, we could ignore them.

(b) When I pick this up again I would be inclined towards the
following: try to solve 4-5 (listed below) first, which are
independently useful; then look at both 1 and 3 to see which one
presents an agreeable solution faster. I'll probably ignore 2 because I
couldn't get agreement the last time around (I think Mark objected to
the idea of denying a drop in privileges).

(c) Let me add:

4. There are still differences between the subscription worker applying
a change and going through the ordinary INSERT paths, for instance with
RLS. Also solvable.

5. Andres raised in another thread the idea of switching to the table
owner when applying changes (perhaps in a
SECURITY_RESTRICTED_OPERATION?): 

https://www.postgresql.org/message-id/20230112033355.u5tiyr2bmuoc4...@awork3.anarazel.de

That seems related, and I like the idea.


-- 
Jeff Davis
PostgreSQL Contributor Team - AWS






Re: BF animal malleefowl reported an failure in 001_password.pl

2023-01-19 Thread Thomas Munro
On Tue, Jan 17, 2023 at 11:24 AM Thomas Munro  wrote:
> Another idea would be to teach the latch infrastructure itself to
> magically swap latch events to position 0.  Latches are usually
> prioritised; it's only in this rare race case that they are not.

I liked that idea for a while, but I suspect it is not really possible
to solve the problem completely this way, because it won't work on
Windows (see below) and the race I described earlier is probably not
the only one.  I think it must also be possible for poll() to ignore a
signal that becomes pending just as the system call begins and return
a socket fd that has also just become ready, without waiting (thus not
causing EINTR).  Then the handler would run after we return to
userspace, we'd see only the socket event, and a later call would see
the latch event.

So I think we probably need something like the attached, which I was
originally trying to avoid.

Looking into all that made me notice a related problem on Windows.
There's an interesting difference between the implementation of
select() in src/backend/port/win32/socket.c and the Windows
implementation of WaitEventSetBlock() in latch.c.  The latch.c code
only reports one event at a time, in event array order, because that's
WaitForMultipleObjects()'s contract and we expose that fairly
directly.  The older socket.c code uses that only for wakeup, and then
it polls *all* sockets to be able to report more than one at a time.
I was careful to use a large array of output events to preserve the
existing round-robin servicing of multiple server sockets, but I see
now that that only works on Unix.  On Windows, I suspect that one
socket receiving a fast enough stream of new connections could prevent
a second socket from ever being serviced.  I think we might want to do
something about that.
From 8b08f138a3286503e339b546d758ef683514a929 Mon Sep 17 00:00:00 2001
From: Thomas Munro 
Date: Fri, 20 Jan 2023 05:50:39 +1300
Subject: [PATCH] Process pending postmaster work before connections.

Modify the new event loop code from commit 7389aad6 so that it checks
for work requested by signal handlers even if it doesn't see a latch
event yet.

This gives priority to shutdown and reload requests where the latch will
be reported later in the event array, or in a later call to
WaitEventSetWait(), due to scheduling details.  In particular, this
guarantees that a SIGHUP-then-connect sequence (as seen in
authentication tests) causes the postmaster to process the reload before
accepting the connection.  If the WaitEventSetWait() call saw the socket
as ready, and the reload signal was generated before the connection,
then the latest time the signal handler should be able to run is after
poll/epoll_wait/kevent returns but before we check the
pending_pm_reload_request flag.

Reported-by: Hou Zhijie 
Reported-by: Tom Lane 
Discussion: https://postgr.es/m/OS0PR01MB57163D3BF2AB42ECAA94E5C394C29%40OS0PR01MB5716.jpnprd01.prod.outlook.com

diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index 9cedc1b9f0..cec3fe8dc5 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -1739,20 +1739,24 @@ ServerLoop(void)
 		for (int i = 0; i < nevents; i++)
 		{
 			if (events[i].events & WL_LATCH_SET)
-			{
 ResetLatch(MyLatch);
 
-/* Process work requested via signal handlers. */
-if (pending_pm_shutdown_request)
-	process_pm_shutdown_request();
-if (pending_pm_child_exit)
-	process_pm_child_exit();
-if (pending_pm_reload_request)
-	process_pm_reload_request();
-if (pending_pm_pmsignal)
-	process_pm_pmsignal();
-			}
-			else if (events[i].events & WL_SOCKET_ACCEPT)
+			/*
+			 * The following is done unconditionally, even if we didn't see
+			 * WL_LATCH_SET.  This gives high priority to shutdown and reload
+			 * requests where the latch happens to appear later in events[] or
+			 * will be reported by a later call to WaitEventSetWait().
+			 */
+			if (pending_pm_shutdown_request)
+process_pm_shutdown_request();
+			if (pending_pm_child_exit)
+process_pm_child_exit();
+			if (pending_pm_reload_request)
+process_pm_reload_request();
+			if (pending_pm_pmsignal)
+process_pm_pmsignal();
+
+			if (events[i].events & WL_SOCKET_ACCEPT)
 			{
 Port	   *port;
 
-- 
2.35.1



Re: Non-superuser subscription owners

2023-01-19 Thread Jeff Davis
On Thu, 2023-01-19 at 10:45 -0500, Robert Haas wrote:
> I wouldn't be OK with writing our own connection string parser for
> this purpose, but using PQconninfoParse seems OK. We still have to
> embed knowledge of which connection string parameters can trigger
> local file access, but that doesn't seem like a massive problem to
> me.

Another idea (I discussed with Andres some time ago) was to have an
option to libpq to turn off file access entirely. That could be a new
API function or a new connection option.

That would be pretty valuable by itself. Though we might want to
support a way to pass SSL keys as values rather than file paths, so
that we can still do SSL.

So perhaps the answer is that it will be a small patch to get non-
superuser subscription owners, but we need three or four preliminary
patches first.


-- 
Jeff Davis
PostgreSQL Contributor Team - AWS






Re: add PROCESS_MAIN to VACUUM

2023-01-19 Thread Nathan Bossart
On Thu, Jan 19, 2023 at 05:28:25PM +0530, vignesh C wrote:
> The patch does not apply on top of HEAD as in [1], please post a rebased 
> patch:

rebased

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
>From 7a7f96bf4eea5be6cc252dda6bc330e77a6a3316 Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Thu, 29 Dec 2022 15:31:49 -0800
Subject: [PATCH v3 1/1] add PROCESS_MAIN to VACUUM

---
 doc/src/sgml/ref/vacuum.sgml | 13 +
 doc/src/sgml/ref/vacuumdb.sgml   | 15 +++
 src/backend/commands/vacuum.c| 28 ++--
 src/backend/postmaster/autovacuum.c  |  4 +++-
 src/bin/psql/tab-complete.c  |  4 ++--
 src/bin/scripts/t/100_vacuumdb.pl|  7 +++
 src/bin/scripts/vacuumdb.c   | 24 
 src/include/commands/vacuum.h|  1 +
 src/test/regress/expected/vacuum.out |  4 
 src/test/regress/sql/vacuum.sql  |  5 +
 10 files changed, 96 insertions(+), 9 deletions(-)

diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index 545b23b54f..b6d30b5764 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -33,6 +33,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ boolean ]
 SKIP_LOCKED [ boolean ]
 INDEX_CLEANUP { AUTO | ON | OFF }
+PROCESS_MAIN [ boolean ]
 PROCESS_TOAST [ boolean ]
 TRUNCATE [ boolean ]
 PARALLEL integer
@@ -238,6 +239,18 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ defname, "process_toast") == 0)
 			process_toast = defGetBoolean(opt);
 		else if (strcmp(opt->defname, "truncate") == 0)
@@ -226,7 +229,8 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 		(disable_page_skipping ? VACOPT_DISABLE_PAGE_SKIPPING : 0) |
 		(process_toast ? VACOPT_PROCESS_TOAST : 0) |
 		(skip_database_stats ? VACOPT_SKIP_DATABASE_STATS : 0) |
-		(only_database_stats ? VACOPT_ONLY_DATABASE_STATS : 0);
+		(only_database_stats ? VACOPT_ONLY_DATABASE_STATS : 0) |
+		(process_main ? VACOPT_PROCESS_MAIN : 0);
 
 	/* sanity checks on options */
 	Assert(params.options & (VACOPT_VACUUM | VACOPT_ANALYZE));
@@ -367,9 +371,10 @@ vacuum(List *relations, VacuumParams *params,
 			ereport(ERROR,
 	(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 	 errmsg("ONLY_DATABASE_STATS cannot be specified with a list of tables")));
-		/* don't require people to turn off PROCESS_TOAST explicitly */
+		/* don't require people to turn off PROCESS_TOAST/MAIN explicitly */
 		if (params->options & ~(VACOPT_VACUUM |
 VACOPT_VERBOSE |
+VACOPT_PROCESS_MAIN |
 VACOPT_PROCESS_TOAST |
 VACOPT_ONLY_DATABASE_STATS))
 			ereport(ERROR,
@@ -2031,10 +2036,12 @@ vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params, bool skip_privs)
 	/*
 	 * Remember the relation's TOAST relation for later, if the caller asked
 	 * us to process it.  In VACUUM FULL, though, the toast table is
-	 * automatically rebuilt by cluster_rel so we shouldn't recurse to it.
+	 * automatically rebuilt by cluster_rel so we shouldn't recurse to it
+	 * unless PROCESS_MAIN is disabled.
 	 */
 	if ((params->options & VACOPT_PROCESS_TOAST) != 0 &&
-		(params->options & VACOPT_FULL) == 0)
+		((params->options & VACOPT_FULL) == 0 ||
+		 (params->options & VACOPT_PROCESS_MAIN) == 0))
 		toast_relid = rel->rd_rel->reltoastrelid;
 	else
 		toast_relid = InvalidOid;
@@ -2053,7 +2060,8 @@ vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params, bool skip_privs)
 	/*
 	 * Do the actual work --- either FULL or "lazy" vacuum
 	 */
-	if (params->options & VACOPT_FULL)
+	if (params->options & VACOPT_FULL &&
+		params->options & VACOPT_PROCESS_MAIN)
 	{
 		ClusterParams cluster_params = {0};
 
@@ -2067,7 +2075,7 @@ vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params, bool skip_privs)
 		/* VACUUM FULL is now a variant of CLUSTER; see cluster.c */
 		cluster_rel(relid, InvalidOid, &cluster_params);
 	}
-	else
+	else if (params->options & VACOPT_PROCESS_MAIN)
 		table_relation_vacuum(rel, params, vac_strategy);
 
 	/* Roll back any GUC changes executed by index functions */
@@ -2094,7 +2102,15 @@ vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params, bool skip_privs)
 	 * totally unimportant for toast relations.
 	 */
 	if (toast_relid != InvalidOid)
+	{
+		/* we force VACOPT_PROCESS_MAIN so vacuum_rel() processes it */
+		bool force_opt = ((params->options & VACOPT_PROCESS_MAIN) == 0);
+
+		params->options |= VACOPT_PROCESS_MAIN;
 		vacuum_rel(toast_relid, NULL, params, true);
+		if (force_opt)
+			params->options &= ~VACOPT_PROCESS_MAIN;
+	}
 
 	/*
 	 * Now release the session-level lock on the main table.
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index f5ea381c53..12dcb2b762 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2860,7 +2860,9 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 		 * skip vac_

Re: HOT chain validation in verify_heapam()

2023-01-19 Thread Robert Haas
On Thu, Jan 19, 2023 at 8:55 AM Aleksander Alekseev
 wrote:
> I noticed that this patch stuck a little and decided to take another look.
>
> It seems to be well written, covered with tests and my understanding
> is that all the previous feedback was accounted for. To your knowledge
> is there anything that prevents us from moving it to "Ready for
> Committer"?

Thanks for taking a look, and for pinging the thread.

I think that the handling of lp_valid[] in the loop that begins with
"Loop over offset and populate predecessor array from all entries that
are present in successor array" is very confusing. I think that
lp_valid[] should be answering the question "is the line pointer
basically sane?". That is, if it's a redirect, it needs to point to
something within the line pointer array (and we also check that it
must be an entry in the line pointer array that is used, which seems
fine). If it's not a redirect, it needs to point to space that's
entirely within the block, properly aligned, and big enough to contain
a tuple. We determine the answers to all of these questions in the
first loop, the one that starts with /* Perform tuple checks */.

Nothing that happens in the second loop, where we populate the
predecessor array, can reverse our previous conclusion that the line
pointer is valid, so this loop shouldn't be resetting entries in
lp_valid[] to false. The reason that it's doing so seems to be that it
wants to use lp_valid[] to control the behavior of the third loop,
where we perform checks against things that have entries in the
predecessor array. As written, the code ensures that we always set
lp_valid[nextoffnum] to false unless we set predecessor[nextoffnum] to
a value other than InvalidOffsetNumber. But that is needlessly
complex: the third loop doesn't need to look at lp_valid[] at all. It
can just check whether predecessor[currentoffnum] is valid. If it is,
perform checks. Otherwise, skip it. It seems to me that this would be
significantly simpler.

To put the above complaint another way, a variable shouldn't mean two
different things depending on where you are in the function. Right
now, at the end of the first loop, lp_valid[x] answers the question
"is line pointer x basically valid?". But by the end of the second
loop, it answers the question "is line pointer x valid and does it
also have a valid predecessor?". That kind of definitional change is
something to be avoided.

The test if (pred_in_progress || TransactionIdDidCommit(curr_xmin))
seems wrong to me. Shouldn't it be &&? Has this code been tested at
all?  It doesn't seem to have a test case. Some of these other errors
don't, either. Maybe there's some that we can't easily test in an
automated way, but we should test what we can. I guess maybe casual
testing wouldn't reveal the problem here because of the recheck, but
it's worrying to find logic that doesn't look right with no
corresponding comments or test cases.

Some error message kibitizing:

 psprintf("redirected tuple at line pointer offset %u is not heap only tuple",

It seems to me that this should say "redirected line pointer pointing
to a non-heap-only tuple at offset %u". There is no such thing as a
redirected tuple -- and even if there were, what we have here is
clearly a redirected line pointer.

psprintf("redirected tuple at line pointer offset %u is not heap only tuple",

And I think for the same reasons this one should say something like
"redirected line pointer pointing to a non-heap-only tuple at offset
%u".

 psprintf("redirected tuple at line pointer offset %u is not heap
updated tuple",

Possibly all of these would sound better with "points" rather than
"pointing" -- if so, we'd need to change an existing message in the
same way.

And this one should say something like "redirected line pointer
pointing to a tuple not produced by an update at offset %u".

 psprintf("tuple is root of chain but it is marked as heap-only tuple"));

I think this would sound better if you deleted the word "it".

I don't know whether it's worth arguing about -- it feels like we've
argued too much already about this sort of thing -- but I am not very
convinced by initializers like OffsetNumber
predecessor[MaxOffsetNumber] = {InvalidOffsetNumber}. That style is
only correct because InvalidOffsetNumber happens to be zero. If it
were up to me, I'd use memset to clear the predecessor array. I would
not bulk initialize sucessor and lp_valid but make sure that the first
loop always sets them, possibly by having the top of the loop set them
to InvalidOffsetNumber and false initially and then letting code later
in the loop change the value, or possibly in some other way.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Non-superuser subscription owners

2023-01-19 Thread Robert Haas
On Thu, Jan 19, 2023 at 1:40 PM Jeff Davis  wrote:
> On Thu, 2023-01-19 at 10:45 -0500, Robert Haas wrote:
> > I wouldn't be OK with writing our own connection string parser for
> > this purpose, but using PQconninfoParse seems OK. We still have to
> > embed knowledge of which connection string parameters can trigger
> > local file access, but that doesn't seem like a massive problem to
> > me.
>
> Another idea (I discussed with Andres some time ago) was to have an
> option to libpq to turn off file access entirely. That could be a new
> API function or a new connection option.
>
> That would be pretty valuable by itself. Though we might want to
> support a way to pass SSL keys as values rather than file paths, so
> that we can still do SSL.

Maybe all of that would be useful, but it doesn't seem that mandatory.

> So perhaps the answer is that it will be a small patch to get non-
> superuser subscription owners, but we need three or four preliminary
> patches first.

I guess I'm not quite seeing it. Why can't we write a small patch to
get this working right now, probably in a few hours, and deal with any
improvements that people want at a later time?

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: almost-super-user problems that we haven't fixed yet

2023-01-19 Thread Robert Haas
On Thu, Jan 19, 2023 at 12:54 PM Nathan Bossart
 wrote:
> > OK. Might be worth a short comment.
>
> I added one.

Thanks. I'd move it to the inner indentation level so it's closer to
the test at issue.

I would also suggest reordering the documentation and the
postgresql.conf.sample file so that reserved_connections precedes
superuser_reserved_connections, instead of following it.

Other than that, this seems like it might be about ready to commit,
barring objections or bug reports.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: [PATCH] Teach planner to further optimize sort in distinct

2023-01-19 Thread Ankit Kumar Pandey



On 19/01/23 18:49, David Rowley wrote:



I think you should write a function like:



bool pathkeys_count_contained_in_unordered(List *keys1, List *keys2,
List **reorderedkeys, int *n_common)



which works very similarly to pathkeys> _count_contained_in, but
populates *reorderedkeys so it contains all of the keys in keys1, but
put the matching ones in the same order as they are in keys2.  If you
find a keys2 that does not exist in keys1 then just add the additional
unmatched keys1 keys to *reorderedkeys.  Set *n_common to the number
of common keys excluding any that come after a key2 key that does not
exist as a key1 key.



You can just switch to using that function in
create_final_distinct_paths(). You'll need to consider if the query is
a DISTINCT ON query and not try the unordered version of the function
in that case.


Tried this, it worked as expected. Tests are green as well.


I also just noticed that in build_index_paths() we'll leave the index
path's pathkeys empty if we deem the pathkeys as useless.  I'm not
sure what the repercussions of setting those to the return value of
build_index_pathkeys() if useful_pathkeys is otherwise empty.


This is very rigid indeed.


It's possible that truncate_useless_pathkeys() needs to be modified to
check if the pathkeys might be useful for DISTINCT 


We have pathkeys_useful_for_merging and pathkeys_useful_for_ordering.

Can we not have pathkeys_useful_for_distinct?

Also, pathkeys_useful_for_ordering calls pathkeys_count_contained_in.

We can add code path on similar lines.


When I
thought about it I assumed that we always set IndexPath's pathkeys to
whatever (if any) sort order that the index provides.


Can we not added original path keys in IndexPath? It could be useful

at other places as well. Atleast, I can see it useful in sorting cases.

makes me wonder if this entire patch is a good idea. 


We are still getting some benefit even without index paths for now.


I have attached patch with pathkeys_count_contained_in_unordered

and corresponding changes in create_final_distinct_paths for reference.


Thanks,

Ankit
diff --git a/src/backend/optimizer/path/pathkeys.c b/src/backend/optimizer/path/pathkeys.c
index d2e241c983..f0bc977eff 100644
--- a/src/backend/optimizer/path/pathkeys.c
+++ b/src/backend/optimizer/path/pathkeys.c
@@ -2014,3 +2014,52 @@ has_useful_pathkeys(PlannerInfo *root, RelOptInfo *rel)
 		return true;			/* might be able to use them for ordering */
 	return false;/* definitely useless */
 }
+
+/*
+ * pathkeys_count_contained_in_unordered
+ *		reorders keys1 to match keys2
+ * Populates *reorderedkeys so it contains all of the keys in keys1, but
+ * put the matching ones in the same order as they are in keys2.  If keys
+ * in keys1 which do not match keys2 are appended in the end.
+ * n_common denotes number of matched keys.
+ */
+bool
+pathkeys_count_contained_in_unordered(List *keys1, List *keys2,
+	  List **reorderedkeys, int *n_common)
+{
+	ListCell*	lc1;
+	ListCell*	lc2;
+	int			n = 0;
+	List*		unmatched_keys = NIL;
+
+	foreach(lc1, keys1)
+	{
+		bool matched = false;
+		PathKey*pathkey1 = (PathKey *) lfirst(lc1);
+		foreach(lc2, keys2)
+		{
+			PathKey*pathkey2 = (PathKey *) lfirst(lc2);
+			if (pathkey1 == pathkey2)
+			{
+*reorderedkeys = lappend(*reorderedkeys, pathkey2);
+n++;
+matched = true;
+break;
+			}
+
+		}
+		/*
+		 * If no match is found, collect path key for appending it later
+		 */
+		if (!matched)
+			unmatched_keys = lappend(unmatched_keys, pathkey1);
+	}
+
+	Assert(n == list_length(*reorderedkeys));
+
+	*reorderedkeys = list_concat_unique(*reorderedkeys, unmatched_keys);
+	Assert(list_length(*reorderedkeys) == list_length(keys1));
+	*n_common = n;
+
+	return *n_common == list_length(keys1);
+}
\ No newline at end of file
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 05f44faf6e..d9904b046a 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -4936,10 +4936,28 @@ create_final_distinct_paths(PlannerInfo *root, RelOptInfo *input_rel,
 			Path	   *sorted_path;
 			bool		is_sorted;
 			int			presorted_keys;
+			List*		reordered_keys = NIL;
 
-			is_sorted = pathkeys_count_contained_in(needed_pathkeys,
+			/*
+			 * Attempt to optimize non distinct-on queries
+			 * if sorted keys are present but not in required order.
+			 * We can modify needed_path keys as per as input path key
+			 * ordering and reuse input sort.
+			 */
+			if (!parse->hasDistinctOn)
+			{
+is_sorted = pathkeys_count_contained_in_unordered(needed_pathkeys,
 	input_path->pathkeys,
+	&reordered_keys,
 	&presorted_keys);
+needed_pathkeys = reordered_keys;
+			}
+			else
+			{
+is_sorted = pathkeys_count_contained_in(needed_pathkeys,
+	input_path->pathkeys,
+	&presorted_keys);
+			}
 
 			if (is_sorted)
 sorted_path = input_path;
diff --git a/

Re: Extracting cross-version-upgrade knowledge from buildfarm client

2023-01-19 Thread Andrew Dunstan


On 2023-01-18 We 17:14, Tom Lane wrote:
> Andrew Dunstan  writes:
>> I think we can do what you want but it's a bit harder than what you've
>> done. If we're not going to save the current run's product then we need
>> to run the upgrade test from a different directory (probably directly in
>> "$buildroot/$this_branch/inst"). Otherwise we'll be testing upgrade to
>> the saved product of a previous run of this branch.
> Hmm, maybe that explains some inconsistent results I remember getting.
>
>> I'll take a stab at it tomorrow if you like.
> Please do.
>
>   


See



I tested it and it seems to be doing the right thing.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: generic plans and "initial" pruning

2023-01-19 Thread Tom Lane
I spent some time re-reading this whole thread, and the more I read
the less happy I got.  We are adding a lot of complexity and introducing
coding hazards that will surely bite somebody someday.  And after awhile
I had what felt like an epiphany: the whole problem arises because the
system is wrongly factored.  We should get rid of AcquireExecutorLocks
altogether, allowing the plancache to hand back a generic plan that
it's not certain of the validity of, and instead integrate the
responsibility for acquiring locks into executor startup.  It'd have
to be optional there, since we don't need new locks in the case of
executing a just-planned plan; but we can easily add another eflags
bit (EXEC_FLAG_GET_LOCKS or so).  Then there has to be a convention
whereby the ExecInitNode traversal can return an indicator that
"we failed because the plan is stale, please make a new plan".

There are a couple reasons why this feels like a good idea:

* There's no need for worry about keeping the locking decisions in sync
with what executor startup does.

* We don't need to add the overhead proposed in the current patch to
pass forward data about what got locked/pruned.  While that overhead
is hopefully less expensive than the locks it saved acquiring, it's
still overhead (and in some cases the patch will fail to save acquiring
any locks, making it certainly a net negative).

* In a successfully built execution state tree, there will simply
not be any nodes corresponding to pruned-away, never-locked subplans.
As long as code like EXPLAIN follows the state tree and doesn't poke
into plan nodes that have no matching state, it's secure against the
sort of problems that Robert worried about upthread.

While I've not attempted to write any code for this, I can also
think of a few issues that'd have to be resolved:

* We'd be pushing the responsibility for looping back and re-planning
out to fairly high-level calling code.  There are only half a dozen
callers of GetCachedPlan, so there's not that many places to be
touched; but in some of those places the subsequent executor-start call
is not close by, so that the necessary refactoring might be pretty
painful.  I doubt there's anything insurmountable, but we'd definitely
be changing some fundamental APIs.

* In some cases (views, at least) we need to acquire lock on relations
that aren't directly reflected anywhere in the plan tree.  So there'd
have to be a separate mechanism for getting those locks and rechecking
validity afterward.  A list of relevant relation OIDs might be enough
for that.

* We currently do ExecCheckPermissions() before initializing the
plan state tree.  It won't do to check permissions on relations we
haven't yet locked, so that responsibility would have to be moved.
Maybe that could also be integrated into the initialization recursion?
Not sure.

* In the existing usage of AcquireExecutorLocks, if we do decide
that the plan is stale then we are able to release all the locks
we got before we go off and replan.  I'm not certain if that behavior
needs to be preserved, but if it does then that would require some
additional bookkeeping in the executor.

* This approach is optimizing on the assumption that we usually
won't need to replan, because if we do then we might waste a fair
amount of executor startup overhead before discovering we have
to throw all that state away.  I think that's clearly the right
way to bet, but perhaps somebody else has a different view.

Thoughts?

regards, tom lane




Re: almost-super-user problems that we haven't fixed yet

2023-01-19 Thread Nathan Bossart
On Thu, Jan 19, 2023 at 02:17:35PM -0500, Robert Haas wrote:
> On Thu, Jan 19, 2023 at 12:54 PM Nathan Bossart
>  wrote:
>> > OK. Might be worth a short comment.
>>
>> I added one.
> 
> Thanks. I'd move it to the inner indentation level so it's closer to
> the test at issue.

I meant for it to cover the call to HaveNFreeProcs() as well since the same
idea applies.  I left it the same for now, but if you still think it makes
sense to move it, I'll do so.

> I would also suggest reordering the documentation and the
> postgresql.conf.sample file so that reserved_connections precedes
> superuser_reserved_connections, instead of following it.

Makes sense.

> Other than that, this seems like it might be about ready to commit,
> barring objections or bug reports.

Awesome.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
>From a6811b643df94c9057373fd687398c85a807fd5e Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Wed, 18 Jan 2023 12:43:41 -0800
Subject: [PATCH v5 1/3] Code review for ea92368.

This commit missed an error message and a line in the docs.
---
 doc/src/sgml/config.sgml  | 3 +--
 src/backend/utils/init/postinit.c | 2 +-
 2 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 89d53f2a64..e019a1aac9 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -725,8 +725,7 @@ include_dir 'conf.d'
 number of active concurrent connections is at least
 max_connections minus
 superuser_reserved_connections, new
-connections will be accepted only for superusers, and no
-new replication connections will be accepted.
+connections will be accepted only for superusers.

 

diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index ae5a85ed65..9145d96b38 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -931,7 +931,7 @@ InitPostgres(const char *in_dbname, Oid dboid,
 		!HaveNFreeProcs(ReservedBackends))
 		ereport(FATAL,
 (errcode(ERRCODE_TOO_MANY_CONNECTIONS),
- errmsg("remaining connection slots are reserved for non-replication superuser connections")));
+ errmsg("remaining connection slots are reserved for superusers")));
 
 	/* Check replication permissions needed for walsender processes. */
 	if (am_walsender)
-- 
2.25.1

>From e0390f0120315746ea04a9fa1bf709def76e6196 Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Tue, 17 Jan 2023 13:58:56 -0800
Subject: [PATCH v5 2/3] Rename ReservedBackends to
 SuperuserReservedConnections.

This is in preparation for adding a new reserved_connections GUC.
---
 src/backend/postmaster/postmaster.c | 20 ++--
 src/backend/utils/init/postinit.c   |  4 ++--
 src/backend/utils/misc/guc_tables.c |  2 +-
 src/include/postmaster/postmaster.h |  2 +-
 4 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index 9cedc1b9f0..3f799c4ac8 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -204,15 +204,15 @@ char	   *Unix_socket_directories;
 char	   *ListenAddresses;
 
 /*
- * ReservedBackends is the number of backends reserved for superuser use.
- * This number is taken out of the pool size given by MaxConnections so
- * number of backend slots available to non-superusers is
- * (MaxConnections - ReservedBackends).  Note what this really means is
- * "if there are <= ReservedBackends connections available, only superusers
- * can make new connections" --- pre-existing superuser connections don't
- * count against the limit.
+ * SuperuserReservedConnections is the number of backends reserved for
+ * superuser use.  This number is taken out of the pool size given by
+ * MaxConnections so number of backend slots available to non-superusers is
+ * (MaxConnections - SuperuserReservedConnections).  Note what this really
+ * means is "if there are <= SuperuserReservedConnections connections
+ * available, only superusers can make new connections" --- pre-existing
+ * superuser connections don't count against the limit.
  */
-int			ReservedBackends;
+int			SuperuserReservedConnections;
 
 /* The socket(s) we're listening to. */
 #define MAXLISTEN	64
@@ -908,11 +908,11 @@ PostmasterMain(int argc, char *argv[])
 	/*
 	 * Check for invalid combinations of GUC settings.
 	 */
-	if (ReservedBackends >= MaxConnections)
+	if (SuperuserReservedConnections >= MaxConnections)
 	{
 		write_stderr("%s: superuser_reserved_connections (%d) must be less than max_connections (%d)\n",
 	 progname,
-	 ReservedBackends, MaxConnections);
+	 SuperuserReservedConnections, MaxConnections);
 		ExitPostmaster(1);
 	}
 	if (XLogArchiveMode > ARCHIVE_MODE_OFF && wal_level == WAL_LEVEL_MINIMAL)
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index 9145d96b38..40f145e0ab 100644
--- 

Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation

2023-01-19 Thread Robert Haas
On Wed, Jan 18, 2023 at 1:31 PM Peter Geoghegan  wrote:
> pgstat_report_analyze() will totally override the
> tabentry->dead_tuples information that drives autovacuum.c, based on
> an estimate derived from a random sample -- which seems to me to be an
> approach that just doesn't have any sound theoretical basis.

In other words, ANALYZE sometimes (but not always) produces wrong answers.

On Wed, Jan 18, 2023 at 4:08 PM Andres Freund  wrote:
> One complicating factor is that VACUUM sometimes computes an incrementally
> more bogus n_live_tup when it skips pages due to the VM, whereas ANALYZE
> computes something sane. I unintentionally encountered one when I was trying
> something while writing this email, reproducer attached.

In other words, VACUUM sometimes (but not always) produces wrong answers.

TL;DR: We're screwed.

I refuse to believe that any amount of math you can do on numbers that
can be arbitrarily inaccurate will result in an accurate answer
popping out the other end. Trying to update the reltuples estimate
incrementally based on an estimate derived from a non-random,
likely-to-be-skewed subset of the table is always going to produce
distortion that gets worse and worse the more times you do it. If
could say, well, the existing estimate of let's say 100 tuples per
page is based on the density being 200 tuples per page in the pages I
just scanned and 50 tuples per page in the rest of the table, then you
could calculate a new estimate that keeps the value of 50 tuples per
page for the remainder of the table intact and just replaces the
estimate for the part you just scanned. But we have no way of doing
that, so we just make some linear combination of the old estimate with
the new one. That overweights the repeatedly-sampled portions of the
table more and more, making the estimate wronger and wronger.

Now, that is already quite bad. But if we accept the premise that
neither VACUUM nor ANALYZE is guaranteed to ever produce a new
actually-reliable estimate, then not only will we go progressively
more wrong as time goes by, but we have no way of ever fixing
anything. If you get a series of unreliable data points followed by a
reliable data point, you can at least get back on track when the
reliable data shows up. But it sounds like you guys are saying that
there's no guarantee that will ever happen, which is a bit like
discovering that not only do you have a hole in your gas tank but
there is no guarantee that you will arrive at a gas station ever again
regardless of distance travelled.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: almost-super-user problems that we haven't fixed yet

2023-01-19 Thread Robert Haas
On Thu, Jan 19, 2023 at 2:46 PM Nathan Bossart  wrote:
> > Thanks. I'd move it to the inner indentation level so it's closer to
> > the test at issue.
>
> I meant for it to cover the call to HaveNFreeProcs() as well since the same
> idea applies.  I left it the same for now, but if you still think it makes
> sense to move it, I'll do so.

Hmm, OK. If you want to leave it where it is, I won't argue further.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: BF animal malleefowl reported an failure in 001_password.pl

2023-01-19 Thread Tom Lane
Thomas Munro  writes:
> So I think we probably need something like the attached, which I was
> originally trying to avoid.

Yeah, something like that.  I also wonder if you don't need to think
a bit harder about the ordering of the flag checks, in particular
it seems like servicing reload_request before child_exit might be
a good idea (remembering that child_exit might cause launching of
new children, so we want to be up to speed on relevant settings).

regards, tom lane




Re: meson oddities

2023-01-19 Thread Peter Eisentraut

On 11.01.23 12:05, Peter Eisentraut wrote:
I think there is also an adjacent issue:  The subdir options may be 
absolute or relative.  So if you specify --prefix=/usr/local and 
--sysconfdir=/etc/postgresql, then


     config_paths_data.set_quoted('SYSCONFDIR', dir_prefix / dir_sysconf)

would produce something like /usr/local/etc/postgresql.

I think maybe we should make all the dir_* variables absolute right at 
the beginning, like


     dir_lib = get_option('libdir')
     if not fs.is_absolute(dir_lib)
     dir_lib = dir_prefix / dir_lib
     endif

And then the appending stuff could be done after that, keeping the 
current code.


Here is a proposed patch.  This should fix all these issues.
From cad243b60e0a45514f48dfc189069eaf883fc5a6 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Thu, 19 Jan 2023 21:05:01 +0100
Subject: [PATCH] meson: Make all subdirectory variables absolute

Meson subdirectory options may be returned as absolute or relative,
depending on whether they are under prefix.  Meson itself can handle
both, but we might need an absolute path in some cases, so to simplify
this turn all paths to absolute right at the beginning.
---
 meson.build | 45 +
 src/include/meson.build | 24 +++---
 2 files changed, 49 insertions(+), 20 deletions(-)

diff --git a/meson.build b/meson.build
index 45fb9dd616..edbec530e2 100644
--- a/meson.build
+++ b/meson.build
@@ -457,27 +457,46 @@ endif
 # Directories
 ###
 
-# These are set by the equivalent --xxxdir configure options.  We
-# append "postgresql" to some of them, if the string does not already
-# contain "pgsql" or "postgres", in order to avoid directory clutter.
+# These are set by the equivalent --xxxdir configure options.
+#
+# Subdirectory options may be returned as absolute or relative,
+# depending on whether they are under prefix.  Meson itself can handle
+# both, but we might need an absolute path in some cases, so to
+# simplify this we turn all paths to absolute here.
+#
+# We append "postgresql" to some of the directories, if the string
+# does not already contain "pgsql" or "postgres", in order to avoid
+# directory clutter.
 
 pkg = 'postgresql'
 
 dir_prefix = get_option('prefix')
 
 dir_bin = get_option('bindir')
+if not fs.is_absolute(dir_bin)
+  dir_bin = dir_prefix / dir_bin
+endif
 
 dir_data = get_option('datadir')
+if not fs.is_absolute(dir_data)
+  dir_data = dir_prefix / dir_data
+endif
 if not (dir_data.contains('pgsql') or dir_data.contains('postgres'))
   dir_data = dir_data / pkg
 endif
 
 dir_sysconf = get_option('sysconfdir')
+if not fs.is_absolute(dir_sysconf)
+  dir_sysconf = dir_prefix / dir_sysconf
+endif
 if not (dir_sysconf.contains('pgsql') or dir_sysconf.contains('postgres'))
   dir_sysconf = dir_sysconf / pkg
 endif
 
 dir_lib = get_option('libdir')
+if not fs.is_absolute(dir_lib)
+  dir_lib = dir_prefix / dir_lib
+endif
 
 dir_lib_pkg = dir_lib
 if not (dir_lib_pkg.contains('pgsql') or dir_lib_pkg.contains('postgres'))
@@ -487,21 +506,31 @@ endif
 dir_pgxs = dir_lib_pkg / 'pgxs'
 
 dir_include = get_option('includedir')
+if not fs.is_absolute(dir_include)
+  dir_include = dir_prefix / dir_include
+endif
 
 dir_include_pkg = dir_include
-dir_include_pkg_rel = ''
 if not (dir_include_pkg.contains('pgsql') or 
dir_include_pkg.contains('postgres'))
   dir_include_pkg = dir_include_pkg / pkg
-  dir_include_pkg_rel = pkg
 endif
 
 dir_man = get_option('mandir')
+if not fs.is_absolute(dir_man)
+  dir_man = dir_prefix / dir_man
+endif
 
 # FIXME: These used to be separately configurable - worth adding?
 dir_doc = get_option('datadir') / 'doc' / 'postgresql'
+if not fs.is_absolute(dir_doc)
+  dir_doc = dir_prefix / dir_doc
+endif
 dir_doc_html = dir_doc
 
 dir_locale = get_option('localedir')
+if not fs.is_absolute(dir_locale)
+  dir_locale = dir_prefix / dir_locale
+endif
 
 
 # Derived values
@@ -2560,9 +2589,9 @@ mod_install_rpaths = []
 if host_system != 'darwin'
   # Add absolute path to libdir to rpath. This ensures installed binaries /
   # libraries find our libraries (mainly libpq).
-  bin_install_rpaths += dir_prefix / dir_lib
-  lib_install_rpaths += dir_prefix / dir_lib
-  mod_install_rpaths += dir_prefix / dir_lib
+  bin_install_rpaths += dir_lib
+  lib_install_rpaths += dir_lib
+  mod_install_rpaths += dir_lib
 
   # Add extra_lib_dirs to rpath. This ensures we find libraries we depend on.
   #
diff --git a/src/include/meson.build b/src/include/meson.build
index 51ad06cecb..b70ffdf785 100644
--- a/src/include/meson.build
+++ b/src/include/meson.build
@@ -28,18 +28,18 @@ configure_files += pg_config
 
 
 config_paths_data = configuration_data()
-config_paths_data.set_quoted('PGBINDIR', dir_prefix / dir_bin)
-config_paths_data.set_quoted('PGSHAREDIR', dir_prefix / dir_data)
-config_paths_data.set_quoted('SYSCONFDIR', dir_prefix / dir_sysconf)
-config_paths_data.set_quoted(

Re: meson oddities

2023-01-19 Thread Andres Freund
Hi,

On 2023-01-19 21:37:15 +0100, Peter Eisentraut wrote:
> On 11.01.23 12:05, Peter Eisentraut wrote:
> > I think there is also an adjacent issue:  The subdir options may be
> > absolute or relative.  So if you specify --prefix=/usr/local and
> > --sysconfdir=/etc/postgresql, then
> > 
> >      config_paths_data.set_quoted('SYSCONFDIR', dir_prefix / dir_sysconf)
> > 
> > would produce something like /usr/local/etc/postgresql.

I don't think it would. The / operator understands absolute paths and doesn't
add the "first component" if the second component is absolute.


>  
>  dir_bin = get_option('bindir')
> +if not fs.is_absolute(dir_bin)
> +  dir_bin = dir_prefix / dir_bin
> +endif

Hm, I'm not sure this works entirely right on windows. A path like /blub isn't
absolute on windows, but it's not really relative either. It's a "drive local"
path. I.e. relative to the current drive (c:/), but not the subdirectory
therein.

Greetings,

Andres Freund




Re: Transparent column encryption

2023-01-19 Thread Jacob Champion
On 12/31/22 06:17, Peter Eisentraut wrote:
> On 21.12.22 06:46, Peter Eisentraut wrote:
>> And another update.  The main changes are that I added an 'unspecified' 
>> CMK algorithm, which indicates that the external KMS knows what it is 
>> but the database system doesn't.  This was discussed a while ago.  I 
>> also changed some details about how the "cmklookup" works in libpq. Also 
>> added more code comments and documentation and rearranged some code.

Trying to delay a review until I had "completed it" has only led to me
not reviewing, so here's a partial one. Let me know what pieces of the
implementation and/or architecture you're hoping to get more feedback on.

I like the existing "caveats" documentation, and I've attached a sample
patch with some more caveats documented, based on some of the upthread
conversation:

- text format makes fixed-length columns leak length information too
- you only get partial protection against the Evil DBA
- RSA-OAEP public key safety

(Feel free to use/remix/discard as desired.)

When writing the paragraph on RSA-OAEP I was reminded that we didn't
really dig into the asymmetric/symmetric discussion. Assuming that most
first-time users will pick the builtin CMK encryption method, do we
still want to have an asymmetric scheme implemented first instead of a
symmetric keywrap? I'm still concerned about that public key, since it
can't really be made public. (And now that "unspecified" is available, I
think an asymmetric CMK could be easily created by users that have a
niche use case, and then we wouldn't have to commit to supporting it
forever.)

For the padding caveat:

> +  There is no concern if all values are of the same length (e.g., credit
> +  card numbers).

I nodded along to this statement last year, and then this year I learned
that CCNs aren't fixed-length. So with a 16-byte block, you're probably
going to be able to figure out who has an American Express card.

The column encryption algorithm is set per-column -- but isn't it
tightly coupled to the CEK, since the key length has to match? From a
layperson perspective, using the same key to encrypt the same plaintext
under two different algorithms (if they happen to have the same key
length) seems like it might be cryptographically risky. Is there a
reason I should be encouraged to do that?

With the loss of \gencr it looks like we also lost a potential way to
force encryption from within psql. Any plans to add that for v1?

While testing, I forgot how the new option worked and connected with
`column_encryption=on` -- and then I accidentally sent unencrypted data
to the server, since `on` means "not enabled". :( The server errors out
after the damage is done, of course, but would it be okay to strictly
validate that option's values?

Are there plans to document client-side implementation requirements, to
ensure cross-client compatibility? Things like the "PG\x00\x01"
associated data are buried at the moment (or else I've missed them in
the docs). If you're holding off until the feature is more finalized,
that's fine too.

Speaking of cross-client compatibility, I'm still disconcerted by the
ability to write the value "hello world" into an encrypted integer
column. Should clients be required to validate the text format, using
the attrealtypid?

It occurred to me when looking at the "unspecified" CMK scheme that the
CEK doesn't really have to be an encryption key at all. In that case it
can function more like a (possibly signed?) cookie for lookup, or even
be ignored altogether if you don't want to use a wrapping scheme
(similar to JWE's "direct" mode, maybe?). So now you have three ways to
look up or determine a column encryption key (CMK realm, CMK name, CEK
cookie)... is that a concept worth exploring in v1 and/or the documentation?

Thanks,
--Jacobdiff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml
index 55f33a2f5f..06e1c077d5 100644
--- a/doc/src/sgml/ddl.sgml
+++ b/doc/src/sgml/ddl.sgml
@@ -1588,7 +1588,33 @@ export PGCMKLOOKUP
 card numbers).  But if there are signficant length differences between
 valid values and that length information is security-sensitive, then
 application-specific workarounds such as padding would need to be applied.
-How to do that securely is beyond the scope of this manual.
+How to do that securely is beyond the scope of this manual.  Note that
+column encryption is applied to the text representation of the stored 
value,
+so length differences can be leaked even for fixed-length column types 
(e.g.
+bigint, whose largest decimal representation is longer
+than 16 bytes).
+   
+
+   
+Column encryption provides only partial protection against a malicious
+user with write access to the table.  Once encrypted, any modifications to 
a
+stored value on the server side will cause a decryption failure on the
+client.  However, a user with write access may still freely swap encrypted
+values between rows or columns (o

Re: Experiments with Postgres and SSL

2023-01-19 Thread Vladimir Sitnikov
It would be great if PostgreSQL supported 'start with TLS', however, how
could clients activate the feature?

I would like to refrain users from configuring the handshake mode, and I
would like to refrain from degrading performance when a new client talks to
an old database.

What if the server that supports 'fast TLS' added an extra notification in
case client connects with a classic TLS?
Then a capable client could remember host:port and try with newer TLS
appoach the next time it connects.

It would be transparent to the clients, and the users won't need to
configure 'prefer classic or fast TLS'
The old clients could discard the notification.

Vladimir

-- 
Vladimir


Re: document the need to analyze partitioned tables

2023-01-19 Thread Bruce Momjian
On Thu, Jan 19, 2023 at 01:50:05PM +0100, Laurenz Albe wrote:
> On Wed, 2023-01-18 at 16:23 -0500, Bruce Momjian wrote:
> > Is it possible to document when partition table statistics helps?
> 
> I think it would be difficult to come up with an exhaustive list.

I was afraid of that.  I asked only because most people assume
autovacuum handles _all_ statistics needs, but this case is not handled.
Do people even have any statistics maintenance process anymore, and if
not, how would they know they need to run a manual ANALYZE?

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

Embrace your flaws.  They make you human, rather than perfect,
which you will never be.




Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation

2023-01-19 Thread Andres Freund
Hi,

On 2023-01-19 15:12:12 -0500, Robert Haas wrote:
> On Wed, Jan 18, 2023 at 1:31 PM Peter Geoghegan  wrote:
> > pgstat_report_analyze() will totally override the
> > tabentry->dead_tuples information that drives autovacuum.c, based on
> > an estimate derived from a random sample -- which seems to me to be an
> > approach that just doesn't have any sound theoretical basis.
> 
> In other words, ANALYZE sometimes (but not always) produces wrong answers.

For dead tuples, but not live tuples.


> On Wed, Jan 18, 2023 at 4:08 PM Andres Freund  wrote:
> > One complicating factor is that VACUUM sometimes computes an incrementally
> > more bogus n_live_tup when it skips pages due to the VM, whereas ANALYZE
> > computes something sane. I unintentionally encountered one when I was trying
> > something while writing this email, reproducer attached.
> 
> In other words, VACUUM sometimes (but not always) produces wrong answers.

For live tuples, but not badly so for dead tuples.


> TL;DR: We're screwed.

We are, but perhaps not too badly so, because we can choose to believe analyze
more for live tuples, and vacuum for dead tuples. Analyze doesn't compute
reltuples incrementally and vacuum doesn't compute deadtuples incrementally.



> I refuse to believe that any amount of math you can do on numbers that
> can be arbitrarily inaccurate will result in an accurate answer
> popping out the other end. Trying to update the reltuples estimate
> incrementally based on an estimate derived from a non-random,
> likely-to-be-skewed subset of the table is always going to produce
> distortion that gets worse and worse the more times you do it. If
> could say, well, the existing estimate of let's say 100 tuples per
> page is based on the density being 200 tuples per page in the pages I
> just scanned and 50 tuples per page in the rest of the table, then you
> could calculate a new estimate that keeps the value of 50 tuples per
> page for the remainder of the table intact and just replaces the
> estimate for the part you just scanned. But we have no way of doing
> that, so we just make some linear combination of the old estimate with
> the new one. That overweights the repeatedly-sampled portions of the
> table more and more, making the estimate wronger and wronger.

Perhaps we should, at least occasionally, make vacuum do a cheaper version of
analyze's sampling to compute an updated reltuples. This could even happen
during the heap scan phase.

I don't like relying on analyze to fix vacuum's bogus reltuples, because
there's nothing forcing an analyze run soon after vacuum [incrementally]
screwed it up. Vacuum can be forced to run a lot of times due to xid horizons
preventing cleanup, after which there isn't anything forcing analyze to run
again.

But in contrast to dead_tuples, where I think we can just stop analyze from
updating it unless we crashed recently, I do think we need to update reltuples
in vacuum. So computing an accurate value seems like the least unreasonable
thing I can see.

Greetings,

Andres Freund




  1   2   >