Re: Use the enum value CRS_EXPORT_SNAPSHOT instead of "0"

2023-06-15 Thread Peter Smith
On Fri, Jun 16, 2023 at 4:10 PM Wei Wang (Fujitsu)
 wrote:
>
> Hi,
>
> In the function WalReceiverMain, when the function walrcv_create_slot is 
> called,
> the fourth parameter is assigned the value "0" instead of the enum value
> "CRS_EXPORT_SNAPSHOT". I think it would be better to use the corresponding 
> enum
> value.
>

+1

--
Kind Regards,
Peter Smith.
Fujitsu Australia




Use the enum value CRS_EXPORT_SNAPSHOT instead of "0"

2023-06-15 Thread Wei Wang (Fujitsu)
Hi,

In the function WalReceiverMain, when the function walrcv_create_slot is called,
the fourth parameter is assigned the value "0" instead of the enum value
"CRS_EXPORT_SNAPSHOT". I think it would be better to use the corresponding enum
value.

Attach the patch to change this point.

Regards,
Wang wei


0001-Use-the-enum-value-CRS_EXPORT_SNAPSHOT-instead-of-0.patch
Description:  0001-Use-the-enum-value-CRS_EXPORT_SNAPSHOT-instead-of-0.patch


Re: allow granting CLUSTER, REFRESH MATERIALIZED VIEW, and REINDEX

2023-06-15 Thread Nathan Bossart
On Thu, Jun 15, 2023 at 04:57:00PM -0700, Nathan Bossart wrote:
> Here's an attempt at adjusting the documentation as I proposed yesterday.
> I think this is a good opportunity to simplify the privilege-related
> sections for these maintenance commands.

I noticed that it was possible to make the documentation changes in 0001
easier to read.  Apologies for the noise.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
>From ab9f6433d50a357d2df00f01567880d8f587a44c Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Wed, 14 Jun 2023 10:54:05 -0700
Subject: [PATCH v3 1/2] partial revert of ff9618e82a

---
 doc/src/sgml/ref/analyze.sgml | 11 +--
 doc/src/sgml/ref/cluster.sgml |  9 +-
 doc/src/sgml/ref/lock.sgml| 11 ++-
 .../sgml/ref/refresh_materialized_view.sgml   |  6 ++--
 doc/src/sgml/ref/reindex.sgml | 27 +++--
 doc/src/sgml/ref/vacuum.sgml  | 11 +--
 doc/src/sgml/user-manag.sgml  |  3 +-
 src/backend/commands/cluster.c| 10 ++-
 src/backend/commands/indexcmds.c  | 17 ---
 src/backend/commands/lockcmds.c   |  8 -
 src/backend/commands/tablecmds.c  | 29 +--
 src/backend/commands/vacuum.c |  6 +---
 src/include/commands/tablecmds.h  |  1 -
 .../expected/cluster-conflict-partition.out   | 14 -
 .../specs/cluster-conflict-partition.spec |  7 +++--
 src/test/regress/expected/cluster.out |  3 +-
 src/test/regress/expected/vacuum.out  | 18 
 17 files changed, 61 insertions(+), 130 deletions(-)

diff --git a/doc/src/sgml/ref/analyze.sgml b/doc/src/sgml/ref/analyze.sgml
index 20c6f9939f..954491b5df 100644
--- a/doc/src/sgml/ref/analyze.sgml
+++ b/doc/src/sgml/ref/analyze.sgml
@@ -183,17 +183,8 @@ ANALYZE [ VERBOSE ] [ table_and_columns
To analyze a table, one must ordinarily have the MAINTAIN
-   privilege on the table or be the table's owner, a superuser, or a role with
-   privileges of the
-   pg_maintain
-   role.  However, database owners are allowed to
+   privilege on the table.  However, database owners are allowed to
analyze all tables in their databases, except shared catalogs.
-   (The restriction for shared catalogs means that a true database-wide
-   ANALYZE can only be performed by superusers and roles
-   with privileges of pg_maintain.)  If a role has
-   permission to ANALYZE a partitioned table, it is also
-   permitted to ANALYZE each of its partitions, regardless
-   of whether the role has the aforementioned privileges on the partition.
ANALYZE will skip over any tables that the calling user
does not have permission to analyze.
   
diff --git a/doc/src/sgml/ref/cluster.sgml b/doc/src/sgml/ref/cluster.sgml
index 29f0f1fd90..06f3d269e6 100644
--- a/doc/src/sgml/ref/cluster.sgml
+++ b/doc/src/sgml/ref/cluster.sgml
@@ -134,14 +134,7 @@ CLUSTER [VERBOSE]
 

 To cluster a table, one must have the MAINTAIN privilege
-on the table or be the table's owner, a superuser, or a role with
-privileges of the
-pg_maintain
-role.  If a role has permission to CLUSTER a partitioned
-table, it is also permitted to CLUSTER each of its
-partitions, regardless of whether the role has the aforementioned
-privileges on the partition.  CLUSTER will skip over any
-tables that the calling user does not have permission to cluster.
+on the table.

 

diff --git a/doc/src/sgml/ref/lock.sgml b/doc/src/sgml/ref/lock.sgml
index 5b3b2b793a..070855da18 100644
--- a/doc/src/sgml/ref/lock.sgml
+++ b/doc/src/sgml/ref/lock.sgml
@@ -166,10 +166,8 @@ LOCK [ TABLE ] [ ONLY ] name [ * ]
 

 To lock a table, the user must have the right privilege for the specified
-lockmode, or be the table's
-owner, a superuser, or a role with privileges of the pg_maintain
-role. If the user has MAINTAIN,
+lockmode.
+If the user has MAINTAIN,
 UPDATE, DELETE, or
 TRUNCATE privileges on the table, any lockmode is permitted. If the user has
@@ -177,10 +175,7 @@ LOCK [ TABLE ] [ ONLY ] name [ * ]
 MODE (or a less-conflicting mode as described in ) is permitted. If a user has
 SELECT privileges on the table, ACCESS SHARE
-MODE is permitted.  If a role has permission to lock a
-partitioned table, it is also permitted to lock each of its partitions,
-regardless of whether the role has the aforementioned privileges on the
-partition.
+MODE is permitted.

 

diff --git a/doc/src/sgml/ref/refresh_materialized_view.sgml b/doc/src/sgml/ref/refresh_materialized_view.sgml
index 4d79b6ae7f..19737668cd 100644
--- a/doc/src/sgml/ref/refresh_materialized_view.sgml
+++ b/doc/src/sgml/ref/refresh_materialized_view.sgml
@@ -31,10 +31,8 @@ REFRESH MATERIALIZED VIEW [ CONCURRENTLY ] name
REFRESH MATERIALIZED VIEW completely replaces the
-   conte

Re: add non-option reordering to in-tree getopt_long

2023-06-15 Thread Nathan Bossart
On Fri, Jun 16, 2023 at 10:30:09AM +0900, Michael Paquier wrote:
> On Thu, Jun 15, 2023 at 05:09:59PM -0700, Nathan Bossart wrote:
>> I've attached a new version of the patch that omits the
>> POSIXLY_CORRECT stuff.
> 
> This looks OK at quick glance, though you may want to document at the
> top of getopt_long() the reordering business with the non-options?

I added a comment to this effect in v3.  I also noticed that '-' wasn't
properly handled as a non-option, so I've tried to fix that as well.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
>From e31fa0ab5237d3ad35bdb44404fd5b5eeea3f5c6 Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Fri, 9 Jun 2023 15:51:58 -0700
Subject: [PATCH v3 1/1] Teach in-tree getopt_long() to move non-options to the
 end of argv.

---
 src/bin/scripts/t/040_createuser.pl | 10 ---
 src/port/getopt_long.c  | 45 +
 2 files changed, 39 insertions(+), 16 deletions(-)

diff --git a/src/bin/scripts/t/040_createuser.pl b/src/bin/scripts/t/040_createuser.pl
index 9ca282181d..ba40ab11a3 100644
--- a/src/bin/scripts/t/040_createuser.pl
+++ b/src/bin/scripts/t/040_createuser.pl
@@ -42,9 +42,9 @@ $node->issues_sql_like(
 	'add a role as a member with admin option of the newly created role');
 $node->issues_sql_like(
 	[
-		'createuser', '-m',
-		'regress_user3', '-m',
-		'regress user #4', 'REGRESS_USER5'
+		'createuser', 'REGRESS_USER5',
+		'-m', 'regress_user3',
+		'-m', 'regress user #4'
 	],
 	qr/statement: CREATE ROLE "REGRESS_USER5" NOSUPERUSER NOCREATEDB NOCREATEROLE INHERIT LOGIN NOREPLICATION NOBYPASSRLS ROLE regress_user3,"regress user #4";/,
 	'add a role as a member of the newly created role');
@@ -73,11 +73,13 @@ $node->issues_sql_like(
 	qr/statement: CREATE ROLE regress_user11 NOSUPERUSER NOCREATEDB NOCREATEROLE INHERIT LOGIN NOREPLICATION NOBYPASSRLS IN ROLE regress_user1;/,
 	'--role');
 $node->issues_sql_like(
-	[ 'createuser', '--member-of', 'regress_user1', 'regress_user12' ],
+	[ 'createuser', 'regress_user12', '--member-of', 'regress_user1' ],
 	qr/statement: CREATE ROLE regress_user12 NOSUPERUSER NOCREATEDB NOCREATEROLE INHERIT LOGIN NOREPLICATION NOBYPASSRLS IN ROLE regress_user1;/,
 	'--member-of');
 
 $node->command_fails([ 'createuser', 'regress_user1' ],
 	'fails if role already exists');
+$node->command_fails([ 'createuser', 'regress_user1', '-m', 'regress_user2', 'regress_user3' ],
+	'fails for too many non-options');
 
 done_testing();
diff --git a/src/port/getopt_long.c b/src/port/getopt_long.c
index c989276988..da233728e1 100644
--- a/src/port/getopt_long.c
+++ b/src/port/getopt_long.c
@@ -50,8 +50,11 @@
  * This implementation does not use optreset.  Instead, we guarantee that
  * it can be restarted on a new argv array after a previous call returned -1,
  * if the caller resets optind to 1 before the first call of the new series.
- * (Internally, this means we must be sure to reset "place" to EMSG before
- * returning -1.)
+ * (Internally, this means we must be sure to reset "place" to EMSG and
+ * nonopt_start to -1 before returning -1.)
+ *
+ * Note that this routine reorders the pointers in argv (despite the const
+ * qualifier) so that all non-options will be at the end when -1 is returned.
  */
 int
 getopt_long(int argc, char *const argv[],
@@ -60,37 +63,55 @@ getopt_long(int argc, char *const argv[],
 {
 	static char *place = EMSG;	/* option letter processing */
 	char	   *oli;			/* option letter list index */
+	static int	nonopt_start = -1;
 
 	if (!*place)
 	{			/* update scanning pointer */
 		if (optind >= argc)
 		{
 			place = EMSG;
+			nonopt_start = -1;
 			return -1;
 		}
 
+retry:
 		place = argv[optind];
 
-		if (place[0] != '-')
+		if (place[0] != '-' || place[1] == '\0')
 		{
-			place = EMSG;
-			return -1;
-		}
+			char	  **args = (char **) argv;
 
-		place++;
+			/*
+			 * If only non-options remain, return -1.  Else, move the
+			 * non-option to the end and try again.
+			 */
+			if (optind == nonopt_start)
+			{
+place = EMSG;
+nonopt_start = -1;
+return -1;
+			}
 
-		if (!*place)
-		{
-			/* treat "-" as not being an option */
-			place = EMSG;
-			return -1;
+			for (int i = optind; i < argc - 1; i++)
+args[i] = args[i + 1];
+			args[argc - 1] = place;
+
+			if (nonopt_start == -1)
+nonopt_start = argc - 1;
+			else
+nonopt_start--;
+
+			goto retry;
 		}
 
+		place++;
+
 		if (place[0] == '-' && place[1] == '\0')
 		{
 			/* found "--", treat it as end of options */
 			++optind;
 			place = EMSG;
+			nonopt_start = -1;
 			return -1;
 		}
 
-- 
2.25.1



Re: Add a perl function in Cluster.pm to generate WAL

2023-06-15 Thread Kyotaro Horiguchi
Thanks for the comments.

At Fri, 16 Jun 2023 11:30:15 +0900, Michael Paquier  wrote 
in 
> > -$node_primary->safe_psql(
> > -'postgres', "create table retain_test(a int);
> > - select pg_switch_wal();
> > - insert into retain_test values(1);
> > - checkpoint;");
> > +$node_primary->advance_wal(1);
> > +$node_primary->safe_psql('postgres', "checkpoint;");
> > 
> > The original test generated some WAL after the segment switch, which
> > appears to be a significant characteristics of the test.
> 
> Still it does not matter for this specific case?  The logical slot has
> been already invalidated, so we don't care much about logical changes
> in WAL, do we?

The change itself doesn't seem to matter, but it seems intended to let
checkpoint trigger the removal of the last segment. However, I'm
unsure how the insert would influence this that way. If my
understanding is correct, then I'd support its removal.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: Support to define custom wait events for extensions

2023-06-15 Thread Masahiro Ikeda

On 2023-06-16 01:13, Tristan Partin wrote:

We had this on our list of things to do at Neon, so it is a nice
surprise that you brought up an initial patchset :). It was also my
first time looking up the word tranche.


What a coincidence! I came up with the idea when I used Neon with
postgres_fdw. As a Neon user, I also feel the feature is important.

Same as you. Thanks to Michael and Drouvot, I got to know the word 
tranche

and the related existing code.


From 59a118402e5e59685fb9e0fb086872e25a405736 Mon Sep 17 00:00:00 2001
From: Masahiro Ikeda 
Date: Thu, 15 Jun 2023 12:57:29 +0900
Subject: [PATCH 2/3] Support to define custom wait events for 
extensions.



Currently, only one PG_WAIT_EXTENSION event can be used as a
wait event for extensions. Therefore, in environments with multiple
extensions are installed, it could take time to identify bottlenecks.


"extensions are installed" should be "extensions installed".


+#define NUM_BUILDIN_WAIT_EVENT_EXTENSION   \
+   (WAIT_EVENT_EXTENSION_FIRST_USER_DEFINED - 
WAIT_EVENT_EXTENSION)


Should that be NUM_BUILTIN_WAIT_EVENT_EXTENSION?


Thanks for your comments.
Yes, I'll fix it.

+   NamedExtensionWaitEventTrancheRequestArray = 
(NamedExtensionWaitEventTrancheRequest *)

+   MemoryContextAlloc(TopMemoryContext,
+  
NamedExtensionWaitEventTrancheRequestsAllocated
+  * 
sizeof(NamedExtensionWaitEventTrancheRequest));


I can't tell from reading other Postgres code when one should cast the
return value of MemoryContextAlloc(). Seems unnecessary to me.


I referenced RequestNamedLWLockTranche() and it looks ok to me.

```
void
RequestNamedLWLockTranche(const char *tranche_name, int num_lwlocks)
NamedLWLockTrancheRequestArray = (NamedLWLockTrancheRequest *)
MemoryContextAlloc(TopMemoryContext,
   
NamedLWLockTrancheRequestsAllocated
   * 
sizeof(NamedLWLockTrancheRequest));
```


+   if (NamedExtensionWaitEventTrancheRequestArray == NULL)
+   {
+   NamedExtensionWaitEventTrancheRequestsAllocated = 16;
+   NamedExtensionWaitEventTrancheRequestArray = 
(NamedExtensionWaitEventTrancheRequest *)

+   MemoryContextAlloc(TopMemoryContext,
+  
NamedExtensionWaitEventTrancheRequestsAllocated
+  * 
sizeof(NamedExtensionWaitEventTrancheRequest));

+   }
+
+   if (NamedExtensionWaitEventTrancheRequests >= 
NamedExtensionWaitEventTrancheRequestsAllocated)

+   {
+   int i = 
pg_nextpower2_32(NamedExtensionWaitEventTrancheRequests + 1);

+
+   NamedExtensionWaitEventTrancheRequestArray = 
(NamedExtensionWaitEventTrancheRequest *)
+   
repalloc(NamedExtensionWaitEventTrancheRequestArray,
+i * 
sizeof(NamedExtensionWaitEventTrancheRequest));

+   NamedExtensionWaitEventTrancheRequestsAllocated = i;
+   }


Do you think this code would look better in an if/else if?


Same as above. I referenced RequestNamedLWLockTranche().
I don't know if it's a good idea, but it's better to refactor the
existing code separately from this patch.

But I plan to remove the code to focus implementing dynamic allocation
similar to LWLockNewTrancheId() and LWLockRegisterTranche() as
Michael's suggestion. I think it's good idea as a first step. Is it ok 
for you?


+   int i = 
pg_nextpower2_32(NamedExtensionWaitEventTrancheRequests + 1);


In the Postgres codebase, is an int always guaranteed to be at least 32
bits? I feel like a fixed-width type would be better for tracking the
length of the array, unless Postgres prefers the Size type.


Same as above.


+   Assert(strlen(tranche_name) + 1 <= NAMEDATALEN);
+   strlcpy(request->tranche_name, tranche_name, NAMEDATALEN);


A sizeof(request->tranche_name) would keep this code more in-sync if
size of tranche_name were to ever change, though I see sizeof
expressions in the codebase are not too common. Maybe just remove the 
+1

and make it less than rather than a less than or equal? Seems like it
might be worth noting in the docs of the function that the event name
has to be less than NAMEDATALEN, but maybe this is something extension
authors are inherently aware of?


Same as above.


---

What's the Postgres policy on the following?

for (int i = 0; ...)
for (i = 0; ...)
You are using 2 different patterns in WaitEventShmemInit() and
InitializeExtensionWaitEventTranches().


I didn't care it. I'll unify it.
Michael's replay is interesting.


+   /*
+* Copy the info about any named tranches into shared memory

Re: Add a perl function in Cluster.pm to generate WAL

2023-06-15 Thread Michael Paquier
On Thu, Jun 15, 2023 at 01:40:15PM +0900, Kyotaro Horiguchi wrote:
> + "CREATE TABLE tt (); DROP TABLE tt; SELECT 
> pg_switch_wal();");
> 
> At least since 11, we can utilize pg_logical_emit_message() for this
> purpose. It's more lightweight and seems appropriate, not only because
> it doesn't cause any side effects but also bacause we don't have to
> worry about name conflicts.

Making this as cheap as possible by design is a good concept for a
common routine.  +1.

> -  SELECT 'finished';",
> - timeout => $PostgreSQL::Test::Utils::timeout_default));
> -is($result[1], 'finished', 'check if checkpoint command is not blocked');
> -
> +$node_primary2->advance_wal(1);
> +$node_primary2->safe_psql('postgres', 'CHECKPOINT;');
> 
> This test anticipates that the checkpoint could get blocked. Shouldn't
> we keep the timeout?

Indeed, this would partially invalidate what's getting tested in light
of 1816a1c6 where we run a secondary command after the checkpoint.  So
the last SELECT should remain around.

> -$node_primary->safe_psql(
> -'postgres', "create table retain_test(a int);
> - select pg_switch_wal();
> - insert into retain_test values(1);
> - checkpoint;");
> +$node_primary->advance_wal(1);
> +$node_primary->safe_psql('postgres', "checkpoint;");
> 
> The original test generated some WAL after the segment switch, which
> appears to be a significant characteristics of the test.

Still it does not matter for this specific case?  The logical slot has
been already invalidated, so we don't care much about logical changes
in WAL, do we?
--
Michael


signature.asc
Description: PGP signature


Re: Allow pg_archivecleanup to remove backup history files

2023-06-15 Thread Kyotaro Horiguchi
At Fri, 16 Jun 2023 11:22:31 +0900 (JST), Kyotaro Horiguchi 
 wrote in 
> ASAICS the main section of the "pg_rewind --help" fits within 80
> columns. However, "initdb --help" does output a few lines exceeding
> the 80-column limit. Judging by the surrounding lines, I believe we're
> still aiming to maintain these the given width. I think we need to fix
> initdb in that regard.

Mmm, the message was introduced in 2012 by 8a02339e9b. I haven't
noticed this until now...

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: Allow pg_archivecleanup to remove backup history files

2023-06-15 Thread Kyotaro Horiguchi
At Thu, 15 Jun 2023 21:38:28 +0900, torikoshia  
wrote in 
> On 2023-06-15 15:20, Kyotaro Horiguchi wrote:
> Thanks for your review!
> > + printf(_(" -x, --strip-extension=EXT strip this extention before
> > identifying files fo clean up\n"));
> > + printf(_(" -?, --help show this help, then exit\n"));
> > After this change, some of these lines corss the boundary of the 80
> > columns width.  (is that policy viable noadays? I am usually working
> > using terminal windows with such a width..) It's somewhat unrelated to
> > this patch, but a help line a few lines further down also exceeds the
> > width. We could shorten it by removing the "/mnt/server" portion, but
> > I'm not sure if it's worth doing.
> 
> I also highlight 80th column according to the wiki[1].
> Since usage() in other files like pg_rewind.c and initdb.c also
> exceeded the 80th column, I thought that was something like a guide.

I think the page is suggesting about program code, not the messages
that binaries print.

ASAICS the main section of the "pg_rewind --help" fits within 80
columns. However, "initdb --help" does output a few lines exceeding
the 80-column limit. Judging by the surrounding lines, I believe we're
still aiming to maintain these the given width. I think we need to fix
initdb in that regard.

> [1]
> https://wiki.postgresql.org/wiki/Configuring_vim_for_postgres_development

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: Support to define custom wait events for extensions

2023-06-15 Thread Masahiro Ikeda

On 2023-06-15 22:21, Drouvot, Bertrand wrote:

Hi,

On 6/15/23 10:00 AM, Michael Paquier wrote:

On Thu, Jun 15, 2023 at 03:06:01PM +0900, Masahiro Ikeda wrote:

Currently, only one PG_WAIT_EXTENSION event can be used as a
wait event for extensions. Therefore, in environments with multiple
extensions are installed, it could take time to identify which
extension is the bottleneck.


Thanks for taking the time to implement a patch to do that.


+1 thanks for it!




I want to know your feedbacks. Please feel free to comment.


I think that's been cruelly missed.


Yeah, that would clearly help to diagnose which extension(s) is/are
causing the waits (if any).

I did not look at the code yet (will do) but just wanted to chime in
to support the idea.


Great! Thanks.

Regards,
--
Masahiro Ikeda
NTT DATA CORPORATION




Re: [PATCH] Slight improvement of worker_spi.c example

2023-06-15 Thread Julien Rouhaud
On Wed, Jun 14, 2023 at 02:08:03PM +0300, Aleksander Alekseev wrote:
>
> Unfortunately I'm not familiar with the problem in respect of naptime
> Julien is referring to. If you know what this problem is and how to
> fix it, go for it. I'll review and test the code then. I can write the
> part of the patch that fixes the part regarding dynamic workers if
> necessary.

Oh, sorry I thought it was somewhat evident.

The naptime GUC description says:

> Duration between each check (in seconds).

and the associated code does a single

WaitLatch(..., WL_LATCH_SET | WL_TIMEOUT, ...)

So unless I'm missing something nothing prevents the check being run way more
often than expected if the latch keeps being set.

Similarly, my understanding of "duration between checks" is that a naptime of 1
min means that the check should be run a minute apart, assuming it's possible.
As is, the checks are run naptime + query execution time apart, which doesn't
seem right.  Obviously there's isn't much you can do if the query execution
lasts for more than naptime, apart from detecting it and raising a warning to
let users know that their configuration isn't adequate (or that there's some
other problem like some lock contention or something), similarly to e.g.
checkpoint_warning.

Note I haven't looked closely at this module otherwise, so I can't say if there
are some other problems around.




Re: Support to define custom wait events for extensions

2023-06-15 Thread Masahiro Ikeda

Thanks for replying and your kind advice!

On 2023-06-15 17:00, Michael Paquier wrote:

On Thu, Jun 15, 2023 at 03:06:01PM +0900, Masahiro Ikeda wrote:
In the core, the requested wait events are dynamically registered in 
shared

memory. The extension then obtains the wait event information with
GetNamedExtensionWaitEventTranche() and uses the value to notify the 
core

that it is waiting.

When a string representing of the wait event is requested,
it returns the name defined by calling
RequestNamedExtensionWaitEventTranche().


So this implements the equivalent of RequestNamedLWLockTranche()
followed by GetNamedLWLockTranche() to get the wait event number,
which can be used only during postmaster startup.  Do you think that
we could focus on implementing something more flexible instead, that
can be used dynamically as well as statically?  That would be similar
to LWLockNewTrancheId() and LWLockRegisterTranche(), actually, where
we would get one or more tranche IDs, then do initialization actions
in shmem based on the tranche ID(s).


OK, I agree. I'll make a patch to only support
ExtensionWaitEventNewTrancheId() and ExtensionWaitEventRegisterTranche()
similar to LWLockNewTrancheId() and LWLockRegisterTranche().


4. check the custom wait event
You can see the following results of psql-1.

query| wait_event_type |wait_event
-+-+---
 SELECT inject_wait_event(); | Extension   | custom_wait_event 
   #

requested wait event by the extension!
(1 row)

(..snip..)


A problem with this approach is that it is expensive as a test.  Do we
really need one?  There are three places that set PG_WAIT_EXTENSION in
src/test/modules/, more in /contrib, and there are modules like
pg_stat_statements that could gain from events for I/O operations, for
example.


Yes. Since it's hard to test, I thought the PoC extension
should not be committed. But, I couldn't figure out the best
way to test yet.


# TODOs

* tests on windows (since I tested on Ubuntu 20.04 only)
* add custom wait events for existing contrib modules (ex. 
postgres_fdw)

* add regression code (but, it seems to be difficult)
* others? (Please let me know)


Hmm.  You would need to maintain a state in a rather stable manner,
and SQL queries can make that difficult in the TAP tests as the wait
event information is reset each time a query finishes.  One area where
I think this gets easier is with a background worker loaded with
shared_preload_libraries that has a configurable naptime.  Looking at
what's available in the tree, the TAP tests of pg_prewarm could use
one test on pg_stat_activity with a custom wait event name
(pg_prewarm.autoprewarm_interval is 0 hence the bgworker waits
infinitely).  Note that in this case, you would need to be careful of
the case where the wait event is loaded dynamically, but like LWLocks
this should be able to work as well?


Thanks for your advice!

I tried to query on pg_stat_activity to check the background worker
invoked by pg_prewarm. But, I found that pg_stat_activity doesn't show
it although I may be missing something...

So, I tried to implement TAP tests. But I have a problem with it.
I couldn't find the way to check the status of another backend
while the another backend wait with custom wait events.

```
# TAP test I've implemented.

# wait forever with custom wait events in session1
$session1->query_safe("SELECT test_custom_wait_events_wait()");

# I want to check the wait event from another backend process
# But, the following code is never reached because the above
# query is waiting forever.
$session2->poll_query_until('postgres',
qq[SELECT
(SELECT count(*) FROM pg_stat_activity
WHERE query ~ '^SELECT test_custom_wait_events_wait'
  AND wait_event_type = 'Extension'
  AND wait_event = 'custom_wait_event'
) > 0;]);
```

If I'm missing something or you have any idea,
please let me know.

Now, I plan to

* find out more the existing tests to check wait events and locks
  (though I have already checked a little, but I couldn't find it)
* find another way to check wait event of the background worker invoked 
by extension
* look up the reason why pg_stat_activity doesn't show the background 
worker

* find a way to implement async queries in TAP tests

Regards,
--
Masahiro Ikeda
NTT DATA CORPORATION




Re: Non-superuser subscription owners

2023-06-15 Thread Amit Kapila
On Thu, Jun 15, 2023 at 11:18 PM Alvaro Herrera  wrote:
>
> On 2023-Jun-13, Amit Kapila wrote:
>
> > I'll push this tomorrow unless there are any suggestions or comments.
>
> Note the proposed commit message is wrong about which commit is to blame
> for the original problem -- it mentions e7e7da2f8d57 twice, but one of
> them is actually c3afe8cf5a1e.
>

Right, I also noticed this and changed it before pushing, See
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=b5c517379a40fa1af84c0852aa3730a5875a6482

-- 
With Regards,
Amit Kapila.




Re: Consistent coding for the naming of LR workers

2023-06-15 Thread Kyotaro Horiguchi
At Thu, 15 Jun 2023 12:42:33 +1000, Peter Smith  wrote 
in 
> It is better to have a *single* point where these worker names are
> defined, so then all output uses identical LR worker nomenclature.
> 
> PSA a small patch to modify the code accordingly. This is not intended
> to be a functional change - just a code cleanup.
> 
> Thoughts?

I generally like this direction when it actually decreases the number
of translatable messages without making grepping on the tree
excessively difficult. However, in this case, the patch doesn't seems
to reduce the translatable messages; instead, it makes grepping
difficult.

Addition to that, I'm inclined to concur with Alvaro regarding the
gramattical aspect.

Consequently, I'd prefer to leave these alone.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: subscription/033_run_as_table_owner is not listed in the meson.build

2023-06-15 Thread Masahiko Sawada
On Fri, Jun 16, 2023 at 7:42 AM Michael Paquier  wrote:
>
> On Fri, Jun 16, 2023 at 07:15:36AM +0900, Masahiko Sawada wrote:
> > On Thu, Jun 15, 2023 at 5:32 PM Masahiko Sawada  
> > wrote:
> >> Checking similar oversights,
> >> src/bin/pg_basebackup/t/011_in_place_tablespace.pl seems not to be
> >> listed in meson.build too.
> >
> > Here is the patch for that.
>
> Yes, good catch.

Pushed.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com




Re: Fix a typo in rewriteHandler.c

2023-06-15 Thread Amit Langote
On Fri, Jun 16, 2023 at 10:25 AM Amit Langote  wrote:
> On Thu, Jun 15, 2023 at 5:07 PM Sho Kato (Fujitsu)  
> wrote:
> > I've attached the patch for the following rewriteTargetView comments.
> >
> > s/rewriteQuery/RewriteQuery
>
> Good catch and thanks for the patch.  Will push shortly.

Done.

-- 
Thanks, Amit Langote
EDB: http://www.enterprisedb.com




Re: add non-option reordering to in-tree getopt_long

2023-06-15 Thread Michael Paquier
On Thu, Jun 15, 2023 at 05:09:59PM -0700, Nathan Bossart wrote:
> On Thu, Jun 15, 2023 at 02:30:34PM +0900, Kyotaro Horiguchi wrote:
>> Hmm, the discussion seems to be based on the assumption that argv[0]
>> can be safely redirected to a different memory location. If that's the
>> case, we can prpbably rearrange the array, even if there's a small
>> window where ps might display a confusing command line, right?
> 
> If that's the extent of the breakage, then it seems alright to me.

Okay by me to live with this burden.

> I've attached a new version of the patch that omits the
> POSIXLY_CORRECT stuff.

This looks OK at quick glance, though you may want to document at the
top of getopt_long() the reordering business with the non-options?
--
Michael


signature.asc
Description: PGP signature


Re: [PATCH] Add loongarch native checksum implementation.

2023-06-15 Thread YANG Xudong

Updated the patch based on the comments.

On 2023/6/15 18:30, John Naylor wrote:


On Wed, Jun 14, 2023 at 9:20 AM YANG Xudong > wrote:

 >
 > Attached a new patch with fixes based on the comment below.

Note: It's helpful to pass "-v" to git format-patch, to have different 
versions.




Added v2


 > > For x86 and Arm, if it fails to link without an -march flag, we allow
 > > for a runtime check. The flags "-march=armv8-a+crc" and "-msse4.2" are
 > > for instructions not found on all platforms. The patch also checks both
 > > ways, and each one results in "Use LoongArch CRC instruction
 > > unconditionally". The -march flag here is general, not specific. In
 > > other words, if this only runs inside "+elif host_cpu == 
'loongarch64'",

 > > why do we need both with -march and without?
 > >
 >
 > Removed the elif branch.

Okay, since we've confirmed that no arch flag is necessary, some other 
places can be simplified:


--- a/src/port/Makefile
+++ b/src/port/Makefile
@@ -98,6 +98,11 @@ pg_crc32c_armv8.o: CFLAGS+=$(CFLAGS_CRC)
  pg_crc32c_armv8_shlib.o: CFLAGS+=$(CFLAGS_CRC)
  pg_crc32c_armv8_srv.o: CFLAGS+=$(CFLAGS_CRC)

+# all versions of pg_crc32c_loongarch.o need CFLAGS_CRC
+pg_crc32c_loongarch.o: CFLAGS+=$(CFLAGS_CRC)
+pg_crc32c_loongarch_shlib.o: CFLAGS+=$(CFLAGS_CRC)
+pg_crc32c_loongarch_srv.o: CFLAGS+=$(CFLAGS_CRC)

This was copy-and-pasted from platforms that use a runtime check, so 
should be unnecessary.




Removed these lines.


+# If the intrinsics are supported, sets pgac_loongarch_crc32c_intrinsics,
+# and CFLAGS_CRC.

+# Check if __builtin_loongarch_crcc_* intrinsics can be used
+# with the default compiler flags.
+# CFLAGS_CRC is set if the extra flag is required.

Same here -- it seems we don't need to set CFLAGS_CRC at all. Can you 
confirm?




We don't need to set CFLAGS_CRC as commented. I have updated the 
configure script to make it align with the logic in meson build script.



 > > Also, I don't have a Loongarch machine for testing. Could you show that
 > > the instructions are found in the binary, maybe using objdump and grep?
 > > Or a performance test?
 > >
 >
 > The output of the objdump command `objdump -dS
 > ../postgres-build/tmp_install/usr/local/pgsql/bin/postgres  | grep -B 30
 > -A 10 crcc` is attached.

Thanks for confirming.

--
John Naylor
EDB: http://www.enterprisedb.com From b2329478b5331e2aa9942c7ee4e23e3bfa871c1d Mon Sep 17 00:00:00 2001
From: YANG Xudong 
Date: Fri, 16 Jun 2023 09:22:08 +0800
Subject: [PATCH v2] Add loongarch native checksum implementation.

---
 config/c-compiler.m4   | 34 
 configure  | 73 ++
 configure.ac   | 33 +++
 meson.build| 24 +++
 src/include/pg_config.h.in |  3 ++
 src/include/port/pg_crc32c.h   |  9 +
 src/port/meson.build   |  3 ++
 src/port/pg_crc32c_loongarch.c | 72 +
 8 files changed, 236 insertions(+), 15 deletions(-)
 create mode 100644 src/port/pg_crc32c_loongarch.c

diff --git a/config/c-compiler.m4 b/config/c-compiler.m4
index 5be8f0f08d..eb3af009c4 100644
--- a/config/c-compiler.m4
+++ b/config/c-compiler.m4
@@ -661,3 +661,37 @@ if test x"$Ac_cachevar" = x"yes"; then
 fi
 undefine([Ac_cachevar])dnl
 ])# PGAC_ARMV8_CRC32C_INTRINSICS
+
+# PGAC_LOONGARCH_CRC32C_INTRINSICS
+# ---
+# Check if the compiler supports the LoongArch CRCC instructions, using
+# __builtin_loongarch_crcc_w_b_w, __builtin_loongarch_crcc_w_h_w,
+# __builtin_loongarch_crcc_w_w_w and __builtin_loongarch_crcc_w_d_w
+# intrinsic functions.
+#
+# If the intrinsics are supported, sets pgac_loongarch_crc32c_intrinsics,
+# and CFLAGS_CRC.
+AC_DEFUN([PGAC_LOONGARCH_CRC32C_INTRINSICS],
+[define([Ac_cachevar], [AS_TR_SH([pgac_cv_loongarch_crc32c_intrinsics_$1])])dnl
+AC_CACHE_CHECK(
+  [for __builtin_loongarch_crcc_w_b_w, __builtin_loongarch_crcc_w_h_w, 
__builtin_loongarch_crcc_w_w_w and __builtin_loongarch_crcc_w_d_w with 
CFLAGS=$1],
+  [Ac_cachevar],
+[pgac_save_CFLAGS=$CFLAGS
+CFLAGS="$pgac_save_CFLAGS $1"
+AC_LINK_IFELSE([AC_LANG_PROGRAM([],
+  [unsigned int crc = 0;
+   crc = __builtin_loongarch_crcc_w_b_w(0, crc);
+   crc = __builtin_loongarch_crcc_w_h_w(0, crc);
+   crc = __builtin_loongarch_crcc_w_w_w(0, crc);
+   crc = __builtin_loongarch_crcc_w_d_w(0, crc);
+   /* return computed value, to prevent the above being optimized away */
+   return crc == 0;])],
+  [Ac_cachevar=yes],
+  [Ac_cachevar=no])
+CFLAGS="$pgac_save_CFLAGS"])
+if test x"$Ac_cachevar" = x"yes"; then
+  CFLAGS_CRC="$1"
+  pgac_loongarch_crc32c_intrinsics=yes
+fi
+undefine([Ac_cachevar])dnl
+])# PGAC_LOONGARCH_CRC32C_INTRINSICS
diff --git a/configure b/configure
index 1b415142d1..7c60a26c11 100755
--- a/configure
+++ b/configure
@@ -18114,6 +18114,51 @@ fi
 
 fi
 
+# Check for LoongArch CRC intrinsics to do CRC calculations.
+#
+

Re: Fix a typo in rewriteHandler.c

2023-06-15 Thread Amit Langote
Hello,

On Thu, Jun 15, 2023 at 5:07 PM Sho Kato (Fujitsu)  wrote:
>
> Hi,
>
> I've attached the patch for the following rewriteTargetView comments.
>
>
>   Assert(parsetree->resultRelation == new_rt_index);
>
> /*
>  * For INSERT/UPDATE we must also update resnos in the targetlist to refer
>  * to columns of the base relation, since those indicate the target
>  * columns to be affected.
>  *
>  * Note that this destroys the resno ordering of the targetlist, but that
>  * will be fixed when we recurse through rewriteQuery, which will invoke
>  * rewriteTargetListIU again on the updated targetlist.
>  */
> if (parsetree->commandType != CMD_DELETE)
> {
> foreach(lc, parsetree->targetList)
>
> s/rewriteQuery/RewriteQuery

Good catch and thanks for the patch.  Will push shortly.

-- 
Thanks, Amit Langote
EDB: http://www.enterprisedb.com




Re: Do we want a hashset type?

2023-06-15 Thread Joel Jacobson
On Thu, Jun 15, 2023, at 11:44, jian he wrote:
> In hashset/test/sql/order.sql, can we add the following to test whether 
> the optimizer will use our index.
>
> CREATE INDEX  ON test_int4hashset_order (int4hashset_col  
> int4hashset_btree_ops);
>
> -- to make sure that this work with just two rows
> SET enable_seqscan TO off;  
>
> explain(costs off) SELECT * FROM test_int4hashset_order WHERE 
> int4hashset_col = '{1,2}'::int4hashset;
> reset enable_seqscan;

Not sure I can see the value of that test,
since we've already tested the comparison functions,
which are used by the int4hashset_btree_ops operator class.

I think a test that verifies the btree index is actually used,
would more be a test of the query planner than hashset.

I might be missing something here, please tell me if so.

> Since most contrib modules, one module, only one test file, maybe we 
> need to consolidate all the test sql files to one sql file 
> (int4hashset.sql)?

I've also made the same observation; I wonder if it's by design
or by coincidence? I think multiple test files improves modularity,
isolation and overall organisation of the testing.

As long as we are developing in the pre-release phase,
I think it's beneficial and affordable with rigorous testing.

However, if hashset would ever be considered
for core inclusion, then we should consolidate all tests into
one file and retain only essential tests, thereby minimizing
impact on PostgreSQL's overall test suite runtime
where every millisecond matters.

> Attached is a patch slightly modified README.md. feel free to change, 
> since i am not native english speaker... 
>
> Attachments:
> * 0001-add-instruction-using-PG_CONFIG-to-install-extension.patch

Thanks, improvements incorporated with some minor changes.

/Joel

hashset-0.0.1-61d572a.patch
Description: Binary data


Re: add non-option reordering to in-tree getopt_long

2023-06-15 Thread Nathan Bossart
On Thu, Jun 15, 2023 at 02:30:34PM +0900, Kyotaro Horiguchi wrote:
> At Wed, 14 Jun 2023 15:46:08 -0700, Nathan Bossart  
> wrote in 
>> Hm.  IIUC modifying the argv pointers on AIX will modify the process title,
>> which could cause 'ps' to temporarily show duplicate/missing arguments
>> during option parsing.  That doesn't seem too terrible, but if pointer
>> assignments aren't atomic, maybe 'ps' could be sent off to another part of
>> memory, which does seem terrible.
> 
> Hmm, the discussion seems to be based on the assumption that argv[0]
> can be safely redirected to a different memory location. If that's the
> case, we can prpbably rearrange the array, even if there's a small
> window where ps might display a confusing command line, right?

If that's the extent of the breakage, then it seems alright to me.  I've
attached a new version of the patch that omits the POSIXLY_CORRECT stuff.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
>From 89ec098d2515d7cf03b630b787e8f53ca25916b9 Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Fri, 9 Jun 2023 15:51:58 -0700
Subject: [PATCH v2 1/1] Teach in-tree getopt_long() to move non-options to the
 end of argv.

---
 src/bin/scripts/t/040_createuser.pl | 10 +
 src/port/getopt_long.c  | 34 +
 2 files changed, 36 insertions(+), 8 deletions(-)

diff --git a/src/bin/scripts/t/040_createuser.pl b/src/bin/scripts/t/040_createuser.pl
index 9ca282181d..ba40ab11a3 100644
--- a/src/bin/scripts/t/040_createuser.pl
+++ b/src/bin/scripts/t/040_createuser.pl
@@ -42,9 +42,9 @@ $node->issues_sql_like(
 	'add a role as a member with admin option of the newly created role');
 $node->issues_sql_like(
 	[
-		'createuser', '-m',
-		'regress_user3', '-m',
-		'regress user #4', 'REGRESS_USER5'
+		'createuser', 'REGRESS_USER5',
+		'-m', 'regress_user3',
+		'-m', 'regress user #4'
 	],
 	qr/statement: CREATE ROLE "REGRESS_USER5" NOSUPERUSER NOCREATEDB NOCREATEROLE INHERIT LOGIN NOREPLICATION NOBYPASSRLS ROLE regress_user3,"regress user #4";/,
 	'add a role as a member of the newly created role');
@@ -73,11 +73,13 @@ $node->issues_sql_like(
 	qr/statement: CREATE ROLE regress_user11 NOSUPERUSER NOCREATEDB NOCREATEROLE INHERIT LOGIN NOREPLICATION NOBYPASSRLS IN ROLE regress_user1;/,
 	'--role');
 $node->issues_sql_like(
-	[ 'createuser', '--member-of', 'regress_user1', 'regress_user12' ],
+	[ 'createuser', 'regress_user12', '--member-of', 'regress_user1' ],
 	qr/statement: CREATE ROLE regress_user12 NOSUPERUSER NOCREATEDB NOCREATEROLE INHERIT LOGIN NOREPLICATION NOBYPASSRLS IN ROLE regress_user1;/,
 	'--member-of');
 
 $node->command_fails([ 'createuser', 'regress_user1' ],
 	'fails if role already exists');
+$node->command_fails([ 'createuser', 'regress_user1', '-m', 'regress_user2', 'regress_user3' ],
+	'fails for too many non-options');
 
 done_testing();
diff --git a/src/port/getopt_long.c b/src/port/getopt_long.c
index c989276988..4bbd8e0b85 100644
--- a/src/port/getopt_long.c
+++ b/src/port/getopt_long.c
@@ -50,8 +50,8 @@
  * This implementation does not use optreset.  Instead, we guarantee that
  * it can be restarted on a new argv array after a previous call returned -1,
  * if the caller resets optind to 1 before the first call of the new series.
- * (Internally, this means we must be sure to reset "place" to EMSG before
- * returning -1.)
+ * (Internally, this means we must be sure to reset "place" to EMSG and
+ * nonopt_start to -1 before returning -1.)
  */
 int
 getopt_long(int argc, char *const argv[],
@@ -60,21 +60,45 @@ getopt_long(int argc, char *const argv[],
 {
 	static char *place = EMSG;	/* option letter processing */
 	char	   *oli;			/* option letter list index */
+	static int	nonopt_start = -1;
 
 	if (!*place)
 	{			/* update scanning pointer */
 		if (optind >= argc)
 		{
 			place = EMSG;
+			nonopt_start = -1;
 			return -1;
 		}
 
+retry:
 		place = argv[optind];
 
 		if (place[0] != '-')
 		{
-			place = EMSG;
-			return -1;
+			char	  **args = (char **) argv;
+
+			/*
+			 * If only non-options remain, return -1.  Else, move the
+			 * non-option to the end and try again.
+			 */
+			if (optind == nonopt_start)
+			{
+place = EMSG;
+nonopt_start = -1;
+return -1;
+			}
+
+			for (int i = optind; i < argc - 1; i++)
+args[i] = args[i + 1];
+			args[argc - 1] = place;
+
+			if (nonopt_start == -1)
+nonopt_start = argc - 1;
+			else
+nonopt_start--;
+
+			goto retry;
 		}
 
 		place++;
@@ -83,6 +107,7 @@ getopt_long(int argc, char *const argv[],
 		{
 			/* treat "-" as not being an option */
 			place = EMSG;
+			nonopt_start = -1;
 			return -1;
 		}
 
@@ -91,6 +116,7 @@ getopt_long(int argc, char *const argv[],
 			/* found "--", treat it as end of options */
 			++optind;
 			place = EMSG;
+			nonopt_start = -1;
 			return -1;
 		}
 
-- 
2.25.1



Re: allow granting CLUSTER, REFRESH MATERIALIZED VIEW, and REINDEX

2023-06-15 Thread Nathan Bossart
On Wed, Jun 14, 2023 at 09:10:44PM -0700, Nathan Bossart wrote:
> IMO
> we should update the docs and leave out the ownership checks since MAINTAIN
> is now a grantable privilege like any other.  WDYT?

Here's an attempt at adjusting the documentation as I proposed yesterday.
I think this is a good opportunity to simplify the privilege-related
sections for these maintenance commands.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
>From 70f76001891413d85ac97f54b787c2e1b508 Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Wed, 14 Jun 2023 10:54:05 -0700
Subject: [PATCH v2 1/2] partial revert of ff9618e82a

---
 doc/src/sgml/ref/analyze.sgml | 13 ++---
 doc/src/sgml/ref/cluster.sgml |  9 +-
 doc/src/sgml/ref/lock.sgml| 24 ++-
 .../sgml/ref/refresh_materialized_view.sgml   | 17 +--
 doc/src/sgml/ref/reindex.sgml | 28 +++---
 doc/src/sgml/ref/vacuum.sgml  | 13 ++---
 doc/src/sgml/user-manag.sgml  |  3 +-
 src/backend/commands/cluster.c| 10 ++-
 src/backend/commands/indexcmds.c  | 17 ---
 src/backend/commands/lockcmds.c   |  8 -
 src/backend/commands/tablecmds.c  | 29 +--
 src/backend/commands/vacuum.c |  6 +---
 src/include/commands/tablecmds.h  |  1 -
 .../expected/cluster-conflict-partition.out   | 14 -
 .../specs/cluster-conflict-partition.spec |  7 +++--
 src/test/regress/expected/cluster.out |  3 +-
 src/test/regress/expected/vacuum.out  | 18 
 17 files changed, 74 insertions(+), 146 deletions(-)

diff --git a/doc/src/sgml/ref/analyze.sgml b/doc/src/sgml/ref/analyze.sgml
index 20c6f9939f..ecc7c884b4 100644
--- a/doc/src/sgml/ref/analyze.sgml
+++ b/doc/src/sgml/ref/analyze.sgml
@@ -183,17 +183,8 @@ ANALYZE [ VERBOSE ] [ table_and_columns
To analyze a table, one must ordinarily have the MAINTAIN
-   privilege on the table or be the table's owner, a superuser, or a role with
-   privileges of the
-   pg_maintain
-   role.  However, database owners are allowed to
-   analyze all tables in their databases, except shared catalogs.
-   (The restriction for shared catalogs means that a true database-wide
-   ANALYZE can only be performed by superusers and roles
-   with privileges of pg_maintain.)  If a role has
-   permission to ANALYZE a partitioned table, it is also
-   permitted to ANALYZE each of its partitions, regardless
-   of whether the role has the aforementioned privileges on the partition.
+   privilege on the table.  However, database owners are allowed to analyze all
+   tables in their databases, except shared catalogs.
ANALYZE will skip over any tables that the calling user
does not have permission to analyze.
   
diff --git a/doc/src/sgml/ref/cluster.sgml b/doc/src/sgml/ref/cluster.sgml
index 29f0f1fd90..06f3d269e6 100644
--- a/doc/src/sgml/ref/cluster.sgml
+++ b/doc/src/sgml/ref/cluster.sgml
@@ -134,14 +134,7 @@ CLUSTER [VERBOSE]
 

 To cluster a table, one must have the MAINTAIN privilege
-on the table or be the table's owner, a superuser, or a role with
-privileges of the
-pg_maintain
-role.  If a role has permission to CLUSTER a partitioned
-table, it is also permitted to CLUSTER each of its
-partitions, regardless of whether the role has the aforementioned
-privileges on the partition.  CLUSTER will skip over any
-tables that the calling user does not have permission to cluster.
+on the table.

 

diff --git a/doc/src/sgml/ref/lock.sgml b/doc/src/sgml/ref/lock.sgml
index 5b3b2b793a..d22c6f8384 100644
--- a/doc/src/sgml/ref/lock.sgml
+++ b/doc/src/sgml/ref/lock.sgml
@@ -166,21 +166,15 @@ LOCK [ TABLE ] [ ONLY ] name [ * ]
 

 To lock a table, the user must have the right privilege for the specified
-lockmode, or be the table's
-owner, a superuser, or a role with privileges of the pg_maintain
-role. If the user has MAINTAIN,
-UPDATE, DELETE, or
-TRUNCATE privileges on the table, any lockmode is permitted. If the user has
-INSERT privileges on the table, ROW EXCLUSIVE
-MODE (or a less-conflicting mode as described in ) is permitted. If a user has
-SELECT privileges on the table, ACCESS SHARE
-MODE is permitted.  If a role has permission to lock a
-partitioned table, it is also permitted to lock each of its partitions,
-regardless of whether the role has the aforementioned privileges on the
-partition.
+lockmode.  If the user has
+MAINTAIN, UPDATE,
+DELETE, or TRUNCATE privileges on the
+table, any lockmode is
+permitted. If the user has INSERT privileges on the
+table, ROW EXCLUSIVE MODE (or a less-conflicting mode as
+described in ) is permitted. If a user
+has SELECT privileges on the table,
+ACCESS SHARE MODE is permitted.

Re: Support to define custom wait events for extensions

2023-06-15 Thread Michael Paquier
On Thu, Jun 15, 2023 at 11:13:57AM -0500, Tristan Partin wrote:
> What's the Postgres policy on the following?
> 
> for (int i = 0; ...)
> for (i = 0; ...)
> 
> You are using 2 different patterns in WaitEventShmemInit() and
> InitializeExtensionWaitEventTranches().

C99 style is OK since v12, so the style of the patch is fine.  The
older style has no urgent need to change, either.  One argument to let
the code as-is is that it could generate backpatching conflicts, while
it does not hurt as it stands.  This also means that bug fixes that
need to be applied down to 12 would be able to use C99 declarations
freely without some of the buildfarm animals running REL_11_STABLE
complaining.  I have fallen into this trap recently, actually.  See
dbd25dd.
--
Michael


signature.asc
Description: PGP signature


Re: subscription/033_run_as_table_owner is not listed in the meson.build

2023-06-15 Thread Michael Paquier
On Fri, Jun 16, 2023 at 07:15:36AM +0900, Masahiko Sawada wrote:
> On Thu, Jun 15, 2023 at 5:32 PM Masahiko Sawada  wrote:
>> Checking similar oversights,
>> src/bin/pg_basebackup/t/011_in_place_tablespace.pl seems not to be
>> listed in meson.build too.
> 
> Here is the patch for that.

Yes, good catch.
--
Michael


signature.asc
Description: PGP signature


Re: Memory leak in incremental sort re-scan

2023-06-15 Thread Tomas Vondra



On 6/15/23 22:36, Tom Lane wrote:
> Tomas Vondra  writes:
>> On 6/15/23 22:11, Tom Lane wrote:
>>> I see zero leakage in that example after applying the attached quick
>>> hack.  (It might be better to make the check in the caller, or to just
>>> move the call to ExecInitIncrementalSort.)
> 
>> Thanks for looking. Are you planning to work on this and push the fix,
>> or do you want me to finish this up?
> 
> I'm happy to let you take it -- got lots of other stuff on my plate.
> 

OK, will do.

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: subscription/033_run_as_table_owner is not listed in the meson.build

2023-06-15 Thread Masahiko Sawada
On Thu, Jun 15, 2023 at 5:32 PM Masahiko Sawada  wrote:
>
> On Thu, Jun 15, 2023 at 5:04 PM Michael Paquier  wrote:
> >
> > On Thu, Jun 15, 2023 at 07:16:06AM +, Hayato Kuroda (Fujitsu) wrote:
> > > I have noticed that the testcase subscription/033_run_as_table_owner in 
> > > the
> > > subscription is not executed when meson build system is chosen. The case 
> > > is not
> > > listed in the meson.build.
> > >
> > > Do we have any reasons or backgrounds about it?
> > > PSA the patch to add the case. It works well on my env.
> >
> > Seems like a thinko of 4826759 to me, that's easy to miss.  Will fix
> > in a bit..
>
> Good catch.
>
> Checking similar oversights,
> src/bin/pg_basebackup/t/011_in_place_tablespace.pl seems not to be
> listed in meson.build too.

Here is the patch for that.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com


0001-Add-missing-pg_basebackup-TAP-test-for-meson.patch
Description: Binary data


Re: Do we want a hashset type?

2023-06-15 Thread Joel Jacobson
On Thu, Jun 15, 2023, at 06:29, jian he wrote:
> I am not sure the following results are correct.
> with cte as (
> select hashset(x) as x
> ,hashset_capacity(hashset(x))
> ,hashset_count(hashset(x))
> from generate_series(1,10) g(x))
> select *
> ,'|' as delim
> , hashset_add(x,1::int)
> ,hashset_capacity(hashset_add(x,1::int))
> ,hashset_count(hashset_add(x,1::int))
> fromcte \gx
>
>
> results:  
> -[ RECORD 1 ]+-
> x| {8,1,10,3,9,4,6,2,1,5,7}
> hashset_capacity | 64
> hashset_count| 10
> delim| |
> hashset_add  | {8,1,10,3,9,4,6,2,1,5,7}
> hashset_capacity | 64
> hashset_count| 11

Nice catch, you found a bug!

Fixed in attached patch:

---
Ensure hashset_add and hashset_merge operate on copied data

Previously, the hashset_add() and hashset_merge() functions were
modifying the original hashset in-place. This was leading to unexpected
results because the original data in the hashset was being altered.

This commit introduces the macro PG_GETARG_INT4HASHSET_COPY(), ensuring
a copy of the hashset is created and modified, leaving the original
hashset untouched.

This adjustment ensures hashset_add() and hashset_merge() operate
correctly on the copied hashset and prevent modification of the
original data.

A new regression test file `reported_bugs.sql` has been added to
validate the proper functionality of these changes. Future reported
bugs and their corresponding tests will also be added to this file.
---

I wonder if this function:

static int4hashset_t *
int4hashset_copy(int4hashset_t *src)
{
return src;
}

...that was previously named hashset_copy(),
should be implemented to actually copy the struct,
instead of just returning the input?

It is being used by int4hashset_agg_combine() like this:

/* copy the hashset into the right long-lived memory context */
oldcontext = MemoryContextSwitchTo(aggcontext);
src = int4hashset_copy(src);
MemoryContextSwitchTo(oldcontext);

/Joel

hashset-0.0.1-da84659.patch
Description: Binary data


Re: Memory leak in incremental sort re-scan

2023-06-15 Thread Tom Lane
Tomas Vondra  writes:
> On 6/15/23 22:11, Tom Lane wrote:
>> I see zero leakage in that example after applying the attached quick
>> hack.  (It might be better to make the check in the caller, or to just
>> move the call to ExecInitIncrementalSort.)

> Thanks for looking. Are you planning to work on this and push the fix,
> or do you want me to finish this up?

I'm happy to let you take it -- got lots of other stuff on my plate.

regards, tom lane




Re: Memory leak in incremental sort re-scan

2023-06-15 Thread Tomas Vondra



On 6/15/23 22:11, Tom Lane wrote:
> Tomas Vondra  writes:
>> On 6/15/23 13:48, Laurenz Albe wrote:
>>> ExecIncrementalSort() calls tuplesort_begin_common(), which creates the 
>>> "TupleSort main"
>>> and "TupleSort sort" memory contexts, and ExecEndIncrementalSort() calls 
>>> tuplesort_end(),
>>> which destroys them.
>>> But ExecReScanIncrementalSort() only resets the memory contexts.
> 
>> I think it's correct, but I need to look at the code more closely - it's
>> been a while. The code is a bit silly, as it resets the tuplesort and
>> then throws away all the pointers - so what could the _end() break?
> 
> The report at [1] seems to be the same issue of ExecReScanIncrementalSort
> leaking memory.

Funny how these reports often come in pairs ...

> I applied Laurenz's fix, and that greatly reduces the
> speed of leak but doesn't remove the problem entirely.  It looks like
> the remaining issue is that the data computed by preparePresortedCols() is
> recomputed each time we rescan the node.  This seems entirely gratuitous,
> because there's nothing in that that could change across rescans.

Yeah, I was wondering about that too when I skimmed over that code
earlier today.

> I see zero leakage in that example after applying the attached quick
> hack.  (It might be better to make the check in the caller, or to just
> move the call to ExecInitIncrementalSort.)
> 

Thanks for looking. Are you planning to work on this and push the fix,
or do you want me to finish this up?


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: Memory leak in incremental sort re-scan

2023-06-15 Thread Tom Lane
Tomas Vondra  writes:
> On 6/15/23 13:48, Laurenz Albe wrote:
>> ExecIncrementalSort() calls tuplesort_begin_common(), which creates the 
>> "TupleSort main"
>> and "TupleSort sort" memory contexts, and ExecEndIncrementalSort() calls 
>> tuplesort_end(),
>> which destroys them.
>> But ExecReScanIncrementalSort() only resets the memory contexts.

> I think it's correct, but I need to look at the code more closely - it's
> been a while. The code is a bit silly, as it resets the tuplesort and
> then throws away all the pointers - so what could the _end() break?

The report at [1] seems to be the same issue of ExecReScanIncrementalSort
leaking memory.  I applied Laurenz's fix, and that greatly reduces the
speed of leak but doesn't remove the problem entirely.  It looks like
the remaining issue is that the data computed by preparePresortedCols() is
recomputed each time we rescan the node.  This seems entirely gratuitous,
because there's nothing in that that could change across rescans.
I see zero leakage in that example after applying the attached quick
hack.  (It might be better to make the check in the caller, or to just
move the call to ExecInitIncrementalSort.)

regards, tom lane

[1] 
https://www.postgresql.org/message-id/db03c582-086d-e7cd-d4a1-3bc722f81765%40inf.ethz.ch

diff --git a/src/backend/executor/nodeIncrementalSort.c b/src/backend/executor/nodeIncrementalSort.c
index 34257ce34b..655b1c30e1 100644
--- a/src/backend/executor/nodeIncrementalSort.c
+++ b/src/backend/executor/nodeIncrementalSort.c
@@ -166,6 +166,9 @@ preparePresortedCols(IncrementalSortState *node)
 {
 	IncrementalSort *plannode = castNode(IncrementalSort, node->ss.ps.plan);
 
+	if (node->presorted_keys)
+		return;
+
 	node->presorted_keys =
 		(PresortedKeyData *) palloc(plannode->nPresortedCols *
 	sizeof(PresortedKeyData));
@@ -1140,7 +1143,6 @@ ExecReScanIncrementalSort(IncrementalSortState *node)
 	node->outerNodeDone = false;
 	node->n_fullsort_remaining = 0;
 	node->bound_Done = 0;
-	node->presorted_keys = NULL;
 
 	node->execution_status = INCSORT_LOADFULLSORT;
 
@@ -1154,12 +1156,12 @@ ExecReScanIncrementalSort(IncrementalSortState *node)
 	 */
 	if (node->fullsort_state != NULL)
 	{
-		tuplesort_reset(node->fullsort_state);
+		tuplesort_end(node->fullsort_state);
 		node->fullsort_state = NULL;
 	}
 	if (node->prefixsort_state != NULL)
 	{
-		tuplesort_reset(node->prefixsort_state);
+		tuplesort_end(node->prefixsort_state);
 		node->prefixsort_state = NULL;
 	}
 


Re: Let's make PostgreSQL multi-threaded

2023-06-15 Thread Konstantin Knizhnik




On 15.06.2023 12:04 PM, Hannu Krosing wrote:

So a fair bit of work but also a clearly defined benefits of
1) reduced memory usage
2) no need to rebuild caches for each new connection
3) no need to track PREPARE statements inside connection poolers.


Shared plan cache (not only prepared statements cache) also opens way to 
more sophisticated query optimizations.
Right now we are not performing some optimization (like constant 
expression folding) just because them increase time of processing normal 
queries.
This is why queries generated by ORMs or wizards, which can contain a 
lot of dumb stuff, are not well simplified  by Postgres.
With MS-Sql it is quite frequent that query execution time is much 
smaller than query optimization time.
Having shared plan cache allows us to spend more time in optimization 
without risk to degrade performance.






Re: Let's make PostgreSQL multi-threaded

2023-06-15 Thread Konstantin Knizhnik




On 15.06.2023 11:41 AM, James Addison wrote:

On Thu, 15 Jun 2023 at 08:12, Konstantin Knizhnik  wrote:



On 15.06.2023 1:23 AM, James Addison wrote:

On Tue, 13 Jun 2023 at 07:55, Konstantin Knizhnik  wrote:


On 12.06.2023 3:23 PM, Pavel Borisov wrote:

Is the following true or not?

1. If we switch processes to threads but leave the amount of session
local variables unchanged, there would be hardly any performance gain.
2. If we move some backend's local variables into shared memory then
the performance gain would be very near to what we get with threads
having equal amount of session-local variables.

In other words, the overall goal in principle is to gain from less
memory copying wherever it doesn't add the burden of locks for
concurrent variables access?

Regards,
Pavel Borisov,
Supabase


IMHO both statements are not true.
Switching to threads will cause less context switch overhead (because
all threads are sharing the same memory space and so preserve TLB.
How big will be this advantage? In my prototype I got ~10%. But may be
it is possible to fin workloads when it is larger.

Hi Konstantin - do you have code/links that you can share for the
prototype and benchmarks used to gather those results?



Sorry, I have already shared the link:
https://github.com/postgrespro/postgresql.pthreads/

Nope, my mistake for not locating the existing link - thank you.

Is there a reason that parser-related files (flex/bison) are added as
part of the changeset?  (I'm trying to narrow it down to only the
changes necessary for the functionality.  so far it looks mostly
fairly minimal, which is good.  the adjustments to progname are
another thing that look a bit unusual/maybe unnecessary for the
feature)


Sorry, absolutely no reason - just my fault.


As you can see last commit was 6 years ago when I stopped work on this project.
Why?  I already tried to explain it:
- benefits from switching to threads were not so large. May be I just failed to 
fid proper workload, but is was more or less expected result,
because most of the code was not changed - it uses the same sync primitives, 
the same local catalog/relation caches,..
To take all advantage of multithreadig model it is necessary to rewrite many 
components, especially related with interprocess communication.
But maintaining such fork of Postgres and synchronize it with mainstream 
requires too much efforts and I was not able to do it myself.

I get the feeling that there are probably certain query types or
patterns where a significant, order-of-magnitude speedup is possible
with threads - but yep, I haven't seen those described in detail yet
on the mailing list (but as hinted by my not noticing the github link
previously, maybe I'm not following the list closely enough).

What workloads did you try with your version of the project?


I do not remember now precisely (6 years passed).
But definitely I tried pgbench, especially read-only pgbench (to be more 
CPU rather than disk bounded)







Re: [17] collation provider "builtin"

2023-06-15 Thread Joe Conway

On 6/14/23 19:20, Thomas Munro wrote:

On Thu, Jun 15, 2023 at 10:55 AM Jeff Davis  wrote:

The locale "C" (and equivalently, "POSIX") is not really a libc locale;
it's implemented internally with memcmp for collation and
pg_ascii_tolower, etc., for ctype.

The attached patch implements a new collation provider, "builtin",
which only supports "C" and "POSIX". It does not change the initdb
default provider, so it must be requested explicitly. The user will be
guaranteed that collations with provider "builtin" will never change
semantics; therefore they need no version and indexes are not at risk
of corruption. See previous discussion[1].


I haven't studied the details yet but +1 for this idea.  It models
what we are actually doing.


+1 agreed

--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com





Re: psql: Add role's membership options to the \du+ command

2023-06-15 Thread David G. Johnston
Robert - can you please comment on what you are willing to commit in order
to close out your open item here.  My take is that the design for this, the
tabular form a couple of emails ago (copied here), is ready-to-commit, just
needing the actual (trivial) code changes to be made to accomplish it.

Tabular form

 rolname  | memberof |   options   |
grantor  
--+--+-+--
postgres |  | |
regress_du_admin | regress_du_role0+| admin, inherit, set+| postgres
 +  | regress_du_role1+| admin, inherit, set+|
postgres+  | regress_du_role2 | admin,
inherit, set | postgres regress_du_role0 |  |
   |  regress_du_role1 | regress_du_role0+| admin, inherit,
set+| regress_du_admin+  | regress_du_role0+| inherit
  +| regress_du_role1+  | regress_du_role0 |
set | regress_du_role2 regress_du_role2 |
regress_du_role0+| admin  +| regress_du_admin+
 | regress_du_role0+| inherit, set   +| regress_du_role1+
| regress_du_role0+| empty  +|
regress_du_role2+  | regress_du_role1 | admin, set
 | regress_du_admin(5 rows)


On Thu, May 18, 2023 at 6:07 AM Pavel Luzanov 
wrote:

> On 18.05.2023 05:42, Jonathan S. Katz wrote:
>
> That said, from a readability standpoint, it was easier for me to follow
> the tabular form vs. the sentence form.
>
> May be possible to reach a agreement on the sentence form. Similar 
> descriptions used
> for referential constraints in the \d command:
>
> I think we should consider the tabular form with translatable headers to
be the acceptable choice here.  I don't see enough value in the sentence
form to make it worth trying to overcome the i18n objection there.

> As for tabular form it looks more natural to have a separate psql command
> for pg_auth_members system catalog. Something based on this query:But is it 
> worth inventing a new psql command for this?
>
>
IMO, no.  I'd much rather read "admin, inherit, set" than deal with
true/false in columns.  I think the newlines are better compared to
repetition of the rolname as well.

I'm also strongly in favor of explicitly writing out the word "empty"
instead of leaving the column blank for the case that no options are marked
true.  But it isn't a show-stopper for me.

David J.


Re: Non-superuser subscription owners

2023-06-15 Thread Alvaro Herrera
On 2023-Jun-13, Amit Kapila wrote:

> I'll push this tomorrow unless there are any suggestions or comments.

Note the proposed commit message is wrong about which commit is to blame
for the original problem -- it mentions e7e7da2f8d57 twice, but one of
them is actually c3afe8cf5a1e.

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/




Re: RFC: Logging plan of the running query

2023-06-15 Thread James Coleman
On Thu, Jun 15, 2023 at 9:00 AM torikoshia  wrote:
>
> On 2023-06-15 01:48, James Coleman wrote:
> > On Tue, Jun 13, 2023 at 11:53 AM James Coleman 
> > wrote:
> >>
> >> ...
> >> I'm going to re-run tests with my patch version + resetting the flag
> >> on SIGINT (and any other error condition) to be certain that the issue
> >> you uncovered (where backends get stuck after a SIGINT not responding
> >> to the requested plan logging) wasn't masking any other issues.
> >>
> >> As long as that run is clean also then I believe the patch is safe
> >> as-is even without the re-entrancy guard.
> >>
> >> I'll report back with the results of that testing.
> >
> > The tests have been running since last night, but have been apparently
> > hung now for many hours. I haven't been able to fully look into it,
> > but so far I know the hung (100% CPU) backend last logged this:
> >
> > 2023-06-14 02:00:30.045 UTC client backend[84461]
> > pg_regress/updatable_views LOG:  query plan running on backend with
> > PID 84461 is:
> > Query Text: SELECT table_name, column_name, is_updatable
> >   FROM information_schema.columns
> >  WHERE table_name LIKE E'r_\\_view%'
> >  ORDER BY table_name, ordinal_position;
> >
> > The last output from the regression test harness was:
> >
> > # parallel group (5 tests):  index_including create_view
> > index_including_gist create_index create_index_spgist
> > ok 66+ create_index36508 ms
> > ok 67+ create_index_spgist 38588 ms
> > ok 68+ create_view  1394 ms
> > ok 69+ index_including   654 ms
> > ok 70+ index_including_gist 1701 ms
> > # parallel group (16 tests):  errors create_cast drop_if_exists
> > create_aggregate roleattributes constraints hash_func typed_table
> > infinite_recurse
> >
> > Attaching gdb to the hung backend shows this:
> >
> > #0  0x5601ab1f9529 in ProcLockWakeup
> > (lockMethodTable=lockMethodTable@entry=0x5601ab6484e0
> > , lock=lock@entry=0x7f5325c913f0) at proc.c:1655
> > #1  0x5601ab1e99dc in CleanUpLock (lock=lock@entry=0x7f5325c913f0,
> > proclock=proclock@entry=0x7f5325d40d60,
> > lockMethodTable=lockMethodTable@entry=0x5601ab6484e0
> > ,
> > hashcode=hashcode@entry=573498161, wakeupNeeded=)
> > at lock.c:1673
> > #2  0x5601ab1e9e21 in LockRefindAndRelease
> > (lockMethodTable=lockMethodTable@entry=0x5601ab6484e0
> > , proc=,
> > locktag=locktag@entry=0x5601ac3d7998, lockmode=lockmode@entry=1,
> >
> > decrement_strong_lock_count=decrement_strong_lock_count@entry=false)
> > at lock.c:3150
> > #3  0x5601ab1edb27 in LockReleaseAll
> > (lockmethodid=lockmethodid@entry=1, allLocks=false) at lock.c:2295
> > #4  0x5601ab1f8599 in ProcReleaseLocks
> > (isCommit=isCommit@entry=true) at proc.c:781
> > #5  0x5601ab37f1f4 in ResourceOwnerReleaseInternal
> > (owner=, phase=phase@entry=RESOURCE_RELEASE_LOCKS,
> > isCommit=isCommit@entry=true, isTopLevel=isTopLevel@entry=true) at
> > resowner.c:618
> > #6  0x5601ab37f7b7 in ResourceOwnerRelease (owner=,
> > phase=phase@entry=RESOURCE_RELEASE_LOCKS,
> > isCommit=isCommit@entry=true, isTopLevel=isTopLevel@entry=true) at
> > resowner.c:494
> > #7  0x5601aaec1d84 in CommitTransaction () at xact.c:2334
> > #8  0x5601aaec2b22 in CommitTransactionCommand () at xact.c:3067
> > #9  0x5601ab200a66 in finish_xact_command () at postgres.c:2783
> > #10 0x5601ab20338f in exec_simple_query (
> > query_string=query_string@entry=0x5601ac3b0858 "SELECT table_name,
> > column_name, is_updatable\n  FROM information_schema.columns\n WHERE
> > table_name LIKE E'r__view%'\n ORDER BY table_name,
> > ordinal_position;") at postgres.c:1300
> >
> > I am unable to connect to the regression test Postgres instance --
> > psql just hangs, so the lock seems to have affected the postmaster
> > also.
> >
> > I'm wondering if this might represent a bug in the current patch.
>
> Thanks for running and analyzing the test!

Sure thing!

> Could you share me how you are running the test?
>
> I imagined something like below, but currently couldn't reproduce it.
> - apply both v26-0001 and v27-0002 and build
> - run PostgreSQL with default GUCssaaa
> - make installcheck-world
> - run 'SELECT pg_log_query_plan(pid) FROM pg_stat_activity \watch 0.1'
> during make installcheck-world

Apologies, I should have attached my updated patch (with the fix for
the bug you'd reporting with the re-entrancy guard). Attached is v28
which sets ProcessLogQueryPlanInterruptActive to false in errfinish
when necessary. Once built with those two patches I'm simply running
`make check`.

Regards,
James Coleman


v28-0001-Add-function-to-log-the-plan-of-the-query.patch
Description: Binary data


v28-0002-Testing-attempt-logging-plan-on-ever-CFI-call.patch
Description: Binary data


Re: When IMMUTABLE is not.

2023-06-15 Thread Yura Sokolov



15.06.2023 17:49, Tom Lane пишет:

"David G. Johnston"  writes:

The failure to find and execute the function code itself is not a failure
mode that these markers need be concerned with.  Assuming one can execute
the function an immutable function will give the same answer for the same
input for all time.

The viewpoint taken in the docs I mentioned is that an IMMUTABLE
marker is a promise from the user to the system about the behavior
of a function.  While the system does provide a few simple tools
to catch obvious errors and to make it easier to write functions
that obey such promises, it's mostly on the user to get it right.

In particular, we've never enforced that an immutable function can't
call non-immutable functions.  While that would seem like a good idea
in the abstract, we've intentionally not tried to do it.  (I'm pretty
sure there is more than one round of previous discussions of the point
in the archives, although locating relevant threads seems hard.)
One reason not to is that polymorphic functions have to be marked
with worst-case volatility labels.  There are plenty of examples of
functions that are stable for some input types and immutable for
others (array_to_string, for instance); but the marking system can't
represent that so we have to label them stable.  Enforcing that a
user-defined immutable function can't use such a function might
just break things for no gain.


"Stable vs Immutable" is much lesser problem compared to "ReadOnly vs 
Volatile".


Executing fairly read-only function more times than necessary (or less 
times),

doesn't modify data in unexpecting way.

But executing immutable/stable function, that occasionally modifies 
data, could
lead to different unexpected effects due to optimizer decided to call 
them more

or less times than query assumes.

Some vulnerabilities were present due to user defined functions used in 
index
definitions started to modify data. If "read-only" execution were forced 
in index

operations, those issues couldn't happen.

> it's mostly on the user to get it right.

It is really bad premise. Users does strange things and aren't expected 
to be

professionals who really understand whole PostgreSQL internals.

And it is strange to hear it at the same time we don't allow users to do 
query hints

since "optimizer does better" :-D

Ok, I'd go and cool myself. Certainly I don't get some point.





Re: Support to define custom wait events for extensions

2023-06-15 Thread Tristan Partin
We had this on our list of things to do at Neon, so it is a nice
surprise that you brought up an initial patchset :). It was also my
first time looking up the word tranche.

>From 59a118402e5e59685fb9e0fb086872e25a405736 Mon Sep 17 00:00:00 2001
From: Masahiro Ikeda 
Date: Thu, 15 Jun 2023 12:57:29 +0900
Subject: [PATCH 2/3] Support to define custom wait events for extensions.

> Currently, only one PG_WAIT_EXTENSION event can be used as a
> wait event for extensions. Therefore, in environments with multiple
> extensions are installed, it could take time to identify bottlenecks.

"extensions are installed" should be "extensions installed".

> +#define NUM_BUILDIN_WAIT_EVENT_EXTENSION   \
> +   (WAIT_EVENT_EXTENSION_FIRST_USER_DEFINED - WAIT_EVENT_EXTENSION)

Should that be NUM_BUILTIN_WAIT_EVENT_EXTENSION?

> +   NamedExtensionWaitEventTrancheRequestArray = 
> (NamedExtensionWaitEventTrancheRequest *)
> +   MemoryContextAlloc(TopMemoryContext,
> +  
> NamedExtensionWaitEventTrancheRequestsAllocated
> +  * 
> sizeof(NamedExtensionWaitEventTrancheRequest));

I can't tell from reading other Postgres code when one should cast the
return value of MemoryContextAlloc(). Seems unnecessary to me.

> +   if (NamedExtensionWaitEventTrancheRequestArray == NULL)
> +   {
> +   NamedExtensionWaitEventTrancheRequestsAllocated = 16;
> +   NamedExtensionWaitEventTrancheRequestArray = 
> (NamedExtensionWaitEventTrancheRequest *)
> +   MemoryContextAlloc(TopMemoryContext,
> +  
> NamedExtensionWaitEventTrancheRequestsAllocated
> +  * 
> sizeof(NamedExtensionWaitEventTrancheRequest));
> +   }
> +
> +   if (NamedExtensionWaitEventTrancheRequests >= 
> NamedExtensionWaitEventTrancheRequestsAllocated)
> +   {
> +   int i = 
> pg_nextpower2_32(NamedExtensionWaitEventTrancheRequests + 1);
> +
> +   NamedExtensionWaitEventTrancheRequestArray = 
> (NamedExtensionWaitEventTrancheRequest *)
> +   repalloc(NamedExtensionWaitEventTrancheRequestArray,
> +i * 
> sizeof(NamedExtensionWaitEventTrancheRequest));
> +   NamedExtensionWaitEventTrancheRequestsAllocated = i;
> +   }

Do you think this code would look better in an if/else if?

> +   int i = 
> pg_nextpower2_32(NamedExtensionWaitEventTrancheRequests + 1);

In the Postgres codebase, is an int always guaranteed to be at least 32
bits? I feel like a fixed-width type would be better for tracking the
length of the array, unless Postgres prefers the Size type.

> +   Assert(strlen(tranche_name) + 1 <= NAMEDATALEN);
> +   strlcpy(request->tranche_name, tranche_name, NAMEDATALEN);

A sizeof(request->tranche_name) would keep this code more in-sync if
size of tranche_name were to ever change, though I see sizeof
expressions in the codebase are not too common. Maybe just remove the +1
and make it less than rather than a less than or equal? Seems like it
might be worth noting in the docs of the function that the event name
has to be less than NAMEDATALEN, but maybe this is something extension
authors are inherently aware of?

---

What's the Postgres policy on the following?

for (int i = 0; ...)
for (i = 0; ...)

You are using 2 different patterns in WaitEventShmemInit() and
InitializeExtensionWaitEventTranches().

> +   /*
> +* Copy the info about any named tranches into shared memory (so that
> +* other processes can see it), and initialize the requested wait 
> events.
> +*/
> +   if (NamedExtensionWaitEventTrancheRequests > 0)

Removing this if would allow one less indentation level. Nothing would
have to change about the containing code either since the for loop will
then not run

> +   ExtensionWaitEventCounter = (int *) ((char *) 
> NamedExtensionWaitEventTrancheArray - sizeof(int));

>From 65e25d4b27bbb6d0934872310c24ee19f89e9631 Mon Sep 17 00:00:00 2001
From: Masahiro Ikeda 
Date: Thu, 15 Jun 2023 13:16:00 +0900
Subject: [PATCH 3/3] Add docs to define custom wait events

> +
> + wait events are reserved by calling:
> +
> +void RequestNamedExtensionWaitEventTranche(const char *tranche_name)
> +
> + from your shmem_request_hook.  This will ensure that
> + wait event is available under the name tranche_name,
> + which the wait event type is Extension.
> + Use GetNamedExtensionWaitEventTranche
> + to get a wait event information.
> +
> +
> + To avoid possible race-conditions, each backend should use the LWLock
> + AddinShmemInitLock when connecting to and 
> initializing
> + its allocation of shared memory, same as LWLocks rese

Re: run pgindent on a regular basis / scripted manner

2023-06-15 Thread Andrew Dunstan


On 2023-06-15 Th 11:26, Jelte Fennema wrote:

On Sat, 22 Apr 2023 at 13:42, Andrew Dunstan  wrote:

Perhaps we should start with a buildfarm module, which would run pg_indent 
--show-diff. That would only need to run on one animal, so a failure wouldn't 
send the whole buildfarm red. This would be pretty easy to do.

Just to be clear on where we are. Is there anything blocking us from
doing this, except for the PG16 branch cut? (that I guess is planned
somewhere in July?)

Just doing this for pgindent and not for perltidy would already be a
huge improvement over the current situation IMHO.



The short answer is that some high priority demands from $dayjob got in 
the way. However, I hope to have it done soon.



cheers


andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com


Re: Do we want a hashset type?

2023-06-15 Thread Joel Jacobson
On Thu, Jun 15, 2023, at 11:44, jian he wrote:
> I didn't install the extension directly. I copied the 
> hashset--0.0.1.sql to another place, using gcc to compile these 
> functions. 
..
> Because even make 
> PG_CONFIG=/home/jian/postgres/2023_05_25_beta5421/bin/pg_config still 
> has an error.
>  fatal error: libpq-fe.h: No such file or directory
> 3 | #include 

What platform are you on?
You seem to be missing the postgresql dev package.
For instance, here is how to compile and install the extension on Ubuntu 
22.04.1 LTS:

sudo apt install postgresql-15 postgresql-server-dev-15 postgresql-client-15
git clone https://github.com/tvondra/hashset.git
cd hashset
make
sudo make install
make installcheck

> Is there any way to put test_send_recv.c to sql test file? 

Unfortunately, there doesn't seem to be a way to test *_recv() functions from 
SQL,
since they take `internal` as input. The only way I could figure out to test 
them
was to write a C-program using libpq's binary mode.

I also note that the test_send_recv test was broken; I had forgot to change
the type from "hashset" to "int4hashset". Fixed in attached commit.

On Ubuntu, you can now run the test by specifying to connect via the UNIX 
socket:

PGHOST=/var/run/postgresql make run_c_tests
cd test/c_tests && ./test_send_recv.sh
test test_send_recv   ... ok

> Attached is a patch slightly modified README.md. feel free to change, 
> since i am not native english speaker... 
> Attachments:
> * 0001-add-instruction-using-PG_CONFIG-to-install-extension.patch

Thanks, will have a look later.

/Joel

hashset-0.0.1-bb0ece8.patch
Description: Binary data


Re: run pgindent on a regular basis / scripted manner

2023-06-15 Thread Jelte Fennema
On Sat, 22 Apr 2023 at 13:42, Andrew Dunstan  wrote:
> Perhaps we should start with a buildfarm module, which would run pg_indent 
> --show-diff. That would only need to run on one animal, so a failure wouldn't 
> send the whole buildfarm red. This would be pretty easy to do.

Just to be clear on where we are. Is there anything blocking us from
doing this, except for the PG16 branch cut? (that I guess is planned
somewhere in July?)

Just doing this for pgindent and not for perltidy would already be a
huge improvement over the current situation IMHO.




Re: When IMMUTABLE is not.

2023-06-15 Thread Isaac Morland
On Thu, 15 Jun 2023 at 10:49, Tom Lane  wrote:

In particular, we've never enforced that an immutable function can't
> call non-immutable functions.  While that would seem like a good idea
> in the abstract, we've intentionally not tried to do it.  (I'm pretty
> sure there is more than one round of previous discussions of the point
> in the archives, although locating relevant threads seems hard.)
> One reason not to is that polymorphic functions have to be marked
> with worst-case volatility labels.  There are plenty of examples of
> functions that are stable for some input types and immutable for
> others (array_to_string, for instance); but the marking system can't
> represent that so we have to label them stable.  Enforcing that a
> user-defined immutable function can't use such a function might
> just break things for no gain.
>

More sophisticated type systems (which I am *not* volunteering to graft
onto Postgres) can handle some of this, but even Haskell has
unsafePerformIO. The current policy is both wise and practical.


Re: When IMMUTABLE is not.

2023-06-15 Thread Tom Lane
"David G. Johnston"  writes:
> The failure to find and execute the function code itself is not a failure
> mode that these markers need be concerned with.  Assuming one can execute
> the function an immutable function will give the same answer for the same
> input for all time.

The viewpoint taken in the docs I mentioned is that an IMMUTABLE
marker is a promise from the user to the system about the behavior
of a function.  While the system does provide a few simple tools
to catch obvious errors and to make it easier to write functions
that obey such promises, it's mostly on the user to get it right.

In particular, we've never enforced that an immutable function can't
call non-immutable functions.  While that would seem like a good idea
in the abstract, we've intentionally not tried to do it.  (I'm pretty
sure there is more than one round of previous discussions of the point
in the archives, although locating relevant threads seems hard.)
One reason not to is that polymorphic functions have to be marked
with worst-case volatility labels.  There are plenty of examples of
functions that are stable for some input types and immutable for
others (array_to_string, for instance); but the marking system can't
represent that so we have to label them stable.  Enforcing that a
user-defined immutable function can't use such a function might
just break things for no gain.

regards, tom lane




Re: When IMMUTABLE is not.

2023-06-15 Thread chap

On 2023-06-15 10:19, David G. Johnston wrote:
The failure to find and execute the function code itself is not a 
failure
mode that these markers need be concerned with.  Assuming one can 
execute
the function an immutable function will give the same answer for the 
same

input for all time.


That was the view I ultimately took, and just made PL/Java suppress that
SPI readonly flag when going to look for the function code.

Until that change, you could run into the not-uncommon situation
where you've just loaded a jar of new functions and try to use them
in the same transaction, and hey presto, the VOLATILE ones all work,
and the IMMUTABLE ones aren't there yet.

Regards,
-Chap




Re: When IMMUTABLE is not.

2023-06-15 Thread David G. Johnston
On Thursday, June 15, 2023,  wrote:

>
> So one could take a strict view that "no PL/Java function should
> ever be marked IMMUTABLE" because every one depends on fetching
> something (once, at least).
>

The failure to find and execute the function code itself is not a failure
mode that these markers need be concerned with.  Assuming one can execute
the function an immutable function will give the same answer for the same
input for all time.

David J.


Re: When IMMUTABLE is not.

2023-06-15 Thread chap

On 2023-06-15 09:58, c...@anastigmatix.net wrote:

also influences what snapshot the
function is looking at, and therefore what it can see, which has
also struck me more as a tacked-on effect than something inherent
in the declaration's meaning.


I just re-read that and realized I should anticipate the obvious
response "but how can it matter what the function can see, if
it's IMMUTABLE and depends on no data?".

So, I ran into the effect while working on PL/Java, where the
code of a function isn't all found in pg_proc.prosrc; that just
indicates what code has to be fetched from sqlj.jar_entry.

So one could take a strict view that "no PL/Java function should
ever be marked IMMUTABLE" because every one depends on fetching
something (once, at least).

But on the other hand, it would seem punctilious to say that
f(int x, int y) { return x + y; } isn't IMMUTABLE, only because
it depends on a fetch /of its own implementation/, and overall
its behavior is better described by marking it IMMUTABLE.

Regards,
-Chap




Re: When IMMUTABLE is not.

2023-06-15 Thread Tom Lane
c...@anastigmatix.net writes:
> And also, isn't it the case that IMMUTABLE should mark a function,
> not merely that "doesn't manipulate data", but whose return value
> doesn't depend in any way on data (outside its own arguments)?

Right.  We can't realistically enforce that either, so it's
up to the user.

> The practice among PLs of choosing an SPI readonly flag based on
> the IMMUTABLE/STABLE/VOLATILE declaration seems to be a sort of
> peculiar heuristic, not something inherent in what that declaration
> means to the optimizer. (And also influences what snapshot the
> function is looking at, and therefore what it can see, which has
> also struck me more as a tacked-on effect than something inherent
> in the declaration's meaning.)

Well, it is a bit odd at first sight, but these properties play
together well.  See

https://www.postgresql.org/docs/current/xfunc-volatility.html

regards, tom lane




Re: When IMMUTABLE is not.

2023-06-15 Thread Yura Sokolov



15.06.2023 16:58, c...@anastigmatix.net пишет:

On 2023-06-15 09:21, Tom Lane wrote:

Yura Sokolov  writes:

not enough to be sure function doesn't manipulate data.


Of course not.  It is the user's responsibility to mark functions
properly.


And also, isn't it the case that IMMUTABLE should mark a function,
not merely that "doesn't manipulate data", but whose return value
doesn't depend in any way on data (outside its own arguments)?

The practice among PLs of choosing an SPI readonly flag based on
the IMMUTABLE/STABLE/VOLATILE declaration seems to be a sort of
peculiar heuristic, not something inherent in what that declaration
means to the optimizer. (And also influences what snapshot the
function is looking at, and therefore what it can see, which has
also struck me more as a tacked-on effect than something inherent
in the declaration's meaning.)


Documentation disagrees:

https://www.postgresql.org/docs/current/sql-createfunction.html#:~:text=IMMUTABLE%0ASTABLE%0AVOLATILE

> |IMMUTABLE|indicates that the function cannot modify the database and 
always returns the same result when given the same argument values


> |STABLE|indicates that the function cannot modify the database, and 
that within a single table scan it will consistently return the same 
result for the same argument values, but that its result could change 
across SQL statements.


> |VOLATILE|indicates that the function value can change even within a 
single table scan, so no optimizations can be made... But note that any 
function that has side-effects must be classified volatile, even if its 
result is quite predictable, to prevent calls from being optimized away






Re: When IMMUTABLE is not.

2023-06-15 Thread chap

On 2023-06-15 09:21, Tom Lane wrote:

Yura Sokolov  writes:

not enough to be sure function doesn't manipulate data.


Of course not.  It is the user's responsibility to mark functions
properly.


And also, isn't it the case that IMMUTABLE should mark a function,
not merely that "doesn't manipulate data", but whose return value
doesn't depend in any way on data (outside its own arguments)?

The practice among PLs of choosing an SPI readonly flag based on
the IMMUTABLE/STABLE/VOLATILE declaration seems to be a sort of
peculiar heuristic, not something inherent in what that declaration
means to the optimizer. (And also influences what snapshot the
function is looking at, and therefore what it can see, which has
also struck me more as a tacked-on effect than something inherent
in the declaration's meaning.)

Regards,
-Chap




Re: When IMMUTABLE is not.

2023-06-15 Thread Yura Sokolov

15.06.2023 16:21, Tom Lane wrote:

Yura Sokolov  writes:

I found, than declaration of function as IMMUTABLE/STABLE is not enough to be 
sure
function doesn't manipulate data.

Of course not.  It is the user's responsibility to mark functions
properly.  Trying to enforce that completely is a fool's errand


https://github.com/postgres/postgres/commit/b2c4071299e02ed96d48d3c8e776de2fab36f88c.patch

https://github.com/postgres/postgres/commit/cdf8b56d5463815244467ea8f5ec6e72b6c65a6c.patch





Re: obsolete filename reference in parser README

2023-06-15 Thread Amit Langote
On Thu, Jun 15, 2023 at 10:48 PM Tristan Partin  wrote:
> Nice catch. Looks good.

Thanks for checking.  As just mentioned, I've pushed this moments ago.

-- 
Thanks, Amit Langote
EDB: http://www.enterprisedb.com




Re: obsolete filename reference in parser README

2023-06-15 Thread Tristan Partin
Nice catch. Looks good.

-- 
Tristan Partin
Neon (https://neon.tech)




Re: obsolete filename reference in parser README

2023-06-15 Thread Amit Langote
On Thu, Jun 15, 2023 at 6:54 PM Amit Langote  wrote:
> I noticed that 2f2b18bd3f55 forgot to remove the mention of
> parse_jsontable.c in src/backend/parser/README.
>
> Attached a patch to fix that.  Will push that shortly to HEAD and v15.

Pushed to HEAD only.   9853bf6ab0e that added parse_jsontable.c to the
README was not back-patched.

-- 
Thanks, Amit Langote
EDB: http://www.enterprisedb.com




Re: Avoid unncessary always true test (src/backend/storage/buffer/bufmgr.c)

2023-06-15 Thread Ranier Vilela
Em qua., 14 de jun. de 2023 às 13:32, Gurjeet Singh 
escreveu:

> On Wed, Jun 14, 2023 at 5:12 AM Ranier Vilela  wrote:
> >
> > Em qua., 14 de jun. de 2023 às 06:51, Richard Guo <
> guofengli...@gmail.com> escreveu:
> >>
> >>
> >> On Tue, Jun 13, 2023 at 3:39 PM Kyotaro Horiguchi <
> horikyota@gmail.com> wrote:
> >>>
> >>> Gurjeet has mentioned that eb.rel cannot be modified by another
> >>> process since the value or memory is in the local stack, and I believe
> >>> he's correct.
> >>>
> >>> If the pointed Relation had been blown out, eb.rel would be left
> >>> dangling, not nullified. However, I don't believe this situation
> >>> happens (or it shouldn't happen) as the entire relation should already
> >>> be locked.
> >>
> >>
> >> Yeah, Gurjeet is right.  I had a thinko here.  eb.rel should not be NULL
> >> pointer in any case.  And as we've acquired the lock for it, it should
> >> not have been closed.  So I think we can remove the check for eb.rel in
> >> the two places.
> >
> > Ok,
> > As there is a consensus on removing the tests and the comment is still
> relevant,
> > here is a new version for analysis.
>
> LGTM.
>
Created an entry in commitfest to track this.
https://commitfest.postgresql.org/43/4371/

regards,
Ranier Vilela


Re: BUG #17946: LC_MONETARY & DO LANGUAGE plperl - BUG

2023-06-15 Thread Tristan Partin
On Mon Jun 12, 2023 at 4:13 AM CDT, Heikki Linnakangas wrote:
> There are a few uselocale() calls in ecpg, and they are protected by 
> HAVE_USELOCALE. Interestingly, the calls in pg_locale.c are not, but 
> they are protected by HAVE_LOCALE_T. Seems a little inconsistent.

Patch is attached. CC-ing hackers.

-- 
Tristan Partin
Neon (https://neon.tech)
From 02a2cdb83405b5aecaec2af02e379a81161f8372 Mon Sep 17 00:00:00 2001
From: Tristan Partin 
Date: Wed, 14 Jun 2023 11:36:00 -0500
Subject: [PATCH v1] Make uselocale protection more consistent

In ecpg, uselocale uses are protected by checking if HAVE_USELOCALE is
defined. Use the same check in pg_locale.c. Since HAVE_USELOCALE implies
HAVE_LOCALE_T, the code should be the same on _all_ platforms that
Postgres supports. Otherwise, I am sure there would have been a bug
report with pg_locale.c failing to build due to the system having
locale_t, but not uselocale.
---
 src/backend/utils/adt/pg_locale.c | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/src/backend/utils/adt/pg_locale.c b/src/backend/utils/adt/pg_locale.c
index 31e3b16ae0..3585afb298 100644
--- a/src/backend/utils/adt/pg_locale.c
+++ b/src/backend/utils/adt/pg_locale.c
@@ -2972,7 +2972,7 @@ wchar2char(char *to, const wchar_t *from, size_t tolen, pg_locale_t locale)
 	}
 	else
 	{
-#ifdef HAVE_LOCALE_T
+#ifdef HAVE_USELOCALE
 #ifdef HAVE_WCSTOMBS_L
 		/* Use wcstombs_l for nondefault locales */
 		result = wcstombs_l(to, from, tolen, locale->info.lt);
@@ -2984,11 +2984,11 @@ wchar2char(char *to, const wchar_t *from, size_t tolen, pg_locale_t locale)
 
 		uselocale(save_locale);
 #endif			/* HAVE_WCSTOMBS_L */
-#else			/* !HAVE_LOCALE_T */
-		/* Can't have locale != 0 without HAVE_LOCALE_T */
+#else			/* !HAVE_USELOCALE */
+		/* Can't have locale != 0 without HAVE_LOCALE_T, which HAVE_USELOCALE implies */
 		elog(ERROR, "wcstombs_l is not available");
 		result = 0;/* keep compiler quiet */
-#endif			/* HAVE_LOCALE_T */
+#endif			/* HAVE_USELOCALE */
 	}
 
 	return result;
@@ -3049,7 +3049,7 @@ char2wchar(wchar_t *to, size_t tolen, const char *from, size_t fromlen,
 		}
 		else
 		{
-#ifdef HAVE_LOCALE_T
+#ifdef HAVE_USELOCALE
 #ifdef HAVE_MBSTOWCS_L
 			/* Use mbstowcs_l for nondefault locales */
 			result = mbstowcs_l(to, str, tolen, locale->info.lt);
@@ -3061,11 +3061,11 @@ char2wchar(wchar_t *to, size_t tolen, const char *from, size_t fromlen,
 
 			uselocale(save_locale);
 #endif			/* HAVE_MBSTOWCS_L */
-#else			/* !HAVE_LOCALE_T */
-			/* Can't have locale != 0 without HAVE_LOCALE_T */
+#else			/* !HAVE_USELOCALE */
+			/* Can't have locale != 0 without HAVE_LOCALE_T, which HAVE_USELOCALE implies */
 			elog(ERROR, "mbstowcs_l is not available");
 			result = 0;			/* keep compiler quiet */
-#endif			/* HAVE_LOCALE_T */
+#endif			/* HAVE_USELOCALE */
 		}
 
 		pfree(str);
-- 
Tristan Partin
Neon (https://neon.tech)



Re: Support to define custom wait events for extensions

2023-06-15 Thread Drouvot, Bertrand

Hi,

On 6/15/23 10:00 AM, Michael Paquier wrote:

On Thu, Jun 15, 2023 at 03:06:01PM +0900, Masahiro Ikeda wrote:

Currently, only one PG_WAIT_EXTENSION event can be used as a
wait event for extensions. Therefore, in environments with multiple
extensions are installed, it could take time to identify which
extension is the bottleneck.


Thanks for taking the time to implement a patch to do that.


+1 thanks for it!




I want to know your feedbacks. Please feel free to comment.


I think that's been cruelly missed.


Yeah, that would clearly help to diagnose which extension(s) is/are causing the 
waits (if any).

I did not look at the code yet (will do) but just wanted to chime in to support 
the idea.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: When IMMUTABLE is not.

2023-06-15 Thread Tom Lane
Yura Sokolov  writes:
> I found, than declaration of function as IMMUTABLE/STABLE is not enough to be 
> sure
> function doesn't manipulate data.

Of course not.  It is the user's responsibility to mark functions
properly.  Trying to enforce that completely is a fool's errand;
you soon get into trying to solve the halting problem.

I don't like anything about the proposed patch.  It's necessarily
only a partial solution, and it probably breaks cases that are
perfectly safe in context.

regards, tom lane




Re: Memory leak in incremental sort re-scan

2023-06-15 Thread Tomas Vondra
Hi,

On 6/15/23 13:48, Laurenz Albe wrote:
> ExecIncrementalSort() calls tuplesort_begin_common(), which creates the 
> "TupleSort main"
> and "TupleSort sort" memory contexts, and ExecEndIncrementalSort() calls 
> tuplesort_end(),
> which destroys them.
> But ExecReScanIncrementalSort() only resets the memory contexts.  Since the 
> next call to
> ExecIncrementalSort() will create them again, we end up leaking these 
> contexts for every
> re-scan.
> 
> Here is a reproducer with the regression test database:
> 
>   SET enable_sort = off;
>   SET enable_hashjoin = off;
>   SET enable_mergejoin = off;
>   SET enable_material = off;
> 
>   SELECT t.unique2, t2.r
>   FROM tenk1 AS t 
>  JOIN (SELECT unique1, 
>   row_number() OVER (ORDER BY hundred, thousand) AS r 
>FROM tenk1 
>OFFSET 0) AS t2 
> ON t.unique1 + 0 = t2.unique1
>   WHERE t.unique1 < 1000;
> 
> The execution plan:
> 
>  Nested Loop
>Join Filter: ((t.unique1 + 0) = tenk1.unique1)
>->  Bitmap Heap Scan on tenk1 t
>  Recheck Cond: (unique1 < 1000)
>  ->  Bitmap Index Scan on tenk1_unique1
>Index Cond: (unique1 < 1000)
>->  WindowAgg
>  ->  Incremental Sort
>Sort Key: tenk1.hundred, tenk1.thousand
>Presorted Key: tenk1.hundred
>->  Index Scan using tenk1_hundred on tenk1
> 
> 
> A memory context dump at the end of the execution looks like this:
> 
>   ExecutorState: 262144 total in 6 blocks; 74136 free (29 chunks); 188008 
> used
> TupleSort main: 32832 total in 2 blocks; 7320 free (0 chunks); 25512 
> used
>   TupleSort sort: 8192 total in 1 blocks; 7928 free (0 chunks); 264 
> used
> Caller tuples: 8192 total in 1 blocks (0 chunks); 7984 free (0 
> chunks); 208 used
> TupleSort main: 32832 total in 2 blocks; 7256 free (0 chunks); 25576 
> used
>   TupleSort sort: 8192 total in 1 blocks; 7928 free (0 chunks); 264 
> used
> Caller tuples: 8192 total in 1 blocks (0 chunks); 7984 free (0 
> chunks); 208 used
> TupleSort main: 32832 total in 2 blocks; 7320 free (0 chunks); 25512 
> used
>   TupleSort sort: 8192 total in 1 blocks; 7928 free (0 chunks); 264 
> used
> Caller tuples: 8192 total in 1 blocks (0 chunks); 7984 free (0 
> chunks); 208 used
>   [many more]
> 1903 more child contexts containing 93452928 total in 7597 blocks; 
> 44073240 free (0 chunks); 49379688 used
> 
> 
> The following patch fixes the problem for me:
> 
> --- a/src/backend/executor/nodeIncrementalSort.c
> +++ b/src/backend/executor/nodeIncrementalSort.c
> @@ -1145,21 +1145,16 @@ ExecReScanIncrementalSort(IncrementalSortState *node)
> node->execution_status = INCSORT_LOADFULLSORT;
>  
> /*
> -* If we've set up either of the sort states yet, we need to reset them.
> -* We could end them and null out the pointers, but there's no reason to
> -* repay the setup cost, and because ExecIncrementalSort guards presorted
> -* column functions by checking to see if the full sort state has been
> -* initialized yet, setting the sort states to null here might actually
> -* cause a leak.
> +* Release tuplesort resources.
>  */
> if (node->fullsort_state != NULL)
> {
> -   tuplesort_reset(node->fullsort_state);
> +   tuplesort_end(node->fullsort_state);
> node->fullsort_state = NULL;
> }
> if (node->prefixsort_state != NULL)
> {
> -   tuplesort_reset(node->prefixsort_state);
> +   tuplesort_end(node->prefixsort_state);
> node->prefixsort_state = NULL;
> }
>  
> 
> The original comment hints that this might mot be the correct thing to do...
> 

I think it's correct, but I need to look at the code more closely - it's
been a while. The code is a bit silly, as it resets the tuplesort and
then throws away all the pointers - so what could the _end() break?

AFAICS the comment says that we can't just do tuplesort_reset and keep
the pointers, because some other code depends on them being NULL.

In hindsight, that's a bit awkward - it'd probably be better to have a
separate flag, which would allow us to just reset the tuplesort.

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: RFC: Logging plan of the running query

2023-06-15 Thread torikoshia

On 2023-06-15 01:48, James Coleman wrote:
On Tue, Jun 13, 2023 at 11:53 AM James Coleman  
wrote:


...
I'm going to re-run tests with my patch version + resetting the flag
on SIGINT (and any other error condition) to be certain that the issue
you uncovered (where backends get stuck after a SIGINT not responding
to the requested plan logging) wasn't masking any other issues.

As long as that run is clean also then I believe the patch is safe
as-is even without the re-entrancy guard.

I'll report back with the results of that testing.


The tests have been running since last night, but have been apparently
hung now for many hours. I haven't been able to fully look into it,
but so far I know the hung (100% CPU) backend last logged this:

2023-06-14 02:00:30.045 UTC client backend[84461]
pg_regress/updatable_views LOG:  query plan running on backend with
PID 84461 is:
Query Text: SELECT table_name, column_name, is_updatable
  FROM information_schema.columns
 WHERE table_name LIKE E'r_\\_view%'
 ORDER BY table_name, ordinal_position;

The last output from the regression test harness was:

# parallel group (5 tests):  index_including create_view
index_including_gist create_index create_index_spgist
ok 66+ create_index36508 ms
ok 67+ create_index_spgist 38588 ms
ok 68+ create_view  1394 ms
ok 69+ index_including   654 ms
ok 70+ index_including_gist 1701 ms
# parallel group (16 tests):  errors create_cast drop_if_exists
create_aggregate roleattributes constraints hash_func typed_table
infinite_recurse

Attaching gdb to the hung backend shows this:

#0  0x5601ab1f9529 in ProcLockWakeup
(lockMethodTable=lockMethodTable@entry=0x5601ab6484e0
, lock=lock@entry=0x7f5325c913f0) at proc.c:1655
#1  0x5601ab1e99dc in CleanUpLock (lock=lock@entry=0x7f5325c913f0,
proclock=proclock@entry=0x7f5325d40d60,
lockMethodTable=lockMethodTable@entry=0x5601ab6484e0
,
hashcode=hashcode@entry=573498161, wakeupNeeded=)
at lock.c:1673
#2  0x5601ab1e9e21 in LockRefindAndRelease
(lockMethodTable=lockMethodTable@entry=0x5601ab6484e0
, proc=,
locktag=locktag@entry=0x5601ac3d7998, lockmode=lockmode@entry=1,

decrement_strong_lock_count=decrement_strong_lock_count@entry=false)

at lock.c:3150
#3  0x5601ab1edb27 in LockReleaseAll
(lockmethodid=lockmethodid@entry=1, allLocks=false) at lock.c:2295
#4  0x5601ab1f8599 in ProcReleaseLocks
(isCommit=isCommit@entry=true) at proc.c:781
#5  0x5601ab37f1f4 in ResourceOwnerReleaseInternal
(owner=, phase=phase@entry=RESOURCE_RELEASE_LOCKS,
isCommit=isCommit@entry=true, isTopLevel=isTopLevel@entry=true) at
resowner.c:618
#6  0x5601ab37f7b7 in ResourceOwnerRelease (owner=,
phase=phase@entry=RESOURCE_RELEASE_LOCKS,
isCommit=isCommit@entry=true, isTopLevel=isTopLevel@entry=true) at
resowner.c:494
#7  0x5601aaec1d84 in CommitTransaction () at xact.c:2334
#8  0x5601aaec2b22 in CommitTransactionCommand () at xact.c:3067
#9  0x5601ab200a66 in finish_xact_command () at postgres.c:2783
#10 0x5601ab20338f in exec_simple_query (
query_string=query_string@entry=0x5601ac3b0858 "SELECT table_name,
column_name, is_updatable\n  FROM information_schema.columns\n WHERE
table_name LIKE E'r__view%'\n ORDER BY table_name,
ordinal_position;") at postgres.c:1300

I am unable to connect to the regression test Postgres instance --
psql just hangs, so the lock seems to have affected the postmaster
also.

I'm wondering if this might represent a bug in the current patch.


Thanks for running and analyzing the test!

Could you share me how you are running the test?

I imagined something like below, but currently couldn't reproduce it.
- apply both v26-0001 and v27-0002 and build
- run PostgreSQL with default GUCssaaa
- make installcheck-world
- run 'SELECT pg_log_query_plan(pid) FROM pg_stat_activity \watch 0.1' 
during make installcheck-world


--
Regards,

--
Atsushi Torikoshi
NTT DATA CORPORATION




Re: Allow pg_archivecleanup to remove backup history files

2023-06-15 Thread torikoshia

On 2023-06-15 15:20, Kyotaro Horiguchi wrote:
Thanks for your review!


At Wed, 14 Jun 2023 00:49:39 +0900, torikoshia
 wrote in

On 2023-06-12 16:33, Michael Paquier wrote:
> On Fri, Jun 09, 2023 at 12:32:15AM +0900, Fujii Masao wrote:
Thanks for reviewing!

>>printf(_(" -n, --dry-run dry run, show the names of the files that
>>would be removed\n"));
>> + printf(_(" -b, --clean-backup-history clean up files including
>> backup history files\n"));
>> Shouldn't -b option be placed in alphabetical order?
> +1.

Modified the place.


-   printf(_("  -d generate debug output (verbose mode)\n"));
-   printf(_("  -n dry run, show the names of the files that
would be removed\n"));
-	printf(_("  -V, --version  output version information, then 
exit\n"));
-	printf(_("  -x EXT clean up files if they have this 
extension\n"));

-   printf(_("  -?, --help show this help, then exit\n"));
+   printf(_("  -d, --debug generate debug output
(verbose mode)\n"));
+   printf(_("  -n, --dry-run   dry run, show the names of
the files that would be removed\n"));
+   printf(_("  -V, --version   output version information,
then exit\n"));
+   printf(_("  -x, --strip-extension=EXT   strip this extention before
identifying files fo clean up\n"));
+	printf(_("  -?, --help  show this help, then 
exit\n"));


After this change, some of these lines corss the boundary of the 80
columns width.  (is that policy viable noadays? I am usually working
using terminal windows with such a width..) It's somewhat unrelated to
this patch, but a help line a few lines further down also exceeds the
width. We could shorten it by removing the "/mnt/server" portion, but
I'm not sure if it's worth doing.


I also highlight 80th column according to the wiki[1].
Since usage() in other files like pg_rewind.c and initdb.c also
exceeded the 80th column, I thought that was something like a guide.


Or for use as a standalone archive cleaner:
e.g.
  pg_archivecleanup /mnt/server/archiverdir 
00010010.0020.backup



+   printf(_("  -x, --strip-extension=EXT   strip this extention before
identifying files fo clean up\n"));

s/fo/for/ ?


Yeah, it's a typo. Fixed it.

[1] 
https://wiki.postgresql.org/wiki/Configuring_vim_for_postgres_development

--
Regards,

--
Atsushi Torikoshi
NTT DATA CORPORATIONFrom 06b5664d1093aba445332ad5420bd442b99a8f98 Mon Sep 17 00:00:00 2001
From: Atsushi Torikoshi 
Date: Thu, 16 Jun 2023 21:13:02 +0900
Subject: [PATCH v7 1/3] Introduce pg_archivecleanup into getopt_long

This patch is a preliminary step to add an easy-to-understand option
to delete backup history files, but it also adds long options to
the existing options.
---
 doc/src/sgml/ref/pgarchivecleanup.sgml|  5 -
 src/bin/pg_archivecleanup/pg_archivecleanup.c | 20 ---
 2 files changed, 17 insertions(+), 8 deletions(-)

diff --git a/doc/src/sgml/ref/pgarchivecleanup.sgml b/doc/src/sgml/ref/pgarchivecleanup.sgml
index 635e7c7685..09991c2fcd 100644
--- a/doc/src/sgml/ref/pgarchivecleanup.sgml
+++ b/doc/src/sgml/ref/pgarchivecleanup.sgml
@@ -95,6 +95,7 @@ pg_archivecleanup:  removing file "archive/00010037000E"
 
  
   -d
+  --debug
   

 Print lots of debug logging output on stderr.
@@ -104,6 +105,7 @@ pg_archivecleanup:  removing file "archive/00010037000E"
 
  
   -n
+  --dry-run
   

 Print the names of the files that would have been removed on stdout (performs a dry run).
@@ -122,7 +124,8 @@ pg_archivecleanup:  removing file "archive/00010037000E"
  
 
  
-  -x extension
+  -x extension
+  --strip-extension=extension
   

 Provide an extension
diff --git a/src/bin/pg_archivecleanup/pg_archivecleanup.c b/src/bin/pg_archivecleanup/pg_archivecleanup.c
index 7726d05149..be62ffe75d 100644
--- a/src/bin/pg_archivecleanup/pg_archivecleanup.c
+++ b/src/bin/pg_archivecleanup/pg_archivecleanup.c
@@ -17,7 +17,7 @@
 
 #include "access/xlog_internal.h"
 #include "common/logging.h"
-#include "pg_getopt.h"
+#include "getopt_long.h"
 
 const char *progname;
 
@@ -252,11 +252,11 @@ usage(void)
 	printf(_("Usage:\n"));
 	printf(_("  %s [OPTION]... ARCHIVELOCATION OLDESTKEPTWALFILE\n"), progname);
 	printf(_("\nOptions:\n"));
-	printf(_("  -d generate debug output (verbose mode)\n"));
-	printf(_("  -n dry run, show the names of the files that would be removed\n"));
-	printf(_("  -V, --version  output version information, then exit\n"));
-	printf(_("  -x EXT clean up files if they have this extension\n"));
-	printf(_("  -?, --help show this help, then exit\n"));
+	printf(_("  -d, --debug generate debug output (verbose mode)\n"));
+	printf(_("  -n, --dry-run   dry run, show the names of the files that would be r

Re: When IMMUTABLE is not.

2023-06-15 Thread Laurenz Albe
On Thu, 2023-06-15 at 13:22 +0300, Yura Sokolov wrote:
> Good day, hackers.
> 
> I found, than declaration of function as IMMUTABLE/STABLE is not enough to be 
> sure
> function doesn't manipulate data.
> 
> [...]
>
> + errmsg("Damn1! Update were done 
> in a non-volatile function")));

I think it is project policy to start error messages with a lower case 
character.

Yours,
Laurenz Albe




Memory leak in incremental sort re-scan

2023-06-15 Thread Laurenz Albe
ExecIncrementalSort() calls tuplesort_begin_common(), which creates the 
"TupleSort main"
and "TupleSort sort" memory contexts, and ExecEndIncrementalSort() calls 
tuplesort_end(),
which destroys them.
But ExecReScanIncrementalSort() only resets the memory contexts.  Since the 
next call to
ExecIncrementalSort() will create them again, we end up leaking these contexts 
for every
re-scan.

Here is a reproducer with the regression test database:

  SET enable_sort = off;
  SET enable_hashjoin = off;
  SET enable_mergejoin = off;
  SET enable_material = off;

  SELECT t.unique2, t2.r
  FROM tenk1 AS t 
 JOIN (SELECT unique1, 
  row_number() OVER (ORDER BY hundred, thousand) AS r 
   FROM tenk1 
   OFFSET 0) AS t2 
ON t.unique1 + 0 = t2.unique1
  WHERE t.unique1 < 1000;

The execution plan:

 Nested Loop
   Join Filter: ((t.unique1 + 0) = tenk1.unique1)
   ->  Bitmap Heap Scan on tenk1 t
 Recheck Cond: (unique1 < 1000)
 ->  Bitmap Index Scan on tenk1_unique1
   Index Cond: (unique1 < 1000)
   ->  WindowAgg
 ->  Incremental Sort
   Sort Key: tenk1.hundred, tenk1.thousand
   Presorted Key: tenk1.hundred
   ->  Index Scan using tenk1_hundred on tenk1


A memory context dump at the end of the execution looks like this:

  ExecutorState: 262144 total in 6 blocks; 74136 free (29 chunks); 188008 
used
TupleSort main: 32832 total in 2 blocks; 7320 free (0 chunks); 25512 
used
  TupleSort sort: 8192 total in 1 blocks; 7928 free (0 chunks); 264 used
Caller tuples: 8192 total in 1 blocks (0 chunks); 7984 free (0 
chunks); 208 used
TupleSort main: 32832 total in 2 blocks; 7256 free (0 chunks); 25576 
used
  TupleSort sort: 8192 total in 1 blocks; 7928 free (0 chunks); 264 used
Caller tuples: 8192 total in 1 blocks (0 chunks); 7984 free (0 
chunks); 208 used
TupleSort main: 32832 total in 2 blocks; 7320 free (0 chunks); 25512 
used
  TupleSort sort: 8192 total in 1 blocks; 7928 free (0 chunks); 264 used
Caller tuples: 8192 total in 1 blocks (0 chunks); 7984 free (0 
chunks); 208 used
  [many more]
1903 more child contexts containing 93452928 total in 7597 blocks; 
44073240 free (0 chunks); 49379688 used


The following patch fixes the problem for me:

--- a/src/backend/executor/nodeIncrementalSort.c
+++ b/src/backend/executor/nodeIncrementalSort.c
@@ -1145,21 +1145,16 @@ ExecReScanIncrementalSort(IncrementalSortState *node)
node->execution_status = INCSORT_LOADFULLSORT;
 
/*
-* If we've set up either of the sort states yet, we need to reset them.
-* We could end them and null out the pointers, but there's no reason to
-* repay the setup cost, and because ExecIncrementalSort guards presorted
-* column functions by checking to see if the full sort state has been
-* initialized yet, setting the sort states to null here might actually
-* cause a leak.
+* Release tuplesort resources.
 */
if (node->fullsort_state != NULL)
{
-   tuplesort_reset(node->fullsort_state);
+   tuplesort_end(node->fullsort_state);
node->fullsort_state = NULL;
}
if (node->prefixsort_state != NULL)
{
-   tuplesort_reset(node->prefixsort_state);
+   tuplesort_end(node->prefixsort_state);
node->prefixsort_state = NULL;
}
 

The original comment hints that this might mot be the correct thing to do...

Yours,
Laurenz Albe




Re: Skip collecting decoded changes of already-aborted transactions

2023-06-15 Thread Amit Kapila
On Tue, Jun 13, 2023 at 2:06 PM Masahiko Sawada  wrote:
>
> On Sun, Jun 11, 2023 at 5:31 AM Andres Freund  wrote:
> >
> > A separate issue is that TransactionIdDidAbort() can end up being very slow 
> > if
> > a lot of transactions are in progress concurrently. As soon as the clog
> > buffers are extended all time is spent copying pages from the kernel
> > pagecache.  I'd not at all be surprised if this changed causes a substantial
> > slowdown in workloads with lots of small transactions, where most 
> > transactions
> > commit.
> >
>
> Indeed. So it should check the transaction status less frequently. It
> doesn't benefit much even if we can skip collecting decoded changes of
> small transactions. Another idea is that we check the status of only
> large transactions. That is, when the size of decoded changes of an
> aborted transaction exceeds logical_decoding_work_mem, we mark it as
> aborted , free its changes decoded so far, and skip further
> collection.
>

Your idea might work for large transactions but I have not come across
reports where this is reported as a problem. Do you see any such
reports and can we see how much is the benefit with large
transactions? Because we do have the handling of concurrent aborts
during sys table scans and that might help sometimes for large
transactions.

-- 
With Regards,
Amit Kapila.




Re: When IMMUTABLE is not.

2023-06-15 Thread Yura Sokolov

Sorry, previous message were smashed for some reason.

I'll try to repeat

I found, than declaration of function as IMMUTABLE/STABLE is not enough 
to be sure

function doesn't manipulate data.

In fact, SPI checks only direct function kind, but fails to check 
indirect call.


Attached immutable_not.sql creates 3 functions:

- `immutable_direct` is IMMUTABLE and tries to insert into table directly.
  PostgreSQL correctly detects and forbids this action.

- `volatile_direct` is VOLATILE and inserts into table directly.
  It is allowed and executed well.

- `immutable_indirect` is IMMUTABLE and calls `volatile_direct`.
  PostgreSQL failed to detect and prevent this DML manipulation.

Output:

select immutable_direct('immutable_direct');
psql:immutable_not.sql:28: ERROR:  INSERT is not allowed in a 
non-volatile function

CONTEXT:  SQL statement "insert into xxx values(j)"
PL/pgSQL function immutable_direct(character varying) line 3 at SQL 
statement


select volatile_direct('volatile_direct');
volatile_direct
-
volatile_direct
(1 row)

select immutable_indirect('immutable_indirect');
immutable_indirect

immutable_indirect
(1 row)

select * from xxx;
    i

volatile_direct
immutable_indirect
(2 rows)

Attached forbid-non-volatile-mutations.diff add checks readonly function 
didn't made data manipulations.

Output for patched version:

select immutable_indirect('immutable_indirect');
psql:immutable_not.sql:32: ERROR:  Damn2! Update were done in a 
non-volatile function

CONTEXT:  SQL statement "SELECT volatile_direct(j)"
PL/pgSQL function immutable_indirect(character varying) line 3 at PERFORM

I doubt check should be done this way. This check is necessary, but it 
should be

FATAL instead of ERROR. And ERROR should be generated at same place, when
it is generated for `immutable_direct`, but with check of "read_only" 
status through

whole call stack instead of just direct function kind.

-

regards,
Yura Sokolov
Postgres Professional






Re: Consistent coding for the naming of LR workers

2023-06-15 Thread Alvaro Herrera
On 2023-Jun-15, Peter Smith wrote:

> PSA a small patch to modify the code accordingly. This is not intended
> to be a functional change - just a code cleanup.

>From a translation standpoint, this doesn't seem good.  Consider this
proposed message:
  "lost connection to the %s"

It's not possible to translate "to the" correctly to Spanish because it
depends on the grammatical gender of the %s.  At present this is not an
actual problem because all bgworker types have the same gender, but it
goes counter translability good practices.  Also, other languages may
have different considerations.

You could use a generic term and then add a colon-separated or a quoted
indicator for its type:
 lost connection to logical replication worker of type "parallel apply"
 lost connection to logical replication worker: "parallel apply worker"
 lost connection to logical replication worker: type "parallel apply worker"

and then make the type string (and nothing else in that message) be a
%s.  But I don't think this looks very good.

I'd leave this alone, except if there are any actual inconsistencies in
which case they should be fixed specifically.

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/
Thou shalt check the array bounds of all strings (indeed, all arrays), for
surely where thou typest "foo" someone someday shall type
"supercalifragilisticexpialidocious" (5th Commandment for C programmers)




Re: [PATCH] Add loongarch native checksum implementation.

2023-06-15 Thread John Naylor
On Wed, Jun 14, 2023 at 9:20 AM YANG Xudong  wrote:
>
> Attached a new patch with fixes based on the comment below.

Note: It's helpful to pass "-v" to git format-patch, to have different
versions.

> > For x86 and Arm, if it fails to link without an -march flag, we allow
> > for a runtime check. The flags "-march=armv8-a+crc" and "-msse4.2" are
> > for instructions not found on all platforms. The patch also checks both
> > ways, and each one results in "Use LoongArch CRC instruction
> > unconditionally". The -march flag here is general, not specific. In
> > other words, if this only runs inside "+elif host_cpu == 'loongarch64'",
> > why do we need both with -march and without?
> >
>
> Removed the elif branch.

Okay, since we've confirmed that no arch flag is necessary, some other
places can be simplified:

--- a/src/port/Makefile
+++ b/src/port/Makefile
@@ -98,6 +98,11 @@ pg_crc32c_armv8.o: CFLAGS+=$(CFLAGS_CRC)
 pg_crc32c_armv8_shlib.o: CFLAGS+=$(CFLAGS_CRC)
 pg_crc32c_armv8_srv.o: CFLAGS+=$(CFLAGS_CRC)

+# all versions of pg_crc32c_loongarch.o need CFLAGS_CRC
+pg_crc32c_loongarch.o: CFLAGS+=$(CFLAGS_CRC)
+pg_crc32c_loongarch_shlib.o: CFLAGS+=$(CFLAGS_CRC)
+pg_crc32c_loongarch_srv.o: CFLAGS+=$(CFLAGS_CRC)

This was copy-and-pasted from platforms that use a runtime check, so should
be unnecessary.

+# If the intrinsics are supported, sets pgac_loongarch_crc32c_intrinsics,
+# and CFLAGS_CRC.

+# Check if __builtin_loongarch_crcc_* intrinsics can be used
+# with the default compiler flags.
+# CFLAGS_CRC is set if the extra flag is required.

Same here -- it seems we don't need to set CFLAGS_CRC at all. Can you
confirm?

> > Also, I don't have a Loongarch machine for testing. Could you show that
> > the instructions are found in the binary, maybe using objdump and grep?
> > Or a performance test?
> >
>
> The output of the objdump command `objdump -dS
> ../postgres-build/tmp_install/usr/local/pgsql/bin/postgres  | grep -B 30
> -A 10 crcc` is attached.

Thanks for confirming.

--
John Naylor
EDB: http://www.enterprisedb.com


When IMMUTABLE is not.

2023-06-15 Thread Yura Sokolov

Good day, hackers.

I found, than declaration of function as IMMUTABLE/STABLE is not enough to be 
sure
function doesn't manipulate data.

In fact, SPI checks only direct function kind, but fails to check indirect call.

Attached immutable_not.sql creates 3 functions:

- `immutable_direct` is IMMUTABLE and tries to insert into table directly.
  PostgreSQL correctly detects and forbids this action.

- `volatile_direct` is VOLATILE and inserts into table directly.
  It is allowed and executed well.

- `immutable_indirect` is IMMUTABLE and calls `volatile_direct`.
  PostgreSQL failed to detect and prevent this DML manipulation.

Output:

select immutable_direct('immutable_direct'); psql:immutable_not.sql:28: 
ERROR:  INSERT is not allowed in a non-volatile function CONTEXT:  SQL 
statement "insert into xxx values(j)" PL/pgSQL function 
immutable_direct(character varying) line 3 at SQL statement select 
volatile_direct('volatile_direct'); volatile_direct - 
volatile_direct (1 row) select immutable_indirect('immutable_indirect'); 
immutable_indirect  immutable_indirect (1 row) 
select * from xxx; i  volatile_direct 
immutable_indirect (2 rows) Attached forbid-non-volatile-mutations.diff 
add checks readonly function didn't made data manipulations. Output for 
patched version: select immutable_indirect('immutable_indirect'); 
psql:immutable_not.sql:32: ERROR:  Damn2! Update were done in a 
non-volatile function CONTEXT:  SQL statement "SELECT 
volatile_direct(j)" PL/pgSQL function immutable_indirect(character 
varying) line 3 at PERFORM I doubt check should be done this way. This 
check is necessary, but it should be FATAL instead of ERROR. And ERROR 
should be generated at same place, when it is generated for 
`immutable_direct`, but with check of "read_only" status through whole 
call stack instead of just direct function kind. - regards, Yura 
Sokolov Postgres Professional


immutable_not.sql
Description: application/sql
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 33975687b38..82b6127d650 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -1584,6 +1584,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 	Portal		portal;
 	SPICallbackArg spicallbackarg;
 	ErrorContextCallback spierrcontext;
+	CommandId 	this_command = InvalidCommandId;
 
 	/*
 	 * Check that the plan is something the Portal code will special-case as
@@ -1730,6 +1731,7 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 	if (read_only)
 	{
 		ListCell   *lc;
+		this_command = GetCurrentCommandId(false);
 
 		foreach(lc, stmt_list)
 		{
@@ -1778,6 +1780,14 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 	/* Pop the SPI stack */
 	_SPI_end_call(true);
 
+	if (read_only && this_command != GetCurrentCommandId(false))
+	{
+		ereport(ERROR,
+(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+		/* translator: %s is a SQL statement name */
+		errmsg("Damn1! Update were done in a non-volatile function")));
+	}
+
 	/* Return the created portal */
 	return portal;
 }
@@ -2404,6 +2414,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 	ErrorContextCallback spierrcontext;
 	CachedPlan *cplan = NULL;
 	ListCell   *lc1;
+	CommandId   this_command = InvalidCommandId;
 
 	/*
 	 * Setup error traceback support for ereport()
@@ -2473,6 +2484,11 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 (errcode(ERRCODE_SYNTAX_ERROR),
  errmsg("empty query does not return tuples")));
 
+	if (options->read_only)
+	{
+		this_command = GetCurrentCommandId(false);
+	}
+
 	foreach(lc1, plan->plancache_list)
 	{
 		CachedPlanSource *plansource = (CachedPlanSource *) lfirst(lc1);
@@ -2788,6 +2804,14 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
 			CommandCounterIncrement();
 	}
 
+	if (options->read_only && this_command != GetCurrentCommandId(false))
+	{
+		ereport(ERROR,
+(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+		/* translator: %s is a SQL statement name */
+		errmsg("Damn2! Update were done in a non-volatile function")));
+	}
+
 fail:
 
 	/* Pop the snapshot off the stack if we pushed one */


obsolete filename reference in parser README

2023-06-15 Thread Amit Langote
Hi,

I noticed that 2f2b18bd3f55 forgot to remove the mention of
parse_jsontable.c in src/backend/parser/README.

Attached a patch to fix that.  Will push that shortly to HEAD and v15.

-- 
Thanks, Amit Langote
EDB: http://www.enterprisedb.com


v1-0001-Remove-outdated-reference-to-a-removed-file.patch
Description: Binary data


Re: Do we want a hashset type?

2023-06-15 Thread jian he
In hashset/test/sql/order.sql, can we add the following to test
whether the optimizer
will use our index.

CREATE INDEX  ON test_int4hashset_order (int4hashset_col
 int4hashset_btree_ops);

-- to make sure that this work with just two rows
SET enable_seqscan TO off;

explain(costs off) SELECT * FROM test_int4hashset_order WHERE
int4hashset_col = '{1,2}'::int4hashset;
reset enable_seqscan;

Since most contrib modules, one module, only one test file, maybe we need
to consolidate all the test sql files to one sql file (int4hashset.sql)?
--
I didn't install the extension directly. I copied the hashset--0.0.1.sql to
another place, using gcc to compile these functions.
gcc -I/home/jian/postgres/2023_05_25_beta5421/include/server -fPIC -c
/home/jian/hashset/hashset.c
gcc -shared  -o /home/jian/hashset/hashset.so /home/jian/hashset/hashset.o
then modify hashset--0.0.1.sql  then in psql  \i fullsqlfilename to create
these functions, types.

Because even make
PG_CONFIG=/home/jian/postgres/2023_05_25_beta5421/bin/pg_config still has
an error.
 fatal error: libpq-fe.h: No such file or directory
3 | #include 

Is there any way to put test_send_recv.c to sql test file?
Attached is a patch slightly modified README.md. feel free to change, since
i am not native english speaker...
From 2bb310affed4a06c6fa38fb0e1b1ff39f330d88d Mon Sep 17 00:00:00 2001
From: pgaddict 
Date: Thu, 15 Jun 2023 15:50:00 +0800
Subject: [PATCH] add instruction using PG_CONFIG to install extension, also
 how to run installcheck

---
 README.md | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/README.md b/README.md
index 3ff57576..30c7 100644
--- a/README.md
+++ b/README.md
@@ -1,7 +1,7 @@
 # hashset
 
 This PostgreSQL extension implements hashset, a data structure (type)
-providing a collection of integer items with fast lookup.
+providing a collection of unique, not null integer items with fast lookup.
 
 
 ## Version
@@ -103,11 +103,19 @@ a variable-length type.
 
 ## Installation
 
-To install the extension, run `make install` in the project root. Then, in your
+To install the extension, run `make install` in the project root. To use a different PostgreSQL installation, point configure to a different `pg_config`, using following command: 
+
+make PG_CONFIG=/else/where/pg_config
+make install PG_CONFIG=/else/where/pg_config
+
+Then, in your
 PostgreSQL connection, execute `CREATE EXTENSION hashset;`.
 
 This extension requires PostgreSQL version ?.? or later.
 
+## Test
+To run regression test, execute
+`make installcheck`
 
 ## License
 
-- 
2.34.1



Re: Let's make PostgreSQL multi-threaded

2023-06-15 Thread Hannu Krosing
One more unexpected benefit of having shared caches would be easing
access to other databases.

If the system caches are there for all databases anyway, then it
becomes much easier to make queries using objects from multiple
databases.

Note that this does not strictly need threads, just shared caches.

On Thu, Jun 15, 2023 at 11:04 AM Hannu Krosing  wrote:
>
> On Thu, Jun 15, 2023 at 10:41 AM James Addison  wrote:
> >
> > This is making me wonder about other performance/scalability areas
> > that might not have been considered due to focus on the details of the
> > existing codebase, but I'll save that for another thread and will try
> > to learn more first.
>
> A gradual move to more shared structures seems to be a way forward
>
> It should get us all the benefits of threading minus the need for TLB
> reloading and (in some cases) reduction of per-process virtual memory
> mapping tables.
>
> In any case we would need to implement all the locking and parallelism
> management of these shared structures that are not there in the
> current process architecture.
>
> So a fair bit of work but also a clearly defined benefits of
> 1) reduced memory usage
> 2) no need to rebuild caches for each new connection
> 3) no need to track PREPARE statements inside connection poolers.
>
> There can be extra complexity when different connections use the same
> prepared statement name (say "PREP001") for different queries.
> For this wel likely will need a good cooperation with connection
> pooler where it passes some kind of client connection id along at the
> transaction start




Re: Let's make PostgreSQL multi-threaded

2023-06-15 Thread Hannu Krosing
On Thu, Jun 15, 2023 at 10:41 AM James Addison  wrote:
>
> This is making me wonder about other performance/scalability areas
> that might not have been considered due to focus on the details of the
> existing codebase, but I'll save that for another thread and will try
> to learn more first.

A gradual move to more shared structures seems to be a way forward

It should get us all the benefits of threading minus the need for TLB
reloading and (in some cases) reduction of per-process virtual memory
mapping tables.

In any case we would need to implement all the locking and parallelism
management of these shared structures that are not there in the
current process architecture.

So a fair bit of work but also a clearly defined benefits of
1) reduced memory usage
2) no need to rebuild caches for each new connection
3) no need to track PREPARE statements inside connection poolers.

There can be extra complexity when different connections use the same
prepared statement name (say "PREP001") for different queries.
For this wel likely will need a good cooperation with connection
pooler where it passes some kind of client connection id along at the
transaction start




Re: Making empty Bitmapsets always be NULL

2023-06-15 Thread Yuya Watari
Hello,

On Tue, Jun 13, 2023 at 8:07 PM David Rowley  wrote:
> I've incorporated fixes for the bugs in the attached patch.  I didn't
> quite use the same approach as you did. I did the fix for 0003
> slightly differently and added two separate paths.  We've no need to
> track the last non-zero word when 'a' has more words than 'b' since we
> can't truncate any zero-words off for that case.  Not having to do
> that makes the do/while loop pretty tight.

I really appreciate your quick response and incorporating those fixes
into your patch. The fix for 0003 looks good to me. I believe your
change improves performance more.

> For the fix in the 0004 patch, I think we can do what you did more
> simply.  I don't think there's any need to perform the loop to find
> the last non-zero word.  We're only deleting a member from a single
> word here, so we only need to check if that word is the last word and
> remove it if it's become zero.  If it's not the last word then we
> can't remove it as there must be some other non-zero word after it.

If my thinking is correct, the do-while loop I added is still
necessary. Consider the following code. The Assertion in this code
passes in the master but fails in the new patch.

=
Bitmapset *x = bms_make_singleton(1000);

x = bms_del_member(x, 1000);
Assert(x == NULL);
=

In the code above, we get a new Bitmapset by bms_make_singleton(1000).
This Bitmapset has many words. Only the last word is non-zero, and all
the rest are zero. If we call bms_del_member(x, 1000) for the
Bitmapset, all words of the result will be zero, including the last
word, so we must return NULL. However, the new patch truncates only
the last word, leading to an incorrect result. Therefore, we need to
perform the loop to find the actual non-zero word after the deletion.
Of course, I agree that if we are not modifying the last word, we
don't have to truncate anything, so we can omit the loop.

> I also made a small adjustment to bms_get_singleton_member() and
> bms_singleton_member() to have them Assert fail if result is < 0 after
> looping over the set.  This should no longer happen so I thought it
> would make more compact code if that check was just removed.  We'd
> likely do better if we got reports of Assert failures here than, in
> the case of bms_get_singleton_member, some code accidentally doing the
> wrong thing based on a corrupt Bitmapset.

I agree with your change. I think failing by Assertion is better than
a runtime error or unexpected behavior.

-- 
Best regards,
Yuya Watari




Re: Let's make PostgreSQL multi-threaded

2023-06-15 Thread Hannu Krosing
On Thu, Jun 15, 2023 at 9:12 AM Konstantin Knizhnik  wrote:

> There are three different but related directions of improving current 
> Postgres:
> 1. Replacing processes with threads

Here we could likely start with making parallel query multi-threaded.

This would also remove the big blocker for parallelizing things like
CREATE TABLE AS SELECT ... where we are currently held bac by the
restriction that only the leader process can write.

> 2. Builtin connection pooler

Would be definitely a nice thing to have. And we could even start by
integrating a non-threaded pooler like pgbouncer to run as a
postgresql worker process (or two).

> 3. Lightweight backends (shared catalog/relation/prepared statements caches)

Shared prepared statement caches (of course have to be per-user and
per-database) would give additional benefit of lightweight connection
poolers not needing to track these. Currently the missing support of
named prepared statements is one of the main hindrances of using
pgbouncer with JDBC in transaction pooling mode (you can use it, but
have to turn off automatic statement preparing)

>
> The motivation for such changes are also similar:
> 1. Increase Postgres scalability
> 2. Reduce memory consumption
> 3. Make Postgres better fit cloud and serverless requirements

The memory consumption reduction would be a big and clear win for many
workloads.

Also just moving more things in shared memory will also prepare us for
move to threaded server (if it will eventually happen)

> I am not sure now which one should be addressed first or them can be done 
> together.

Shared caches seem like a guaranteed win at least on memory usage.
There could be performance  (and complexity) downsides for specific
workloads, but they would be the same as for the threaded model, so
would also be a good learning opportunity.

> Replacing static variables with thread-local is the first and may be the 
> easiest step.

I think we got our first patch doing this (as part of patches for
running PG threaded on Solaris) quite early in the OSS development ,
could have been even in the last century :)

> It requires more or less mechanical changes. More challenging thing is 
> replacing private per-backend data structures
> with shared ones (caches, file descriptors,...)

Indeed, sharing caches would be also part of the work that is needed
for the sharded model, so anyone feeling strongly about moving to
threads could start with this :)

---
Hannu




Re: Let's make PostgreSQL multi-threaded

2023-06-15 Thread James Addison
On Thu, 15 Jun 2023 at 08:12, Konstantin Knizhnik  wrote:
>
>
>
> On 15.06.2023 1:23 AM, James Addison wrote:
>
> On Tue, 13 Jun 2023 at 07:55, Konstantin Knizhnik  wrote:
>
>
> On 12.06.2023 3:23 PM, Pavel Borisov wrote:
>
> Is the following true or not?
>
> 1. If we switch processes to threads but leave the amount of session
> local variables unchanged, there would be hardly any performance gain.
> 2. If we move some backend's local variables into shared memory then
> the performance gain would be very near to what we get with threads
> having equal amount of session-local variables.
>
> In other words, the overall goal in principle is to gain from less
> memory copying wherever it doesn't add the burden of locks for
> concurrent variables access?
>
> Regards,
> Pavel Borisov,
> Supabase
>
>
> IMHO both statements are not true.
> Switching to threads will cause less context switch overhead (because
> all threads are sharing the same memory space and so preserve TLB.
> How big will be this advantage? In my prototype I got ~10%. But may be
> it is possible to fin workloads when it is larger.
>
> Hi Konstantin - do you have code/links that you can share for the
> prototype and benchmarks used to gather those results?
>
>
>
> Sorry, I have already shared the link:
> https://github.com/postgrespro/postgresql.pthreads/

Nope, my mistake for not locating the existing link - thank you.

Is there a reason that parser-related files (flex/bison) are added as
part of the changeset?  (I'm trying to narrow it down to only the
changes necessary for the functionality.  so far it looks mostly
fairly minimal, which is good.  the adjustments to progname are
another thing that look a bit unusual/maybe unnecessary for the
feature)

> As you can see last commit was 6 years ago when I stopped work on this 
> project.
> Why?  I already tried to explain it:
> - benefits from switching to threads were not so large. May be I just failed 
> to fid proper workload, but is was more or less expected result,
> because most of the code was not changed - it uses the same sync primitives, 
> the same local catalog/relation caches,..
> To take all advantage of multithreadig model it is necessary to rewrite many 
> components, especially related with interprocess communication.
> But maintaining such fork of Postgres and synchronize it with mainstream 
> requires too much efforts and I was not able to do it myself.

I get the feeling that there are probably certain query types or
patterns where a significant, order-of-magnitude speedup is possible
with threads - but yep, I haven't seen those described in detail yet
on the mailing list (but as hinted by my not noticing the github link
previously, maybe I'm not following the list closely enough).

What workloads did you try with your version of the project?

> There are three different but related directions of improving current 
> Postgres:
> 1. Replacing processes with threads
> 2. Builtin connection pooler
> 3. Lightweight backends (shared catalog/relation/prepared statements caches)
>
> The motivation for such changes are also similar:
> 1. Increase Postgres scalability
> 2. Reduce memory consumption
> 3. Make Postgres better fir cloud and serverless requirements
>
> I am not sure now which one should be addressed first or them can be done 
> together.
>
> Replacing static variables with thread-local is the first and may be the 
> easiest step.
> It requires more or less mechanical changes. More challenging thing is 
> replacing private per-backend data structures
> with shared ones (caches, file descriptors,...)

Thank you.  Personally I think that motivation two (reducing memory
consumption) -- as long as it can be done without detrimentally
affecting functionality or correctness, and without making the code
harder to develop/understand -- could provide benefits for all three
of the motivating cases (and, in fact, for non-cloud/serverful use
cases too).

This is making me wonder about other performance/scalability areas
that might not have been considered due to focus on the details of the
existing codebase, but I'll save that for another thread and will try
to learn more first.




Re: subscription/033_run_as_table_owner is not listed in the meson.build

2023-06-15 Thread Masahiko Sawada
On Thu, Jun 15, 2023 at 5:04 PM Michael Paquier  wrote:
>
> On Thu, Jun 15, 2023 at 07:16:06AM +, Hayato Kuroda (Fujitsu) wrote:
> > I have noticed that the testcase subscription/033_run_as_table_owner in the
> > subscription is not executed when meson build system is chosen. The case is 
> > not
> > listed in the meson.build.
> >
> > Do we have any reasons or backgrounds about it?
> > PSA the patch to add the case. It works well on my env.
>
> Seems like a thinko of 4826759 to me, that's easy to miss.  Will fix
> in a bit..

Good catch.

Checking similar oversights,
src/bin/pg_basebackup/t/011_in_place_tablespace.pl seems not to be
listed in meson.build too.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com




MergeJoin beats HashJoin in the case of multiple hash clauses

2023-06-15 Thread Andrey Lepikhov

Hi, all.

Some of my clients use JOIN's with three - four clauses. Quite 
frequently, I see complaints on unreasonable switch of JOIN algorithm to 
Merge Join instead of Hash Join. Quick research have shown one weak 
place - estimation of an average bucket size in final_cost_hashjoin (see 
q2.sql in attachment) with very conservative strategy.
Unlike estimation of groups, here we use smallest ndistinct value across 
all buckets instead of multiplying them (or trying to make multivariate 
analysis).
It works fine for the case of one clause. But if we have many clauses, 
and if each has high value of ndistinct, we will overestimate average 
size of a bucket and, as a result, prefer to use Merge Join. As the 
example in attachment shows, it leads to worse plan than possible, 
sometimes drastically worse.
I assume, this is done with fear of functional dependencies between hash 
clause components. But as for me, here we should go the same way, as 
estimation of groups.

The attached patch shows a sketch of the solution.

--
regards,
Andrey Lepikhov
Postgres Professional

q2.sql
Description: application/sql
diff --git a/src/backend/optimizer/path/costsize.c 
b/src/backend/optimizer/path/costsize.c
index ef475d95a1..26f26d6a40 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -4033,11 +4033,12 @@ final_cost_hashjoin(PlannerInfo *root, HashPath *path,
thismcvfreq = restrictinfo->left_mcvfreq;
}
 
-   if (innerbucketsize > thisbucketsize)
-   innerbucketsize = thisbucketsize;
-   if (innermcvfreq > thismcvfreq)
-   innermcvfreq = thismcvfreq;
+   innerbucketsize *= thisbucketsize;
+   innermcvfreq *= thismcvfreq;
}
+
+   if (innerbucketsize > virtualbuckets)
+   innerbucketsize = 1.0 / virtualbuckets;
}
 
/*


Re: Bypassing shared_buffers

2023-06-15 Thread Vladimir Churyukin
On Thu, Jun 15, 2023 at 12:32 AM Konstantin Knizhnik 
wrote:

>
>
> On 15.06.2023 4:37 AM, Vladimir Churyukin wrote:
> > Ok, got it, thanks.
> > Is there any alternative approach to measuring the performance as if
> > the cache was empty?
> > The goal is basically to calculate the max possible I/O time for a
> > query, to get a range between min and max timing.
> > It's ok if it's done during EXPLAIN ANALYZE call only, not for regular
> > executions.
> > One thing I can think of is even if the data in storage might be
> > stale, issue read calls from it anyway, for measuring purposes.
> > For EXPLAIN ANALYZE it should be fine as it doesn't return real data
> > anyway.
> > Is it possible that some pages do not exist in storage at all? Is
> > there a different way to simulate something like that?
> >
>
> I do not completely understand what you want to measure: how fast cache
> be prewarmed or what is the performance
> when working set doesn't fit in memory?
>
>
No, it's not about working set or prewarming speed.
We're trying to see what is the worst performance in terms of I/O, i.e.
when the database just started up or the data/indexes being queried are not
cached at all.

Why not changing `shared_buffers` size to some very small values (i.e.
> 1MB) doesn't work?
>
As it was already noticed, there are levels of caching: shared buffers
> and OS file cache.
> By reducing size of shared buffers you rely mostly on OS file cache.
> And actually there is no big gap in performance here - at most workloads
> I didn't see more than 15% difference).
>

I thought about the option of setting minimal shared_buffers, but it
requires a server restart anyway, something I'd like to avoid.

You can certainly flush OS cache `echo 3 > /proc/sys/vm/drop_caches` and
> so simulate cold start.
> But OS cached will be prewarmed quite fast (unlike shared buffer because
> of strange Postgres ring-buffer strategies which cause eviction of pages
> from shared buffers even if there is a lot of free space).
>
> So please more precisely specify the goal of your experiment.
> "max possible I/O time for a query" depends on so many factors...
> Do you consider just one client working in isolation or there will be
> many concurrent queries and background tasks like autovacuum and
> checkpointer  competing for the resources?
>

> My point is that if you need some deterministic result then you will
> have to exclude a lot of different factors which may affect performance
> and then ... you calculate speed of horse in vacuum, which has almost no
> relation to real performance.
>
>
Exactly, we need more or less deterministic results for how bad I/O timings
can be.
Even though it's not necessarily the numbers we will be getting in real
life, it gives us ideas about distribution,
and it's useful because we care about the long tail (p99+) of our queries.
For simplicity let's say it will be a single client only (it will be hard
to do the proposed solutions reliably with other stuff running in parallel
anyway).

-Vladimir Churyukin


Re: Replace (GUC_UNIT_MEMORY | GUC_UNIT_TIME) with GUC_UNIT in guc.c

2023-06-15 Thread Masahiko Sawada
On Thu, Jun 15, 2023 at 11:02 AM Masahiko Sawada  wrote:
>
> On Wed, Jun 14, 2023 at 4:47 PM Michael Paquier  wrote:
> >
> > On Wed, Jun 14, 2023 at 03:38:10PM +0800, Richard Guo wrote:
> > > +1.  BTW, it seems that GUC_UNIT_TIME is not used anywhere except in
> > > GUC_UNIT.  I was wondering if we can retire it, but maybe we'd better
> > > not.  It still indicates that we need to use time units table.
> >
> > Some out-of-core code declaring custom GUCs could rely on that, so
> > it is better not to remove it.
>
> +1 not to remove it.
>
> I've attached the patch to fix  (GUC_UNIT_MEMORY | GUC_UNIT_TIME)
> thing, and am going to push it later today to only master branch.

Pushed.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com




Fix a typo in rewriteHandler.c

2023-06-15 Thread Sho Kato (Fujitsu)
Hi,

I've attached the patch for the following rewriteTargetView comments.


  Assert(parsetree->resultRelation == new_rt_index);

/*
 * For INSERT/UPDATE we must also update resnos in the targetlist to refer
 * to columns of the base relation, since those indicate the target
 * columns to be affected.
 *
 * Note that this destroys the resno ordering of the targetlist, but that
 * will be fixed when we recurse through rewriteQuery, which will invoke
 * rewriteTargetListIU again on the updated targetlist.
 */
if (parsetree->commandType != CMD_DELETE)
{
foreach(lc, parsetree->targetList)

s/rewriteQuery/RewriteQuery

regards,
Sho Kato
diff --git a/src/backend/rewrite/rewriteHandler.c 
b/src/backend/rewrite/rewriteHandler.c
index 5a7b914183..b486ab559a 100644
--- a/src/backend/rewrite/rewriteHandler.c
+++ b/src/backend/rewrite/rewriteHandler.c
@@ -3362,7 +3362,7 @@ rewriteTargetView(Query *parsetree, Relation view)
 * columns to be affected.
 *
 * Note that this destroys the resno ordering of the targetlist, but 
that
-* will be fixed when we recurse through rewriteQuery, which will invoke
+* will be fixed when we recurse through RewriteQuery, which will invoke
 * rewriteTargetListIU again on the updated targetlist.
 */
if (parsetree->commandType != CMD_DELETE)


Re: Do we want a hashset type?

2023-06-15 Thread Joel Jacobson
On Thu, Jun 15, 2023, at 04:22, jian he wrote:
> Attachments:
> * temp.patch

Thanks for good suggestions.
New patch attached:

Enhance parsing and reorder headers in hashset module

Allow whitespaces in hashset input and reorder the inclusion of
header files, placing PostgreSQL headers first. Additionally, update
deprecated pq_sendint calls to pq_sendint32. Add tests for improved
parsing functionality.

/Joel

hashset-0.0.1-1fd790f
Description: Binary data


Re: subscription/033_run_as_table_owner is not listed in the meson.build

2023-06-15 Thread Michael Paquier
On Thu, Jun 15, 2023 at 07:16:06AM +, Hayato Kuroda (Fujitsu) wrote:
> I have noticed that the testcase subscription/033_run_as_table_owner in the
> subscription is not executed when meson build system is chosen. The case is 
> not
> listed in the meson.build.
> 
> Do we have any reasons or backgrounds about it?
> PSA the patch to add the case. It works well on my env.

Seems like a thinko of 4826759 to me, that's easy to miss.  Will fix
in a bit..
--
Michael


signature.asc
Description: PGP signature


Re: Support to define custom wait events for extensions

2023-06-15 Thread Michael Paquier
On Thu, Jun 15, 2023 at 03:06:01PM +0900, Masahiro Ikeda wrote:
> Currently, only one PG_WAIT_EXTENSION event can be used as a
> wait event for extensions. Therefore, in environments with multiple
> extensions are installed, it could take time to identify which
> extension is the bottleneck.

Thanks for taking the time to implement a patch to do that.

> I want to know your feedbacks. Please feel free to comment.

I think that's been cruelly missed.

> In the core, the requested wait events are dynamically registered in shared
> memory. The extension then obtains the wait event information with
> GetNamedExtensionWaitEventTranche() and uses the value to notify the core
> that it is waiting.
> 
> When a string representing of the wait event is requested,
> it returns the name defined by calling
> RequestNamedExtensionWaitEventTranche().

So this implements the equivalent of RequestNamedLWLockTranche()
followed by GetNamedLWLockTranche() to get the wait event number,
which can be used only during postmaster startup.  Do you think that
we could focus on implementing something more flexible instead, that
can be used dynamically as well as statically?  That would be similar
to LWLockNewTrancheId() and LWLockRegisterTranche(), actually, where
we would get one or more tranche IDs, then do initialization actions
in shmem based on the tranche ID(s).

> 4. check the custom wait event
> You can see the following results of psql-1.
> 
> query| wait_event_type |wait_event
> -+-+---
>  SELECT inject_wait_event(); | Extension   | custom_wait_event#
> requested wait event by the extension!
> (1 row)
> 
> (..snip..)

A problem with this approach is that it is expensive as a test.  Do we
really need one?  There are three places that set PG_WAIT_EXTENSION in
src/test/modules/, more in /contrib, and there are modules like
pg_stat_statements that could gain from events for I/O operations, for
example.

> # TODOs
> 
> * tests on windows (since I tested on Ubuntu 20.04 only)
> * add custom wait events for existing contrib modules (ex. postgres_fdw)
> * add regression code (but, it seems to be difficult)
> * others? (Please let me know)

Hmm.  You would need to maintain a state in a rather stable manner,
and SQL queries can make that difficult in the TAP tests as the wait
event information is reset each time a query finishes.  One area where
I think this gets easier is with a background worker loaded with
shared_preload_libraries that has a configurable naptime.  Looking at
what's available in the tree, the TAP tests of pg_prewarm could use
one test on pg_stat_activity with a custom wait event name
(pg_prewarm.autoprewarm_interval is 0 hence the bgworker waits
infinitely).  Note that in this case, you would need to be careful of
the case where the wait event is loaded dynamically, but like LWLocks
this should be able to work as well?
--
Michael


signature.asc
Description: PGP signature


Re: Consistent coding for the naming of LR workers

2023-06-15 Thread Amit Kapila
On Thu, Jun 15, 2023 at 8:13 AM Peter Smith  wrote:
>
> There are different types of Logical Replication workers -- e.g.
> tablesync workers, apply workers, and parallel apply workers.
>
> The logging and errors often name these worker types, but during a
> recent code review, I noticed some inconsistency in the way this is
> done:
> a) there is a common function get_worker_name() to return the name for
> the worker type,  -- OR --
> b) the worker name is just hardcoded in the message/error
>
> I think it is not ideal to cut/paste the same hardwired strings over
> and over. IMO it just introduces an unnecessary risk of subtle naming
> differences creeping in.
>
> ~~
>
> It is better to have a *single* point where these worker names are
> defined, so then all output uses identical LR worker nomenclature.
>

+1. I think makes error strings in the code look a bit shorter.  I
think it is better to park the patch for the next CF to avoid
forgetting about it.


-- 
With Regards,
Amit Kapila.




Re: generate syscache info automatically

2023-06-15 Thread John Naylor
On Wed, May 31, 2023 at 4:58 AM Peter Eisentraut 
wrote:
>
> I want to report on my on-the-plane-to-PGCon project.
>
> The idea was mentioned in [0].  genbki.pl already knows everything about
> system catalog indexes.  If we add a "please also make a syscache for
> this one" flag to the catalog metadata, we can have genbki.pl produce
> the tables in syscache.c and syscache.h automatically.
>
> Aside from avoiding the cumbersome editing of those tables, I think this
> layout is also conceptually cleaner, as you can more easily see which
> system catalog indexes have syscaches and maybe ask questions about why
> or why not.

When this has come up before, one objection was that index declarations
shouldn't know about cache names and bucket sizes [1]. The second paragraph
above makes a reasonable case for that, however. I believe one alternative
idea was for a script to read the enum, which would look something like
this:

#define DECLARE_SYSCACHE(cacheid,indexname,numbuckets) cacheid

enum SysCacheIdentifier
{
DECLARE_SYSCACHE(AGGFNOID, pg_aggregate_fnoid_index, 16) = 0,
...
};

...which would then look up the other info in the usual way from Catalog.pm.

> As a possible follow-up, I have also started work on generating the
> ObjectProperty structure in objectaddress.c.  One of the things you need
> for that is making genbki.pl aware of the syscache information.  There
> is some more work to be done there, but it's looking promising.

I haven't studied this, but it seems interesting.

One other possible improvement: syscache.c has a bunch of #include's, one
for each catalog with a cache, so there's still a bit of manual work in
adding a cache, and the current #include list is a bit cumbersome. Perhaps
it's worth it to have the script emit them as well?

I also wonder if at some point it will make sense to split off a separate
script(s) for some things that are unrelated to the bootstrap data.
genbki.pl is getting pretty large, and there are additional things that
could be done with syscaches, e.g. inlined eq/hash functions for cache
lookup [2].

[1] https://www.postgresql.org/message-id/12460.1570734...@sss.pgh.pa.us
[2]
https://www.postgresql.org/message-id/20210831205906.4wk3s4lvgzkdaqpi%40alap3.anarazel.de

--
John Naylor
EDB: http://www.enterprisedb.com


Re: trying again to get incremental backup

2023-06-15 Thread Dilip Kumar
On Thu, Jun 15, 2023 at 2:11 AM Andres Freund  wrote:
>

> > I'm not really sure. I expect Dilip would be happy to post his patch,
> > and if you'd be willing to have a look at it and express your concerns
> > or lack thereof, that would be super valuable.
>
> Will do. Adding me to CC: might help, I have a backlog unfortunately :(.

Thanks, I have posted it here[1]

[1] 
https://www.postgresql.org/message-id/CAFiTN-s-K%3DmVA%3DHPr_VoU-5bvyLQpNeuzjq1ebPJMEfCJZKFsg%40mail.gmail.com

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com




New WAL record to detect the checkpoint redo location

2023-06-15 Thread Dilip Kumar
As discussed [1 ][2] currently, the checkpoint-redo LSN can not be
accurately detected while processing the WAL. Although we have a
checkpoint WAL record containing the exact redo LSN, other WAL records
may be inserted between the checkpoint-redo LSN and the actual
checkpoint record. If we want to stop processing wal exactly at the
checkpoint-redo location then we cannot do that because we would have
already processed some extra records that got added after the redo
LSN.

The patch inserts a special wal record named CHECKPOINT_REDO WAL,
which is located exactly at the checkpoint-redo location.   We can
guarantee this record to be exactly at the checkpoint-redo point
because we already hold the exclusive WAL insertion lock while
identifying the checkpoint redo point and can insert this special
record exactly at the same time so that there are no concurrent WAL
insertions.

[1] 
https://www.postgresql.org/message-id/CA%2BTgmoYOYZfMCyOXFyC-P%2B-mdrZqm5pP2N7S-r0z3_402h9rsA%40mail.gmail.com
[2] 
https://www.postgresql.org/message-id/20230614194717.jyuw3okxup4cvtbt%40awork3.anarazel.de

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


v1-0001-New-WAL-record-to-identify-check-redo-location.patch
Description: Binary data


Re: Bypassing shared_buffers

2023-06-15 Thread Konstantin Knizhnik




On 15.06.2023 4:37 AM, Vladimir Churyukin wrote:

Ok, got it, thanks.
Is there any alternative approach to measuring the performance as if 
the cache was empty?
The goal is basically to calculate the max possible I/O time for a 
query, to get a range between min and max timing.
It's ok if it's done during EXPLAIN ANALYZE call only, not for regular 
executions.
One thing I can think of is even if the data in storage might be 
stale, issue read calls from it anyway, for measuring purposes.
For EXPLAIN ANALYZE it should be fine as it doesn't return real data 
anyway.
Is it possible that some pages do not exist in storage at all? Is 
there a different way to simulate something like that?




I do not completely understand what you want to measure: how fast cache 
be prewarmed or what is the performance

when working set doesn't fit in memory?

Why not changing `shared_buffers` size to some very small values (i.e. 
1MB) doesn't work?
As it was already noticed, there are levels of caching: shared buffers 
and OS file cache.

By reducing size of shared buffers you rely mostly on OS file cache.
And actually there is no big gap in performance here - at most workloads 
I didn't see more than 15% difference).


You can certainly flush OS cache `echo 3 > /proc/sys/vm/drop_caches` and 
so simulate cold start.
But OS cached will be prewarmed quite fast (unlike shared buffer because 
of strange Postgres ring-buffer strategies which cause eviction of pages

from shared buffers even if there is a lot of free space).

So please more precisely specify the goal of your experiment.
"max possible I/O time for a query" depends on so many factors...
Do you consider just one client working in isolation or there will be 
many concurrent queries and background tasks like autovacuum and 
checkpointer  competing for the resources?


My point is that if you need some deterministic result then you will 
have to exclude a lot of different factors which may affect performance
and then ... you calculate speed of horse in vacuum, which has almost no 
relation to real performance.










Re: Initial Schema Sync for Logical Replication

2023-06-15 Thread Peter Smith
On Thu, Jun 15, 2023 at 4:14 PM Peter Smith  wrote:
>
> On Thu, Jun 8, 2023 at 1:24 PM Masahiko Sawada  wrote:
> >
> ...
>
> > We also need to research how to integrate the initial schema
> > synchronization with tablesync workers.  We have a PoC patch[2].
> >
> > Regards,
> >
> > [1] 
> > https://wiki.postgresql.org/wiki/Logical_replication_of_DDLs#Initial_Schema_Sync
> > [2] 
> > https://www.postgresql.org/message-id/CAD21AoCdfg506__qKz%2BHX8vqfdyKgQ5qeehgqq9bi1L-6p5Pwg%40mail.gmail.com
> >
>
> FYI -- the PoC patch fails to apply using HEAD fetched today.
>
> git apply 
> ../patches_misc/0001-Poc-initial-table-structure-synchronization-in-logic.patch
> error: patch failed: src/backend/replication/logical/tablesync.c:1245
> error: src/backend/replication/logical/tablesync.c: patch does not apply
>

After rebasing the PoC patch locally, I found the 'make check' still
did not pass 100%.

# 2 of 215 tests failed.

Here are the differences:

diff -U3 /home/postgres/oss_postgres_misc/src/test/regress/expected/rules.out
/home/postgres/oss_postgres_misc/src/test/regress/results/rules.out
--- /home/postgres/oss_postgres_misc/src/test/regress/expected/rules.out
   2023-06-02 23:12:32.073864475 +1000
+++ /home/postgres/oss_postgres_misc/src/test/regress/results/rules.out
2023-06-15 16:53:29.352622676 +1000
@@ -2118,14 +2118,14 @@
 su.subname,
 st.pid,
 st.leader_pid,
-st.relid,
+st.subrelid,
 st.received_lsn,
 st.last_msg_send_time,
 st.last_msg_receipt_time,
 st.latest_end_lsn,
 st.latest_end_time
FROM (pg_subscription su
- LEFT JOIN pg_stat_get_subscription(NULL::oid) st(subid, relid,
pid, leader_pid, received_lsn, last_msg_send_time,
last_msg_receipt_time, latest_end_lsn, latest_end_time) ON ((st.subid
= su.oid)));
+ LEFT JOIN pg_stat_get_subscription(NULL::oid) st(subid,
subrelid, pid, leader_pid, received_lsn, last_msg_send_time,
last_msg_receipt_time, latest_end_lsn, latest_end_time) ON ((st.subid
= su.oid)));
 pg_stat_subscription_stats| SELECT ss.subid,
 s.subname,
 ss.apply_error_count,
diff -U3 /home/postgres/oss_postgres_misc/src/test/regress/expected/oidjoins.out
/home/postgres/oss_postgres_misc/src/test/regress/results/oidjoins.out
--- /home/postgres/oss_postgres_misc/src/test/regress/expected/oidjoins.out
2022-10-04 15:11:32.457834981 +1100
+++ /home/postgres/oss_postgres_misc/src/test/regress/results/oidjoins.out
 2023-06-15 16:54:07.159839010 +1000
@@ -265,4 +265,3 @@
 NOTICE:  checking pg_subscription {subdbid} => pg_database {oid}
 NOTICE:  checking pg_subscription {subowner} => pg_authid {oid}
 NOTICE:  checking pg_subscription_rel {srsubid} => pg_subscription {oid}
-NOTICE:  checking pg_subscription_rel {srrelid} => pg_class {oid}

--
Kind Regards,
Peter Smith.
Fujitsu Australia




subscription/033_run_as_table_owner is not listed in the meson.build

2023-06-15 Thread Hayato Kuroda (Fujitsu)
Dear hackers,

I have noticed that the testcase subscription/033_run_as_table_owner in the
subscription is not executed when meson build system is chosen. The case is not
listed in the meson.build.

Do we have any reasons or backgrounds about it?
PSA the patch to add the case. It works well on my env.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED



add_test.patch
Description: add_test.patch


Re: pg_collation.collversion for C.UTF-8

2023-06-15 Thread Thomas Munro
On Sun, Apr 23, 2023 at 5:22 AM Daniel Verite  wrote:
> I understand that my proposal to version C.* like any other collation
> might be erring on the side of caution, but ignoring these collation
> changes on at least one major OS does not feel right either.
> Maybe we should consider doing platform-dependent checks?

Hmm, OK let's explore that.  What could we do that would be helpful
here, without affecting users of the "true" C.UTF-8 for the rest of
time?  This is a Debian (+ downstream distro) only problem as far as
we know so far, and only for Debian 11 and older.  Debian 12 was just
released the other day as the new stable, so the window of opportunity
to actually help anyone with our actions in this area is now beginning
to close, because we're really talking about new databases initdb'd
and new COLLATIONs CREATEd on Debian old stable after our next
(August) release.  The window may be much wider for long term Ubuntu
releases.  You're right that we could do something platform-specific
to help with that: we could change that code so that it captures the
version for C.* #if __GLIBC_MAJOR__ == 2 && __GLIBC_MINOR__ < 35 (or
we could parse the string returned by the runtime version function).
I don't recall immediately what our warning code does if it sees a
change from versioned to not versioned, but we could make sure it does
what you want here.  That way we wouldn't burden all future users of
C.* with version warnings (because it'll be empty), but we'd catch
those doing Debian 11 -> 12 upgrades, and whatever Ubuntu upgrades
that corresponds to, etc.  Is it worth it?




Re: Let's make PostgreSQL multi-threaded

2023-06-15 Thread Konstantin Knizhnik



On 15.06.2023 1:23 AM, James Addison wrote:

On Tue, 13 Jun 2023 at 07:55, Konstantin Knizhnik  wrote:



On 12.06.2023 3:23 PM, Pavel Borisov wrote:

Is the following true or not?

1. If we switch processes to threads but leave the amount of session
local variables unchanged, there would be hardly any performance gain.
2. If we move some backend's local variables into shared memory then
the performance gain would be very near to what we get with threads
having equal amount of session-local variables.

In other words, the overall goal in principle is to gain from less
memory copying wherever it doesn't add the burden of locks for
concurrent variables access?

Regards,
Pavel Borisov,
Supabase



IMHO both statements are not true.
Switching to threads will cause less context switch overhead (because
all threads are sharing the same memory space and so preserve TLB.
How big will be this advantage? In my prototype I got ~10%. But may be
it is possible to fin workloads when it is larger.

Hi Konstantin - do you have code/links that you can share for the
prototype and benchmarks used to gather those results?



Sorry, I have already shared the link:
https://github.com/postgrespro/postgresql.pthreads/

As you can see last commit was 6 years ago when I stopped work on this 
project.

Why?  I already tried to explain it:
- benefits from switching to threads were not so large. May be I just 
failed to fid proper workload, but is was more or less expected result,
because most of the code was not changed - it uses the same sync 
primitives, the same local catalog/relation caches,..
To take all advantage of multithreadig model it is necessary to rewrite 
many components, especially related with interprocess communication.
But maintaining such fork of Postgres and synchronize it with mainstream 
requires too much efforts and I was not able to do it myself.


There are three different but related directions of improving current 
Postgres:

1. Replacing processes with threads
2. Builtin connection pooler
3. Lightweight backends (shared catalog/relation/prepared statements caches)

The motivation for such changes are also similar:
1. Increase Postgres scalability
2. Reduce memory consumption
3. Make Postgres better fir cloud and serverless requirements

I am not sure now which one should be addressed first or them can be 
done together.


Replacing static variables with thread-local is the first and may be the 
easiest step.
It requires more or less mechanical changes. More challenging thing is 
replacing private per-backend data structures

with shared ones (caches, file descriptors,...)