date:20220520

Re: tweak to a few index tests to hits ambuildempty() routine.

2022-05-20 Thread Noah Misch

On Mon, Apr 25, 2022 at 10:05:08AM -0400, Tom Lane wrote:
> Alvaro Herrera  writes:
> > Hmm, so 027_stream_regress.pl is not prepared to deal with any unlogged
> > tables that may be left in the regression database (which is what my
> > spgist addition did).  I first tried doing a TRUNCATE of the unlogged
> > table, but that doesn't work either, and it turns out that the
> > regression database does not have any UNLOGGED relations.  Maybe that's
> > something we need to cater for, eventually, but for now dropping the
> > table suffices.  I have pushed that.
> 
> It does seem like the onus should be on 027_stream_regress.pl to
> deal with that, rather than restricting what the core tests can
> leave behind.

Yeah.  Using "pg_dumpall --no-unlogged-table-data", as attached, suffices.

> Maybe we could have it look for unlogged tables and drop them
> before making the dumps?  Although I don't understand why
> TRUNCATE wouldn't do the job equally well.

After TRUNCATE, one still gets a setval for sequences and a zero-row COPY for
tables.  When dumping a standby or using --no-unlogged-table-data, those
commands are absent.
Author: Noah Misch 
Commit: Noah Misch 

Use --no-unlogged-table-data in t/027_stream_regress.pl.

This removes the need to drop unlogged relations in the src/test/regress
suite, like commit dec8ad367e46180f826d5b6dc820fbecba1b71d2 did.

Reviewed by FIXME.

Discussion: https://postgr.es/m/39945.1650895...@sss.pgh.pa.us

diff --git a/src/test/recovery/t/027_stream_regress.pl 
b/src/test/recovery/t/027_stream_regress.pl
index fdb4ea0..7982ac0 100644
--- a/src/test/recovery/t/027_stream_regress.pl
+++ b/src/test/recovery/t/027_stream_regress.pl
@@ -100,7 +100,8 @@ $node_primary->wait_for_catchup($node_standby_1, 'replay',
 command_ok(
[
'pg_dumpall', '-f', $outputdir . '/primary.dump',
-   '--no-sync', '-p', $node_primary->port
+   '--no-sync',  '-p', $node_primary->port,
+   '--no-unlogged-table-data'# if unlogged, standby has schema 
only
],
'dump primary server');
 command_ok(

Re: Skipping schema changes in publication

2022-05-20 Thread vignesh C

On Fri, May 20, 2022 at 11:23 AM Peter Smith  wrote:
>
> Below are my review comments for v6-0002.
>
> ==
>
> 1. Commit message.
> The psql \d family of commands to display excluded tables.
>
> SUGGESTION
> The psql \d family of commands can now display excluded tables.

Modified

> ~~~
>
> 2. doc/src/sgml/ref/alter_publication.sgml
>
> @@ -22,6 +22,7 @@ PostgreSQL documentation
>   
>  
>  ALTER PUBLICATION name
> ADD publication_object [,
> ...]
> +ALTER PUBLICATION name
> ADD ALL TABLES [ EXCEPT [ TABLE ] exception_object [, ... ] ]
>
> The "exception_object" font is wrong. Should look the same as
> "publication_object"

Modified

> ~~~
>
> 3. doc/src/sgml/ref/alter_publication.sgml - Examples
>
> @@ -214,6 +220,14 @@ ALTER PUBLICATION sales_publication ADD ALL
> TABLES IN SCHEMA marketing, sales;
>  
>
>
> +  
> +   Alter publication production_publication to 
> publish
> +   all tables except users and
> +   departments tables:
> +
> +ALTER PUBLICATION production_publication ADD ALL TABLES EXCEPT TABLE
> users, departments;
> +
>
> Consider using "EXCEPT" instead of "EXCEPT TABLE" because that will
> show TABLE keyword is optional.

Modified

> ~~~
>
> 4. doc/src/sgml/ref/create_publication.sgml
>
> An SGML tag error caused building the docs to fail. My fix was
> previously reported [1].

Modified

> ~~~
>
> 5. doc/src/sgml/ref/create_publication.sgml
>
> @@ -22,7 +22,7 @@ PostgreSQL documentation
>   
>  
>  CREATE PUBLICATION name
> -[ FOR ALL TABLES
> +[ FOR ALL TABLES [ EXCEPT [ TABLE ] exception_object [, ... ] ]
>
> The "exception_object" font is wrong. Should look the same as
> "publication_object"

Modified

> ~~~
>
> 6. doc/src/sgml/ref/create_publication.sgml - Examples
>
> @@ -351,6 +366,15 @@ CREATE PUBLICATION production_publication FOR
> TABLE users, departments, ALL TABL
>  CREATE PUBLICATION sales_publication FOR ALL TABLES IN SCHEMA marketing, 
> sales;
>  
>
> +  
> +   Create a publication that publishes all changes in all the tables except 
> for
> +   the changes of users and
> +   departments table:
> +
> +CREATE PUBLICATION mypublication FOR ALL TABLE EXCEPT TABLE users, 
> departments;
> +
> +  
> +
>
> 6a.
> Typo: "FOR ALL TABLE" -> "FOR ALL TABLES"

Modified

> 6b.
> Consider using "EXCEPT" instead of "EXCEPT TABLE" because that will
> show TABLE keyword is optional.

Modified

> ~~~
>
> 7. src/backend/catalog/pg_publication.c - GetTopMostAncestorInPublication
>
> @@ -316,18 +316,25 @@ GetTopMostAncestorInPublication(Oid puboid, List
> *ancestors, int *ancestor_level
>   }
>   else
>   {
> - aschemaPubids = GetSchemaPublications(get_rel_namespace(ancestor));
> - if (list_member_oid(aschemaPubids, puboid))
> + List*aschemapubids = NIL;
> + List*aexceptpubids = NIL;
> +
> + aschemapubids = GetSchemaPublications(get_rel_namespace(ancestor));
> + aexceptpubids = GetRelationPublications(ancestor, true);
> + if (list_member_oid(aschemapubids, puboid) ||
> + (puballtables && !list_member_oid(aexceptpubids, puboid)))
>   {
>
> You could re-write this as multiple conditions instead of one. That
> could avoid always assigning the 'aexceptpubids', so it might be a
> more efficient way to write this logic.

Modified

> ~~~
>
> 8. src/backend/catalog/pg_publication.c - CheckPublicationDefValues
>
> +/*
> + * Check if the publication has default values
> + *
> + * Check the following:
> + * Publication is having default options
> + *  Publication is not associated with relations
> + *  Publication is not associated with schemas
> + *  Publication is not set with "FOR ALL TABLES"
> + */
> +static bool
> +CheckPublicationDefValues(HeapTuple tup)
>
> 8a.
> Remove the tab. Replace with spaces.

Modified

> 8b.
> It might be better if this comment order is the same as the logic order.
> e.g.
>
> * Check the following:
> *  Publication is not set with "FOR ALL TABLES"
> *  Publication is having default options
> *  Publication is not associated with schemas
> *  Publication is not associated with relations

Modified

> ~~~
>
> 9. src/backend/catalog/pg_publication.c - AlterPublicationSetAllTables
>
> +/*
> + * Reset the publication.
> + *
> + * Reset the publication options, publication relations and
> publication schemas.
> + */
> +static void
> +AlterPublicationSetAllTables(Relation rel, HeapTuple tup)
>
> The function comment and the function name do not seem to match here;
> something looks like a cut/paste error ??

Modified

> ~~~
>
> 10. src/backend/catalog/pg_publication.c - AlterPublicationSetAllTables
>
> + /* set all tables option */
> + values[Anum_pg_publication_puballtables - 1] = BoolGetDatum(true);
> + replaces[Anum_pg_publication_puballtables - 1] = true;
>
> SUGGEST (comment)
> /* set all ALL TABLES flag */

Modified

> ~~~
>
> 11. src/backend/catalog/pg_publication.c - AlterPublication
>
> @@ -1501,6 +1579,20 @@ AlterPublication(ParseState *pstate,
> AlterPublicationStmt *stmt)
>   aclcheck_error(ACLCHECK_NOT_OWNER, OBJECT_PUBLICATION,
>  stmt->pubn

Re: Skipping schema changes in publication

2022-05-20 Thread vignesh C

On Fri, May 20, 2022 at 5:49 AM Peter Smith  wrote:
>
> FYI, although the v6-0002 patch applied cleanly, I found that the SGML
> was malformed and so the pg docs build fails.
>
> ~~~
> e.g.
>
> [postgres@CentOS7-x64 sgml]$ make STYLE=website html
> { \
>   echo ""; \
>   echo ""; \
> } > version.sgml
> '/usr/bin/perl' ./mk_feature_tables.pl YES
> ../../../src/backend/catalog/sql_feature_packages.txt
> ../../../src/backend/catalog/sql_features.txt >
> features-supported.sgml
> '/usr/bin/perl' ./mk_feature_tables.pl NO
> ../../../src/backend/catalog/sql_feature_packages.txt
> ../../../src/backend/catalog/sql_features.txt >
> features-unsupported.sgml
> '/usr/bin/perl' ./generate-errcodes-table.pl
> ../../../src/backend/utils/errcodes.txt > errcodes-table.sgml
> '/usr/bin/perl' ./generate-keywords-table.pl . > keywords-table.sgml
> /usr/bin/xmllint --path . --noout --valid postgres.sgml
> ref/create_publication.sgml:171: parser error : Opening and ending tag
> mismatch: varlistentry line 166 and listitem
> 
>^
> ref/create_publication.sgml:172: parser error : Opening and ending tag
> mismatch: variablelist line 60 and varlistentry
>
>   ^
> ref/create_publication.sgml:226: parser error : Opening and ending tag
> mismatch: refsect1 line 57 and variablelist
>   
>  ^
> ...
>
> I will work around it locally, but for future patches please check the
> SGML builds ok before posting.

Thanks for reporting this, I have made the changes for this.
The v7 patch attached at [1] has the changes for the same.

[1] - 
https://www.postgresql.org/message-id/CALDaNm3EpX3%2BRu%3DSNaYi%3DUW5ZLE6nNhGRHZ7a8-fXPZ_-gLdxQ%40mail.gmail.com

Regards,
Vignesh

Re: Skipping schema changes in publication

2022-05-20 Thread vignesh C

On Thu, May 19, 2022 at 1:49 PM Peter Smith  wrote:
>
> Below are my review comments for v6-0001.
>
> ==
>
> 1. General.
>
> The patch failed 'publication' tests in the make check phase.
>
> Please add this work to the commit-fest so that the 'cfbot' can report
> such errors sooner.

Added commitfest entry

> ~~~
>
> 2. src/backend/commands/publicationcmds.c - AlterPublicationReset
>
> +/*
> + * Reset the publication.
> + *
> + * Reset the publication options, publication relations and
> publication schemas.
> + */
> +static void
> +AlterPublicationReset(ParseState *pstate, AlterPublicationStmt *stmt,
> + Relation rel, HeapTuple tup)
>
> SUGGESTION (Make the comment similar to the sgml text instead of
> repeating "publication" 4x !)
> /*
>  * Reset the publication options, set the ALL TABLES flag to false, and
>  * drop all relations and schemas that are associated with the publication.
>  */

Modified

> ~~~
>
> 3. src/test/regress/expected/publication.out
>
> make check failed. The diff is below:
>
> @@ -1716,7 +1716,7 @@
>  -- Verify that only superuser can reset a publication
>  ALTER PUBLICATION testpub_reset OWNER TO regress_publication_user2;
>  SET ROLE regress_publication_user2;
> -ALTER PUBLICATION testpub_reset RESET; -- fail
> +ALTER PUBLICATION testpub_reset RESET; -- fail - must be superuser
>  ERROR:  must be superuser to RESET publication
>  SET ROLE regress_publication_user;
>  DROP PUBLICATION testpub_reset;

It passed for me locally because the change was present in the 002
patch. I have moved the change to 001.

The attached v7 patch has the changes for the same.
[1] - https://commitfest.postgresql.org/38/3646/

Regards,
Vignesh
From 1855e00f2d6cc19427c55eec2f1e40ecc8f8c1cc Mon Sep 17 00:00:00 2001
From: Vigneshwaran C 
Date: Sat, 14 May 2022 13:13:46 +0530
Subject: [PATCH v7 1/2] Add RESET clause to Alter Publication which will reset
 the publication with default values.

This patch adds a new RESET clause to ALTER PUBLICATION which will reset
the publication to the default state which includes resetting the publication
options, setting ALL TABLES flag to false and dropping the relations and
schemas that are associated with the publication.
Usage:
ALTER PUBLICATION pub1 RESET;
---
 doc/src/sgml/ref/alter_publication.sgml   |  38 ++--
 src/backend/commands/publicationcmds.c| 100 --
 src/backend/parser/gram.y |   9 ++
 src/bin/psql/tab-complete.c   |   2 +-
 src/include/nodes/parsenodes.h|   3 +-
 src/test/regress/expected/publication.out |  69 +++
 src/test/regress/sql/publication.sql  |  37 
 7 files changed, 242 insertions(+), 16 deletions(-)

diff --git a/doc/src/sgml/ref/alter_publication.sgml b/doc/src/sgml/ref/alter_publication.sgml
index e2cce49471..47bd15f1fa 100644
--- a/doc/src/sgml/ref/alter_publication.sgml
+++ b/doc/src/sgml/ref/alter_publication.sgml
@@ -27,6 +27,7 @@ ALTER PUBLICATION name DROP name SET ( publication_parameter [= value] [, ... ] )
 ALTER PUBLICATION name OWNER TO { new_owner | CURRENT_ROLE | CURRENT_USER | SESSION_USER }
 ALTER PUBLICATION name RENAME TO new_name
+ALTER PUBLICATION name RESET
 
 where publication_object is one of:
 
@@ -65,20 +66,33 @@ ALTER PUBLICATION name RENAME TO 
 
   
-   The remaining variants change the owner and the name of the publication.
+   The OWNER clause will change the owner of the publication.
+  
+
+  
+   The RENAME clause will change the name of the publication.
+  
+
+  
+   The RESET clause will reset the publication to the
+   default state which includes resetting the publication options, setting
+   ALL TABLES flag to false and
+   dropping all relations and schemas that are associated with the publication.
   
 
   
You must own the publication to use ALTER PUBLICATION.
Adding a table to a publication additionally requires owning that table.
-   The ADD ALL TABLES IN SCHEMA and
-   SET ALL TABLES IN SCHEMA to a publication requires the
-   invoking user to be a superuser.  To alter the owner, you must also be a
-   direct or indirect member of the new owning role. The new owner must have
-   CREATE privilege on the database.  Also, the new owner
-   of a FOR ALL TABLES or FOR ALL TABLES IN
-   SCHEMA publication must be a superuser. However, a superuser can
-   change the ownership of a publication regardless of these restrictions.
+   The ADD ALL TABLES IN SCHEMA,
+   SET ALL TABLES IN SCHEMA to a publication and
+   RESET of publication requires the invoking user to be a
+   superuser. To alter the owner, you must also be a direct or indirect member
+   of the new owning role. The new owner must have CREATE
+   privilege on the database.  Also, the new owner of a
+   FOR ALL TABLES or
+   FOR ALL TABLES IN SCHEMA publication must be a superuser.
+   However, a superuser can change the ownership of a publication regardless of
+   these restrictions.
   
 
   
@@ -207,6 +221,12 @@ ALTER PUBLICATION sales_publ

Re: PG15 beta1 fix pg_stat_recovery_prefetch view manual

2022-05-20 Thread Thomas Munro

On Sat, May 21, 2022 at 4:07 PM Shinoda, Noriyoshi (PN Japan FSIP)
 wrote:
> Thanks to all the developers. The attached patch updates the manual for the 
> pg_stat_recovery_prefetch view.
> The current pg_stat_recovery_prefetch view definition is missing the 
> stats_reset column. The attached patch adds information in the stats_reset 
> column.

Ahh, thank you.  I will push this soon.

PG15 beta1 fix pg_stat_recovery_prefetch view manual

2022-05-20 Thread Shinoda, Noriyoshi (PN Japan FSIP)

Hi hackers,
Thanks to all the developers. The attached patch updates the manual for the 
pg_stat_recovery_prefetch view.
The current pg_stat_recovery_prefetch view definition is missing the 
stats_reset column. The attached patch adds information in the stats_reset 
column.

https://www.postgresql.org/docs/15/monitoring-stats.html#MONITORING-PG-STAT-RECOVERY-PREFETCH

Regards,
Noriyoshi Shinoda


pg_stat_recovery_prefetch_doc_v1.diff
Description: pg_stat_recovery_prefetch_doc_v1.diff

RE: PG15 beta1 fix pg_stat_statements view document

2022-05-20 Thread Shinoda, Noriyoshi (PN Japan FSIP)

Hi,

Thank you for your comment.
I attached the fixed patch.

-Original Message-
From: Michael Paquier  
Sent: Saturday, May 21, 2022 12:33 PM
To: Nathan Bossart 
Cc: Shinoda, Noriyoshi (PN Japan FSIP) ; 
PostgreSQL-development ; mag...@hagander.net
Subject: Re: PG15 beta1 fix pg_stat_statements view document

On Fri, May 20, 2022 at 04:04:29PM -0700, Nathan Bossart wrote:
> I think there is a typo in the change to the jit_optimization_time 
> section, but otherwise it looks good to me.

Yes, as of "double precisiodouble precision".  All these four fields are indeed 
doubles in the code, for what looks like a copy-pasto from 57d6aea.  Will fix.
--
Michael


pg_stat_statements_doc_v2.diff
Description: pg_stat_statements_doc_v2.diff

Re: Build-farm - intermittent error in 031_column_list.pl

2022-05-20 Thread Amit Kapila

On Fri, May 20, 2022 at 4:01 PM Tomas Vondra
 wrote:
>
> On 5/20/22 05:58, Amit Kapila wrote:
>
> Are we really querying the publications (in get_rel_sync_entry) using
> the historical snapshot?
>

Yes.

> I haven't really realized this, but yeah, that
> might explain the issue.
>
> The new TAP test does ALTER SUBSCRIPTION ... SET PUBLICATION much more
> often than any other test (there are ~15 calls, 12 of which are in this
> new test). That might be why we haven't seen failures before. Or maybe
> the existing tests simply are not vulnerable to this,
>

Right, I have checked the other cases are not vulnerable to this,
otherwise, I think we would have seen intermittent failures till now.
They don't seem to be doing DMLs before the creation of a publication
or they create a subscription pointing to the same publication before.

> because they
> either do wait_for_catchup late enough or don't do any DML right before
> executing SET PUBLICATION.
>
> >>  That timetravel seems inintuitive but it's the
> >> (current) way it works.
> >>
> >
> > I have thought about it but couldn't come up with a good way to change
> > the way currently it works. Moreover, I think it is easy to hit this
> > in other ways as well. Say, you first create a subscription with a
> > non-existent publication and then do operation on any unrelated table
> > on the publisher before creating the required publication, we will hit
> > exactly this problem of "publication does not exist", so I think we
> > may need to live with this behavior and write tests carefully.
> >
>
> Yeah, I think it pretty much requires ensuring the subscriber is fully
> caught up with the publisher, otherwise ALTER SUBSCRIPTION may break the
> replication in an unrecoverable way (actually, you can alter the
> subscription and remove the publication again, right?).
>

Right.

> But this is not just about tests, of course - the same issue applies to
> regular replication. That's a bit unfortunate, so maybe we should think
> about making this less fragile.
>

Agreed, provided we find some reasonable solution.

> We might make sure the subscriber is not lagging (essentially the
> wait_for_catchup) - which the users will have to do anyway (although
> maybe they know the publisher is beyond the LSN where it was created).
>

This won't work for the case mentioned above where we create a
subscription with non-existent publications, then perform DML and then
'CREATE PUBLICATION'.

> The other option would be to detect such case, somehow - if you don't
> see the publication yet, see if it exists in current snapshot, and then
> maybe ignore this error. But that has other issues (the publication
> might have been created and dropped, in which case you won't see it).
>

True, the dropped case would again be tricky to deal with and I think
we will end up publishing some operations which are performed before
the publication is even created.

> Also, we'd probably have to ignore RelationSyncEntry for a while, which
> seems quite expensive.
>

Yet another option could be that we continue using a historic snapshot
but ignore publications that are not found for the purpose of
computing RelSyncEntry attributes. We won't mark such an entry as
valid till all the publications are loaded without anything missing. I
think such cases in practice won't be enough to matter. This means we
won't publish operations on tables corresponding to that publication
till we found such a publication and that seems okay.

-- 
With Regards,
Amit Kapila.

Re: PG15 beta1 fix pg_stat_statements view document

2022-05-20 Thread Michael Paquier

On Fri, May 20, 2022 at 04:04:29PM -0700, Nathan Bossart wrote:
> I think there is a typo in the change to the jit_optimization_time section,
> but otherwise it looks good to me.

Yes, as of "double precisiodouble precision".  All these four fields
are indeed doubles in the code, for what looks like a copy-pasto from
57d6aea.  Will fix.
--
Michael


signature.asc
Description: PGP signature

Re: docs: mention "pg_read_all_stats" in "track_activities" description

2022-05-20 Thread Michael Paquier

On Fri, May 20, 2022 at 04:08:37PM -0700, Nathan Bossart wrote:
> LGTM

Indeed, it is a good idea to add this information.  Will apply and
backpatch accordingly.
--
Michael


signature.asc
Description: PGP signature

Re: Rewriting the test of pg_upgrade as a TAP test - take three - remastered set

2022-05-20 Thread Michael Paquier

On Fri, May 20, 2022 at 06:28:01PM -0700, Noah Misch wrote:
> Looks successful.

Thanks a lot for confirming.  I have applied that on HEAD, then.
--
Michael


signature.asc
Description: PGP signature

Re: Rewriting the test of pg_upgrade as a TAP test - take three - remastered set

2022-05-20 Thread Noah Misch

On Wed, May 18, 2022 at 06:20:08PM +0900, Michael Paquier wrote:
> Attached is an updated patch to address your concerns.

Looks successful.

Re: allow building trusted languages without the untrusted versions

2022-05-20 Thread Tom Lane

Nathan Bossart  writes:
> Presently, if you want to only build trusted PL/Perl and PL/Tcl, you need
> to make a couple of code changes to compile out the untrusted parts.  I
> suspect many users (e.g., anyone who wants to disallow file system access)
> would benefit from a better supported way to do this.  Thus, I've attached
> some patches that introduce an optional argument for the --with-perl and
> --with-tcl configuration options.  This new argument can be used to build
> only the trusted or untrusted version of the language.  If the argument is
> not provided, both the trusted and untrusted versions are built, so this
> change is backward compatible.

I do not believe that this is worth the extra complication.  Nobody has
ever asked for it before, so I estimate the number of people who would
use it as near zero, and those folk are entirely capable of removing
the relevant extension files from their installations.

Moreover, if we accept this as a useful configure option, what other
things will we be called on to change?  It surely makes no sense to
install contrib/adminpack, for instance, if you're afraid of having
plperlu installed.

Lastly, you've offered no reason to think this would provide any real
security improvement.  Someone who's gained the ability to issue CREATE
EXTENSION on untrusted extensions has already got all the privileges he
needs; leaving out a few extension files is at most going to slow him
down a bit on the way to full filesystem access.  (See, eg, COPY TO
PROGRAM.)

regards, tom lane

Re: Limiting memory allocation

2022-05-20 Thread Tomas Vondra

On 5/20/22 21:50, Stephen Frost wrote:
> Greetings,
> 
> ...
>
>>>  How exactly this would work is unclear to me; maybe one
>>> process keeps an eye on it in an OS-specific manner,
> 
> There seems to be a lot of focus on trying to implement this as "get the
> amount of free memory from the OS and make sure we don't go over that
> limit" and that adds a lot of OS-specific logic which complicates things
> and also ignores the use-cases where an admin wishes to limit PG's
> memory usage due to other processes running on the same system.  I'll
> point out that the LD_PRELOAD library doesn't even attempt to do this,
> even though it's explicitly for Linux, but uses an environment variable
> instead.
> 
> In PG, we'd have that be a GUC that an admin is able to set and then we
> track the memory usage (perhaps per-process, perhaps using some set of
> buckets, perhaps locally per-process and then in a smaller number of
> buckets in shared memory, or something else) and fail an allocation when
> it would go over that limit, perhaps only when it's a regular user
> backend or with other conditions around it.
> 

I agree a GUC setting a memory target is a sensible starting point.

I wonder if we might eventually use this to define memory budgets. One
of the common questions I get is how do you restrict the user from
setting work_mem too high or doing too much memory-hungry things.
Currently there's no way to do that, because we have no way to limit
work_mem values, and even if we had the user could construct a more
complex query with more memory-hungry operations.

But I think it's also that we weren't sure what to do after hitting a
limit - should we try replanning the query with lower work_mem value, or
what?

However, if just failing the malloc() is acceptable, maybe we could use
this to achieve something like this?

>> What would be useful is a way for Postgres to count the amount of memory
>> allocated by each backend.  This could be advantageous for giving per-backend
>> memory usage to the user, as well as for enforcing a limit on the total 
>> amount
>> of memory allocated by the backends.
> 
> I agree that this would be independently useful.
> 

Well, we already have the memory-accounting built into the memory
context infrastructure. It kinda does the same thing as the malloc()
wrapper, except that it does not publish the information anywhere and
it's per-context (so we have to walk the context recursively).

So maybe we could make this on-request somehow? Say, we'd a signal to
the process, and it'd run MemoryContextMemAllocated() on the top memory
context and store the result somewhere.

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: docs: mention "pg_read_all_stats" in "track_activities" description

2022-05-20 Thread Nathan Bossart

On Fri, May 20, 2022 at 03:17:29PM +0900, Ian Lawrence Barwick wrote:
> It seems reasonable to mention here that the information is also visible to
> members of "pg_read_all_stats", similar to what is done in the
> pg_stat_statements
> docs [2].
> 
> [2] 
> https://www.postgresql.org/docs/current/pgstatstatements.html#PGSTATSTATEMENTS-COLUMNS
> 
> Suggested wording:
> 
>> Note that even when enabled, this information is only visible to superusers,
>> members of the pg_read_all_stats role and the user owning
>> the session being reported on, so it should not represent a security risk.
> 
> Patch (for HEAD) with suggested wording attached; the change should
> IMO be applied
> all the way back to v10 (though as-is the patch only applies to HEAD,
> can provide
> others if needed).

LGTM

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Re: PG15 beta1 fix pg_stat_statements view document

2022-05-20 Thread Nathan Bossart

On Fri, May 20, 2022 at 12:46:03PM +, Shinoda, Noriyoshi (PN Japan FSIP) 
wrote:
> The attached patch modifies the pg_stat_statements view documentation updated 
> in PostgreSQL 15 Beta 1.
> The data type of the following columns in the pg_stat_statements view is 
> bigint in the current document, 
> but it is actually double precision.
>   jit_generation_time
>   jit_inlining_time
>   jit_optimization_time
>   jit_emission_time

I think there is a typo in the change to the jit_optimization_time section,
but otherwise it looks good to me.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

allow building trusted languages without the untrusted versions

2022-05-20 Thread Nathan Bossart

Hi hackers,

Presently, if you want to only build trusted PL/Perl and PL/Tcl, you need
to make a couple of code changes to compile out the untrusted parts.  I
suspect many users (e.g., anyone who wants to disallow file system access)
would benefit from a better supported way to do this.  Thus, I've attached
some patches that introduce an optional argument for the --with-perl and
--with-tcl configuration options.  This new argument can be used to build
only the trusted or untrusted version of the language.  If the argument is
not provided, both the trusted and untrusted versions are built, so this
change is backward compatible.

The PL/Tcl patch (0003) is relatively straightforward, as there are already
separate handler functions for the trusted and untrusted versions of the
language.  PL/Perl, however, is slightly more complicated.  0001 first
modifies PL/Perl to use separate handle/validator functions for the trusted
and untrusted versions.  0002 then adds support for building only trusted
or untrusted PL/Perl in a similar fashion to 0003.  Since a few contrib
modules depend on PL/Perl, 0002 also modifies some modules' Makefiles to
handle whether trusted and/or untrusted PL/Perl is built.

I haven't made the required changes (if any) for MSVC, as I do not
currently have a way to test it.  For now, I am parking these patches in
the July commitfest while I gauge interest in this feature and await any
feedback on the proposed approach.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
>From fc3b083b752d31975bf8530c09497a89dd3d9a35 Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Tue, 22 Feb 2022 15:26:35 -0800
Subject: [PATCH v1 1/3] Do not use pg_language to determine whether PL/Perl is
 trusted.

Presently, PL/Perl and PL/PerlU use the same handler and validator
functions, and we determine whether to use the trusted or untrusted
code paths by inspecting the language's pg_language tuple.  This
can lead to unexpected behavior (e.g., using the untrusted code
paths despite specifying the trusted handler/validator functions),
and it complicates building the trusted and untrusted versions of
the language separately, which we intend to support in a follow-up
change.

This change creates separate handler/validator functions for the
trusted and untrusted versions of PL/Perl so that we no longer need
to inspect pg_language to determine which code path to take.  Since
this makes it easier to tell whether we are using trusted or
untrusted PL/Perl, this change has the added benefit of simplifying
function lookup in plperl_proc_hash.
---
 src/pl/plperl/expected/plperl.out |   8 ++
 src/pl/plperl/plperl.c| 117 --
 src/pl/plperl/sql/plperl.sql  |   7 ++
 3 files changed, 76 insertions(+), 56 deletions(-)

diff --git a/src/pl/plperl/expected/plperl.out b/src/pl/plperl/expected/plperl.out
index d8a1ff5dd8..f92848602b 100644
--- a/src/pl/plperl/expected/plperl.out
+++ b/src/pl/plperl/expected/plperl.out
@@ -792,3 +792,11 @@ SELECT self_modify(42);
  126
 (1 row)
 
+-- make sure lanpltrusted is ignored
+CREATE OR REPLACE LANGUAGE mylang
+	HANDLER plperl_call_handler
+	INLINE plperl_inline_handler
+	VALIDATOR plperl_validator;
+CREATE OR REPLACE FUNCTION myfunc(TEXT) RETURNS TEXT LANGUAGE mylang AS $$ return `$_[0]`; $$;
+ERROR:  'quoted execution (``, qx)' trapped by operation mask at line 1.
+CONTEXT:  compilation of PL/Perl function "myfunc"
diff --git a/src/pl/plperl/plperl.c b/src/pl/plperl/plperl.c
index edb93ec1c4..9bc6793a30 100644
--- a/src/pl/plperl/plperl.c
+++ b/src/pl/plperl/plperl.c
@@ -252,15 +252,20 @@ static void plperl_destroy_interp(PerlInterpreter **);
 static void plperl_fini(int code, Datum arg);
 static void set_interp_require(bool trusted);
 
-static Datum plperl_func_handler(PG_FUNCTION_ARGS);
-static Datum plperl_trigger_handler(PG_FUNCTION_ARGS);
-static void plperl_event_trigger_handler(PG_FUNCTION_ARGS);
+static Datum plperl_call_handler_internal(PG_FUNCTION_ARGS, bool trusted);
+static Datum plperl_inline_handler_internal(PG_FUNCTION_ARGS, bool trusted);
+static Datum plperl_validator_internal(PG_FUNCTION_ARGS, bool trusted);
+
+static Datum plperl_func_handler(PG_FUNCTION_ARGS, bool trusted);
+static Datum plperl_trigger_handler(PG_FUNCTION_ARGS, bool trusted);
+static void plperl_event_trigger_handler(PG_FUNCTION_ARGS, bool trusted);
 
 static void free_plperl_function(plperl_proc_desc *prodesc);
 
 static plperl_proc_desc *compile_plperl_function(Oid fn_oid,
  bool is_trigger,
- bool is_event_trigger);
+ bool is_event_trigger,
+ bool trusted);
 
 static SV  *plperl_hash_from_tuple(HeapTuple tuple, TupleDesc tupdesc, bool include_generated);
 static SV  *plperl_hash_from_datum(Datum attr);
@@ -1824,19 +1829,12 @@ plperl_modify_tuple(HV *hvTD, TriggerData *tdata, HeapTuple otup)
 }
 
 
-/*
- * There are three externally visible pieces to plperl: plperl_call_handler,
- * plperl_inl

Re: Add --{no-,}bypassrls flags to createuser

2022-05-20 Thread Nathan Bossart

On Thu, May 19, 2022 at 10:35:23AM +0900, Shinya Kato wrote:
> I created a new patch to test the new options!

Thanks for the new patch!  I attached a new version with a few small
changes.  What do you think?

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
>From f1c00de3e3daf0150fdaa6ed5fe4215ebf928b6a Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Fri, 20 May 2022 21:28:06 +
Subject: [PATCH v6 1/1] Add missing options to createuser.

This adds the following options to the createuser program:
	--admin		(ADMIN)
	--member	(IN ROLE)
	--valid-until	(VALID UNTIL)
	--bypassrls	(BYPASSRLS)
	--no-bypassrls	(NOBYPASSRLS)
---
 doc/src/sgml/ref/createuser.sgml| 56 ++
 src/bin/scripts/createuser.c| 72 -
 src/bin/scripts/t/040_createuser.pl | 20 
 3 files changed, 146 insertions(+), 2 deletions(-)

diff --git a/doc/src/sgml/ref/createuser.sgml b/doc/src/sgml/ref/createuser.sgml
index 17579e50af..27efebbfe2 100644
--- a/doc/src/sgml/ref/createuser.sgml
+++ b/doc/src/sgml/ref/createuser.sgml
@@ -76,6 +76,20 @@ PostgreSQL documentation
   
  
 
+ 
+  -a role
+  --admin=role
+  
+   
+Indicates role that will be immediately added as a member of the new
+role with admin option, giving it the right to grant membership in the
+new role to others.  Multiple roles to add as members (with admin
+option) of the new role can be specified by writing multiple
+-a switches.
+   
+  
+ 
+
  
   -c number
   --connection-limit=number
@@ -204,6 +218,18 @@ PostgreSQL documentation
   
  
 
+ 
+  -m role
+  --member=role
+  
+   
+Indicates role that will be immediately added as a member of the new
+role.  Multiple roles to add as members of the new role can be specified
+by writing multiple -m switches.
+   
+  
+ 
+
  
   -P
   --pwprompt
@@ -258,6 +284,17 @@ PostgreSQL documentation
   
  
 
+ 
+  -v timestamp
+  --valid-until=timestamp
+  
+   
+Set a date and time after which the role's password is no longer valid.
+The default is to set no password expiry date.
+   
+  
+ 
+
  
-V
--version
@@ -290,6 +327,25 @@ PostgreSQL documentation
   
  
 
+ 
+  --bypassrls
+  
+   
+The new user will bypass every row-level security (RLS) policy.
+   
+  
+ 
+
+ 
+  --no-bypassrls
+  
+   
+The new user will not bypass row-level security (RLS) policies.  This is
+the default.
+   
+  
+ 
+
  
-?
--help
diff --git a/src/bin/scripts/createuser.c b/src/bin/scripts/createuser.c
index bfba0d09d1..7e3e61a9c8 100644
--- a/src/bin/scripts/createuser.c
+++ b/src/bin/scripts/createuser.c
@@ -51,6 +51,11 @@ main(int argc, char *argv[])
 		{"connection-limit", required_argument, NULL, 'c'},
 		{"pwprompt", no_argument, NULL, 'P'},
 		{"encrypted", no_argument, NULL, 'E'},
+		{"bypassrls", no_argument, NULL, 4},
+		{"no-bypassrls", no_argument, NULL, 5},
+		{"valid-until", required_argument, NULL, 'v'},
+		{"member", required_argument, NULL, 'm'},
+		{"admin", required_argument, NULL, 'a'},
 		{NULL, 0, NULL, 0}
 	};
 
@@ -62,6 +67,8 @@ main(int argc, char *argv[])
 	char	   *port = NULL;
 	char	   *username = NULL;
 	SimpleStringList roles = {NULL, NULL};
+	SimpleStringList members = {NULL, NULL};
+	SimpleStringList admins = {NULL, NULL};
 	enum trivalue prompt_password = TRI_DEFAULT;
 	ConnParams	cparams;
 	bool		echo = false;
@@ -69,6 +76,7 @@ main(int argc, char *argv[])
 	int			conn_limit = -2;	/* less than minimum valid value */
 	bool		pwprompt = false;
 	char	   *newpassword = NULL;
+	char	   *pwexpiry = NULL;
 
 	/* Tri-valued variables.  */
 	enum trivalue createdb = TRI_DEFAULT,
@@ -76,7 +84,8 @@ main(int argc, char *argv[])
 createrole = TRI_DEFAULT,
 inherit = TRI_DEFAULT,
 login = TRI_DEFAULT,
-replication = TRI_DEFAULT;
+replication = TRI_DEFAULT,
+bypassrls = TRI_DEFAULT;
 
 	PQExpBufferData sql;
 
@@ -89,7 +98,7 @@ main(int argc, char *argv[])
 
 	handle_help_version_opts(argc, argv, "createuser", help);
 
-	while ((c = getopt_long(argc, argv, "h:p:U:g:wWedDsSrRiIlLc:PE",
+	while ((c = getopt_long(argc, argv, "h:p:U:g:wWedDsSrRiIlLc:PEv:m:a:",
 			long_options, &optindex)) != -1)
 	{
 		switch (c)
@@ -165,6 +174,21 @@ main(int argc, char *argv[])
 			case 3:
 interactive = true;
 break;
+			case 4:
+bypassrls = TRI_YES;
+break;
+			case 5:
+bypassrls = TRI_NO;
+break;
+			case 'v':
+pwexpiry = pg_strdup(optarg);
+break;
+			case 'm':
+simple_string_list_append(&members, optarg);
+break;
+			case 'a':
+simple_string_list_append(&admins, optarg);
+break;
 			default:
 /* getopt_long already emitted a complaint */
 pg_log_e

Re: CPU time for pg_stat_statement

2022-05-20 Thread Michail Nikolaev

Hello, Tom.

> This is a pretty broad claim to make on the basis of one undocumented
> test case on one unmentioned platform.

I'll try to use pg_stat_kcache to check the difference between Wall
and CPU for my case.

> On what grounds do you claim getrusage will be better?  One thing we
> can be pretty certain of is that it will be slower, since it has to
> return many more pieces of information.  And the API for it only allows
> time info to be specified to microseconds, versus nanoseconds for
> clock_gettime, so it's also going to be taking a precision hit.

My idea was to not replace wall-clock (clock_gettime) by cpu-clock (getrusage).
I think about adding getrusage as an additional column (with flag to
enable actual measuring).
Looks like I need to be more precise in words :)

It is just two different clocks - and sometimes you need physical
time, sometimes CPU time (and sometimes, for example, amount of WAL
written).

Best regards,
Michail.

Re: CPU time for pg_stat_statement

2022-05-20 Thread Michail Nikolaev

Hello, Thomas.

> This might be interesting:
> https://github.com/powa-team/pg_stat_kcache

Oh, nice, looks like it could help me to reduce CPU and test my
assumption (using exec_user_time and exec_system_time).

BWT, do you know why extension is not in standard contrib (looks mature)?

Best regards,
Michail.

Re: CPU time for pg_stat_statement

2022-05-20 Thread Thomas Munro

On Sat, May 21, 2022 at 6:50 AM Michail Nikolaev
 wrote:
> But what do you think about adding cpu_time (by calling getrusage) to
> pg_stat_statements? Seems it could be very useful for CPU profiling.
>
> I am probably able to prepare the patch, but it is always better to
> get some feedback on the idea first :)

This might be interesting:

https://github.com/powa-team/pg_stat_kcache

Re: CPU time for pg_stat_statement

2022-05-20 Thread Tom Lane

Michail Nikolaev  writes:
> So, looks like clock_gettime is not so accurate to measure real CPU
> usage for some OLTP workloads. I suppose it is caused by the wall time
> vs CPU time difference (IO, thread switch, etc).

This is a pretty broad claim to make on the basis of one undocumented
test case on one unmentioned platform.

> But what do you think about adding cpu_time (by calling getrusage) to
> pg_stat_statements? Seems it could be very useful for CPU profiling.

On what grounds do you claim getrusage will be better?  One thing we
can be pretty certain of is that it will be slower, since it has to
return many more pieces of information.  And the API for it only allows
time info to be specified to microseconds, versus nanoseconds for
clock_gettime, so it's also going to be taking a precision hit.

regards, tom lane

Re: check for null value before looking up the hash function

2022-05-20 Thread Tom Lane

Zhihong Yu  writes:
> I was looking at the code in hash_record()
> of src/backend/utils/adt/rowtypes.c
> It seems if nulls[i] is true, we don't need to look up the hash function.

I don't think this is worth changing.  It complicates the logic,
rendering it unlike quite a few other functions written in the same
style.  In cases where the performance actually matters, the hash
function is cached across multiple calls anyway.  You might save
something if you have many calls in a query and not one of them
receives a non-null input, but how likely is that?

regards, tom lane

Re: Zstandard support for toast compression

2022-05-20 Thread Stephen Frost

Greetings,

* Nikolay Shaplov (dh...@nataraj.su) wrote:
> В письме от вторник, 17 мая 2022 г. 23:01:07 MSK пользователь Tom Lane 
> написал:
> 
> Hi! I came to this branch looking for a patch to review, but I guess I would 
> join the discussion instead of reading the code.

Seems that's what would be helpful now thanks for joining the
discussion.

> > >> Yeah - I think we had better reserve the fourth bit pattern for
> > >> something extensible e.g. another byte or several to specify the
> > >> actual method, so that we don't have a hard limit of 4 methods. But
> > >> even with such a system, the first 3 methods will always and forever
> > >> be privileged over all others, so we'd better not make the mistake of
> > >> adding something silly as our third algorithm.
> > > 
> > > In such a situation, would they really end up being properly distinct
> > > when it comes to what our users see..?  I wouldn't really think so.
> > 
> > It should be transparent to users, sure, but the point is that the
> > first three methods will have a storage space advantage over others.
> > Plus we'd have to do some actual work to create that extension mechanism.
> 
> Postgres is well known for extensiblility. One can write your own 
> implementation of almost everything and make it an extension.
> Though one would hardly need more than one (or two) additional compression 
> methods, but which method one will really need is hard to say. 

A thought I've had before is that it'd be nice to specify a particular
compression method on a data type basis.  Wasn't the direction that this
was taken, for reasons, but I wonder about perhaps still having a data
type compression method and perhaps one of these bits might be "the data
type's (default?) compression method".  Even so though, having an
extensible way to add new compression methods would be a good thing.

For compression methods that we already support in other parts of the
system, seems clear that we should allow those to be used for column
compression too.  We should certainly also support a way to specify on a
compression-type specific level what the compression level should be
though.

> So I guess it would be much better to create and API for creating and 
> registering own compression method and create build in Zstd compression 
> method 
> that can be used (or optionally not used) via that API.

While I generally agree that we want to provide extensibility in this
area, given that we already have zstd as a compile-time option and it
exists in other parts of the system, I don't think it makes sense to
require users to install an extension to use it.

> Moreover I guess this API (may be with some modification) can be used for 
> seamless data encryption, for example. 

Perhaps.. but this kind of encryption wouldn't allow indexing and
certainly lots of other metadata would still be unencrypted (the entire
system catalog being a good example..).

Thanks,

Stephen

signature.asc
Description: PGP signature

Re: Inquiring about my GSoC Proposal.

2022-05-20 Thread Stephen Frost

Greetings,

* Israa Odeh (israa2...@hotmail.com) wrote:
> I was wondering if you could provide me with initial feedback on my GSoC 
> proposal, as well as if you have any comments about it. And would it be 
> possible to know whether I got accepted as a contributor?

Google published this information and all accepted contributors will be
hearing from the mentors soon.

Thanks!

Stephen


signature.asc
Description: PGP signature

Re: Limiting memory allocation

2022-05-20 Thread Stephen Frost

Greetings,

* Oleksii Kliukin (al...@hintbits.com) wrote:
> > On 18. May 2022, at 17:11, Alvaro Herrera  wrote:
> > On 2022-May-18, Jan Wieck wrote:
> >> Maybe I'm missing something, but what is it that you would actually 
> >> consider
> >> a solution? Knowing your current memory consumption doesn't make the need
> >> for allocating some right now go away. What do you envision the response of
> >> PostgreSQL to be if we had that information about resource pressure?
> > 
> > What they (Timescale) do, is have a LD_PRELOAD library that checks
> > status of memory pressure, and return NULL from malloc().  This then
> > leads to clean abort of transactions and all is well.  There's nothing
> > that Postgres needs to do different than today.

I'm not super fixated on exactly what this one implementation does, but
rather that the kernel is evidently not interested in trying to solve
this problem and therefore it's something which we need to address.  I
agree in general that we don't need to do much different except to have
a way to effectively have a limit where we treat an allocation attempt
as failing and then the rest of our existing machinery will handle
failing the transaction and doing cleanup and such just fine.

> Correct. The library we have reads a limit supplied in an environment variable
> and stores per-process and total memory usage values in shared memory 
> counters,
> updated after each call to malloc/free/calloc/realloc by the process making 
> the
> call.  When updating totals, a process picks one of N counters to update
> atomically with the difference between its old and new memory usage, avoiding
> congested ones; those are summed  to determine current allocations for all
> processes and to compare against the limit.

Would be interesting to know just how many of these counters are used
and how 'congested' ones are avoided.  Though, would certainly be easier
if one could simply review this library.

> > I suppose that what they would like, is a way to inquire into the memory
> > pressure status at MemoryContextAlloc() time and return NULL if it is
> > too high.

Not really concerned with what one specific implementation that's been
done would like but rather with solving the larger issue that exists,
which is that we aren't able to cap our memory usage today and that can
lead to the OOM killer coming into play, or excessive swap usage, or
causing issue for other processes running.  While I started this with
the crash case as the main concern, and I do feel it's still a big case
to consider, there are other valuable use-cases to consider where this
would help.

> If we call user code just before malloc (and, presumably free and realloc), 
> the
> code would have to do just as much work as when it is called from the
> malloc/free/realloc wrappers inside a preloaded library. Furthermore, I don’t
> see why the user would want to customize that logic: a single Linux-specific
> implementation would solve the problem for everyone.

If the problem is explicitly defined as "deal with the Linux OOM killer"
then, yes, a Linux-specific fix would address that.  I do think that's
certainly an important, and perhaps the most important, issue that this
solves, but there's other cases where this would be really helpful.

> >  How exactly this would work is unclear to me; maybe one
> > process keeps an eye on it in an OS-specific manner,

There seems to be a lot of focus on trying to implement this as "get the
amount of free memory from the OS and make sure we don't go over that
limit" and that adds a lot of OS-specific logic which complicates things
and also ignores the use-cases where an admin wishes to limit PG's
memory usage due to other processes running on the same system.  I'll
point out that the LD_PRELOAD library doesn't even attempt to do this,
even though it's explicitly for Linux, but uses an environment variable
instead.

In PG, we'd have that be a GUC that an admin is able to set and then we
track the memory usage (perhaps per-process, perhaps using some set of
buckets, perhaps locally per-process and then in a smaller number of
buckets in shared memory, or something else) and fail an allocation when
it would go over that limit, perhaps only when it's a regular user
backend or with other conditions around it.

> What would be useful is a way for Postgres to count the amount of memory
> allocated by each backend.  This could be advantageous for giving per-backend
> memory usage to the user, as well as for enforcing a limit on the total amount
> of memory allocated by the backends.

I agree that this would be independently useful.  

Thanks,

Stephen

signature.asc
Description: PGP signature

CPU time for pg_stat_statement

2022-05-20 Thread Michail Nikolaev

Hello, hackers.

Today I was doing some aggregates over pg_stat_statements in order to
find types of queries consuming most of the CPU. Aggregates were made
on two pg_state_statement snapshots within 30 sec delay.

The sum(total_time) had the biggest value for a very frequent query
with about 10ms execution. I was thinking it is the biggest CPU
consumer.

But after reducing the frequency of queries a lot I was unable to see
any significant difference in server CPU usage...

So, looks like clock_gettime is not so accurate to measure real CPU
usage for some OLTP workloads. I suppose it is caused by the wall time
vs CPU time difference (IO, thread switch, etc).

But what do you think about adding cpu_time (by calling getrusage) to
pg_stat_statements? Seems it could be very useful for CPU profiling.

I am probably able to prepare the patch, but it is always better to
get some feedback on the idea first :)

Best regards,
Michail.

check for null value before looking up the hash function

2022-05-20 Thread Zhihong Yu

Hi,
I was looking at the code in hash_record()
of src/backend/utils/adt/rowtypes.c

It seems if nulls[i] is true, we don't need to look up the hash function.

Please take a look at the patch.

Thanks


hash-record-check-null-first.patch
Description: Binary data

Re: Limiting memory allocation

2022-05-20 Thread Oleksii Kliukin

Hi,

> On 18. May 2022, at 17:11, Alvaro Herrera  wrote:
> 
> On 2022-May-18, Jan Wieck wrote:
> 
>> Maybe I'm missing something, but what is it that you would actually consider
>> a solution? Knowing your current memory consumption doesn't make the need
>> for allocating some right now go away. What do you envision the response of
>> PostgreSQL to be if we had that information about resource pressure?
> 
> 
> What they (Timescale) do, is have a LD_PRELOAD library that checks
> status of memory pressure, and return NULL from malloc().  This then
> leads to clean abort of transactions and all is well.  There's nothing
> that Postgres needs to do different than today.

Correct. The library we have reads a limit supplied in an environment variable
and stores per-process and total memory usage values in shared memory counters,
updated after each call to malloc/free/calloc/realloc by the process making the
call.  When updating totals, a process picks one of N counters to update
atomically with the difference between its old and new memory usage, avoiding
congested ones; those are summed  to determine current allocations for all
processes and to compare against the limit.

> 
> I suppose that what they would like, is a way to inquire into the memory
> pressure status at MemoryContextAlloc() time and return NULL if it is
> too high.

If we call user code just before malloc (and, presumably free and realloc), the
code would have to do just as much work as when it is called from the
malloc/free/realloc wrappers inside a preloaded library. Furthermore, I don’t
see why the user would want to customize that logic: a single Linux-specific
implementation would solve the problem for everyone.

>  How exactly this would work is unclear to me; maybe one
> process keeps an eye on it in an OS-specific manner,

We don’t need to do anything for non-Linux systems, as cgroups and OOM
killer doesn’t exist there.

> and if it does get
> near the maximum, set a bit in shared memory that other processes can
> examine when MemoryContextAlloc is called.  It doesn't have to be
> exactly accurate; an approximation is probably okay.

What would be a purpose of setting a bit in shared memory when the maximum Is
about to be reached?

What would be useful is a way for Postgres to count the amount of memory
allocated by each backend.  This could be advantageous for giving per-backend
memory usage to the user, as well as for enforcing a limit on the total amount
of memory allocated by the backends.

—
Oleksii Kliukin

Re: 15beta1 test failure on mips in isolation/expected/stats

2022-05-20 Thread Tom Lane

I wrote:
> Andres Freund  writes:
>> I think what might be happening is that the transactional stats updates get
>> reported by s2 *before* the non-transactional stats updates come in from
>> s1. I.e. the pgstat_report_stat() at the end of s2_commit_prepared_a does a
>> report, because the machine is slow enough for it to be "time to reports 
>> stats
>> again". Then s1 reports its non-transactional stats.

> Sounds plausible.  And I left the test loop running, and it's now past
> 100 consecutive successes, so I think this change definitely "fixes" it.

In the light of morning, at least half of the parameter dependency is
now obvious: the problematic test case involves a prepared transaction,
so it fails completely with max_prepared_transactions = 0.  The isolation
test harness masks that by matching against stats_1.out, but it's not
really a "success".

My numbers do still suggest that there's a weak dependence on
max_connections, but it's possible that that's a mirage.  I did not run
enough test cycles to be able to say positively that that's a real effect
(and the machine's slow enough that I'm not excited about doing so).

regards, tom lane

RE: Build-farm - intermittent error in 031_column_list.pl

2022-05-20 Thread osumi.takami...@fujitsu.com

On Thursday, May 19, 2022 8:13 PM Amit Kapila  wrote:
> On Thu, May 19, 2022 at 3:16 PM Amit Kapila 
> wrote:
> >
> > On Thu, May 19, 2022 at 12:28 PM Kyotaro Horiguchi
> >  wrote:
> > >
> > > At Thu, 19 May 2022 14:26:56 +1000, Peter Smith
> > >  wrote in
> > > > Hi hackers.
> > > >
> > > > FYI, I saw that there was a recent Build-farm error on the
> > > > "grison" machine [1] [1]
> > > > https://buildfarm.postgresql.org/cgi-bin/show_history.pl?nm=grison
> > > > &br=HEAD
> > > >
> > > > The error happened during "subscriptionCheck" phase in the TAP
> > > > test t/031_column_list.pl This test file was added by this [2]
> > > > commit.
> > > > [2]
> > > >
> https://github.com/postgres/postgres/commit/923def9a533a7d986acfb5
> > > > 24139d8b9e5466d0a5
> > >
> > > 2022-04-17 00:16:04.278 CEST [293659][client
> > > backend][4/270:0][031_column_list.pl] LOG:  statement: CREATE
> > > PUBLICATION pub9 FOR TABLE test_part_d (a) WITH
> > > (publish_via_partition_root = true);
> > > 2022-04-17 00:16:04.279 CEST [293659][client
> > > backend][:0][031_column_list.pl] LOG:  disconnection: session time:
> > > 0:00:00.002 user=bf database=postgres host=[local]
> > >
> > > "CREATE PUBLICATION pub9" is executed at 00:16:04.278 on 293659 then
> > > the session has been disconnected. But the following request for the
> > > same publication fails due to the absense of the publication.
> > >
> > > 2022-04-17 00:16:08.147 CEST [293856][walsender][3/0:0][sub1]
> > > STATEMENT:  START_REPLICATION SLOT "sub1" LOGICAL 0/153DB88
> > > (proto_version '3', publication_names '"pub9"')
> > > 2022-04-17 00:16:08.148 CEST [293856][walsender][3/0:0][sub1] ERROR:
> > > publication "pub9" does not exist
> > >
> >
> > This happens after "ALTER SUBSCRIPTION sub1 SET PUBLICATION pub9".
> The
> > probable theory is that ALTER SUBSCRIPTION will lead to restarting of
> > apply worker (which we can see in LOGS as well) and after the restart,
> > the apply worker will use the existing slot and replication origin
> > corresponding to the subscription. Now, it is possible that before
> > restart the origin has not been updated and the WAL start location
> > points to a location prior to where PUBLICATION pub9 exists which can
> > lead to such an error. Once this error occurs, apply worker will never
> > be able to proceed and will always return the same error. Does this
> > make sense?
> >
> > Unless you or others see a different theory, this seems to be the
> > existing problem in logical replication which is manifested by this
> > test. If we just want to fix these test failures, we can create a new
> > subscription instead of altering the existing publication to point to
> > the new publication.
> >
> 
> If the above theory is correct then I think allowing the publisher to catch 
> up with
> "$node_publisher->wait_for_catchup('sub1');" before ALTER SUBSCRIPTION
> should fix this problem. Because if before ALTER both publisher and
> subscriber are in sync then the new publication should be visible to
> WALSender.
Hi,


I've attached a patch for the fix proposed here.
First of all, thank you so much for helping me offlist, Amit-san.

I reproduced the failure like [1] by commenting out
WalSndWaitForWal's call of WalSndKeepalive and running the test.
This comment out intends to suppress the advance of confirmed_flush location
after creating a publication.

In short, my understanding how the bug happened is, 
1. we execute 'create publication pubX' and create one publication.
2. 'alter subscription subY set publication pubX' makes the apply worker exit
3. relaunched apply worker searches for pubX. But, the slot 
position(confirmed_flush)
   doesn't get updated and points to some location before create publication at 
the publisher node.

Applying the attached patch have made the test pass.


[1] the subscriber's log

2022-05-20 08:56:50.773 UTC [5153] 031_column_list.pl LOG: statement: ALTER 
SUBSCRIPTION sub1 SET PUBLICATION pub6
2022-05-20 08:56:50.801 UTC [5156] 031_column_list.pl LOG: statement: SELECT 
count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');
2022-05-20 08:56:50.846 UTC [5112] LOG: logical replication apply worker for 
subscription "sub1" will restart because of a parameter change
2022-05-20 08:56:50.915 UTC [5158] 031_column_list.pl LOG: statement: SELECT 
count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');
...
2022-05-20 08:56:51.257 UTC [5164] 031_column_list.pl LOG: statement: SELECT 
count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');
2022-05-20 08:56:51.353 UTC [5166] LOG: logical replication apply worker for 
subscription "sub1" has started
2022-05-20 08:56:51.366 UTC [5168] LOG: logical replication table 
synchronization worker for subscription "sub1", table "test_part_a" has started
2022-05-20 08:56:51.370 UTC [5171] 031_column_list.pl LOG: statement: SELECT 
count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');
2022-05-20 08:56:51.373 UTC [5166

Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.

2022-05-20 Thread Pavel Borisov

CFbot says v12 patch does not apply.
Rebased. PFA v13.
Your reviews are very much welcome!

-- 
Best regards,
Pavel Borisov

Postgres Professional: http://postgrespro.com 
From 21fe45c0ae8479e0733bd8caeb5d2a19d715e0d9 Mon Sep 17 00:00:00 2001
From: Pavel Borisov 
Date: Wed, 11 May 2022 15:54:13 +0400
Subject: [PATCH v13] Add option for amcheck and pg_amcheck to check unique
 constraint for btree indexes.

With 'checkunique' option bt_index_check() and bt_index_parent_check()
for btree indexes that has unique constraint will check it i.e.
will check that only one heap entry for all equal keys in the index
(including posting list entries) is visible. Report error if not.

pg_amcheck called with --checkunique option will do the same for
all indexes it checks

Authors:
Anastasia Lubennikova 
Pavel Borisov 
Maxim Orlov 
---
 contrib/amcheck/Makefile  |   2 +-
 contrib/amcheck/amcheck--1.3--1.4.sql |  29 ++
 contrib/amcheck/amcheck.control   |   2 +-
 contrib/amcheck/expected/check_btree.out  |  42 +++
 contrib/amcheck/sql/check_btree.sql   |  14 +
 contrib/amcheck/t/004_verify_nbtree_unique.pl | 234 +
 contrib/amcheck/verify_nbtree.c   | 329 +-
 doc/src/sgml/amcheck.sgml |  14 +-
 doc/src/sgml/ref/pg_amcheck.sgml  |  11 +
 src/bin/pg_amcheck/pg_amcheck.c   |  50 ++-
 src/bin/pg_amcheck/t/003_check.pl |  45 +++
 src/bin/pg_amcheck/t/005_opclass_damage.pl|  65 
 12 files changed, 813 insertions(+), 24 deletions(-)
 create mode 100644 contrib/amcheck/amcheck--1.3--1.4.sql
 create mode 100644 contrib/amcheck/t/004_verify_nbtree_unique.pl

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index b82f221e50b..88271687a3e 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -7,7 +7,7 @@ OBJS = \
 	verify_nbtree.o
 
 EXTENSION = amcheck
-DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql
+DATA = amcheck--1.3--1.4.sql amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql
 PGFILEDESC = "amcheck - function for verifying relation integrity"
 
 REGRESS = check check_btree check_heap
diff --git a/contrib/amcheck/amcheck--1.3--1.4.sql b/contrib/amcheck/amcheck--1.3--1.4.sql
new file mode 100644
index 000..1caba148aa4
--- /dev/null
+++ b/contrib/amcheck/amcheck--1.3--1.4.sql
@@ -0,0 +1,29 @@
+/* contrib/amcheck/amcheck--1.3--1.4.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "ALTER EXTENSION amcheck UPDATE TO '1.4'" to load this file. \quit
+
+-- In order to avoid issues with dependencies when updating amcheck to 1.4,
+-- create new, overloaded versions of the 1.2 bt_index_parent_check signature,
+-- and 1.1 bt_index_check signature.
+
+--
+-- bt_index_parent_check()
+--
+CREATE FUNCTION bt_index_parent_check(index regclass,
+heapallindexed boolean, rootdescend boolean, checkunique boolean)
+RETURNS VOID
+AS 'MODULE_PATHNAME', 'bt_index_parent_check'
+LANGUAGE C STRICT PARALLEL RESTRICTED;
+--
+-- bt_index_check()
+--
+CREATE FUNCTION bt_index_check(index regclass,
+heapallindexed boolean, checkunique boolean)
+RETURNS VOID
+AS 'MODULE_PATHNAME', 'bt_index_check'
+LANGUAGE C STRICT PARALLEL RESTRICTED;
+
+-- Don't want this to be available to public
+REVOKE ALL ON FUNCTION bt_index_parent_check(regclass, boolean, boolean, boolean) FROM PUBLIC;
+REVOKE ALL ON FUNCTION bt_index_check(regclass, boolean, boolean) FROM PUBLIC;
diff --git a/contrib/amcheck/amcheck.control b/contrib/amcheck/amcheck.control
index ab50931f754..e67ace01c99 100644
--- a/contrib/amcheck/amcheck.control
+++ b/contrib/amcheck/amcheck.control
@@ -1,5 +1,5 @@
 # amcheck extension
 comment = 'functions for verifying relation integrity'
-default_version = '1.3'
+default_version = '1.4'
 module_pathname = '$libdir/amcheck'
 relocatable = true
diff --git a/contrib/amcheck/expected/check_btree.out b/contrib/amcheck/expected/check_btree.out
index 38791bbc1f4..9e257ac3bb2 100644
--- a/contrib/amcheck/expected/check_btree.out
+++ b/contrib/amcheck/expected/check_btree.out
@@ -199,6 +199,47 @@ SELECT bt_index_check('bttest_a_expr_idx', true);
  
 (1 row)
 
+-- UNIQUE constraint check
+SELECT bt_index_check('bttest_a_idx', true, true);
+ bt_index_check 
+
+ 
+(1 row)
+
+SELECT bt_index_check('bttest_b_idx', false, true);
+ bt_index_check 
+
+ 
+(1 row)
+
+SELECT bt_index_parent_check('bttest_a_idx', true, true, true);
+ bt_index_parent_check 
+---
+ 
+(1 row)
+
+SELECT bt_index_parent_check('bttest_b_idx', true, false, true);
+ bt_index_parent_check 
+---
+ 
+(1 row)
+
+-- Check null values in unique index are not treated as equal
+CREATE TABLE bttest_unique_nulls (a serial, b int, c int UNIQUE);
+INSERT INTO bttest_unique_nulls VALUES (generate_series(1,

PG15 beta1 fix pg_stat_statements view document

2022-05-20 Thread Shinoda, Noriyoshi (PN Japan FSIP)

Hi hackers,

The attached patch modifies the pg_stat_statements view documentation updated 
in PostgreSQL 15 Beta 1.
The data type of the following columns in the pg_stat_statements view is bigint 
in the current document, 
but it is actually double precision.
jit_generation_time
jit_inlining_time
jit_optimization_time
jit_emission_time

Regards,
Noriyoshi Shinoda


pg_stat_statements_doc_v1.diff
Description: pg_stat_statements_doc_v1.diff

Re: [RFC] Improving multi-column filter cardinality estimation using MCVs and HyperLogLog

2022-05-20 Thread Tomas Vondra




On 5/19/22 19:59, Matthias van de Meent wrote:
> On Mon, 16 May 2022 at 00:09, Tomas Vondra
> mailto:tomas.von...@enterprisedb.com>>
> wrote:
>>
>> On 5/15/22 21:55, Matthias van de Meent wrote:
>> > Note: I am not (currently) planning on implementing this rough idea,
>> > just putting it up to share and document the idea, on request of Tomas
>> > (cc-ed).
>> >
>> > The excellent pgconf.de  presentation on
> PostgreSQL's extended
>> > statistics system by Tomas Vondra [0] talked about how the current
>> > default statistics assume the MCVs of columns to be fully independent,
>> > i.e. values of column A do not imply any value of columns B and C, and
>> > that for accurate data on correllated values the user needs to
>> > manually create statistics on the combined columns (by either
>> > STATISTICS or by INDEX).
>> >
>> > This is said to be due to limitations in our statistics collector: to
>> > determine the fraction of the table that contains the value, we store
>> > the N most common values with the fraction of their occurrance in the
>> > table. This value is quite exact, but combining these values proves
>> > difficult: there is nothing in the stored value that can confidently
>> > include or exclude parts of the table from a predicate using that MCV,
>> > so we can only assume that the values of two columns are independent.
>> >
>> > After the presentation it came to me that if we were to add an
>> > estimator for the number of rows with that value to the MCV lists in
>> > the form of HLL sketches (in addition to or replacing the current
>> > most_common_elem_freqs fractions), we would be able to make better
>> > estimates for multi-column filters by combining the HLL row
>> > cardinality sketches for filters that filter on these MCVs. This would
>> > remove the immediate need for manual statistics with an cartesian
>> > product of the MCVs of those columns with their occurrance fractions,
>> > which significantly reduces the need for the creation of manual
>> > statistics - the need that exists due to planner mis-estimates in
>> > correlated columns. Custom statistics will still be required for
>> > expression statistics, but column correlation estimations _within
>> > MCVs_ is much improved.
>> >
>> > How I imagine this would work is that for each value in the MCV, an
>> > HLL is maintained that estimates the amount of distinct tuples
>> > containing that value. This can be h(TID) or h(PK), or anything else
>> > that would uniquely identify returned tuples. Because the keyspace of
>> > all HLLs that are generated are on the same table, you can apply join
>> > and intersection operations on the HLLs of the MCVs (for OR and
>> > AND-operations respectively), and provide fairly accurately estimates
>> > for the amount of tuples that would be returned by the filter on that
>> > table.
>> > > The required size of the HLL sketches can be determined by the amount
>> > of tuples scanned during analyze, potentially reducing the size
>> > required to store these HLL sketches from the usual 1.5kB per sketch
>> > to something smaller - we'll only ever need to count nTuples distinct
>> > values, so low values for default_statistics_target would allow for
>> > smaller values for m in the HLL sketches, whilst still providing
>> > fairly accurate result estimates.
>> >
>>
>> I think it's an interesting idea. In principle it allows deducing the
>> multi-column MCV for arbitrary combination of columns, not determined in
>> advance. We'd have the MCV with HLL instead of frequencies for columns
>> A, B and C:
>>
>> (a1, hll(a1))
>> (a2, hll(a2))
>> (...)
>> (aK, hll(aK))
>>
>>
>> (b1, hll(b1))
>> (b2, hll(b2))
>> (...)
>> (bL, hll(bL))
>>
>> (c1, hll(c1))
>> (c2, hll(c2))
>> (...)
>> (cM, hll(cM))
>>
>> and from this we'd be able to build MCV for any combination of those
>> three columns.
>>
>> And in some sense it might even be more efficient/accurate, because the
>> MCV on (A,B,C) might have up to K*L*M items. if there's 100 items in
>> each column, that'd be 1,000,000 combinations, which we can't really
>> store (target is up to 10k). And even if we could, it'd be 1M
>> combinations with frequencies (so ~8-16B per combination).
>>
>> While with the MCV/HLL, we'd have 300 items and HLL. Assuming 256-512B
>> HLL would be enough, that's still way smaller than the multi-column MCV.
> 
> HLLs for statistics_target=100 could use 4 bits per bucket, but any
> target above 218 should use 5 bits: nbits = ceil(log2(log2(target *
> 300))), and this saves only 20% on storage.
> 

I think the size estimate are somewhat misleading, as it ignores how
correlated the columns actually are. If they are strongly correlated,
there are probably much fewer combinations than the cartesian product.
That is, given two correlated columns with 100 items MCVs, the combined
MCV is likely much smaller than 1 items.

And for non-correlated columns it doesn't really matter, because people
would not need to create the multi-c

Re: Addition of PostgreSQL::Test::Cluster::pg_version()

2022-05-20 Thread Michael Paquier

On Thu, May 19, 2022 at 07:28:53AM -0400, Andrew Dunstan wrote:
> Looks ok. PostgreSQL::Version is designed so that the object behaves
> sanely in comparisons and when interpolated into a string.

Okay, I have applied this thing.  I'll move back to my business with
the tests of pg_upgrade...
--
Michael


signature.asc
Description: PGP signature

Re: Build-farm - intermittent error in 031_column_list.pl

2022-05-20 Thread Tomas Vondra

On 5/20/22 05:58, Amit Kapila wrote:
> On Fri, May 20, 2022 at 6:58 AM Kyotaro Horiguchi
>  wrote:
>>
 the apply worker will use the existing slot and replication origin
 corresponding to the subscription. Now, it is possible that before
 restart the origin has not been updated and the WAL start location
 points to a location prior to where PUBLICATION pub9 exists which can
 lead to such an error. Once this error occurs, apply worker will never
 be able to proceed and will always return the same error. Does this
 make sense?
>>
>> Wow. I didin't thought that line. That theory explains the silence and
>> makes sense even though I don't see LSN transistions that clearly
>> support it.  I dimly remember a similar kind of problem..
>>
 Unless you or others see a different theory, this seems to be the
 existing problem in logical replication which is manifested by this
 test. If we just want to fix these test failures, we can create a new
 subscription instead of altering the existing publication to point to
 the new publication.

>>>
>>> If the above theory is correct then I think allowing the publisher to
>>> catch up with "$node_publisher->wait_for_catchup('sub1');" before
>>> ALTER SUBSCRIPTION should fix this problem. Because if before ALTER
>>> both publisher and subscriber are in sync then the new publication
>>> should be visible to WALSender.
>>
>> It looks right to me.
>>
> 
> Let's wait for Tomas or others working in this area to share their thoughts.
> 

Are we really querying the publications (in get_rel_sync_entry) using
the historical snapshot? I haven't really realized this, but yeah, that
might explain the issue.

The new TAP test does ALTER SUBSCRIPTION ... SET PUBLICATION much more
often than any other test (there are ~15 calls, 12 of which are in this
new test). That might be why we haven't seen failures before. Or maybe
the existing tests simply are not vulnerable to this, because they
either do wait_for_catchup late enough or don't do any DML right before
executing SET PUBLICATION.

>>  That timetravel seems inintuitive but it's the
>> (current) way it works.
>>
> 
> I have thought about it but couldn't come up with a good way to change
> the way currently it works. Moreover, I think it is easy to hit this
> in other ways as well. Say, you first create a subscription with a
> non-existent publication and then do operation on any unrelated table
> on the publisher before creating the required publication, we will hit
> exactly this problem of "publication does not exist", so I think we
> may need to live with this behavior and write tests carefully.
> 

Yeah, I think it pretty much requires ensuring the subscriber is fully
caught up with the publisher, otherwise ALTER SUBSCRIPTION may break the
replication in an unrecoverable way (actually, you can alter the
subscription and remove the publication again, right?).

But this is not just about tests, of course - the same issue applies to
regular replication. That's a bit unfortunate, so maybe we should think
about making this less fragile.

We might make sure the subscriber is not lagging (essentially the
wait_for_catchup) - which the users will have to do anyway (although
maybe they know the publisher is beyond the LSN where it was created).

The other option would be to detect such case, somehow - if you don't
see the publication yet, see if it exists in current snapshot, and then
maybe ignore this error. But that has other issues (the publication
might have been created and dropped, in which case you won't see it).
Also, we'd probably have to ignore RelationSyncEntry for a while, which
seems quite expensive.

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Handle infinite recursion in logical replication setup

2022-05-20 Thread vignesh C

On Thu, May 19, 2022 at 2:12 PM shiy.f...@fujitsu.com
 wrote:
>
> On Wed, May 4, 2022 2:47 PM vignesh C  wrote:
> >
> > Thanks for the comments, the attached v13 patch has the changes for the 
> > same.
> >
>
> Here are some comments on v13-0002 patch.
>
> 1)
> +* Throw an error so that the user can take care of the 
> initial data
> +* copying and then create subscription with copy_data off.
>
> Should "with copy_data off" be changed to "with copy_data off or force"?

Modified

> 2)
> case ALTER_SUBSCRIPTION_ADD_PUBLICATION:
> case ALTER_SUBSCRIPTION_DROP_PUBLICATION:
> ...
> /*
>  * See ALTER_SUBSCRIPTION_REFRESH for 
> details why this is
>  * not allowed.
>  */
> if (sub->twophasestate == 
> LOGICALREP_TWOPHASE_STATE_ENABLED && opts.copy_data)
>
> I think we need some changes here, too. Should it be modified to:
> if (sub->twophasestate == LOGICALREP_TWOPHASE_STATE_ENABLED && 
> IS_COPY_DATA_ON_OR_FORCE(opts.copy_data))

Modified

Thanks for the comments, the v14 patch attached at [1] has the changes
for the same.
[1] - 
https://www.postgresql.org/message-id/CALDaNm0xuYy35vOudVHBjov3fQ%3DjBRHJHKUUN9VarqO%3DYqtaxg%40mail.gmail.com

Regards,
Vignesh

Re: PG15 beta1 sort performance regression due to Generation context change

2022-05-20 Thread Heikki Linnakangas


On 20/05/2022 08:56, David Rowley wrote:

The problem is that generation chunks have a larger chunk header than
aset do due to having to store the block pointer that the chunk
belongs to so that GenerationFree() can increment the nfree chunks in
the block. aset.c does not require this as freed chunks just go onto a
freelist that's global to the entire context.


Could the 'context' field be moved from GenerationChunk to GenerationBlock?

- Heikki

Re: Handle infinite recursion in logical replication setup

2022-05-20 Thread vignesh C

On Wed, May 18, 2022 at 4:22 PM Amit Kapila  wrote:
>
> On Wed, May 18, 2022 at 10:29 AM Amit Kapila  wrote:
> >
> > On Wed, May 4, 2022 at 12:17 PM vignesh C  wrote:
> > >
> > > Thanks for the comments, the attached v13 patch has the changes for the 
> > > same.
> > >
> >
> > Few comments on v13-0001
> > ==
> >
>
> Few comments on v13-0002
> ===
> 1.
> The steps
> + to create a two-node bidirectional replication are given below:
> +   
>
> The steps given after this will be successful only when there is no
> data in any of the nodes and that is not clear by reading docs.

Modified

> 2.
> +  
> +   Adding a new node when data is present in the existing 
> nodes
> +
> + Adding a new node node3 to the existing
> + node1 and node2 when data is 
> present
> + in existing nodes node1 and node2
> + needs similar steps. The only change required here is that
> + node3 should create a subscription with
> + copy_data = force to one of the existing nodes to
> + receive the existing data during initial data synchronization.
> +   
>
> I think the steps for these require the user to lock the required
> tables/database (or in some other way hold the operations on required
> tables) for node-2 till the time setup is complete, otherwise, node-3
> might miss some data. It seems the same is also missing in the
> section: "Adding a new node when data is present in the new node".

Modified. Mentioned that lock is required on new node-3 and node-2. I
have mentioned node-3 also should be locked so that no operation is
happened in node-3, else there can be some dml operation after
truncate which is sent to node-1 and the same data is synced when
create subscription in node-3 subscribing to the publisher in node-1.

> 3.
> +
> +  
> +   Adding a new node when data is present in the new node
> ...
> ...
>
> +   
> +Create a subscription in node1 to subscribe to
> +node3. Use copy_data specified as
> +force so that the existing table data is copied during
> +initial sync:
> +
> +node1=# CREATE SUBSCRIPTION sub_node1_node3
> +node1-# CONNECTION 'dbname=foo host=node3 user=repuser'
> +node1-# PUBLICATION pub_node3
> +node1-# WITH (copy_data = force, local_only = on);
> +CREATE SUBSCRIPTION
> +
> +
> +   
> +Create a subscription in node2 to subscribe to
> +node3. Use copy_data specified as
> +force so that the existing table data is copied during
> +initial sync:
> +
> +node2=# CREATE SUBSCRIPTION sub_node2_node3
> +node2-# CONNECTION 'dbname=foo host=node3 user=repuser'
> +node2-# PUBLICATION pub_node3
> +node2-# WITH (copy_data = force, local_only = on);
> +CREATE SUBSCRIPTION
> +
>
> Why do we need to use "copy_data = force" here? AFAIU, unless, we
> create any subscription on node-3, we don't need the 'force' option.

Modified to on.

> 4. We should have a generic section to explain how users can add a new
> node using the new options to the existing set of nodes in all cases.
> For example, say when the existing set of nodes has some data and the
> new node also has some pre-existing data. I think the basic steps are
> something like: a. create a required publication(s) on the new node.
> (b) create subscriptions on existing nodes pointing to publication on
> the new node with the local_only option as true and copy_data = on.
> (c) wait for data to be copied from the new node to existing nodes.
> (d) Truncate the data on the new node. (e) create subscriptions
> corresponding to each of the existing publisher nodes on the new node
> with local_only as true and copy_data = force for one of the nodes.
> (f) One needs to ensure that there is no new activity on required
> tables/database (either by locking or in some other way) in all the
> nodes for which copy_data option is kept as false while creating
> subscriptions in the previous step to avoid any data loss.

Added

> We also need to mention in Notes that as all operations are not
> transactional, user is advised to take backup of existing data to
> avoid any inconsistency.

Modified

> 5.
>  * It is quite possible that subscriber has not yet pulled data to
> + * the tables, but in ideal cases the table data will be subscribed.
> + * To keep the code simple it is not checked if the subscriber table
> + * has pulled the data or not.
> + */
> + if (copydata == COPY_DATA_ON && local_only && !slot_attisnull(slot, 3))
>
> Sorry, but I don't understand what you intend to say by the above
> comment. Can you please explain?

When the user specifies copy_data as on, we should check if the
publisher has the publication tables being subscribed from a remote
publisher. If so throw an error as it remote origin data present.
Ex:
Node1 - pub1 for table t1 -- no data
Node2 - Sub1 subscribing to data from pub1
Node2 - pub2 for table t1 -- no data
Node3 - create subscription to Node2 with copy_data = ON

In this case even though the table does not have any remote origin
data,  as Node

Re: Handle infinite recursion in logical replication setup

2022-05-20 Thread vignesh C

On Wed, May 18, 2022 at 1:40 PM shiy.f...@fujitsu.com
 wrote:
>
> On Wed, May 4, 2022 2:47 PM vignesh C  wrote:
> >
> > Thanks for the comments, the attached v13 patch has the changes for the 
> > same.
> >
>
> Thanks for your patch. Here are some comments on v13-0001 patch.
>
> 1)
> @@ -152,6 +152,18 @@ CREATE SUBSCRIPTION  class="parameter">subscription_name  
> 
>
> +   
> +local_only (boolean)
> +
> + 
> +  Specifies whether the subscription will request the publisher to 
> send
> +  locally originated changes at the publisher node, or send any
> +  publisher node changes regardless of their origin. The default is
> +  false.
> + 
> +
> +   
> +
> 
>  slot_name (string)
>  
>
> I think this change should be put after "The following parameters control the
> subscription's replication behavior after it has been created", thoughts?

Modified

> 2)
> A new column "sublocalonly" is added to pg_subscription, so maybe we need add 
> it
> to pg_subscription document, too. (in doc/src/sgml/catalogs.sgml)

Modified

> 3)
> /*
>  * Currently we always forward.
>  */
> static bool
> pgoutput_origin_filter(LogicalDecodingContext *ctx,
>RepOriginId origin_id)
>
> Should we modify the comment of pgoutput_origin_filter()? It doesn't match the
> code.

Modified

Thanks for the comments, the v14 patch attached at [1] has the changes
for the same.
[1] - 
https://www.postgresql.org/message-id/CALDaNm0xuYy35vOudVHBjov3fQ%3DjBRHJHKUUN9VarqO%3DYqtaxg%40mail.gmail.com

Regards,
Vignesh

Re: 15beta1 crash on mips64el in pg_regress/triggers

2022-05-20 Thread Christoph Berg

Re: Tom Lane
> Many kilowatt-hours later, I've entirely failed to reproduce this
> on gcc230.  Not sure how to investigate further.  Given that your
> original build machine is so slow, could it be timing-related?
> Hard to see how, given the location of the crash, but ...

My other rebuild (on yet another machine) also passed fine, so we can
possibly attribute that to some hardware glitch on the original
machine. But it's being used as a regular buildd for Debian, so I
guess it would have already been noticed if there was any general
problem with it. I'll try reaching out to the buildd folks if they
know anything.

https://buildd.debian.org/status/recent.php?bad_results_only=on&a=mips64el&suite=experimental

Christoph

41 matches

Mail list logo