Re: Skipping logical replication transactions on subscriber side

2021-11-16 Thread Masahiko Sawada
On Wed, Nov 17, 2021 at 3:52 PM vignesh C  wrote:
>
> On Tue, Nov 16, 2021 at 12:01 PM Masahiko Sawada  
> wrote:
> >
> > On Mon, Nov 15, 2021 at 11:43 PM vignesh C  wrote:
> > >
> > > On Mon, Nov 15, 2021 at 2:48 PM Masahiko Sawada  
> > > wrote:
> > > >
> > > > On Mon, Nov 15, 2021 at 4:49 PM Greg Nancarrow  
> > > > wrote:
> > > > >
> > > > > On Mon, Nov 15, 2021 at 1:49 PM Masahiko Sawada 
> > > > >  wrote:
> > > > > >
> > > > > > I've attached an updated patch that incorporates all comments I got 
> > > > > > so
> > > > > > far. Please review it.
> > > > > >
> > > > >
> > > > > Thanks for the updated patch.
> > > > > A few minor comments:
> > > > >
> > > > > doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
> > > > >
> > > > > (1) tab in doc updates
> > > > >
> > > > > There's a tab before "Otherwise,":
> > > > >
> > > > > +copy of the relation with relid.
> > > > > Otherwise,
> > > >
> > > > Fixed.
> > > >
> > > > >
> > > > > src/backend/utils/adt/pgstatfuncs.c
> > > > >
> > > > > (2) The function comment for "pg_stat_reset_subscription_worker_sub"
> > > > > seems a bit long and I expected it to be multi-line (did you run
> > > > > pg_indent?)
> > > >
> > > > I ran pg_indent on pgstatfuncs.c but it didn't become a multi-line 
> > > > comment.
> > > >
> > > > >
> > > > > src/include/pgstat.h
> > > > >
> > > > > (3) Remove PgStat_StatSubWorkerEntry.dbid?
> > > > >
> > > > > The "dbid" member of the new PgStat_StatSubWorkerEntry struct doesn't
> > > > > seem to be used, so I think it should be removed.
> > > > > (I could remove it and everything builds OK and tests pass).
> > > > >
> > > >
> > > > Fixed.
> > > >
> > > > Thank you for the comments! I've updated an updated version patch.
> > >
> > > Thanks for the updated patch.
> > > I found one issue:
> > > This Assert can fail in few cases:
> > > +void
> > > +pgstat_report_subworker_error(Oid subid, Oid subrelid, Oid relid,
> > > +
> > > LogicalRepMsgType command, TransactionId xid,
> > > + const char 
> > > *errmsg)
> > > +{
> > > +   PgStat_MsgSubWorkerError msg;
> > > +   int len;
> > > +
> > > +   Assert(strlen(errmsg) < PGSTAT_SUBWORKERERROR_MSGLEN);
> > > +   len = offsetof(PgStat_MsgSubWorkerError, m_message[0]) +
> > > strlen(errmsg) + 1;
> > > +
> > >
> > > I could reproduce the problem with the following scenario:
> > > Publisher:
> > > create table t1 (c1 varchar);
> > > create publication pub1 for table t1;
> > > insert into t1 values(repeat('abcd', 5000));
> > >
> > > Subscriber:
> > > create table t1(c1 smallint);
> > > create subscription sub1 connection 'dbname=postgres port=5432'
> > > publication pub1 with ( two_phase = true);
> > > postgres=# select * from pg_stat_subscription_workers;
> > > WARNING:  terminating connection because of crash of another server 
> > > process
> > > DETAIL:  The postmaster has commanded this server process to roll back
> > > the current transaction and exit, because another server process
> > > exited abnormally and possibly corrupted shared memory.
> > > HINT:  In a moment you should be able to reconnect to the database and
> > > repeat your command.
> > > server closed the connection unexpectedly
> > >This probably means the server terminated abnormally
> > >before or while processing the request.
> > > The connection to the server was lost. Attempting reset: Failed.
> > >
> > > Subscriber logs:
> > > 2021-11-15 19:27:56.380 IST [15685] LOG:  logical replication apply
> > > worker for subscription "sub1" has started
> > > 2021-11-15 19:27:56.384 IST [15687] LOG:  logical replication table
> > > synchronization worker for subscription "sub1", table "t1" has started
> > > TRAP: FailedAssertion("strlen(errmsg) < PGSTAT_SUBWORKERERROR_MSGLEN",
> > > File: "pgstat.c", Line: 1946, PID: 15687)
> > > postgres: logical replication worker for subscription 16387 sync 16384
> > > (ExceptionalCondition+0xd0)[0x55a18f3c727f]
> > > postgres: logical replication worker for subscription 16387 sync 16384
> > > (pgstat_report_subworker_error+0x7a)[0x55a18f126417]
> > > postgres: logical replication worker for subscription 16387 sync 16384
> > > (ApplyWorkerMain+0x493)[0x55a18f176611]
> > > postgres: logical replication worker for subscription 16387 sync 16384
> > > (StartBackgroundWorker+0x23c)[0x55a18f11f7e2]
> > > postgres: logical replication worker for subscription 16387 sync 16384
> > > (+0x54efc0)[0x55a18f134fc0]
> > > postgres: logical replication worker for subscription 16387 sync 16384
> > > (+0x54f3af)[0x55a18f1353af]
> > > postgres: logical replication worker for subscription 16387 sync 16384
> > > (+0x54e338)[0x55a18f134338]
> > > /lib/x86_64-linux-gnu/libpthread.so.0(+0x141f0)[0x7feef84371f0]
> > > /lib/x86_64-linux-gnu/libc.so.6(__select+0x57)[0x7feef81e3ac7]
> > > postgres: logical replication worker for subscription 16387 sync 16384
> > > (+0x5498c2)[0x55a18f12f8c2]
> > > postgres:

Re: Parallel vacuum workers prevent the oldest xmin from advancing

2021-11-16 Thread Masahiko Sawada
On Tue, Nov 16, 2021 at 8:45 PM Amit Kapila  wrote:
>
> On Mon, Nov 15, 2021 at 12:38 PM Masahiko Sawada  
> wrote:
> >
> > I've updated the patch so that ProcArrayInstallRestoredXmin() sets
> > both xmin and statusFlags only when the source proc is still running
> > and xmin doesn't go backwards. IOW it doesn't happen that only one of
> > them is set by this function, which seems more understandable
> > behavior.
> >
>
> How have you tested this patch? As there was no test case presented in
> this thread, I used the below manual test to verify that the patch
> works. The idea is to generate a scenario where a parallel vacuum
> worker holds back the xmin from advancing.
>
> Setup:
> -- keep autovacuum = off in postgresql.conf
> create table t1(c1 int, c2 int);
> insert into t1 values(generate_series(1,1000),100);
> create index idx_t1_c1 on t1(c1);
> create index idx_t1_c2 on t1(c2);
>
> create table t2(c1 int, c2 int);
> insert into t2 values(generate_series(1,1000),100);
> create index idx_t2_c1 on t1(c1);
>
> Session-1:
> delete from t1 where c1 < 10; --this is to ensure that vacuum has some
> work to do
>
> Session-2:
> -- this is done just to ensure the Session-1's xmin captures the value
> of this xact
> begin;
> select txid_current(); -- say value is 725
> insert into t2 values(1001, 100);
>
> Session-1:
> set min_parallel_index_scan_size=0;
> -- attach a debugger and ensure to stop parallel worker somewhere
> before it completes and the leader after launching parallel worker
> vacuum t1;
>
> Session-2:
> -- commit the open transaction
> commit;
>
> Session-3:
> -- attach a debugger and break at the caller of vacuum_set_xid_limits.
> vacuum t2;

Yes, I've tested this patch in a similar way; while running pgbench in
the background in order to constantly consume XID, I checked if the
oldest xmin in VACUUM VERBOSE log is advancing even during parallel
vacuum running.

>
> I noticed that before the patch the value of oldestXmin in Session-3
> is 725 but after the patch it got advanced. I have made minor edits to
> the attached patch. See, if this looks okay to you then please prepare
> and test the patch for back-branches as well. If you have some other
> way to test the patch then do share the same and let me know if you
> see any flaw in the above verification method.

The patch looks good to me. But I can't come up with a stable test for
this. It seems to be hard without stopping and resuming parallel
vacuum workers. Do you have any good idea?

I've attached patches for back branches (13 and 14).

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/


0001-Fix-parallel-operations-that-prevent-oldest-xmin-fro_PG14.patch
Description: Binary data


0001-Fix-parallel-operations-that-prevent-oldest-xmin-fro_PG13.patch
Description: Binary data


Re: CREATE tab completion

2021-11-16 Thread Michael Paquier
On Wed, Nov 17, 2021 at 10:44:44AM +0900, Ken Kato wrote:
> I made a patch for this, so please have a look.

+   else if (Matches("CREATE", "TRANSFORM", "FOR", MatchAny)
+   COMPLETE_WITH("LANGUAGE")
+   else if (Matches("CREATE", "TRANSFORM", "FOR", MatchAny, "LANGUAGE")

Those three lines are wrong, for two different reasons and three
mistakes.  You may want to compile your code before sending it :)

"CREATE [TEMP|TEMPORARY] SEQUENCE name AS" could be completed with the
supported types.  There are three of them.

+   COMPLETE_WITH_QUERY(Query_for_list_of_schemas
+   " UNION SELECT 
'AUTORIZATION'");
Incorrect completion here, s/AUTORIZATION/AUTHORIZATION/.

+   else if (Matches("CREATE", "CONVERSION", MatchAny))
+   COMPLETE_WITH("FOR");
Why didn't you consider DEFAULT for the set of changes with
conversions?

+   else if (Matches("CREATE", "SCHEMA"))
+   COMPLETE_WITH("AUTHORIZATION", "IF NOT EXISTS");
+   else if (Matches("CREATE", "SCHEMA", "IF", "NOT", "EXISTS"))
We don't do any completion for INE or IE in the other object types.

+   /* CREATE LANGUAGE */
+   else if (Matches("CREATE", "LANGUAGE", MatchAny))
+   COMPLETE_WITH("HANDLER");
+   else if (Matches("CREATE", "LANGUAGE", MatchAny, "HANDLER", MatchAny))
+   COMPLETE_WITH("INLINE", "VALIDATOR");
It looks like you forgot the case of "OR REPLACE" here?  This is done
for triggers, for example.
--
Michael


signature.asc
Description: PGP signature


Re: pg_upgrade test for binary compatibility of core data types

2021-11-16 Thread Michael Paquier
On Sun, Nov 07, 2021 at 01:22:00PM -0600, Justin Pryzby wrote:
> That may be good enough for test.sh, but if the kludges were moved to a .sql
> script which was also run by the buildfarm (in stead of its hardcoded 
> kludges), then
> it might be necessary to handle the additional stuff my patch did, like:
>
> +   DROP TRANSFORM FOR integer LANGUAGE 
> sql CASCADE;"
> +   DROP FUNCTION boxarea(box);"
> +   DROP FUNCTION funny_dup17();"

These apply for an old version <= v10.

> +   DROP TABLE abstime_tbl;"
> +   DROP TABLE reltime_tbl;"
> +   DROP TABLE tinterval_tbl;"

old version <= 9.3.

> +   DROP AGGREGATE 
> first_el_agg_any(anyelement);"

Not sure about this one.

> +   DROP AGGREGATE 
> array_cat_accum(anyarray);"
> +   DROP OPERATOR @#@(NONE,bigint);"

These are on 9.4.  It is worth noting that TestUpgradeXversion.pm
recreates those objects.  I'd agree to close the gap completely rather
than just moving what test.sh does to wipe out a maximum client code
for the buildfarm.

> Or, maybe it's guaranteed that the animals all run latest version of old
> branches, in which case I think some of the BF's existing logic could be
> dropped, which would help to reconcile these two scripts:

That seems like a worthy goal to reduce the amount of duplication with
the buildfarm code, while allowing tests from upgrades with older
versions (the WAL segment size and group permission issue in test.sh
had better be addressed in a better way, perhaps once the pg_upgrade
tests are moved to TAP).  There are also things specific to contrib/
modules with older versions, but that may be too specific for this
exercise.

+\if :oldpgversion_le84
+DROP FUNCTION public.myfunc(integer);
+\endif

The oldest version tested by the buildfarm is 9.2, so we could ignore
this part I guess?

Andrew, what do you think about this part?  Based on my read of this
thread, there is an agreement that this approach makes the buildfarm
code more manageable so as committers would not need to patch the
buildfarm code if their test fail.  I agree with this conclusion, but
I wanted to double-check with you first.  This would need a backpatch
down to 10 so as we could clean up a maximum of code in
TestUpgradeXversion.pm without waiting for an extra 5 years.  Please
note that I am fine to send a patch for the buildfarm client.

> We wouldn't miss new core types, because of the 2nd part of type_sanity which
> tests that each core type was included in the "manytypes" table.

Thanks, I see your point now after a closer read.

There is still a pending question for contrib modules, but I think
that we need to think larger here with a better integration of
contrib/ modules in the upgrade testing process.  Making that cheap
would require running the set of regression tests on the instance
to-be-upgraded first.  I think that one step in this direction would
be to have unique databases for each contrib/ modules, so as there is
no overlap with objects dropped?

Having some checks with code types looks fine as a first step, so
let's do that.  I have reviewed 0001, rewrote a couple of comments.
All the comments from upthread seem to be covered with that.  So I'd
like to get that applied on HEAD.  We could as well be less
conservative and backpatch that down to 12 to follow on 7c15cef so we
would be more careful with 15~ already (a backpatch down to 14 would
be enough for this purpose, actually thanks to the 14->15 upgrade
path).
--
Michael
From c4d766f9a461dad2d51cba3cb8d7d0c523267716 Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Wed, 17 Nov 2021 15:57:04 +0900
Subject: [PATCH v7] pg_upgrade: test to exercise binary compatibility

Creating a table with columns of many different datatypes to notice if the
binary format is accidentally changed again, as happened at:
7c15cef86 Base information_schema.sql_identifier domain on name, not varchar.

I checked that if I cherry-pick to v11, and comment out
old_11_check_for_sql_identifier_data_type_usage(), then pg_upgrade/test.sh
detects the original problem:
pg_dump: error: Error message from server: ERROR:  invalid memory alloc request size 18446744073709551613

I understand the buildfarm has its own cross-version-upgrade test, which I
think would catch this on its own.
---
 src/test/regress/expected/sanity_check.out |   1 +
 src/test/regress/expected/type_sanity.out  | 102 +
 src/test/regress/sql/type_sanity.sql   | 100 
 3 files changed, 203 insertions(+)

diff --git a/src/test/regress/expected/sanity_check.out b/src/test/regress/expected/sanity_check.out
index d04dc66db9..63706a28cc 100644
--- a/src/test/regress/expected/sanity_check.out
+++ b/src/t

Re: Skipping logical replication transactions on subscriber side

2021-11-16 Thread vignesh C
On Tue, Nov 16, 2021 at 12:01 PM Masahiko Sawada  wrote:
>
> On Mon, Nov 15, 2021 at 11:43 PM vignesh C  wrote:
> >
> > On Mon, Nov 15, 2021 at 2:48 PM Masahiko Sawada  
> > wrote:
> > >
> > > On Mon, Nov 15, 2021 at 4:49 PM Greg Nancarrow  
> > > wrote:
> > > >
> > > > On Mon, Nov 15, 2021 at 1:49 PM Masahiko Sawada  
> > > > wrote:
> > > > >
> > > > > I've attached an updated patch that incorporates all comments I got so
> > > > > far. Please review it.
> > > > >
> > > >
> > > > Thanks for the updated patch.
> > > > A few minor comments:
> > > >
> > > > doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
> > > >
> > > > (1) tab in doc updates
> > > >
> > > > There's a tab before "Otherwise,":
> > > >
> > > > +copy of the relation with relid.
> > > > Otherwise,
> > >
> > > Fixed.
> > >
> > > >
> > > > src/backend/utils/adt/pgstatfuncs.c
> > > >
> > > > (2) The function comment for "pg_stat_reset_subscription_worker_sub"
> > > > seems a bit long and I expected it to be multi-line (did you run
> > > > pg_indent?)
> > >
> > > I ran pg_indent on pgstatfuncs.c but it didn't become a multi-line 
> > > comment.
> > >
> > > >
> > > > src/include/pgstat.h
> > > >
> > > > (3) Remove PgStat_StatSubWorkerEntry.dbid?
> > > >
> > > > The "dbid" member of the new PgStat_StatSubWorkerEntry struct doesn't
> > > > seem to be used, so I think it should be removed.
> > > > (I could remove it and everything builds OK and tests pass).
> > > >
> > >
> > > Fixed.
> > >
> > > Thank you for the comments! I've updated an updated version patch.
> >
> > Thanks for the updated patch.
> > I found one issue:
> > This Assert can fail in few cases:
> > +void
> > +pgstat_report_subworker_error(Oid subid, Oid subrelid, Oid relid,
> > +
> > LogicalRepMsgType command, TransactionId xid,
> > + const char 
> > *errmsg)
> > +{
> > +   PgStat_MsgSubWorkerError msg;
> > +   int len;
> > +
> > +   Assert(strlen(errmsg) < PGSTAT_SUBWORKERERROR_MSGLEN);
> > +   len = offsetof(PgStat_MsgSubWorkerError, m_message[0]) +
> > strlen(errmsg) + 1;
> > +
> >
> > I could reproduce the problem with the following scenario:
> > Publisher:
> > create table t1 (c1 varchar);
> > create publication pub1 for table t1;
> > insert into t1 values(repeat('abcd', 5000));
> >
> > Subscriber:
> > create table t1(c1 smallint);
> > create subscription sub1 connection 'dbname=postgres port=5432'
> > publication pub1 with ( two_phase = true);
> > postgres=# select * from pg_stat_subscription_workers;
> > WARNING:  terminating connection because of crash of another server process
> > DETAIL:  The postmaster has commanded this server process to roll back
> > the current transaction and exit, because another server process
> > exited abnormally and possibly corrupted shared memory.
> > HINT:  In a moment you should be able to reconnect to the database and
> > repeat your command.
> > server closed the connection unexpectedly
> >This probably means the server terminated abnormally
> >before or while processing the request.
> > The connection to the server was lost. Attempting reset: Failed.
> >
> > Subscriber logs:
> > 2021-11-15 19:27:56.380 IST [15685] LOG:  logical replication apply
> > worker for subscription "sub1" has started
> > 2021-11-15 19:27:56.384 IST [15687] LOG:  logical replication table
> > synchronization worker for subscription "sub1", table "t1" has started
> > TRAP: FailedAssertion("strlen(errmsg) < PGSTAT_SUBWORKERERROR_MSGLEN",
> > File: "pgstat.c", Line: 1946, PID: 15687)
> > postgres: logical replication worker for subscription 16387 sync 16384
> > (ExceptionalCondition+0xd0)[0x55a18f3c727f]
> > postgres: logical replication worker for subscription 16387 sync 16384
> > (pgstat_report_subworker_error+0x7a)[0x55a18f126417]
> > postgres: logical replication worker for subscription 16387 sync 16384
> > (ApplyWorkerMain+0x493)[0x55a18f176611]
> > postgres: logical replication worker for subscription 16387 sync 16384
> > (StartBackgroundWorker+0x23c)[0x55a18f11f7e2]
> > postgres: logical replication worker for subscription 16387 sync 16384
> > (+0x54efc0)[0x55a18f134fc0]
> > postgres: logical replication worker for subscription 16387 sync 16384
> > (+0x54f3af)[0x55a18f1353af]
> > postgres: logical replication worker for subscription 16387 sync 16384
> > (+0x54e338)[0x55a18f134338]
> > /lib/x86_64-linux-gnu/libpthread.so.0(+0x141f0)[0x7feef84371f0]
> > /lib/x86_64-linux-gnu/libc.so.6(__select+0x57)[0x7feef81e3ac7]
> > postgres: logical replication worker for subscription 16387 sync 16384
> > (+0x5498c2)[0x55a18f12f8c2]
> > postgres: logical replication worker for subscription 16387 sync 16384
> > (PostmasterMain+0x134c)[0x55a18f12f1dd]
> > postgres: logical replication worker for subscription 16387 sync 16384
> > (+0x43c3d4)[0x55a18f0223d4]
> > /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xd5)[0x7feef80fd565]
> > postgres: logical replic

Re: Schema variables - new implementation for Postgres 15

2021-11-16 Thread Pavel Stehule
po 15. 11. 2021 v 21:23 odesílatel Justin Pryzby 
napsal:

> On Mon, Nov 15, 2021 at 09:00:13PM +0100, Pavel Stehule wrote:
> > Thank you for review and fixes, I try to complete some version for next
> > work, and looks so your patch 0001 is broken
> >
> > gedit reports to me broken unicode \A0\A0\A0\A0\A0
> >
> > my last patch has 276KB and your patch has 293KB?
>
> On Mon, Nov 15, 2021 at 09:06:08PM +0100, Pavel Stehule wrote:
> > >
> > > my last patch has 276KB and your patch has 293KB?
> >
> > Please, can you resend your version of patch 0001?
>
> https://www.postgresql.org/message-id/20211106013904.gg17...@telsasoft.com
>
> 0001 is exactly your patch applied to HEAD, and 0002 are Tomas' changes
> relative to your patch.
>
> 0003 is my contribution on top.  My intent is that you wouldn't apply
> 0001, but
> rather apply my 0003 on top of your existing branch, and then review
> 0002/0003,
> and then squish the changes into your patch.
>
> I see the 0xa0 stuff in your original patch before my changes, but I'm not
> sure
> what went wrong.
>
> Let me know if you have any issue applying my changes on top of your
> existing,
> local branch ?
>

It is ok, I was able to apply all your patches to my local branch

Regards

Pavel

>
> --
> Justin
>


Re: pg_get_publication_tables() output duplicate relid

2021-11-16 Thread Amit Langote
On Wed, Nov 17, 2021 at 12:15 PM houzj.f...@fujitsu.com
 wrote:
> On Wed, Nov 17, 2021 10:47 AM Amit Kapila  wrote:
> > On Tue, Nov 16, 2021 at 7:21 AM houzj.f...@fujitsu.com wrote:
> > > If we decide to disallow this case, we seem need to handle some other
> > > cases as well, for example: We might also need additional check when
> > > ATTACH a partition, because the partition's parent table could already
> > > be published in the same publication as the partition.
> > >
> >
> > What kind of additional checks you are envisioning and why?
>
> For example:
>
> create table tbl1 (a int) partition by range (a);
> create table tbl1_part1 (like tbl1);
> create table tbl1_part2 partition of tbl1 for values from (10) to (20);
> create publication pub for table
> tbl1, tbl1_part1 with (publish_via_partition_root=false);
>
> --- We might need addition check here
> Alter table tbl1 ATTACH partition tbl1_part1 for values from (1) to (10);
>
> In the above cases, 'tbl1_part1' is not a partition of 'tb1' when creating a
> publication. After the ATTACH, 'tbl1_part1' would become a partition of 'tbl1'
> which seems the case we want to disallow(both parent and child table in
> publication). So, When ATTACH, I thought we might need to check all the parent
> tables to see if any parent table is in the same publication which the table 
> to
> be attached is also belongs to. Does it make sense ?

I don't think creating or attaching a partition of a table that is
present in a publish_via_partition_root=false actually adds the
partition to pg_publication_rel, the base catalog.  A partition's
membership in the publication is implicit, unless of course you add it
to the publication explicitly, like all the examples we have been
discussing.  I guess we're only arguing about the problems with the
pg_publication_tables view, which does expand the partitioned table to
show the partitions that are not otherwise not present in the base
catalog.

-- 
Amit Langote
EDB: http://www.enterprisedb.com




Re: pg_get_publication_tables() output duplicate relid

2021-11-16 Thread Amit Langote
On Tue, Nov 16, 2021 at 10:27 PM Amit Kapila  wrote:
> On Mon, Nov 15, 2021 at 7:12 PM Alvaro Herrera  
> wrote:
> > > On Mon, Nov 15, 2021 at 1:48 PM houzj.f...@fujitsu.com
> > >  wrote:
> >
> > > > create table tbl1 (a int) partition by range (a);
> > > > create table tbl1_part1 partition of tbl1 for values from (1) to (10);
> > > > create table tbl1_part2 partition of tbl1 for values from (10) to (20);
> > > > create publication pub for table
> > > > tbl1, tbl1_part1 with (publish_via_partition_root=false);
> >
> > In the name of consistency, I think this situation should be an error --
> > I mean, if we detect that the user is trying to add both the partitioned
> > table *and* its partition, then all manner of things are possibly going
> > to go wrong in some way, so my inclination is to avoid it altogether.
> >
> > Is there any reason to allow that?
>
> I think it could provide flexibility to users to later change
> "publish_via_partition_root" option. Because when that option is
> false, we use individual partitions schema to send changes and when it
> is true, we use root table's schema to send changes. Added Amit L. to
> know if he has any thoughts on this matter as he was the author of
> this work?

FWIW, I'm not sure that adding an error in this path, that is, when a
user adds both a partitioned table and its partitions to a
publication, helps much.  As for the safety of allowing it, if you
look at get_rel_sync_entry(), which handles much of the complexity of
determining whether to publish a relation's changes and the schema to
use when doing so, you may be able to see that a partition being added
duplicatively is harmless, modulo any undiscovered bugs.  At least as
far as the post-initial-sync replication functionality is concerned.

What IS problematic is what a subscriber sees in the
pg_publication_tables view and the problem occurs only in the initial
sync phase, where the partition is synced duplicatively because of
being found in the view along with the parent in this case, that is,
when publish_via_partiiton_root is true.  I was saying on the other
thread [1] that we should leave it up to the subscriber to decide what
to do when the view (duplicatively) returns both the parent and the
partition, citing the use case that a subscriber may want to replicate
the parent and the partition as independent tables.  Though I now tend
to agree with Amit K that that may be such a meaningful and all that
common use case, and the implementation on the subscriber side would
have to be unnecessarily complex.

So maybe it makes sense to just do what has been proposed --
de-duplicate partitions out of the pg_publication_tables view, unless
we know of a bigger problem that requires us to hack the subscriber
side of things too.  Actually, I came to know of one such problem
while thinking about this today: when you ATTACH a partition to a
table that is present in a publish_via_partition_root=true
publication, it doesn't get copied via the initial sync even though
subsequent replication works just fine.  The reason for that is that
the subscriber only syncs the partitions that are known at the time
when the parent is first synced, that too via the parent (as SELECT
 FROM parent), and then marks the parent as sync-done.
Refreshing the subscription after ATTACHing doesn't help, because the
subscriber can't see any partitions to begin with.  Maybe a more
elaborate solution is needed for this one, though I haven't figured
what it is going to look like yet.

Thanks.

--
Amit Langote
EDB: http://www.enterprisedb.com

[1] 
https://www.postgresql.org/message-id/CA%2BHiwqHnDHcT4OOcga9rDFyc7TvDrpN5xFH9J2pyHQo9ptvjmQ%40mail.gmail.com




Re: isolation test output format

2021-11-16 Thread Andy Fan
> Are you using recent sources?  The isolationtester's output hasn't
> looked like that since June (4a054069a et al).
>
> regards, tom lane

I am coding with "Stamp 14beta2" which are very close to have 4a054069a.
I have tried the newer version, the issue here is fixed perfectly.  Thanks
for that!


-- 
Best Regards
Andy Fan




Re: Skipping logical replication transactions on subscriber side

2021-11-16 Thread Masahiko Sawada
On Wed, Nov 17, 2021 at 1:58 PM Amit Kapila  wrote:
>
> On Wed, Nov 17, 2021 at 9:13 AM houzj.f...@fujitsu.com
>  wrote:
> >
> > On Tues, Nov 16, 2021 2:31 PM Masahiko Sawada  wrote:
> >
> > 2)
> > + 
> > +  
> > +   subrelid oid
> > +  
> > +  
> > +   OID of the relation that the worker is synchronizing; null for the
> > +   main apply worker
> > +  
> > + 
> >
> > Is the 'subrelid' only used for distinguishing the worker type ?
> >
>
> I think it will additionally tell which table sync worker as well.

Right.

>
> > If so, would it
> > be clear to have a string value here. I recalled the previous version patch 
> > has
> > failure_source column but was removed. Maybe I missed something.
> >
>
> I also don't remember the reason for this but like to know.

I felt it's a bit redundant. Setting subrelid to NULL already means
that it’s an entry for a tablesync worker. If users want the value
like “apply” or “tablesync” for each entry, they can use the subrelid
value.

> I am also reviewing the latest version of the patch and will share
> comments/questions sometime today.

Thanks!

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/




Re: [Patch] ALTER SYSTEM READ ONLY

2021-11-16 Thread Amul Sul
On Sat, Nov 13, 2021 at 2:18 AM Robert Haas  wrote:
>
> On Mon, Nov 8, 2021 at 8:20 AM Amul Sul  wrote:
> > Attached is the rebased version of refactoring as well as the
> > pg_prohibit_wal feature patches for the latest master head (commit #
> > 39a3105678a).
>
> I spent a lot of time today studying 0002, and specifically the
> question of whether EndOfLog must be the same as
> XLogCtl->replayEndRecPtr and whether EndOfLogTLI must be the same as
> XLogCtl->replayEndTLI.
>
> The answer to the former question is "no" because, if we don't enter
> redo, XLogCtl->replayEndRecPtr won't be initialized at all. If we do
> enter redo, then I think it has to be the same unless something very
> weird happens. EndOfLog gets set like this:
>
> XLogBeginRead(xlogreader, LastRec);
> record = ReadRecord(xlogreader, PANIC, false, replayTLI);
> EndOfLog = EndRecPtr;
>
> In every case that exists in our regression tests, EndRecPtr is the
> same before these three lines of code as it is afterward. However, if
> you test with recovery_target=immediate, you can get it to be
> different, because in that case we drop out of the redo loop after
> calling recoveryStopsBefore() rather than after calling
> recoveryStopsAfter(). Similarly I'm fairly sure that if you use
> recovery_target_inclusive=off you can likewise get it to be different
> (though I discovered the hard way that recovery_target_inclusive=off
> is ignored when you use recovery_target_name). It seems like a really
> bad thing that neither recovery_target=immediate nor
> recovery_target_inclusive=off have any tests, and I think we ought to
> add some.
>

recovery/t/003_recovery_targets.pl has test for
recovery_target=immediate but not for recovery_target_inclusive=off, we
can add that for recovery_target_lsn, recovery_target_time, and
recovery_target_xid case only where it affects.

> Anyway, in effect, these three lines of code have the effect of
> backing up the xlogreader by one record when we stop before rather
> than after a record that we're replaying. What that means is that
> EndOfLog is going to be the end+1 of the last record that we actually
> replayed. There might be one more record that we read but did not
> replay, and that record won't impact the value we end up with in
> EndOfLog. Now, XLogCtl->replayEndRecPtr is also that end+1 of the last
> record that we actually replayed. To put that another way, there's no
> way to exit the main redo loop after we set XLogCtl->replayEndRecPtr
> and before we change LastRec. So in the cases where
> XLogCtl->replayEndRecPtr gets initialized at all, it can only be
> different from EndOfLog if something different happens when we re-read
> the last-replayed WAL record than what happened when we read it the
> first time. That seems unlikely, and would be unfortunate it if it did
> happen. I am inclined to think that it might be better not to reread
> the record at all, though.

There are two reasons that the record is reread; first, one that you
have just explained where the redo loop drops out due to
recoveryStopsBefore() and another one is that InRecovery is false.

In the formal case at the end, redo while-loop does read a new record
which in effect updates EndRecPtr and when we breaks the loop, we do
reach the place where we do reread record -- where we do read the
record (i.e. LastRec) before the record that redo loop has read and
which correctly sets EndRecPtr. In the latter case, definitely, we
don't need any adjustment to EndRecPtr.

So technically one case needs reread but that is also not needed, we
have that value in XLogCtl->lastReplayedEndRecPtr. I do agree that we
do not need to reread the record, but EndOfLog and EndOfLogTLI should
be set conditionally something like:

if (InRecovery)
{
EndOfLog = XLogCtl->lastReplayedEndRecPtr;
EndOfLogTLI = XLogCtl->lastReplayedTLI;
}
else
{
EndOfLog = EndRecPtr;
EndOfLogTLI = replayTLI;
}

> As far as this patch goes, I think we need
> a solution that doesn't involve fetching EndOfLog from a variable
> that's only sometimes initialized and then not doing anything with it
> except in the cases where it was initialized.
>

Another reason could be EndOfLog changes further in the following case:

/*
 * Actually, if WAL ended in an incomplete record, skip the parts that
 * made it through and start writing after the portion that persisted.
 * (It's critical to first write an OVERWRITE_CONTRECORD message, which
 * we'll do as soon as we're open for writing new WAL.)
 */
if (!XLogRecPtrIsInvalid(missingContrecPtr))
{
Assert(!XLogRecPtrIsInvalid(abortedRecPtr));
EndOfLog = missingContrecPtr;
}

Now only solution that I can think is to copy EndOfLog (so
EndOfLogTLI) into shared memory.

> As for EndOfLogTLI, I'm afraid I don't think that's the same thing as
> XLogCtl->replayEndTLI. Now, it's hard to be sure, because I don't
> think the regression tests contain any scenarios where we run recovery
> and the values end up different. However, I th

Slow client can delay replication despite max_standby_streaming_delay set

2021-11-16 Thread Andrey Borodin
Hi hackers!

$subj was recently observed on one of our installations.

Startup process backtrace
#0  0x7fd216660d27 in epoll_wait (epfd=525, events=0x55c688dfbde8, 
maxevents=maxevents@entry=1, timeout=timeout@entry=-1)
#1  0x55c687264be9 in WaitEventSetWaitBlock (nevents=1, 
occurred_events=0x7ffdf8089f00, cur_timeout=-1, set=0x55c688dfbd78)
#2  WaitEventSetWait (set=set@entry=0x55c688dfbd78, timeout=timeout@entry=-1, 
occurred_events=occurred_events@entry=0x7ffdf8089f00, nevents=nevents@entry=1, 
wait_event_info=wait_event_info@entry=67108864) at 
/build/../src/backend/storage/ipc/latch.c:1000
#3  0x55c687265038 in WaitLatchOrSocket (latch=0x7fd1fa735454, 
wakeEvents=wakeEvents@entry=1, sock=sock@entry=-1, timeout=-1, timeout@entry=0, 
wait_event_info=wait_event_info@entry=67108864)
#4  0x55c6872650f5 in WaitLatch (latch=, 
wakeEvents=wakeEvents@entry=1, timeout=timeout@entry=0, 
wait_event_info=wait_event_info@entry=67108864
#5  0x55c687276399 in ProcWaitForSignal 
(wait_event_info=wait_event_info@entry=67108864)
#6  0x55c68726c898 in ResolveRecoveryConflictWithBufferPin ()
#7  0x55c6872582c5 in LockBufferForCleanup (buffer=292159)
#8  0x55c687259447 in ReadBuffer_common (smgr=0x55c688deae40, 
relpersistence=relpersistence@entry=112 'p', 
forkNum=forkNum@entry=MAIN_FORKNUM, blockNum=blockNum@entry=3751242, 
mode=mode@entry=RBM_ZERO_AND_CLEANUP_LOCK, strategy=strategy@entry=0x0, 
hit=0x7ffdf808a117 "\001")
#9  0x55c687259b6b in ReadBufferWithoutRelcache (rnode=..., 
forkNum=forkNum@entry=MAIN_FORKNUM, blockNum=blockNum@entry=3751242, 
mode=mode@entry=RBM_ZERO_AND_CLEANUP_LOCK, strategy=strategy@entry=0x0)
#10 0x55c68705655f in XLogReadBufferExtended (rnode=..., 
forknum=MAIN_FORKNUM, blkno=3751242, mode=RBM_ZERO_AND_CLEANUP_LOCK)
#11 0x55c687056706 in XLogReadBufferForRedoExtended 
(record=record@entry=0x55c688dd2378, block_id=block_id@entry=0 '\000', 
mode=mode@entry=RBM_NORMAL, get_cleanup_lock=get_cleanup_lock@entry=1 '\001', 
buf=buf@entry=0x7ffdf808a218)
#12 0x55c68700728b in heap_xlog_clean (record=0x55c688dd2378)
#13 heap2_redo (record=0x55c688dd2378)
#14 0x55c68704a7eb in StartupXLOG ()

Backend holding the buffer pin:

#0  0x7fd216660d27 in epoll_wait (epfd=5, events=0x55c688d67ca8, 
maxevents=maxevents@entry=1, timeout=timeout@entry=-1)
#1  0x55c687264be9 in WaitEventSetWaitBlock (nevents=1, 
occurred_events=0x7ffdf808e070, cur_timeout=-1, set=0x55c688d67c38)
#2  WaitEventSetWait (set=0x55c688d67c38, timeout=timeout@entry=-1, 
occurred_events=occurred_events@entry=0x7ffdf808e070, nevents=nevents@entry=1, 
wait_event_info=wait_event_info@entry=100663297)
#3  0x55c687185d7e in secure_write (port=0x55c688db4a60, 
ptr=ptr@entry=0x55c688dd338e, len=len@entry=2666)
#4  0x55c687190f4c in internal_flush ()
#5  0x55c68719118a in internal_putbytes (s=0x55c68f8dcc35 "", 
s@entry=0x55c68f8dcc08 "", len=65)
#6  0x55c687191262 in socket_putmessage (msgtype=, 
s=0x55c68f8dcc08 "", len=)
#7  0x55c687193431 in pq_endmessage (buf=buf@entry=0x7ffdf808e1a0)
#8  0x55c686fd1442 in printtup (slot=0x55c6894c2dc0, self=0x55c689326b40)
#9  0x55c687151962 in ExecutePlan (execute_once=, 
dest=0x55c689326b40, direction=, numberTuples=0, 
sendTuples=, operation=CMD_SELECT, use_parallel_mode=, planstate=0x55c689281a28, estate=0x55c6892816f8)
#10 standard_ExecutorRun (queryDesc=0x55c689289628, direction=, 
count=0, execute_once=)
#11 0x7fd2074100c5 in pgss_ExecutorRun (queryDesc=0x55c689289628, 
direction=ForwardScanDirection, count=0, execute_once=)
#12 0x55c68728a356 in PortalRunSelect (portal=portal@entry=0x55c688d858b8, 
forward=forward@entry=1 '\001', count=0, count@entry=9223372036854775807, 
dest=dest@entry=0x55c689326b40)
#13 0x55c68728b988 in PortalRun (portal=portal@entry=0x55c688d858b8, 
count=count@entry=9223372036854775807, isTopLevel=isTopLevel@entry=1 '\001', 
run_once=run_once@entry=1 '\001', dest=dest@entry=0x55c689326b40, 
altdest=altdest@entry=0x55c689326b40, completionTag=0x7ffdf808e580 "")
#14 0x55c687287425 in exec_simple_query (query_string=0x55c688ec5e38 
"select\n  '/capacity/created-at-counter-by-time-v2' as sensor,\n  
round(extract(epoch from shipment_date))
#15 0x55c687289418 in PostgresMain (argc=, 
argv=argv@entry=0x55c688dd3e28, dbname=, username=)


I think the problem here is that secure_write() uses infinite timeout.
Probably we rely here on tcp keepalives, but they did not fire for some reason. 
Seems like the client was alive but sluggish.
Does it make sense to look for infinite timeouts in communication and replace 
them with a loop checking for interrupts?

Best regards, Andrey Borodin.





Re: Deficient error handling in pg_dump and pg_basebackup

2021-11-16 Thread Michael Paquier
On Tue, Nov 16, 2021 at 10:26:11PM -0500, Tom Lane wrote:
> I feel like doing an immediate exit() for these errors and not other
> ones is a pretty terrible idea, mainly because it doesn't account for
> the question of what to do with a failure that prevents us from getting
> to the fsync() call in the first place.  So I'd like to see a better
> design rather than more quick hacking.  I confess I don't have a
> clear idea of what "a better design" would look like.

[ .. thinks .. ]

We cannot really have something equivalent to data_sync_retry in the
frontends.  But I'd like to think that it would be fine for
pg_basebackup to just exit() on this failure so as callers would be
able to retry a base backup.  pg_receivewal is more tricky though.  An
exit() would allow for flush retries of a previous WAL segment where
things failed, but that stands when using --no-loop (still the code
path triggered by this option would not be used).  When not using
--no-loop, it may be better to actually just retry streaming from the
previous point so as the error should be reported from walmethods.c to
the upper stack anyway.

> However, that's largely orthogonal to any of the things my proposed
> patches are trying to fix.  If you want to review the patches without
> considering the fsync-error-handling problem, that'd be great.

I have looked at them upthread, FWIW:
https://www.postgresql.org/message-id/yytsj5vlwp5fa...@paquier.xyz
Your proposals still look rather sane to me, after a second look.
--
Michael


signature.asc
Description: PGP signature


Re: Skipping logical replication transactions on subscriber side

2021-11-16 Thread Amit Kapila
On Wed, Nov 17, 2021 at 9:13 AM houzj.f...@fujitsu.com
 wrote:
>
> On Tues, Nov 16, 2021 2:31 PM Masahiko Sawada  wrote:
>
> 2)
> + 
> +  
> +   subrelid oid
> +  
> +  
> +   OID of the relation that the worker is synchronizing; null for the
> +   main apply worker
> +  
> + 
>
> Is the 'subrelid' only used for distinguishing the worker type ?
>

I think it will additionally tell which table sync worker as well.

> If so, would it
> be clear to have a string value here. I recalled the previous version patch 
> has
> failure_source column but was removed. Maybe I missed something.
>

I also don't remember the reason for this but like to know.

I am also reviewing the latest version of the patch and will share
comments/questions sometime today.

-- 
With Regards,
Amit Kapila.




RE: Failed transaction statistics to measure the logical replication progress

2021-11-16 Thread osumi.takami...@fujitsu.com
On Wednesday, November 17, 2021 12:19 PM Masahiko Sawada 
 wrote:
> On Tue, Nov 16, 2021 at 9:34 PM osumi.takami...@fujitsu.com
>  wrote:
> >
> > On Monday, November 15, 2021 9:14 PM I wrote:
> > > I've conducted some update for this.
> > > (The rebased part is only C code and checked by pgindent)
> > I'll update my patches since a new skip xid patch has been shared in
> > [1].
> >
> > This version includes some minor renames of functions that are related
> > to transaction sizes.
> 
> I've looked at v12-0001 patch. Here are some comments:
Thank you for paying attention to this thread !


> -   TupleDescInitEntry(tupdesc, (AttrNumber) 3, "relid",
> +   TupleDescInitEntry(tupdesc, (AttrNumber) 3, "last_error_relid",
>OIDOID, -1, 0);
> -   TupleDescInitEntry(tupdesc, (AttrNumber) 4, "command",
> +   TupleDescInitEntry(tupdesc, (AttrNumber) 4, "last_error_command",
>TEXTOID, -1, 0);
> -   TupleDescInitEntry(tupdesc, (AttrNumber) 5, "xid",
> +   TupleDescInitEntry(tupdesc, (AttrNumber) 5, "last_error_xid",
>XIDOID, -1, 0);
> -   TupleDescInitEntry(tupdesc, (AttrNumber) 6, "error_count",
> +   TupleDescInitEntry(tupdesc, (AttrNumber) 6, "last_error_count",
>INT8OID, -1, 0);
> -   TupleDescInitEntry(tupdesc, (AttrNumber) 7, "error_message",
> +   TupleDescInitEntry(tupdesc, (AttrNumber) 7, "last_error_message",
> 
> If renaming column names clarifies those meanings, the above changes should
> be included into my patch that introduces pg_stat_subscription_workers
> view?
At first, your column names of pg_stat_subscription_workers look totally OK to 
me by itself
and I thought I should take care of those renaming at the commit timing of my 
stats patches.
But, if you agree with the new names above and fixing your patch doesn't
bother you, I'd appreciate your help !

> I think that exporting PartitionTupleRouting should not be done in the one
> patch together with renaming the view columns. There is not relevance
> between them at all. If it's used by v12-0002 patch, I think it should be 
> included
> in that patch or in another separate patch.
Yes, it's used by v12-0002.

Absolutely you are right. When you update the patch like above,
I would like to make it independent.


Best Regards,
Takamichi Osumi




Re: Non-superuser subscription owners

2021-11-16 Thread Jeff Davis
On Wed, 2021-11-03 at 12:50 -0700, Mark Dilger wrote:
> The first two patches are virtually unchanged.  The third updates the
> behavior of the apply workers, and updates the documentation to
> match.

v2-0001 corrects some surprises, but may create others. Why is renaming
allowed, but not changing the options? What if we add new options, and
some of them seem benign for a non-superuser to change?

The commit message part of the patch says that it's to prevent non-
superusers from being able to (effectively) create subscriptions, but
don't we want privileged non-superusers to be able to create
subscriptions?

Regards,
Jeff Davis






Re: postgres_fdw: commit remote (sub)transactions in parallel during pre-commit

2021-11-16 Thread Etsuro Fujita
Kuroda-san,

On Thu, Nov 11, 2021 at 11:27 AM kuroda.hay...@fujitsu.com
 wrote:
> I love your proposal because it will remove a bottleneck
> for PostgreSQL build-in sharding.
>
> I read your patch briefly and I think basically it's good.

Great!  Thanks for reviewing!

> Currently I have only one comment.
>
> In your patch, postgres_fdw sends a COMMIT command to all entries in the hash 
> table
> and waits for the result without a timeout from the first entry.
> I think this specification is good because it's very simple,
> but if a COMMIT for a particular foreign server could take some time,
> I thought it might be more efficient to stop waiting for results and look at 
> the next entry.
> This is how it works. First, we define a function similar to 
> pgfdw_get_result()
> so that we can specify the timeout time as an argument to WaitLatchOrSocket().
> Then change the function called by do_sql_command_end () to the new one,
> and change the callback function to skip if the result has not yet arrived
>
> How is it? Is it an unnecessary assumption that COMMIT takes time? Or is this 
> the next step?
> I will put a PoC if needed.

Hmm, I'm not sure the cost-effectiveness of this optimization is
really high, because if the timeout expired, it means that something
unusual would have happened, and that it would take a long time for
the COMMIT command to complete (or abort at worst).  So even if we
processed the rest of the entries while waiting for the command
result, we cannot reduce the total time very much.  Maybe I'm missing
something, though.

Best regards,
Etsuro Fujita




Re: Unnecessary global variable declared in xlog.c

2021-11-16 Thread Amul Sul
On Wed, Nov 17, 2021 at 7:36 AM Michael Paquier  wrote:
>
> On Tue, Nov 16, 2021 at 02:08:54AM -0500, Tom Lane wrote:
> > I think LastRec was originally referenced by multiple functions
> > in xlog.c.  But it does look like it could be a local now.
>
> Thanks for double-checking, applied this one as of f975fc3.

Thank you, Michael.

Regards,
Amul




RE: Skipping logical replication transactions on subscriber side

2021-11-16 Thread houzj.f...@fujitsu.com
On Tues, Nov 16, 2021 2:31 PM Masahiko Sawada  wrote:
> Right. I've fixed this issue and attached an updated patch.

Hi,

Thanks for updating the patch.
Here are few comments.

1)

+pg_stat_reset_subscription_worker ( 
subid oid,  
relid oid  )

It seems we should put '' before the comma(',').


2)
+ 
+  
+   subrelid oid
+  
+  
+   OID of the relation that the worker is synchronizing; null for the
+   main apply worker
+  
+ 

Is the 'subrelid' only used for distinguishing the worker type ? If so, would it
be clear to have a string value here. I recalled the previous version patch has
failure_source column but was removed. Maybe I missed something.


3)
.
+extern void pgstat_reset_subworker_stats(Oid subid, Oid subrelid, bool 
allstats);

I didn't find the code of this functions, maybe we can remove this declaration ?

Best regards,
Hou zj


Re: Deficient error handling in pg_dump and pg_basebackup

2021-11-16 Thread Tom Lane
Michael Paquier  writes:
> Taking the issues with fsync() aside, which could be improved as a
> separate patch, is there anything I can do for this thread?  The error
> handling problems in walmethods.c introduced by the integration of LZ4
> are on me, so I'd like to fix it.  0002 depends on some parts of 0001,
> as well for walmethods.c and the new error code paths.  And I have
> been through this code quite a lot recently.

I feel like doing an immediate exit() for these errors and not other
ones is a pretty terrible idea, mainly because it doesn't account for
the question of what to do with a failure that prevents us from getting
to the fsync() call in the first place.  So I'd like to see a better
design rather than more quick hacking.  I confess I don't have a clear
idea of what "a better design" would look like.

However, that's largely orthogonal to any of the things my proposed
patches are trying to fix.  If you want to review the patches without
considering the fsync-error-handling problem, that'd be great.

regards, tom lane




Re: Failed transaction statistics to measure the logical replication progress

2021-11-16 Thread Masahiko Sawada
On Tue, Nov 16, 2021 at 9:34 PM osumi.takami...@fujitsu.com
 wrote:
>
> On Monday, November 15, 2021 9:14 PM I wrote:
> > I've conducted some update for this.
> > (The rebased part is only C code and checked by pgindent)
> I'll update my patches since a new skip xid patch
> has been shared in [1].
>
> This version includes some minor renames of functions
> that are related to transaction sizes.

I've looked at v12-0001 patch. Here are some comments:

-   TupleDescInitEntry(tupdesc, (AttrNumber) 3, "relid",
+   TupleDescInitEntry(tupdesc, (AttrNumber) 3, "last_error_relid",
   OIDOID, -1, 0);
-   TupleDescInitEntry(tupdesc, (AttrNumber) 4, "command",
+   TupleDescInitEntry(tupdesc, (AttrNumber) 4, "last_error_command",
   TEXTOID, -1, 0);
-   TupleDescInitEntry(tupdesc, (AttrNumber) 5, "xid",
+   TupleDescInitEntry(tupdesc, (AttrNumber) 5, "last_error_xid",
   XIDOID, -1, 0);
-   TupleDescInitEntry(tupdesc, (AttrNumber) 6, "error_count",
+   TupleDescInitEntry(tupdesc, (AttrNumber) 6, "last_error_count",
   INT8OID, -1, 0);
-   TupleDescInitEntry(tupdesc, (AttrNumber) 7, "error_message",
+   TupleDescInitEntry(tupdesc, (AttrNumber) 7, "last_error_message",

If renaming column names clarifies those meanings, the above changes
should be included into my patch that introduces
pg_stat_subscription_workers view?

---
I think that exporting PartitionTupleRouting should not be done in the
one patch together with renaming the view columns. There is not
relevance between them at all. If it's used by v12-0002 patch, I think
it should be included in that patch or in another separate patch.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/




RE: pg_get_publication_tables() output duplicate relid

2021-11-16 Thread houzj.f...@fujitsu.com
On Wed, Nov 17, 2021 10:47 AM Amit Kapila  wrote:
> On Tue, Nov 16, 2021 at 7:21 AM houzj.f...@fujitsu.com
>  wrote:
> >
> > On Mon, Nov 15, 2021 9:42 PM Alvaro Herrera 
> wrote:
> > > > On Mon, Nov 15, 2021 at 1:48 PM houzj.f...@fujitsu.com
> > > >  wrote:
> > >
> > > > > create table tbl1 (a int) partition by range (a); create table
> > > > > tbl1_part1 partition of tbl1 for values from (1) to (10); create
> > > > > table tbl1_part2 partition of tbl1 for values from (10) to (20);
> > > > > create publication pub for table tbl1, tbl1_part1 with
> > > > > (publish_via_partition_root=false);
> > >
> > > In the name of consistency, I think this situation should be an
> > > error -- I mean, if we detect that the user is trying to add both
> > > the partitioned table *and* its partition, then all manner of things
> > > are possibly going to go wrong in some way, so my inclination is to avoid 
> > > it
> > > altogether.
> > >
> > > Is there any reason to allow that?
> > >
> > > If we do that, then we have to be careful with later ALTER
> > > PUBLICATION
> > > too: adding a partition is not allowed if its partitioned table is
> > > there, and adding a partitioned table should behave in some sensible
> > > way if one of its partitions is there (either removing the partition
> > > at the same time, or outright rejecting the
> > > operation.)
> >
> > Thanks for the response.
> >
> > If we decide to disallow this case, we seem need to handle some other
> > cases as well, for example: We might also need additional check when
> > ATTACH a partition, because the partition's parent table could already
> > be published in the same publication as the partition.
> >
> 
> What kind of additional checks you are envisioning and why?

For example:

create table tbl1 (a int) partition by range (a);
create table tbl1_part1 (like tbl1);
create table tbl1_part2 partition of tbl1 for values from (10) to (20);
create publication pub for table
tbl1, tbl1_part1 with (publish_via_partition_root=false);

--- We might need addition check here
Alter table tbl1 ATTACH partition tbl1_part1 for values from (1) to (10);

In the above cases, 'tbl1_part1' is not a partition of 'tb1' when creating a
publication. After the ATTACH, 'tbl1_part1' would become a partition of 'tbl1'
which seems the case we want to disallow(both parent and child table in
publication). So, When ATTACH, I thought we might need to check all the parent
tables to see if any parent table is in the same publication which the table to
be attached is also belongs to. Does it make sense ?

Best regards,
Hou zj


Re: pg_get_publication_tables() output duplicate relid

2021-11-16 Thread Amit Kapila
On Tue, Nov 16, 2021 at 7:21 AM houzj.f...@fujitsu.com
 wrote:
>
> On Mon, Nov 15, 2021 9:42 PM Alvaro Herrera  wrote:
> > > On Mon, Nov 15, 2021 at 1:48 PM houzj.f...@fujitsu.com
> > >  wrote:
> >
> > > > create table tbl1 (a int) partition by range (a); create table
> > > > tbl1_part1 partition of tbl1 for values from (1) to (10); create
> > > > table tbl1_part2 partition of tbl1 for values from (10) to (20);
> > > > create publication pub for table tbl1, tbl1_part1 with
> > > > (publish_via_partition_root=false);
> >
> > In the name of consistency, I think this situation should be an error -- I 
> > mean, if
> > we detect that the user is trying to add both the partitioned table *and* 
> > its
> > partition, then all manner of things are possibly going to go wrong in some 
> > way,
> > so my inclination is to avoid it altogether.
> >
> > Is there any reason to allow that?
> >
> > If we do that, then we have to be careful with later ALTER PUBLICATION
> > too: adding a partition is not allowed if its partitioned table is there, 
> > and adding
> > a partitioned table should behave in some sensible way if one of its 
> > partitions is
> > there (either removing the partition at the same time, or outright 
> > rejecting the
> > operation.)
>
> Thanks for the response.
>
> If we decide to disallow this case, we seem need to handle some other cases as
> well, for example: We might also need additional check when ATTACH a 
> partition,
> because the partition's parent table could already be published in the same
> publication as the partition.
>

What kind of additional checks you are envisioning and why?

-- 
With Regards,
Amit Kapila.




Re: Deficient error handling in pg_dump and pg_basebackup

2021-11-16 Thread Michael Paquier
On Wed, Nov 10, 2021 at 02:03:11PM +0900, Michael Paquier wrote:
>> but this code seems about as fishy and ill-thought-out as anything
>> else I've touched here.  Why is this different from the half-dozen
>> other fsync-error cases in the same file?  Why, if fsync failure
>> here is so catastrophic, is it okay to just return a normal failure
>> code when tar_close doesn't even get to this point because of some
>> earlier error?
> 
> Hmm, I don't think that's fine.  There is an extra one in tar_finish()
> that would report a failure, but not exit() immediately.  All the
> callers of the sync callback in receivelog.c exit() on sight, but I am
> wondering if it would not be saner to just exit() from walmethods.c
> each time we see a failure with a fsync().

Taking the issues with fsync() aside, which could be improved as a
separate patch, is there anything I can do for this thread?  The error
handling problems in walmethods.c introduced by the integration of LZ4
are on me, so I'd like to fix it.  0002 depends on some parts of 0001,
as well for walmethods.c and the new error code paths.  And I have
been through this code quite a lot recently.
--
Michael


signature.asc
Description: PGP signature


Re: add recovery, backup, archive, streaming etc. activity messages to server logs along with ps display

2021-11-16 Thread Michael Paquier
On Tue, Nov 16, 2021 at 01:26:49PM -0300, Alvaro Herrera wrote:
> My opinion is that adding these things willy-nilly is not a solution to
> any actual problem.  Adding a few additional log lines that are
> low-volume at DEBUG1 might be useful, but below that (DEBUG2 etc) it's
> not good for anything other than specific development, IMO.  At least
> this particular one for streaming replication I think we should not
> include.

Looking at v2, I think that this leaves the additions of the DEBUG1
entries in SendBaseBackup() and WalRcvWaitForStartPosition(), then.
The one in pgarch.c does not provide any additional information as the
segment to-be-archived should be part of the command.
--
Michael


signature.asc
Description: PGP signature


Re: pg_waldump stucks with options --follow or -f and --stats or -z

2021-11-16 Thread Michael Paquier
On Tue, Nov 16, 2021 at 09:34:25AM +0530, Bharath Rupireddy wrote:
> It looks okay to be more responsive with the -f option so that the
> above commands keep emitting stats for every 1 second(the --follow
> option waits for 1 second). Should we really be emitting stats for
> every 1 second? Isn't there going to be too much information on the
> stdout? Or should we be emitting the stats for every 5 or 10 seconds?

I don't have a perfect answer to this question, but dumping the stats
with a fixed frequency is not going to be useful if we don't have in
those logs a current timestamp and/or a current LSN.  This is
basically about how much we want to call XLogDumpDisplayStats() and
how useful it is, but note that my argument is a bit different than
what you are describing here: we could try to make
XLogDumpDisplayStats() responsive on SIGINT or SIGTERM to show
statistics reports.  This way, a --follow without an --end LSN
specified could still provide some information rather than nothing.
That could also be useful if defining an --end but interrupting the
call.

At the same time, we could also just let things as they are.  --follow
and --stats being specified together is what the user looked for, so
they get what they wanted.

> In summary, we have the following options:
> 1) block  the combinations  "-s/-f/-z", "-s/-e/-f/-z"
> 2) be more responsive and keep emitting the stats per 1 sec default
> with -f option
> 3)  be more responsive and keep emitting the stats per user's choice
> of seconds (a new option that can be used with the -s/-e/-f/-z).

A frequency cannot be measured only in time here, but also in bytes in
terms of a minimum amount of WAL replayed between two reports.  I
don't like much the idea of multiple stats reports emitted in a single
pg_waldump call, TBH.  This makes things more complicated than they
should, and the gain is not obvious, at least to me.
--
Michael


signature.asc
Description: PGP signature


Re: Unnecessary global variable declared in xlog.c

2021-11-16 Thread Michael Paquier
On Tue, Nov 16, 2021 at 02:08:54AM -0500, Tom Lane wrote:
> I think LastRec was originally referenced by multiple functions
> in xlog.c.  But it does look like it could be a local now.

Thanks for double-checking, applied this one as of f975fc3.
--
Michael


signature.asc
Description: PGP signature


Re: isolation test output format

2021-11-16 Thread Tom Lane
Andy Fan  writes:
> I'm using the isolation test to write some tests, and then I found the
> output is unclear some time.
> for example:

> a  b  c
> 1988-12-14 01:03:031988-12-13 22:03:03-011988-12-14

Are you using recent sources?  The isolationtester's output hasn't
looked like that since June (4a054069a et al).

regards, tom lane




CREATE tab completion

2021-11-16 Thread Ken Kato

Hi hackers,

I noticed that there are some tab completions missing for the following 
commands:

-CREATE CONVERSION : missing FOR, TO, FROM
-CREATE DOMAIN : missing after AS
-CREATE LANGUAGE : missing after HANDLER
-CREATE SCHEMA : missing AUTHORIZATION, IF NOT EXISTS
-CREATE SEQUENCE : missing AS
-CREATE TRANSFORM : missing after FOR

I made a patch for this, so please have a look.

Best wishes,

--
Ken Kato
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATIONdiff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 4f724e4428..0216b50946 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -2572,6 +2572,20 @@ psql_completion(const char *text, int start, int end)
 			COMPLETE_WITH("true", "false");
 	}
 
+	/* CREATE CONVERSION */
+	else if (Matches("CREATE", "CONVERSION", MatchAny))
+		COMPLETE_WITH("FOR");
+	else if (Matches("CREATE", "CONVERSION", MatchAny, "FOR"))
+		COMPLETE_WITH_QUERY(Query_for_list_of_encodings);
+	else if (Matches("CREATE", "CONVERSION", MatchAny, "FOR", MatchAny))
+		COMPLETE_WITH("TO");
+	else if (Matches("CREATE", "CONVERSION", MatchAny, "FOR", MatchAny, "TO"))
+		COMPLETE_WITH_QUERY(Query_for_list_of_encodings);
+	else if (Matches("CREATE", "CONVERSION", MatchAny, "FOR", MatchAny, "TO", MatchAny))
+		COMPLETE_WITH("FROM");
+	else if (Matches("CREATE", "CONVERSION", MatchAny, "FOR", MatchAny, "TO", MatchAny, "FROM"))
+		COMPLETE_WITH_VERSIONED_SCHEMA_QUERY(Query_for_list_of_functions, NULL);
+
 	/* CREATE DATABASE */
 	else if (Matches("CREATE", "DATABASE", MatchAny))
 		COMPLETE_WITH("OWNER", "TEMPLATE", "ENCODING", "TABLESPACE",
@@ -2582,6 +2596,17 @@ psql_completion(const char *text, int start, int end)
 	else if (Matches("CREATE", "DATABASE", MatchAny, "TEMPLATE"))
 		COMPLETE_WITH_QUERY(Query_for_list_of_template_databases);
 
+	/* CREATE DOMAIN */
+	else if (Matches("CREATE", "DOMAIN", MatchAny))
+		COMPLETE_WITH("AS");
+	else if (Matches("CREATE", "DOMAIN", MatchAny, "AS"))
+		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_datatypes, NULL);
+	else if (Matches("CREATE", "DOMAIN", MatchAny, "AS", MatchAny))
+		COMPLETE_WITH("COLLATE", "DEFAULT", "CONSTRAINT",
+	  "NOT NULL", "NULL", "CHECK (");
+	else if (Matches("CREATE", "DOMAIN", MatchAny, "COLLATE"))
+		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_collations, NULL);
+
 	/* CREATE EXTENSION */
 	/* Complete with available extensions rather than installed ones. */
 	else if (Matches("CREATE", "EXTENSION"))
@@ -2661,6 +2686,12 @@ psql_completion(const char *text, int start, int end)
 			 !TailMatches("FOR", MatchAny, MatchAny, MatchAny))
 		COMPLETE_WITH("(");
 
+	/* CREATE LANGUAGE */
+	else if (Matches("CREATE", "LANGUAGE", MatchAny))
+		COMPLETE_WITH("HANDLER");
+	else if (Matches("CREATE", "LANGUAGE", MatchAny, "HANDLER", MatchAny))
+		COMPLETE_WITH("INLINE", "VALIDATOR");
+
 	/* CREATE OR REPLACE */
 	else if (Matches("CREATE", "OR"))
 		COMPLETE_WITH("REPLACE");
@@ -2802,11 +2833,23 @@ psql_completion(const char *text, int start, int end)
 	else if (TailMatches("AS", "ON", "SELECT|UPDATE|INSERT|DELETE", "TO"))
 		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tables, NULL);
 
+/* CREATE SCHEMA */
+	else if (Matches("CREATE", "SCHEMA"))
+		COMPLETE_WITH("AUTHORIZATION", "IF NOT EXISTS");
+	else if (Matches("CREATE", "SCHEMA", "IF", "NOT", "EXISTS"))
+	{
+		completion_info_charp = prev2_wd;
+		COMPLETE_WITH_QUERY(Query_for_list_of_schemas
+			" UNION SELECT 'AUTORIZATION'");
+	}
+	else if (Matches("CREATE", "SCHEMA") && TailMatches("AUTHORIZATION"))
+		COMPLETE_WITH_QUERY(Query_for_list_of_roles);
+
 /* CREATE SEQUENCE --- is allowed inside CREATE SCHEMA, so use TailMatches */
 	else if (TailMatches("CREATE", "SEQUENCE", MatchAny) ||
 			 TailMatches("CREATE", "TEMP|TEMPORARY", "SEQUENCE", MatchAny))
-		COMPLETE_WITH("INCREMENT BY", "MINVALUE", "MAXVALUE", "NO", "CACHE",
-	  "CYCLE", "OWNED BY", "START WITH");
+		COMPLETE_WITH("AS", "INCREMENT BY", "MINVALUE", "MAXVALUE", "NO",
+	  "CACHE","CYCLE", "OWNED BY", "START WITH");
 	else if (TailMatches("CREATE", "SEQUENCE", MatchAny, "NO") ||
 			 TailMatches("CREATE", "TEMP|TEMPORARY", "SEQUENCE", MatchAny, "NO"))
 		COMPLETE_WITH("MINVALUE", "MAXVALUE", "CYCLE");
@@ -2882,6 +2925,19 @@ psql_completion(const char *text, int start, int end)
 	else if (Matches("CREATE", "TEXT", "SEARCH", "CONFIGURATION|DICTIONARY|PARSER|TEMPLATE", MatchAny))
 		COMPLETE_WITH("(");
 
+/* CREATE TRANSFORM */
+	else if (Matches("CREATE", "TRANSFORM"))
+		COMPLETE_WITH("FOR");
+	else if (Matches("CREATE", "TRANSFORM", "FOR"))
+		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_datatypes, NULL);
+	else if (Matches("CREATE", "TRANSFORM", "FOR", MatchAny)
+		COMPLETE_WITH("LANGUAGE")
+	else if (Matches("CREATE", "TRANSFORM", "FOR", MatchAny, "LANGUAGE")
+	{
+		completion_info_charp = prev2_wd;
+		COMPLETE_WITH_QUERY(Query_for_list_of_languages);
+	}
+
 /* CREATE SUBSCRIPTION */
 	else if (Matches("CREATE

isolation test output format

2021-11-16 Thread Andy Fan
Hi:

I'm using the isolation test to write some tests, and then I found the
output is unclear some time.
for example:

a  b  c
1988-12-14 01:03:031988-12-13 22:03:03-011988-12-14

Actually what I want is there are some spaces among the fields. like:

a  b  c
1988-12-14 01:03:03  1988-12-13 22:03:03-01  1988-12-14

The simplest reproduce case is:

setup
{
CREATE TABLE c_tm(a timestamp, b timestamptz, c date);
}

session "s1"
step "s1inserttm"
{
INSERT INTO c_tm VALUES('1988-12-14 01:03:03 +2', '1988-12-14
01:03:03 +2', '1988-12-14 01:03:03 +2');
}

step "s1read"
{
SELECT * FROM c_tm;
}

Is there any simple way to improve this?

-- 
Best Regards
Andy Fan




Re: row filtering for logical replication

2021-11-16 Thread Greg Nancarrow
On Tue, Nov 16, 2021 at 7:33 PM tanghy.f...@fujitsu.com
 wrote:
>
> The second record can’t be replicated.
>
> By the way, when only applied 0001 patch, I couldn't reproduce this bug.
> So, I think it was related to the later patches.
>

The problem seems to be caused by the 0006 patch (when I remove that
patch, the problem doesn't occur).
Still needs investigation.

Regards,
Greg Nancarrow
Fujitsu Australia




Re: Granting SET and ALTER SYSTE privileges for GUCs

2021-11-16 Thread Mark Dilger



> On Nov 16, 2021, at 2:12 PM, Tom Lane  wrote:
> 
> BTW, another objection to pg_config_param as designed here is that
> a "name" is not an appropriate way to store possibly-qualified
> custom GUC names.  It's not long enough (cf. valid_custom_variable_name).

I was aware of that, but figured not all GUCs have to be grantable.  If it 
doesn't fit in a NameData, you can't grant on it.

If we want to be more accommodating than that, we can store it as text, just 
like pg_db_role_names does, but then we need more code complexity to look it up 
and to verify that it is unique.  (We wouldn't want multiple records for the 
same  pair.) 

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: Granting SET and ALTER SYSTE privileges for GUCs

2021-11-16 Thread Mark Dilger



> On Nov 16, 2021, at 2:12 PM, Tom Lane  wrote:
> 
> The question is why you need pg_config_param at all, then.
> AFAICS it just adds maintenance complexity we could do without.
> I think we'd be better off with a catalog modeled on the design of
> pg_db_role_setting, which would have entries for roles and lists
> of GUC names that those roles could set.

Originally, I was trying to have dependency linkage between two proper types of 
objects, so that DROP ROLE and DROP CONFIGURATION PARAMETER would behave as 
expected, complaining about privileges remaining rather than dropping an object 
and leaving a dangling reference.

I've deleted the whole CREATE CONFIGURATION PARAMETER and DROP CONFIGURATION 
PARAMETER syntax and implementation, but it still seems odd to me that:

  CREATE ROLE somebody;
  GRANT SELECT ON TABLE sometable TO ROLE somebody;
  GRANT ALTER SYSTEM ON someguc TO ROLE somebody;
  DROP ROLE somebody;
  ERROR:  role "somebody" cannot be dropped because some objects depend on it
  DETAIL:  privileges for table sometable

would not mention privileges for "someguc" as well.  That's why I want 
configuration parameters to be proper objects with OIDs and with entries in 
pg_depend and/or pg_shdepend.  Maybe there is some better way to do it, but 
that's why I've been doing this.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: Granting SET and ALTER SYSTE privileges for GUCs

2021-11-16 Thread Tom Lane
Mark Dilger  writes:
> The patch already posted on this thread already works that way.  Robert and 
> Tom seemed to infer that all gucs need to be present in the catalog, but in 
> fact, not entering them in the catalog simply means that they aren't 
> grantable.  I think the confusion arose from the fact that I created a 
> mechanism for extension authors to enter additional gucs into the catalog, 
> which gave the false impression that such entries were required.  They are 
> not.  I didn't bother explaining that before, because Robert and Tom both 
> seem to expect all gucs to be grantable, an expectation you do not appear to 
> share.

The question is why you need pg_config_param at all, then.
AFAICS it just adds maintenance complexity we could do without.
I think we'd be better off with a catalog modeled on the design of
pg_db_role_setting, which would have entries for roles and lists
of GUC names that those roles could set.

BTW, another objection to pg_config_param as designed here is that
a "name" is not an appropriate way to store possibly-qualified
custom GUC names.  It's not long enough (cf. valid_custom_variable_name).

>> To support pg_dump and pg_upgrade, it might be better to have an
>> enabled/disabled flag rather than to delete rows.

> I'm not really sure what this means.

I didn't see the point of this either.  We really need to KISS here.
Every bit of added complexity in the catalog representation is another
opportunity for bugs-of-omission, not to mention a detail that you
have to provide mechanisms to dump and restore.

regards, tom lane




Re: RecoveryInProgress() has critical side effects

2021-11-16 Thread Andres Freund
On 2021-11-16 16:30:27 -0500, Robert Haas wrote:
> I'm still not entirely clear on whether you prefer v1-0002, v2-0002,
> or something else.

I think it basically doesn't matter much. It's such a small piece of the cost
compared to either the cost of a single insert or the ratio between
RedoRecPtr/FPW changes and the number of inserted records.

I guess v2-0002 is mildly simpler, so I very weakly lean towards that.




Re: refactoring basebackup.c

2021-11-16 Thread Robert Haas
On Mon, Nov 15, 2021 at 2:23 PM Robert Haas  wrote:
> Yeah, that's what it should be doing. I'll commit a fix, thanks for
> the report and diagnosis.

Here's a new patch set.

0001 - When I committed the patch to add the missing 2 blocks of zero
bytes to the tar archives generated by the server, I failed to adjust
the documentation. So 0001 does that. This is the only new patch in
the series. I was not sure whether to just remove the statement from
the documentation saying that those blocks aren't included, or whether
to mention that we used to include them and no longer do. I went for
the latter; opinions welcome.

0002 - This adds a new COPY subprotocol for taking base backups. I've
improved it over the previous version by adding documentation. I'm
still seeking comments on the points I raised in
http://postgr.es/m/CA+TgmobrOXbDh+hCzzVkD3weV3R-QRy3SPa=frb_rv9wf5i...@mail.gmail.com
but what I'm leaning toward doing is committing the patch as is and
then submitting - or maybe several patches - later to rip some this
and a few other old things out. That way the debate - or lack thereof
- about what to do here doesn't have to block the main patch set, and
also, it feels safer to make removing the existing stuff a separate
effort rather than doing it now.

0003 - This adds "server" and "blackhole" as backup targets. In this
version, I've improved the documentation. Also, the previous version
only let you use a backup target with -Xnone, and I realized that was
stupid. -Xfetch is OK too. -Xstream still doesn't work, since that's
implemented via client-side logic. I think this still needs some work
to be committable, like adding tests, but I don't expect to make any
major changes.

0004 - Server-side gzip compression. Similar level of maturity to 0003.

-- 
Robert Haas
EDB: http://www.enterprisedb.com


v10-0001-Document-that-tar-archives-are-now-properly-term.patch
Description: Binary data


v10-0002-Modify-pg_basebackup-to-use-a-new-COPY-subprotoc.patch
Description: Binary data


v10-0003-Support-base-backup-targets.patch
Description: Binary data


v10-0004-Server-side-gzip-compression.patch
Description: Binary data


Re: RecoveryInProgress() has critical side effects

2021-11-16 Thread Robert Haas
On Tue, Nov 16, 2021 at 3:42 PM Andres Freund  wrote:
> On 2021-11-16 15:19:19 -0500, Robert Haas wrote:
> > > Hm. I think this might included a bunch of convoluting factors that make 
> > > it
> > > hard to judge the actual size of the performance difference.
> >
> > Yes, I think so, too.
>
> FWIW I ran that pgench thing I presented upthread, and I didn't see any
> meaningful and repeatable performance difference 354a1f8d220, ad26ee28250 and
> 0002 applied ontop of ad26ee28250. The run-to-run variance is way higher than
> the difference between the changes.

Thanks. I suspected that the results I was seeing were not meaningful,
but it's hard to be sure when the results seem to be repeatable
locally.

I'm still not entirely clear on whether you prefer v1-0002, v2-0002,
or something else.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Granting SET and ALTER SYSTE privileges for GUCs

2021-11-16 Thread Mark Dilger



> On Nov 16, 2021, at 12:38 PM, Andrew Dunstan  wrote:
> 
> Your original and fairly simple set of patches used hardcoded role names
> and sets of GUCs they could update via ALTER SYSTEM. I suggested to you
> privately that a more flexible approach would be to drive this from a
> catalog table. I had in mind a table of more or less .

Right.  I prefer a table of  matching the structure of most of 
the rest of the grantable objects, so that entries from pg_depend or 
pg_shdepend can track the dependencies in the usual way and prevent dangling 
references when DROP ROLE is executed.  Is there a reason to prefer an ad hoc 
tables structure?

> You could prepopulate it with the roles / GUCs from your original patch
> set. I don't think it needs to be initially empty. But DBAs would be
> able to modify and extend the settings. I agree with Tom that we
> shouldn't try to cover all GUCs in the table - any GUC without an entry
> can only be updated by a superuser.

The patch already posted on this thread already works that way.  Robert and Tom 
seemed to infer that all gucs need to be present in the catalog, but in fact, 
not entering them in the catalog simply means that they aren't grantable.  I 
think the confusion arose from the fact that I created a mechanism for 
extension authors to enter additional gucs into the catalog, which gave the 
false impression that such entries were required.  They are not.  I didn't 
bother explaining that before, because Robert and Tom both seem to expect all 
gucs to be grantable, an expectation you do not appear to share.

> To support pg_dump and pg_upgrade, it might be better to have an
> enabled/disabled flag rather than to delete rows.

I'm not really sure what this means.  Are you suggesting that the  (or in your formulation ) should have a "bool enabled" 
field, and when the guc gets dropped, the "enabled" field gets set to false?

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: RecoveryInProgress() has critical side effects

2021-11-16 Thread Andres Freund
Hi,

On 2021-11-16 15:19:19 -0500, Robert Haas wrote:
> > Hm. I think this might included a bunch of convoluting factors that make it
> > hard to judge the actual size of the performance difference.
> 
> Yes, I think so, too.

FWIW I ran that pgench thing I presented upthread, and I didn't see any
meaningful and repeatable performance difference 354a1f8d220, ad26ee28250 and
0002 applied ontop of ad26ee28250. The run-to-run variance is way higher than
the difference between the changes.

Greetings,

Andres Freund




Re: Non-superuser subscription owners

2021-11-16 Thread Andrew Dunstan


On 11/16/21 15:08, Mark Dilger wrote:
>
>> On Nov 16, 2021, at 12:06 PM, Andrew Dunstan  wrote:
>>
>> There doesn't seem to be a CF item for it but I'm
>> inclined to commit it in a couple of days time.
> https://commitfest.postgresql.org/36/3414/
>

OK, got it, thanks.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: Granting SET and ALTER SYSTE privileges for GUCs

2021-11-16 Thread Andrew Dunstan


On 11/16/21 14:48, Mark Dilger wrote:
>
>> On Nov 16, 2021, at 8:44 AM, Tom Lane  wrote:
>>
>> My concern is not about performance, it's about the difficulty of
>> maintaining a catalog that expects to be a more-or-less exhaustive
>> list of GUCs.  I think you need to get rid of that expectation.
> I'm preparing a new version of the patch that has the catalog empty to begin 
> with, and only adds values in response to GRANT commands.  That also handles 
> the issues of extension upgrades, which I think the old patch handled better 
> than the discussion on this thread suggested, but no matter.  The new 
> behavior will allow granting privileges on non-existent gucs, just as ALTER 
> ROLE..SET allows registering settings on non-existent gucs.



Your original and fairly simple set of patches used hardcoded role names
and sets of GUCs they could update via ALTER SYSTEM. I suggested to you
privately that a more flexible approach would be to drive this from a
catalog table. I had in mind a table of more or less .
You could prepopulate it with the roles / GUCs from your original patch
set. I don't think it needs to be initially empty. But DBAs would be
able to modify and extend the settings. I agree with Tom that we
shouldn't try to cover all GUCs in the table - any GUC without an entry
can only be updated by a superuser.


To support pg_dump and pg_upgrade, it might be better to have an
enabled/disabled flag rather than to delete rows.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: RecoveryInProgress() has critical side effects

2021-11-16 Thread Robert Haas
On Tue, Nov 16, 2021 at 2:27 PM Andres Freund  wrote:
> Is this with assertions enabled? Because on my workstation (which likely is
> slower on a single-core basis than your laptop) I get around half of this with
> an optimized build (and a non-optimized build rarely is worth using as a
> measuring stick). It could also be that your psqlrc does something
> heavyweight?

Assertions disabled. ~/.psqlrc does not exist.

> Hm. I think this might included a bunch of convoluting factors that make it
> hard to judge the actual size of the performance difference.

Yes, I think so, too.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: pgsql: Fix headerscheck failure in replication/worker_internal.h

2021-11-16 Thread Stephen Frost
Greetings,

On Tue, Nov 16, 2021 at 15:12 Tom Lane  wrote:

> Alvaro Herrera  writes:
> > On 2021-Nov-16, Stephen Frost wrote:
> >> Not against possibly changing that but I don’t get the point of
> including
> >> be-gssapi-common.h if it’s not enabled in the build and typically if
> GSSAPI
> >> is possible and the reason for including be-gssapi-common.h then there’s
> >> other things that need to be under a ifdef, again, as in auth.c
>
> > BTW, this is exactly why my first suggestion was to add an exclusion
> > rule to headerscheck so that be-gssapi-common.h is not verified by that
> > script.  After re-reading your response, that looks like a reasonable
> > answer too.
>
> I think adding #ifdef ENABLE_GSS as per your prior message is better.
> Headers have little business making assumptions about the context in
> which they're included --- which is exactly why headerscheck exists ---
> so I disagree with Stephen's argument.  In any case I am not in favor of
> making random exclusions from that script's testing.


I don’t feel all that strongly either way, so if you’d rather have it that
way then that’s fine.  Will still need the other ifdefs too anyway though,
but I guess it isn’t that big of a deal.

Thanks,

Stephen


Re: pgsql: Fix headerscheck failure in replication/worker_internal.h

2021-11-16 Thread Tom Lane
Alvaro Herrera  writes:
> On 2021-Nov-16, Stephen Frost wrote:
>> Not against possibly changing that but I don’t get the point of including
>> be-gssapi-common.h if it’s not enabled in the build and typically if GSSAPI
>> is possible and the reason for including be-gssapi-common.h then there’s
>> other things that need to be under a ifdef, again, as in auth.c

> BTW, this is exactly why my first suggestion was to add an exclusion
> rule to headerscheck so that be-gssapi-common.h is not verified by that
> script.  After re-reading your response, that looks like a reasonable
> answer too.

I think adding #ifdef ENABLE_GSS as per your prior message is better.
Headers have little business making assumptions about the context in
which they're included --- which is exactly why headerscheck exists ---
so I disagree with Stephen's argument.  In any case I am not in favor of
making random exclusions from that script's testing.

regards, tom lane




Re: Non-superuser subscription owners

2021-11-16 Thread Mark Dilger



> On Nov 16, 2021, at 12:06 PM, Andrew Dunstan  wrote:
> 
> There doesn't seem to be a CF item for it but I'm
> inclined to commit it in a couple of days time.

https://commitfest.postgresql.org/36/3414/

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: Non-superuser subscription owners

2021-11-16 Thread Andrew Dunstan


On 11/3/21 15:50, Mark Dilger wrote:
>> On Nov 1, 2021, at 10:58 AM, Mark Dilger  
>> wrote:
>>
>> ALTER SUBSCRIPTION..[ENABLE | DISABLE] do not synchronously start or stop 
>> subscription workers.  The ALTER command updates the catalog's subenabled 
>> field, but workers only lazily respond to that.  Disabling and enabling the 
>> subscription as part of the OWNER TO would not reliably accomplish anything.
> I have rethought my prior analysis.  The problem in the previous patch was 
> that the subscription apply workers did not check for a change in ownership 
> the way they checked for other changes, instead only picking up the new 
> ownership information when the worker restarted for some other reason.  This 
> next patch set fixes that.  The application of a change record may continue 
> under the old ownership permissions when a concurrent command changes the 
> ownership of the subscription, but the worker will pick up the new 
> permissions before applying the next record.  I think that is consistent 
> enough with reasonable expectations.
>
> The first two patches are virtually unchanged.  The third updates the 
> behavior of the apply workers, and updates the documentation to match.


I'm generally happier about this than the previous patch set. With the
exception of some slight documentation modifications I think it's
basically committable. There doesn't seem to be a CF item for it but I'm
inclined to commit it in a couple of days time.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: Granting SET and ALTER SYSTE privileges for GUCs

2021-11-16 Thread Mark Dilger



> On Nov 16, 2021, at 8:44 AM, Tom Lane  wrote:
> 
> My concern is not about performance, it's about the difficulty of
> maintaining a catalog that expects to be a more-or-less exhaustive
> list of GUCs.  I think you need to get rid of that expectation.

I'm preparing a new version of the patch that has the catalog empty to begin 
with, and only adds values in response to GRANT commands.  That also handles 
the issues of extension upgrades, which I think the old patch handled better 
than the discussion on this thread suggested, but no matter.  The new behavior 
will allow granting privileges on non-existent gucs, just as ALTER ROLE..SET 
allows registering settings on non-existent gucs.

The reason I had resisted allowing grants of privileges on non-existent gucs is 
that you can get the following sort of behavior (note the last two lines):

  DROP USER regress_priv_user7;
  ERROR:  role "regress_priv_user7" cannot be dropped because some objects 
depend on it
  DETAIL:  privileges for table persons2
  privileges for configuration parameter sort_mem
  privileges for configuration parameter no_such_param

Rejecting "no_such_param" in the original GRANT statement seemed cleaner to me, 
but this discussion suggests pretty strongly that I can't do it that way.  
Changing the historical "sort_mem" to the canonical "work_mem" name also seems 
better to me, as otherwise you could have different grants on the same GUC 
under different names.  I'm inclined to keep the canonicalization of names 
where known, but maybe that runs afoul the rule that these grants should not be 
tied very hard to the GUC?

> In the analogy to ALTER DATABASE/USER SET, we don't expect that
> pg_db_role_setting catalog entries will exist for all, or even
> very many, GUCs.  Also, the fact that pg_db_role_setting entries
> aren't tied very hard to the actual existence of a GUC is a good
> thing from the maintenance and upgrade standpoint.

Doing it that way

> BTW, if we did create such a catalog, there would need to be
> pg_dump and pg_upgrade support for its entries, and you'd have to
> think about (e.g.) whether pg_upgrade would attempt to preserve
> the same OIDs.  I don't see any indication that the patch has
> addressed that infrastructure ... which is probably just as well,
> since it's work that I'd be wanting to reject.

Yeah, that's why I didn't write it.  I wanted feedback on the basic 
implementation before doing that work.

>  (Hm, but actually,
> doesn't pg_dump need work anyway to dump this new type of GRANT?)

Yes, if the idea of this kind of grant isn't being outright rejected, then I'll 
need to write that.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: RecoveryInProgress() has critical side effects

2021-11-16 Thread Andres Freund
Hi,

On 2021-11-15 16:29:28 -0500, Robert Haas wrote:
> v1: 0.378 ms
> v2: 0.391 ms
> common base commit (10eae82b2): 0.376 ms

Is this with assertions enabled? Because on my workstation (which likely is
slower on a single-core basis than your laptop) I get around half of this with
an optimized build (and a non-optimized build rarely is worth using as a
measuring stick). It could also be that your psqlrc does something
heavyweight?


> methodology:
> for i in `seq 1 1000`; do psql -c '\timing' -c 'select
> txid_current();'; done | grep '^Time:' | awk '{total+=$2} END {print
> total/NR}'
> run twice, discarding the first result, since the very first
> connection attempt after starting a new server process is notably
> slower

Hm. I think this might included a bunch of convoluting factors that make it
hard to judge the actual size of the performance difference. It'll
e.g. include a fair bit of syscache initialization overhead that won't change
between the two approaches. This could be addressed by doing something like -c
"SELECT 'txid_current'::regproc;" first.

Also, the psql invocations itself use up a fair bit of time, even if you
ignored them from the result with the \timing trick. A pgbench -C doing 1k
SELECT 1;s is about ~1.5s for me, whereas psql is ~5.7s.

Just to size up the order of magnitude of overhead of the connection
establishment and syscache initialization, I ran a pgbench with a script of

SELECT 1;
SELECT 1;
SELECT 'txid_current()'::regprocedure;
SELECT 'txid_current()'::regprocedure;
SELECT txid_current();
SELECT txid_current();

and ran that with pgbench -n -f /tmp/txid.sql -C -t1000 -P1 --report-latencies
and got
statement latencies in milliseconds:
 0.125  SELECT 1;
 0.024  SELECT 1;
 0.095  SELECT 'txid_current()'::regprocedure;
 0.025  SELECT 'txid_current()'::regprocedure;
 0.033  SELECT txid_current();
 0.024  SELECT txid_current();

which nicely shows how much of the overhead is not related to the actual
txid_current() invocation.

Greetings,

Andres Freund




Re: pgsql: Fix headerscheck failure in replication/worker_internal.h

2021-11-16 Thread Stephen Frost
Greetings,

On Tue, Nov 16, 2021 at 13:16 Alvaro Herrera 
wrote:

> On 2021-Nov-16, Stephen Frost wrote:
>
> > Not against possibly changing that but I don’t get the point of including
> > be-gssapi-common.h if it’s not enabled in the build and typically if
> GSSAPI
> > is possible and the reason for including be-gssapi-common.h then there’s
> > other things that need to be under a ifdef, again, as in auth.c
>
> BTW, this is exactly why my first suggestion was to add an exclusion
> rule to headerscheck so that be-gssapi-common.h is not verified by that
> script.  After re-reading your response, that looks like a reasonable
> answer too.


Yeah, that seems better to me, though I’ve not yet had time to look deeply
into any of this.

Thanks,

Stephen

>


Re: RecoveryInProgress() has critical side effects

2021-11-16 Thread Andres Freund
Hi,

On 2021-11-11 12:15:03 -0500, Robert Haas wrote:
> The reason why RecoveryInProgress() has critical side effects is that
> it calls InitXLOGAccess(). It turns out that the only critical thing
> that InitXLOGAccess() does is call InitXLogInsert().[1] If a backend
> doesn't call InitXLogInsert(), max_rdatas won't be initialized, and
> the first attempt to insert WAL will probably fail with something like
> "ERROR: too much WAL data". But there's no real reason why
> InitXLogInsert() needs to happen only after recovery is finished.

I think we should consider allocating that memory in postmaster, on !windows
at least. Or, perhaps even better, to initially use some static variable, and
only use a separate memory context when XLogEnsureRecordSpace().  Reserving
all that memory in every process just needlessly increases our per-connection
memory usage, for no good reason.


> The
> work that it does is just setting up data structures in backend-local
> memory, and not much is lost if they are set up and not used. So, in
> 0001, I propose to move the call to InitXLogInsert() from
> InitXLOGAccess() to BaseInit(). That means every backend will do it
> the first time it starts up. It also means that CreateCheckPoint()
> will no longer need a special case call to InitXLogInsert(), which
> seems like a good thing.

Yes. Seems making BaseInit() being executed at a halfway consistent point is
starting to have some benefits at least ;)


> Nevertheless, what I did in 0002 is remove InitXLOGAccess() but move
> the initialization of RedoRecPtr and doPageWrites into
> GetFullPageWritesInfo(), only in the case where RedoRecPtr hasn't been
> set yet. This has one property that I really really like, which is
> that makes the code more understandable. There is no suggestion that
> the correctness of WAL insertion somehow depends on a
> RecoveryInProgress() call which may or may not have occurred at some
> arbitrarily distant time in the past. Initializing the value at the
> point of first use is just way clearer than initializing it as a
> side-effect of some other function that's not even that tightly
> connected. However, it does have the effect of introducing a branch
> into GetFullPageWritesInfo(), which I'm not 100% sure is cheap enough
> not to matter.

Hm. Compared to the other costs of the XLogInsert do/while loop it probably
doesn't matter. One nearly-always-false branch is vastly cheaper than going
through the loop unnecessarily. Sure, the unnecessarily iteration saved will
only be the first (for now, it might be worth to refresh the values more
often), but there's enough branches in the body of the loop that I can't
really see it mattering.

Maybe kinda conceivably the cost of that branch could be an argument if
GetFullPageWritesInfo() were inline in XLogInsert(). But it isn't.


> I think the obvious alternative is to just not
> initialize RedoRecPtr and doPageWrites at all and document in the
> comments that the first time each backend does something with them
> it's going to end up retrying once; perhaps that is preferable. Going
> the other way, we could get rid of the global variables altogether and
> have GetFullPageWrites() read the data from shared memory. However,
> unless 8-byte atomicity is guaranteed, that requires a spinlock, so it
> seems likely unappealing.

I think it'd be fine to just make them 8byte atomics, which'd be lock-free on
relevant platforms (at least once the aarch64 thing is fixed, there's a
patch).

XLogCtlInsert already takes care to ensure that RedoRecPtr/fullPageWrites are
on a cacheline that's not constantly modified.


If we really wanted to optimize the no-8-byte-single-copy-atomicity path, we
could introduce a counter counting how many times RedoRecPtr/fullPageWrites
have changed. But it seems unlikely to be worth it.


Greetings,

Andres Freund




Re: Support for NSS as a libpq TLS backend

2021-11-16 Thread Joshua Brindle
On Tue, Nov 16, 2021 at 9:45 AM Joshua Brindle
 wrote:
>
> On Mon, Nov 15, 2021 at 5:37 PM Joshua Brindle
>  wrote:
> >
> > On Mon, Nov 15, 2021 at 4:44 PM Daniel Gustafsson  wrote:
> > >
> > > > On 15 Nov 2021, at 20:51, Joshua Brindle 
> > > >  wrote:
> > >
> > > > Apologies for the delay, this didn't go to my inbox and I missed it on 
> > > > list.
> > > >
> > > > The bitcode generation is still broken, this time for nspr.h:
> > >
> > > Interesting, I am unable to replicate that in my tree but I'll investigate
> > > further tomorrow using your Dockerfile.  For the sake of testing, does
> > > compilation pass for you in the same place without using --with-llvm?
> > >
> >
> > Yes, it builds and check-world passes. I'll continue testing with this
> > build. Thank you.
>
> The previous Dockerfile had some issues due to a hasty port from RHEL
> to Fedora, attached is one that works with your patchset, llvm
> currently disabled, and the llvm deps removed.
>
> The service file is also attached since it's referenced in the
> Dockerfile and you'd have had to reproduce it.
>
> After building, run with:
> docker run --name pg-test -p 5432:5432 --cap-add=SYS_ADMIN -v
> /sys/fs/cgroup:/sys/fs/cgroup:ro -d 

I think there it a typo in the docs here that prevents them from
building (this diff seems to fix it):

diff --git a/doc/src/sgml/pgcrypto.sgml b/doc/src/sgml/pgcrypto.sgml
index 56b73e033c..844aa31e86 100644
--- a/doc/src/sgml/pgcrypto.sgml
+++ b/doc/src/sgml/pgcrypto.sgml
@@ -767,7 +767,7 @@ pgp_sym_encrypt(data, psw, 'compress-algo=1,
cipher-algo=aes256')

 Which cipher algorithm to use.  cast5 is only available
 if PostgreSQL was built with
-OpenSSL.
+OpenSSL.

 
 Values: bf, aes128, aes192, aes256, 3des, cast5




Re: pgsql: Fix headerscheck failure in replication/worker_internal.h

2021-11-16 Thread Alvaro Herrera
On 2021-Nov-16, Stephen Frost wrote:

> Not against possibly changing that but I don’t get the point of including
> be-gssapi-common.h if it’s not enabled in the build and typically if GSSAPI
> is possible and the reason for including be-gssapi-common.h then there’s
> other things that need to be under a ifdef, again, as in auth.c

BTW, this is exactly why my first suggestion was to add an exclusion
rule to headerscheck so that be-gssapi-common.h is not verified by that
script.  After re-reading your response, that looks like a reasonable
answer too.

-- 
Álvaro Herrera  Valdivia, Chile  —  https://www.EnterpriseDB.com/




Re: Non-superuser subscription owners

2021-11-16 Thread Mark Dilger



> On Nov 16, 2021, at 10:08 AM, Jeff Davis  wrote:
> 
> On Mon, 2021-11-01 at 10:58 -0700, Mark Dilger wrote:
>> It is unclear .
>> 
>> Thoughts?
> 
> What if we just say that OWNER TO must be done by a superuser, changing
> from one superuser to another, just like today? That would preserve
> backwards compatibility, but people with non-superuser subscriptions
> would need to drop/recreate them.

The paragraph I wrote on 11/01 and you are responding to is no longer relevant. 
 The patch submission on 11/03 tackled the problem.  Have you had a chance to 
take a look at the new design?

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: Non-superuser subscription owners

2021-11-16 Thread Jeff Davis
On Mon, 2021-11-01 at 10:58 -0700, Mark Dilger wrote:
> It is unclear that I can make ALTER SUBSCRIPTION..OWNER TO
> synchronous without redesigning the way workers respond to
> pg_subscription catalog updates generally.  That may be a good
> project to eventually tackle, but I don't see that it is more
> important to close the race condition in an OWNER TO than for a
> DISABLE.
> 
> Thoughts?

What if we just say that OWNER TO must be done by a superuser, changing
from one superuser to another, just like today? That would preserve
backwards compatibility, but people with non-superuser subscriptions
would need to drop/recreate them.

When we eventually do tackle the problem, we can lift the restriction.

Regards,
Jeff Davis






Re: pgsql: Fix headerscheck failure in replication/worker_internal.h

2021-11-16 Thread Alvaro Herrera
On 2021-Nov-16, Stephen Frost wrote:

> I’ve not looked at this very closely but no, normally it’s the inclusion of
> be-gssapi-common.h that’s wrapped in the ifdef, not the contents of it,
> again, see auth.c

I don't think you read my original post very carefully.  I'm talking
about sanitizing the output of the headerscheck script.

-- 
Álvaro Herrera   39°49'30"S 73°17'W  —  https://www.EnterpriseDB.com/
"Java is clearly an example of money oriented programming"  (A. Stepanov)




Re: pgsql: Fix headerscheck failure in replication/worker_internal.h

2021-11-16 Thread Stephen Frost
Greetings,

On Tue, Nov 16, 2021 at 12:33 Alvaro Herrera 
wrote:

> On 2021-Nov-16, Stephen Frost wrote:
>
> > Short answer is yes, inclusion of be-gssapi-common.h is typically
> > wrapped in a #ifdef, see src/backend/libpq/auth.c
>
> It'd be as in the attached, then.


I’ve not looked at this very closely but no, normally it’s the inclusion of
be-gssapi-common.h that’s wrapped in the ifdef, not the contents of it,
again, see auth.c

Not against possibly changing that but I don’t get the point of including
be-gssapi-common.h if it’s not enabled in the build and typically if GSSAPI
is possible and the reason for including be-gssapi-common.h then there’s
other things that need to be under a ifdef, again, as in auth.c

If there isn’t any need to be-gssapi-common.h then maybe just drop that
include instead of adding an ifdef..?

Thanks,

Stephen

>


Re: Commitfest 2021-11 Patch Triage - Part 2

2021-11-16 Thread Daniil Zakhlystov
Hi! It’s been a while since the original patch release. Let me provide a brief 
overview of the current patch status.

The initial approach was to use the streaming compression to compress all 
outgoing and decompress all incoming bytes. However, after the long discussion 
in the thread, the initial approach has been changed.

The current implementation allows compressing only specific message types, 
use the different compression algorithms for different message types, configure 
the allowed compression methods and levels both for server- and client- sides 
via GUC setting / connection string respectively.

Also, current implementation combines (when possible) multiple protocol 
messages 
into the single CompressedData message for a better compression ratio.

> 
> On 16 Nov 2021, at 01:23, Robert Haas  wrote:
> 
> To me, this feels like an attempt to move the goalposts far enough to
> kill the project. Sure, in a perfect world, that would be nice. But,
> we don't do it anywhere else. If you try to store a JPEG into a bytea
> column, we'll try to compress it just like we would any other data,
> and it may not work out. If you then take a pg_basebackup of the
> database using -Z, there's no attempt made to avoid the overhead of
> CPU overhead of compressing those TOAST table pages that contain
> already-compressed data and not the others. And it's easy to
> understand why that's the case: when you insert data into the
> database, there's no way for the database to magically know whether
> that data has been previously compressed by some means, and if so, how
> effectively. And when you back up a database, the backup doesn't know
> which relfilenodes contain TOAST tables or which pages of those
> relfilenodes contain that is already pre-compressed. In both cases,
> your options are either (1) shut off compression yourself or (2) hope
> that the compressor doesn't waste too much effort on it.
> 
> I think the same approach ought to be completely acceptable here. I
> don't even really understand how we could do anything else. printtup()
> just gets datums, and it has no idea whether or how they are toasted.
> It calls the type output functions which don't know that data is being
> prepared for transmission to the client as opposed to some other
> hypothetical way you could call that function, nor do they know what
> compression method the client wants. It does not seem at all
> straightforward to teach them that ... and even if they did, what
> then? It's not like every column value is sent as a separate packet;
> the whole row is a single protocol message, and some columns may be
> compressed and others uncompressed. Trying to guess what to do about
> that seems to boil down to a sheer guess. Unless you try to compress
> that mixture of compressed and uncompressed values - and it's
> moderately uncommon for every column of a table to be even be
> toastable - you aren't going to know how well it will compress. You
> could easily waste more CPU cycles trying to guess than you would have
> spent just doing what the user asked for.
> 

Agree. From my POV, it is OK to use the protocol message type and length to 
decide 
should it be compressed or not. Also, this can be optimized later without the 
need to change 
the protocol.

Regarding the LZ4 support patch, it still has some minor polishing to do. 
Basically, it only adds the LZ4 
algorithm support and does not change anything fundamentally. So I would 
appreciate someone doing 
a review of the current patch version.

The original thread is quite huge so I guess that it makes it hard to catch up 
with the current patch status.
I can make a new one with a detailed summary if that would help.

Thanks,

Daniil Zakhlystov





Re: pgsql: Fix headerscheck failure in replication/worker_internal.h

2021-11-16 Thread Alvaro Herrera
On 2021-Nov-16, Stephen Frost wrote:

> Short answer is yes, inclusion of be-gssapi-common.h is typically
> wrapped in a #ifdef, see src/backend/libpq/auth.c

It'd be as in the attached, then.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
>From db4a303a8a7c0c1422c974b70c06f2776670216a Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Tue, 16 Nov 2021 13:40:27 -0300
Subject: [PATCH] Harden gssapi.h inclusion for headerscheck

If the file is not in any of these places, headerscheck warns about the
inclusion.
---
 src/include/libpq/be-gssapi-common.h | 4 
 1 file changed, 4 insertions(+)

diff --git a/src/include/libpq/be-gssapi-common.h b/src/include/libpq/be-gssapi-common.h
index c07d7e7c5a..c2215f6ce7 100644
--- a/src/include/libpq/be-gssapi-common.h
+++ b/src/include/libpq/be-gssapi-common.h
@@ -14,6 +14,8 @@
 #ifndef BE_GSSAPI_COMMON_H
 #define BE_GSSAPI_COMMON_H
 
+#ifdef ENABLE_GSS
+
 #if defined(HAVE_GSSAPI_H)
 #include 
 #else
@@ -23,4 +25,6 @@
 extern void pg_GSS_error(const char *errmsg,
 		 OM_uint32 maj_stat, OM_uint32 min_stat);
 
+#endif			/* ENABLE_GSS */
+
 #endif			/* BE_GSSAPI_COMMON_H */
-- 
2.30.2



Re: pgsql: Fix headerscheck failure in replication/worker_internal.h

2021-11-16 Thread Stephen Frost
Greetings,

* Alvaro Herrera (alvhe...@alvh.no-ip.org) wrote:
> On 2021-Nov-16, Alvaro Herrera wrote:
> 
> > Fix headerscheck failure in replication/worker_internal.h
> 
> The other failure is in src/include/libpq/be-gssapi-common.h:
> 
> In file included from /tmp/headerscheck.a6f0um/test.c:2:
> ./src/include/libpq/be-gssapi-common.h:20:10: fatal error: gssapi/gssapi.h: 
> No existe el fichero o el directorio
>20 | #include 
>   |  ^
> compilation terminated.
> 
> One possible solution for this one is to add an exclusion in
> headerscheck (and cpluspluscheck?); the other possible solution seems to
> be to wrap the whole contents of the file in "#ifdef ENABLE_GSS".  I
> think the latter is roughly the approach used for OpenSSL inclusions.

Short answer is yes, inclusion of be-gssapi-common.h is typically
wrapped in a #ifdef, see src/backend/libpq/auth.c

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: pgsql: Fix headerscheck failure in replication/worker_internal.h

2021-11-16 Thread Alvaro Herrera
On 2021-Nov-16, Alvaro Herrera wrote:

> Fix headerscheck failure in replication/worker_internal.h

The other failure is in src/include/libpq/be-gssapi-common.h:

In file included from /tmp/headerscheck.a6f0um/test.c:2:
./src/include/libpq/be-gssapi-common.h:20:10: fatal error: gssapi/gssapi.h: No 
existe el fichero o el directorio
   20 | #include 
  |  ^
compilation terminated.

One possible solution for this one is to add an exclusion in
headerscheck (and cpluspluscheck?); the other possible solution seems to
be to wrap the whole contents of the file in "#ifdef ENABLE_GSS".  I
think the latter is roughly the approach used for OpenSSL inclusions.

-- 
Álvaro Herrera   39°49'30"S 73°17'W  —  https://www.EnterpriseDB.com/




Re: Granting SET and ALTER SYSTE privileges for GUCs

2021-11-16 Thread Tom Lane
Mark Dilger  writes:
> On Nov 16, 2021, at 7:28 AM, Tom Lane  wrote:
>> True; as long as the expectation is that entries will exist for only
>> a tiny subset of GUCs, it's probably fine.

> I understand that bloating a frequently used catalog can be pretty
> harmful to performance.  I wasn't aware that the size of an infrequently
> used catalog was critical.

My concern is not about performance, it's about the difficulty of
maintaining a catalog that expects to be a more-or-less exhaustive
list of GUCs.  I think you need to get rid of that expectation.
In the analogy to ALTER DATABASE/USER SET, we don't expect that
pg_db_role_setting catalog entries will exist for all, or even
very many, GUCs.  Also, the fact that pg_db_role_setting entries
aren't tied very hard to the actual existence of a GUC is a good
thing from the maintenance and upgrade standpoint.

BTW, if we did create such a catalog, there would need to be
pg_dump and pg_upgrade support for its entries, and you'd have to
think about (e.g.) whether pg_upgrade would attempt to preserve
the same OIDs.  I don't see any indication that the patch has
addressed that infrastructure ... which is probably just as well,
since it's work that I'd be wanting to reject.  (Hm, but actually,
doesn't pg_dump need work anyway to dump this new type of GRANT?)

regards, tom lane




Re: Granting SET and ALTER SYSTE privileges for GUCs

2021-11-16 Thread Tom Lane
Mark Dilger  writes:
> You are talking about mismatches in the other direction, aren't you?  I was 
> responding to Robert's point that new gucs could appear, and old gucs 
> disappear.  That seems analogous to new functions appearing and old functions 
> disappearing.  If you upgrade (not downgrade) the .so, the new gucs and 
> functions will be in the .so, but won't be callable/grantable from sql until 
> the upgrade script runs.

True, but in general users might not care about getting access to new
features (or at least, they might not care until much later, at which
point they'd bother to run ALTER EXTENSION UPDATE).

> The old gucs and functions will be missing from the .so, and attempts to call 
> them/grant them from sql before the upgrade will fail.  What am I missing 
> here?

Here, you are describing behavior that's against project policy and would
be rejected immediately if anyone submitted a patch that made an extension
do that.  Newer .so versions are expected to run successfully, not fail,
with an older version of their SQL objects.  Were this not so, it's almost
inevitable that a binary upgrade would fail, because the extension is
likely to be called into action somewhere before there is any opportunity
to issue ALTER EXTENSION UPDATE.  Even in-place updates of extensions
would become problematic: you can't assume that a rollout of new
executables will be instantly followed by ALTER EXTENSION UPDATE.

Basically, I think the proposed new system catalog is both unworkable
and entirely unnecessary.  Maybe it would have been okay if Peter had
designed GUCs like that a couple of decades ago, but at this point
we have a ton of infrastructure --- much of it not under the core
project's control --- that assumes that GUCs can be created with just
a bit of code.  I do not think that this feature is worth the cost of
changing that.  Or IOW: ALTER SYSTEM SET has gotten along fine without
such a catalog; why can't this feature?

I also notice that the patch seems to intend to introduce tracking
of which user "owns" a GUC, which I think is even more unworkable.
They are all going to wind up owned by the bootstrap superuser
(extension GUCs too), so why bother?

regards, tom lane




Re: add recovery, backup, archive, streaming etc. activity messages to server logs along with ps display

2021-11-16 Thread Alvaro Herrera
On 2021-Nov-16, Bossart, Nathan wrote:

> On 11/15/21, 7:14 PM, "Michael Paquier"  wrote:
> > +   ereport(DEBUG1,
> > +   (errmsg_internal("streaming %X/%X",
> > + 
> > LSN_FORMAT_ARGS(sentPtr;
> > Anyway, those two ones are going to make the logs much more noisy, no?
> > The same could be said about XLogFileRead(), joining the point of
> > Nathan's upthread.  So I cannot get excited by this change.
> 
> Yeah, this might even be too noisy for DEBUG5.

Nod.  And if you're at DEBUG5, the log contains so much other crap that
it is unusable for any purposes anyway.

My opinion is that adding these things willy-nilly is not a solution to
any actual problem.  Adding a few additional log lines that are
low-volume at DEBUG1 might be useful, but below that (DEBUG2 etc) it's
not good for anything other than specific development, IMO.  At least
this particular one for streaming replication I think we should not
include.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/




Re: Time to drop plpython2?

2021-11-16 Thread Andrew Dunstan


On 11/14/21 21:24, Tom Lane wrote:
> ... btw, there's a fairly critical gating factor for any plan to drop
> python2 support: the buildfarm.  I just counted, and there are exactly
> as many members running python 2.x as 3.x (49 apiece), not counting
> Windows machines that aren't running configure.  We can't commit
> something that's going to make half the buildfarm go red.
>
> (It's likely that some fraction of them do already have python3 installed,
> in which case the search order change Peter recommended would be enough to
> fix it.  But I'm sure not all do.)
>
> This ties into the business about converting the build system to meson,
> as that also requires python 3 --- with, IIUC, a higher minimum version
> than we might otherwise need.  I'm disinclined to cause two separate
> flag days for buildfarm owners, so what I now think is we ought to put
> this idea on the shelf until we've finished that conversion or decided
> we're not gonna do it.  We need to identify exactly what needs to be
> installed before we start pestering the owners.
>
>   


crake has been using 2.7, but has 3.9.7 installed. I tried switching to
that but ran into an issue with upgrading things from 9.5 on. It would
have been all the way back to 9.2 but the plpython tests drop the
extension even though the transform tests in contrib don't, and neither
do the plperl tests drop plperlu. I'm rather inclined to say we should
stop doing that, or at least be consistent about it.

Having rebuilt all the branches from 9.5 and up the cross version
upgrade tests are now passing on crake.

My other machine with an old python instance is bowerbird. It has python
3.4 installed but not used, alongside 2.7 which is udsed. I will install
the latest and see if that can be made to work.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: support for MERGE

2021-11-16 Thread Álvaro Herrera
Hi Amit

On 2021-Nov-16, Amit Langote wrote:

> AFAICS, MERGE operating on an inheritance parent that is not
> partitioned should work mostly the same as the case where it is
> partitioned (good thing that it works at all without needing any
> special code!), though only the INSERT actions would have to be
> handled appropriately by the user using triggers and such.  And also I
> guess any UPDATE actions that need to move rows between child tables
> because that too involves tuple routing logic.  As long as we're clear
> on that in the documentation, I don't see why this case should not be
> covered in the initial version.

Yeah, I think the reason it works so cleanly is that the code you and/or
Tom added to be able to get rid of inheritance_planner is superb,
including the new row identity stuff.  For the same reason, I suspect
that adding support for foreign tables should be reasonably simple --
just add explicit support for handling "wholerow" in a few places.  I
have not tried.

> I thought for a second about the cases where child tables have columns
> not present in the root parent mentioned in the command, but I guess
> that possibility doesn't present problems given that the command
> wouldn't be able to mention such columns to begin with; it can only
> refer to the root parent's column which must be present in all of the
> affected child tables.

Right.  On the other hand, if we did have a problem with extra columns,
ISTM that would be on the user's head, not our responsibility.  In the
example I added, there is one child table with variant column layout; it
did require that the insertion trigger explicitly lists the columns in
the INSERT statement for that table, but otherwise it work correctly.

> In any case, I have a feeling that the planner would catch any
> problematic cases if there're any while converting MergeAction
> expressions into the individual child table layouts.

Yeah, AFAICS it worked fine for the case I tried.  Maybe there are more
elaborate ones that I didn't think of, of course.

-- 
Álvaro Herrera   39°49'30"S 73°17'W  —  https://www.EnterpriseDB.com/
"Puedes vivir sólo una vez, pero si lo haces bien, una vez es suficiente"




Re: Granting SET and ALTER SYSTE privileges for GUCs

2021-11-16 Thread Robert Haas
On Tue, Nov 16, 2021 at 10:45 AM Mark Dilger
 wrote:
> You are talking about mismatches in the other direction, aren't you?  I was 
> responding to Robert's point that new gucs could appear, and old gucs 
> disappear.  That seems analogous to new functions appearing and old functions 
> disappearing.  If you upgrade (not downgrade) the .so, the new gucs and 
> functions will be in the .so, but won't be callable/grantable from sql until 
> the upgrade script runs.  The old gucs and functions will be missing from the 
> .so, and attempts to call them/grant them from sql before the upgrade will 
> fail.  What am I missing here?

It's true that we could impose such a restriction, but I don't think
we should. If I install a different .so, I want the new GUCs to be
grantable immediately, without running any separate DDL.

I also don't think we should burden extension authors with putting
stuff in their upgrade scripts for this. We should solve the problem
in our code rather than forcing them to do so in theirs.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Test::More version

2021-11-16 Thread Andrew Dunstan


On 11/15/21 14:14, Tom Lane wrote:
> =?utf-8?Q?Dagfinn_Ilmari_Manns=C3=A5ker?=  writes:
>> Tom Lane  writes:
>>> Based on this, I'm inclined to think we should select 0.98 as the
>>> minimum version.  Anything later would inconvenience some people.
>>> OTOH, it doesn't look like there's any value in promising compatibility
>>> with 0.96 instead, especially since I see a couple of subplan-related
>>> bug fixes in 0.97 and 0.98 in Test::Simple's changelog.
>> FWIW, here's a list of Test::More versions shipped with various Perl
>> releases. Since v5.12.0, the version does not change in minor releases.
>> ...
>>   v5.12.00.94
>>   v5.14.00.98
> Ah-hah, that's a good way to think about it too.  There's surely
> little reason in worrying about a module version that did not ship
> in any Perl release, because what 99.9% of people will have is what
> shipped with their Perl release.
>
>   


And newer versions of Test::Simple/Test::More still work with very old
versions of perl. jacana runs prove with perl 5.8.8 but has Test::More
1.001014

So I agree 0.98 seems like a perfectly reasonable target. You should be
able to drop a replacement on prairiedog quite simply. For jacana I just
unpacked it and pointed PERL5LIB at it.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: add recovery, backup, archive, streaming etc. activity messages to server logs along with ps display

2021-11-16 Thread Bossart, Nathan
On 11/15/21, 7:14 PM, "Michael Paquier"  wrote:
> +   ereport(DEBUG1,
> +   (errmsg_internal("streaming %X/%X",
> + 
> LSN_FORMAT_ARGS(sentPtr;
> Anyway, those two ones are going to make the logs much more noisy, no?
> The same could be said about XLogFileRead(), joining the point of
> Nathan's upthread.  So I cannot get excited by this change.

Yeah, this might even be too noisy for DEBUG5.

Nathan



Re: Granting SET and ALTER SYSTE privileges for GUCs

2021-11-16 Thread Mark Dilger



> On Nov 16, 2021, at 7:28 AM, Tom Lane  wrote:
> 
> True; as long as the expectation is that entries will exist for only
> a tiny subset of GUCs, it's probably fine.

I understand that bloating a frequently used catalog can be pretty harmful to 
performance.  I wasn't aware that the size of an infrequently used catalog was 
critical.  This new catalog would be used during GRANT SET ... and GRANT ALTER 
SYSTEM commands, which should be rare, and potentially consulted when SET or 
ALTER SYSTEM commands are issued.  Is there a more substantial performance 
impact to this than I'm aware?  It can be a bit challenging to run performance 
tests on such things, given the way everything interacts with everything else. 

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: Granting SET and ALTER SYSTE privileges for GUCs

2021-11-16 Thread Mark Dilger



> On Nov 16, 2021, at 7:32 AM, Tom Lane  wrote:
> 
> Mark Dilger  writes:
>> I don't think we support using a .so that is mismatched against the
>> version of the extension that is installed.
> 
> You are entirely mistaken.  That's not only "supported", it's *required*.
> Consider binary upgrades, where the SQL objects are transferred as-is
> but the receiving installation may have a different (hopefully newer)
> version of the .so.  That .so has to cope with the older SQL object
> definitions; it doesn't get to assume that the upgrade script has been
> run yet.

You are talking about mismatches in the other direction, aren't you?  I was 
responding to Robert's point that new gucs could appear, and old gucs 
disappear.  That seems analogous to new functions appearing and old functions 
disappearing.  If you upgrade (not downgrade) the .so, the new gucs and 
functions will be in the .so, but won't be callable/grantable from sql until 
the upgrade script runs.  The old gucs and functions will be missing from the 
.so, and attempts to call them/grant them from sql before the upgrade will 
fail.  What am I missing here?

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: Granting SET and ALTER SYSTE privileges for GUCs

2021-11-16 Thread Tom Lane
Mark Dilger  writes:
> I don't think we support using a .so that is mismatched against the
> version of the extension that is installed.

You are entirely mistaken.  That's not only "supported", it's *required*.
Consider binary upgrades, where the SQL objects are transferred as-is
but the receiving installation may have a different (hopefully newer)
version of the .so.  That .so has to cope with the older SQL object
definitions; it doesn't get to assume that the upgrade script has been
run yet.

regards, tom lane




Re: Granting SET and ALTER SYSTE privileges for GUCs

2021-11-16 Thread Tom Lane
Robert Haas  writes:
> On Tue, Nov 16, 2021 at 10:17 AM Tom Lane  wrote:
>> Right.  I think that any design that involves per-GUC catalog entries
>> is going to be an abysmal failure, precisely because the set of GUCs
>> is not stable enough.

> In practice it's pretty stable. I think it's just a matter of having a
> plan that covers the cases where it isn't stable reasonably elegantly.

Um.  Really the point comes down to having sane default behavior
when there's no entry, which ought to eliminate any need to do the
sort of "run over all the entries at startup" processing that you
seemed to be proposing.  So I guess I don't understand why such a
thing would be needed.

> We already embed GUC names in catalog entries when someone runs ALTER
> USER SET or ALTER DATABASE SET, so this proposal doesn't seem to be
> moving the goalposts in any meaningful way.

True; as long as the expectation is that entries will exist for only
a tiny subset of GUCs, it's probably fine.

regards, tom lane




Re: Granting SET and ALTER SYSTE privileges for GUCs

2021-11-16 Thread Mark Dilger



> On Nov 16, 2021, at 7:03 AM, Robert Haas  wrote:
> 
> It's also going to be important to think about what happens with
> extension GUCs. If somebody installs an extension, we can't ask them
> to perform a manual step in order to be able to grant privileges.

The burden isn't on the installer of an extension.  As implemented, it's the 
extension's installation .sql file that sets it up, and the upgrade .sql files 
that make adjustments, if necessary.

> And
> if somebody then loads up a different .so for that extension, the set
> of GUCs that it provides can change without any DDL being executed.
> New GUCs could appear, and old GUCs could vanish.

Well, the same is true for functions, right?  If you add, remove, or redefine 
functions in the extension, you need an upgrade script that defines the new 
functions, removes the old ones, changes function signatures, or whatever.  The 
same is true here for GUCs.

I don't think we support using a .so that is mismatched against the version of 
the extension that is installed.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: Granting SET and ALTER SYSTE privileges for GUCs

2021-11-16 Thread Robert Haas
On Tue, Nov 16, 2021 at 10:17 AM Tom Lane  wrote:
> Right.  I think that any design that involves per-GUC catalog entries
> is going to be an abysmal failure, precisely because the set of GUCs
> is not stable enough.

In practice it's pretty stable. I think it's just a matter of having a
plan that covers the cases where it isn't stable reasonably elegantly.

We already embed GUC names in catalog entries when someone runs ALTER
USER SET or ALTER DATABASE SET, so this proposal doesn't seem to be
moving the goalposts in any meaningful way.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Frontend error logging style

2021-11-16 Thread Robert Haas
On Mon, Nov 15, 2021 at 10:02 PM Michael Paquier  wrote:
> On Mon, Nov 15, 2021 at 02:40:10PM -0500, Robert Haas wrote:
> > Having different frontend utilities each invent their own
> > slightly-different way of doing this makes it hard to reuse code, and
> > hard to understand code. We need to find ways to make it more uniform,
> > not just observe that it isn't uniform today and give up.
>
> I agree with this sentiment, but this is a bit more complex than just
> calling exit() with pg_log_fatal(), no?  pg_dump likes playing a lot
> with its exit_nicely(), meaning that we may want to allow frontends to
> plug in callbacks.

Yep.

I think we need frontend facilities that look like the backend
facilities, so try/catch blocks, on-exit callbacks, and whatever else
there is. Otherwise code reuse is going to continue to be annoying.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Granting SET and ALTER SYSTE privileges for GUCs

2021-11-16 Thread Tom Lane
Robert Haas  writes:
> It's also going to be important to think about what happens with
> extension GUCs. If somebody installs an extension, we can't ask them
> to perform a manual step in order to be able to grant privileges. And
> if somebody then loads up a different .so for that extension, the set
> of GUCs that it provides can change without any DDL being executed.
> New GUCs could appear, and old GUCs could vanish.

Right.  I think that any design that involves per-GUC catalog entries
is going to be an abysmal failure, precisely because the set of GUCs
is not stable enough.  So I'm suspicious of this entire proposal.
Maybe there's a way to make it work, but that way isn't how.

regards, tom lane




Re: Time to drop plpython2?

2021-11-16 Thread Robert Haas
On Mon, Nov 15, 2021 at 4:12 PM Tom Lane  wrote:
> > It'd only be an issue if they want to compile from source, right?
> > We're not speaking of changing the runtime prerequisites, IIUC.
>
> I'm not sure.  Does it make sense to document that pl/python has
> a different Python version requirement than the build system does?
> If we do, who exactly is going to be testing that such a combination
> works?  Will it even be possible to compile pl/python against Python
> headers/libs of a different Python generation than meson is running
> under?

Hmm, that's true. I hadn't considered the fact that anyone who is
packaging PostgreSQL probably also wants to build plpython. However,
it's possible that a side-by-side install of a newer python version
could be used for the build system while building against the system
python for plpython. That might or might not be too exotic a
configuration for someone to consider.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Granting SET and ALTER SYSTE privileges for GUCs

2021-11-16 Thread Robert Haas
On Mon, Nov 15, 2021 at 3:37 PM Mark Dilger
 wrote:
> One disadvantage of this approach is that the GUC variables are represented 
> both in the list of C structures in guc.c and in the new system catalog 
> pg_config_param's .dat file.  Failure to enter a GUC in the .dat file will 
> result in the inability to grant privileges on the GUC, at least unless/until 
> you run CREATE CONFIGURATION PARAMETER on the GUC.  (This is, in fact, how 
> extension scripts deal with the issue.)  It would perhaps be better if the 
> list of GUCs were not duplicated, but I wasn't clever enough to find a clean 
> way to do that without greatly expanding the patch (nor did I complete 
> prototyping any such thing.)

I think this imposes an unnecessary burden on developers. It seems
like it would be easy to write some code that lives inside guc.c and
runs during bootstrap, and it could just iterate over each
ConfigureNamesWhatever array and insert catalog entries for everything
it finds.

It's also going to be important to think about what happens with
extension GUCs. If somebody installs an extension, we can't ask them
to perform a manual step in order to be able to grant privileges. And
if somebody then loads up a different .so for that extension, the set
of GUCs that it provides can change without any DDL being executed.
New GUCs could appear, and old GUCs could vanish.

So I wonder if we should instead not do what I proposed above, and
instead just adjust the GRANT command to automatically insert a new
row into the relevant catalog if there isn't one already. That seems
nicer for extensions, and also nicer for core GUCs, since it avoids
bloating the catalog with a bunch of entries that aren't needed.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Support for NSS as a libpq TLS backend

2021-11-16 Thread Joshua Brindle
On Mon, Nov 15, 2021 at 5:37 PM Joshua Brindle
 wrote:
>
> On Mon, Nov 15, 2021 at 4:44 PM Daniel Gustafsson  wrote:
> >
> > > On 15 Nov 2021, at 20:51, Joshua Brindle  
> > > wrote:
> >
> > > Apologies for the delay, this didn't go to my inbox and I missed it on 
> > > list.
> > >
> > > The bitcode generation is still broken, this time for nspr.h:
> >
> > Interesting, I am unable to replicate that in my tree but I'll investigate
> > further tomorrow using your Dockerfile.  For the sake of testing, does
> > compilation pass for you in the same place without using --with-llvm?
> >
>
> Yes, it builds and check-world passes. I'll continue testing with this
> build. Thank you.

The previous Dockerfile had some issues due to a hasty port from RHEL
to Fedora, attached is one that works with your patchset, llvm
currently disabled, and the llvm deps removed.

The service file is also attached since it's referenced in the
Dockerfile and you'd have had to reproduce it.

After building, run with:
docker run --name pg-test -p 5432:5432 --cap-add=SYS_ADMIN -v
/sys/fs/cgroup:/sys/fs/cgroup:ro -d 


Dockerfile
Description: Binary data


postgresql-15.service
Description: Binary data


Re: RecoveryInProgress() has critical side effects

2021-11-16 Thread Robert Haas
On Mon, Nov 15, 2021 at 9:34 PM Michael Paquier  wrote:
> Patch 0001 means that the startup process would set up the structures
> to be able to build WAL records before running any kind of recovery
> action rather than doing it when it really needs it.  That's fine by
> me.

Great, thanks. I think I'll go ahead and commit this one, then.

> Is patch 0002 actually right regarding the handling of doPageWrites?
> Once applied, we finish by setting it when the startup process starts
> and not anymore at the end of recovery based on its the state of
> Insert, but this could have changed while in recovery when replaying
> one or more XLOG_FPW_CHANGE records.

Maybe I'm not understanding you properly here, but it seems like you
might be forgetting that this is a local variable and thus every
backend is going to have something different. In the startup process,
it will be initialized by StartupXLOG(); in other processes, it's
currently initialized by RecoveryInProgress(), but with this patch it
wouldn't be. Either way, it's then updated by future calls to
XLogInsertRecord() as required. XLOG_FPW_CHANGE records might affect
the new value that gets set the next time XLogInsertRecord(), but they
don't directly affect doPageWrites.

> > I'm personally not too worried about a ~4% regression in this
> > particular benchmark...
>
> This is not a hot code path, that should be fine.

OK. I'll wait a while and see if anyone else wants to weigh in.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Add regression coverage for REVOKE ADMIN OPTION

2021-11-16 Thread Mark Dilger



> On Nov 16, 2021, at 6:31 AM, Daniel Gustafsson  wrote:
> 
>> On 16 Nov 2021, at 00:58, Mark Dilger  wrote:
> 
>> While working on a fix for dangling references to dropped roles in the 
>> pg_auth_members.grantor field, I happened to notice we entirely lack 
>> regression test coverage of the REVOKE ADMIN OPTION FOR form of the 
>> RevokeRoleStmt.  I am unaware of any bugs in the current implementation, but 
>> future work on roles may benefit if we close the testing gap.
> 
> LGTM.  Reading this I realized that the GRANTED BY keyword for RevokeRoleStmt
> isn't working as documented, it's not checking the role at all.  I've sent a
> diff for that with tests on the relevant thread, but I think it would be a 
> good
> to get this in too to boost coverage.

Thanks for the review!

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: Add regression coverage for REVOKE ADMIN OPTION

2021-11-16 Thread Daniel Gustafsson
> On 16 Nov 2021, at 00:58, Mark Dilger  wrote:

> While working on a fix for dangling references to dropped roles in the 
> pg_auth_members.grantor field, I happened to notice we entirely lack 
> regression test coverage of the REVOKE ADMIN OPTION FOR form of the 
> RevokeRoleStmt.  I am unaware of any bugs in the current implementation, but 
> future work on roles may benefit if we close the testing gap.

LGTM.  Reading this I realized that the GRANTED BY keyword for RevokeRoleStmt
isn't working as documented, it's not checking the role at all.  I've sent a
diff for that with tests on the relevant thread, but I think it would be a good
to get this in too to boost coverage.

--
Daniel Gustafsson   https://vmware.com/





Re: Allow CURRENT_ROLE in GRANTED BY

2021-11-16 Thread Daniel Gustafsson
> On 16 Nov 2021, at 15:04, Daniel Gustafsson  wrote:

> ..or should the attached small diff be applied to fix it?

Actually it shouldn't, I realized when hitting Send that it was the wrong
version.  The attached is the proposed diff.

--
Daniel Gustafsson   https://vmware.com/



granted_by_v2.diff
Description: Binary data


Re: Allow CURRENT_ROLE in GRANTED BY

2021-11-16 Thread Daniel Gustafsson
> On 30 Jan 2021, at 09:51, Peter Eisentraut  
> wrote:
> 
> On 2020-12-30 13:43, Simon Riggs wrote:
>> On Thu, 10 Dec 2020 at 18:40, Peter Eisentraut
>>  wrote:
>>> 
>>> On 2020-06-24 20:21, Peter Eisentraut wrote:
 On 2020-06-24 10:12, Vik Fearing wrote:
> On 6/24/20 8:35 AM, Peter Eisentraut wrote:
>> I was checking some loose ends in SQL conformance, when I noticed: We
>> support GRANT role ... GRANTED BY CURRENT_USER, but we don't support
>> CURRENT_ROLE in that place, even though in PostgreSQL they are
>> equivalent.  Here is a trivial patch to add that.
> 
> 
> The only thing that isn't dead-obvious about this patch is the commit
> message says "[PATCH 1/2]".  What is in the other part?
 
 Hehe.  The second patch is some in-progress work to add the GRANTED BY
 clause to the regular GRANT command.  More on that perhaps at a later date.
>>> 
>>> Here is the highly anticipated and quite underwhelming second part of
>>> this patch set.
>> Looks great, but no test to confirm it works. I would suggest adding a
>> test and committing directly since I don't see any cause for further
>> discussion.
> 
> Committed with some tests.  Thanks.

While looking at the proposed privileges.sql test patch from Mark Dilger [0] I
realized that the commit above seems to have missed the RevokeRoleStmt syntax.
As per the SQL Spec it should be supported there as well AFAICT.  Was this
intentional or should the attached small diff be applied to fix it?

--
Daniel Gustafsson   https://vmware.com/

[0] 333b0203-d19b-4335-ae64-90eb0faf4...@enterprisedb.com



granted_by.diff
Description: Binary data


Re: pg_get_publication_tables() output duplicate relid

2021-11-16 Thread Amit Kapila
On Mon, Nov 15, 2021 at 7:12 PM Alvaro Herrera  wrote:
>
> > On Mon, Nov 15, 2021 at 1:48 PM houzj.f...@fujitsu.com
> >  wrote:
>
> > > create table tbl1 (a int) partition by range (a);
> > > create table tbl1_part1 partition of tbl1 for values from (1) to (10);
> > > create table tbl1_part2 partition of tbl1 for values from (10) to (20);
> > > create publication pub for table
> > > tbl1, tbl1_part1 with (publish_via_partition_root=false);
>
> In the name of consistency, I think this situation should be an error --
> I mean, if we detect that the user is trying to add both the partitioned
> table *and* its partition, then all manner of things are possibly going
> to go wrong in some way, so my inclination is to avoid it altogether.
>
> Is there any reason to allow that?
>

I think it could provide flexibility to users to later change
"publish_via_partition_root" option. Because when that option is
false, we use individual partitions schema to send changes and when it
is true, we use root table's schema to send changes. Added Amit L. to
know if he has any thoughts on this matter as he was the author of
this work?

-- 
With Regards,
Amit Kapila.




Re: CREATE PUBLICATION should "See Also" CREATE SUBSCRIPTION

2021-11-16 Thread Daniel Gustafsson
> On 15 Nov 2021, at 23:29, Peter Smith  wrote:

> I noticed recently that the CREATE PUBLICATION docs page [1] does not
> have any "See Also" reference to the CREATE SUBSCRIPTION docs page
> [2].

Nice catch.

>   
>
>
> +   

To make it analogous with how CREATE/ALTER for SUBSCRIPTION and PUBLICATION
reference each other, there should IMO be another xref to ALTER SUBSCRIPTION as
well.

Unless there are objections, I'll apply that.

--
Daniel Gustafsson   https://vmware.com/





RE: Failed transaction statistics to measure the logical replication progress

2021-11-16 Thread osumi.takami...@fujitsu.com
On Monday, November 15, 2021 9:14 PM I wrote:
> I've conducted some update for this.
> (The rebased part is only C code and checked by pgindent)
I'll update my patches since a new skip xid patch
has been shared in [1].

This version includes some minor renames of functions
that are related to transaction sizes.

[1] - 
https://www.postgresql.org/message-id/CAD21AoA5jupM6O%3DpYsyfaxQ1aMX-en8%3DQNgpW6KfXsg7_CS0CQ%40mail.gmail.com

Best Regards,
Takamichi Osumi



v12-0001-Rename-existing-columns-of-pg_stat_subscription_.patch
Description:  v12-0001-Rename-existing-columns-of-pg_stat_subscription_.patch


v12-0002-Extend-pg_stat_subscription_workers-to-include-g.patch
Description:  v12-0002-Extend-pg_stat_subscription_workers-to-include-g.patch


Re: WIP: System Versioned Temporal Table

2021-11-16 Thread Daniel Gustafsson
> On 15 Nov 2021, at 11:50, Simon Riggs  wrote:

> I have no plans on taking this patch further, but will give some help
> to anyone that wishes to do that.
> 
> I suggest we Return with Feedback.

Fair enough, done that way.

--
Daniel Gustafsson   https://vmware.com/





Re: [BUG]Invalidate relcache when setting REPLICA IDENTITY

2021-11-16 Thread Amit Kapila
On Mon, Nov 15, 2021 at 7:20 AM houzj.f...@fujitsu.com
 wrote:
>
> On Sat, Nov 13, 2021 6:50 PM Amit Kapila  wrote:
> >
> > The patch looks mostly good to me. I have slightly tweaked the comments in
> > the code (as per my previous suggestion) and test. Also, I have slightly
> > modified the commit message. If the attached looks good to you then kindly
> > prepare patches for back-branches.
>
> Thanks for reviewing, the latest patch looks good to me.
> Attach the patches for back branch (HEAD ~ v10).
>

Pushed.

-- 
With Regards,
Amit Kapila.




Re: RFC: Logging plan of the running query

2021-11-16 Thread torikoshia

On 2021-11-15 23:15, Bharath Rupireddy wrote:


I have another comment: isn't it a good idea that an overloaded
version of the new function pg_log_query_plan can take the available
explain command options as a text argument? I'm not sure if it is
possible to get the stats like buffers, costs etc of a running query,
if yes, something like pg_log_query_plan(pid, 'buffers',
'costs');? It looks to be an overkill at first sight, but these
can be useful to know more detailed plan of the query.


I also think the overloaded version would be useful.
However as discussed in [1], it seems to introduce other difficulties.
I think it would be enough that the first version of pg_log_query_plan 
doesn't take any parameters.


[1] 
https://www.postgresql.org/message-id/ce86e4f72f09d5497e8ad3a162861d33%40oss.nttdata.com


--
Regards,

--
Atsushi Torikoshi
NTT DATA CORPORATION




Re: Parallel vacuum workers prevent the oldest xmin from advancing

2021-11-16 Thread Amit Kapila
On Mon, Nov 15, 2021 at 12:38 PM Masahiko Sawada  wrote:
>
> I've updated the patch so that ProcArrayInstallRestoredXmin() sets
> both xmin and statusFlags only when the source proc is still running
> and xmin doesn't go backwards. IOW it doesn't happen that only one of
> them is set by this function, which seems more understandable
> behavior.
>

How have you tested this patch? As there was no test case presented in
this thread, I used the below manual test to verify that the patch
works. The idea is to generate a scenario where a parallel vacuum
worker holds back the xmin from advancing.

Setup:
-- keep autovacuum = off in postgresql.conf
create table t1(c1 int, c2 int);
insert into t1 values(generate_series(1,1000),100);
create index idx_t1_c1 on t1(c1);
create index idx_t1_c2 on t1(c2);

create table t2(c1 int, c2 int);
insert into t2 values(generate_series(1,1000),100);
create index idx_t2_c1 on t1(c1);

Session-1:
delete from t1 where c1 < 10; --this is to ensure that vacuum has some
work to do

Session-2:
-- this is done just to ensure the Session-1's xmin captures the value
of this xact
begin;
select txid_current(); -- say value is 725
insert into t2 values(1001, 100);

Session-1:
set min_parallel_index_scan_size=0;
-- attach a debugger and ensure to stop parallel worker somewhere
before it completes and the leader after launching parallel worker
vacuum t1;

Session-2:
-- commit the open transaction
commit;

Session-3:
-- attach a debugger and break at the caller of vacuum_set_xid_limits.
vacuum t2;

I noticed that before the patch the value of oldestXmin in Session-3
is 725 but after the patch it got advanced. I have made minor edits to
the attached patch. See, if this looks okay to you then please prepare
and test the patch for back-branches as well. If you have some other
way to test the patch then do share the same and let me know if you
see any flaw in the above verification method.

-- 
With Regards,
Amit Kapila.


v5-0001-Fix-parallel-operations-that-prevent-oldest-xmin-.patch
Description: Binary data


Re: Add connection active, idle time to pg_stat_activity

2021-11-16 Thread Rafia Sabih
On Mon, 15 Nov 2021 at 12:40, Dilip Kumar  wrote:
>
> On Mon, Nov 15, 2021 at 4:46 PM Rafia Sabih  wrote:
> >
> > On Mon, 15 Nov 2021 at 10:24, Dilip Kumar  wrote:
> > >
> > > On Wed, Nov 10, 2021 at 1:47 PM Rafia Sabih  
> > > wrote:
> > > >
> > > > > It seems that in beentry->st_idle_time, you want to compute the
> > > > > STATE_IDLE, but that state is not handled in the outer "if", that
> > > > > means whenever it comes out of the
> > > > > STATE_IDLE, it will not enter inside this if check.  You can run and
> > > > > test, I am sure that with this patch the "idle_time" will always
> > > > > remain 0.
> > > > >
> > > > Thank you Dilip for your time on this.
> > > > And yes you are right in both your observations.
> > > > Please find the attached patch for the updated version.
> > >
> > > Looks fine now except these variable names,
> > >
> > >  PgStat_Counter pgStatTransactionIdleTime = 0;
> > > +PgStat_Counter pgStatTransactionIdleInTxnTime = 0;
> > >
> > > Now, pgStatTransactionIdleTime is collecting just the Idle time so
> > > pgStatTransactionIdleTime should be renamed to "pgStatIdleTime" and
> > > pgStatTransactionIdleInTxnTime should be renamed to
> > > "pgStatTransactionIdleTime"
> > >
> > Good point!
> > Done.
>
> @@ -1018,7 +1019,7 @@ pgstat_send_tabstat(PgStat_MsgTabstat *tsmsg,
> TimestampTz now)
>   pgLastSessionReportTime = now;
>   tsmsg->m_session_time = (PgStat_Counter) secs * 100 + usecs;
>   tsmsg->m_active_time = pgStatActiveTime;
> - tsmsg->m_idle_in_xact_time = pgStatTransactionIdleTime;
> + tsmsg->m_idle_in_xact_time = pgStatIdleTime;
>
> I think this change is wrong,  basically, "tsmsg->m_idle_in_xact_time"
> is used for counting the database level idle in transaction count, you
> can check "pg_stat_get_db_idle_in_transaction_time" function for that.
> So "pgStatTransactionIdleTime" is the variable counting the idle in
> transaction time, pgStatIdleTime is just counting the idle time
> outside the transaction so if we make this change we are changing the
> meaning of tsmsg->m_idle_in_xact_time.

Got it.
Updated

-- 
Regards,
Rafia Sabih
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index eb560955cd..4dfa33ffa9 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -839,7 +839,10 @@ CREATE VIEW pg_stat_activity AS
 s.backend_xmin,
 S.query_id,
 S.query,
-S.backend_type
+S.backend_type,
+S.active_time,
+S.idle_in_transaction_time,
+S.idle_time
 FROM pg_stat_get_activity(NULL) AS S
 LEFT JOIN pg_database AS D ON (S.datid = D.oid)
 LEFT JOIN pg_authid AS U ON (S.usesysid = U.oid);
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index 8c166e5e16..fc5e58e06b 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -248,6 +248,7 @@ PgStat_Counter pgStatBlockReadTime = 0;
 PgStat_Counter pgStatBlockWriteTime = 0;
 static PgStat_Counter pgLastSessionReportTime = 0;
 PgStat_Counter pgStatActiveTime = 0;
+PgStat_Counter pgStatIdleTime = 0;
 PgStat_Counter pgStatTransactionIdleTime = 0;
 SessionEndType pgStatSessionEndCause = DISCONNECT_NORMAL;
 
@@ -1031,7 +1032,7 @@ pgstat_send_tabstat(PgStat_MsgTabstat *tsmsg, TimestampTz now)
 		pgStatBlockReadTime = 0;
 		pgStatBlockWriteTime = 0;
 		pgStatActiveTime = 0;
-		pgStatTransactionIdleTime = 0;
+		pgStatIdleTime = 0;
 	}
 	else
 	{
diff --git a/src/backend/utils/activity/backend_status.c b/src/backend/utils/activity/backend_status.c
index 7229598822..560fa0fa0c 100644
--- a/src/backend/utils/activity/backend_status.c
+++ b/src/backend/utils/activity/backend_status.c
@@ -577,6 +577,7 @@ pgstat_report_activity(BackendState state, const char *cmd_str)
 	 */
 	if ((beentry->st_state == STATE_RUNNING ||
 		 beentry->st_state == STATE_FASTPATH ||
+		 beentry->st_state == STATE_IDLE ||
 		 beentry->st_state == STATE_IDLEINTRANSACTION ||
 		 beentry->st_state == STATE_IDLEINTRANSACTION_ABORTED) &&
 		state != beentry->st_state)
@@ -590,9 +591,21 @@ pgstat_report_activity(BackendState state, const char *cmd_str)
 
 		if (beentry->st_state == STATE_RUNNING ||
 			beentry->st_state == STATE_FASTPATH)
-			pgstat_count_conn_active_time((PgStat_Counter) secs * 100 + usecs);
-		else
+		{
+pgstat_count_conn_active_time((PgStat_Counter) secs * 100 + usecs);
+beentry->st_active_time = pgStatActiveTime;
+		}
+		else if (beentry->st_state ==  STATE_IDLEINTRANSACTION ||
+ beentry->st_state == STATE_IDLEINTRANSACTION_ABORTED)
+		{
 			pgstat_count_conn_txn_idle_time((PgStat_Counter) secs * 100 + usecs);
+			beentry->st_transaction_idle_time = pgStatTransactionIdleTime;
+		}
+		else
+		{
+			pgstat_count_conn_idle_time((PgStat_Counter) secs * 100 + usecs);
+			beentry->st_idle_time = pgStatIdleTime;
+		}
 	}
 
 	/*
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/uti

Re: Emit a warning if the extension's GUC is set incorrectly

2021-11-16 Thread Bharath Rupireddy
On Tue, Nov 16, 2021 at 2:50 PM Shinya Kato
 wrote:
> > I'm not sure if we have "how to create an extension/a bg worker?" in
> > the core docs, if we have, it is a good idea to make note of using
> > EmitWarningsOnPlaceholders so that it will not be missed in the future
> > modules.
>
> I cannot find any such docs, so I add a comment to
> src/backend/utils/misc/guc.c.
>
> > I observed an odd behaviour:
> > 1) I set postgres_fdw.XXX = 'I_messed_up_conf_file' in postgresql.conf
> > 2) With EmitWarningsOnPlaceholders("postgres_fdw"); in postgres_fdw
> > contrib module, I created the extension, have seen the following
> > warning:
> > 2021-11-15 06:02:31.198 UTC [2018111] WARNING:  unrecognized
> > configuration parameter "postgres_fdw.XXX"
> > 3) I further did, "alter system set
> > postgres_fdw.XXX='I_further_messed_up_conf_file';" and also "select
> > pg_reload_conf();", it silently accepts.
> >
> > postgres=# create extension postgres_fdw ;
> > WARNING:  unrecognized configuration parameter "postgres_fdw.XXX"
> > CREATE EXTENSION
> > postgres=# alter system set
> > postgres_fdw.XXX='I_further_messed_up_conf_file';
> > ALTER SYSTEM
> > postgres=# select pg_reload_conf();
> >  pg_reload_conf
> > 
> >  t
> > (1 row)
> >
> > My point is, why should the "create extension" succeed with a WARNING
> > when a wrong parameter related to it is set in the postgresql.conf
> > file so that we don't see further problems as shown in (3). I think
> > EmitWarningsOnPlaceholders should emit an error so that the extension
> > will not get created in the first place if a wrong configuration
> > parameter is set in the postgresql.conf file. Many times, users will
> > not have access to server logs so it is good to fail the "create
> > extension" statement.
> >
> > Thoughts?
> >
> > postgres=# create extension postgres_fdw ;
> > ERROR:  unrecognized configuration parameter "postgres_fdw.XXX"
> > postgres=#
>
> I confirmed it too, and I think that's definitely a problem.
> I also thought it would be a good idea to emit an ERROR as well as when
> an invalid normal GUC is set.
> I didn't change the function name because it would affect many third
> party extensions.

For backward compatibility you can retain the function
EmitWarningsOnPlaceholders as-is and have another similar function
that emits an error:

In guc.c, have something like below:
static void
check_conf_params(const char *className, bool emit_error)
{
   /* have the existing code of EmitWarningsOnPlaceholders here */
ereport(emit_error ? ERROR : WARNING,
(errcode(ERRCODE_UNDEFINED_OBJECT),
 errmsg("unrecognized configuration parameter \"%s\"",
var->name)));
}

void
EmitErrorOnPlaceholders(const char *className)
{
check_conf_params(className, true);
}

void
EmitWarningsOnPlaceholders(const char *className)
{
check_conf_params(className, false);
}

And change all the core extensions to use EmitErrorOnPlaceholders.
This way you don't break the backward compatibility of the outside
extensions, if they want they get to choose which behaviour they want
for their extension.

Actually, I didn't quite like the function name
EmitWarningsOnPlaceholders or EmitErrorOnPlaceholders, it could have
been something else. But let's not go that direction of changing the
function name for backward compatibility.

Regards,
Bharath Rupireddy.




Re: postgres_fdw: commit remote (sub)transactions in parallel during pre-commit

2021-11-16 Thread Etsuro Fujita
On Mon, Nov 8, 2021 at 1:13 PM Fujii Masao  wrote:
> On 2021/11/07 18:06, Etsuro Fujita wrote:
> > On Mon, Nov 1, 2021 at 3:22 PM Fujii Masao  
> > wrote:
> >> Could you tell me why the parameter is necessary?
> >> Can't we always enable the feature?

> > I think it might be OK to enable it by
> > default.  But my concern about doing so is the remote side: during
> > those functions, if there are a lot of (sub)transactions on a single
> > remote server that need to be committed, parallel commit would
> > increase the remote server’s load at (sub)transaction end than serial
> > commit, which is the existing implementation, as the requests to
> > commit those (sub)transactions are sent to the remote server at the
> > same time; which some users might want to avoid.
>
> Thanks for explaining this! But probably I failed to get your point.
> Sorry... Whichever parallel or serial commit approach, the number of
> transactions to commit on the remote server is the same, isn't it?
> For example, please imagine the case where a client requests
> ten transactions per second to the local server. Each transaction
> accesses to the foreign table, so which means that ten transaction
> commit operations per second are requested to the remote server.
> Unless I'm missing something, this number doesn't change whether
> the foreign transaction is committed in parallel way or not.

Sorry, my explanation was not enough, but I don’t think this is always
true.  Let me explain using an example:

create server loopback foreign data wrapper postgres_fdw options
(dbname 'postgres', parallel_commit 'true');
create user mapping for current_user server loopback;
create table t1 (a int, b int);
create table t2 (a int, b int);
create foreign table ft1 (a int, b int) server loopback options
(table_name 't1');
create foreign table ft2 (a int, b int) server loopback options
(table_name 't2');
create role view_owner superuser;
create user mapping for view_owner server loopback;
grant SELECT on ft1 to view_owner;
create view v1 as select * from ft1;
alter view v1 owner to view_owner;

begin;
insert into v1 values (10, 10);
insert into ft2 values (20, 20);
commit;

For this transaction, since the first insert is executed as the view
owner while the second insert is executed as the current user, we
create a connection to the foreign server for each of the users to
execute the inserts.  This leads to sending two commit commands to the
foreign server at the same time during pre-commit.

To avoid spike loads on a remote server induced by such a workload, I
think it’s a good idea to have a server option to control whether this
is enabled, but I might be too worried about that, so I want to hear
the opinions of people.

> IMO it's better to implement and commit these features gradually
> if possible. Which would simplify the patch and make it
> easier-to-review. So I think that it's better to implement
> this feature as a separate patch.

Ok, I'll create a patch for abort cleanup separately.

> >> +   /* Consume whatever data is available from the socket */
> >> +   if (!PQconsumeInput(conn))
> >> +   pgfdw_report_error(ERROR, NULL, conn, false, sql);
> >>
> >> Without the patch, PQconsumeInput() is not called before pgfdw_get_result()
> >> But could you tell me why you added PQconsumeInput() there?
> >
> > The reason is that there might be the result already before calling
> > pgfdw_get_result(), in which case PQconsumeInput() followed by
> > PQisBusy() would allow us to call PQgetResult() without doing
> > WaitLatchOrSocket(), which I think is rather expensive.
>
> Understood. It's helpful to add the comment about why PQconsumeInput()
> is called there.

Ok.

> Also could you tell me how much expensive it is?

IIUC I think the overheads of WaitLatchOrSocket() incurred by a series
of epoll system calls are much larger compared to the overheads of
PQconsumeInput() incurred by a recv system call in non-blocking mode
when no data is available.  I didn’t do testing, though.

Actually, we already use this optimization in libpqrcv_receive() for
the caller of that function to avoid doing WaitLatchOrSocket()?

> >> When ignore_errors argument is true, the error reported by
> >> PQconsumeInput() should be ignored?
> >
> > I’m not sure about that, because the error might be caused by e.g.,
> > OOM in the local server, in which case I don’t think it is safe to
> > ignore it and continue the (sub)transaction-end processing.
>
> But the existing code ignores the error at all, doesn't it?
> If it's unsafe to do that, probably we should fix that at first?

I changed my mind; I’ll update the patch to ignore the error as
before, because 1) as far as I know, there are no reports from the
field concerning that we ignore all kinds of errors in cleaning up the
prepared statements, so maybe we don’t need to change that, and 2) we
already committed at least one of the remote transactions, so it’s not
good to abort the local transaction unless we reall

Re: Emit a warning if the extension's GUC is set incorrectly

2021-11-16 Thread Shinya Kato

On 2021-11-15 15:14, Bharath Rupireddy wrote:


Do we need to add them in the following too?

delay_execution/delay_execution.c
ssl_passphrase_func.c
worker_spi.c

Especially, worker_spi.c is important as it works as a template for
the bg worker code.


Thank you for pointing this out! I added it.


I'm not sure if we have "how to create an extension/a bg worker?" in
the core docs, if we have, it is a good idea to make note of using
EmitWarningsOnPlaceholders so that it will not be missed in the future
modules.


I cannot find any such docs, so I add a comment to 
src/backend/utils/misc/guc.c.



I observed an odd behaviour:
1) I set postgres_fdw.XXX = 'I_messed_up_conf_file' in postgresql.conf
2) With EmitWarningsOnPlaceholders("postgres_fdw"); in postgres_fdw
contrib module, I created the extension, have seen the following
warning:
2021-11-15 06:02:31.198 UTC [2018111] WARNING:  unrecognized
configuration parameter "postgres_fdw.XXX"
3) I further did, "alter system set
postgres_fdw.XXX='I_further_messed_up_conf_file';" and also "select
pg_reload_conf();", it silently accepts.

postgres=# create extension postgres_fdw ;
WARNING:  unrecognized configuration parameter "postgres_fdw.XXX"
CREATE EXTENSION
postgres=# alter system set 
postgres_fdw.XXX='I_further_messed_up_conf_file';

ALTER SYSTEM
postgres=# select pg_reload_conf();
 pg_reload_conf

 t
(1 row)

My point is, why should the "create extension" succeed with a WARNING
when a wrong parameter related to it is set in the postgresql.conf
file so that we don't see further problems as shown in (3). I think
EmitWarningsOnPlaceholders should emit an error so that the extension
will not get created in the first place if a wrong configuration
parameter is set in the postgresql.conf file. Many times, users will
not have access to server logs so it is good to fail the "create
extension" statement.

Thoughts?

postgres=# create extension postgres_fdw ;
ERROR:  unrecognized configuration parameter "postgres_fdw.XXX"
postgres=#


I confirmed it too, and I think that's definitely a problem.
I also thought it would be a good idea to emit an ERROR as well as when 
an invalid normal GUC is set.
I didn't change the function name because it would affect many third 
party extensions.


I plan to change to emit an error when an invalid custom GUC is set in 
the SET or ALTER SYSTEM SET commands, but I haven't tackled this yet.

The patch as of now is attached.


--
Regards,

--
Shinya Kato
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATIONdiff --git a/contrib/auth_delay/auth_delay.c b/contrib/auth_delay/auth_delay.c
index 5820ac328d..d11dd1e416 100644
--- a/contrib/auth_delay/auth_delay.c
+++ b/contrib/auth_delay/auth_delay.c
@@ -67,6 +67,9 @@ _PG_init(void)
 			NULL,
 			NULL,
 			NULL);
+
+	EmitWarningsOnPlaceholders("auth_delay");
+
 	/* Install Hooks */
 	original_client_auth_hook = ClientAuthentication_hook;
 	ClientAuthentication_hook = auth_delay_checks;
diff --git a/contrib/pg_trgm/trgm_op.c b/contrib/pg_trgm/trgm_op.c
index fb38135f7a..0407c7dd64 100644
--- a/contrib/pg_trgm/trgm_op.c
+++ b/contrib/pg_trgm/trgm_op.c
@@ -100,6 +100,8 @@ _PG_init(void)
 			 NULL,
 			 NULL,
 			 NULL);
+
+	EmitWarningsOnPlaceholders("pg_trgm");
 }
 
 /*
diff --git a/contrib/postgres_fdw/option.c b/contrib/postgres_fdw/option.c
index 48c7417e6e..36555398ec 100644
--- a/contrib/postgres_fdw/option.c
+++ b/contrib/postgres_fdw/option.c
@@ -469,4 +469,6 @@ _PG_init(void)
 			   NULL,
 			   NULL,
 			   NULL);
+
+	EmitWarningsOnPlaceholders("postgres_fdw");
 }
diff --git a/contrib/sepgsql/hooks.c b/contrib/sepgsql/hooks.c
index 19a3ffb7ff..44a09c35a7 100644
--- a/contrib/sepgsql/hooks.c
+++ b/contrib/sepgsql/hooks.c
@@ -455,6 +455,8 @@ _PG_init(void)
 			 NULL,
 			 NULL);
 
+	EmitWarningsOnPlaceholders("sepgsql");
+
 	/* Initialize userspace access vector cache */
 	sepgsql_avc_init();
 
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index e91d5a3cfd..07647a0d84 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -9322,6 +9322,13 @@ DefineCustomEnumVariable(const char *name,
 	define_custom_variable(&var->gen);
 }
 
+/*
+ * For historical context, this function name is EmitWarningsOnPlaceholders,
+ * but it emits an error when an invalid custom GUC is set.
+ *
+ * If we use DefineCustomXXXVariable, it is recommended that this function is
+ * added.
+ */
 void
 EmitWarningsOnPlaceholders(const char *className)
 {
@@ -9336,7 +9343,7 @@ EmitWarningsOnPlaceholders(const char *className)
 			strncmp(className, var->name, classLen) == 0 &&
 			var->name[classLen] == GUC_QUALIFIER_SEPARATOR)
 		{
-			ereport(WARNING,
+			ereport(ERROR,
 	(errcode(ERRCODE_UNDEFINED_OBJECT),
 	 errmsg("unrecognized configuration parameter \"%s\"",
 			var->name)));
diff --git a/src/pl/tcl/pltcl.c b/src/pl/tcl/pltcl.c
index e11837559d..1389aa

Re: row filtering for logical replication

2021-11-16 Thread Dilip Kumar
On Mon, Nov 15, 2021 at 2:44 PM Dilip Kumar  wrote:
>
> On Fri, Nov 12, 2021 at 3:49 PM Ajin Cherian  wrote:
> >
> > Attaching version 39-

I have reviewed, 0001* and I have a few comments on it

---
>If you choose to do the initial table synchronization, only data that satisfies
>the row filters is sent.

I think this comment is not correct, I think the correct statement
would be "only data that satisfies the row filters is pulled by the
subscriber"

---

---
+   The WHERE clause should contain only columns that are
+   part of the primary key or be covered  by REPLICA
+   IDENTITY otherwise, DELETE operations will not
+   be replicated. That's because old row is used and it only contains primary
+   key or columns that are part of the REPLICA IDENTITY; the
+   remaining columns are NULL. For INSERT
+   and UPDATE operations, any column might be used in the
+   WHERE clause.

I think this message is not correct, because for update also we can
not have filters on the non-key attribute right?  Even w.r.t the first
patch also if the non update non key toast columns are there we can
not apply filters on those.  So this comment seems misleading to me.

---

-Oidrelid = RelationGetRelid(targetrel->relation);
..
+relid = RelationGetRelid(targetrel);
+

Why this change is required, I mean instead of fetching the relid
during the variable declaration why do we need to do it separately
now?

---

+if (expr == NULL)
+ereport(ERROR,
+(errcode(ERRCODE_CANNOT_COERCE),
+ errmsg("row filter returns type %s that cannot be
coerced to the expected type %s",

Instead of "coerced to" can we use "cast to"?  That will be in sync
with other simmilar kind od user exposed error message.


+static ExprState *
+pgoutput_row_filter_init_expr(Node *rfnode)
+{
.
+/*
+ * Cache ExprState using CacheMemoryContext. This is the same code as
+ * ExecPrepareExpr() but that is not used because it doesn't use an EState.
+ * It should probably be another function in the executor to handle the
+ * execution outside a normal Plan tree context.
+ */
+oldctx = MemoryContextSwitchTo(CacheMemoryContext);
+expr = expression_planner(expr);
+exprstate = ExecInitExpr(expr, NULL);
+MemoryContextSwitchTo(oldctx);
+
+return exprstate;
+}

I can see the caller of this function is already switching to
CacheMemoryContext, so what is the point in doing it again here?
Maybe if called is expected to do show we can Asssert on the
CurrentMemoryContext.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com




RE: row filtering for logical replication

2021-11-16 Thread tanghy.f...@fujitsu.com
On Friday, November 12, 2021 6:20 PM Ajin Cherian  wrote:
> 
> Attaching version 39-
>

I met another problem when filtering out with the operator '~'.
Data can't be replicated as expected.

For example:
-- publisher
create table t (a text primary key);
create publication pub for table t where (a ~ 'aaa');

-- subscriber
create table t (a text primary key);
create subscription sub connection 'port=5432' publication pub;

-- publisher
insert into t values ('ab');
insert into t values ('abc');
postgres=# select * from t where (a ~ 'aaa');
a
-
 ab
 abc
(2 rows)

-- subscriber
postgres=# select * from t;
   a

 ab
(1 row)

The second record can’t be replicated.

By the way, when only applied 0001 patch, I couldn't reproduce this bug.
So, I think it was related to the later patches.

Regards
Tang


Re: row filtering for logical replication

2021-11-16 Thread Greg Nancarrow
On Fri, Nov 12, 2021 at 9:19 PM Ajin Cherian  wrote:
>
> Attaching version 39-
>

Thanks for the updated patch.
Some review comments:

doc/src/sgml/ref/create_publication.sgml
(1) improve comment
+ /* Set up a pstate to parse with */

"pstate" is the variable name, better to use "ParseState".

src/test/subscription/t/025_row_filter.pl
(2) rename TAP test 025 to 026
I suggest that the t/025_row_filter.pl TAP test should be renamed to
026 now because 025 is being used by some schema TAP test.

(3) whitespace errors
The 0006 patch applies with several whitespace errors.

(4) fix crash
The pgoutput.c patch that I previously posted on this thread needs to
be applied to fix the coredump issue reported by Tang-san.
While that fixes the crash, I haven't tracked through to see
where/whether the expression nodes are actually freed or whether now
there is a possible memory leak issue that may need further
investigation.


Regards,
Greg Nancarrow




RE: Optionally automatically disable logical replication subscriptions on error

2021-11-16 Thread osumi.takami...@fujitsu.com
Thank you for checking the patch !

On Friday, November 12, 2021 1:49 PM Greg Nancarrow  wrote:
> On Thu, Nov 11, 2021 at 8:20 PM osumi.takami...@fujitsu.com
>  wrote:
> Some comments on the v4 patch:
> 
> (1) Patch subject
> I think the patch subject should say "disable" instead of "disabling":
>Optionally disable subscriptions on error
Fixed.

 
> doc/src/sgml/ref/create_subscription.sgml
> (2) spelling mistake
> +  if replicating data from the publisher triggers
> + non-transicent errors
> 
> non-transicent -> non-transient
Fixed.

 
> (I notice Vignesh also pointed this out)
> 
> src/backend/replication/logical/worker.c
> (3) calling geterrcode()
> The new IsSubscriptionDisablingError() function calls geterrcode().
> According to the function comment for geterrcode(), it is only intended for 
> use
> in error callbacks.
> Instead of calling geterrcode(), couldn't the ErrorData from PG_CATCH block be
> passed to IsSubscriptionDisablingError() instead (from which it can get the
> sqlerrcode)?
> 
> (4) DisableSubscriptionOnError
> DisableSubscriptionOnError() is again calling MemoryContextSwitch() and
> CopyErrorData().
> I think the ErrorData from the PG_CATCH block could simply be passed to
> DisableSubscriptionOnError() instead of the memory-context, and the existing
> MemoryContextSwitch() and CopyErrorData() calls could be removed from it.
> 
> AFAICS, applying (3) and (4) above would make the code a lot cleaner.
Fixed.

The updated patch is shared in [1].

[1] - 
https://www.postgresql.org/message-id/TYCPR01MB8373771371B31E1E6CC74B0AED999%40TYCPR01MB8373.jpnprd01.prod.outlook.com


Best Regards,
Takamichi Osumi