Re: add checkpoint stats of snapshot and mapping files of pg_logical dir

2021-12-30 Thread Bharath Rupireddy
On Tue, Nov 30, 2021 at 6:24 AM Cary Huang  wrote:
>
> The following review has been posted through the commitfest application:
> make installcheck-world:  tested, passed
> Implements feature:   tested, passed
> Spec compliant:   tested, passed
> Documentation:not tested
>
> Hi
>
> The patch applies and tests fine. I don't think there is any harm having 
> little extra statistical information about the checkpoint process. In fact, 
> it could be useful in identifying a bottleneck during the checkpoint process 
> as the stats exactly the time taken to do the file IO in pg_logical dir.

Thanks for the review. Here's the CF entry
https://commitfest.postgresql.org/36/3389/, please feel free to add
you as reviewer.

Regards,
Bharath Rupireddy.




Re: skip replication slot snapshot/map file removal during end-of-recovery checkpoint

2021-12-30 Thread Bharath Rupireddy
On Thu, Dec 23, 2021 at 4:46 PM Bharath Rupireddy
 wrote:
>
> Hi,
>
> Currently the end-of-recovery checkpoint can be much slower, impacting
> the server availability, if there are many replication slot files
> .snap or map- to be enumerated and deleted. How about skipping
> the .snap and map- file handling during the end-of-recovery
> checkpoint? It makes the server available faster and the next regular
> checkpoint can deal with these files. If required, we can have a GUC
> (skip_replication_slot_file_handling or some other better name) to
> control this default being the existing behavior.
>
> Thoughts?

Here's the v1 patch, please review it.

Regards,
Bharath Rupireddy.


v1-0001-Skip-processing-snapshot-mapping-files-during-end.patch
Description: Binary data


RE: [PATCH]Comment improvement in publication.sql

2021-12-30 Thread tanghy.f...@fujitsu.com
On Monday, December 13, 2021 11:53 PM vignesh C  wrote:
> 
> On Wed, Dec 8, 2021 at 11:07 AM tanghy.f...@fujitsu.com
>  wrote:
> >
> > On Wednesday, December 8, 2021 1:49 PM, vignesh C 
> wrote:
> >
> > > The patch no longer applies, could you post a rebased patch.
> >
> > Thanks for your kindly reminder. Attached a rebased patch.
> > Some changes in v4 patch has been fixed by 5a28324, so I deleted those
> changes.
> 
> Thanks for the updated patch, should we make a similar change in
> AlterPublicationTables Function header to mention Set too:
> /*
> * Add or remove table to/from publication.
> */
> static void
> AlterPublicationTables(AlterPublicationStmt *stmt, HeapTuple tup,
> List *tables, List *schemaidlist)
> 

Sorry for the late reply.

I am not sure if we need this change because the way to SET the tables in
publication is that drop tables and then add tables, instead of directly
replacing the list of tables in the publication. (We can see this point in
AlterPublicationTables function.). Thoughts?

Regards,
Tang


Re: Patch to avoid orphaned dependencies

2021-12-30 Thread Andres Freund
Hi,

On 2021-12-17 14:19:18 +0100, Drouvot, Bertrand wrote:
> Please find enclosed v1-0003-orphaned-dependencies.patch, that contains:
> 
> - a mandatory rebase
> 
> - a few isolation tests added in src/test/modules/test_dependencies (but I'm
> not sure at all that's the right place to add them, is it?)

This fails on windows w/ msvc:

https://cirrus-ci.com/task/5368174125252608?logs=configure#L102
https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.157904#L12

Greetings,

Andres Freund




Re: Adding CI to our tree

2021-12-30 Thread Andres Freund
On 2021-12-30 17:46:52 -0800, Andres Freund wrote:
> I plan to push this after another cycle of tests passing (and driving
> over the bay bridge...) unless I hear protests.

Pushed.

Marked CF entry as committed.




Re: Adding CI to our tree

2021-12-30 Thread Andres Freund
Hi,

On 2021-12-30 20:28:46 -0600, Justin Pryzby wrote:
> [ language fixes]

Thanks!

> script uses a pseudo-tty, which do support locking.
> => which does

This didn't seem right either way - it's pseudo-ttys that don't support
locking, so plural seemed appropriate? I changed it to "script uses
pseudo-ttys, which do" instead...

Regards,

Andres




Re: Adding CI to our tree

2021-12-30 Thread Justin Pryzby
commit message: agithub

the the the buildfarm.
=> the

access too.
=> to

# Due to that it using concurrency within prove is helpful.
=> Due to that, it's useful to run prove with multiple jobs.

further details src/tools/ci/README
=> further details , see src/tools/ci/README

script uses a pseudo-tty, which do support locking.
=> which does

To limit unneccessary work only run this once normal linux test succeeded
=> unnecessary
=> succeeds

-- 
Justin




Re: generic plans and "initial" pruning

2021-12-30 Thread Amit Langote
On Tue, Dec 28, 2021 at 22:12 Ashutosh Bapat 
wrote:

> On Sat, Dec 25, 2021 at 9:06 AM Amit Langote 
> wrote:
> >
> > Executing generic plans involving partitions is known to become slower
> > as partition count grows due to a number of bottlenecks, with
> > AcquireExecutorLocks() showing at the top in profiles.
> >
> > Previous attempt at solving that problem was by David Rowley [1],
> > where he proposed delaying locking of *all* partitions appearing under
> > an Append/MergeAppend until "initial" pruning is done during the
> > executor initialization phase.  A problem with that approach that he
> > has described in [2] is that leaving partitions unlocked can lead to
> > race conditions where the Plan node belonging to a partition can be
> > invalidated when a concurrent session successfully alters the
> > partition between AcquireExecutorLocks() saying the plan is okay to
> > execute and then actually executing it.
> >
> > However, using an idea that Robert suggested to me off-list a little
> > while back, it seems possible to determine the set of partitions that
> > we can safely skip locking.  The idea is to look at the "initial" or
> > "pre-execution" pruning instructions contained in a given Append or
> > MergeAppend node when AcquireExecutorLocks() is collecting the
> > relations to lock and consider relations from only those sub-nodes
> > that survive performing those instructions.   I've attempted
> > implementing that idea in the attached patch.
> >
>
> In which cases, we will have "pre-execution" pruning instructions that
> can be used to skip locking partitions? Can you please give a few
> examples where this approach will be useful?


This is mainly to be useful for prepared queries, so something like:

prepare q as select * from partitioned_table where key = $1;

And that too when execute q(…) uses a generic plan. Generic plans are
problematic because it must contain nodes for all partitions (without any
plan time pruning), which means CheckCachedPlan() has to spend time
proportional to the number of partitions to determine that the plan is
still usable / has not been invalidated; most of that is
AcquireExecutorLocks().

Other bottlenecks, not addressed in this patch, pertain to some executor
startup/shutdown subroutines that process the range table of a PlannedStmt
in its entirety, whose length is also proportional to the number of
partitions when the plan is generic.

The benchmark is showing good results, indeed.


Thanks.
-- 
Amit Langote
EDB: http://www.enterprisedb.com


Re: Autovacuum and idle_session_timeout

2021-12-30 Thread Japin Li


On Fri, 31 Dec 2021 at 00:24, Tom Lane  wrote:
> Japin Li  writes:
>> On Thu, 30 Dec 2021 at 18:53, Guillaume Lelarge  
>> wrote:
>>> pg_dump works in a single transaction, so it's already dealt with
>>> idle_in_transaction_timeout. Though I guess setting both would work too.
>
>> Attached fix this, please consider reveiew it.  Thanks.
>
> This seems rather pointless to me.  The idle-session timeout is only
> activated in PostgresMain's input loop, so it will never be reached
> in autovacuum or other background workers.  (The same is true for
> idle_in_transaction_session_timeout, so the fact that somebody made
> autovacuum.c clear that looks like cargo-cult programming from here,
> not useful code.)  And as for pg_dump, how would it ever trigger the
> timeout?  It's not going to sit there thinking, especially not
> outside a transaction.
>

Thanks for your clarify!  If the timeout never be reached, should we remove
those settings?

-- 
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.




Re: Adding CI to our tree

2021-12-30 Thread Andres Freund
Hi,

Attached is v5 of the CI patch. Not a lot of changes:
- a bunch of copy-editing, wrote a commit message etc
- use ccache for CXX/CLANG in the CompilerWarnings task, I had missed
  that when making the task use all --with-* flags
- switch linux to use ossp-uuid. I tried to switch macos at first, but
  that doesn't currently seem to work.
- minor other cleanups

This time I've only attached the main CI patch, not the one making core
dumps on windows work. That's not yet committable...

I plan to push this after another cycle of tests passing (and driving
over the bay bridge...) unless I hear protests.

Greetings,

Andres Freund
>From 124ec8dea062e95931d5d2f80633e025aa7733ad Mon Sep 17 00:00:00 2001
From: Andres Freund 
Date: Mon, 15 Mar 2021 09:25:15 -0700
Subject: [PATCH v5] ci: Add continuous integration for github repositories via
 cirrus-ci.

Currently FreeBSD, Linux, macOS and Windows (Visual Studio) are tested.

The main goal of this integration is to make it easier to test in-development
patches across multiple platforms. This includes improving the testing done
automatically by cfbot [1] for commitfest entries.  It is *not* the goal to
supersede the buildfarm.

cirrus-ci [2] was chosen because it was already in use for cfbot, allows using
full VMs, has good OS coverage and allows accessing the full test results
without authentication (like agithub account).  It might be worth adding
support for further CI providers, particularly ones supporting other git
forges, in the future.

To keep CI times tolerable, most platforms use pre-generated images. Some
platforms use containers, others use full VMs.

For instructions on how to enable the CI integration in a repository and
further details src/tools/ci/README

[1] http://cfbot.cputube.org/
[2] https://cirrus-ci.org/

Author: Andres Freund 
Author: Thomas Munro 
Author: Melanie Plageman 
Reviewed-By: Melanie Plageman 
Reviewed-By: Justin Pryzby 
Reviewed-By: Thomas Munro 
Reviewed-By: Peter Eisentraut 
Discussion: https://postgr.es/m/20211001222752.wrz7erzh4cajv...@alap3.anarazel.de
---
 .cirrus.yml | 547 
 src/tools/ci/README |  63 +++
 src/tools/ci/cores_backtrace.sh |  50 +++
 src/tools/ci/gcp_freebsd_repartition.sh |  28 ++
 src/tools/ci/pg_ci_base.conf|  14 +
 src/tools/ci/windows_build_config.pl|  13 +
 6 files changed, 715 insertions(+)
 create mode 100644 .cirrus.yml
 create mode 100644 src/tools/ci/README
 create mode 100755 src/tools/ci/cores_backtrace.sh
 create mode 100755 src/tools/ci/gcp_freebsd_repartition.sh
 create mode 100644 src/tools/ci/pg_ci_base.conf
 create mode 100644 src/tools/ci/windows_build_config.pl

diff --git a/.cirrus.yml b/.cirrus.yml
new file mode 100644
index 000..f49799e9a32
--- /dev/null
+++ b/.cirrus.yml
@@ -0,0 +1,547 @@
+# CI configuration file for CI utilizing cirrus-ci.org
+#
+# For instructions on how to enable the CI integration in a repository and
+# further details src/tools/ci/README
+
+
+env:
+  # Source of images / containers
+  GCP_PROJECT: pg-ci-images
+  IMAGE_PROJECT: $GCP_PROJECT
+  CONTAINER_REPO: us-docker.pkg.dev/${GCP_PROJECT}/ci
+
+  # The lower depth accelerates git clone. Use a bit of depth so that
+  # concurrent tasks and retrying older jobs has a chance of working.
+  CIRRUS_CLONE_DEPTH: 500
+  # Useful to be able to analyse what in a script takes long
+  CIRRUS_LOG_TIMESTAMP: true
+
+  CCACHE_MAXSIZE: "250M"
+
+  # target to test, for all but windows
+  CHECK: check-world PROVE_FLAGS=$PROVE_FLAGS
+  CHECKFLAGS: -Otarget
+  PROVE_FLAGS: --timer
+  PGCTLTIMEOUT: 120 # avoids spurious failures during parallel tests
+  TEMP_CONFIG: ${CIRRUS_WORKING_DIR}/src/tools/ci/pg_ci_base.conf
+  PG_TEST_EXTRA: kerberos ldap ssl
+
+
+# What files to preserve in case tests fail
+on_failure: _failure
+  log_artifacts:
+path: "**/**.log"
+type: text/plain
+  regress_diffs_artifacts:
+path: "**/**.diffs"
+type: text/plain
+  tap_artifacts:
+path: "**/regress_log_*"
+type: text/plain
+
+
+task:
+  name: FreeBSD - 13
+
+  env:
+# FreeBSD on GCP is slow when running with larger number of CPUS /
+# jobs. Using one more job than cpus seems to work best.
+CPUS: 2
+BUILD_JOBS: 3
+TEST_JOBS: 3
+
+CCACHE_DIR: /tmp/ccache_dir
+
+  only_if: $CIRRUS_CHANGE_MESSAGE !=~ '.*\nci-os-only:.*' || $CIRRUS_CHANGE_MESSAGE =~ '.*\nci-os-only:[^\n]*freebsd.*'
+
+  compute_engine_instance:
+image_project: $IMAGE_PROJECT
+image: family/pg-ci-freebsd-13-0
+platform: freebsd
+cpu: $CPUS
+memory: 2G
+disk: 50
+
+  sysinfo_script: |
+id
+uname -a
+ulimit -a -H && ulimit -a -S
+export
+
+  ccache_cache:
+folder: $CCACHE_DIR
+  # Workaround around performance issues due to 32KB block size
+  repartition_script: src/tools/ci/gcp_freebsd_repartition.sh
+  create_user_script: |
+pw useradd postgres
+chown -R postgres:postgres .
+mkdir 

RE: Failed transaction statistics to measure the logical replication progress

2021-12-30 Thread tanghy.f...@fujitsu.com
On Wednesday, December 22, 2021 6:14 PM osumi.takami...@fujitsu.com 
 wrote:
> 
> Attached the new patch v19.
> 

Thanks for your patch. I think it's better if you could add this patch to the 
commitfest.
Here are some comments:

1)
+   commit_count bigint
+  
+  
+   Number of transactions successfully applied in this subscription.
+   Both COMMIT and COMMIT PREPARED increment this counter.
+  
+ 
...

I think the commands (like COMMIT, COMMIT PREPARED ...) can be surrounded with
" ", thoughts?

2)
+extern void pgstat_report_subworker_xact_end(LogicalRepWorker *repWorker,
+   
 LogicalRepMsgType command,
+   
 bool bforce);

Should "bforce" be "force"?

3)
+ *  This should be called before the call of process_syning_tables() so to not

"process_syning_tables()" should be "process_syncing_tables()".

4)
+void
+pgstat_send_subworker_xact_stats(LogicalRepWorker *repWorker, bool force)
+{
+   static TimestampTz last_report = 0;
+   PgStat_MsgSubWorkerXactEnd msg;
+
+   if (!force)
+   {
...
+   if (!TimestampDifferenceExceeds(last_report, now, 
PGSTAT_STAT_INTERVAL))
+   return;
+   last_report = now;
+   }
+
...
+   if (repWorker->commit_count == 0 && repWorker->abort_count == 0)
+   return;
...

I think it's better to check commit_count and abort_count first, then check if
reach PGSTAT_STAT_INTERVAL.
Otherwise if commit_count and abort_count are 0, it is possible that the value
of last_report has been updated but it didn't send stats in fact. In this case,
last_report is not the real time that send last message.

Regards,
Tang


Re: Apple's ranlib warns about protocol_openssl.c

2021-12-30 Thread Tom Lane
Andres Freund  writes:
> I also see it on an m1 mini I got when building against openssl 3.

Huh, I wonder why I'm not seeing it.

> There is -no_warning_for_no_symbols in apple's ranlib. But perhaps
> there's another way around this:
> We have ssl_protocol_version_to_openssl() in both be-secure-openssl.c
> and fe-secure-openssl.c.  Perhaps we should just move it to
> protocol_openssl.c?

Those functions have the same name, but not the same arguments,
so it'd take some refactoring to share any code.

regards, tom lane




Re: Apple's ranlib warns about protocol_openssl.c

2021-12-30 Thread Andres Freund
Hi,

On 2021-12-17 14:26:53 +1300, Thomas Munro wrote:
> On Fri, Dec 17, 2021 at 9:38 AM Tom Lane  wrote:
> > Could be.  I tried it on Monterey, but not anything older.
> > (longfin is still on Big Sur, because I've been lazy about
> > updating it.)
> 
> Hmm.  Happened[1] with Andres's CI scripts, which (at least on the
> version I used here, may not be his latest) runs on macOS Monterey and
> installs openssl from brew which is apparently 3.0.0.  Wild guess:
> some versions of openssl define functions, and some define macros, and
> here we're looking for the macros?

I also see it on an m1 mini I got when building against openssl 3.

There is -no_warning_for_no_symbols in apple's ranlib. But perhaps
there's another way around this:
We have ssl_protocol_version_to_openssl() in both be-secure-openssl.c
and fe-secure-openssl.c.  Perhaps we should just move it to
protocol_openssl.c?

Greetings,

Andres Freund




Re: Column Filtering in Logical Replication

2021-12-30 Thread Alvaro Herrera
On 2021-Dec-30, Justin Pryzby wrote:

Thank you!  I've incorporated your proposed fixes.

> > +   /*
> > +* Even if the user listed all columns in the column list, we 
> > cannot
> > +* allow a column list to be specified when REPLICA IDENTITY is 
> > FULL;
> > +* that would cause problems if a new column is added later, 
> > because
> > +* that could would have to be included (because of being part 
> > of the
> 
> could would is wrong

Hah, yeah, this was "that column would".

> > + * Gets a list of OIDs of all column-partial publications of the given
> > + * relation, that is, those that specify a column list.
> 
> I would call this a "partial-column" publication.

OK, done that way.

-- 
Álvaro Herrera  Valdivia, Chile  —  https://www.EnterpriseDB.com/
>From dd2515ee7e0b37f82c76edc4fe890bb7be1abb3e Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Thu, 30 Dec 2021 19:49:31 -0300
Subject: [PATCH v14 1/2] Avoid use of DEFELEM enum in AlterPublicationStmt

This allows to add new values for future functionality.

Discussion: https://postgr.es/m/202112302021.ca7ihogysgh3@alvherre.pgsql
---
 src/backend/commands/publicationcmds.c | 18 +-
 src/backend/parser/gram.y  |  6 +++---
 src/include/nodes/parsenodes.h | 11 +--
 3 files changed, 21 insertions(+), 14 deletions(-)

diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index f932f47a08..0f04969fd6 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -503,12 +503,12 @@ AlterPublicationTables(AlterPublicationStmt *stmt, HeapTuple tup,
 	 * possible that user has not specified any tables in which case we need
 	 * to remove all the existing tables.
 	 */
-	if (!tables && stmt->action != DEFELEM_SET)
+	if (!tables && stmt->action != AP_SetObjects)
 		return;
 
 	rels = OpenTableList(tables);
 
-	if (stmt->action == DEFELEM_ADD)
+	if (stmt->action == AP_AddObjects)
 	{
 		List	   *schemas = NIL;
 
@@ -521,9 +521,9 @@ AlterPublicationTables(AlterPublicationStmt *stmt, HeapTuple tup,
 			  PUBLICATIONOBJ_TABLE);
 		PublicationAddTables(pubid, rels, false, stmt);
 	}
-	else if (stmt->action == DEFELEM_DROP)
+	else if (stmt->action == AP_DropObjects)
 		PublicationDropTables(pubid, rels, false);
-	else		/* DEFELEM_SET */
+	else		/* AP_SetObjects */
 	{
 		List	   *oldrelids = GetPublicationRelations(pubid,
 		PUBLICATION_PART_ROOT);
@@ -598,7 +598,7 @@ AlterPublicationSchemas(AlterPublicationStmt *stmt,
 	 * possible that user has not specified any schemas in which case we need
 	 * to remove all the existing schemas.
 	 */
-	if (!schemaidlist && stmt->action != DEFELEM_SET)
+	if (!schemaidlist && stmt->action != AP_SetObjects)
 		return;
 
 	/*
@@ -606,7 +606,7 @@ AlterPublicationSchemas(AlterPublicationStmt *stmt,
 	 * concurrent schema deletion.
 	 */
 	LockSchemaList(schemaidlist);
-	if (stmt->action == DEFELEM_ADD)
+	if (stmt->action == AP_AddObjects)
 	{
 		List	   *rels;
 		List	   *reloids;
@@ -620,9 +620,9 @@ AlterPublicationSchemas(AlterPublicationStmt *stmt,
 		CloseTableList(rels);
 		PublicationAddSchemas(pubform->oid, schemaidlist, false, stmt);
 	}
-	else if (stmt->action == DEFELEM_DROP)
+	else if (stmt->action == AP_DropObjects)
 		PublicationDropSchemas(pubform->oid, schemaidlist, false);
-	else		/* DEFELEM_SET */
+	else		/* AP_SetObjects */
 	{
 		List	   *oldschemaids = GetPublicationSchemas(pubform->oid);
 		List	   *delschemas = NIL;
@@ -657,7 +657,7 @@ CheckAlterPublication(AlterPublicationStmt *stmt, HeapTuple tup,
 {
 	Form_pg_publication pubform = (Form_pg_publication) GETSTRUCT(tup);
 
-	if ((stmt->action == DEFELEM_ADD || stmt->action == DEFELEM_SET) &&
+	if ((stmt->action == AP_AddObjects || stmt->action == AP_SetObjects) &&
 		schemaidlist && !superuser())
 		ereport(ERROR,
 (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index f3c232842d..6dddc07947 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -9828,7 +9828,7 @@ AlterPublicationStmt:
 	n->pubname = $3;
 	n->pubobjects = $5;
 	preprocess_pubobj_list(n->pubobjects, yyscanner);
-	n->action = DEFELEM_ADD;
+	n->action = AP_AddObjects;
 	$$ = (Node *)n;
 }
 			| ALTER PUBLICATION name SET pub_obj_list
@@ -9837,7 +9837,7 @@ AlterPublicationStmt:
 	n->pubname = $3;
 	n->pubobjects = $5;
 	preprocess_pubobj_list(n->pubobjects, yyscanner);
-	n->action = DEFELEM_SET;
+	n->action = AP_SetObjects;
 	$$ = (Node *)n;
 }
 			| ALTER PUBLICATION name DROP pub_obj_list
@@ -9846,7 +9846,7 @@ AlterPublicationStmt:
 	n->pubname = $3;
 	n->pubobjects = $5;
 	preprocess_pubobj_list(n->pubobjects, yyscanner);
-	n->action = DEFELEM_DROP;
+	n->action = AP_DropObjects;
 	$$ = (Node *)n;
 }
 		;
diff 

More pg_dump performance hacking

2021-12-30 Thread Tom Lane
Attached are a couple of patches for loose ends that I didn't
get to when I was working on pg_dump before the last CF.

0001 removes all the "username_subquery" subqueries in favor
of doing local username lookups.  On the regression database
with no extra roles, it seems to be more or less a wash ...
but if I create 100 roles, then the patch seems to save five
or ten percent compared to HEAD.

I also got rid of the rather-pointless-IMO checks for pg_authid
join failures, in favor of having the lookup subroutine just
fatal() if it doesn't find a match.  I don't think we need to
burden translators with all those strings for cases that shouldn't
happen.  Note that a lot of object types weren't checking
for this condition anyway, making it even more pointless.

0002 is a very small patch that gets rid of an extra subquery
for identity-sequence checking, realizing that the LEFT JOIN
in the FROM clause will have picked up that row already,
if it exists.  This again saves a few percent for
"pg_dump -s regression", though the effects would depend a lot
on how many sequences you have.

These don't seem complicated enough to require real review,
so I plan to just push them, unless there are objections.

regards, tom lane

diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 94f1f32558..76226de485 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -66,6 +66,12 @@
 #include "pg_dump.h"
 #include "storage/block.h"
 
+typedef struct
+{
+	Oid			roleoid;		/* role's OID */
+	const char *rolename;		/* role's name */
+} RoleNameItem;
+
 typedef struct
 {
 	const char *descr;			/* comment for an object */
@@ -93,9 +99,6 @@ typedef enum OidOptions
 /* global decls */
 static bool dosync = true;		/* Issue fsync() to make dump durable on disk. */
 
-/* subquery used to convert user ID (eg, datdba) to user name */
-static const char *username_subquery;
-
 static Oid	g_last_builtin_oid; /* value of the last builtin oid */
 
 /* The specified names/patterns should to match at least one entity */
@@ -130,6 +133,10 @@ static const CatalogId nilCatalogId = {0, 0};
 static bool have_extra_float_digits = false;
 static int	extra_float_digits;
 
+/* sorted table of role names */
+static RoleNameItem *rolenames = NULL;
+static int	nrolenames = 0;
+
 /* sorted table of comments */
 static CommentItem *comments = NULL;
 static int	ncomments = 0;
@@ -174,6 +181,8 @@ static void expand_table_name_patterns(Archive *fout,
 static NamespaceInfo *findNamespace(Oid nsoid);
 static void dumpTableData(Archive *fout, const TableDataInfo *tdinfo);
 static void refreshMatViewData(Archive *fout, const TableDataInfo *tdinfo);
+static const char *getRoleName(const char *roleoid_str);
+static void collectRoleNames(Archive *fout);
 static void getAdditionalACLs(Archive *fout);
 static void dumpCommentExtended(Archive *fout, const char *type,
 const char *name, const char *namespace,
@@ -744,9 +753,6 @@ main(int argc, char **argv)
 	if (fout->isStandby)
 		dopt.no_unlogged_table_data = true;
 
-	/* Select the appropriate subquery to convert user IDs to names */
-	username_subquery = "SELECT rolname FROM pg_catalog.pg_roles WHERE oid =";
-
 	/*
 	 * Find the last built-in OID, if needed (prior to 8.1)
 	 *
@@ -814,6 +820,11 @@ main(int argc, char **argv)
 	if (dopt.include_everything && !dopt.schemaOnly && !dopt.dontOutputBlobs)
 		dopt.outputBlobs = true;
 
+	/*
+	 * Collect role names so we can map object owner OIDs to names.
+	 */
+	collectRoleNames(fout);
+
 	/*
 	 * Now scan the database and create DumpableObject structs for all the
 	 * objects we intend to dump.
@@ -2737,7 +2748,7 @@ dumpDatabase(Archive *fout)
 	int			i_tableoid,
 i_oid,
 i_datname,
-i_dba,
+i_datdba,
 i_encoding,
 i_collate,
 i_ctype,
@@ -2771,7 +2782,7 @@ dumpDatabase(Archive *fout)
 	if (fout->remoteVersion >= 90300)
 	{
 		appendPQExpBuffer(dbQry, "SELECT tableoid, oid, datname, "
-		  "(%s datdba) AS dba, "
+		  "datdba, "
 		  "pg_encoding_to_char(encoding) AS encoding, "
 		  "datcollate, datctype, datfrozenxid, datminmxid, "
 		  "datacl, acldefault('d', datdba) AS acldefault, "
@@ -2780,13 +2791,12 @@ dumpDatabase(Archive *fout)
 		  "shobj_description(oid, 'pg_database') AS description "
 
 		  "FROM pg_database "
-		  "WHERE datname = current_database()",
-		  username_subquery);
+		  "WHERE datname = current_database()");
 	}
 	else
 	{
 		appendPQExpBuffer(dbQry, "SELECT tableoid, oid, datname, "
-		  "(%s datdba) AS dba, "
+		  "datdba, "
 		  "pg_encoding_to_char(encoding) AS encoding, "
 		  "datcollate, datctype, datfrozenxid, 0 AS datminmxid, "
 		  "datacl, acldefault('d', datdba) AS acldefault, "
@@ -2795,8 +2805,7 @@ dumpDatabase(Archive *fout)
 		  "shobj_description(oid, 'pg_database') AS description "
 
 		  "FROM pg_database "
-		  "WHERE datname = current_database()",
-		  

Re: Column Filtering in Logical Replication

2021-12-30 Thread Justin Pryzby
> + boolam_partition = false;
>...
>   Assert(!isnull);
>   lrel->relkind = DatumGetChar(slot_getattr(slot, 3, ));
>   Assert(!isnull);
> + am_partition = DatumGetChar(slot_getattr(slot, 4, ));

I think this needs to be GetBool.
You should Assert(!isnull) like the others.
Also, I think it doesn't need to be initialized to "false".

> + /*
> +  * Even if the user listed all columns in the column list, we 
> cannot
> +  * allow a column list to be specified when REPLICA IDENTITY is 
> FULL;
> +  * that would cause problems if a new column is added later, 
> because
> +  * that could would have to be included (because of being part 
> of the

could would is wrong

> + /*
> +  * Translate list of columns to attnums. We prohibit system attributes 
> and
> +  * make sure there are no duplicate columns.
> +  *
> +  */

extraneous line

> +/*
> + * Gets a list of OIDs of all column-partial publications of the given
> + * relation, that is, those that specify a column list.

I would call this a "partial-column" publication.

> + errmsg("cannot set REPLICA IDENTITY 
> FULL when column-partial publications exist"));
> +  * Check column-partial publications.  All publications have to include 
> all

same

> + /*
> +  * Store the column names only if they are contained in column filter

period(.)

> +  * LogicalRepRelation will only contain attributes corresponding to 
> those
> +  * specficied in column filters.

specified

> --- a/src/include/catalog/pg_publication_rel.h
> +++ b/src/include/catalog/pg_publication_rel.h
> @@ -31,6 +31,9 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId)
>   Oid oid;/* oid */
>   Oid prpubid BKI_LOOKUP(pg_publication); /* Oid of 
> the publication */
>   Oid prrelid BKI_LOOKUP(pg_class);   /* Oid of the 
> relation */
> +#ifdef CATALOG_VARLEN
> + int2vector  prattrs;/* Variable length field starts 
> here */
> +#endif

The language in the pre-existing comments is better:
/* variable-length fields start here */

> @@ -791,12 +875,13 @@ fetch_remote_table_info(char *nspname, char *relname,
>
> ExecClearTuple(slot);
> }
> +
> ExecDropSingleTupleTableSlot(slot);
> +   walrcv_clear_result(res);
> +   pfree(cmd.data);
>
> lrel->natts = natt;
>
> -   walrcv_clear_result(res);
> -   pfree(cmd.data);
>  }

The blank line after "lrel->natts = natt;" should be removed.




Re: Why is src/test/modules/committs/t/002_standby.pl flaky?

2021-12-30 Thread Andrew Dunstan


On 12/30/21 15:01, Thomas Munro wrote:
> Hi,
>
> There's a wait for replay that is open coded (instead of using the
> wait_for_catchup() routine), and sometimes the second of two such
> waits at line 51 (in master) times out after 3 minutes with "standby
> never caught up".  It's happening on three particular Windows boxes,
> but once also happened on the AIX box "tern".
>
> branch |  animal   | count
> ---+---+---
>  HEAD  | drongo| 1
>  HEAD  | fairywren | 8
>  REL_10_STABLE | drongo| 3
>  REL_10_STABLE | fairywren |10
>  REL_10_STABLE | jacana| 3
>  REL_11_STABLE | drongo| 1
>  REL_11_STABLE | fairywren | 4
>  REL_11_STABLE | jacana| 3
>  REL_12_STABLE | drongo| 2
>  REL_12_STABLE | fairywren | 5
>  REL_12_STABLE | jacana| 1
>  REL_12_STABLE | tern  | 1
>  REL_13_STABLE | fairywren | 3
>  REL_14_STABLE | drongo| 2
>  REL_14_STABLE | fairywren | 6
>
>  


FYI, drongo and fairywren are run on the same AWS/EC2 Windows Server
2019 instance. Nothing else runs on it.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)

2021-12-30 Thread Melanie Plageman
On Tue, Dec 21, 2021 at 8:32 PM Melanie Plageman
 wrote:
> On Thu, Dec 16, 2021 at 3:18 PM Andres Freund  wrote:
> > > > > From 9f22da9041e1e1fbc0ef003f5f78f4e72274d438 Mon Sep 17 00:00:00 2001
> > > > > From: Melanie Plageman 
> > > > > Date: Wed, 24 Nov 2021 12:20:10 -0500
> > > > > Subject: [PATCH v17 6/7] Remove superfluous bgwriter stats
> > > > >
> > > > > Remove stats from pg_stat_bgwriter which are now more clearly 
> > > > > expressed
> > > > > in pg_stat_buffers.
> > > > >
> > > > > TODO:
> > > > > - make pg_stat_checkpointer view and move relevant stats into it
> > > > > - add additional stats to pg_stat_bgwriter
> > > >
> > > > When do you think it makes sense to tackle these wrt committing some of 
> > > > the
> > > > patches?
> > >
> > > Well, the new stats are a superset of the old stats (no stats have been
> > > removed that are not represented in the new or old views). So, I don't
> > > see that as a blocker for committing these patches.
> >
> > > Since it is weird that pg_stat_bgwriter had mostly checkpointer stats,
> > > I've edited this commit to rename that view to pg_stat_checkpointer.
> >
> > > I have not made a separate view just for maxwritten_clean (presumably
> > > called pg_stat_bgwriter), but I would not be opposed to doing this if
> > > you thought having a view with a single column isn't a problem (in the
> > > event that we don't get around to adding more bgwriter stats right
> > > away).
> >
> > How about keeping old bgwriter values in place in the view , but generated
> > from the new stats stuff?
>
> I tried this, but I actually don't think it is the right way to go. In
> order to maintain the old view with the new source code, I had to add
> new code to maintain a separate resets array just for the bgwriter view.
> It adds some fiddly code that will be annoying to maintain (the reset
> logic is confusing enough as is).
> And, besides the implementation complexity, if a user resets
> pg_stat_bgwriter and not pg_stat_buffers (or vice versa), they will
> see totally different numbers for "buffers_backend" in pg_stat_bgwriter
> than shared buffers written by B_BACKEND in pg_stat_buffers. I would
> find that confusing.

In a quick chat off-list, Andres suggested it might be okay to have a
single reset target for both the pg_stat_buffers view and legacy
pg_stat_bgwriter view. So, I am planning to share a new patchset which
has only the new "buffers" target which will also reset the legacy
pg_stat_bgwriter view.

I'll also remove the bgwriter stats I proposed and the
pg_stat_checkpointer view to keep things simple for now.

- Melanie




Re: Column Filtering in Logical Replication

2021-12-30 Thread Alvaro Herrera
On 2021-Dec-29, Alvaro Herrera wrote:

> This new stuff is not yet finished.  For example I didn't refactor
> handling of REPLICA IDENTITY, so the new command does not correctly
> check everything, such as the REPLICA IDENTITY FULL stuff.  Also, no
> tests have been added yet.  In manual tests it seems to behave as
> expected.

Fixing the lack of check for replica identity full didn't really require
much refactoring, so I did it that way.

I split it with some trivial fixes that can be committed separately
ahead of time.  I'm thinking in committing 0001 later today, perhaps
0002 tomorrow.  The interesting part is 0003.

-- 
Álvaro Herrera   39°49'30"S 73°17'W  —  https://www.EnterpriseDB.com/
>From 0453eb6397803ce4dd607fd3a17a12d573eb2c90 Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Thu, 30 Dec 2021 16:48:09 -0300
Subject: [PATCH v13 1/3] Small cleanups for publicationcmds.c and
 pg_publication.c

This fixes a typo in a local function name, completes an existing comment,
and renames an unhappily named local variable.
---
 src/backend/catalog/pg_publication.c   | 11 ++-
 src/backend/commands/publicationcmds.c | 16 +---
 src/backend/commands/tablecmds.c   |  3 +--
 src/backend/parser/gram.y  |  5 +++--
 4 files changed, 19 insertions(+), 16 deletions(-)

diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 62f10bcbd2..b307bc2ed5 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -287,7 +287,7 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
 	Datum		values[Natts_pg_publication_rel];
 	bool		nulls[Natts_pg_publication_rel];
 	Oid			relid = RelationGetRelid(targetrel->relation);
-	Oid			prrelid;
+	Oid			pubreloid;
 	Publication *pub = GetPublication(pubid);
 	ObjectAddress myself,
 referenced;
@@ -320,9 +320,9 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
 	memset(values, 0, sizeof(values));
 	memset(nulls, false, sizeof(nulls));
 
-	prrelid = GetNewOidWithIndex(rel, PublicationRelObjectIndexId,
- Anum_pg_publication_rel_oid);
-	values[Anum_pg_publication_rel_oid - 1] = ObjectIdGetDatum(prrelid);
+	pubreloid = GetNewOidWithIndex(rel, PublicationRelObjectIndexId,
+   Anum_pg_publication_rel_oid);
+	values[Anum_pg_publication_rel_oid - 1] = ObjectIdGetDatum(pubreloid);
 	values[Anum_pg_publication_rel_prpubid - 1] =
 		ObjectIdGetDatum(pubid);
 	values[Anum_pg_publication_rel_prrelid - 1] =
@@ -334,7 +334,8 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
 	CatalogTupleInsert(rel, tup);
 	heap_freetuple(tup);
 
-	ObjectAddressSet(myself, PublicationRelRelationId, prrelid);
+	/* Register dependencies as needed */
+	ObjectAddressSet(myself, PublicationRelRelationId, pubreloid);
 
 	/* Add dependency on the publication */
 	ObjectAddressSet(referenced, PublicationRelationId, pubid);
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 404bb5d0c8..3466c57dc0 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -48,7 +48,7 @@
 #include "utils/syscache.h"
 #include "utils/varlena.h"
 
-static List *OpenReliIdList(List *relids);
+static List *OpenRelIdList(List *relids);
 static List *OpenTableList(List *tables);
 static void CloseTableList(List *rels);
 static void LockSchemaList(List *schemalist);
@@ -499,8 +499,9 @@ AlterPublicationTables(AlterPublicationStmt *stmt, HeapTuple tup,
 	Oid			pubid = pubform->oid;
 
 	/*
-	 * It is quite possible that for the SET case user has not specified any
-	 * tables in which case we need to remove all the existing tables.
+	 * Nothing to do if no objects, except in SET: for that it is quite
+	 * possible that user has not specified any schemas in which case we need
+	 * to remove all the existing schemas.
 	 */
 	if (!tables && stmt->action != DEFELEM_SET)
 		return;
@@ -593,8 +594,9 @@ AlterPublicationSchemas(AlterPublicationStmt *stmt,
 	Form_pg_publication pubform = (Form_pg_publication) GETSTRUCT(tup);
 
 	/*
-	 * It is quite possible that for the SET case user has not specified any
-	 * schemas in which case we need to remove all the existing schemas.
+	 * Nothing to do if no objects, except in SET: for that it is quite
+	 * possible that user has not specified any schemas in which case we need
+	 * to remove all the existing schemas.
 	 */
 	if (!schemaidlist && stmt->action != DEFELEM_SET)
 		return;
@@ -610,7 +612,7 @@ AlterPublicationSchemas(AlterPublicationStmt *stmt,
 		List	   *reloids;
 
 		reloids = GetPublicationRelations(pubform->oid, PUBLICATION_PART_ROOT);
-		rels = OpenReliIdList(reloids);
+		rels = OpenRelIdList(reloids);
 
 		CheckObjSchemaNotAlreadyInPublication(rels, schemaidlist,
 			  PUBLICATIONOBJ_TABLE_IN_SCHEMA);
@@ -868,7 +870,7 @@ RemovePublicationSchemaById(Oid psoid)
  * add them to a publication.
  */
 static List *

Why is src/test/modules/committs/t/002_standby.pl flaky?

2021-12-30 Thread Thomas Munro
Hi,

There's a wait for replay that is open coded (instead of using the
wait_for_catchup() routine), and sometimes the second of two such
waits at line 51 (in master) times out after 3 minutes with "standby
never caught up".  It's happening on three particular Windows boxes,
but once also happened on the AIX box "tern".

branch |  animal   | count
---+---+---
 HEAD  | drongo| 1
 HEAD  | fairywren | 8
 REL_10_STABLE | drongo| 3
 REL_10_STABLE | fairywren |10
 REL_10_STABLE | jacana| 3
 REL_11_STABLE | drongo| 1
 REL_11_STABLE | fairywren | 4
 REL_11_STABLE | jacana| 3
 REL_12_STABLE | drongo| 2
 REL_12_STABLE | fairywren | 5
 REL_12_STABLE | jacana| 1
 REL_12_STABLE | tern  | 1
 REL_13_STABLE | fairywren | 3
 REL_14_STABLE | drongo| 2
 REL_14_STABLE | fairywren | 6

 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=drongo=2021-12-30%2014:42:30
 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=drongo=2021-12-30%2013:13:22
 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=fairywren=2021-12-30%2006:03:07
 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=fairywren=2021-12-22%2011:37:37
 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=fairywren=2021-12-22%2010:46:07
 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=fairywren=2021-12-22%2009:03:06
 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=drongo=2021-12-17%2004:59:17
 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=drongo=2021-12-17%2003:59:51
 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=drongo=2021-12-16%2004:37:58
 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=fairywren=2021-12-15%2009:57:14
 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=drongo=2021-12-15%2002:38:43
 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=fairywren=2021-12-14%2020:42:15
 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=fairywren=2021-12-14%2012:08:41
 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=fairywren=2021-12-14%2000:35:32
 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=fairywren=2021-12-13%2023:40:11
 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=fairywren=2021-12-13%2022:47:25
 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=fairywren=2021-12-09%2006:59:10
 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=fairywren=2021-12-09%2006:04:04
 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=fairywren=2021-12-09%2001:36:09
 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=fairywren=2021-12-08%2019:20:35
 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=fairywren=2021-12-08%2018:04:28
 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=drongo=2021-12-08%2014:12:32
 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=fairywren=2021-12-08%2011:15:58
 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=drongo=2021-12-08%2004:04:22
 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=jacana=2021-12-03%2017:31:49
 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=fairywren=2021-11-11%2015:58:55
 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=fairywren=2021-10-02%2022:00:17
 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=fairywren=2021-09-09%2005:16:43
 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=fairywren=2021-08-24%2004:45:09
 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=tern=2021-07-17%2010:57:49
 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=jacana=2021-06-12%2016:05:32
 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=fairywren=2021-02-07%2012:59:43
 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=jacana=2020-03-24%2012:49:50
 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=jacana=2020-02-01%2018:00:27
 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=jacana=2020-02-01%2017:26:27
 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=jacana=2020-01-30%2023:49:49
 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=jacana=2019-12-22%2014:19:02
 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=fairywren=2019-12-13%2000:12:11
 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=fairywren=2019-12-09%2006:02:05
 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=fairywren=2019-12-06%2003:07:42
 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=fairywren=2019-11-02%2014:41:04
 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=fairywren=2019-10-25%2013:12:08
 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=fairywren=2019-10-24%2013:12:41
 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=fairywren=2019-10-23%2023:10:00
 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=fairywren=2019-10-23%2018:00:39
 

Re: Throttling WAL inserts when the standby falls behind more than the configured replica_lag_in_bytes

2021-12-30 Thread Andres Freund
Hi,

On 2021-12-29 23:06:31 -0800, SATYANARAYANA NARLAPURAM wrote:
> I am afraid there are problems with making the RPO check post releasing the
> locks. By this time the transaction is committed and visible to the other
> backends (ProcArrayEndTransaction is already called) though the intention
> is to block committing transactions that violate the defined RPO.

Shrug. Anything transaction based has way bigger holes than this.


> Even though we block existing connections starting a new transaction, it is
> possible to do writes by opening a new connection / canceling the query.

If your threat model is users explicitly trying to circumvent this they can
cause problems much more easily. Trigger a bunch of vacuums, big COPYs etc.


> I am not too much worried about the lock contention as the system is already
> hosed because of the policy. This behavior is very similar to what happens
> when the Sync standby is not responding. Thoughts?

I don't see why we'd bury ourselves deeper in problems just because we already
have a problem. There's reasons why we want to do the delay for syncrep be
before xact completion - but I don't see those applying to WAL throttling to a
significant degree, particularly not when it's on a transaction level.

Greetings,

Andres Freund




Re: Autovacuum and idle_session_timeout

2021-12-30 Thread Guillaume Lelarge
Le jeu. 30 déc. 2021 à 17:25, Tom Lane  a écrit :

> Japin Li  writes:
> > On Thu, 30 Dec 2021 at 18:53, Guillaume Lelarge 
> wrote:
> >> pg_dump works in a single transaction, so it's already dealt with
> >> idle_in_transaction_timeout. Though I guess setting both would work too.
>
> > Attached fix this, please consider reveiew it.  Thanks.
>
> This seems rather pointless to me.  The idle-session timeout is only
> activated in PostgresMain's input loop, so it will never be reached
> in autovacuum or other background workers.  (The same is true for
> idle_in_transaction_session_timeout, so the fact that somebody made
> autovacuum.c clear that looks like cargo-cult programming from here,
> not useful code.)  And as for pg_dump, how would it ever trigger the
> timeout?  It's not going to sit there thinking, especially not
> outside a transaction.
>
>
Agreed. It makes more sense. So no need for the patch. Thanks to both.


-- 
Guillaume.


Re: Strange path from pgarch_readyXlog()

2021-12-30 Thread Bossart, Nathan
On 12/29/21, 3:11 PM, "Tom Lane"  wrote:
> "Bossart, Nathan"  writes:
>> This crossed my mind, too.  I also think one of the arrays can be
>> eliminated in favor of just using the heap (after rebuilding with a
>> reversed comparator).  Here is a minimally-tested patch that
>> demonstrates what I'm thinking.
>
> I already pushed a patch that de-static-izes those arrays, so this
> needs rebased at least.  However, now that you mention it it does
> seem like maybe the intermediate arch_files[] array could be dropped
> in favor of just pulling the next file from the heap.
>
> The need to reverse the heap's sort order seems like a problem
> though.  I really dislike the kluge you used here with a static flag
> that inverts the comparator's sort order behind the back of the
> binary-heap mechanism.  It seems quite accidental that that doesn't
> fall foul of asserts or optimizations in binaryheap.c.  For
> instance, if binaryheap_build decided it needn't do anything when
> bh_has_heap_property is already true, this code would fail.  In any
> case, we'd need to spend O(n) time inverting the heap's sort order,
> so this'd likely be slower than the current code.
>
> On the whole I'm inclined not to bother trying to optimize this
> further.  The main thing that concerned me is that somebody would
> bump up NUM_FILES_PER_DIRECTORY_SCAN to the point where the static
> space consumption becomes really problematic, and we've fixed that.

Your assessment seems reasonable to me.  If there was a better way to
adjust the comparator for the heap, maybe there would be a stronger
case for this approach.  I certainly don't think it's worth inventing
something for just this use-case.

Thanks for fixing this!

Nathan



Pluggable toaster

2021-12-30 Thread Teodor Sigaev

Hi!

We are working on custom toaster for JSONB [1], because current TOAST is 
universal for any data type and because of that it has some disadvantages:

   - "one toast fits all"  may be not the best solution for particular
     type or/and use cases
   - it doesn't know the internal structure of data type, so it  cannot
 choose an optimal toast strategy
   - it can't  share common parts between different rows and even
     versions of rows

Modification of current toaster for all tasks and cases looks too 
complex, moreover, it  will not works for  custom data types. Postgres 
is an extensible database,  why not to extent its extensibility even 
further, to have pluggable TOAST! We  propose an idea to separate 
toaster from  heap using  toaster API similar to table AM API etc. 
Following patches are applicable over patch in [1]


1) 1_toaster_interface_v1.patch.gz
https://github.com/postgrespro/postgres/tree/toaster_interface
  Introduces  syntax for storage and formal toaster API. Adds column 
atttoaster to pg_attribute, by design this column should not be equal to 
invalid oid for any toastable datatype, ie it must have correct oid for 
any type (not column) with non-plain storage. Since  toaster may support 
only particular datatype, core should check correctness of toaster set 
by toaster validate method. New commands could be found in 
src/test/regress/sql/toaster.sql


On-disk toast pointer structure now has one more possible struct - 
varatt_custom with fixed header and variable tail which uses as a 
storage for custom toasters. Format of built-in toaster is kept to allow 
simple pg_upgrade logic.


Since toaster for column could be changed during table's lifetime we had 
two options about toaster's drop operation:

   - if column's toaster has been changed,  then we need to re-toast all
     values, which could be extremely expensive. In any case,
     functions/operators should be ready to work with values toasted by
     different toasters, although any toaster should execute simple
     toast/detoast operation, which allows any existing code to
     work with the new approach. Tracking dependency of toasters and
 rows looks as bad idea.
   - disallow drop toaster. We don't believe that there will be many
     toasters at the same time (number of AM isn't very high too and
     we don't believe that it will be changed significantly in the near
     future), so prohibition of  dropping  of toaster looks reasonable.
In this patch set we choose second option.

Toaster API includes get_vtable method, which is planned to access the 
custom toaster features which isn't covered by this API.  The idea is, 
that toaster returns some structure with some values and/or pointers to 
toaster's methods and caller could use it for particular purposes, see 
patch 4). Kind of structure identified by magic number, which should be 
a first field in this structure.


Also added contrib/dummy_toaster to simplify checking.

psql/pg_dump are modified to support toaster object concept.

2) 2_toaster_default_v1.patch.gz
https://github.com/postgrespro/postgres/tree/toaster_default
Built-in toaster implemented (with some refactoring)  uisng toaster API 
as generic (or default) toaster.  dummy_toaster here is a minimal 
workable example, it saves value directly in toast pointer and fails if 
value is greater than 1kb.


3) 3_toaster_snapshot_v1.patch.gz
https://github.com/postgrespro/postgres/tree/toaster_snapshot
The patch implements technology to distinguish row's versions in toasted 
values to share common parts of toasted values between different 
versions of rows


4) 4_bytea_appendable_toaster_v1.patch.gz
https://github.com/postgrespro/postgres/tree/bytea_appendable_toaster
Contrib module implements toaster for non-compressed bytea columns, 
which allows fast appending to existing bytea value. Appended tail 
stored directly in toaster pointer, if there is enough place to do it.


Note: patch modifies byteacat() to support contrib toaster. Seems, it's 
looks ugly and contrib module should create new concatenation function.


We are open for any questions, discussions, objections and advices.
Thank you.

Peoples behind:
Oleg Bartunov
Nikita Gluhov
Nikita Malakhov
Teodor Sigaev


[1]
https://www.postgresql.org/message-id/flat/de83407a-ae3d-a8e1-a788-920eb334f...@sigaev.ru 



--
Teodor Sigaev  E-mail: teo...@sigaev.ru
  WWW: http://www.sigaev.ru/

4_bytea_appendable_toaster_v1.patch.gz
Description: application/gzip


3_toaster_snapshot_v1.patch.gz
Description: application/gzip


2_toaster_default_v1.patch.gz
Description: application/gzip


1_toaster_interface_v1.patch.gz
Description: application/gzip


Re: Autovacuum and idle_session_timeout

2021-12-30 Thread Tom Lane
Japin Li  writes:
> On Thu, 30 Dec 2021 at 18:53, Guillaume Lelarge  
> wrote:
>> pg_dump works in a single transaction, so it's already dealt with
>> idle_in_transaction_timeout. Though I guess setting both would work too.

> Attached fix this, please consider reveiew it.  Thanks.

This seems rather pointless to me.  The idle-session timeout is only
activated in PostgresMain's input loop, so it will never be reached
in autovacuum or other background workers.  (The same is true for
idle_in_transaction_session_timeout, so the fact that somebody made
autovacuum.c clear that looks like cargo-cult programming from here,
not useful code.)  And as for pg_dump, how would it ever trigger the
timeout?  It's not going to sit there thinking, especially not
outside a transaction.

regards, tom lane




Re: Autovacuum and idle_session_timeout

2021-12-30 Thread Guillaume Lelarge
Le jeu. 30 déc. 2021 à 12:01, Japin Li  a écrit :

>
> On Thu, 30 Dec 2021 at 18:53, Guillaume Lelarge 
> wrote:
> > Le jeu. 30 déc. 2021 à 11:44, Japin Li  a écrit :
> >
> >>
> > pg_dump works in a single transaction, so it's already dealt with
> > idle_in_transaction_timeout. Though I guess setting both would work too.
>
> Attached fix this, please consider reveiew it.  Thanks.
>
>
I've read it and it really looks like what I would have done. Sounds good
to me.


-- 
Guillaume.


Re: Fix BUG #17335: Duplicate result rows in Gather node

2021-12-30 Thread Dilip Kumar
On Thu, Dec 30, 2021 at 4:44 PM Yura Sokolov 
wrote:

> Good day, hackers.
>
> Problem:
> - Append path is created with explicitely parallel_aware = true
> - It has two child, one is trivial, other is parallel_aware = false .
>   Trivial child is dropped.
> - Gather/GatherMerge path takes Append path as a child and thinks
>   its child is parallel_aware = true.
> - But Append path is removed at the last since it has only one child.
> - Now Gather/GatherMerge thinks its child is parallel_aware, but it
>   is not.
>   Gather/GatherMerge runs its child twice: in a worker and in a leader,
>   and gathers same rows twice.
>
> Reproduction code attached (repro.sql. Included as a test as well).
>

Yeah, this is a problem.


>
> Suggested quick (and valid) fix in the patch attached:
> - If Append has single child, then copy its parallel awareness.
>
> Bug were introduced with commit 8edd0e79460b414b1d971895312e549e95e12e4f
> "Suppress Append and MergeAppend plan nodes that have a single child."
>
> During discussion, it were supposed [1] those fields should be copied:
>
> > I haven't looked into whether this does the right things for parallel
> > planning --- possibly create_[merge]append_path need to propagate up
> > parallel-related path fields from the single child?
>
> But it were not so obvious [2].
>
> Better fix could contain removing Gather/GatherMerge node as well if
> its child is not parallel aware.
>

The Gather path will only be created if we have an underlying partial path,
so I think if we are generating the append path only from the non-partial
paths then we can see if the number of child nodes is just 1 then don't
generate the partial append path, so from that you will node generate the
partial join and eventually gather will be avoided.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


Re: ICU for global collation

2021-12-30 Thread Peter Eisentraut


There were a few inquiries about this topic recently, so I dug up the 
old thread and patch.  What we got stuck on last time was that we can't 
just swap out all locale support in a database for ICU.  We still need 
to set the usual locale environment, otherwise some things that are not 
ICU aware will break or degrade.  I had initially anticipated fixing 
that by converting everything that uses libc locales to ICU.  But that 
turned out to be tedious and ultimately not very useful as far as the 
user-facing result is concerned, so I gave up.


So this is a different approach: If you choose ICU as the default locale 
for a database, you still need to specify lc_ctype and lc_collate 
settings, as before.  Unlike in the previous patch, where the ICU 
collation name was written in datcollate, there is now a third column 
(daticucoll), so we can store all three values.  This fixes the 
described problem.  Other than that, once you get all the initial 
settings right, it basically just works: The places that have ICU 
support now will use a database-wide ICU collation if appropriate, the 
places that don't have ICU support continue to use the global libc 
locale settings.


I changed the datcollate, datctype, and the new daticucoll fields to 
type text (from name).  That way, the daticucoll field can be set to 
null if it's not applicable.  Also, the limit of 63 characters can 
actually be a problem if you want to use some combination of the options 
that ICU locales offer.  And for less extreme uses, having 
variable-length fields will save some storage, since typical locale 
names are much shorter.


For the same reasons and to keep things consistent, I also changed the 
analogous pg_collation fields like that.  This also removes some weird 
code that has to check that colcollate and colctype have to be the same 
for ICU, so it's overall cleaner.From 4eb9fbac238c1abf481fa43431ecc22e782a5290 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Thu, 30 Dec 2021 12:47:24 +0100
Subject: [PATCH v3] Add option to use ICU as global collation provider

This adds the option to use ICU as the default collation provider for
either the whole cluster or a database.  New options for initdb,
createdb, and CREATE DATABASE are used to select this.

Discussion: 
https://www.postgresql.org/message-id/flat/5e756dd6-0e91-d778-96fd-b1bcb06c161a%402ndquadrant.com
---
 doc/src/sgml/catalogs.sgml|  13 +-
 doc/src/sgml/ref/create_database.sgml |  16 ++
 doc/src/sgml/ref/createdb.sgml|   9 +
 doc/src/sgml/ref/initdb.sgml  |  23 ++
 src/backend/access/hash/hashfunc.c|  18 +-
 src/backend/catalog/pg_collation.c|  24 +-
 src/backend/commands/collationcmds.c  | 120 +++---
 src/backend/commands/dbcommands.c |  93 ++--
 src/backend/regex/regc_pg_locale.c|   7 +-
 src/backend/utils/adt/formatting.c|   6 +
 src/backend/utils/adt/like.c  |  20 +-
 src/backend/utils/adt/like_support.c  |   2 +
 src/backend/utils/adt/pg_locale.c | 223 +++---
 src/backend/utils/adt/varchar.c   |  22 +-
 src/backend/utils/adt/varlena.c   |  26 +-
 src/backend/utils/init/postinit.c |  37 ++-
 src/bin/initdb/Makefile   |   2 +
 src/bin/initdb/initdb.c   |  62 -
 src/bin/initdb/t/001_initdb.pl|  18 +-
 src/bin/pg_dump/pg_dump.c |  16 ++
 src/bin/psql/describe.c   |  23 +-
 src/bin/psql/tab-complete.c   |   2 +-
 src/bin/scripts/Makefile  |   2 +
 src/bin/scripts/createdb.c|   9 +
 src/bin/scripts/t/020_createdb.pl |  20 +-
 src/include/catalog/pg_collation.dat  |   3 +-
 src/include/catalog/pg_collation.h|   6 +-
 src/include/catalog/pg_database.dat   |   2 +-
 src/include/catalog/pg_database.h |  16 +-
 src/include/utils/pg_locale.h |   6 +
 .../regress/expected/collate.icu.utf8.out |  10 +-
 src/test/regress/sql/collate.icu.utf8.sql |   8 +-
 32 files changed, 665 insertions(+), 199 deletions(-)

diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 03e2537b07..89e7279030 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -2368,7 +2368,7 @@ pg_collation 
Columns
 
  
   
-   collcollate name
+   collcollate text
   
   
LC_COLLATE for this collation object
@@ -2377,13 +2377,22 @@ pg_collation 
Columns
 
  
   
-   collctype name
+   collctype text
   
   
LC_CTYPE for this collation object
   
  
 
+ 
+  
+   collicucoll text
+  
+  
+   ICU collation string
+  
+ 
+
  
   
collversion text
diff --git a/doc/src/sgml/ref/create_database.sgml 

Re: Pre-allocating WAL files

2021-12-30 Thread Maxim Orlov
I did check the patch too and found it to be ok. Check and check-world are
passed.
Overall idea seems to be good in my opinion, but I'm not sure where is the
optimal place to put the pre-allocation.

On Thu, Dec 30, 2021 at 2:46 PM Pavel Borisov 
wrote:

> > pre-allocating during checkpoints.  I've done a few pgbench runs, and
>> > it seems to work pretty well.  Initialization is around 15% faster,
>> > and I'm seeing about a 5% increase in TPS with a simple-update
>> > workload with wal_recycle turned off.  Of course, these improvements
>> > go away once segments can be recycled.
>>
>
> I've checked the patch v7. It applies cleanly, code is good, check-world
> tests passed without problems.
> I think it's ok to use checkpointer for this and the overall patch can be
> committed. But the seen performance gain makes me think again before adding
> this feature. I did tests myself a couple of months ago and got similar
> results.
> Really don't know whether is it worth the effort.
>
> Wish you and all hackers happy New Year!
> --
> Best regards,
> Pavel Borisov
>
> Postgres Professional: http://postgrespro.com 
>


-- 
---
Best regards,
Maxim Orlov.


Re: Pre-allocating WAL files

2021-12-30 Thread Pavel Borisov
>
> > pre-allocating during checkpoints.  I've done a few pgbench runs, and
> > it seems to work pretty well.  Initialization is around 15% faster,
> > and I'm seeing about a 5% increase in TPS with a simple-update
> > workload with wal_recycle turned off.  Of course, these improvements
> > go away once segments can be recycled.
>

I've checked the patch v7. It applies cleanly, code is good, check-world
tests passed without problems.
I think it's ok to use checkpointer for this and the overall patch can be
committed. But the seen performance gain makes me think again before adding
this feature. I did tests myself a couple of months ago and got similar
results.
Really don't know whether is it worth the effort.

Wish you and all hackers happy New Year!
-- 
Best regards,
Pavel Borisov

Postgres Professional: http://postgrespro.com 


Fix BUG #17335: Duplicate result rows in Gather node

2021-12-30 Thread Yura Sokolov
Good day, hackers.

Problem:
- Append path is created with explicitely parallel_aware = true
- It has two child, one is trivial, other is parallel_aware = false .
  Trivial child is dropped.
- Gather/GatherMerge path takes Append path as a child and thinks
  its child is parallel_aware = true.
- But Append path is removed at the last since it has only one child.
- Now Gather/GatherMerge thinks its child is parallel_aware, but it
  is not.
  Gather/GatherMerge runs its child twice: in a worker and in a leader,
  and gathers same rows twice.

Reproduction code attached (repro.sql. Included as a test as well).

Suggested quick (and valid) fix in the patch attached:
- If Append has single child, then copy its parallel awareness.

Bug were introduced with commit 8edd0e79460b414b1d971895312e549e95e12e4f
"Suppress Append and MergeAppend plan nodes that have a single child."

During discussion, it were supposed [1] those fields should be copied:

> I haven't looked into whether this does the right things for parallel
> planning --- possibly create_[merge]append_path need to propagate up
> parallel-related path fields from the single child?

But it were not so obvious [2].

Better fix could contain removing Gather/GatherMerge node as well if
its child is not parallel aware.

Bug is reported in 
https://postgr.es/m/flat/17335-4dc92e1aea3a78af%40postgresql.org
Since no way to add thread from pgsql-bugs to commitfest, I write here.

[1] https://postgr.es/m/17500.1551669976%40sss.pgh.pa.us
[2] 
https://postgr.es/m/CAKJS1f_Wt_tL3S32R3wpU86zQjuHfbnZbFt0eqm%3DqcRFcdbLvw%40mail.gmail.com

 
regards
Yura Sokolov
y.soko...@postgrespro.ru
funny.fal...@gmail.com
From 47c6e161de4fc9d2d6eff45f427ebf49b4c9d11c Mon Sep 17 00:00:00 2001
From: Yura Sokolov 
Date: Mon, 20 Dec 2021 11:48:10 +0300
Subject: [PATCH v1] Quick fix to duplicate result rows after Append path
 removal.

It could happen Append path is created with "parallel_aware" flag,
but its single child is not. Append path parent (Gather or Gather Merge)
thinks its child is parallel_aware, but after Append path removal Gather's
child become not parallel_aware. Then when Gather/Gather Merge decides
to run child in several workers or worker + leader participation, it
gathers duplicate result rows from several child path invocations.

With this fix Append path copies its single child parallel_aware / cost /
workers values.

Copied `num_workers == 0` triggers assert `num_workers > 0` in
cost_gather_merge function. But changing this assert to `num_workers >= 0`
doesn't lead to any runtime and/or logical error.

Fixes bug 17335
https://postgr.es/m/flat/17335-4dc92e1aea3a78af%40postgresql.org
---
 src/backend/optimizer/path/costsize.c |   2 +-
 src/backend/optimizer/util/pathnode.c |   3 +
 .../expected/gather_removed_append.out| 131 ++
 src/test/regress/parallel_schedule|   1 +
 .../regress/sql/gather_removed_append.sql |  82 +++
 5 files changed, 218 insertions(+), 1 deletion(-)
 create mode 100644 src/test/regress/expected/gather_removed_append.out
 create mode 100644 src/test/regress/sql/gather_removed_append.sql

diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index 1e4d404f024..9949c3ab555 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -440,7 +440,7 @@ cost_gather_merge(GatherMergePath *path, PlannerInfo *root,
 	 * be overgenerous since the leader will do less work than other workers
 	 * in typical cases, but we'll go with it for now.
 	 */
-	Assert(path->num_workers > 0);
+	Assert(path->num_workers >= 0);
 	N = (double) path->num_workers + 1;
 	logN = LOG2(N);
 
diff --git a/src/backend/optimizer/util/pathnode.c b/src/backend/optimizer/util/pathnode.c
index af5e8df26b4..2ff4678937a 100644
--- a/src/backend/optimizer/util/pathnode.c
+++ b/src/backend/optimizer/util/pathnode.c
@@ -1340,6 +1340,9 @@ create_append_path(PlannerInfo *root,
 		pathnode->path.startup_cost = child->startup_cost;
 		pathnode->path.total_cost = child->total_cost;
 		pathnode->path.pathkeys = child->pathkeys;
+		pathnode->path.parallel_aware = child->parallel_aware;
+		pathnode->path.parallel_safe = child->parallel_safe;
+		pathnode->path.parallel_workers = child->parallel_workers;
 	}
 	else
 		cost_append(pathnode);
diff --git a/src/test/regress/expected/gather_removed_append.out b/src/test/regress/expected/gather_removed_append.out
new file mode 100644
index 000..f6e861ce59d
--- /dev/null
+++ b/src/test/regress/expected/gather_removed_append.out
@@ -0,0 +1,131 @@
+-- Test correctness of parallel query execution after removal
+-- of Append path due to single non-trivial child.
+DROP TABLE IF EXISTS gather_append_1, gather_append_2;
+NOTICE:  table "gather_append_1" does not exist, skipping
+NOTICE:  table "gather_append_2" does not exist, skipping
+CREATE TABLE gather_append_1 (
+fk int,
+f bool
+);
+INSERT INTO 

Re: Autovacuum and idle_session_timeout

2021-12-30 Thread Japin Li

On Thu, 30 Dec 2021 at 18:53, Guillaume Lelarge  wrote:
> Le jeu. 30 déc. 2021 à 11:44, Japin Li  a écrit :
>
>>
> pg_dump works in a single transaction, so it's already dealt with
> idle_in_transaction_timeout. Though I guess setting both would work too.

Attached fix this, please consider reveiew it.  Thanks.

-- 
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index f6d0562876..c7070b05c9 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -602,6 +602,7 @@ AutoVacLauncherMain(int argc, char *argv[])
 	SetConfigOption("lock_timeout", "0", PGC_SUSET, PGC_S_OVERRIDE);
 	SetConfigOption("idle_in_transaction_session_timeout", "0",
 	PGC_SUSET, PGC_S_OVERRIDE);
+	SetConfigOption("idle_session_timeout", "0", PGC_SUSET, PGC_S_OVERRIDE);
 
 	/*
 	 * Force default_transaction_isolation to READ COMMITTED.  We don't want
@@ -1624,6 +1625,7 @@ AutoVacWorkerMain(int argc, char *argv[])
 	SetConfigOption("lock_timeout", "0", PGC_SUSET, PGC_S_OVERRIDE);
 	SetConfigOption("idle_in_transaction_session_timeout", "0",
 	PGC_SUSET, PGC_S_OVERRIDE);
+	SetConfigOption("idle_session_timeout", "0", PGC_SUSET, PGC_S_OVERRIDE);
 
 	/*
 	 * Force default_transaction_isolation to READ COMMITTED.  We don't want
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index 8903a694ae..15960ea644 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -3059,6 +3059,7 @@ _doSetFixedOutputState(ArchiveHandle *AH)
 	ahprintf(AH, "SET statement_timeout = 0;\n");
 	ahprintf(AH, "SET lock_timeout = 0;\n");
 	ahprintf(AH, "SET idle_in_transaction_session_timeout = 0;\n");
+	ahprintf(AH, "SET idle_session_timeout = 0;\n");
 
 	/* Select the correct character set encoding */
 	ahprintf(AH, "SET client_encoding = '%s';\n",
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index b52f3ccda2..fd4f7269c3 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -1146,6 +1146,8 @@ setup_connection(Archive *AH, const char *dumpencoding,
 		ExecuteSqlStatement(AH, "SET lock_timeout = 0");
 	if (AH->remoteVersion >= 90600)
 		ExecuteSqlStatement(AH, "SET idle_in_transaction_session_timeout = 0");
+	if (AH->remoteVersion >= 14)
+		ExecuteSqlStatement(AH, "SET idle_session_timeout = 0");
 
 	/*
 	 * Quote all identifiers, if requested.


Re: Foreign key joins revisited

2021-12-30 Thread Joel Jacobson
On Wed, Dec 29, 2021, at 16:28, Tom Lane wrote:
>Peter Eisentraut  writes:
>> In the 1990s, there were some SQL drafts that included syntax like
>> JOIN ... USING PRIMARY KEY | USING FOREIGN KEY | USING CONSTRAINT ...
>> AFAICT, these ideas just faded away because of other priorities, so if 
>> someone wants to revive it, some work already exists.
>
> Interesting!  One thing that bothered me about this whole line of
> discussion is that we could get blindsided in future by the SQL
> committee standardizing the same idea with slightly different
> syntax/semantics.  I think borrowing this draft text would greatly
> improve the odds of that not happening.  Do you have access to
> full details?

Wisely said, I agree.

I have access to the ISO online document database, but the oldest SQL documents 
I could find there are from 2008-10-15.
I searched for document titles containing "SQL" in both ISO/IEC JTC 1/SC 32 and 
ISO/IEC JTC 1/SC 32/WG 3.

It would be very interesting to read these old SQL drafts from the 1990s, if 
they can be found.

/Joel

Re: Autovacuum and idle_session_timeout

2021-12-30 Thread Guillaume Lelarge
Le jeu. 30 déc. 2021 à 11:44, Japin Li  a écrit :

>
> On Thu, 30 Dec 2021 at 17:18, Guillaume Lelarge 
> wrote:
> > Hello,
> >
> > I've been reading the autovacuum code (the launcher and the worker) on
> the
> > 14 branch. As previously, I've seen some configuration at the beginning,
> > especially for statement_timeout, lock_timeout and
> > idle_in_transaction_session_timeout, and I was surprised to discover
> there
> > was no configuration for idle_session_timeout. I'm not sure the code
> should
> > set it to 0 as well (otherwise I'd have written a patch), but, if there
> was
> > a decision made to ignore its value, I'd be interested to know the
> reason.
> > I could guess for the autovacuum worker (it seems to work in a
> transaction,
> > so it's already handled by the idle_in_transaction_timeout), but I have
> no
> > idea for the autovacuum launcher.
> >
> > If it was just missed, I could write a patch this week to fix this.
> >
>
> Oh, it was just missed. I didn't note set autovacuum code set those
> settings,
> I think we should also set idle_session_timeout to 0.
>
> Should we also change this for pg_dump and pg_backup_archiver?
>
>
pg_dump works in a single transaction, so it's already dealt with
idle_in_transaction_timeout. Though I guess setting both would work too.


-- 
Guillaume.


Re: Autovacuum and idle_session_timeout

2021-12-30 Thread Japin Li


On Thu, 30 Dec 2021 at 17:18, Guillaume Lelarge  wrote:
> Hello,
>
> I've been reading the autovacuum code (the launcher and the worker) on the
> 14 branch. As previously, I've seen some configuration at the beginning,
> especially for statement_timeout, lock_timeout and
> idle_in_transaction_session_timeout, and I was surprised to discover there
> was no configuration for idle_session_timeout. I'm not sure the code should
> set it to 0 as well (otherwise I'd have written a patch), but, if there was
> a decision made to ignore its value, I'd be interested to know the reason.
> I could guess for the autovacuum worker (it seems to work in a transaction,
> so it's already handled by the idle_in_transaction_timeout), but I have no
> idea for the autovacuum launcher.
>
> If it was just missed, I could write a patch this week to fix this.
>

Oh, it was just missed. I didn't note set autovacuum code set those settings,
I think we should also set idle_session_timeout to 0.

Should we also change this for pg_dump and pg_backup_archiver?

-- 
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.




Re: Non-decimal integer literals

2021-12-30 Thread Peter Eisentraut
There has been some other refactoring going on, which made this patch 
set out of date.  So here is an update.


The old pg_strtouint64() has been removed, so there is no longer a 
naming concern with patch 0001.  That one should be good to go.


I also found that yet another way to parse integers in pg_atoi() has 
mostly faded away in utility, so I removed the last two callers and 
removed the function in 0002 and 0003.


The remaining patches are as before, with some of the review comments 
applied.  I still need to write some lexing unit tests for ecpg, which I 
haven't gotten to yet.  This affects patches 0004 and 0005.


As mentioned before, patches 0006 and 0007 are more feature previews at 
this point.



On 01.12.21 16:47, Peter Eisentraut wrote:

On 25.11.21 18:51, John Naylor wrote:
If we're going to change the comment anyway, "the parser" sounds more 
natural. Aside from that, 0001 and 0002 can probably be pushed now, if 
you like.


done


--- a/src/interfaces/ecpg/preproc/pgc.l
+++ b/src/interfaces/ecpg/preproc/pgc.l
@@ -365,6 +365,10 @@ real ({integer}|{decimal})[Ee][-+]?{digit}+
  realfail1 ({integer}|{decimal})[Ee]
  realfail2 ({integer}|{decimal})[Ee][-+]

+integer_junk {integer}{ident_start}
+decimal_junk {decimal}{ident_start}
+real_junk {real}{ident_start}

A comment might be good here to explain these are only in ECPG for 
consistency with the other scanners. Not really important, though.


Yeah, it's a bit weird that not all the symbols are used in ecpg.  I'll 
look into explaining this better.



0006

+{hexfail} {
+ yyerror("invalid hexadecimal integer");
+ }
+{octfail} {
+ yyerror("invalid octal integer");
   }
-{decimal} {
+{binfail} {
+ yyerror("invalid binary integer");
+ }

It seems these could use SET_YYLLOC(), since the error cursor doesn't 
match other failure states:


ok

We might consider some tests for ECPG since lack of coverage has been 
a problem.


right

Also, I'm curious: how does the spec work as far as deciding the year 
of release, or feature-freezing of new items?


The schedule has recently been extended again, so the current plan is 
for SQL:202x with x=3, with feature freeze in mid-2022.


So the feature patches in this thread are in my mind now targeting 
PG15+1.  But the preparation work (up to v5-0005, and some other number 
parsing refactoring that I'm seeing) could be considered for PG15.


I'll move this to the next CF and come back with an updated patch set in 
a little while.



From 4aa1329c3aad512f33a56a05fcc465793ef19b1d Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Thu, 30 Dec 2021 10:26:37 +0100
Subject: [PATCH v6 1/7] Move scanint8() to numutils.c

Move scanint8() to numutils.c and rename to pg_strtoint64().  We
already have a "16" and "32" version of that, and the code inside the
functions was aligned, so this move makes all three versions
consistent.  The API is also changed to no longer provide the errorOK
case.  Users that need the error checking can use strtoi64().

Discussion: 
https://www.postgresql.org/message-id/flat/b239564c-cad0-b23e-c57e-166d883cb...@enterprisedb.com
---
 src/backend/parser/parse_node.c | 12 ++-
 src/backend/replication/pgoutput/pgoutput.c |  9 ++-
 src/backend/utils/adt/int8.c| 90 +
 src/backend/utils/adt/numutils.c| 84 +++
 src/bin/pgbench/pgbench.c   |  4 +-
 src/include/utils/builtins.h|  1 +
 src/include/utils/int8.h| 25 --
 7 files changed, 103 insertions(+), 122 deletions(-)
 delete mode 100644 src/include/utils/int8.h

diff --git a/src/backend/parser/parse_node.c b/src/backend/parser/parse_node.c
index 8cfe6f67c0..0eefd5427a 100644
--- a/src/backend/parser/parse_node.c
+++ b/src/backend/parser/parse_node.c
@@ -26,7 +26,6 @@
 #include "parser/parse_relation.h"
 #include "parser/parsetree.h"
 #include "utils/builtins.h"
-#include "utils/int8.h"
 #include "utils/lsyscache.h"
 #include "utils/syscache.h"
 #include "utils/varbit.h"
@@ -353,7 +352,6 @@ make_const(ParseState *pstate, A_Const *aconst)
 {
Const  *con;
Datum   val;
-   int64   val64;
Oid typeid;
int typelen;
booltypebyval;
@@ -384,8 +382,15 @@ make_const(ParseState *pstate, A_Const *aconst)
break;
 
case T_Float:
+   {
/* could be an oversize integer as well as a float ... 
*/
-   if (scanint8(aconst->val.fval.val, true, ))
+
+   int64   val64;
+   char   *endptr;
+
+   errno = 0;
+   val64 = strtoi64(aconst->val.fval.val, , 10);
+   if (errno == 0 && *endptr == '\0')
{
/*
 * It might actually fit in int32. 

Autovacuum and idle_session_timeout

2021-12-30 Thread Guillaume Lelarge
Hello,

I've been reading the autovacuum code (the launcher and the worker) on the
14 branch. As previously, I've seen some configuration at the beginning,
especially for statement_timeout, lock_timeout and
idle_in_transaction_session_timeout, and I was surprised to discover there
was no configuration for idle_session_timeout. I'm not sure the code should
set it to 0 as well (otherwise I'd have written a patch), but, if there was
a decision made to ignore its value, I'd be interested to know the reason.
I could guess for the autovacuum worker (it seems to work in a transaction,
so it's already handled by the idle_in_transaction_timeout), but I have no
idea for the autovacuum launcher.

If it was just missed, I could write a patch this week to fix this.

Regards.


-- 
Guillaume.


Re: Add Boolean node

2021-12-30 Thread Peter Eisentraut

On 29.12.21 21:32, Andres Freund wrote:

On 2021-12-27 09:53:32 -0500, Tom Lane wrote:

Didn't really read the patch in any detail, but I did have one idea:
I think that the different things-that-used-to-be-Value-nodes ought to
use different field names, say ival, rval, bval, sval not just "val".
That makes it more likely that you'd catch any code that is doing the
wrong thing and not going through one of the access macros.


If we go around changing all these places, it might be worth to also change
Integer to be a int64 instead of an int.


I was actually looking into that, when I realized that most uses of 
Integer were actually Booleans.  Hence the current patch to clear those 
fake Integers out of the way.  I haven't gotten to analyze the int64 
question any further, but it should be easier hereafter.





RE: row filtering for logical replication

2021-12-30 Thread wangw.f...@fujitsu.com
On Mon, Dec 28, 2021 9:03 PM houzj.f...@fujitsu.com  
wrote:
> Attach a top up patch 0004 which did the above changes.

A few comments about v55-0001 and v55-0002.
v55-0001
1.
There is a typo at the last sentence of function(rowfilter_walker)'s comment. 
   * (b) a user-defined function can be used to access tables which could have
   * unpleasant results because a historic snapshot is used. That's why only
-  * non-immutable built-in functions are allowed in row filter expressions.
+ * immutable built-in functions are allowed in row filter expressions.

2.
There are two if statements at the end of fetch_remote_table_info.
+   if (!isnull)
+   *qual = lappend(*qual, 
makeString(TextDatumGetCString(rf)));
+
+   ExecClearTuple(slot);
+
+   /* Ignore filters and cleanup as necessary. */
+   if (isnull)
+   {
+   if (*qual)
+   {
+   list_free_deep(*qual);
+   *qual = NIL;
+   }
+   break;
+   }
What about using the format like following:
if (!isnull)
...
else
...


v55-0002
In function pgoutput_row_filter_init, I found almost whole function is in the if
statement written like this:
static void
pgoutput_row_filter_init()
{
Variable declaration and initialization;
if (!entry->exprstate_valid)
{
..
}
}
What about changing this if statement like following:
if (entry->exprstate_valid)
return;


Regards,
Wang wei


Re: Throttling WAL inserts when the standby falls behind more than the configured replica_lag_in_bytes

2021-12-30 Thread SATYANARAYANA NARLAPURAM
On Thu, Dec 30, 2021 at 12:20 AM Dilip Kumar  wrote:

> On Thu, Dec 30, 2021 at 1:41 PM Bharath Rupireddy <
> bharath.rupireddyforpostg...@gmail.com> wrote:
>
>>
>> >
>> > Yeah, that's true, but even if we are blocking the transactions from
>> committing then also it is possible that a new connection can come and
>> generate more WAL,  yeah but I agree with the other part that if you
>> throttle after committing then the user can cancel the queries and generate
>> more WAL from those sessions as well.  But that is an extreme case where
>> application developers want to bypass the throttling and want to generate
>> more WALs.
>>
>> How about having the new hook at the start of the new txn?  If we do
>> this, when the limit for the throttling is exceeded, the current txn
>> (even if it is a long running one) continues to do the WAL insertions,
>> the next txns would get blocked. Thoughts?
>>
>
> Do you mean while StartTransactionCommand or while assigning a new
> transaction id? If it is at StartTransactionCommand then we would be
> blocking the sessions which are only performing read queries right?
>

Definitely not at StartTransactionCommand but possibly while assigning
transaction Id inAssignTransactionId. Blocking readers is never the intent.


> If we are doing at the transaction assignment level then we might be
> holding some of the locks so this might not be any better than throttling
> inside the commit.
>

If we define RPO as no transaction can commit when the wal_distance is more
than configured MB, we had to throttle the writes before committing the
transaction and new WAL generation by new connections or active doesn't
matter as the transactions can't be committed and visible to the user. If
the RPO is defined as no new write transactions allowed when wal_distance >
configured MB, then we can block assigning the new transaction IDs until
the RPO policy is met. IMHO, following the sync replication semantics is
easier and more explainable as it is already familiar to the customers.


>
> --
> Regards,
> Dilip Kumar
> EnterpriseDB: http://www.enterprisedb.com
>


Re: warn if GUC set to an invalid shared library

2021-12-30 Thread Bharath Rupireddy
On Tue, Dec 28, 2021 at 11:15 PM Justin Pryzby  wrote:
>
> forking 
>
> On Mon, Dec 13, 2021 at 09:01:57AM -0500, Robert Haas wrote:
> > On Thu, Dec 9, 2021 at 2:32 AM Maciek Sakrejda  wrote:
> > > > Considering the vanishingly small number of actual complaints we've
> > > > seen about this, that sounds ridiculously over-engineered.
> > > > A documentation example should be sufficient.
> > >
> > > I don't know if this will tip the scales, but I'd like to lodge a
> > > belated complaint. I've gotten myself in this server-fails-to-start
> > > situation several times (in development, for what it's worth). The
> > > syntax (as Bharath pointed out in the original message) is pretty
> > > picky, there are no guard rails, and if you got there through ALTER
> > > SYSTEM, you can't fix it with ALTER SYSTEM (because the server isn't
> > > up). If you go to fix it manually, you get a scary "Do not edit this
> > > file manually!" warning that you have to know to ignore in this case
> > > (that's if you find the file after you realize what the fairly generic
> > > "FATAL: ... No such file or directory" error in the log is telling
> > > you). Plus you have to get the (different!) quoting syntax right or
> > > cut your losses and delete the change.
> >
> > +1. I disagree that trying to detect this kind of problem would be
> > "ridiculously over-engineered." I don't know whether it can be done
> > elegantly enough that we'd be happy with it and I don't know whether
> > it would end up just garden variety over-engineered. But there's
> > nothing ridiculous about trying to prevent people from putting their
> > system into a state where it won't start.
> >
> > (To be clear, I also think updating the documentation is sensible,
> > without taking a view on exactly what that update should look like.)
>
> Yea, I think documentation won't help to avoid this issue:
>
> If ALTER SYSTEM gives an ERROR, someone will likely to check the docs after a
> few minutes if they know that they didn't get the correct syntax.
> But if it gives no error nor warning, then most likely they won't know to 
> check
> the docs.
>
> We should check session_preload_libraries too, right ?  It's PGC_SIGHUP, so if
> someone sets the variable and sends sighup, clients will be rejected, and they
> had no good opportunity to avoid that.
>
> 0001 adds WARNINGs when doing SET:
>
> postgres=# SET local_preload_libraries=xyz;
> WARNING:  could not load library: xyz: cannot open shared object 
> file: No such file or directory
> SET
>
> postgres=# ALTER SYSTEM SET shared_preload_libraries =asdf;
> WARNING:  could not load library: $libdir/plugins/asdf: cannot open 
> shared object file: No such file or directory
> ALTER SYSTEM
>
> 0002 adds context when failing to start.
>
> 2021-12-27 17:01:12.996 CST postmaster[1403] WARNING:  could not load 
> library: $libdir/plugins/asdf: cannot open shared object file: No such file 
> or directory
> 2021-12-27 17:01:14.938 CST postmaster[1403] FATAL:  could not access 
> file "asdf": No such file or directory
> 2021-12-27 17:01:14.938 CST postmaster[1403] CONTEXT:  guc 
> "shared_preload_libraries"
> 2021-12-27 17:01:14.939 CST postmaster[1403] LOG:  database system is 
> shut down
>
> But I wonder whether it'd be adequate context if dlopen were to fail rather
> than stat() ?
>
> Before 0003:
> 2021-12-18 23:13:57.861 CST postmaster[11956] FATAL:  could not 
> access file "asdf": No such file or directory
> 2021-12-18 23:13:57.862 CST postmaster[11956] LOG:  database system 
> is shut down
>
> After 0003:
> 2021-12-18 23:16:05.719 CST postmaster[13481] FATAL:  could not load 
> library: asdf: cannot open shared object file: No such file or directory
> 2021-12-18 23:16:05.720 CST postmaster[13481] LOG:  database system 
> is shut down

Overall the idea looks good to me. A warning on ALTER SYSTEM SET seems
reasonable than nothing. ERROR isn't the way to go as it limits the
users of setting the extensions in shared_preload_libraries first,
installing them later. Is NOTICE here a better idea than WARNING?

I haven't looked at the patches though.

Regards,
Bharath Rupireddy.




Re: Throttling WAL inserts when the standby falls behind more than the configured replica_lag_in_bytes

2021-12-30 Thread Dilip Kumar
On Thu, Dec 30, 2021 at 1:41 PM Bharath Rupireddy <
bharath.rupireddyforpostg...@gmail.com> wrote:

>
> >
> > Yeah, that's true, but even if we are blocking the transactions from
> committing then also it is possible that a new connection can come and
> generate more WAL,  yeah but I agree with the other part that if you
> throttle after committing then the user can cancel the queries and generate
> more WAL from those sessions as well.  But that is an extreme case where
> application developers want to bypass the throttling and want to generate
> more WALs.
>
> How about having the new hook at the start of the new txn?  If we do
> this, when the limit for the throttling is exceeded, the current txn
> (even if it is a long running one) continues to do the WAL insertions,
> the next txns would get blocked. Thoughts?
>

Do you mean while StartTransactionCommand or while assigning a new
transaction id?  If it is at StartTransactionCommand then we would be
blocking the sessions which are only performing read queries right?  If we
are doing at the transaction assignment level then we might be holding some
of the locks so this might not be any better than throttling inside the
commit.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


Re: Throttling WAL inserts when the standby falls behind more than the configured replica_lag_in_bytes

2021-12-30 Thread Bharath Rupireddy
On Thu, Dec 30, 2021 at 1:21 PM Dilip Kumar  wrote:
>
> On Thu, Dec 30, 2021 at 12:36 PM SATYANARAYANA NARLAPURAM 
>  wrote:
>>>
>>>
>>> Yeah, I think that would make sense, even though we will be allowing a new 
>>> backend to get connected insert WAL, and get committed but after that, it 
>>> will be throttled.  However, if the number of max connections will be very 
>>> high then even after we detected a lag there a significant amount WAL could 
>>> be generated, even if we keep long-running transactions aside.  But I think 
>>> still it will serve the purpose of what Satya is trying to achieve.
>>
>>
>> I am afraid there are problems with making the RPO check post releasing the 
>> locks. By this time the transaction is committed and visible to the other 
>> backends (ProcArrayEndTransaction is already called) though the intention is 
>> to block committing transactions that violate the defined RPO. Even though 
>> we block existing connections starting a new transaction, it is possible to 
>> do writes by opening a new connection / canceling the query. I am not too 
>> much worried about the lock contention as the system is already hosed 
>> because of the policy. This behavior is very similar to what happens when 
>> the Sync standby is not responding. Thoughts?
>
>
> Yeah, that's true, but even if we are blocking the transactions from 
> committing then also it is possible that a new connection can come and 
> generate more WAL,  yeah but I agree with the other part that if you throttle 
> after committing then the user can cancel the queries and generate more WAL 
> from those sessions as well.  But that is an extreme case where application 
> developers want to bypass the throttling and want to generate more WALs.

How about having the new hook at the start of the new txn?  If we do
this, when the limit for the throttling is exceeded, the current txn
(even if it is a long running one) continues to do the WAL insertions,
the next txns would get blocked. Thoughts?

Regards,
Bharath Rupireddy.