Re: [Proposal] Global temporary tables

2022-02-26 Thread Justin Pryzby
I read through this.
Find attached some language fixes.  You should be able to apply each "fix"
patch on top of your own local branch with git am, and then squish them
together.  Let me know if you have trouble with that.

I think get_seqence_start_value() should be static.  (Or otherwise, it should
be in lsyscache.c).

The include added to execPartition.c seems to be unused.

+#define RELATION_IS_TEMP_ON_CURRENT_SESSION(relation) \
+#define RELATION_IS_TEMP(relation) \
+#define RelpersistenceTsTemp(relpersistence) \
+#define RELATION_GTT_ON_COMMIT_DELETE(relation)\

=> These macros can evaluate their arguments multiple times.
You should add a comment to warn about that.  And maybe avoid passing them a
function argument, like: RelpersistenceTsTemp(get_rel_persistence(rte->relid))

+list_all_backend_gtt_frozenxids should return TransactionId not int.
The function name should say "oldest" and not "all" ?

I think the GUC should have a longer name.  max_active_gtt is too short for a
global var.

+#defineMIN_NUM_ACTIVE_GTT  0
+#defineDEFAULT_NUM_ACTIVE_GTT  1000
+#defineMAX_NUM_ACTIVE_GTT  100

+intmax_active_gtt = MIN_NUM_ACTIVE_GTT

It's being initialized to MIN, but then the GUC machinery sets it to DEFAULT.
By convention, it should be initialized to default.

fout->remoteVersion >= 14

=> should say 15

describe.c has gettext_noop("session"), which is a half-truth.  The data is
per-session but the table definition is persistent..

You redirect stats from pg_class and pg_statistics to a local hash table.
This is pretty hairy :(
I guess you'd also need to handle pg_statistic_ext and ext_data.
pg_stats doesn't work, since the data isn't in pg_statistic - it'd need to look
at pg_get_gtt_statistics.

I wonder if there's a better way to do it, like updating pg_statistic but
forcing the changes to be rolled back when the session ends...  But I think
that would make longrunning sessions behave badly, the same as "longrunning
transactions".

Have you looked at Gilles Darold's GTT extension ?
>From b89f3cc5c78e7b4c3e10ab39ef527b524d0d112d Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Sat, 1 Jan 2022 17:02:30 -0600
Subject: [PATCH 02/11] f!0002-gtt-v64-doc.patch

---
 doc/src/sgml/ref/create_table.sgml | 42 --
 1 file changed, 22 insertions(+), 20 deletions(-)

diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index 408373323c..9cd5fc17f2 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -174,18 +174,18 @@ WITH ( MODULUS numeric_literal, REM
  
   If specified, the table is created as a temporary table.
   Optionally, GLOBAL or LOCAL
-  can be written before TEMPORARY or TEMP.
+  may be written before TEMPORARY or TEMP.
   They represent two types of temporary tables supported by PostgreSQL:
-  global temporary table and local temporary table. Without specified
-  GLOBAL or LOCAL, a local temporary table is created by default.
+  global temporary table and local temporary table. If neither
+  GLOBAL or LOCAL is specified, a local temporary table is created by default.
  
 
 
- Both types of temporary tables’ data are truncated at the
+ Data of both types of temporary tables is truncated at the
  end of a session or optionally at the end of the current transaction.
- (see ON COMMIT below). For global temporary table,
- its schema is reserved and reused by future sessions or transactions.
- For local temporary table, both its data and its schema are dropped.
+ (see ON COMMIT below). For a global temporary table,
+ its table definition is preserved and reused by future sessions or transactions.
+ For a local temporary table, both its data and its table definition are dropped.
 
 
 
@@ -194,10 +194,10 @@ WITH ( MODULUS numeric_literal, REM
   

 Global temporary table are defined just once and automatically exist
-(starting with empty contents) in every session that needs them.
-The schema definition of temporary tables is persistent and shared among sessions.
-However, the data in temporary tables are kept private to sessions themselves,
-even though they use same name and same schema.
+(initially with empty contents) in every session.
+The schema definition of a temporary table is persistent and common to all sessions.
+However, the data in temporary tables is private to each sessions,
+even though they share the same name and same table definition.

   
  
@@ -206,15 +206,15 @@ WITH ( MODULUS numeric_literal, REM
   Local Temporary Table
  
  
-  Local temporary table are automatically dropped at the end of a
-  session (include schema and data). Future sessions need to creat

Re: explain_regress, explain(MACHINE), and default to explain(BUFFERS) (was: BUFFERS enabled by default in EXPLAIN (ANALYZE))

2022-02-26 Thread Justin Pryzby
Rebased over ebf6c5249b7db525e59563fb149642665c88f747.
It looks like that patch handles only query_id, and this patch also tries to
handle a bunch of other stuff.

If it's helpful, feel free to kick this patch to a future CF.
>From e58fffedc6f1cf471228fb3234faba35898678c3 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Sat, 22 Feb 2020 21:17:10 -0600
Subject: [PATCH 1/6] Add GUC: explain_regress

This changes the defaults for explain to: costs off, timing off, summary off.
It'd be reasonable to use this for new regression tests which are not intended
to be backpatched.
---
 contrib/postgres_fdw/postgres_fdw.c |  2 +-
 src/backend/commands/explain.c  | 13 +++--
 src/backend/utils/misc/guc.c| 13 +
 src/bin/psql/describe.c |  2 +-
 src/include/commands/explain.h  |  2 ++
 src/test/regress/expected/explain.out   |  3 +++
 src/test/regress/expected/inherit.out   |  2 +-
 src/test/regress/expected/stats_ext.out |  2 +-
 src/test/regress/pg_regress.c   |  3 ++-
 src/test/regress/sql/explain.sql|  4 
 src/test/regress/sql/inherit.sql|  2 +-
 src/test/regress/sql/stats_ext.sql  |  2 +-
 12 files changed, 41 insertions(+), 9 deletions(-)

diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index 56654844e8..54da40d662 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -3124,7 +3124,7 @@ estimate_path_cost_size(PlannerInfo *root,
 		 * values, so don't request params_list.
 		 */
 		initStringInfo();
-		appendStringInfoString(, "EXPLAIN ");
+		appendStringInfoString(, "EXPLAIN (COSTS)");
 		deparseSelectStmtForRel(, root, foreignrel, fdw_scan_tlist,
 remote_conds, pathkeys,
 fpextra ? fpextra->has_final_sort : false,
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index de81379da3..22a3cf6349 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -154,6 +154,7 @@ static void ExplainJSONLineEnding(ExplainState *es);
 static void ExplainYAMLLineStarting(ExplainState *es);
 static void escape_yaml(StringInfo buf, const char *str);
 
+bool explain_regress = false; /* GUC */
 
 
 /*
@@ -172,6 +173,7 @@ ExplainQuery(ParseState *pstate, ExplainStmt *stmt,
 	ListCell   *lc;
 	bool		timing_set = false;
 	bool		summary_set = false;
+	bool		costs_set = false;
 
 	/* Parse options list. */
 	foreach(lc, stmt->options)
@@ -183,7 +185,11 @@ ExplainQuery(ParseState *pstate, ExplainStmt *stmt,
 		else if (strcmp(opt->defname, "verbose") == 0)
 			es->verbose = defGetBoolean(opt);
 		else if (strcmp(opt->defname, "costs") == 0)
+		{
+			/* Need to keep track if it was explicitly set to ON */
+			costs_set = true;
 			es->costs = defGetBoolean(opt);
+		}
 		else if (strcmp(opt->defname, "buffers") == 0)
 			es->buffers = defGetBoolean(opt);
 		else if (strcmp(opt->defname, "wal") == 0)
@@ -227,13 +233,16 @@ ExplainQuery(ParseState *pstate, ExplainStmt *stmt,
 	 parser_errposition(pstate, opt->location)));
 	}
 
+	/* if the costs option was not set explicitly, set default value */
+	es->costs = (costs_set) ? es->costs : es->costs && !explain_regress;
+
 	if (es->wal && !es->analyze)
 		ereport(ERROR,
 (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
  errmsg("EXPLAIN option WAL requires ANALYZE")));
 
 	/* if the timing was not set explicitly, set default value */
-	es->timing = (timing_set) ? es->timing : es->analyze;
+	es->timing = (timing_set) ? es->timing : es->analyze && !explain_regress;
 
 	/* check that timing is used with EXPLAIN ANALYZE */
 	if (es->timing && !es->analyze)
@@ -242,7 +251,7 @@ ExplainQuery(ParseState *pstate, ExplainStmt *stmt,
  errmsg("EXPLAIN option TIMING requires ANALYZE")));
 
 	/* if the summary was not set explicitly, set default value */
-	es->summary = (summary_set) ? es->summary : es->analyze;
+	es->summary = (summary_set) ? es->summary : es->analyze && !explain_regress;
 
 	query = castNode(Query, stmt->query);
 	if (IsQueryIdEnabled())
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 1e3650184b..87266cf7bd 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -46,6 +46,7 @@
 #include "catalog/pg_authid.h"
 #include "catalog/storage.h"
 #include "commands/async.h"
+#include "commands/explain.h"
 #include "commands/prepare.h"
 #include "commands/tablespace.h"
 #include "commands/trigger.h"
@@ -1474,6 +1475,18 @@ static struct config_bool ConfigureNamesBool[] =
 		true,
 		NULL, NULL, NULL
 	},
+
+	{
+		{"explain_regress", PGC_USERSET, DEVELOPER_OPTIONS,
+			gettext_noop("Change defaults of 

Re: Missed condition-variable wakeups on FreeBSD

2022-02-26 Thread Justin Pryzby
On Sat, Feb 26, 2022 at 02:07:05PM -0500, Tom Lane wrote:
> I don't know much about how gdb interacts with kernel calls on
> FreeBSD, but I speculate that the poll(2) call returns with EINTR
> after gdb releases the process, and then things resume fine,
> suggesting that we lost an interrupt somewhere.

I've seen some similar interactions with strace under linux causing a process
to be woken up or otherwise incur a different behavior (not necessarily
postgres).  

> Thoughts?  Ideas on debugging this?

Before attaching a debugger, figure out what syscall each process is in.
In linux, that's ps O wchan PID.

>> Besides trying to make the issue more likely as suggested above, it might be
>> worth checking if signalling the stuck processes with SIGUSR1 gets them
>> unstuck.

And SIGCONT.

Maybe already did this, but you can dump a corefile of the running processes to
allow future inspection.

-- 
Justin




Re: Adding CI to our tree

2022-02-25 Thread Justin Pryzby
This is the other half of my CI patches, which are unrelated to the TAP ones on
the other thread.
>From 88c01c09ee26db2817629265fc12b2dbcd8c9a91 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Mon, 17 Jan 2022 00:53:04 -0600
Subject: [PATCH 1/7] cirrus: include hints how to install OS packages..

This is useful for patches during development, but once a feature is merged,
new libraries should be added to the OS image files, rather than installed
during every CI run forever into the future.
---
 .cirrus.yml | 16 +---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/.cirrus.yml b/.cirrus.yml
index d10b0a82f9..1b7c36283e 100644
--- a/.cirrus.yml
+++ b/.cirrus.yml
@@ -73,10 +73,11 @@ task:
 chown -R postgres:postgres .
 mkdir -p ${CCACHE_DIR}
 chown -R postgres:postgres ${CCACHE_DIR}
-  setup_cores_script: |
+  setup_os_script: |
 mkdir -m 770 /tmp/cores
 chown root:postgres /tmp/cores
 sysctl kern.corefile='/tmp/cores/%N.%P.core'
+#pkg install -y ...
 
   # NB: Intentionally build without --with-llvm. The freebsd image size is
   # already large enough to make VM startup slow, and even without llvm
@@ -180,10 +181,12 @@ task:
 chown -R postgres:postgres ${CCACHE_DIR}
 echo '* - memlock 134217728' > /etc/security/limits.d/postgres.conf
 su postgres -c "ulimit -l -H && ulimit -l -S"
-  setup_cores_script: |
+  setup_os_script: |
 mkdir -m 770 /tmp/cores
 chown root:postgres /tmp/cores
 sysctl kernel.core_pattern='/tmp/cores/%e-%s-%p.core'
+#apt-get update
+#apt-get -y install ...
 
   configure_script: |
 su postgres <<-EOF
@@ -237,7 +240,7 @@ task:
 ulimit -a -H && ulimit -a -S
 export
 
-  setup_cores_script:
+  setup_os_script:
 - mkdir ${HOME}/cores
 - sudo sysctl kern.corefile="${HOME}/cores/core.%P"
 
@@ -384,6 +387,9 @@ task:
 powershell -Command get-psdrive -psprovider filesystem
 set
 
+  setup_os_script: |
+REM choco install -y ...
+
   configure_script:
 # copy errors out when using forward slashes
 - copy src\tools\ci\windows_build_config.pl src\tools\msvc\config.pl
@@ -479,6 +485,10 @@ task:
   ccache_cache:
 folder: $CCACHE_DIR
 
+  setup_os_script: |
+#apt-get update
+#apt-get -y install ...
+
   ###
   # Test that code can be built with gcc/clang without warnings
   ###
-- 
2.17.1

>From 7a80dfccb17f454849679ebe9abc89bd0901828a Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Sat, 15 Jan 2022 11:27:28 -0600
Subject: [PATCH 2/7] cirrus: upload changed html docs as artifacts

Always run doc build; to allow them to be shown in cfbot, they should not be
skipped if the linux build fails.

This could be done on the client side (cfbot).  One advantage of doing it here
is that fewer docs are uploaded - many patches won't upload docs at all.

XXX: if this is run in the same task, the configure flags should probably be
consistent ?

https://cirrus-ci.com/task/5396696388599808

ci-os-only: linux
---
 .cirrus.yml | 46 ++
 1 file changed, 34 insertions(+), 12 deletions(-)

diff --git a/.cirrus.yml b/.cirrus.yml
index 1b7c36283e..f21b249b6f 100644
--- a/.cirrus.yml
+++ b/.cirrus.yml
@@ -450,10 +450,6 @@ task:
 task:
   name: CompilerWarnings
 
-  # To limit unnecessary work only run this once the normal linux test succeeds
-  depends_on:
-- Linux - Debian Bullseye
-
   env:
 CPUS: 4
 BUILD_JOBS: 4
@@ -482,6 +478,13 @@ task:
 clang -v
 export
 
+git remote -v
+git branch -a
+git remote add postgres https://github.com/postgres/postgres
+time git fetch -v postgres master
+git log -1 postgres/master
+git diff --name-only postgres/master..
+
   ccache_cache:
 folder: $CCACHE_DIR
 
@@ -557,16 +560,35 @@ task:
   ###
   # Verify docs can be built
   ###
-  # XXX: Only do this if there have been changes in doc/ since last build
   always:
 docs_build_script: |
-  time ./configure \
---cache gcc.cache \
-CC="ccache gcc" \
-CXX="ccache g++" \
-CLANG="ccache clang"
-  make -s -j${BUILD_JOBS} clean
-  time make -s -j${BUILD_JOBS} -C doc
+  # Do nothing if the patch doesn't update doc:
+  git diff --name-only --cherry-pick --exit-code postgres/master... doc && exit
+
+  # Exercise HTML and other docs:
+  ./configure >/dev/null
+  make -s -C doc
+  cp -r doc new-docs
+
+  # Build HTML docs from the parent commit
+  git checkout postgres/master -- doc
+  make -s -C doc clean
+  make -s -C doc html
+  cp -r doc old-docs
+
+  # Copy HTML which differ into html_docs
+  # This will show any files which differ from files generated from postgres/master.
+  # Commits to postgres/master which affect doc/ will cause more files to be
+  # included until the patch is rebased.
+  mkdir html_docs

Re: set TESTDIR from perl rather than Makefile

2022-02-25 Thread Justin Pryzby
On Sun, Feb 20, 2022 at 04:39:08PM -0800, Andres Freund wrote:
> On 2022-02-19 17:53:09 -0600, Justin Pryzby wrote:
> > I also meant to also attach it.
> 
> Is the patch actually independent of the other patches in your stack?

Yes - I rearranged it that way for this thread.

However, it's best served/combined with the alltaptests patch :)

> > -   $expected = slurp_file_eval("traces/$testname.trace");
> > +   my $inputdir = "$ENV{'TESTDIR'}/tmp_check";
> > +   $expected = slurp_file_eval("$inputdir/traces/$testname.trace");
> 
> Why is this needed? Shouldn't we end up in exactly the same dir with/without
> this patch?

Right - I'd incorrectly set test_dir to t/ rather than the parent dir.

> > +++ b/src/tools/msvc/vcregress.pl
> > @@ -261,10 +261,8 @@ sub tap_check
> > $ENV{PG_REGRESS}= "$topdir/$Config/pg_regress/pg_regress";
> > $ENV{REGRESS_SHLIB} = "$topdir/src/test/regress/regress.dll";
> >  
> > -   $ENV{TESTDIR} = "$dir";
> > my $module = basename $dir;
> > -   # add the module build dir as the second element in the PATH
> > -   $ENV{PATH} =~ s!;!;$topdir/$Config/$module;!;
> > +   #$ENV{VCREGRESS_MODE} = $Config;
> 
> Hm. How does the module build dir end up on PATH after this?

That patch only dealt with TESTDIR.  PATH was still set by makefiles.
For MSVC, it was being set to top_builddir/tmp_install.

I've added a 2nd patch to also set PATH.  Maybe it should also set
PATH=$bindir:$PATH - I'm not sure.

https://github.com/justinpryzby/postgres/runs/5340884168
>From 9268dc1f10eed605d584ade20e263a6c32439e2e Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Sat, 19 Feb 2022 13:06:52 -0600
Subject: [PATCH 1/7] wip: set TESTDIR from src/test/perl rather than
 Makefile/vcregress

TODO: PATH

These seem most likely to break:

make check -C src/bin/psql
make check -C src/bin/pgbench
make check -C src/test/modules/test_misc
make check -C src/test/modules/libpq_pipeline
PROVE_TESTS=t/027_stream_regress.pl make check -C src/test/recovery
---
 src/Makefile.global.in |  9 ++---
 src/test/perl/PostgreSQL/Test/Utils.pm | 16 +---
 src/tools/msvc/vcregress.pl|  1 -
 3 files changed, 15 insertions(+), 11 deletions(-)

diff --git a/src/Makefile.global.in b/src/Makefile.global.in
index c980444233..92649d0193 100644
--- a/src/Makefile.global.in
+++ b/src/Makefile.global.in
@@ -451,8 +451,9 @@ define prove_installcheck
 rm -rf '$(CURDIR)'/tmp_check
 $(MKDIR_P) '$(CURDIR)'/tmp_check
 cd $(srcdir) && \
-   TESTDIR='$(CURDIR)' PATH="$(bindir):$(CURDIR):$$PATH" \
+   PATH="$(bindir):$(CURDIR):$$PATH" \
PGPORT='6$(DEF_PGPORT)' top_builddir='$(CURDIR)/$(top_builddir)' \
+   PG_SUBDIR='$(CURDIR)' \
PG_REGRESS='$(CURDIR)/$(top_builddir)/src/test/regress/pg_regress' \
$(PROVE) $(PG_PROVE_FLAGS) $(PROVE_FLAGS) $(if $(PROVE_TESTS),$(PROVE_TESTS),t/*.pl)
 endef
@@ -461,8 +462,9 @@ define prove_installcheck
 rm -rf '$(CURDIR)'/tmp_check
 $(MKDIR_P) '$(CURDIR)'/tmp_check
 cd $(srcdir) && \
-   TESTDIR='$(CURDIR)' PATH="$(bindir):$(CURDIR):$$PATH" \
+   PATH="$(bindir):$(CURDIR):$$PATH" \
PGPORT='6$(DEF_PGPORT)' top_builddir='$(top_builddir)' \
+   PG_SUBDIR='$(CURDIR)' \
PG_REGRESS='$(top_builddir)/src/test/regress/pg_regress' \
$(PROVE) $(PG_PROVE_FLAGS) $(PROVE_FLAGS) $(if $(PROVE_TESTS),$(PROVE_TESTS),t/*.pl)
 endef
@@ -472,7 +474,8 @@ define prove_check
 rm -rf '$(CURDIR)'/tmp_check
 $(MKDIR_P) '$(CURDIR)'/tmp_check
 cd $(srcdir) && \
-   TESTDIR='$(CURDIR)' $(with_temp_install) PGPORT='6$(DEF_PGPORT)' \
+   $(with_temp_install) PGPORT='6$(DEF_PGPORT)' \
+   PG_SUBDIR='$(CURDIR)' \
PG_REGRESS='$(CURDIR)/$(top_builddir)/src/test/regress/pg_regress' \
$(PROVE) $(PG_PROVE_FLAGS) $(PROVE_FLAGS) $(if $(PROVE_TESTS),$(PROVE_TESTS),t/*.pl)
 endef
diff --git a/src/test/perl/PostgreSQL/Test/Utils.pm b/src/test/perl/PostgreSQL/Test/Utils.pm
index 46cd746796..0e2abb6c9e 100644
--- a/src/test/perl/PostgreSQL/Test/Utils.pm
+++ b/src/test/perl/PostgreSQL/Test/Utils.pm
@@ -184,19 +184,21 @@ INIT
 	# test may still fail, but it's more likely to report useful facts.
 	$SIG{PIPE} = 'IGNORE';
 
-	# Determine output directories, and create them.  The base path is the
-	# TESTDIR environment variable, which is normally set by the invoking
-	# Makefile.
-	$tmp_check = $ENV{TESTDIR} ? "$ENV{TESTDIR}/tmp_check" : "tmp_check";
+	my $test_dir = File::Spec->rel2abs($ENV{PG_SUBDIR} || dirname(dirname($0)));
+	my $test_name = basename($0);
+	$test_name =~ s/\.[^.]+$//;
+
+	$ENV{TESTDIR} = $test_dir;
+
+	# Determine output directories, and create them.
+	$tmp_check = "$test_dir/tmp_check";
 	$log_path = "$tmp_check/log";
 
 	mkdir $tmp_check;
 	mkdir $log_path;
 
 

Re: Add parameter jit_warn_above_fraction

2022-02-25 Thread Justin Pryzby
On Fri, Feb 25, 2022 at 04:16:01PM +0100, Magnus Hagander wrote:
> + {
> + {"jit_warn_above_fraction", PGC_SUSET, LOGGING_WHEN,
> + gettext_noop("Sets the fraction of query time spent on 
> JIT before writing"
> +  "a warning to the log."),
> + gettext_noop("Write a message tot he server log if more 
> than this"
> +  "fraction of the query runtime 
> is spent on JIT."
> +  "Zero turns off the warning.")
> + },
> + _warn_above_fraction,
> + 0.0, 0.0, 1.0,
> + NULL, NULL, NULL
> + },

Should be PGC_USERSET ?

+   gettext_noop("Write a message tot he server log if more 
than this"  
   

to the

+   if (jit_enabled && jit_warn_above_fraction > 0) 

   
+   {   

   
+   int64 jit_time =

   
+   
INSTR_TIME_GET_MILLISEC(portal->queryDesc->estate->es_jit->instr.generation_counter)
 +  
   
+   
INSTR_TIME_GET_MILLISEC(portal->queryDesc->estate->es_jit->instr.inlining_counter)
 +  
 
+   
INSTR_TIME_GET_MILLISEC(portal->queryDesc->estate->es_jit->instr.optimization_counter)
 +  
 
+   
INSTR_TIME_GET_MILLISEC(portal->queryDesc->estate->es_jit->instr.emission_counter);


+   

   
+   if (jit_time > msecs * jit_warn_above_fraction) 

   
+   {   

   
+   ereport(WARNING,

   
+   (errmsg("JIT time was %ld ms of %d ms", 

   
+   jit_time, msecs))); 

   
+   }   

   
+   }   

   


I think it should be a NOTICE (or less?)

Is it more useful if this is applied combined with log_min_duration_statement ?

It's easy to imagine a query for which the planner computes a high cost, but
actually runs quickly.  You might get a bunch of WARNINGs that the query took
10 MS and JIT was 75% of that, even if you don't care about queries that take
less than 10 SEC.

I should say that this is already available by processing the output of
autoexplain.

-- 
Justin




Re: set TESTDIR from perl rather than Makefile

2022-02-24 Thread Justin Pryzby
On Mon, Feb 21, 2022 at 07:00:54AM -0500, Andrew Dunstan wrote:
> On 2/19/22 18:53, Justin Pryzby wrote:
> > On Sat, Feb 19, 2022 at 05:41:49PM -0600, Justin Pryzby wrote:
> >> I rebased and fixed the check-guc script to work, made it work with vpath
> >> builds, and cleaned it up some.
> > I also meant to also attach it.
> 
> This is going to break a bunch of stuff as written.
> 
> First, it's not doing the same thing. The current system sets TESTDIR to
> be the parent of the directory that holds the test. e.g. for
> src/bin/pg_ctl/t/001_start_stop.pl it's src/bin/pg_ctl in the build
> tree, not the 't' subdirectory. This patch apparently sets it to the 't'
> subdirectory. That will break anything that goes looking for log files
> in the current location, like the buildfarm client, and possibly some CI
> setups as well.

Yes, thanks.

I changed the patch to use ENV{CURDIR} || dirname(dirname($0)).  If I'm not
wrong, that seems to be doing the right thing.

> Also, unless I'm mistaken it appears to to the wrong thing for vpath
> builds:
> 
> my $test_dir = File::Spec->rel2abs(dirname($0));
> 
> is completely wrong for vpaths, since that will place it in the source
> tree, not the build tree.
> 
> Last, and for no explained reason that I can see, the patch undoes
> commit f4ce6c4d3a, but only for msvc builds. Even if that's justified it
> appears to have no relevance to this patch.

Andres' idea is that perl should set TESTDIR and PATH.  Here I commented out
PATH, and had the improbable issue that nothing seemed to be breaking,
including the pipeline test under msvc.  It'd be helpful to know what
configuration that breaks so I can test that it's broken and then test that
it's fixed when set from within perl...

I got busy here, and may not be able to progress this for awhile.
>From b46b405565a2fce3f96a1dcddf6d35ae7f3acc6d Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Sat, 19 Feb 2022 13:06:52 -0600
Subject: [PATCH] wip: set TESTDIR from src/test/perl rather than
 Makefile/vcregress

These seem most likely to break:

make check -C src/bin/psql
make check -C src/bin/pgbench
make check -C src/test/modules/test_misc
make check -C src/test/modules/libpq_pipeline
PROVE_TESTS=t/027_stream_regress.pl make check -C src/test/recovery
---
 src/Makefile.global.in |  9 ++---
 src/test/perl/PostgreSQL/Test/Utils.pm | 17 ++---
 src/tools/msvc/vcregress.pl|  4 +---
 3 files changed, 17 insertions(+), 13 deletions(-)

diff --git a/src/Makefile.global.in b/src/Makefile.global.in
index c980444233..92649d0193 100644
--- a/src/Makefile.global.in
+++ b/src/Makefile.global.in
@@ -451,8 +451,9 @@ define prove_installcheck
 rm -rf '$(CURDIR)'/tmp_check
 $(MKDIR_P) '$(CURDIR)'/tmp_check
 cd $(srcdir) && \
-   TESTDIR='$(CURDIR)' PATH="$(bindir):$(CURDIR):$$PATH" \
+   PATH="$(bindir):$(CURDIR):$$PATH" \
PGPORT='6$(DEF_PGPORT)' top_builddir='$(CURDIR)/$(top_builddir)' \
+   PG_SUBDIR='$(CURDIR)' \
PG_REGRESS='$(CURDIR)/$(top_builddir)/src/test/regress/pg_regress' \
$(PROVE) $(PG_PROVE_FLAGS) $(PROVE_FLAGS) $(if $(PROVE_TESTS),$(PROVE_TESTS),t/*.pl)
 endef
@@ -461,8 +462,9 @@ define prove_installcheck
 rm -rf '$(CURDIR)'/tmp_check
 $(MKDIR_P) '$(CURDIR)'/tmp_check
 cd $(srcdir) && \
-   TESTDIR='$(CURDIR)' PATH="$(bindir):$(CURDIR):$$PATH" \
+   PATH="$(bindir):$(CURDIR):$$PATH" \
PGPORT='6$(DEF_PGPORT)' top_builddir='$(top_builddir)' \
+   PG_SUBDIR='$(CURDIR)' \
PG_REGRESS='$(top_builddir)/src/test/regress/pg_regress' \
$(PROVE) $(PG_PROVE_FLAGS) $(PROVE_FLAGS) $(if $(PROVE_TESTS),$(PROVE_TESTS),t/*.pl)
 endef
@@ -472,7 +474,8 @@ define prove_check
 rm -rf '$(CURDIR)'/tmp_check
 $(MKDIR_P) '$(CURDIR)'/tmp_check
 cd $(srcdir) && \
-   TESTDIR='$(CURDIR)' $(with_temp_install) PGPORT='6$(DEF_PGPORT)' \
+   $(with_temp_install) PGPORT='6$(DEF_PGPORT)' \
+   PG_SUBDIR='$(CURDIR)' \
PG_REGRESS='$(CURDIR)/$(top_builddir)/src/test/regress/pg_regress' \
$(PROVE) $(PG_PROVE_FLAGS) $(PROVE_FLAGS) $(if $(PROVE_TESTS),$(PROVE_TESTS),t/*.pl)
 endef
diff --git a/src/test/perl/PostgreSQL/Test/Utils.pm b/src/test/perl/PostgreSQL/Test/Utils.pm
index 46cd746796..74ccaa08d9 100644
--- a/src/test/perl/PostgreSQL/Test/Utils.pm
+++ b/src/test/perl/PostgreSQL/Test/Utils.pm
@@ -184,19 +184,22 @@ INIT
 	# test may still fail, but it's more likely to report useful facts.
 	$SIG{PIPE} = 'IGNORE';
 
-	# Determine output directories, and create them.  The base path is the
-	# TESTDIR environment variable, which is normally set by the invoking
-	# Makefile.
-	$tmp_check = $ENV{TESTDIR} ? "$ENV{TESTDIR}/tmp_check" : "tmp_check";
+	my $test_dir = File::Spec->rel2abs($ENV{PG_SUBDIR} || dirname(dirname($0)));
+
+	my $test_name = basename($0);
+	$test_name =~ s/\.[^.]+$//;
+
+	# Determine output d

Re: document that brin's autosummarize parameter is off by default

2022-02-24 Thread Justin Pryzby
Ten months ago, Jaime Casanova wrote:
> Hi everyone,
> 
> Just noted that the default value of autosummarize reloption for brin
> indexes is not documented, or at least not well documented.
> 
> I added the default value in create_index.sgml where other options
> mention their own defaults, also made a little change in brin.sgml to 
> make it more clear that is disabled by default (at least the way it 
> was written made no sense for me, but it could be that my english is 
> not that good).

It seems like "This last trigger" in the current text is intended to mean "The
second condition".  Your change improves that.

Should we also consider enabling autosummarize by default ?
It was added in v10, after BRIN was added in v9.5.  For us, brin wasn't usable
without autosummarize.

Also, note that vacuums are now triggered by insertions, since v13, so it might
be that autosummarize is needed much less.

-- 
Justin

PS. I hope there's a faster response to your pg_upgrade patch.




Re: Report checkpoint progress with pg_stat_progress_checkpoint (was: Report checkpoint progress in server logs)

2022-02-23 Thread Justin Pryzby
+   Whenever the checkpoint operation is running, the
+   pg_stat_progress_checkpoint view will contain a
+   single row indicating the progress of the checkpoint. The tables below

Maybe it should show a single row , unless the checkpointer isn't running at
all (like in single user mode).

+   Process ID of a CHECKPOINTER process.

It's *the* checkpointer process.

pgstatfuncs.c has a whitespace issue (tab-space).

I suppose the functions should set provolatile.

-- 
Justin




Re: CLUSTER on partitioned index

2022-02-23 Thread Justin Pryzby
On Tue, Jul 20, 2021 at 08:27:02PM -0400, Alvaro Herrera wrote:
> I have to wonder if there really *is* a use case for CLUSTER in the
> first place on regular tables, let alone on partitioned tables, which
> are likely to be large and thus take a lot of time.  What justifies
> spending so much time on this implementation?  My impression is that
> CLUSTER is pretty much a fringe command nowadays, because of the access
> exclusive lock required.
> 
> Does anybody actually use it?

I hope that Alvaro will comment on the simplified patches.  If multiple people
think the patch isn't worth it, feel free to close it.  But I don't see how
complexity could be the reason.

-- 
Justin




Re: shared_preload_libraries = 'pg_stat_statements' failing with installcheck (compute_query_id = auto)

2022-02-22 Thread Justin Pryzby
On Tue, Feb 22, 2022 at 11:04:16AM +0900, Michael Paquier wrote:
> On Fri, Feb 18, 2022 at 05:38:56PM +0800, Julien Rouhaud wrote:
> > On Fri, Feb 18, 2022 at 05:22:36PM +0900, Michael Paquier wrote:
> >> So, I have been looking at this problem, and I don't see a problem in
> >> doing something like the attached, where we add a "regress" mode to
> >> compute_query_id that is a synonym of "auto". Or, in short, we have
> >> the default of letting a module decide if a query ID can be computed
> >> or not, at the exception that we hide its result in EXPLAIN outputs.

> Okay, thanks.  I have worked on that today and applied the patch down

While rebasing, I noticed that this patch does part of what I added in another
thread.

https://commitfest.postgresql.org/37/3409/
| explain_regress, explain(MACHINE), and default to explain(BUFFERS)

-   if (es->verbose && plannedstmt->queryId != UINT64CONST(0))
+   if (es->verbose && plannedstmt->queryId != UINT64CONST(0) && 
es->machine)
{
/*
 * Output the queryid as an int64 rather than a uint64 so we 
match

Apparently, I first wrote this two years ago today.

-- 
Justin




wal_compression=zstd

2022-02-22 Thread Justin Pryzby
Since 4035cd5d4, wal_compression=lz4 is supported.  I think pg15 should also
support wal_compression=zstd.  There are free bits in the WAL header.

The last message on that thread includes a patch doing just that, which I've
rebased.
https://www.postgresql.org/message-id/ynqwd2gsmrnqw...@paquier.xyz

It might be nice if to also support wal_compression=zstd-level, but that seems
optional and could wait for pg16...

In [0], I showed a case where lz4 is just as fast as uncompressed data, and
writes ~60% as much.  zstd writes half as much as LZ4 and 12% slower.
[0] https://www.postgresql.org/message-id/20210614012412.GA31772%40telsasoft.com

I am not claiming that zstd is generally better for WAL.  Rather, if we're
going to support alternate compression methods, it's nice to give a couple
options (in addition to pglz).  Some use cases would certainly suffer from
slower WAL.  But providing the option will benefit some others.  Note that a
superuser can set wal_compression for a given session - this would probably
benefit bulk-loading in an otherwise OLTP environment.

As writen, this patch uses zstd level=1 (whereas the ZSTD's default compress
level is 6).

0001 adds support for zstd
0002 makes more efficient our use of bits in the WAL header
0003 makes it the default compression, to exercise on CI (windows will fail).
0004 adds support for zstd-level
0005-6 are needed to allow recovery tests to pass when wal compression is 
enabled;
0007 (not included) also adds support for zlib.  I'm of the impression nobody
 cares about this, otherwise it would've been included 5-10 years ago.

Only 0001 should be reviewed for pg15 - the others are optional/future work.
>From 9253013c789ffb121272bfeeaa9dcdebbef79ced Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Fri, 18 Feb 2022 22:54:18 -0600
Subject: [PATCH 1/6] add wal_compression=zstd

---
 doc/src/sgml/config.sgml  |  9 +++---
 doc/src/sgml/installation.sgml| 10 +++
 src/backend/access/transam/xloginsert.c   | 30 ++-
 src/backend/access/transam/xlogreader.c   | 20 +
 src/backend/utils/misc/guc.c  |  3 ++
 src/backend/utils/misc/postgresql.conf.sample |  2 +-
 src/bin/pg_waldump/pg_waldump.c   |  2 ++
 src/include/access/xlog.h |  3 +-
 src/include/access/xlogrecord.h   |  5 +++-
 9 files changed, 76 insertions(+), 8 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 7ed8c82a9d..97740c7b66 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -3154,10 +3154,11 @@ include_dir 'conf.d'
 server compresses full page images written to WAL when
  is on or during a base backup.
 A compressed page image will be decompressed during WAL replay.
-The supported methods are pglz and
-lz4 (if PostgreSQL was
-compiled with --with-lz4). The default value is
-off. Only superusers can change this setting.
+The supported methods are pglz, and
+(if configured when PostgreSQL was built) 
+lz4 and zstd.
+The default value is off.
+Only superusers can change this setting.

 

diff --git a/doc/src/sgml/installation.sgml b/doc/src/sgml/installation.sgml
index 0f74252590..d32767b229 100644
--- a/doc/src/sgml/installation.sgml
+++ b/doc/src/sgml/installation.sgml
@@ -271,6 +271,14 @@ su - postgres
  
 
 
+
+ 
+  The ZSTD library can be used to enable
+  compression using that method; see
+  .
+ 
+
+
 
  
   To build the PostgreSQL documentation,
@@ -988,6 +996,8 @@ build-postgresql:

 
  Build with ZSTD compression support.
+ This enables use of ZSTD for
+ compression of WAL data.
 

   
diff --git a/src/backend/access/transam/xloginsert.c b/src/backend/access/transam/xloginsert.c
index c260310c4c..847ce0c3fc 100644
--- a/src/backend/access/transam/xloginsert.c
+++ b/src/backend/access/transam/xloginsert.c
@@ -44,9 +44,17 @@
 #define LZ4_MAX_BLCKSZ		0
 #endif
 
+#ifdef USE_ZSTD
+#include 
+#define ZSTD_MAX_BLCKSZ		ZSTD_COMPRESSBOUND(BLCKSZ)
+#else
+#define ZSTD_MAX_BLCKSZ		0
+#endif
+
+/* Buffer size required to store a compressed version of backup block image */
 #define PGLZ_MAX_BLCKSZ		PGLZ_MAX_OUTPUT(BLCKSZ)
 
-#define COMPRESS_BUFSIZE	Max(PGLZ_MAX_BLCKSZ, LZ4_MAX_BLCKSZ)
+#define COMPRESS_BUFSIZE	Max(Max(PGLZ_MAX_BLCKSZ, LZ4_MAX_BLCKSZ), ZSTD_MAX_BLCKSZ)
 
 /*
  * For each block reference registered with XLogRegisterBuffer, we fill in
@@ -695,6 +703,14 @@ XLogRecordAssemble(RmgrId rmid, uint8 info,
 #endif
 		break;
 
+	case WAL_COMPRESSION_ZSTD:
+#ifdef USE_ZSTD
+		bimg.bimg_info |= BKPIMAGE_COMPRESS_ZSTD;
+#else
+		elog(ERROR, "ZSTD is not supported by this build");
+#endif
+		break;
+
 	case WAL_COMPRESSION_NONE:
 		Assert(false);	/* 

Re: Adding CI to our tree (ccache)

2022-02-20 Thread Justin Pryzby
On Sun, Feb 20, 2022 at 12:47:31PM -0800, Andres Freund wrote:
> > Did you ever try to use clcache (or others) ?
> > 
> > When I tried, it refused to cache because of our debug settings
> > (DebugInformationFormat) - which seem to be enabled even in release mode.
> 
> > I wonder if that'll be an issue for ccache, too.  I think that line may 
> > need to
> > be conditional on debug mode.
> 
> That's relatively easily solvable by using a different debug format IIRC (/Z7
> or such).

Yes. I got that working for CI by overriding with a value from the environment.
https://cirrus-ci.com/task/6191974075072512

This is right after rebasing, so it doesn't save anything, but normally cuts
build time to 90sec, which isn't impressive, but it's something.

BTW, I think it's worth compiling the windows build with optimizations (as I
did here).  At least with all the tap tests, this pays for itself.  I suppose
you don't want to use a Release build, but optimizations could be enabled by
an(other) environment variable.

-- 
Justin




Re: Adding CI to our tree (ccache)

2022-02-20 Thread Justin Pryzby
Have you tried to use the yet-to-be-released ccache with MSVC ?

Also, do you know about msbuild /outputResultsCache ?
When I tried that, it gave a bunch of error.

https://cirrus-ci.com/task/5697497241747456

|[16:35:13.605]  1>c:\cirrus\pgsql.sln.metaproj : error : MSB4252: Project 
"c:\cirrus\pgsql.sln" with global properties [c:\cirrus\pgsql.sln]
|[16:35:13.615] c:\cirrus\pgsql.sln.metaproj : error : 
(TrackFileAccess=false; CLToolExe=clcache.exe) [c:\cirrus\pgsql.sln]
|[16:35:13.615] c:\cirrus\pgsql.sln.metaproj : error : is building project 
"c:\cirrus\initdb.vcxproj" with global properties [c:\cirrus\pgsql.sln]
|[16:35:13.615] c:\cirrus\pgsql.sln.metaproj : error : 
(TrackFileAccess=false; CLToolExe=clcache.exe; BuildingSolutionFile=true; 
CurrentSolutionConfigurationContents= 
[c:\cirrus\pgsql.sln]
|[16:35:13.615] c:\cirrus\pgsql.sln.metaproj : error :   Debug|x64 
[c:\cirrus\pgsql.sln]
|...
|[16:35:14.518]c:\cirrus\pgsql.sln.metaproj : error :   
Debug|x64 
[c:\cirrus\pgsql.sln]
|[16:35:14.518]c:\cirrus\pgsql.sln.metaproj : error : 
; SolutionDir=c:\cirrus\; SolutionExt=.sln; 
SolutionFileName=pgsql.sln; SolutionName=pgsql; 
SolutionPath=c:\cirrus\pgsql.sln; Configuration=Debug; Platform=x64) 
[c:\cirrus\pgsql.sln]
|[16:35:14.518]c:\cirrus\pgsql.sln.metaproj : error : with the 
(default) target(s) but the build result for the built project is not in the 
engine cache. In isolated builds this could mean one of the following: 
[c:\cirrus\pgsql.sln]
|[16:35:14.518]c:\cirrus\pgsql.sln.metaproj : error : - the 
reference was called with a target which is not specified in the 
ProjectReferenceTargets item in project "c:\cirrus\pgsql.sln" 
[c:\cirrus\pgsql.sln]
|[16:35:14.518]c:\cirrus\pgsql.sln.metaproj : error : - the 
reference was called with global properties that do not match the static graph 
inferred nodes [c:\cirrus\pgsql.sln]
|[16:35:14.518]c:\cirrus\pgsql.sln.metaproj : error : - the 
reference was not explicitly specified as a ProjectReference item in project 
"c:\cirrus\pgsql.sln" [c:\cirrus\pgsql.sln]
|[16:35:14.518]c:\cirrus\pgsql.sln.metaproj : error :  
[c:\cirrus\pgsql.sln]
|[16:35:14.518] 
|[16:35:14.518] 0 Warning(s)
|[16:35:14.518] 149 Error(s)

Did you ever try to use clcache (or others) ?

When I tried, it refused to cache because of our debug settings
(DebugInformationFormat) - which seem to be enabled even in release mode.

I wonder if that'll be an issue for ccache, too.  I think that line may need to
be conditional on debug mode.

https://cirrus-ci.com/task/4808554103177216

|[17:14:28.765]   C:\ProgramData\chocolatey\lib\clcache\clcache\clcache.py 
Expanded commandline '['/c', '/Isrc/include', '/Isrc/include/port/win32', 
'/Isrc/include/port/win32_msvc', '/Ic:/openssl/1.1/\\include', '/Zi', 
'/nologo', '/W3', '/WX-', '/diagnostics:column', '/Ox', '/D', 'WIN32', '/D', 
'_WINDOWS', '/D', '__WINDOWS__', '/D', '__WIN32__', '/D', 
'WIN32_STACK_RLIMIT=4194304', '/D', '_CRT_SECURE_NO_DEPRECATE', '/D', 
'_CRT_NONSTDC_NO_DEPRECATE', '/D', 'FRONTEND', '/D', '_MBCS', '/GF', '/Gm-', 
'/EHsc', '/MD', '/GS', '/fp:precise', '/Zc:wchar_t', '/Zc:forScope', 
'/Zc:inline', '/Fo.\\Release\\libpgcommon\\', 
'/Fd.\\Release\\libpgcommon\\libpgcommon.pdb', '/external:W3', '/Gd', '/TC', 
'/wd4018', '/wd4244', '/wd4273', '/wd4101', '/wd4102', '/wd4090', '/wd4267', 
'/FC', '/errorReport:queue', '/MP', 'src/common/archive.c', 
'src/common/base64.c', 'src/common/checksum_helper.c', 
'src/common/config_info.c', 'src/common/controldata_utils.c', 
'src/common/cryptohash_openssl.c', 'src/common/d2s.c', 'src/common/encnames.c', 
'src/common/exec.c', 'src/common/f2s.c', 'src/common/fe_memutils.c', 
'src/common/file_perm.c', 'src/common/file_utils.c', 'src/common/hashfn.c', 
'src/common/hmac_openssl.c', 'src/common/ip.c', 'src/common/jsonapi.c', 
'src/common/keywords.c', 'src/common/kwlookup.c', 'src/common/link-canary.c', 
'src/common/logging.c', 'src/common/md5_common.c', 'src/common/pg_get_line.c', 
'src/common/pg_lzcompress.c', 'src/common/pg_prng.c', 'src/common/pgfnames.c', 
'src/common/protocol_openssl.c', 'src/common/psprintf.c', 
'src/common/relpath.c', 'src/common/restricted_token.c', 'src/common/rmtree.c', 
'src/common/saslprep.c', 'src/common/scram-common.c', 'src/common/sprompt.c', 
'src/common/string.c', 'src/common/stringinfo.c', 'src/common/unicode_norm.c', 
'src/common/username.c', 'src/common/wait_error.c', 'src/common/wchar.c']'
|[17:14:28.765]   C:\ProgramData\chocolatey\lib\clcache\clcache\clcache.py 
Cannot cache invocation as ['/c', '/Isrc/include', '/Isrc/include/port/win32', 
'/Isrc/include/port/win32_msvc', '/Ic:/openssl/1.1/\\include', '/Zi', 
'/nologo', '/W3', '/WX-', '/diagnostics:column', '/Ox', '/D', 'WIN32', '/D', 
'_WINDOWS', '/D', '__WINDOWS__', '/D', '__WIN32__', '/D', 
'WIN32_STACK_RLIMIT=4194304', '/D', '_CRT_SECURE_NO_DEPRECATE', '/D', 
'_CRT_NONSTDC_NO_DEPRECATE', '/D', 

Re: Uniforms the errors msgs at tuplestore paths

2022-02-20 Thread Justin Pryzby
On Sun, Feb 20, 2022 at 11:12:42AM -0300, Ranier Vilela wrote:
> Like how the commit
> https://github.com/postgres/postgres/commit/07daca53bfcad59618a9c6fad304e380cc9d2bc1
> The are some paths that were missed:

I think these are all unified by the existing tuplestore patch.
https://commitfest.postgresql.org/37/3420/

> -At jsonfuncs.c
>  The errors mgs do not report about the materialize mode.
> -At plperl.c
>  The function plperl_return_next_internal does not check rsi appropriately.
>  The error about materialize mode required, is missed.
> -At pl_exec.c
>  The error mgs do not report about the materialize mode
> -At pltcl.c:
>  The function pltcl_init_tuple_store does not check  rsi appropriately.




Re: set TESTDIR from perl rather than Makefile

2022-02-19 Thread Justin Pryzby
On Sat, Feb 19, 2022 at 05:41:49PM -0600, Justin Pryzby wrote:
> I rebased and fixed the check-guc script to work, made it work with vpath
> builds, and cleaned it up some.

I also meant to also attach it.

> This (and other) patches ran here.
> https://github.com/justinpryzby/postgres/runs/5261323874
> ...
> e806bcb280 wip: set TESTDIR from src/test/perl rather than Makefile/vcregress
>From e806bcb280f1093c0a0e71c9a0b5617f938c4b86 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Sat, 19 Feb 2022 13:06:52 -0600
Subject: [PATCH] wip: set TESTDIR from src/test/perl rather than
 Makefile/vcregress

---
 src/Makefile.global.in   |  6 +++---
 .../libpq_pipeline/t/001_libpq_pipeline.pl   |  3 ++-
 src/test/modules/test_misc/t/003_check_guc.pl|  2 +-
 src/test/perl/PostgreSQL/Test/Utils.pm   | 16 +---
 src/tools/msvc/vcregress.pl  |  4 +---
 5 files changed, 16 insertions(+), 15 deletions(-)

diff --git a/src/Makefile.global.in b/src/Makefile.global.in
index c980444233..ff20387c42 100644
--- a/src/Makefile.global.in
+++ b/src/Makefile.global.in
@@ -451,7 +451,7 @@ define prove_installcheck
 rm -rf '$(CURDIR)'/tmp_check
 $(MKDIR_P) '$(CURDIR)'/tmp_check
 cd $(srcdir) && \
-   TESTDIR='$(CURDIR)' PATH="$(bindir):$(CURDIR):$$PATH" \
+   PATH="$(bindir):$(CURDIR):$$PATH" \
PGPORT='6$(DEF_PGPORT)' top_builddir='$(CURDIR)/$(top_builddir)' \
PG_REGRESS='$(CURDIR)/$(top_builddir)/src/test/regress/pg_regress' \
$(PROVE) $(PG_PROVE_FLAGS) $(PROVE_FLAGS) $(if $(PROVE_TESTS),$(PROVE_TESTS),t/*.pl)
@@ -461,7 +461,7 @@ define prove_installcheck
 rm -rf '$(CURDIR)'/tmp_check
 $(MKDIR_P) '$(CURDIR)'/tmp_check
 cd $(srcdir) && \
-   TESTDIR='$(CURDIR)' PATH="$(bindir):$(CURDIR):$$PATH" \
+   PATH="$(bindir):$(CURDIR):$$PATH" \
PGPORT='6$(DEF_PGPORT)' top_builddir='$(top_builddir)' \
PG_REGRESS='$(top_builddir)/src/test/regress/pg_regress' \
$(PROVE) $(PG_PROVE_FLAGS) $(PROVE_FLAGS) $(if $(PROVE_TESTS),$(PROVE_TESTS),t/*.pl)
@@ -472,7 +472,7 @@ define prove_check
 rm -rf '$(CURDIR)'/tmp_check
 $(MKDIR_P) '$(CURDIR)'/tmp_check
 cd $(srcdir) && \
-   TESTDIR='$(CURDIR)' $(with_temp_install) PGPORT='6$(DEF_PGPORT)' \
+   $(with_temp_install) PGPORT='6$(DEF_PGPORT)' \
PG_REGRESS='$(CURDIR)/$(top_builddir)/src/test/regress/pg_regress' \
$(PROVE) $(PG_PROVE_FLAGS) $(PROVE_FLAGS) $(if $(PROVE_TESTS),$(PROVE_TESTS),t/*.pl)
 endef
diff --git a/src/test/modules/libpq_pipeline/t/001_libpq_pipeline.pl b/src/test/modules/libpq_pipeline/t/001_libpq_pipeline.pl
index 0c164dcaba..2a0eca7744 100644
--- a/src/test/modules/libpq_pipeline/t/001_libpq_pipeline.pl
+++ b/src/test/modules/libpq_pipeline/t/001_libpq_pipeline.pl
@@ -49,7 +49,8 @@ for my $testname (@tests)
 		my $expected;
 		my $result;
 
-		$expected = slurp_file_eval("traces/$testname.trace");
+		my $inputdir = "$ENV{'TESTDIR'}/tmp_check";
+		$expected = slurp_file_eval("$inputdir/traces/$testname.trace");
 		next unless $expected ne "";
 		$result = slurp_file_eval($traceout);
 		next unless $result ne "";
diff --git a/src/test/modules/test_misc/t/003_check_guc.pl b/src/test/modules/test_misc/t/003_check_guc.pl
index 60459ef759..f4c1636240 100644
--- a/src/test/modules/test_misc/t/003_check_guc.pl
+++ b/src/test/modules/test_misc/t/003_check_guc.pl
@@ -35,7 +35,7 @@ my @not_in_sample_array = split("\n", lc($not_in_sample));
 
 # TAP tests are executed in the directory of the test, in the source tree,
 # even for VPATH builds, so rely on that to find postgresql.conf.sample.
-my $rootdir = "../../../..";
+my $rootdir = "$ENV{TESTDIR}/../../../../..";
 my $sample_file = "$rootdir/src/backend/utils/misc/postgresql.conf.sample";
 
 # List of all the GUCs found in the sample file.
diff --git a/src/test/perl/PostgreSQL/Test/Utils.pm b/src/test/perl/PostgreSQL/Test/Utils.pm
index 31e2b0315e..8a8d95ca8c 100644
--- a/src/test/perl/PostgreSQL/Test/Utils.pm
+++ b/src/test/perl/PostgreSQL/Test/Utils.pm
@@ -184,19 +184,21 @@ INIT
 	# test may still fail, but it's more likely to report useful facts.
 	$SIG{PIPE} = 'IGNORE';
 
-	# Determine output directories, and create them.  The base path is the
-	# TESTDIR environment variable, which is normally set by the invoking
-	# Makefile.
-	$tmp_check = $ENV{TESTDIR} ? "$ENV{TESTDIR}/tmp_check" : "tmp_check";
+	my $test_dir = File::Spec->rel2abs(dirname($0));
+	my $test_name = basename($0);
+	$test_name =~ s/\.[^.]+$//;
+
+	# Determine output directories, and create them.
+	# TODO: set srcdir?
+	$tmp_check = "$test_dir/tmp_check";
 	$log_path = "$tmp_check/log";
+	$ENV{TESTDIR} = $test_dir;
 
 	mkdir $tmp_check;
 	mkdir $log_path;
 
 	# Open the test log file, whose name depends on the test name.
-	$test_logfile = basename($0)

set TESTDIR from perl rather than Makefile

2022-02-19 Thread Justin Pryzby
Forking: 

On Tue, Feb 15, 2022 at 10:42:09PM -0800, Andres Freund wrote:
> >> I was thinking that we should make Utils.pm's INIT block responsible for
> >> figuring out both the directory a test should run in and the log location,
> >> instead having that in vcregress.pl and Makefile.global.in. Mostly because
> >> doing it in the latter means we can't start tests with different TESTDIR 
> >> and
> >> working dir at the same time.
> >> 
> >> If instead we pass the location of the top-level build and top-level source
> >> directory from vcregress.pl / Makefile.global, the tap test infrastructure 
> >> can
> >> figure out that stuff themselves, on a per-test basis.
> >> 
> >> For msvc builds we probably would need to pass in some information that 
> >> allow
> >> Utils.pm to set up PATH appropriately. I think that might just require 
> >> knowing
> >> that a) msvc build system is used b) Release vs Debug.
> >
> >I'm totally unsure if this resembles what you're thinking of, and I'm 
> >surprised
> >I got it working so easily.  But it gets the tap test output in separate 
> >dirs,
> >and CI is passing for everyone (windows failed because I injected a "false" 
> >to
> >force it to upload artifacts).
> >
> >https://github.com/justinpryzby/postgres/runs/5211673291
> 
> Yes, that's along the lines I was thinking. I only checked it on my phone, so 
> it certainly isn't a careful look...
> 
> I think this should be discussed in a separate thread, for visibility.

I rebased and fixed the check-guc script to work, made it work with vpath
builds, and cleaned it up some.

There may be other reasons to do this, but the reason I did it is to implement
an alltaptests target for vcregress, for cirrus (and everyone else).  If all
the tap tests are run serially, it takes ~16min on cirrus; it takes ~13 to run
in parallel (the below run is slower than that since all the slow tests were
scheduled at once - which isn't always a good idea).

Running the tests in parallel uses a single invocation of prove, and starts all
the TAP tests from the same dir rather than their own dir.  But without this
patch, all the tap test are output to the same dir, which is a pain to look
through.

This (and other) patches ran here.
https://github.com/justinpryzby/postgres/runs/5261323874
...
e806bcb280 wip: set TESTDIR from src/test/perl rather than Makefile/vcregress
a1bfa8e1a6 cirrus/windows: increase timout to 20min
5479b44198 vcregress: add alltaptests
fcef696c7d vcregress: run alltaptests in parallel
2d7dba13dd cirrus/windows: prove --state to run tests in order
7edf835d43 tmp: run tap tests first
6fb010c137 cirrus: include hints how to install OS packages..
28a25f12c3 cirrus/windows: add compiler_warnings_script
75bc8cff69 cirrus: upload changed html docs as artifacts
8008af2480 s!build docs as a separate task..
a3bf699a0e vcregress/ci: test modules/contrib with NO_INSTALLCHECK=1
d28ce46c2f wip: cirrus: code coverage

-- 
Justin




Re: pg_upgrade verbosity when redirecting output to log file

2022-02-18 Thread Justin Pryzby
+* If outputting to a tty / or , append newline. pg_log_v() will put 
the 
 
+* individual progress items onto the next line.

  
+*/ 

  
+   if (log_opts.isatty || log_opts.verbose)

  

I guess the comment should say "or in verbose mode".

-- 
Justin




Re: Assert in pageinspect with NULL pages

2022-02-17 Thread Justin Pryzby
BRIN can also crash if passed a non-brin index.

I've been sitting on this one for awhile.  Feel free to include it in your
patchset.

commit 08010a6037fc4e24a9ba05e5386e766f4310d35e
Author: Justin Pryzby 
Date:   Tue Jan 19 00:25:15 2021 -0600

pageinspect: brin_page_items(): check that given relation is not only an 
index, but a brin one

diff --git a/contrib/pageinspect/brinfuncs.c b/contrib/pageinspect/brinfuncs.c
index f1e64a39ef2..3de6dd943ef 100644
--- a/contrib/pageinspect/brinfuncs.c
+++ b/contrib/pageinspect/brinfuncs.c
@@ -16,6 +16,7 @@
 #include "access/brin_tuple.h"
 #include "access/htup_details.h"
 #include "catalog/index.h"
+#include "catalog/pg_am.h"
 #include "catalog/pg_type.h"
 #include "funcapi.h"
 #include "lib/stringinfo.h"
@@ -59,7 +60,7 @@ brin_page_type(PG_FUNCTION_ARGS)
if (raw_page_size != BLCKSZ)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
-errmsg("input page too small"),
+errmsg("input page wrong size"),
 errdetail("Expected size %d, got %d",
   BLCKSZ, raw_page_size)));
 
@@ -97,7 +98,7 @@ verify_brin_page(bytea *raw_page, uint16 type, const char 
*strtype)
if (raw_page_size != BLCKSZ)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
-errmsg("input page too small"),
+errmsg("input page wrong size"),
 errdetail("Expected size %d, got %d",
   BLCKSZ, raw_page_size)));
 
@@ -169,7 +170,14 @@ brin_page_items(PG_FUNCTION_ARGS)
MemoryContextSwitchTo(oldcontext);
 
indexRel = index_open(indexRelid, AccessShareLock);
-   bdesc = brin_build_desc(indexRel);
+
+   /* Must be a BRIN index */
+   if (indexRel->rd_rel->relkind != RELKIND_INDEX ||
+   indexRel->rd_rel->relam != BRIN_AM_OID)
+   ereport(ERROR,
+   (errcode(ERRCODE_WRONG_OBJECT_TYPE),
+errmsg("\"%s\" is not a BRIN index",
+RelationGetRelationName(indexRel;
 
/* minimally verify the page we got */
page = verify_brin_page(raw_page, BRIN_PAGETYPE_REGULAR, "regular");
@@ -178,6 +186,7 @@ brin_page_items(PG_FUNCTION_ARGS)
 * Initialize output functions for all indexed datatypes; simplifies
 * calling them later.
 */
+   bdesc = brin_build_desc(indexRel);
columns = palloc(sizeof(brin_column_state *) * 
RelationGetDescr(indexRel)->natts);
for (attno = 1; attno <= bdesc->bd_tupdesc->natts; attno++)
{




Re: Nonrandom scanned_pages distorts pg_class.reltuples set by VACUUM

2022-02-17 Thread Justin Pryzby
On Wed, Feb 16, 2022 at 07:43:09PM -0800, Peter Geoghegan wrote:
> > I also have a hard time making heads or tails out of the commit message of
> > 44fa84881ff. It's quite long without being particularly descriptive. The
> > commit just changes a lot of things at once, making it hard to precisely
> > describe and very hard to review and understand.
> 
> The commit message is high level. The important point is that we can
> more or less treat all scanned_pages the same, without generally
> caring about whether or not a cleanup lock could be acquired (since
> the exceptions where that isn't quite true are narrow and rare). That
> is the common theme, for everything.

As for myself, I particularly appreciated the clarity and detail of this commit
message.  (I have looked at the associated change a bit, but not deeply).

Thanks,
-- 
Justin




Re: adding 'zstd' as a compression algorithm (initdb/lz4)

2022-02-16 Thread Justin Pryzby
On Tue, Feb 15, 2022 at 11:54:10AM -0800, Andres Freund wrote:
> > Isn't it an incontrovertible fact that LZ4 is superior to pglz in
> > every way? LZ4 is pretty much its successor. And so it seems totally
> > fine to assume that users will always want to use the clearly better
> > option, and that that option will be generally available going
> > forward. TOAST compression is applied selectively already, based on
> > various obscure implementation details, some of which are quite
> > arbitrary.
> 
> Yea, we should really default to lz4 in initdb when available.

This patch intends to implement that.  I have no particular interest in this,
but if anyone wants, I will add it to the next CF (or the one after that).

commit 2a3c5950e625ccfaebc49bbf71b8db16dc143cd2
Author: Justin Pryzby 
Date:   Tue Feb 15 19:14:33 2022 -0600

initdb: default_toast_compression=lz4 if available

TODO: consider same for wal_compression

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index fc63172efde..f9cd2ef7229 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -8537,7 +8537,9 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH 
csv;
 The supported compression methods are pglz and
 (if PostgreSQL was compiled with
 --with-lz4) lz4.
-The default is pglz.
+The default is lz4 if available at the time 
+PostgreSQL was compiled, otherwise
+pglz.

   
  
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index f505413a7f9..6b6f6efaba1 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -4726,7 +4726,11 @@ static struct config_enum ConfigureNamesEnum[] =
NULL
},
_toast_compression,
+#ifdef USE_LZ4
+   TOAST_LZ4_COMPRESSION,
+#else
TOAST_PGLZ_COMPRESSION,
+#endif
default_toast_compression_options,
NULL, NULL, NULL
},
diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
index d78e8e67b8d..bb7c57e00fa 100644
--- a/src/bin/initdb/initdb.c
+++ b/src/bin/initdb/initdb.c
@@ -1185,6 +1185,12 @@ setup_config(void)
  
"#update_process_title = off");
 #endif
 
+#ifdef USE_LZ4
+   conflines = replace_token(conflines,
+ 
"#default_toast_compression = 'pglz'",
+ 
"#default_toast_compression = 'lz4'");
+#endif
+
/*
 * Change password_encryption setting to md5 if md5 was chosen as an
 * authentication method, unless scram-sha-256 was also chosen.




Re: Adding CI to our tree

2022-02-15 Thread Justin Pryzby
On Sun, Feb 13, 2022 at 01:53:19PM -0800, Andres Freund wrote:
> Hi,
> 
> On 2022-02-13 15:31:20 -0600, Justin Pryzby wrote:
> > Oh - I suppose you're right.  That's an unfortunate consequence of running a
> > single prove instance without chdir.
> 
> I don't think it's chdir that's relevant (that changes into the source
> directory after all). It's the TESTDIR environment variable.
> 
> I was thinking that we should make Utils.pm's INIT block responsible for
> figuring out both the directory a test should run in and the log location,
> instead having that in vcregress.pl and Makefile.global.in. Mostly because
> doing it in the latter means we can't start tests with different TESTDIR and
> working dir at the same time.
> 
> If instead we pass the location of the top-level build and top-level source
> directory from vcregress.pl / Makefile.global, the tap test infrastructure can
> figure out that stuff themselves, on a per-test basis.
> 
> For msvc builds we probably would need to pass in some information that allow
> Utils.pm to set up PATH appropriately. I think that might just require knowing
> that a) msvc build system is used b) Release vs Debug.

I'm totally unsure if this resembles what you're thinking of, and I'm surprised
I got it working so easily.  But it gets the tap test output in separate dirs,
and CI is passing for everyone (windows failed because I injected a "false" to
force it to upload artifacts).

https://github.com/justinpryzby/postgres/runs/5211673291

commit 899e562102dd7a663cb087cdf88f0f78f8302492
Author: Justin Pryzby 
Date:   Tue Feb 15 20:02:36 2022 -0600

wip: set TESTDIR from src/test/perl rather than Makefile/vcregress

diff --git a/src/Makefile.global.in b/src/Makefile.global.in
index 05c54b27def..1e49d8c8c37 100644
--- a/src/Makefile.global.in
+++ b/src/Makefile.global.in
@@ -450,7 +450,7 @@ define prove_installcheck
 rm -rf '$(CURDIR)'/tmp_check
 $(MKDIR_P) '$(CURDIR)'/tmp_check
 cd $(srcdir) && \
-   TESTDIR='$(CURDIR)' PATH="$(bindir):$(CURDIR):$$PATH" \
+   PATH="$(bindir):$(CURDIR):$$PATH" \
PGPORT='6$(DEF_PGPORT)' top_builddir='$(CURDIR)/$(top_builddir)' \
PG_REGRESS='$(CURDIR)/$(top_builddir)/src/test/regress/pg_regress' \
$(PROVE) $(PG_PROVE_FLAGS) $(PROVE_FLAGS) $(if 
$(PROVE_TESTS),$(PROVE_TESTS),t/*.pl)
@@ -460,7 +460,7 @@ define prove_installcheck
 rm -rf '$(CURDIR)'/tmp_check
 $(MKDIR_P) '$(CURDIR)'/tmp_check
 cd $(srcdir) && \
-   TESTDIR='$(CURDIR)' PATH="$(bindir):$(CURDIR):$$PATH" \
+   PATH="$(bindir):$(CURDIR):$$PATH" \
PGPORT='6$(DEF_PGPORT)' top_builddir='$(top_builddir)' \
PG_REGRESS='$(top_builddir)/src/test/regress/pg_regress' \
$(PROVE) $(PG_PROVE_FLAGS) $(PROVE_FLAGS) $(if 
$(PROVE_TESTS),$(PROVE_TESTS),t/*.pl)
@@ -471,7 +471,7 @@ define prove_check
 rm -rf '$(CURDIR)'/tmp_check
 $(MKDIR_P) '$(CURDIR)'/tmp_check
 cd $(srcdir) && \
-   TESTDIR='$(CURDIR)' $(with_temp_install) PGPORT='6$(DEF_PGPORT)' \
+   $(with_temp_install) PGPORT='6$(DEF_PGPORT)' \
PG_REGRESS='$(CURDIR)/$(top_builddir)/src/test/regress/pg_regress' \
$(PROVE) $(PG_PROVE_FLAGS) $(PROVE_FLAGS) $(if 
$(PROVE_TESTS),$(PROVE_TESTS),t/*.pl)
 endef
diff --git a/src/bin/psql/t/010_tab_completion.pl 
b/src/bin/psql/t/010_tab_completion.pl
index 005961f34d4..a86dc78a365 100644
--- a/src/bin/psql/t/010_tab_completion.pl
+++ b/src/bin/psql/t/010_tab_completion.pl
@@ -70,7 +70,7 @@ delete $ENV{LS_COLORS};
 # to run in the build directory so that we can use relative paths to
 # access the tmp_check subdirectory; otherwise the output from filename
 # completion tests is too variable.
-if ($ENV{TESTDIR})
+if ($ENV{TESTDIR} && 0)
 {
chdir $ENV{TESTDIR} or die "could not chdir to \"$ENV{TESTDIR}\": $!";
 }
diff --git a/src/test/modules/libpq_pipeline/t/001_libpq_pipeline.pl 
b/src/test/modules/libpq_pipeline/t/001_libpq_pipeline.pl
index facfec5cad4..2a0eca77440 100644
--- a/src/test/modules/libpq_pipeline/t/001_libpq_pipeline.pl
+++ b/src/test/modules/libpq_pipeline/t/001_libpq_pipeline.pl
@@ -49,9 +49,7 @@ for my $testname (@tests)
my $expected;
my $result;
 
-   # Hack to allow TESTDIR=. during parallel tap tests
-   my $inputdir = 
"$ENV{'TESTDIR'}/src/test/modules/libpq_pipeline";
-   $inputdir = "$ENV{'TESTDIR'}" if ! -e $inputdir;
+   my $inputdir = "$ENV{'TESTDIR'}/tmp_check";
$expected = slurp_file_eval("$inputdir/traces/$testname.trace");
next unless $expected ne "";
$result = slurp_file_eval($traceout);
diff --git a/src/test/perl/PostgreSQL/Test/Utils.pm 
b/src/test/perl/PostgreSQL/Test/Utils.pm
index 57fcb240898..5429de41ed5 100644
--- a/src/test/perl/PostgreSQL/Test/Utils.pm
+++ b/src/test/perl/Pos

Re: make tuplestore helper function

2022-02-15 Thread Justin Pryzby
On Tue, Jan 11, 2022 at 10:19:33AM -0500, Melanie Plageman wrote:
> > If the expectedDesc NULL check were an 0001 patch, then 0002 (the main 
> > patch)
> > would be even easier to review.  Only foreign.c is different.
> 
> I'll wait to do that if preferred by committer.
> Are you imagining that patch 0001 would only add the check for
> expectedDesc that is missing from pg_config() and text_to_table()?

Yes, that's what I meant, but I haven't been able to determine if a NULL check
should be added at all.

I found this commit message, which indicates that it shouldn't be used in every
case.  I'm not clear when it should/not be used, though.

commit 60651e4cddbb77e8f1a0c7fc0be6a7e7bf626fe0
Author: Tom Lane 
Date:   Sat Oct 28 14:02:21 2017 -0400

Support domains over composite types in PL/Perl.

In passing, don't insist on rsi->expectedDesc being set unless we
actually need it; this allows succeeding in a couple of cases where
PL/Perl functions returning setof composite would have failed before,
and makes the error message more apropos in other cases.




Re: Small and unaffected typo in pg_logical_slot_get_changes_guts()

2022-02-15 Thread Justin Pryzby
On Wed, Feb 16, 2022 at 11:27:58AM +0900, Kasahara Tatsuhito wrote:
> -   tuplestore_donestoring(tupstore);
> +   tuplestore_donestoring(p->tupstore);

Melanie's tuplestore patch also removes the bogus line.
https://www.postgresql.org/message-id/20220106005748.GT14051%40telsasoft.com

-- 
Justin




Re: adding 'zstd' as a compression algorithm

2022-02-15 Thread Justin Pryzby
On Tue, Feb 15, 2022 at 03:23:30PM -0500, Robert Haas wrote:
> On Tue, Feb 15, 2022 at 1:55 PM Justin Pryzby  wrote:
> > Regarding the patch:
> >
> > I suppose it should support windows, and whatever patches use zstd should
> > update the install instructions.  See 9ca40dcd4, 02a93e7ef, 4035cd5d4.
> 
> Thanks, this is helpful. To me, it looks like we need something
> similar to 9ca40dcd4, plus something like the last hunk of 02a93e7ef.

If you mean a patch which only adds the configure options, I don't think 02a9
is applicable at all.

What I mean is that any patches which *use* the zstd support should update
those for completeness.

 doc/src/sgml/config.sgml  | 14 +-
 doc/src/sgml/install-windows.sgml |  2 +-
 doc/src/sgml/installation.sgml|  5 +++--

> The rest of 02a93e7ef and all of 4035cd5d4 seem like they address the
> application of lz4 rather than just adding support for it, so I don't
> see the need for those elements in this patch. Am I missing something?




Re: adding 'zstd' as a compression algorithm

2022-02-15 Thread Justin Pryzby
I think the WAL patch (4035cd5d4) should support zstd if library support is
added.  A supplementary patch for that already exists.
https://www.postgresql.org/message-id/ynqwd2gsmrnqw...@paquier.xyz

There's also some in-progress patches:

 - Konstantin and Daniil have a patch to add libpq compression, which ultimately
   wants to use zstd.
   https://commitfest.postgresql.org/37/3499/

 - I had a patch to add zstd to pg_dump, but I ended up storing backups to
   ZFS+zstd rather than waiting to progress the patch.
   https://commitfest.postgresql.org/32/2888/

Regarding the patch:

I suppose it should support windows, and whatever patches use zstd should
update the install instructions.  See 9ca40dcd4, 02a93e7ef, 4035cd5d4.




Re: refactoring basebackup.c (zstd)

2022-02-15 Thread Justin Pryzby
+++ b/configure

@@ -801,6 +805,7 @@ infodir
 docdir
 oldincludedir
 includedir
+runstatedir

There's superfluous changes to ./configure unrelated to the changes in
configure.ac.  Probably because you're using a different version of autotools,
or a vendor's patched copy.  You can remove the changes with git checkout -p or
similar.

+++ b/src/backend/replication/basebackup_zstd.c
+bbsink *
+bbsink_zstd_new(bbsink *next, int compresslevel)
+{
+#ifndef HAVE_LIBZSTD
+   ereport(ERROR,
+   (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+errmsg("zstd compression is not supported by this 
build")));
+#else

This should have an return; like what's added by 71ce8 and 302612a6c.
Also, the parens() around errcode aren't needed since last year.

+   bbsink_zstd *sink;
+
+   Assert(next != NULL);
+   Assert(compresslevel >= 0 && compresslevel <= 22);
+
+   if (compresslevel < 0 || compresslevel > 22)
+   ereport(ERROR,

This looks like dead code in assert builds.
If it's unreachable, it can be elog().

+ * Compress the input data to the output buffer until we run out of input
+ * data. Each time the output buffer falls below the compression bound for
+ * the input buffer, invoke the archive_contents() method for then next sink.

*the next sink ?

Does anyone plan to include this for pg15 ?  If so, I think at least the WAL
compression should have support added too.  I'd plan to rebase Michael's patch.
https://www.postgresql.org/message-id/ynqwd2gsmrnqw...@paquier.xyz

-- 
Justin




Re: Time to increase hash_mem_multiplier default?

2022-02-15 Thread Justin Pryzby
On Mon, Feb 14, 2022 at 10:32:43PM -0800, Peter Geoghegan wrote:
> On Wed, Jan 19, 2022 at 11:32 AM John Naylor
>  wrote:
> > I don't have anything really profound to say here, but in the last
> > year I did on a couple occasions recommend clients to raise
> > hash_mem_multiplier to 2.0 to fix performance problems.
> 
> I would like to push ahead with an increase in the default for
> Postgres 15, to 2.0.
> 
> Any objections to that plan?

The only reason not to is that a single-node hash-aggregate plan will now use
2x work_mem.  Which won't make sense to someone who doesn't deal with
complicated plans (and who doesn't know that work_mem is per-node and can be
used multiplicitively).  I don't see how one could address that other than to
change hash_mem_multiplier to nonhash_mem_divider.

It'll be in the release notes, so should be fine.

-- 
Justin




Re: Postgres 14.2 Windows can't rename temporary statistics file

2022-02-14 Thread Justin Pryzby
On Mon, Feb 14, 2022 at 09:19:54PM -0300, Ranier Vilela wrote:
> I've reported this issue, but without success in fixing it.

It'd be helpful to provide a link to the prior discussions, and summarize it.

https://wiki.postgresql.org/wiki/PostgreSQL_14_Open_Items
https://www.postgresql.org/message-id/CAEudQAoR5e7=umz0otzucvb25ztc8qqbe+2dt1jrsa3u+xu...@mail.gmail.com

See also e2f0f8ed251d02c1eda79e1ca3cb3db2681e7a86

-- 
Justin




Re: Adding CI to our tree

2022-02-13 Thread Justin Pryzby
On Sat, Feb 12, 2022 at 02:26:25PM -0800, Andres Freund wrote:
> On 2022-02-12 16:06:40 -0600, Justin Pryzby wrote:
> > I had some success with that, but it doesn't seem to be significantly 
> > faster -
> > it looks a lot like the tests are not actually running in parallel.

Note that the total test time is close to the sum of the individual test times.
But I think that may be an artifact of how prove is showing/attributing times
to each test (which, if so, is misleading).

> Note that prove unfortunately serializes the test output to be in the order it
> started them, even when tests run concurrently. Extremely unhelpful, but ...

Are you sure ?  When I run it locally, I see:
rm -fr src/test/recovery/tmp_check ; time PERL5LIB=`pwd`/src/test/perl 
TESTDIR=`pwd`/src/test/recovery 
PATH=`pwd`/tmp_install/usr/local/pgsql/bin:$PATH 
PG_REGRESS=`pwd`/src/test/regress/pg_regress 
REGRESS_SHLIB=`pwd`/src/test/regress/regress.so prove --time -j4 --ext '*.pl' 
`find src -name t`
...
[15:34:48] src/bin/scripts/t/101_vacuumdb_all.pl ... ok 
 104 ms ( 0.00 usr  0.00 sys +  2.35 cusr  0.47 csys =  2.82 CPU)
[15:34:49] src/bin/scripts/t/090_reindexdb.pl .. ok 
8894 ms ( 0.06 usr  0.01 sys + 14.45 cusr  3.38 csys = 17.90 CPU)
[15:34:50] src/bin/pg_config/t/001_pg_config.pl  ok 
  79 ms ( 0.00 usr  0.01 sys +  0.23 cusr  0.04 csys =  0.28 CPU)
[15:34:50] src/bin/pg_waldump/t/001_basic.pl ... ok 
  35 ms ( 0.00 usr  0.00 sys +  0.26 cusr  0.02 csys =  0.28 CPU)
[15:34:51] src/bin/pg_test_fsync/t/001_basic.pl  ok 
 100 ms ( 0.01 usr  0.00 sys +  0.24 cusr  0.04 csys =  0.29 CPU)
[15:34:51] src/bin/pg_archivecleanup/t/010_pg_archivecleanup.pl  ok 
 177 ms ( 0.02 usr  0.00 sys +  0.26 cusr  0.03 csys =  0.31 CPU)
[15:34:55] src/bin/scripts/t/100_vacuumdb.pl ... ok
11267 ms ( 0.12 usr  0.04 sys + 13.47 cusr  3.20 csys = 16.83 CPU)
[15:34:57] src/bin/scripts/t/102_vacuumdb_stages.pl  ok 
5802 ms ( 0.06 usr  0.01 sys +  7.70 cusr  1.37 csys =  9.14 CPU)
...

=> scripts/ stuff, followed by other stuff, followed by more, slower, scripts/ 
stuff.

But I never saw that in cirrus.

> Isn't this kind of a good test time? I thought earlier your alltaptests target
> took a good bit longer?

The original alltaptests runs in 16m 21s.
https://cirrus-ci.com/task/6679061752709120

2 weeks ago, it was ~14min with your patch to cache initdb.
https://cirrus-ci.com/task/5439320633901056

As I recall, that didn't seem to improve runtime when combined with my parallel
patch.

> One nice bit is that the output is a *lot* easier to read.
> 
> You could try improving the total time by having prove remember slow tests and
> use that file to run the slowest tests first next time. --state slow,save or
> such I believe. Of course we'd have to save that state file...

In a test, this hurt rather than helped (13m 42s).
https://cirrus-ci.com/task/6359167186239488

I'm not surprised - it makes sense to run 10 fast tests at once, but usually
doesn't make sense to run 10 slow tests tests at once (at least a couple of
which are doing something intensive).  It was faster (12m16s) to do it
backwards (fastest tests first).
https://cirrus-ci.com/task/5745115443494912

BTW, does it make sense to remove test_regress_parallel_script ?  The
pg_upgrade run would do the same things, no ?  If so, it might make sense to
run that first.  OTOH, you suggested to run the upgrade tests with checksums
enabled, which seems like a good idea.

Note that in the attached patches, I changed the msvc "warnings" to use "tee".

I don't know how to fix the pipeline test in a less hacky way...

You said the docs build should be a separate task, but then said that it'd be
okay to remove the dependency.  So I did it both ways.  There's currently some
duplication between the docs patch and code coverage patch.

-- 
Justin
>From 05c24a1e679e5ae0dd0cb6504d397f77c8b1fc5c Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Mon, 17 Jan 2022 00:53:04 -0600
Subject: [PATCH 1/8] cirrus: include hints how to install OS packages..

This is useful for patches during development, but once a features is merged,
new libraries should be added to the OS image files, rather than installed
during every CI run forever into the future.
---
 .cirrus.yml | 12 +---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/.cirrus.yml b/.cirrus.yml
index dd96a97efe5..eda8ac9596c 100644
--- a/.cirrus.yml
+++ b/.cirrus.yml
@@ -73,10 +73,11 @@ task:
 chown -R postgres:postgres .
 mkdir -p ${CCACHE_DIR}
 chown -R postgres:postgres ${CCACHE_DIR}
-  setup_cores_script: |
+  setup_os_script: |
 mkdir -m 770 /tmp/cores
 chown root:postgres /tmp/cores
 sysctl kern.corefile='/tmp/cores/%N.%P.core'
+#pkg install -y

Re: Adding CI to our tree

2022-02-13 Thread Justin Pryzby
On Sun, Feb 13, 2022 at 01:23:16PM -0800, Andres Freund wrote:
> If you're seeing this on windows on one of your test branches, that's much
> more likely to be caused by the alltaptests stuff, than by the change in
> artifact instruction.

Oh - I suppose you're right.  That's an unfortunate consequence of running a
single prove instance without chdir.

-- 
Justin




Re: Adding CI to our tree

2022-02-13 Thread Justin Pryzby
On Sat, Feb 12, 2022 at 04:24:20PM -0800, Andres Freund wrote:
> > What I am excited about is that some of your other changes showed that we
> > don't need separate *_artifacts for separate directories anymore. That used 
> > to
> > be the case, but an array of paths is now supported. Putting log, diffs, and
> > regress_log in one directory will be considerably more convenient...
> 
> pushed together.

This change actually complicates things.

Before, there was log/src/test/recovery/tmp_check/log, with a few files for
001, a few for 002, a few for 003.  This are a lot of output files, but at
least they're all related.

Now, there's a single log/tmp_check/log, which has logs for the entire tap
tests.  There's 3 pages of 001*, 2 pages of 002*, 3 pages of 003, etc.  

-- 
Justin




Re: Adding CI to our tree

2022-02-12 Thread Justin Pryzby
On Sat, Feb 12, 2022 at 05:20:08PM -0800, Andres Freund wrote:
> On 2022-02-03 23:04:04 -0600, Justin Pryzby wrote:
> > > I'm a bit worried about the increased storage and runtime overhead due to 
> > > the
> > > docs changes. We probably can make it a good bit cheaper though.
> >
> > If you mean overhead of additional disk operations, it shouldn't be an 
> > issue,
> > since the git clone uses shared references (not even hardlinks).
> 
> I was concerned about those and just the increased runtime of the script. Our
> sources are 130MB, leaving the shared .git aside. But maybe it's just fine.
> 
> We probably can just get rid of the separate clone and configure run though?
> Build the docs, copy the output, do a git co parent docs/, build again?

Yes - works great, thanks.

> What was the reason behind moving the docs stuff from the compiler warnings
> task to linux?

I wanted to build docs even if the linux task fails.  To allow CFBOT to link to
them, so somoene can always review the docs, in HTML (rather than reading SGML
with lines prefixed with +).

> Not that either fits very well... I think it might be worth
> moving the docs stuff into its own task, using a 1 CPU container (docs build
> isn't parallel anyway). Think that'll be easier to see in the cfbot page. I

Yeah.  The only drawback is the duplication (including the "git parent" stuff).

BTW, docs can be built in parallel, and CI is using BUILD_JOBS: 4.
/usr/bin/xmllint --path . --noout --valid postgres.sgml
/usr/bin/xmllint --path . --noout --valid postgres.sgml
/usr/bin/xsltproc --path . --stringparam pg.version '15devel'  stylesheet.xsl 
postgres.sgml
/usr/bin/xsltproc --path . --stringparam pg.version '15devel'  
stylesheet-man.xsl postgres.sgml

> 1) iterate over release branches and see which has the smallest diff

Maybe for each branch: do echo `git revlist or merge base |wc -l` $branch; done 
|sort -n |head -1

> > > Is looking at the .c files in the change really a reliable predictor of 
> > > where
> > > code coverage changes? I'm doubtful. Consider stuff like removing the last
> > > user of some infrastructure or such. Or adding the first.
> >
> > You're right that it isn't particularly accurate, but it's a step forward if
> > lots of patches use it to check/improve coverge of new code.
> 
> Maybe it's good enough... The overhead in test runtime is noticeable (~5.30m
> -> ~7.15m), but probably acceptable. Although I also would like to enable
> -fsanitize=alignment and -fsanitize=alignment, which add about 2 minutes as
> well (-fsanitize=address is a lot more expensive), they both work best on
> linux.

There's other things that it'd be nice to enable, but maybe these don't need to
be on by default.  Maybe just have a list of optional compiler flags (and hints
when they're useful).  Like WRITE_READ_PARSE_PLAN_TREES.

> > In addition to the HTML generated for changed .c files, it's possible to 
> > create
> > a coverage.gcov output for everything, which could be retrieved to generate
> > full HTML locally.  It's ~8MB (or 2MB after gzip).
> 
> Note sure that doing doing it locally adds much over just running tests
> locally.

You're right, since one needs to have the patched source files to generate the
HTML.

On Thu, Feb 03, 2022 at 11:57:18AM -0800, Andres Freund wrote:
> I think it may be a bit cleaner to have the "install additional packages"
> "template step" be just that, and not merge in other contents into it?

I renamed the "cores" task since it consistently makes me think you're doing
with CPU cores.  It took it as an opportunity to generalize the task.

These patches are ready for review.  I'll plan to mail about TAP stuff
tomorrow.
>From 05c24a1e679e5ae0dd0cb6504d397f77c8b1fc5c Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Mon, 17 Jan 2022 00:53:04 -0600
Subject: [PATCH 1/3] cirrus: include hints how to install OS packages..

This is useful for patches during development, but once a features is merged,
new libraries should be added to the OS image files, rather than installed
during every CI run forever into the future.
---
 .cirrus.yml | 12 +---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/.cirrus.yml b/.cirrus.yml
index dd96a97efe5..eda8ac9596c 100644
--- a/.cirrus.yml
+++ b/.cirrus.yml
@@ -73,10 +73,11 @@ task:
 chown -R postgres:postgres .
 mkdir -p ${CCACHE_DIR}
 chown -R postgres:postgres ${CCACHE_DIR}
-  setup_cores_script: |
+  setup_os_script: |
 mkdir -m 770 /tmp/cores
 chown root:postgres /tmp/cores
 sysctl kern.corefile='/tmp/cores/%N.%P.core'
+#pkg install -y ...
 
   # NB: Intentionally build without --with-llvm. The freebsd image size is
   # already large enough to make VM startup slow, and even without

Re: Adding CI to our tree

2022-02-12 Thread Justin Pryzby
On Sat, Feb 12, 2022 at 04:24:20PM -0800, Andres Freund wrote:
> > > e5286ede1b4 cirrus: avoid unnecessary double star **
> > 
> > Can't get excited about this, but whatever.
> > 
> > What I am excited about is that some of your other changes showed that we
> > don't need separate *_artifacts for separate directories anymore. That used 
> > to
> > be the case, but an array of paths is now supported. Putting log, diffs, and
> > regress_log in one directory will be considerably more convenient...
> 
> pushed together.

While rebasing, I noticed an error.

You wrote **/.diffs, but should be **/*.diffs

--- a/.cirrus.yml
+++ b/.cirrus.yml
@@ -30,15 +30,11 @@ env:
 # What files to preserve in case tests fail
 on_failure: _failure
   log_artifacts:
-path: "**/**.log"
+paths:
+  - "**/*.log"
+  - "**/.diffs"
+  - "**/regress_log_*"
 type: text/plain
-  regress_diffs_artifacts:
-path: "**/**.diffs"
-type: text/plain
-  tap_artifacts:
-path: "**/regress_log_*"
-type: text/plain
-




Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints

2022-02-12 Thread Justin Pryzby
On Sat, Feb 12, 2022 at 06:00:44PM -0800, Andres Freund wrote:
> Hi,
> 
> On 2022-02-11 16:19:12 -0500, Robert Haas wrote:
> > I somewhat hope we never end up with THREE strategies for creating a new
> > database, but now that I think about it, we might. Somebody might want to
> > use a fancy FS primitive that clones a directory at the FS level, or
> > something.
> 
> I think that'd be a great, and pretty easy to implement, feature. But it seems
> like it'd be mostly orthogonal to the "WAL log data" vs "checkpoint data"
> question? On the primary / single node system using "WAL log data" with "COW
> file copy" would work well.
> 
> I bet using COW file copies would speed up our own regression tests noticeably
> - on slower systems we spend a fair bit of time and space creating template0
> and postgres, with the bulk of the data never changing.
> 
> Template databases are also fairly commonly used by application developers to
> avoid the cost of rerunning all the setup DDL & initial data loading for
> different tests. Making that measurably cheaper would be a significant win.

+1

I ran into this last week and was still thinking about proposing it.

Would this help CI or any significant fraction of buildfarm ?
Or just tests run locally on supporting filesystems.

Note that pg_upgrade already supports copy/link/clone.  (Obviously, link
wouldn't do anything desirable for CREATE DATABASE).

-- 
Justin




Re: Adding CI to our tree

2022-02-12 Thread Justin Pryzby
On Tue, Jan 18, 2022 at 05:16:26PM -0800, Andres Freund wrote:
> On 2022-01-18 15:08:47 -0600, Justin Pryzby wrote:
> > On Mon, Jan 17, 2022 at 12:16:19PM -0800, Andres Freund wrote:
> > > I think it might still be worth adding stopgap way of running all tap 
> > > tests on
> > > windows though. Having a vcregress.pl function to find all directories 
> > > with t/
> > > and run the tests there, shouldn't be a lot of code...
> > 
> > I started doing that, however it makes CI/windows even slower.
...
> > I think it'll be necessary to run prove with all the tap tests to
> > parallelize them, rather than looping around directories, many of which have
> > only a single file, and are run serially.
> 
> That's unfortunately not trivially possible. Quite a few tests currently rely
> on being called in a specific directory. We should fix this, but it's not a
> trivial amount of work.

On Sat, Feb 05, 2022 at 07:23:39PM -0800, Andres Freund wrote:
> On 2022-02-03 23:04:04 -0600, Justin Pryzby wrote:
> > > I assume this doesn't yet work to a meaningful degree? Last time I checked
> > > there were quite a few tests that needed to be invoked in a specific
> > > directory.
> > 
> > It works - tap_check() does chdir().
> 
> Ah, I thought you'd implemented a target that does it all in one prove
> invocation...

I had some success with that, but it doesn't seem to be significantly faster -
it looks a lot like the tests are not actually running in parallel.  I tried
some variations like passing the list of dirs vs the list of files, and
--jobs=9 vs -j9, without success.

https://cirrus-ci.com/task/5580584675180544

https://github.com/justinpryzby/postgres/commit/a865adc5b8c
fc7b3ea8bce vcregress/ci: test modules/contrib with NO_INSTALLCHECK=1
03adb043d16 wip: vcsregress: add alltaptests
63bf0796ffd wip: vcregress: run alltaptests in parallel
9dc327f6b30 f!wip: vcregress: run alltaptests in a single prove invocation
a865adc5b8c tmp: run tap tests first

> > It currently fails in 027_stream_regress.pl, although I keep hoping that it
> > had been fixed...
> 
> That's likely because you're not setting REGRESS_OUTPUTDIR like
> src/test/recovery/Makefile and recoverycheck() are doing.

Yes, thanks.

-- 
Justin




buildfarm warnings

2022-02-12 Thread Justin Pryzby
Is there any check for warnings from new code, other than those buildfarm
members with -Werror ?

It'd be better to avoid warnings, allowing members to use -Werror, rather than
to allow/ignore warnings, which preclude that possibility.  A circular problem.

I checked for warnings on master during "check" phase by opening many browser
tabs...

I'm of the impression that some people have sql access to BF logs.  It seems
like it'd be easy to make a list of known/old/false-positive warnings and
ignore warnings matching (member,file,regex) to detect new warnings.

https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=caiman=2022-02-12%2006%3A00%3A30=make
https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=moonjelly=2022-02-12%2001%3A06%3A22=make
rawhide prerelease / gcc trunk
pg_basebackup.c:1261:35: warning: storing the address of local variable 
archive_filename in progress_filename [-Wdangling-pointer=]
=> new in 23a1c6578 - looks like a real error

https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=wrasse=2022-02-12%2005%3A08%3A48=make
Oracle Developer Studio
"/export/home/nm/farm/studio64v12_6/HEAD/pgsql.build/../pgsql/src/backend/libpq/auth.c",
 line 104: warning: initialization type mismatch
"/export/home/nm/farm/studio64v12_6/HEAD/pgsql.build/../pgsql/src/backend/commands/copy.c",
 line 206: warning: true part of ?: is unused and is nonconstant
"/export/home/nm/farm/studio64v12_6/HEAD/pgsql.build/../pgsql/src/backend/utils/fmgr/fmgr.c",
 line 400: warning: assignment type mismatch:
"/export/home/nm/farm/studio64v12_6/HEAD/pgsql.build/../pgsql/src/backend/access/common/reloptions.c",
 line 742: warning: argument #2 is incompatible with prototype:
=> "Size relopt_struct_size" looks wrong since 911e7020 (v13)

https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=haddock=2022-02-12%2000%3A23%3A17=make
https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=hake=2022-02-12%2003%3A00%3A35=make
llumos "rolling release"
pg_dump_sort.c:1232:3: warning: format not a string literal and no format 
arguments [-Wformat-security]

https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=prairiedog=2022-02-12%2009%3A52%3A07=make
OSX/PPC
A lot of these, but some from new code (ba59)
pg_receivewal.c:288: warning: 'wal_compression_method' may be used 
uninitialized in this function
pg_receivewal.c:289: warning: 'ispartial' may be used uninitialized in this 
function

https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=gharial=2022-02-12%2012%3A36%3A42=make
HPUX
compress_io.c:532:4: warning: implicit declaration of function gzopen64 
[-Wimplicit-function-declaration]
pg_backup_archiver.c:1521:4: warning: implicit declaration of function gzopen64 
[-Wimplicit-function-declaration]
pg_backup_archiver.c:1521:11: warning: assignment makes pointer from integer 
without a cast [enabled by default]

https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=anole=2022-02-12%2005%3A46%3A44=make
HPUX
"heapam_handler.c", line 2434: warning #2940-D: missing return statement at end 
of non-void function "heapam_scan_sample_next_tuple"
"regc_lex.c", line 762: warning #2940-D: missing return statement at end of 
non-void function "lexescape"
"fmgr.c", line 400: warning #2513-D: a value of type "void *" cannot be 
assigned to an entity of type "PGFunction"
"bbstreamer_gzip.c", line 92: warning #2223-D: function "gzopen64" declared 
implicitly

https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=tern=2022-02-12%2006%3A48%3A14=make
https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=sungazer=2022-02-12%2009%3A26%3A58=make
ppc
auth.c:1876:6: warning: implicit declaration of function 'getpeereid'; did you 
mean 'getpgid'? [-Wimplicit-function-declaration]
missing #include ?

https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=pollock=2022-02-12%2004%3A35%3A53=make
GCC10/illumos
pg_shmem.c:326:33: warning: passing argument 1 of 'shmdt' from incompatible 
pointer type [-Wincompatible-pointer-types]

https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=pogona=2022-02-12%2004%3A10%3A03=make
https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=desmoxytes=2022-02-12%2018%3A24%3A03=make
LLVM: warning: void* memcpy(void*, const void*, size_t) writing to an object of 
type struct std::pair with no trivial 
copy-assignment; use copy-assignment or copy-initialization instead 
[-Wclass-memaccess]

https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=rorqual=2022-02-12%2018%3A23%3A39=make
https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=serinus=2022-02-12%2005%3A00%3A08=make
https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=calliphoridae=2022-02-12%2004%3A07%3A11=make
https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=culicidae=2022-02-12%2004%3A07%3A13=make
https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=piculet=2022-02-12%2004%3A07%3A25=make

Re: refactoring basebackup.c

2022-02-12 Thread Justin Pryzby
The LZ4 patches caused new compiler warnings.
It's the same issue that was fixed at 71ce8 for gzip.
I think they would've been visible in the CI environment, too.

https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=wrasse=2022-02-12%2005%3A08%3A48=make
"/export/home/nm/farm/studio64v12_6/HEAD/pgsql.build/../pgsql/src/backend/replication/basebackup_lz4.c",
 line 87: warning: Function has no return statement : bbsink_lz4_new

https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=bowerbird=2022-02-12%2013%3A11%3A20=make
https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=hamerkop=2022-02-12%2010%3A04%3A08=make
warning C4715: 'bbsink_lz4_new': not all control paths return a value

https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=anole=2022-02-12%2005%3A46%3A44=make
"basebackup_lz4.c", line 87: warning #2940-D: missing return statement at end 
of non-void function "bbsink_lz4_new"

https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=wrasse=2022-02-12%2005%3A08%3A48=make
"/export/home/nm/farm/studio64v12_6/HEAD/pgsql.build/../pgsql/src/backend/replication/basebackup_lz4.c",
 line 87: warning: Function has no return statement : bbsink_lz4_new




Re: support for MERGE

2022-02-11 Thread Justin Pryzby
On Fri, Feb 11, 2022 at 05:25:49PM -0300, Alvaro Herrera wrote:
> > I'm not sure git diff --cherry-pick is widely known/used, but I think
> > using that relative to master may be good enough.  
> 
> I had never heard of git diff --cherry-pick, and the manpages I found
> don't document it, so frankly I doubt it's known.  I still have no idea
> what does it do.

See git-log(1)

   --cherry-pick
   Omit any commit that introduces the same change as another commit on 
the “other side” when the set of commits are limited with symmetric difference.

The "symmetric difference" is the triple-dot notation.

The last few years I've used this to check for missing bits in the draft
release notes.  (Actually, I tend to start my own list of features before
that).  It's doing a generic version of what git_changelog does.

https://www.postgresql.org/message-id/20210510144045.gc27...@telsasoft.com

> I suppose there is an obvious reason why using git diff with
> $(git merge-base ...) as one of the arguments doesn't work for these purposes.
> 
> > Andres thinks that does the wrong thing if CI is run manually (not by CFBOT)
> > for patches against backbranches.
> 
> I wonder if it's sufficient to handle these things (coverage
> specifically) for branch master only.

Or default to master, and maybe try to parse the commit message and pull out
Backpatch-through: NN.  It's intended to be machine-readable, after all.

Let's talk about it more on the/another thread :)

-- 
Justin




Re: support for MERGE

2022-02-11 Thread Justin Pryzby
On Fri, Feb 11, 2022 at 03:21:43PM -0300, Alvaro Herrera wrote:
> On 2022-Jan-28, Justin Pryzby wrote:
> > Have you looked at code coverage ?  I have an experimental patch to add 
> > that to
> > cirrus, and ran it with this patch; visible here:
> > https://cirrus-ci.com/task/6362512059793408
> 
> Ah, thanks, this is useful.  I think it is showing that the new code is
> generally well covered, but there are some exceptions, particularly
> ExecMergeMatched in some concurrent cases (TM_Deleted and
> TM_SelfModified after table_tuple_lock -- those code pages have
> user-facing errors but are not covered by any tests.)
> 
> How does this work?  I notice it is reporting for
> src/bin/pg_upgrade/relfilenode.c, but that file is not changed by this
> patch.

Because I put your patch on top of some other branch with the CI coverage (and
other stuff).

It tries to only show files changed by the branch being checked:
https://github.com/justinpryzby/postgres/commit/d668142040031915

But it has to figure out where the branch "starts".  Which I did by looking at
"git diff --cherry-pick origin..."

Andres thinks that does the wrong thing if CI is run manually (not by CFBOT)
for patches against backbranches.  I'm not sure git diff --cherry-pick is
widely known/used, but I think using that relative to master may be good
enough.  

Ongoing discussion here.
https://www.postgresql.org/message-id/20220203035827.GG23027%40telsasoft.com

-- 
Justin




Re: pgsql: Add TAP test to automate the equivalent of check_guc

2022-02-11 Thread Justin Pryzby
On Fri, Feb 11, 2022 at 10:41:27AM -0500, Tom Lane wrote:
> Justin Pryzby  writes:
> > On Fri, Feb 11, 2022 at 09:59:55AM -0500, Tom Lane wrote:
> >> This seems like a pretty bad idea even if it weren't failing outright.
> >> We should be examining the version of the file that's in the source
> >> tree; the one in the installation tree might have version-skew
> >> problems, if you've not yet done "make install".
> 
> > My original way used the source tree, but Michael thought it would be an 
> > issue
> > for "installcheck" where the config may not be available.
> 
> Yeah, you are at risk either way, but in practice nobody is going to be
> running these TAP tests without a source tree.
> 
> > This is what I had written:
> > FROM (SELECT regexp_split_to_table(pg_read_file('postgresql.conf'), '\n') 
> > AS ln) conf
> 
> That's not using the source tree either, but the copy in the
> cluster-under-test.  I'd fear it to be unstable in the buildfarm, where
> animals can append whatever they please to the config file being used by
> tests.

The important test is that "new GUCs exist in sample.conf".  The converse test
is less interesting, so I think the important half would be okay.

I see the BF client appends to postgresql.conf.
https://github.com/PGBuildFarm/client-code/blob/main/run_build.pl#L1374

It could use "include", which was added in v8.2.  Or it could add a marker,
like pg_regress does:
src/test/regress/pg_regress.c:  fputs("\n# Configuration added by 
pg_regress\n\n", pg_conf);

If it were desirable to also check that "things that look like GUCs in the conf
are actually GUCs", either of those would avoid false positives.

Or, is it okay to use ABS_SRCDIR?

+\getenv abs_srcdir PG_ABS_SRCDIR
+\set filename :abs_srcdir 
'../../../../src/backend/utils/misc/postgresql.conf.sample'
+-- test that GUCS are in postgresql.conf
+SELECT lower(name) FROM tab_settings_flags WHERE NOT not_in_sample EXCEPT
+SELECT regexp_replace(ln, '^#?([_[:alpha:]]+) (= .*|[^ ]*$)', '\1') AS guc
+FROM (SELECT regexp_split_to_table(pg_read_file(:'filename'), '\n') AS ln) conf
+WHERE ln ~ '^#?[[:alpha:]]'
+ORDER BY 1;
+
+-- test that lines in postgresql.conf that look like GUCs are GUCs
+SELECT regexp_replace(ln, '^#?([_[:alpha:]]+) (= .*|[^ ]*$)', '\1') AS guc
+FROM (SELECT regexp_split_to_table(pg_read_file(:'filename'), '\n') AS ln) conf
+WHERE ln ~ '^#?[[:alpha:]]'
+EXCEPT SELECT lower(name) FROM tab_settings_flags WHERE NOT not_in_sample
+ORDER BY 1;

-- 
Justin




Re: pgsql: Add TAP test to automate the equivalent of check_guc

2022-02-11 Thread Justin Pryzby
On Fri, Feb 11, 2022 at 09:59:55AM -0500, Tom Lane wrote:
> Christoph Berg  writes:
> > this test is failing at Debian package compile time:
> > Could not open /usr/share/postgresql/15/postgresql.conf.sample: No such 
> > file or directory at t/003_check_guc.pl line 47.
> 
> > So it's trying to read from /usr/share/postgresql which doesn't exist
> > yet at build time.
> 
> > The relevant part of the test is this:
> 
> > # Find the location of postgresql.conf.sample, based on the information
> > # provided by pg_config.
> > my $sample_file =
> >   $node->config_data('--sharedir') . '/postgresql.conf.sample';
> 
> This seems like a pretty bad idea even if it weren't failing outright.
> We should be examining the version of the file that's in the source
> tree; the one in the installation tree might have version-skew
> problems, if you've not yet done "make install".

My original way used the source tree, but Michael thought it would be an issue
for "installcheck" where the config may not be available.

https://www.postgresql.org/message-id/YfTg/WHNLVVygy8v%40paquier.xyz

This is what I had written:

-- test that GUCS are in postgresql.conf
SELECT lower(name) FROM tab_settings_flags WHERE NOT not_in_sample EXCEPT
SELECT regexp_replace(ln, '^#?([_[:alpha:]]+) (= .*|[^ ]*$)', '\1') AS guc
FROM (SELECT regexp_split_to_table(pg_read_file('postgresql.conf'), '\n') AS 
ln) conf
WHERE ln ~ '^#?[[:alpha:]]'
ORDER BY 1;
lower
-
 config_file
 plpgsql.check_asserts
 plpgsql.extra_errors
 plpgsql.extra_warnings
 plpgsql.print_strict_params
 plpgsql.variable_conflict
(6 rows)

-- test that lines in postgresql.conf that look like GUCs are GUCs
SELECT regexp_replace(ln, '^#?([_[:alpha:]]+) (= .*|[^ ]*$)', '\1') AS guc
FROM (SELECT regexp_split_to_table(pg_read_file('postgresql.conf'), '\n') AS 
ln) conf
WHERE ln ~ '^#?[[:alpha:]]'
EXCEPT SELECT lower(name) FROM tab_settings_flags WHERE NOT not_in_sample
ORDER BY 1;
guc
---
 include
 include_dir
 include_if_exists
(3 rows)

-- 
Justin




Re: refactoring basebackup.c

2022-02-11 Thread Justin Pryzby
On Fri, Feb 11, 2022 at 08:35:25PM +0530, Jeevan Ladhe wrote:
> Thanks Robert for the bravity :-)

FYI: there's a couple typos in the last 2 patches.

I added them to my typos branch; feel free to wait until April if you'd prefer
to see them fixed in bulk.

diff --git a/doc/src/sgml/ref/pg_basebackup.sgml 
b/doc/src/sgml/ref/pg_basebackup.sgml
index 53aa40dcd19..649b91208f3 100644
--- a/doc/src/sgml/ref/pg_basebackup.sgml
+++ b/doc/src/sgml/ref/pg_basebackup.sgml
@@ -419,7 +419,7 @@ PostgreSQL documentation

 The compression method can be set to gzip or
 lz4, or none for no
-compression. A compression level can be optionally specified, by
+compression. A compression level can optionally be specified, by
 appending the level number after a colon (:). If no
 level is specified, the default compression level will be used. If
 only a level is specified without mentioning an algorithm,
@@ -440,7 +440,7 @@ PostgreSQL documentation
 -Xstream, pg_wal.tar will
 be compressed using gzip if client-side gzip
 compression is selected, but will not be compressed if server-side
-compresion or LZ4 compresion is selected.
+compression or LZ4 compression is selected.

   
  




Re: can we add subscription TAP test option "vcregress subscriptioncheck" for MSVC builds?

2022-02-10 Thread Justin Pryzby
On Thu, Feb 10, 2022 at 10:21:08PM +0530, Bharath Rupireddy wrote:
> Thanks for the comments. Above looks good to me, changed that way, PSA v2.

I spy a typo: subcription

-- 
Justin




Re: warn if GUC set to an invalid shared library

2022-02-09 Thread Justin Pryzby
On Fri, Jan 28, 2022 at 09:42:17AM -0500, Robert Haas wrote:
> -1 from me on using "guc" in any user-facing error message. And even
> guc -> setting isn't a big improvement. If we're going to structure
> the reporting this way there, we should try to use a meaningful phrase
> there, probably beginning with the word "while"; see "git grep
> errcontext.*while" for interesting precedents.

On Wed, Feb 02, 2022 at 06:06:01AM +, Maciek Sakrejda wrote:
> I do agree that GUC is awkward here, and I like the "while..." wording 
> suggested both for consistency with other messages and how it could work here:
> CONTEXT: while loading "shared_preload_libraries"

FYI, it has said "while..." and hasn't said "guc" since the 2nd revision of the
patch.

> That said, that series of messages seems a bit suspect to me, because
> the WARNING seems to be stating the same problem as the subsequent
> FATAL and CONTEXT lines. Ideally we'd tighten that somehow.

I avoided the warning by checking IsUnderPostmaster, though I'm not sure if
that's the right condition..

On Wed, Feb 02, 2022 at 06:06:01AM +, Maciek Sakrejda wrote:
> I think this is great, but it would be really helpful to also indicate that 
> at this point the server will fail to come back up after a restart.

> I don't really know a good wording here, but maybe a hint like "The server or 
> session will not be able to start if it has been configured to use libraries 
> that cannot be loaded."?

postgres=# ALTER SYSTEM SET shared_preload_libraries =a,b;
WARNING:  could not access file "$libdir/plugins/a"
HINT:  The server will fail to start with the existing configuration.  If it is 
is shut down, it will be necessary to manually edit the postgresql.auto.conf 
file to allow it to start.
WARNING:  could not access file "$libdir/plugins/b"
HINT:  The server will fail to start with the existing configuration.  If it is 
is shut down, it will be necessary to manually edit the postgresql.auto.conf 
file to allow it to start.
ALTER SYSTEM
postgres=# ALTER SYSTEM SET session_preload_libraries =c,d;
WARNING:  could not access file "$libdir/plugins/c"
HINT:  New sessions will fail with the existing configuration.
WARNING:  could not access file "$libdir/plugins/d"
HINT:  New sessions will fail with the existing configuration.
ALTER SYSTEM

$ ./tmp_install/usr/local/pgsql/bin/postgres -D src/test/regress/tmp_check/data 
-clogging_collector=on
2022-02-09 19:53:48.034 CST postmaster[30979] FATAL:  could not access file 
"a": No such file or directory
2022-02-09 19:53:48.034 CST postmaster[30979] CONTEXT:  while loading shared 
libraries for setting "shared_preload_libraries"
from 
/home/pryzbyj/src/postgres/src/test/regress/tmp_check/data/postgresql.auto.conf:3
2022-02-09 19:53:48.034 CST postmaster[30979] LOG:  database system is shut down

Maybe it's enough to show the GucSource rather than file:line...

-- 
Justin
>From 69560e81bfbb43a67269f54b564fe8ac5ae1b5c6 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Mon, 13 Dec 2021 08:42:38 -0600
Subject: [PATCH v4 1/3] warn when setting GUC to a nonextant library

---
 src/backend/utils/misc/guc.c  | 103 +-
 .../unsafe_tests/expected/rolenames.out   |   1 +
 .../worker_spi/expected/worker_spi.out|   2 +
 .../modules/worker_spi/sql/worker_spi.sql |   2 +
 src/test/regress/expected/rules.out   |   9 ++
 src/test/regress/sql/rules.sql|   1 +
 6 files changed, 115 insertions(+), 3 deletions(-)

diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index f505413a7f9..c44b8ebbfd6 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -170,6 +170,12 @@ static bool call_enum_check_hook(struct config_enum *conf, int *newval,
  void **extra, GucSource source, int elevel);
 
 static bool check_log_destination(char **newval, void **extra, GucSource source);
+static bool check_local_preload_libraries(char **newval, void **extra,
+		GucSource source);
+static bool check_session_preload_libraries(char **newval, void **extra,
+		GucSource source);
+static bool check_shared_preload_libraries(char **newval, void **extra,
+		GucSource source);
 static void assign_log_destination(const char *newval, void *extra);
 
 static bool check_wal_consistency_checking(char **newval, void **extra,
@@ -4198,7 +4204,7 @@ static struct config_string ConfigureNamesString[] =
 		},
 		_preload_libraries_string,
 		"",
-		NULL, NULL, NULL
+		check_session_preload_libraries, NULL, NULL
 	},
 
 	{
@@ -4209,7 +4215,7 @@ static struct config_string ConfigureNamesString[] =
 		},
 		_preload_libraries_string,
 		"",
-		NULL, NULL, NULL
+		check_shared_preload_libraries, NULL, NULL
 	},
 
 	{
@@ -4220,7 +4226,7 @@ static st

Re: wrong fds used for refilenodes after pg_upgrade relfilenode changes Reply-To:

2022-02-09 Thread Justin Pryzby
On Wed, Feb 09, 2022 at 02:00:04PM -0800, Andres Freund wrote:
> On linux we can do so by a) checking if readlink(/proc/self/fd/$fd) points to
> a filename ending in " (deleted)", b) doing fstat(fd) and checking if st_nlink
> == 0.

You could also stat() the file in proc/self/fd/N and compare st_ino.  It
"looks" like a symlink (in which case that wouldn't work) but it's actually a
Very Special File.  You can even recover deleted, still-opened files that way..

PS. I didn't know pg_upgrade knew about Reply-To ;)




Re: is the base backup protocol used by out-of-core tools?

2022-02-08 Thread Justin Pryzby
On Tue, Feb 08, 2022 at 11:26:41AM -0500, Robert Haas wrote:
> Patches for the nuking are attached. If nobody writes back, I'm going
> to assume that means nobody cares, and commit these some time before
> feature freeze. If one or more people do write back, then my plan is
> to see what they have to say and proceed accordingly.

I created this and added that for visibility and so it's not forgotten.
ISTM that could be done post-freeze, although I don't know if that's useful or
important.

https://wiki.postgresql.org/wiki/PostgreSQL_15_Open_Items

-- 
Justin




Re: GUC flags

2022-02-07 Thread Justin Pryzby
On Tue, Feb 08, 2022 at 10:44:07AM +0900, Michael Paquier wrote:
> What do you think about the updated version attached?  I have applied
> the addition of config_data() separately.

Looks fine

> + # Check if this line matches a GUC parameter.
> + if ($line =~ m/^#?([_[:alpha:]]+) (= .*|[^ ]*$)/)

I think this is the regex I wrote to handle either "name = value" or "name
value", which was needed between f47ed79cc..4d7c3e344.   See skip_equals.

It's fine the way it is, but could also remove the 2nd half of the alternation
(|), since GUCs shouldn't be added to sample.conf without '='.

-- 
Justin




Re: 2022-02-10 release announcement draft

2022-02-06 Thread Justin Pryzby
On Sun, Feb 06, 2022 at 08:01:02PM -0500, Jonathan S. Katz wrote:
> Hi,
> 
> Attached is a draft for the release announcement for the 2022-02-10
> cumulative update release.
> 
> Please review for technical accuracy or if you believe any items should be
> added/removed. Please provide feedback no later than 2020-02-10 0:00 AoE[1].

I guess you mean 2022 ;)

> * The [`psql`](https://www.postgresql.org/docs/current/app-psql.html)
> `\password` command now defaults to setting the password for to the role 
> defined
> by `CURRENT_USER`. Additionally, the role name is now included in the password
> prompt.

s/for to/for/

> * Build extended statistics for partitioned tables. If you previously added
> extended statistics to a partitioned table, you can fix this by running
> [`ANALYZE`](https://www.postgresql.org/docs/current/sql-analyze.html).
> [`autovacuum`](https://www.postgresql.org/docs/current/routine-vacuuming.html#AUTOVACUUM)
> currently does not process these statistics, so you must periodically run
> `ANALYZE` on any partitioned tables that are using extended statistics.

Instead of "you can fix this" , I suggest to say "you should run ANALYZE on
those tables to collect updated statistics."

It should say "not process these TABLES" (not stats), and should not say "that
are using extended statistics" since partitioned tables should be manually
analyzed whether or not they have extended stats.

It's good to say "..you should periodically run ANALYZE on partitioned tables
to collect updated statistics." (even though this patch doesn't change that).

> * Fix crash with 
> [multiranges](https://www.postgresql.org/docs/current/rangetypes.html)
> when extracting a variable-length data types.

remove "a" or s/types/types/

> * Disallow altering data type of a partitioned table's columns when the
> partitioned table's row type is used as a composite type elsewhere.

the data types ?

-- 
Justin




Re: GUC flags

2022-02-06 Thread Justin Pryzby
Thanks for working on it.

Your test is checking that stuff in sample.conf is actually a GUC and not
marked NOT_IN_SAMPLE.  But those are both unlikely mistakes to make.

The important/interesting test is the opposite: that all GUCs are present in
the sample file.  It's a lot easier for someone to forget to add a GUC to
sample.conf than it is for someone to accidentally add something that isn't a
GUC.

I'd first parse the GUC-like lines in the file, making a list of gucs_in_file
and then compare the two lists.

-- 
Justin




Re: Release notes for February minor releases

2022-02-06 Thread Justin Pryzby
On Sun, Feb 06, 2022 at 02:22:25PM -0500, Jonathan S. Katz wrote:
> I'm working on the release announcement and have been following this thread.
> 
> Are there steps we can provide to help a user detect that this occurred,
> even though it's a low-probability?

It's the same question as raised here:
https://www.postgresql.org/message-id/61fd9a5b.1c69fb81.88a14.5556%40mx.google.com

-- 
Justin




Re: Release notes for February minor releases

2022-02-04 Thread Justin Pryzby
On Fri, Feb 04, 2022 at 04:29:19PM -0500, Tom Lane wrote:
> > +  A previous bug fix disabled building of extended statistics for
> > +  old-style inheritance trees, but any existing statistics data was
> > +  not removed, and that data would become more and more out-of-date
> > +  over time.  Adjust the planner to ignore such data.  Extended
> > +  statistics for the individual child tables are still built and used,
> > +  however.
> 
> > The issue here isn't that old stats were never updated.  For inheritance, 
> > they
> > *were* updated with non-inherited stats (for SELECT FROM ONLY).  But then
> > "SELECT FROM tbl*" used the stats anyway...
> 
> I'm confused about this bit.  Are we still building bogus stats for
> inheritance parents, or has that stopped?

To make a long story long:
- before 859b3003de, an ERROR occurred when a stats object was created on an
  inheritance parent.
- To avoid the error, 859b3003de changed to no longer build "whole tree" stats
  on the table heirarchy.  Non-inheried stats were still collected.
- However, the stats were *also* applied to inherited queries (FROM tbl*).

36c4bc6 stops applying stats that shouldn't be applied (and doesn't change
their collection during ANALYZE).

20b9fa3 then changes to collect inherited stats on partitioned tables, since
they have no non-inherited stats, and since extended stats on partitioned
tables were intended to work since v10, and did work until 859b3003de stopped
collecting them.

In back branches, pg_statistic has inherited stats for partitioned tables, and
non-inherited stats otherwise.

-- 
Justin




Re: Release notes for February minor releases

2022-02-04 Thread Justin Pryzby
On Fri, Feb 04, 2022 at 02:58:59PM -0500, Tom Lane wrote:
> I've pushed the first draft for $SUBJECT at
> 
> https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=ab22eea83169c8d0eb15050ce61cbe3d7dae4de6
> 
> Please send comments/corrections by Sunday.

+ 
+  Build extended statistics for partitioned tables (Justin Pryzby)
+ 
+ 
+  A previous bug fix disabled building of extended statistics for
+  old-style inheritance trees, but it also prevented building them for
+  partitioned tables, which was an unnecessary restriction.
+  If you have created statistics objects for partitioned tables, you
+  may wish to explicitly ANALYZE those tables after
+  installing this update, rather than waiting for auto-analyze to do it.

Since autoanalyze still doesn't process partitioned tables, the last part
should be removed.

Probably it should say "..you *should* explicitly ANALYZE thse tables..".

+ 
+  Ignore extended statistics for inheritance trees (Justin Pryzby)
+ 
+ 
+  A previous bug fix disabled building of extended statistics for
+  old-style inheritance trees, but any existing statistics data was
+  not removed, and that data would become more and more out-of-date
+  over time.  Adjust the planner to ignore such data.  Extended
+  statistics for the individual child tables are still built and used,
+  however.
+ 

The issue here isn't that old stats were never updated.  For inheritance, they
*were* updated with non-inherited stats (for SELECT FROM ONLY).  But then
"SELECT FROM tbl*" used the stats anyway...

+ 
+  Fix failure of SP-GiST indexes when indexed column's data type is
+  binary-compatible with the declared input type of the operator class
+  (Tom Lane)
+ 

maybe: when *the*

-- 
Justin




Re: Adding CI to our tree

2022-02-03 Thread Justin Pryzby
On Thu, Feb 03, 2022 at 11:57:18AM -0800, Andres Freund wrote:
> On 2022-02-02 21:58:28 -0600, Justin Pryzby wrote:
> > FYI: I've rebased these against your cirrus/windows changes.
> 
> Did you put then on a dedicated branch, or only intermixed with other changes?

Yes it's intermixed (because I have too many branches, and because in this case
it's useful to show the doc builds and coverage artifacts).

> > A recent cirrus result is here; this has other stuff in the branch, so you 
> > can
> > see the artifacts with HTML docs and code coverage.
> 
> I'm a bit worried about the increased storage and runtime overhead due to the
> docs changes. We probably can make it a good bit cheaper though.

If you mean overhead of additional disk operations, it shouldn't be an issue,
since the git clone uses shared references (not even hardlinks).

If you meant storage capacity, I'm only uploading the *changed* docs as
artifacts.  The motivation being that it's a lot more convenient to look though
a single .html, and not hundreds.

> What's the idea behind
> #echo 'deb http://deb.debian.org/debian bullseye main' >>/etc/apt/sources.list
> That's already in sources.list, so I'm not sure what this shows?

At one point I thought I needed it - maybe all I needed was "apt-get update"..

> > 9d0f03d3450 wip: cirrus: code coverage
>
> I don't think it's good to just unconditionally reference the master branch
> here. It'll do bogus stuff once 15 is branched off. It works for cfbot, but
> not other uses.

That's only used for filtering changed files.  It uses git diff --cherry-pick
postgres/master..., which would *try* to avoid showing anything which is also
in master.

> Is looking at the .c files in the change really a reliable predictor of where
> code coverage changes? I'm doubtful. Consider stuff like removing the last
> user of some infrastructure or such. Or adding the first.

You're right that it isn't particularly accurate, but it's a step forward if
lots of patches use it to check/improve coverge of new code.

In addition to the HTML generated for changed .c files, it's possible to create
a coverage.gcov output for everything, which could be retrieved to generate
full HTML locally.  It's ~8MB (or 2MB after gzip).

> > bff64e8b998 cirrus: upload html docs as artifacts
> > 80f52c3b172 wip: only upload changed docs
> 
> Similar to the above.

Do you mean it's not reliable ?  This is looking at which .html have changed
(not sgml).  Surely that's reliable ?

> > 654b6375401 wip: vcsregress: add alltaptests
> 
> I assume this doesn't yet work to a meaningful degree? Last time I checked
> there were quite a few tests that needed to be invoked in a specific
> directory.

It works - tap_check() does chdir().  But it's slow, and maybe should try to
implement a check-world target.  It currently fails in 027_stream_regress.pl,
although I keep hoping that it had been fixed...
https://cirrus-ci.com/task/6116235950686208

(BTW, I just realized that that commit should also remove the recoverycheck
call.)

-- 
Justin




Re: do only critical work during single-user vacuum?

2022-02-03 Thread Justin Pryzby
On Thu, Feb 03, 2022 at 07:26:01PM -0800, Andres Freund wrote:
> Which reminds me: Perhaps we ought to hint about reducing / removing
> autovacuum cost limits in this situation? And perhaps make autovacuum absorb
> config changes while running? It's annoying that an autovac halfway into a
> huge table doesn't absorb changed cost limits for example.

I remembered this thread:

https://commitfest.postgresql.org/32/2983/
| Running autovacuum dynamic update to cost_limit and delay

https://www.postgresql.org/message-id/flat/13A6B954-5C21-4E60-BC06-751C8EA469A0%40amazon.com
https://www.postgresql.org/message-id/flat/0A3F8A3C-4328-4A4B-80CF-14CEBE0B695D%40amazon.com

-- 
Justin




Re: do only critical work during single-user vacuum?

2022-02-03 Thread Justin Pryzby
On Tue, Feb 01, 2022 at 04:50:31PM -0500, John Naylor wrote:
> On Thu, Jan 27, 2022 at 8:28 PM Justin Pryzby  wrote:
> 
> > I'm sure you meant "&" here (fixed in attached patch to appease the cfbot):
> > +   if (options | VACOPT_MINIMAL)
> 
> Thanks for catching that! That copy-pasto was also masking my failure
> to process the option properly -- fixed in the attached as v5.

Thank the cfbot ;)

Actually, your most recent patch failed again without this:

-   if (params->VACOPT_EMERGENCY)
+   if (params->options & VACOPT_EMERGENCY)

> - It mentions the new command in the error hint in
> GetNewTransactionId(). I'm not sure if multi-word commands should be
> quoted like this.

Use  ?

> +  xid age or mxid age is older than 1 billion. To complete as quickly as 
> possible, an emergency
> +  vacuum will skip truncation and index cleanup, and will skip toast 
> tables whose age has not
> +  exceeded the cutoff.

Why does this specially mention toast tables ?

> +  While this option could be used while the postmaster is running, it is 
> expected that the wraparound
> +  failsafe mechanism will automatically work in the same way to prevent 
> imminent shutdown.
> +  When EMERGENCY is specified no tables may be 
> listed, since it is designed to

specified comma

> +  select candidate relations from the entire database.
> +  The only other option that may be combined with 
> VERBOSE, although in single-user mode no client messages 
> are

this is missing a word?
Maybe say: May not be combined with any other option, other than VERBOSE.

Should the docs describe that the vacuum is done with cost based delay disabled
and with vacuum_freeze_table_age=0 (and other settings).

> + ereport(ERROR,
> + (errcode(ERRCODE_SYNTAX_ERROR),
> + errmsg("option 
> \"%s\" is incompatible with EMERGENCY", opt->defname),
> + 
> parser_errposition(pstate, opt->location)));

IMO new code should avoid using the outer parenthesis around errcode().

Maybe the errmsg should say: .. may not be specified with EMERGENCY.
EMERGENCY probably shouldn't be part of the translatable string.

+   if (strcmp(opt->defname, "emergency") != 0 &&
+   strcmp(opt->defname, "verbose") != 0 &&
+   defGetBoolean(opt))

It's wrong to call getBoolean(), since the options may not be boolean.
postgres=# VACUUM(EMERGENCY, INDEX_CLEANUP auto);
ERROR:  index_cleanup requires a Boolean value

I think EMERGENCY mode should disable process_toast.  It already processes
toast tables separately.  See 003.

Should probably exercise (EMERGENCY) in vacuum.sql.  See 003.

> > I still wonder if the relations should be processed in order of decreasing 
> > age.
> > An admin might have increased autovacuum_freeze_max_age up to 2e9, and your
> > query might return thousands of tables, with a wide range of sizes and ages.
> 
> While that seems like a nice property to have, it does complicate
> things, so can be left for follow-on work.

I added that in the attached 003.

-- 
Justin
>From a870303f4bd62b7c653a4bef53ed6d2748268bc0 Mon Sep 17 00:00:00 2001
From: John Naylor 
Date: Tue, 1 Feb 2022 16:50:31 -0500
Subject: [PATCH 1/3] do only critical work during single-user vacuum?

Feb 01 John Naylor
---
 doc/src/sgml/maintenance.sgml   |  12 ++--
 doc/src/sgml/ref/vacuum.sgml|  22 ++
 src/backend/access/transam/varsup.c |   4 +-
 src/backend/commands/vacuum.c   | 107 +---
 src/include/commands/vacuum.h   |   5 ++
 5 files changed, 134 insertions(+), 16 deletions(-)

diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml
index 36f975b1e5b..5c360499504 100644
--- a/doc/src/sgml/maintenance.sgml
+++ b/doc/src/sgml/maintenance.sgml
@@ -629,17 +629,19 @@ HINT:  To avoid a database shutdown, execute a database-wide VACUUM in that data
 
 
 ERROR:  database is not accepting commands to avoid wraparound data loss in database "mydb"
-HINT:  Stop the postmaster and vacuum that database in single-user mode.
+HINT:  Stop the postmaster and run "VACUUM (EMERGENCY)" in that database in single-user mode.
 
 
 The three-million-transaction safety margin exists to let the
 administrator recover without data loss, by manually executing the
-required VACUUM commands.  However, since the system will not
+required VACUUM command.  However, since the system will not
 execute commands once it has gone into the safety shutdown mode,
 the on

Re: Adding CI to our tree

2022-02-02 Thread Justin Pryzby
On Tue, Jan 18, 2022 at 03:08:47PM -0600, Justin Pryzby wrote:
> On Mon, Jan 17, 2022 at 12:16:19PM -0800, Andres Freund wrote:
> > I think it might still be worth adding stopgap way of running all tap tests 
> > on
> > windows though. Having a vcregress.pl function to find all directories with 
> > t/
> > and run the tests there, shouldn't be a lot of code...
> 
> I started doing that, however it makes CI/windows even slower.  I think it'll
> be necessary to run prove with all the tap tests to parallelize them, rather
> than looping around directories, many of which have only a single file, and 
> are
> run serially.

FYI: I've rebased these against your cirrus/windows changes.

A recent cirrus result is here; this has other stuff in the branch, so you can
see the artifacts with HTML docs and code coverage.

https://github.com/justinpryzby/postgres/runs/5046465342

95793675633 cirrus: spell ccache_maxsize
e5286ede1b4 cirrus: avoid unnecessary double star **
03f6de4643e cirrus: include hints how to install OS packages..
39cc2130e76 cirrus/linux: script test.log..
398b7342349 cirrus/macos: uname -a and export at end of sysinfo
9d0f03d3450 wip: cirrus: code coverage
bff64e8b998 cirrus: upload html docs as artifacts
80f52c3b172 wip: only upload changed docs
7f3b50f1847 pg_upgrade: show list of files copied only in vebose mode
ba229165fe1 wip: run pg_upgrade tests with data-checksums
97d7b039b9b vcregress/ci: test modules/contrib with NO_INSTALLCHECK=1
654b6375401 wip: vcsregress: add alltaptests

BTW, I think the double star added in f3feff825 probably doesn't need to be
double, either:

path: "crashlog-**.txt"

-- 
Justin




Re: Condition pushdown: why (=) is pushed down into join, but BETWEEN or >= is not?

2022-02-01 Thread Justin Pryzby
> Subject: [PATCH v1 1/6] Rebaee David's patch against the latest code.

If you use git-am, then the author/commit information is preserved.
It's probably good to include a link to the patch in any case.

> Subject: [PATCH v1 4/6] remove duplicated qual executing.


 
--- 

 
 src/backend/optimizer/path/equivclass.c | 22 +++   

 
 src/backend/optimizer/plan/createplan.c | 29 +++-- 

 
 src/include/optimizer/paths.h   |  2 ++

 
 src/test/regress/parallel_schedule  |  2 ++

 
 4 files changed, 53 insertions(+), 2 deletions(-)  

 

I think the ./ec_filter test is missing from from this patch.

> Subject: [PATCH v1 6/6] adding some test cases for this feature and fix the 
> existing case 
>   
> 
The tests should be updated with the corresponding patches.  It's common for
the patches to be commited separately, like if 0001 is ready but the others are
still evolving.

I'm not sure whether you think this patch is ready to be added to a commitfest,
but do you know about the CI infrastructure ?  It allows running all the cfbot
tests for a github branch against 4 OS, which helps catch portability issues,
including memory errors and unstable explain output.  See: src/tools/ci/README.
There's an failure in postgres_fdw, probably the output needs to be updated.

-- 
Justin




Re: GUC flags

2022-01-31 Thread Justin Pryzby
On Mon, Jan 31, 2022 at 02:17:41PM +0900, Michael Paquier wrote:
> With all those doc fixes, applied after an extra round of review.  So
> this makes us rather covered with the checks on the flags.

Thanks

> Now, what do we do with the rest of check_guc that involve a direct
> lookup at what's on disk.  We have the following:
> 1) Check the format of the option lists in guc.c.
> 2) Check the format of postgresql.conf.sample:
> -- Valid options preceded by a '#' character.
> -- Valid options followed by ' =', with at least one space before the
> equal sign.
> 3) Check that options not marked as NOT_IN_SAMPLE are in the sample
> file.
> 
> I have never seen 1) as a problem, and pgindent takes care of that at
> some degree.  2) is also mostly cosmetic, and committers are usually
> careful when adding a new GUC.  3) would be the most interesting
> piece, and would cover most cases if we consider that a default
> installation just copies postgresql.conf.sample over, as proposed
> upthread in 0002.
> 
> Now, 3) has also the problem that it would fail installcheck as one
> can freely add a developer option in the configuration.  We could

I'm not clear on what things are required/prohibited to allow/expect
"installcheck" to pass.  It's possible that postgresql.conf doesn't even exist
in the data dir, right ?

It's okay with me if the config_file-reading stuff isn't re-implemented.

-- 
Justin




Re: pg_walinspect - a new extension to get raw WAL data and WAL stats

2022-01-31 Thread Justin Pryzby
On Mon, Jan 31, 2022 at 04:40:09PM -0500, Greg Stark wrote:
> 4) This isn't really an issue with your patch at all but why on earth
> do we have a bitvector for WAL compression methods?! Like, what does
> it mean to have multiple compression methods set? That should just be
> a separate field with values for each type of compression surely?

I don't have an answer to your question, but the discussion was here.

In the versions of the patches I sent on Mar 15, Mar 21, May 18, May 24, Jun
13, I avoided "one bit per compression method", but Michael thought this was
simpler.

https://www.postgresql.org/message-id/20210622031358.gf29...@telsasoft.com
On Mon, Jun 21, 2021 at 10:13:58PM -0500, Justin Pryzby wrote:
> +/* compression methods supported */
> +#define BKPIMAGE_COMPRESS_PGLZ 0x04
> +#define BKPIMAGE_COMPRESS_ZLIB 0x08
> +#define BKPIMAGE_COMPRESS_LZ4  0x10
> +#define BKPIMAGE_COMPRESS_ZSTD 0x20
> +#defineBKPIMAGE_IS_COMPRESSED(info) \
> +   ((info & (BKPIMAGE_COMPRESS_PGLZ | BKPIMAGE_COMPRESS_ZLIB | \
> + BKPIMAGE_COMPRESS_LZ4 | BKPIMAGE_COMPRESS_ZSTD)) != 
> 0)
> 
> You encouraged saving bits here, so I'm surprised to see that your patches
> use one bit per compression method: 2 bits to support no/pglz/lz4, 3 to add
> zstd, and the previous patch used 4 bits to also support zlib.
> 
> There are spare bits available for that, but now there can be an inconsistency
> if two bits are set.  Also, 2 bits could support 4 methods (including "no").

On Tue, Jun 22, 2021 at 12:53:46PM +0900, Michael Paquier wrote:
> Yeah, I know.  I have just finished with that to get something
> readable for the sake of the tests.  As you say, the point is moot
> just we add one new method, anyway, as we need just one new bit.
> And that's what I would like to do for v15 with LZ4 as the resulting
> patch is simple.  It would be an idea to discuss more compression
> methods here once we hear more from users when this is released in the
> field, re-considering at this point if more is necessary or not.




Re: Why is INSERT-driven autovacuuming based on pg_class.reltuples?

2022-01-30 Thread Justin Pryzby
On Thu, Jan 27, 2022 at 01:59:38PM -0800, Peter Geoghegan wrote:
> On Thu, Jan 27, 2022 at 12:20 PM Peter Geoghegan  wrote:
> > Both VACUUM and ANALYZE update pg_class.reltuples. But this code seems
> > to assume that it's only something that VACUUM can ever do. Why
> > wouldn't we expect a plain ANALYZE to have actually been the last
> > thing to update pg_class.reltuples for an append-only table? Wouldn't
> > that lead to less frequent (perhaps infinitely less frequent)
> > vacuuming for an append-only table, relative to the documented
> > behavior of autovacuum_vacuum_insert_scale_factor?
> 
> PgStat_StatTabEntry.inserts_since_vacuum will continue to grow and
> grow as more tuples are inserted, until VACUUM actually runs, no
> matter what. That largely explains why this bug was missed before now:
> it's inevitable that inserts_since_vacuum will become large at some
> point -- even large relative to a bogus scaled
> pg_class.reltuples-at-ANALYZE threshold (unless ANALYZE hasn't been
> run since the last VACUUM, in which case pg_class.reltuples will be at
> the expected value anyway). And so we'll eventually get to the point
> where so many unvacuumed inserted tuples have accumulated that an
> insert-driven autovacuum still takes place.

Maybe I'm missed your point, but I think this may not rise to the level of
being a "bug".

If we just vacuumed an insert-only table, and then insert some more, and
autoanalyze runs and updates reltuples, what's wrong with vac_ins_scale_factor
* reltuples + vac_ins_base_thresh ?

You're saying reltuples should be the number of tuples at the last vacuum
instead of the most recent value from either vacuum or analyze ?

It's true that the vacuum threshold may be hit later than if reltuples hadn't
been updated by ANALYZE.  If that's what you're referring to, that's the
behavior of scale factor in general.  If a table is growing in size at a
constant rate, analyze will run at decreasing frequency.  With the default 20%
scale factor, it'll first run at 1.2x the table's initial size, then at 1.44
(not 1.4), then at 1.728 (not 1.6), then at 2.0736 (not 1.8).  That's not
necessarily desirable, but it's not necessarily wrong, either.  If your table
doubles in size, you might have to adjust these things.  Maybe there should be
another knob allowing perfect, "geometric" (or other) frequency, but the
behavior is not new in this patch.

We talked about that here.
https://www.postgresql.org/message-id/flat/20200305172749.GK684%40telsasoft.com#edac59123843f9f8e1abbc2b570c76f1

With the default values, analyze happens after 10% growth, and vacuum happens
after 20% (which ends up being 22% of the initial table size).  The goal of
this patch was to have inserts trigger autovacuum *at all*.  This behavior may
be subtle or non-ideal, but not a problem?  The opposite thing could also
happen - *vacuum* could update reltuples, causing the autoanalyze threshold to
be hit a bit later.  

-- 
Justin




Re: GUC flags

2022-01-29 Thread Justin Pryzby
On Sat, Jan 29, 2022 at 03:38:53PM +0900, Michael Paquier wrote:
> +-- Three exceptions as of transaction_*
> +SELECT name FROM pg_settings_flags
> +  WHERE NOT no_show_all AND no_reset_all
> +  ORDER BY 1;
> +  name  
> +
> + transaction_deferrable
> + transaction_isolation
> + transaction_read_only
> +(3 rows)

I think "as of" is not the right phrase here.
Maybe say: Exceptions are transaction_*

> --- a/doc/src/sgml/func.sgml
> +++ b/doc/src/sgml/func.sgml
> @@ -23596,6 +23596,45 @@ SELECT 
> pg_type_is_visible('myschema.widget'::regtype);
> +   
> +Returns an array of the flags associated with the given GUC, or
> +NULL if the  does not exist.  The result is

I guess it should say "if the GUC does not exist".

> +an empty array if the GUC exists but there are no flags to show,
> +as supported by the list below.

I'd say "...but none of the GUC's flags are exposed by this function."

> +The following flags are exposed (the most meaningful ones are
> +included):

"The most meaningful ones are included" doesn't seem to add anything.
Maybe it'd be useful to say "(Only the most useful flags are exposed)"

> +  EXPLAIN, parameters included in
> +  EXPLAIN commands.
> + 
> + 

I think the description is wrong, or just copied from the others.
EXPLAIN is for GUCs which are shown in EXPLAIN(SETTINGS).

|EXPLAIN, parameters included in EXPLAIN commands.
|NO_SHOW_ALL, parameters excluded from SHOW ALL commands.
|NO_RESET_ALL, parameters excluded from RESET ALL commands.
|NOT_IN_SAMPLE, parameters not included in postgresql.conf by default.
|RUNTIME_COMPUTED, runtime-computed parameters. 

Instead of a comma, these should use a colon, or something else?

-- 
Justin




Re: warn if GUC set to an invalid shared library

2022-01-28 Thread Justin Pryzby
Thanks for loooking

On Fri, Jan 28, 2022 at 11:36:20PM +, Cary Huang wrote:
> This is fine as this is what these patches are aiming to provide. However, 
> when I try to restart the server, it fails to start because abc.so and xyz.so 
> do not exist. Setting the parameters "local_preload_libraries" and 
> "local_preload_libraries" to something else in postgresql.conf does not seem 
> to take effect either.
> It still complains shared_preload_libraries abc.so does not exist even though 
> I have already set shared_preload_libraries to something else in 
> postgresql.conf. This seems a little strange to me 

Could you show exactly what you did and the output ?

The patches don't entirely prevent someone from putting the server config into
a bad state.  It only aims to tell them if they've done that, so they can fix
it, rather than letting someone (else) find the error at some later (probably
inconvenient) time.

ALTER SYSTEM adds config into postgresql.auto.conf.  If you stop the server
after adding bad config there (after getting a warning), the server won't
start.  Once the server is off, you have to remove it manually.

The goal of the patch is to 1) warn someone that they've put a bad config in
place, so they don't leave it there; and, 2) if the server fails to start for
such a reason, provide a CONTEXT line to help them resolve it quickly.

Maybe you know all that and I didn't understand what you're saying.

-- 
Justin




Re: support for MERGE

2022-01-28 Thread Justin Pryzby
On Fri, Jan 28, 2022 at 05:27:37PM -0300, Alvaro Herrera wrote:
> The one thing I'm a bit bothered about is the fact
> that we expose a lot of executor functions previously static.  I am now
> wondering if it would be better to move the MERGE executor support
> functions into nodeModifyTable.c, which I think would mean we would not
> have to expose those function prototypes.

It's probably a good idea.  

If you wanted to avoid bloating nodeModifyTable.c, maybe you could
#include "execMerge.c"

>From commit message:

> MERGE does not yet support inheritance,

It does support it now, right ?

>From merge.sgml:

"If you specify an update action...":
=> should say "If an update action is specified, ..."

s/an delete/a delete/

".. the WHEN clause is executed"
=> should say "the WHEN clause's action is executed" ?

" If a later WHEN clause of that kind is specified"
=> + COMMA

> --- a/doc/src/sgml/ref/allfiles.sgml
> +++ b/doc/src/sgml/ref/allfiles.sgml
> @@ -159,6 +159,7 @@ Complete list of usable sgml source files in this 
> directory.
>  
>  
>  
> +
>  
>  
>  

Looks like this is intended to be in alpha order.

> +  insert, update, or delete rows of a table based upon source 
> data

based on ?

> --- a/src/backend/executor/README
> +++ b/src/backend/executor/README
> @@ -41,6 +41,19 @@ be used for other table types.)  For DELETE, the plan tree 
> need only deliver
>  junk row-identity column(s), and the ModifyTable node visits each of those
>  rows and marks the row deleted.
>  
> +MERGE runs one generic plan that returns candidate change rows. Each row
> +consists of the output of the data-source table or query, plus CTID and
> +(if the target table is partitioned) TABLEOID junk columns.  If the target

s/partitioned/has child tables/ ?

>   case CMD_INSERT:
>   case CMD_DELETE:
>   case CMD_UPDATE:
> + case CMD_MERGE:

Is it intended to stay in alpha order (?)

> + case WCO_RLS_MERGE_UPDATE_CHECK:
> + case WCO_RLS_MERGE_DELETE_CHECK:
> + if (wco->polname != NULL)
> + ereport(ERROR,
> + 
> (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
> +  errmsg("target 
> row violates row-level security policy \"%s\" (USING expression) for table 
> \"%s\"",
> + 
> wco->polname, wco->relname)));

The parens around errcode are optional and IMO should be avoided for new code.

> +  * This duplicates much of the logic in ExecInitMerge(), so something
> +  * changes there, look here too.

so *if* ?

>   case T_InsertStmt:
>   case T_DeleteStmt:
>   case T_UpdateStmt:
> + case T_MergeStmt:
>   lev = LOGSTMT_MOD;
>   break;

alphabetize (?)

> + /* selcondition */
> + "c.relkind IN (" CppAsString2(RELKIND_RELATION) ", "
> + CppAsString2(RELKIND_PARTITIONED_TABLE) ") AND "
> + "c.relhasrules = false AND "
> + "(c.relhassubclass = false OR "
> + " c.relkind = " CppAsString2(RELKIND_PARTITIONED_TABLE) ")",

relhassubclass=false is wrong now ?

> +-- prepare
> +RESET SESSION AUTHORIZATION;
> +DROP TABLE target, target2;
> +DROP TABLE source, source2;
> +DROP FUNCTION merge_trigfunc();
> +DROP USER merge_privs;
> +DROP USER merge_no_privs;

Why does it say "prepare" ?
I think it means to say "Clean up"

WRITE_READ_PARSE_PLAN_TREES exposes errors in make check:
+ERROR:  targetColnos does not match subplan target list

Have you looked at code coverage ?  I have an experimental patch to add that to
cirrus, and ran it with this patch; visible here:
https://cirrus-ci.com/task/6362512059793408

-- 
Justin




Re: GUC flags

2022-01-27 Thread Justin Pryzby
On Wed, Jan 26, 2022 at 03:29:29PM +0900, Michael Paquier wrote:
> On Tue, Jan 25, 2022 at 09:44:26PM -0600, Justin Pryzby wrote:
> > It seems like an arbitrary and short-sighted policy to expose a handful of
> > flags in the view for the purpose of retiring ./check_guc, but not expose 
> > other
> > flags, because we thought we knew that no user could ever want them.
> > 
> > We should either expose all the flags, or should put them into an 
> > undocumented
> > function.  Otherwise, how would we document the flags argument ?  "Shows 
> > some
> > of the flags" ?  An undocumented function avoids this issue.
> 
> My vote would be to have a documented function, with a minimal set of
> the flags exposed and documented, with the option to expand that in
> the future.  COMPUTED and EXPLAIN are useful, and allow some of the
> automated tests to happen.  NOT_IN_SAMPLE and GUC_NO_SHOW_ALL are less 
> useful for the user, and are more developer oriented, but are useful
> for the tests.  So having these four seem like a good first cut.

I implemented that (But my own preference would still be for an *undocumented*
function which returns whatever flags we find to be useful to include.  Or
alternately, a documented function which exposes every flag).

> +SELECT lower(name) FROM pg_settings_flags WHERE NOT not_in_sample EXCEPT
> +SELECT regexp_replace(ln, '^#?([_[:alpha:]]+) (= .*|[^ ]*$)', '\1') AS guc
> +FROM (SELECT regexp_split_to_table(pg_read_file('postgresql.conf'),
> '\n') AS ln) conf
> 
> Tests reading postgresql.conf would break on instances started with a
> custom config_file provided by a command line, no?

Maybe you misunderstood - I'm not reading the file specified by
current_setting('config_file').  Rather, I'm reading
tmp_check/data/postgresql.conf, which is copied from the sample conf.
Do you see an issue with that ?

The regression tests are only intended run from a postgres source dir, and if
someone runs the from somewhere else, and they "fail", I think that's because
they violated their assumption, not because of a problem with the test.

I wondered if it should chomp off anything added by pg_regress --temp-regress.
However that's either going to be a valid guc (or else it would fail some other
test).  Or an extention's guc (which this isn't testing), which has a dot, and
which this regex doesn't match, so doesn't cause false positives.

-- 
Justin
>From d09cc770a3e307f55843942a850f7859c63d4c76 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Sun, 5 Dec 2021 23:22:04 -0600
Subject: [PATCH 1/2] check_guc: fix absurd number of false positives

Avoid false positives;
Avoid various assumptions;
Avoid list of exceptions;
Simplify shell/awk/sed/grep;

This requires GNU awk for RS as a regex.
---
 src/backend/utils/misc/check_guc | 69 +---
 1 file changed, 10 insertions(+), 59 deletions(-)

diff --git a/src/backend/utils/misc/check_guc b/src/backend/utils/misc/check_guc
index b171ef0e4ff..323ca13191b 100755
--- a/src/backend/utils/misc/check_guc
+++ b/src/backend/utils/misc/check_guc
@@ -1,78 +1,29 @@
-#!/bin/sh
+#! /bin/sh
+set -e
 
-## currently, this script makes a lot of assumptions:
+## this script makes some assumptions about the formatting of files it parses.
 ## in postgresql.conf.sample:
 ##   1) the valid config settings may be preceded by a '#', but NOT '# '
 ##  (we use this to skip comments)
-##   2) the valid config settings will be followed immediately by  ' ='
-##  (at least one space preceding the '=')
-## in guc.c:
-##   3) the options have PGC_ on the same line as the option
-##   4) the options have '{' on the same line as the option
-
-##  Problems
-## 1) Don't know what to do with TRANSACTION ISOLATION LEVEL
-
-## if an option is valid but shows up in only one file (guc.c but not
-## postgresql.conf.sample), it should be listed here so that it
-## can be ignored
-INTENTIONALLY_NOT_INCLUDED="debug_deadlocks in_hot_standby \
-is_superuser lc_collate lc_ctype lc_messages lc_monetary lc_numeric lc_time \
-pre_auth_delay role seed server_encoding server_version server_version_num \
-session_authorization trace_lock_oidmin trace_lock_table trace_locks trace_lwlocks \
-trace_notify trace_userlocks transaction_isolation transaction_read_only \
-zero_damaged_pages"
 
 ### What options are listed in postgresql.conf.sample, but don't appear
 ### in guc.c?
 
-# grab everything that looks like a setting and convert it to lower case
-SETTINGS=`grep ' =' postgresql.conf.sample |
-grep -v '^# ' | # strip comments
-sed -e 's/^#//' |
-awk '{print $1}'`
-
-SETTINGS=`echo "$SETTINGS" | tr 'A-Z' 'a-z'`
+# grab everything that looks like a setting
+SETTINGS=`sed '/^#[[:alnum:]]/!d; s/^#//; s/ =.*//; /^include/d' postgresql.conf.sample`
 
 for i in $SETTINGS ; do
-  hidden=0
   ## it sure would be

Re: do only critical work during single-user vacuum?

2022-01-27 Thread Justin Pryzby
On Fri, Jan 21, 2022 at 05:41:58PM -0500, John Naylor wrote:
> On Wed, Jan 19, 2022 at 5:26 PM Michael Paquier  wrote:
> >
> > Could you avoid introducing a new grammar pattern in VACUUM?  Any new
> > option had better be within the parenthesized part as it is extensible
> > at will with its set of DefElems.
> 
> This new behavior is not an option that one can sensibly mix with
> other options as the user sees fit, but rather hard-codes the
> parameters for its single purpose. That said, I do understand your
> objection.

This seems better, and it's shorter too.

I'm sure you meant "&" here (fixed in attached patch to appease the cfbot):
+   if (options | VACOPT_MINIMAL)   


It should either refuse to run if a list of tables is specified with MINIMAL,
or it should filter that list by XID condition.

As for the name, it could be MINIMAL or FAILSAFE or EMERGENCY or ??
I think the name should actually be a bit more descriptive, and maybe say XID,
like MINIMAL_XID or XID_EMERGENCY...

Normally, options are independent, but VACUUM (MINIMAL) is a "shortcut" to a
hardcoded set of options: freeze on, truncate off, cleanup off.  So it refuses
to be combined with other options - good.

This is effectively a shortcut to hypothetical parameters for selecting tables
by XID/MXID age.  In the future, someone could debate adding user-facing knobs
for table selection by age.

I still wonder if the relations should be processed in order of decreasing age.
An admin might have increased autovacuum_freeze_max_age up to 2e9, and your
query might return thousands of tables, with a wide range of sizes and ages.

Processing them in order of decreasing age would allow the admin to quickly
vacuum the oldest tables, and optionally interrupt vacuum to get out of single
user mode ASAP - even if their just want to run VACUUM(MINIMAL) in a normal
backend when services aren't offline.  Processing them out of order might be
pretty surprising - they might run vacuum for an hour (or overnight), cancel
it, attempt to start the DB in normal mode, and conclude that it made no
visible progress.

On Fri, Jan 21, 2022 at 12:59 AM Masahiko Sawada  wrote:
> > The purpose of this thread is to provide a way for users to run vacuum
> > only very old tables (while skipping index cleanup, etc.),
> 
> Ah, thank you Sawada-san, now I understand why we have been talking
> past each other. The purpose is actually:
> 
> - to have a simple, easy to type, command
> - intended for single-user mode, but not limited to it (so it's easy to test)
> - to get out of single user mode as quickly as possible
>From 03c567bb534219acdd76b0acc40e544c76f938e5 Mon Sep 17 00:00:00 2001
From: John Naylor 
Date: Fri, 21 Jan 2022 17:41:58 -0500
Subject: [PATCH] do only critical work during single-user vacuum?

Jan 21 John Naylor
---
 src/backend/commands/vacuum.c | 76 +--
 src/include/commands/vacuum.h |  1 +
 2 files changed, 74 insertions(+), 3 deletions(-)

diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index d1dadc54e47..c7bc97d3f76 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -52,6 +52,7 @@
 #include "storage/proc.h"
 #include "storage/procarray.h"
 #include "utils/acl.h"
+#include "utils/builtins.h"
 #include "utils/fmgroids.h"
 #include "utils/guc.h"
 #include "utils/memutils.h"
@@ -114,6 +115,7 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 	bool		full = false;
 	bool		disable_page_skipping = false;
 	bool		process_toast = true;
+	bool		wraparound = false;
 	ListCell   *lc;
 
 	/* index_cleanup and truncate values unspecified for now */
@@ -200,6 +202,8 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 	params.nworkers = nworkers;
 			}
 		}
+		else if (strcmp(opt->defname, "wraparound") == 0)
+			wraparound = defGetBoolean(opt);
 		else
 			ereport(ERROR,
 	(errcode(ERRCODE_SYNTAX_ERROR),
@@ -246,17 +250,51 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 		}
 	}
 
+	if (wraparound)
+	{
+		/* exclude incompatible options */
+		foreach(lc, vacstmt->options)
+		{
+			DefElem*opt = (DefElem *) lfirst(lc);
+
+			// WIP is there a better way?
+			if (strcmp(opt->defname, "wraparound") != 0 &&
+strcmp(opt->defname, "verbose") != 0 &&
+defGetBoolean(opt))
+
+ereport(ERROR,
+		(errcode(ERRCODE_SYNTAX_ERROR),
+errmsg("option \"%s\" is incompatible with WRAPAROUND", opt->defname),
+parser_errposition(pstate, opt->location)));
+		}
+
+		/* skip unnecessary work, as in failsafe mode */
+		params.index_cleanup = VACOPTVALUE_DISABLED;
+		params.truncate = VACOPTVALUE_DISABLED;
+	}
+
 	/*
-	 * All freeze ages are zero if the FREEZE option is given; otherwise pass
-	 * them as -1 which means to use the default values.
+	 * Set freeze ages to zero where appropriate; 

Re: warn if GUC set to an invalid shared library

2022-01-27 Thread Justin Pryzby
On Sun, Jan 09, 2022 at 11:58:18AM -0800, Maciek Sakrejda wrote:
> On Sat, Jan 8, 2022 at 2:07 PM Justin Pryzby  wrote:
> > Unfortunately, the output for dlopen() is not portable, which (I think) 
> > means
> > most of what I wrote can't be made to work..  Since it doesn't work to call
> > dlopen() when using SET, I tried using just stat().  But that also fails on
> > windows, since one of the regression tests has an invalid filename involving
> > unbalanced quotes, which cause it to return EINVAL rather than ENOENT.  So 
> > SET
> > cannot warn portably, unless it includes no details at all (or we specially
> > handle the windows case), or change the pre-existing regression test.  But
> > there's a 2nd instability, too, apparently having to do with timing.  So I'm
> > planning to drop the 0001 patch.
> 
> Hmm. I think 001 is a big part of the usability improvement here.

I agree - it helps people avoid causing a disruption, rather than just helping
them to fix it faster.

> Could we not at least warn generically, without relaying the
> underlying error? The notable thing in this situation is that the
> specified library could not be loaded (and that it will almost
> certainly cause problems on restart). The specific error would be nice
> to have, but it's less important. What is the timing instability?

I saw regression diffs like this, showing that the warning could be displayed
before or after the SELECT was echoed.

https://cirrus-ci.com/task/6301672321318912
-SELECT * FROM schema4.counted;
 WARNING:  could not load library: $libdir/plugins/worker_spi: cannot open 
shared object file: No such file or directory
+SELECT * FROM schema4.counted;

It's certainly possible to show a static message without additional text from
errno.

On Tue, Dec 28, 2021 at 9:45 AM Justin Pryzby  wrote:
> > > For whatever reason, I get slightly different (and somewhat redundant)
> > > output on failing to start:
> > >
> > > 2022-01-08 12:59:36.784 PST [324482] WARNING:  could not load library: 
> > > $libdir/plugins/totally bogus: cannot open shared object file: No such 
> > > file or directory
> > > 2022-01-08 12:59:36.787 PST [324482] FATAL:  could not load library: 
> > > totally bogus: cannot open shared object file: No such file or directory
> > > 2022-01-08 12:59:36.787 PST [324482] LOG:  database system is shut down
> >
> > I think the first WARNING is from the GUC mechanism "setting" the library.
> > And then the FATAL is from trying to apply the GUC.
> > It looks like you didn't apply the 0002 patch for that test so got no 
> > CONTEXT ?
> 
> I still had the terminal open where I tested this, and the scrollback
> did show me applying the patch (and building after). I tried a make
> clean and applying the patch again, and I do see the CONTEXT line now.
> I'm not sure what the problem was but seems like PEBKAC--sorry about
> that.

Maybe you missed "make install" or similar.

I took the liberty of adding you as a reviewer here:
https://commitfest.postgresql.org/36/3482/

-- 
Justin
>From 53bf0c82ea76a00fbf1c95c4833f1bd302228d25 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Sat, 18 Dec 2021 22:51:01 -0600
Subject: [PATCH v3 1/2] errcontext if server fails to start due to library
 GUCs

---
 src/backend/utils/fmgr/dfmgr.c| 20 +++-
 src/backend/utils/init/miscinit.c |  2 +-
 src/include/fmgr.h|  1 +
 3 files changed, 17 insertions(+), 6 deletions(-)

diff --git a/src/backend/utils/fmgr/dfmgr.c b/src/backend/utils/fmgr/dfmgr.c
index 050da780804..40eed5b5530 100644
--- a/src/backend/utils/fmgr/dfmgr.c
+++ b/src/backend/utils/fmgr/dfmgr.c
@@ -76,7 +76,7 @@ static DynamicFileList *file_tail = NULL;
 
 char	   *Dynamic_library_path;
 
-static void *internal_load_library(const char *libname);
+static void *internal_load_library(const char *libname, const char *gucname);
 static void incompatible_module_error(const char *libname,
 	  const Pg_magic_struct *module_magic_data) pg_attribute_noreturn();
 static void internal_unload_library(const char *libname);
@@ -115,7 +115,7 @@ load_external_function(const char *filename, const char *funcname,
 	fullname = expand_dynamic_library_name(filename);
 
 	/* Load the shared library, unless we already did */
-	lib_handle = internal_load_library(fullname);
+	lib_handle = internal_load_library(fullname, NULL);
 
 	/* Return handle if caller wants it */
 	if (filehandle)
@@ -144,6 +144,14 @@ load_external_function(const char *filename, const char *funcname,
  */
 void
 load_file(const char *filename, bool restricted)
+{
+	load_file_guc(filename, restricted, NULL);
+}
+
+/*
+ * Also accepts a GUC arg, for error reports
+ */
+void load_file_guc(const char *filename, bool restri

Re: Write visibility map during CLUSTER/VACUUM FULL

2022-01-27 Thread Justin Pryzby
On Sun, Dec 26, 2021 at 08:59:31PM -0600, Justin Pryzby wrote:
> Rebased on 8e1fae193864527c931a704bd7908e4fbc983f5c.
> 
> Would someone step up to "own" this patch ?
> 
> If not, its CF entry may need to be closed (there's no status for "needs
> author").

I'm planning to close this patch until someone can shepherd it.

-- 
Justin




Re: Output clause for Upsert aka INSERT...ON CONFLICT

2022-01-26 Thread Justin Pryzby
On Thu, Jan 27, 2022 at 10:24:14AM +0530, Anand Sowmithiran wrote:
> The INSERT...ON CONFLICT is used for doing upserts in one of our app.
> Our app works with both MS SQL and Postgresql, based on customer needs.
> 
> Unlike the MS SQL MERGE command's OUTPUT clause that gives the $action
> 
> [INSERT / UPDATE  /DELETE] that was done during the upsert, the RETURNING
> clause of the pgsql does not return the action done.
> We need this so that the application can use that for auditing and UI
> purposes.
> Is there any workaround to get this info ?

Thomas already answered about the xmax hack, but I dug these up in the
meantime.

https://www.postgresql.org/message-id/flat/CAA-aLv4d%3DzHnx%2BzFKqoszT8xRFpdeRNph1Z2uhEYA33bzmgtaA%40mail.gmail.com#899e15b8b357c6b29c51d94a0767a601
https://www.postgresql.org/message-id/flat/1565486215.7551.0%40finefun.com.au
https://www.postgresql.org/message-id/flat/20190724232439.lpxzjw2jg3ukgcqn%40alap3.anarazel.de
https://www.postgresql.org/message-id/flat/DE57F14C-DB96-4F17-9254-AD0AABB3F81F%40mackerron.co.uk
https://www.postgresql.org/message-id/CAM3SWZRmkVqmRCs34YtZPOCn%2BHmHqtcdEmo6%3D%3Dnqz1kNA43DVw%40mail.gmail.com

https://stackoverflow.com/questions/39058213/postgresql-upsert-differentiate-inserted-and-updated-rows-using-system-columns-x/39204667
https://stackoverflow.com/questions/40878027/detect-if-the-row-was-updated-or-inserted/40880200#40880200




Re: GUC flags

2022-01-25 Thread Justin Pryzby
On Wed, Jan 26, 2022 at 09:54:43AM +0900, Michael Paquier wrote:
> On Tue, Jan 25, 2022 at 12:07:51PM -0600, Justin Pryzby wrote:
> > On Tue, Jan 25, 2022 at 11:47:14AM +0100, Peter Eisentraut wrote:
> >> Does this stuff have any value for users?  I'm worried we are exposing a
> >> bunch of stuff that is really just for internal purposes.  Like, what value
> >> does showing "not_in_sample" have?  On the other hand, "guc_explain" might
> >> be genuinely useful, since that is part of a user-facing feature.  (I don't
> >> like the "guc_*" naming though.)
> 
> EXPLAIN is useful to know which parameter could be part of an explain
> query, as that's not an information provided now, even if the category
> provides a hint.  COMPUTED is also useful for the purpose of postgres
> -C in my opinion.

It seems like an arbitrary and short-sighted policy to expose a handful of
flags in the view for the purpose of retiring ./check_guc, but not expose other
flags, because we thought we knew that no user could ever want them.

We should either expose all the flags, or should put them into an undocumented
function.  Otherwise, how would we document the flags argument ?  "Shows some
of the flags" ?  An undocumented function avoids this issue.

Should I update the patch to put the function back ?
Should I also make the function expose all of the flags ?

-- 
Justin




Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)

2022-01-25 Thread Justin Pryzby
On Sun, Jan 02, 2022 at 01:07:29PM +0100, Fabien COELHO wrote:
> One liner doc improvement to tell that creation time is only available on 
> windows.
> It is indeed not available on Linux.

The change is about the "isflag" flag, not creation time.

 Returns a record containing the file's size, last access time stamp,
 last modification time stamp, last file status change time stamp (Unix
 platforms only), file creation time stamp (Windows only), and a flag
-indicating if it is a directory.
+indicating if it is a directory (or a symbolic link to a directory).

> # part 03
> ISTM that the "if (1) { if (2) continue; } else if(3) { if (4) continue; }" 
> structure"
> may be simplified to "if (1 && 2) continue; if (3 && 4) continue;", at least 
> if
> IS_DIR and IS_REG are incompatible?

No, what you suggested is not the same;

We talked about this before:
https://www.postgresql.org/message-id/20200315212729.gc26...@telsasoft.com

> Otherwise, at least "else if (3 & 4) continue"?

I could write the *final* "else if" like that, but then it would be different
from the previous case.  Which would be confusing and prone to mistakes.

If I wrote it like this, I think it'd just provoke suggestions from someone
else to change it differently:

/* Skip dirs or special files? */
if (S_ISDIR(attrib.st_mode) && !(flags & LS_DIR_SKIP_DIRS))
continue;
if (!S_ISDIR(attrib.st_mode) && !S_ISREG(attrib.st_mode) && !(flags & 
LS_DIR_SKIP_SPECIAL)
continue;

...
<< Why don't you use "else if" instead of "if (a){} if (!a && b){}" >>

I'm going to leave it up to a committer.

> The ifdef WIN32 (which probably detects windows 64 bits…) overwrites 
> values[3]. ISTM
> it could be reordered so that there is no overwrite, and simpler single 
> assignements.
> 
>   #ifndef WIN32
> v = ...;
>   #else
> v = ... ? ... : ...;
>   #endif

I changed this but without using nested conditionals.

> Add a new "isdir" column to "pg_ls_tmpdir" output.  This is a small behavioral
> change.  I'm ok with it, however I'm unsure why we would not jump directly to
> the "type" char column done later in the patch series.

Because that depends on lstat().

> ISTM all such functions
> should be extended the same way for better homogeneity? That would also impact
> "waldir", "archive_status", "logical_*", "replslot" variants. "make check" ok.

I agree that makes sense, however others have expressed the opposite opinion.
https://www.postgresql.org/message-id/CALj2ACWtrt5EkHrY4WAZ4Cv42SidXAwpeQJU021bxaKpjmbGfA%40mail.gmail.com

The original motive for the patch was that pg_ls_tmpdir doesn't show shared
filesets.  This fixes that essential problem without immediately dragging
everything else along.  I think it's more likely that a committer would merge
them both.  But I don't know, and it's easy to combine patches if desired.

> This patch applies my previous advice:-) ISTM that parts 4 and 5 should be one
> single patch. The test changes show that only waldir has a test. Would it be
> possible to add minimal tests to other variants as well?  "make check" ok.

I have added tests, although some are duplicative.

> This part extends and adds a test for pg_ls_logdir. ISTM that it should
> be merged with the previous patches.  "make check" is ok.

It's seperate to allow writing a separate commit message since it does
something unrelated to the other patches.  What other patch would it would be
merged with ?
| v32-0006-pg_ls_logdir-to-ignore-error-if-initial-top-dir-.patch 

> ISTM that the documentation should be clear about windows vs unix/cygwin 
> specific
> data provided (change/creation).

I preferred to refer to pg_stat_file rather than repeat it for all 7 functions
currently in v15, (and future functions added for new, toplevel dirs).

> # part 11
> 
> This part adds a recurse option. Why not. However, the true value does not
> seem to be tested? "make check" is ok.

WDYM the true value?  It's tested like:

+-- Exercise recursion
+select path, filename, type from pg_ls_dir_metadata('.', true, false, true) 
where
+path in ('base', 'base/pgsql_tmp', 'global', 'global/pg_control', 
'global/pg_filenode.map', 'PG_VERSION', 'pg_multixact', 'pg_multixact/members', 
'pg_multixact/offsets', 'pg_wal', 'pg_wal/archive_status')
+-- (type='d' or 
path~'^(global/.*|PG_VERSION|postmaster\.opts|postmaster\.pid|pg_logical/replorigin_checkpoint)$')
 and filename!~'[0-9]'
+order by path collate "C", filename collate "C";
+  path  |filename | type 
+-

Re: GUC flags

2022-01-25 Thread Justin Pryzby
On Tue, Jan 25, 2022 at 11:47:14AM +0100, Peter Eisentraut wrote:
> On 25.01.22 02:07, Justin Pryzby wrote:
> > +CREATE TABLE pg_settings_flags AS SELECT name, category,
> > +   'NO_SHOW_ALL'   =ANY(flags) AS no_show_all,
> > +   'NO_RESET_ALL'  =ANY(flags) AS no_reset_all,
> > +   'NOT_IN_SAMPLE' =ANY(flags) AS not_in_sample,
> > +   'EXPLAIN'   =ANY(flags) AS guc_explain,
> > +   'COMPUTED'  =ANY(flags) AS guc_computed
> > +   FROM pg_show_all_settings();
> 
> Does this stuff have any value for users?  I'm worried we are exposing a
> bunch of stuff that is really just for internal purposes.  Like, what value
> does showing "not_in_sample" have?  On the other hand, "guc_explain" might
> be genuinely useful, since that is part of a user-facing feature.  (I don't
> like the "guc_*" naming though.)
> 
> Your patch doesn't contain a documentation change, so I don't know how and
> to what extend this is supposed to be presented to users.

I want to avoid putting this in pg_settings.

The two options discussed so far are:
 - to add an function to return the flags;
 - to add the flags to pg_show_all_settings(), but not show it in pg_settings 
view;

I interpretted Michael's suggested as adding it to pg_get_all_settings(), but
*not* including it in the pg_settings view.  Now it seems like I misunderstood,
and Michael wants to add it to the view.

But, even if we only handle the 5 flags we have an immediate use for, it makes
the user-facing view too "wide", just to accommodate this internal use.

If it were in the pg_settings view, I think it ought to have *all* the flags
(not just the flags that help us to retire ./check_guc).  That's much too much.

-- 
Justin




Re: pg_upgrade should truncate/remove its logs before running

2022-01-25 Thread Justin Pryzby
On Mon, Jan 24, 2022 at 10:59:40AM +0900, Michael Paquier wrote:
> On Thu, Jan 20, 2022 at 07:51:37PM +0900, Michael Paquier wrote:
> > Neat idea.  That would work fine for my case.  So I am fine to stick
> > with this suggestion. 
> 
> I have been looking at this idea, and the result is quite nice, being
> simpler than anything that has been proposed on this thread yet.  We
> get a simpler removal logic, and there is no need to perform any kind
> of sanity checks with the output path provided as long as we generate
> the paths and the dirs after adjust_data_dir().
> 
> Thoughts?

Andrew: you wanted to accommodate any change on the build client, right ?

-- 
Justin




Re: GUC flags

2022-01-24 Thread Justin Pryzby
On Wed, Jan 05, 2022 at 11:36:41PM -0600, Justin Pryzby wrote:
> On Thu, Jan 06, 2022 at 02:19:08PM +0900, Michael Paquier wrote:
> > > + initStringInfo();
> > > + appendStringInfoChar(, '{');
> > > +
> > > + if (flags & GUC_NO_SHOW_ALL)
> > > + appendStringInfo(, "NO_SHOW_ALL,");
> > > + if (flags & GUC_NO_RESET_ALL)
> > > + appendStringInfo(, "NO_RESET_ALL,");
> > > + if (flags & GUC_NOT_IN_SAMPLE)
> > > + appendStringInfo(, "NOT_IN_SAMPLE,");
> > > + if (flags & GUC_EXPLAIN)
> > > + appendStringInfo(, "EXPLAIN,");
> > > + if (flags & GUC_RUNTIME_COMPUTED)
> > > + appendStringInfo(, "RUNTIME_COMPUTED,");
> > > +
> > > + /* Remove trailing comma, if any */
> > > + if (ret.len > 1)
> > > + ret.data[--ret.len] = '\0';
> > 
> > The way of building the text array is incorrect here.  See

I think you'll find that this is how it's done elsewhere in postgres.
In the frontend, see appendPQExpBufferChar and appendPGArray and 3e6e86abc.
On the backend, see: git grep -F "'{'" |grep -w appendStringInfoChar

I updated the patch with a regex to accommodate GUCs without '=', as needed
since f47ed79cc8.

-- 
Justin
>From eee345a1685a04eeb0fedf3ba9ad0fc6fe6e3d81 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Sun, 5 Dec 2021 23:22:04 -0600
Subject: [PATCH 1/2] check_guc: fix absurd number of false positives

Avoid false positives;
Avoid various assumptions;
Avoid list of exceptions;
Simplify shell/awk/sed/grep;

This requires GNU awk for RS as a regex.
---
 src/backend/utils/misc/check_guc | 69 +---
 1 file changed, 10 insertions(+), 59 deletions(-)

diff --git a/src/backend/utils/misc/check_guc b/src/backend/utils/misc/check_guc
index b171ef0e4ff..323ca13191b 100755
--- a/src/backend/utils/misc/check_guc
+++ b/src/backend/utils/misc/check_guc
@@ -1,78 +1,29 @@
-#!/bin/sh
+#! /bin/sh
+set -e
 
-## currently, this script makes a lot of assumptions:
+## this script makes some assumptions about the formatting of files it parses.
 ## in postgresql.conf.sample:
 ##   1) the valid config settings may be preceded by a '#', but NOT '# '
 ##  (we use this to skip comments)
-##   2) the valid config settings will be followed immediately by  ' ='
-##  (at least one space preceding the '=')
-## in guc.c:
-##   3) the options have PGC_ on the same line as the option
-##   4) the options have '{' on the same line as the option
-
-##  Problems
-## 1) Don't know what to do with TRANSACTION ISOLATION LEVEL
-
-## if an option is valid but shows up in only one file (guc.c but not
-## postgresql.conf.sample), it should be listed here so that it
-## can be ignored
-INTENTIONALLY_NOT_INCLUDED="debug_deadlocks in_hot_standby \
-is_superuser lc_collate lc_ctype lc_messages lc_monetary lc_numeric lc_time \
-pre_auth_delay role seed server_encoding server_version server_version_num \
-session_authorization trace_lock_oidmin trace_lock_table trace_locks trace_lwlocks \
-trace_notify trace_userlocks transaction_isolation transaction_read_only \
-zero_damaged_pages"
 
 ### What options are listed in postgresql.conf.sample, but don't appear
 ### in guc.c?
 
-# grab everything that looks like a setting and convert it to lower case
-SETTINGS=`grep ' =' postgresql.conf.sample |
-grep -v '^# ' | # strip comments
-sed -e 's/^#//' |
-awk '{print $1}'`
-
-SETTINGS=`echo "$SETTINGS" | tr 'A-Z' 'a-z'`
+# grab everything that looks like a setting
+SETTINGS=`sed '/^#[[:alnum:]]/!d; s/^#//; s/ =.*//; /^include/d' postgresql.conf.sample`
 
 for i in $SETTINGS ; do
-  hidden=0
   ## it sure would be nice to replace this with an sql "not in" statement
-  ## it doesn't seem to make sense to have things in .sample and not in guc.c
-#  for hidethis in $INTENTIONALLY_NOT_INCLUDED ; do
-#if [ "$hidethis" = "$i" ] ; then
-#  hidden=1
-#fi
-#  done
-  if [ "$hidden" -eq 0 ] ; then
-grep -i '"'$i'"' guc.c > /dev/null
-if [ $? -ne 0 ] ; then
-  echo "$i seems to be missing from guc.c";
-fi;
-  fi
+  grep -i "\"$i\"" guc.c >/dev/null ||
+echo "$i seems to be missing from guc.c";
 done
 
 ### What options are listed in guc.c, but don't appear
 ### in postgresql.conf.sample?
 
 # grab everything that looks like a setting and convert it to lower case
-
-SETTINGS=`grep '{.* PGC_' guc.c | awk '{print $1}' | \
-  sed -e 's/{//g' -e 's/"//g' -e 's/,//'`
-
-SETTINGS=`echo "$SETTINGS" | tr 'A-Z' 'a-z'`
-
+SETTINGS=`gawk -F '[",]' 'BEGIN{RS="\n\t{\n"} /",[[:space:]]*PGC_.*.*gettext_noop/ && !/NOT_IN_SAMPLE/{print tolower($2)}' guc.c`
 for i in $S

explain_regress, explain(MACHINE), and default to explain(BUFFERS) (was: BUFFERS enabled by default in EXPLAIN (ANALYZE))

2022-01-24 Thread Justin Pryzby
I'm renaming this thread for better visibility, since buffers is a small,
optional part of the patches I sent.

I made a CF entry here.
https://commitfest.postgresql.org/36/3409/

On Wed, Dec 01, 2021 at 06:58:20PM -0600, Justin Pryzby wrote:
> On Mon, Nov 15, 2021 at 01:09:54PM -0600, Justin Pryzby wrote:
> > Some time ago, I had a few relevant patches:
> > 1) add explain(REGRESS) which is shorthand for (BUFFERS OFF, TIMING OFF, 
> > COSTS OFF, SUMMARY OFF)
> > 2) add explain(MACHINE) which elides machine-specific output from explain;
> >for example, Heap Fetches, sort spaceUsed, hash nbuckets, and tidbitmap 
> > stuff.
> > 
> > https://www.postgresql.org/message-id/flat/20200306213310.gm...@telsasoft.com
> 
> The attached patch series now looks like this (some minor patches are not
> included in this list):
> 
>  1. add GUC explain_regress, which disables TIMING, SUMMARY, COSTS;
>  2. enable explain(BUFFERS) by default (but disabled by explain_regress);
>  3. Add explain(MACHINE) - which is disabled by explain_regress.
> This elides various machine-specific output like Memory and Disk use.
> Maybe it should be called something else like "QUIET" or "VERBOSE_MINUS_1"
> or ??
> 
> The regression tests now automatically run with explain_regress=on, which is
> shorthand for TIMING OFF, SUMMARY OFF, COSTS OFF, and then BUFFERS OFF.
> 
> There's a further option called explain(MACHINE) which can be disabled to hide
> portions of the output that are unstable, like Memory/Disk/Batches/
> Heap Fetches/Heap Blocks.  This allows "explain analyze" to be used more 
> easily
> in regression tests, and simplifies some existing tests that currently use
> plpgsql functions to filter the output.  But it doesn't handle all the
> variations from parallel workers.
> 
> (3) is optional, but simplifies some regression tests.  The patch series could
> be rephrased with (3) first.
> 
> Unfortunately, "COSTS OFF" breaks postgres_fdw remote_estimate.  If specifying
> "COSTS ON" in postgres_fdw.c is considered to be a poor fix , then I suppose
> this patch series could do as suggested and enable buffers by default only 
> when
> ANALYZE is specified.  Then postgres_fdw is not affected, and the
> explain_regress GUC is optional: instead, we'd need to specify BUFFERS OFF in
> ~100 regression tests which use EXPLAIN ANALYZE.  (3) still seems useful on 
> its
> own.
>From 747cb7a979cf96f99403a005053e635f3bf8c3f0 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Sat, 22 Feb 2020 21:17:10 -0600
Subject: [PATCH 1/7] Add GUC: explain_regress

This changes the defaults for explain to: costs off, timing off, summary off.
It'd be reasonable to use this for new regression tests which are not intended
to be backpatched.
---
 contrib/postgres_fdw/postgres_fdw.c |  2 +-
 src/backend/commands/explain.c  | 13 +++--
 src/backend/utils/misc/guc.c| 13 +
 src/include/commands/explain.h  |  2 ++
 src/test/regress/expected/explain.out   |  3 +++
 src/test/regress/expected/inherit.out   |  2 +-
 src/test/regress/expected/stats_ext.out |  2 +-
 src/test/regress/pg_regress.c   |  3 ++-
 src/test/regress/sql/explain.sql|  4 
 src/test/regress/sql/inherit.sql|  2 +-
 src/test/regress/sql/stats_ext.sql  |  2 +-
 11 files changed, 40 insertions(+), 8 deletions(-)

diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index bf3f3d9e26e..71f5b145984 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -3112,7 +3112,7 @@ estimate_path_cost_size(PlannerInfo *root,
 		 * values, so don't request params_list.
 		 */
 		initStringInfo();
-		appendStringInfoString(, "EXPLAIN ");
+		appendStringInfoString(, "EXPLAIN (COSTS)");
 		deparseSelectStmtForRel(, root, foreignrel, fdw_scan_tlist,
 remote_conds, pathkeys,
 fpextra ? fpextra->has_final_sort : false,
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index b970997c346..43e8bc78778 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -154,6 +154,7 @@ static void ExplainJSONLineEnding(ExplainState *es);
 static void ExplainYAMLLineStarting(ExplainState *es);
 static void escape_yaml(StringInfo buf, const char *str);
 
+bool explain_regress = false; /* GUC */
 
 
 /*
@@ -172,6 +173,7 @@ ExplainQuery(ParseState *pstate, ExplainStmt *stmt,
 	ListCell   *lc;
 	bool		timing_set = false;
 	bool		summary_set = false;
+	bool		costs_set = false;
 
 	/* Parse options list. */
 	foreach(lc, stmt->options)
@@ -183,7 +185,11 @@ ExplainQuery(ParseState *pstate, ExplainStmt *stmt,
 		else if (strcmp(opt->defname, "verbose") == 0)
 			es->

Re: pg_upgrade should truncate/remove its logs before running

2022-01-24 Thread Justin Pryzby
On Mon, Jan 24, 2022 at 12:39:30PM -0500, Bruce Momjian wrote:
> On Mon, Jan 24, 2022 at 10:59:40AM +0900, Michael Paquier wrote:
> > On Thu, Jan 20, 2022 at 07:51:37PM +0900, Michael Paquier wrote:
> > > Neat idea.  That would work fine for my case.  So I am fine to stick
> > > with this suggestion. 
> > 
> > I have been looking at this idea, and the result is quite nice, being
> > simpler than anything that has been proposed on this thread yet.  We
> > get a simpler removal logic, and there is no need to perform any kind
> > of sanity checks with the output path provided as long as we generate
> > the paths and the dirs after adjust_data_dir().
> ...
> >  
> >
> > pg_upgrade creates various working files, 
> > such
> > -   as schema dumps, in the current working directory.  For security, be 
> > sure
> > -   that that directory is not readable or writable by any other users.
> > +   as schema dumps, stored within pg_upgrade_output.d in
> > +   the directory of the new cluster.
> >
> 
> Uh, how are we instructing people to delete that pg_upgrade output
> directory?  If pg_upgrade completes cleanly, would it be removed
> automatically?

Clearly.

@@ -689,28 +751,5 @@ cleanup(void)  



/* Remove dump and log files? */

if (!log_opts.retain)   

-   {   

-   int dbnum;  

-   char  **filename;   

-   

-   for (filename = output_files; *filename != NULL; filename++)

-   unlink(*filename);  

-   

-   /* remove dump files */ 

-   unlink(GLOBALS_DUMP_FILE);  

-   

-   if (old_cluster.dbarr.dbs)  

-   for (dbnum = 0; dbnum < old_cluster.dbarr.ndbs; 
dbnum++)
-   {   

-   charsql_file_name[MAXPGPATH],   

-   
log_file_name[MAXPGPATH];   

-   DbInfo *old_db = 
_cluster.dbarr.dbs[dbnum];  
   
-   

-   snprintf(sql_file_name, sizeof(sql_file_name), 
DB_DUMP_FILE_MASK, old_db->db_oid);  
-   unlink(sql_file_name);  

-   

-   snprintf(log_file_name, sizeof(log_file_name), 
DB_DUMP_LOG_FILE_MASK, old_db->db_oid);  
-   unlink(log_file_name);

Re: typos

2022-01-23 Thread Justin Pryzby
On Mon, Jan 24, 2022 at 04:01:47PM +0900, Michael Paquier wrote:
> On Sun, Jan 23, 2022 at 09:00:01PM -0600, Justin Pryzby wrote:
> > Feel free to ignore this for now and revisit in April...
> 
> I don't mind fixing that now.  That means less to do later.

Thanks.

> > @Michael: I'm not sure what this is trying to say.
> > 1e9475694b0ae2cf1204d01d2ef6ad86f3c7cac8
> > +  First, scan the directory where the WAL segment files are written and
> > +  find the newest completed segment file, using as starting point the
> > +  beginning of the next WAL segment file. This is calculated 
> > independently
> > +  on the compression method used to compress each segment.
> > 
> > I suppose it should say independently *of* the compression method, but then 
> > I
> > still don't know what it means.  I checked FindStreamingStart().
> > It that doesn't look like it's "calculated independently" - actually, it 
> > takes
> > the compression method into account and explicitly handles each compression
> > method.
> 
> This means that we are able to calculate the starting LSN even if the
> segments stored use different compression methods or are
> uncompressed.  Would you reword that differently?  Or perhaps removing
> the last sentence of this paragraph would be simpler in the long run?

different from what?  From each other ?

Maybe I would have written:
| This is calculated separately for each segment, which may each use
| different compression methods.

But probably I would just remove it.

-- 
Justin




Re: [PATCH] Implement INSERT SET syntax

2022-01-23 Thread Justin Pryzby
Hi,

On Fri, Jan 21, 2022 at 05:24:51PM +0800, wenjing zeng wrote:
> Since this feature adds INSERT OVERRIDING SET syntax, it is recommended to 
> add some related testcases.

Thanks for proposing some more tests.

Note that your patch caused Gareth's patches to break under the cfbot.
http://cfbot.cputube.org/gareth-palmer.html

You have to either include the pre-requisite patches as 0001, and your patch as
0002 (as I'm doing now), or name your patch something other than *.diff or
*.patch, so cfbot doesn't think it's a new version of the patch to be tested.

Thanks,
-- 
Justin
>From 757b3013c58cc33bffd509988833588fc0806d6b Mon Sep 17 00:00:00 2001
From: Gareth Palmer 
Date: Wed, 22 Sep 2021 05:09:28 +
Subject: [PATCH 1/2] Implement INSERT SET syntax

Allow the target column and values of an INSERT statement to be specified
using a SET clause in the same manner as that of an UPDATE statement.

The advantage of using the INSERT SET style is that the columns and values
are kept together, which can make changing or removing a column or value
from a large list easier.

A simple example that uses SET instead of a VALUES() clause:

INSERT INTO t SET c1 = 'foo', c2 = 'bar', c3 = 'baz';

Values can also be sourced from other tables similar to the INSERT INTO
SELECT FROM syntax:

INSERT INTO t SET c1 = x.c1, c2 = x.c2 FROM x WHERE x.c2 > 10 LIMIT 10;

INSERT SET is not part of any SQL standard, however this syntax is also
implemented by MySQL. Their implementation does not support specifying a
FROM clause.
---
 doc/src/sgml/ref/insert.sgml  | 56 +-
 src/backend/parser/gram.y | 77 ++-
 src/test/regress/expected/insert.out  | 13 +++-
 src/test/regress/expected/insert_conflict.out |  2 +
 src/test/regress/expected/with.out| 20 +
 src/test/regress/sql/insert.sql   |  1 +
 src/test/regress/sql/insert_conflict.sql  |  3 +
 src/test/regress/sql/with.sql |  9 +++
 8 files changed, 173 insertions(+), 8 deletions(-)

diff --git a/doc/src/sgml/ref/insert.sgml b/doc/src/sgml/ref/insert.sgml
index 2973b72b815..63c0579d4b7 100644
--- a/doc/src/sgml/ref/insert.sgml
+++ b/doc/src/sgml/ref/insert.sgml
@@ -28,6 +28,16 @@ INSERT INTO table_name [ AS conflict_target ] conflict_action ]
 [ RETURNING * | output_expression [ [ AS ] output_name ] [, ...] ]
 
+
+[ WITH [ RECURSIVE ] with_query [, ...] ]
+INSERT INTO table_name [ AS alias ]
+[ OVERRIDING { SYSTEM | USER} VALUE ]
+SET { column_name = { expression | DEFAULT } } [, ...]
+[ FROM from_clause ]
+[ ON CONFLICT [ conflict_target ] conflict_action ]
+[ RETURNING * | output_expression [ [ AS ] output_name ] [, ...] ]
+
+
 where conflict_target can be one of:
 
 ( { index_column_name | ( index_expression ) } [ COLLATE collation ] [ opclass ] [, ...] ) [ WHERE index_predicate ]
@@ -263,6 +273,18 @@ INSERT INTO table_name [ AS 
  
 
+ 
+  from_clause
+  
+   
+A list of table expressions, allowing columns from other tables
+to be used as values in the expression.
+Refer to the  statement for a
+description of the syntax.
+   
+  
+ 
+
  
   DEFAULT
   
@@ -643,6 +665,15 @@ INSERT INTO films (code, title, did, date_prod, kind) VALUES
 
   
 
+  
+Insert a row using SET syntax:
+
+
+INSERT INTO films SET code = 'MH832', title = 'Blade Runner',
+did = 201, date_prod = DEFAULT, kind = 'SciFi'; 
+
+  
+
   
This example inserts some rows into table
films from a table tmp_films
@@ -689,6 +720,16 @@ WITH upd AS (
 INSERT INTO employees_log SELECT *, current_timestamp FROM upd;
 
   
+  
+   Insert multiple rows into employees_log containing
+the hours worked by each employee from time_sheets.
+
+INSERT INTO employees_log SET id = time_sheets.employee,
+total_hours = sum(time_sheets.hours) FROM time_sheets
+WHERE time_sheets.date  '2019-11-15' GROUP BY time_sheets.employee;
+
+  
+
   
Insert or update new distributors as appropriate.  Assumes a unique
index has been defined that constrains values appearing in the
@@ -745,6 +786,18 @@ INSERT INTO distributors (did, dname) VALUES (9, 'Antwerp Design')
 INSERT INTO distributors (did, dname) VALUES (10, 'Conrad International')
 ON CONFLICT (did) WHERE is_active DO NOTHING;
 
+  
+   Insert a new film into watched_films or increment the
+   number of times seen. Returns the new seen count, example assumes a
+   unique index has been defined that constrains the values appearing in
+   the title and year columns and 
+   that seen_count defaults to 1.
+
+INSERT INTO watched_films SET title = 'Akira', year = 1988
+   ON CONFLICT (title, year) DO UPDATE SET seen_count = watched_films.seen_count + 1
+   RETURNING watched_films.seen_count;
+
+  
  
 
  
@@ -755,7 +808,8 @@ INSERT INTO distributors (did, dname) VALUES (10, 'Conrad International')
the RETURNING clause is a
PostgreSQL 

makefiles writing to $@ should first write to $@.new

2022-01-23 Thread Justin Pryzby
There are many Makefile rules like

foo: bar
./tool $< > $@

If the rule is interrupted (due to ^C or ENOSPC), foo can be 0 bytes or
partially written, but won't be rebuilt until someone runs distclean or debugs
it and removes the individual file, as I did for errcodes.h.

It'd be better if these did

./tool $< > $@.new
mv $@.new $@

-- 
Justin




typos

2022-01-23 Thread Justin Pryzby
ckend/replication/logical/logicalfuncs.c
@@ -9,7 +9,7 @@
  * Copyright (c) 2012-2022, PostgreSQL Global Development Group
  *
  * IDENTIFICATION
- *	  src/backend/replication/logicalfuncs.c
+ *	  src/backend/replication/logical/logicalfuncs.c
  *-
  */
 
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index d317a626e1e..19b2ba2000c 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -8,7 +8,7 @@
  *
  *
  * IDENTIFICATION
- *	  src/backend/replication/reorderbuffer.c
+ *	  src/backend/replication/logical/reorderbuffer.c
  *
  * NOTES
  *	  This module gets handed individual pieces of transactions in the order
diff --git a/src/backend/replication/logical/snapbuild.c b/src/backend/replication/logical/snapbuild.c
index ab04e946bc2..83fca8a77d9 100644
--- a/src/backend/replication/logical/snapbuild.c
+++ b/src/backend/replication/logical/snapbuild.c
@@ -110,7 +110,7 @@
  * Copyright (c) 2012-2022, PostgreSQL Global Development Group
  *
  * IDENTIFICATION
- *	  src/backend/replication/snapbuild.c
+ *	  src/backend/replication/logical/snapbuild.c
  *
  *-
  */
diff --git a/src/backend/utils/activity/backend_progress.c b/src/backend/utils/activity/backend_progress.c
index e5958fa77c5..f29199725b7 100644
--- a/src/backend/utils/activity/backend_progress.c
+++ b/src/backend/utils/activity/backend_progress.c
@@ -1,11 +1,11 @@
 /* --
- * progress.c
+ * backend_progress.c
  *
  *	Command progress reporting infrastructure.
  *
  *	Copyright (c) 2001-2022, PostgreSQL Global Development Group
  *
- *	src/backend/postmaster/progress.c
+ *	src/backend/utils/activity/backend_progress.c
  * --
  */
 #include "postgres.h"
diff --git a/src/backend/utils/activity/backend_status.c b/src/backend/utils/activity/backend_status.c
index 2fecf26a2c3..c7ed1e6d7ac 100644
--- a/src/backend/utils/activity/backend_status.c
+++ b/src/backend/utils/activity/backend_status.c
@@ -6,7 +6,7 @@
  *
  *
  * IDENTIFICATION
- *	  src/backend/postmaster/backend_status.c
+ *	  src/backend/utils/activity/backend_status.c
  * --
  */
 #include "postgres.h"
diff --git a/src/backend/utils/activity/wait_event.c b/src/backend/utils/activity/wait_event.c
index 0f5f18f02e0..36c4c0fd052 100644
--- a/src/backend/utils/activity/wait_event.c
+++ b/src/backend/utils/activity/wait_event.c
@@ -6,7 +6,7 @@
  *
  *
  * IDENTIFICATION
- *	  src/backend/postmaster/wait_event.c
+ *	  src/backend/utils/activity/wait_event.c
  *
  * NOTES
  *
diff --git a/src/include/replication/pgoutput.h b/src/include/replication/pgoutput.h
index fe1d0892aa6..78aa9151ef5 100644
--- a/src/include/replication/pgoutput.h
+++ b/src/include/replication/pgoutput.h
@@ -6,7 +6,7 @@
  * Copyright (c) 2015-2022, PostgreSQL Global Development Group
  *
  * IDENTIFICATION
- *		pgoutput.h
+ *		src/include/replication/pgoutput.h
  *
  *-
  */
diff --git a/src/include/utils/dynahash.h b/src/include/utils/dynahash.h
index 28fa58a6629..4564fb2467a 100644
--- a/src/include/utils/dynahash.h
+++ b/src/include/utils/dynahash.h
@@ -1,13 +1,14 @@
 /*-
  *
- * dynahash
+ * dynahash.h
  *	  POSTGRES dynahash.h file definitions
  *
  *
  * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
  *
- * src/include/utils/dynahash.h
+ * IDENTIFICATION
+ *		src/include/utils/dynahash.h
  *
  *-
  */
diff --git a/src/port/pgcheckdir.c b/src/port/pgcheckdir.c
index 75dbb147d71..6c09cf29810 100644
--- a/src/port/pgcheckdir.c
+++ b/src/port/pgcheckdir.c
@@ -1,6 +1,6 @@
 /*-
  *
- * src/port/pgcheckdir.c
+ * pgcheckdir.c
  *
  * A simple subroutine to check whether a directory exists and is empty or not.
  * Useful in both initdb and the backend.
@@ -8,6 +8,8 @@
  * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
  *
+ * IDENTIFICATION
+ *		src/port/pgcheckdir.c
  *---------
  */
 
-- 
2.17.1

>From 98a9b6dab9b9554eee90b1b53dfba155d8d8d547 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Mon, 5 Jul 2021 11:16:38 -0500
Subject: [PATCH 02/33] Add some missing newlines after function definitions

---
 src/backend/access/gin/ginlogic.c  | 1 +
 src/backend/access/rmgrdesc/heapdesc.c | 1 +
 src/backend/utils/mmgr/freepage.c  | 1 +
 3 files changed, 3 insertions(+)

diff --git a/src

pg_upgrade/test.sh and v9.5

2022-01-22 Thread Justin Pryzby
In fa66b6dee, Micheal fixed test.sh to work back to v11, so I suppose nobody is
trying to run it with older versions, as I was endeavored to do.

With the attached patch, I'm able to test upgrades back to v9.6.

In 9.5, there are regression diffs from CONTEXT lines from non-error messages,
which is a v9.5 change (0426f349e).  The first "make check" fails before even
getting to the upgrade part:

|  NOTICE:  trigger_func(before_ins_stmt) called: action = INSERT, when = 
BEFORE, level = STATEMENT
|- CONTEXT:  SQL statement "INSERT INTO main_table VALUES (NEW.a, NEW.b)"
|- PL/pgSQL function view_trigger() line 17 at SQL statement

I tried a lot of things but couldn't find one that worked.  I just tried this,
which allows the "make check" to pass, but then fails due missing symbols in
libpq during the upgrade phase.  Maybe I'm missing something - Tom must have
tested psql against old versions somehow before de-supporting old versions (but
maybe not like this).

| time make check -C src/bin/pg_upgrade oldsrc=`pwd`/new/95 
oldbindir=`pwd`/new/95/tmp_install/usr/local/pgsql/bin 
with_temp_install="LD_LIBRARY_PATH=`pwd`/new/95/tmp_install/usr/local/pgsql/lib"

I tried installcheck, but then that fails because psql doesn't accept multiple
-c options (it runs the final -c command only).

| EXTRA_REGRESS_OPTS="--bindir `pwd`/new/95/tmp_install/usr/local/pgsql/bin" 
LD_LIBRARY_PATH=`pwd`/new/95/tmp_install/usr/local/pgsql/lib PGHOST=/tmp time 
make installcheck
| ...
| == creating database "regression" ==
| ERROR:  database "regression" does not exist
| STATEMENT:  ALTER DATABASE "regression" SET lc_messages TO 'C';ALTER DATABASE 
"regression" SET lc_monetary TO 'C';ALTER DATABASE "regression" SET lc_numeric 
TO 'C';ALTER DATABASE "regression" SET lc_time TO 'C';ALTER DATABASE 
"regression" SET bytea_output TO 'hex';ALTER DATABASE "regression" SET 
timezone_abbreviations TO 'Default';
| ERROR:  database "regression" does not exist
| command failed: 
"/home/pryzbyj/src/postgres/new/95/tmp_install/usr/local/pgsql/bin/psql" -X -c 
"CREATE DATABASE \"regression\" TEMPLATE=template0" -c "ALTER DATABASE 
\"regression\" SET lc_messages TO 'C';ALTER DATABASE \"regression\" SET 
lc_monetary TO 'C';ALTER DATABASE \"regression\" SET lc_numeric TO 'C';ALTER 
DATABASE \"regression\" SET lc_time TO 'C';ALTER DATABASE \"regression\" SET 
bytea_output TO 'hex';ALTER DATABASE \"regression\" SET timezone_abbreviations 
TO 'Default';" "postgres"

pg_regress was changed to do that recently:

commit f45dc59a38cab1d2af6baaedb79559fe2e9b3781
Author: Tom Lane 
Date:   Wed Oct 20 18:44:37 2021 -0400

Improve pg_regress.c's infrastructure for issuing psql commands.

-- 
Justin
>From ccc113ccaef45b09077faa250e126e7c480324d7 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Sun, 2 Jan 2022 15:00:19 -0600
Subject: [PATCH] wip: test.sh: allow pg_upgrade check from v10 and earlier

---
 src/bin/pg_upgrade/test.sh   | 32 +---
 src/bin/pg_upgrade/upgrade_adapt.sql |  8 +++
 2 files changed, 28 insertions(+), 12 deletions(-)

diff --git a/src/bin/pg_upgrade/test.sh b/src/bin/pg_upgrade/test.sh
index ef328b3062f..c0a569ed6e3 100644
--- a/src/bin/pg_upgrade/test.sh
+++ b/src/bin/pg_upgrade/test.sh
@@ -24,7 +24,12 @@ standard_initdb() {
 	# without increasing test runtime, run these tests with a custom setting.
 	# Also, specify "-A trust" explicitly to suppress initdb's warning.
 	# --allow-group-access and --wal-segsize have been added in v11.
-	"$1" -N --wal-segsize 1 --allow-group-access -A trust
+	if [ "$newsrc" = "$oldsrc" ]; then
+		"$@" -N -A trust --wal-segsize 1 --allow-group-access
+	else
+		"$@" -N -A trust
+	fi
+
 	if [ -n "$TEMP_CONFIG" -a -r "$TEMP_CONFIG" ]
 	then
 		cat "$TEMP_CONFIG" >> "$PGDATA/postgresql.conf"
@@ -82,12 +87,25 @@ mkdir "$temp_root"
 oldsrc=`cd "$oldsrc" && pwd`
 newsrc=`cd ../../.. && pwd`
 
+# See similar logic in pg_upgrade/exec.c
+#oldpgversion0=`"$oldbindir"/pg_ctl --version`
+#oldpgversion=`echo "$oldpgversion0" |awk 'NF>=3 && $3~/^[0-9]+(\.[0-9]+|devel$)/{split($3, a, "."); print int(a[1])}'`
+#[ -n "$oldpgversion" ] || {
+	#echo "Could not determine version of old cluster: $oldpgversion0";
+	#exit 1
+#}
+oldpgversion=`grep '#define PG_VERSION_NUM' "$oldsrc"/src/include/pg_config.h | awk '{print $3}'`
+
 # We need to make pg_regress use psql from the desired installation
 # (likely a temporary one), because otherwise the installcheck run
 # below would try to use psql from t

Re: document the need to analyze partitioned tables

2022-01-21 Thread Justin Pryzby
Thanks for looking at this

On Fri, Jan 21, 2022 at 06:21:57PM +0100, Tomas Vondra wrote:
> Hi,
> 
> On 10/8/21 14:58, Justin Pryzby wrote:
> > Cleaned up and attached as a .patch.
> > 
> > The patch implementing autoanalyze on partitioned tables should
> > revert relevant portions of this patch.
> 
> I went through this patch and I'd like to propose a couple changes, per the
> 0002 patch:
> 
> 1) I've reworded the changes in maintenance.sgml a bit. It sounded a bit
> strange before, but I'm not a native speaker so maybe it's worse ...

+ autoanalyze on the parent table.  If your queries require statistics on   

+ parent relations for proper planning, it's necessary to periodically run  


You added two references to "relations", but everything else talks about
"tables", which is all that analyze processes.

> 2) Remove unnecessary whitespace changes in perform.sgml.

Those were a note to myself and to any reviewer - should that be updated too ?

> 3) Simplify the analyze.sgml changes a bit - it was trying to cram too much
> stuff into a single paragraph, so I split that.
> 
> Does that seem OK, or did omit something important?

+If the table being analyzed has one or more children,

I think you're referring to both legacy inheritance and and partitioning.  That
should be more clear.

+ANALYZE gathers two sets of statistics: once on the rows
+of the parent table only, and a second one including rows of both the 
parent
+table and all child relations.  This second set of statistics is needed 
when

I think should say ".. and all of its children".

> FWIW I think it's really confusing we have inheritance and partitioning, and
> partitions and child tables. And sometimes we use partitioning in the
> generic sense (i.e. including the inheritance approach), and sometimes only
> the declarative variant. Same for partitions vs child tables. I can't even
> imagine how confusing this has to be for people just learning this stuff.
> They must be in permanent WTF?! state ...

The docs were cleaned up some in 0c06534bd.  At least the word "partitioned"
should never be used for legacy inheritance - but "partitioning" is.

-- 
Justin




Re: Poor performance PostgreSQL13/PostGIS 3.x

2022-01-20 Thread Justin Pryzby
On Thu, Jan 20, 2022 at 10:31:15PM +, Lugosi, Jim wrote:
> We are struggling to figure out what is going on.  We are migrating from 
> PostgreSQL 9.6 to PostgreSQL 13 w/ PostGIS.  Our 9.6 version was compiled 
> from source and the new version (13) was installed using Yum.  BTW, the new 
> version is on a VM that has 16GB of memory, two cores, and 500 GB of disk.  
> In addition, we are using MapServer as our mapping engine and OpenLayers as 
> the client side interface.  Once we switch over to the new version of 
> PostgreSQL, the performance takes a big nose dive.  We have being tweaking 
> and tuning the database and it appears to be happy but the response times 
> from mapfile requests are 3 -7 seconds.  Previously, the response time was 
> below a second.
> 
> Another point is that we populated the new database from the old (9.6), using 
> pg_dump.  Could this be causing issues?  Should we load the data from 
> scratch?  We use ogr2ogr (GDAL) to help assist with loading of spatial data.  
> Anyway, not really sure what the problem is.
> 
> Lastly, why am I seeing so many requests as to the PostGIS version.  It 
> appears that every map request sends the following query "SELECT 
> PostGIS_Version();", which in turn takes up a connection.

This list is for postgres development and bug reports.

I suggest to post here, instead:
https://www.postgresql.org/list/pgsql-performance/

Here's a list of needed info:
https://wiki.postgresql.org/wiki/Slow_Query_Questions

It's important to include the slow query itself.

-- 
Justin




Re: pg_upgrade should truncate/remove its logs before running

2022-01-19 Thread Justin Pryzby
On Thu, Jan 20, 2022 at 12:01:29PM +0900, Michael Paquier wrote:
> On Wed, Jan 19, 2022 at 06:05:40PM -0600, Justin Pryzby wrote:
> 
> > I'm not sure these restrictions are needed ?
> 
> This could lead to issues with rmtree() if we are not careful enough,
> no?  We'd had our deal of argument injections with pg_upgrade commands
> in the past (fcd15f1).

We require that the dir not exist, by testing if (mkdir()).
So it's okay if someone specifies ../whatever or $CWD.

-- 
Justin




Re: pg_upgrade should truncate/remove its logs before running

2022-01-19 Thread Justin Pryzby
On Wed, Jan 19, 2022 at 05:13:18PM +0900, Michael Paquier wrote:
> On Tue, Jan 11, 2022 at 10:08:13PM -0600, Justin Pryzby wrote:
> > I asked about that before.  Right now, it'll exit(1) when mkdir fails.
> > 
> > I had written a patch to allow "." by skipping mkdir (or allowing it to 
> > fail if
> > errno == EEXIST), but it seems like an awfully bad idea to try to make that
> > work with rmtree().

I still don't know if it even needs to be configurable.

> - Add some sanity check about the path used, aka no parent reference
> allowed and the output path should not be a direct parent of the
> current working directory.

I'm not sure these restrictions are needed ?

+   outputpath = make_absolute_path(log_opts.basedir);  

+   if (path_contains_parent_reference(outputpath)) 

+   pg_fatal("reference to parent directory not allowed\n");


Besides, you're passing the wrong path here.

> I have noticed a couple of incorrect things in the docs, and some
> other things.  It is a bit late here, so I may have missed a couple of
> things but I'll look at this stuff once again in a couple of days.

> +  pg_upgrade, and is be removed after a successful

remove "be"

> +   if (mkdir(log_opts.basedir, S_IRWXU | S_IRWXG | S_IRWXO))

S_IRWXG | S_IRWXO are useless due to the umask, right ?
Maybe use PG_DIR_MODE_OWNER ?

> +   if (mkdir(log_opts.basedir, S_IRWXU | S_IRWXG | S_IRWXO))
> +   pg_fatal("could not create directory \"%s\": %m\n", 
> filename_path);
> +   if (mkdir(log_opts.dumpdir, S_IRWXU | S_IRWXG | S_IRWXO))
> +   pg_fatal("could not create directory \"%s\": %m\n", 
> filename_path);
> +   if (mkdir(log_opts.logdir, S_IRWXU | S_IRWXG | S_IRWXO))
> +   pg_fatal("could not create directory \"%s\": %m\n", 
> filename_path);

You're printing the wrong var.  filename_path is not initialized.

-- 
Justin




Re: Adding CI to our tree

2022-01-18 Thread Justin Pryzby
On Mon, Jan 17, 2022 at 12:16:19PM -0800, Andres Freund wrote:
> I think it might still be worth adding stopgap way of running all tap tests on
> windows though. Having a vcregress.pl function to find all directories with t/
> and run the tests there, shouldn't be a lot of code...

I started doing that, however it makes CI/windows even slower.  I think it'll
be necessary to run prove with all the tap tests to parallelize them, rather
than looping around directories, many of which have only a single file, and are
run serially.

https://github.com/justinpryzby/postgres/commits/citest-cirrus
This has the link to a recent cirrus result if you'd want to look.
I suppose I should start a new thread.  

There's a bunch of other stuff for cirrus in there (and bits and pieces of
other branches).

 .  cirrus: spell ccache_maxsize
 .  cirrus: avoid unnecessary double star **
 .  vcregress/ci: test modules/contrib with NO_INSTALLCHECK=1
 .  vcregress: style
 .  wip: vcsregress: add alltaptests
 .  wip: run upgrade tests with data-checksums
 .  pg_regress --initdb-opts
 .  wip: pg_upgrade: show list of files copied only in vebose mode
 .  wip: cirrus: include hints how to install OS packages..
 .  wip: cirrus: code coverage
 .  cirrus: upload html docs as artifacts
 .  wip: cirrus/windows: save compiled build as an artifact




Re: libpq compression (part 2)

2022-01-17 Thread Justin Pryzby
On Tue, Jan 18, 2022 at 02:06:32AM +0500, Daniil Zakhlystov wrote:
> > => Since March, errmsg doesn't need extra parenthesis around it (e3a87b4).

> I’ve resolved the stuck tests and added zlib support for CI Windows builds to 
> patch 0003-*.  Thanks
> for the suggestion, all tests seem to be OK now, except the macOS one.  It 
> won't schedule in Cirrus
> CI for some reason, but I guess it happens because of my GitHub account 
> limitation.

I don't know about your github account, but it works for cfbot, which is now
green.

Thanks for implementing zlib for windows.  Did you try this with default
compressions set to lz4 and zstd ?

The thread from 2019 is very long, and starts off with the guidance that
compression had been implemented at the wrong layer.  It looks like this hasn't
changed since then.  secure_read/write are passed as function pointers to the
ZPQ interface, which then calls back to them to read and flush its compression
buffers.  As I understand, the suggestion was to leave the socket reads and
writes alone.  And then conditionally de/compress buffers after reading /
before writing from the socket if compression was negotiated.

It's currently done like this
pq_recvbuf() => secure_read() - when compression is disabled 
pq_recvbuf() => ZPQ => secure_read() - when compression is enabled 

Dmitri sent a partial, POC patch which changes the de/compression to happen in
secure_read/write, which is changed to call ZPQ:  
https://www.postgresql.org/message-id/ca+q6zcuprssnars+fyobsd-f2stk1roa-4sahfofajowlzi...@mail.gmail.com
pq_recvbuf() => secure_read() => ZPQ

The same thing is true of the frontend: function pointers to
pqsecure_read/write are being passed to zpq_create, and then the ZPQ interface
called instead of the original functions.  Those are the functions which read
using SSL, so they should also handle compression.

That's where SSL is handled, and it seems like the right place to handle
compression.  Have you evaluated that way to do things ?

Konstantin said he put ZPQ at that layer seems to 1) avoid code duplication
between client/server; and, 2) to allow compression to happen before SSL, to
allow both (if the admin decides it's okay).  But I don't see why compression
can't happen before sending to SSL, or after reading from it?

-- 
Justin




Re: \d with triggers: more than one row returned by a subquery used as an expression

2022-01-17 Thread Justin Pryzby
On Mon, Jan 17, 2022 at 05:02:00PM -0500, Tom Lane wrote:
> Justin Pryzby  writes:
> > On Fri, Dec 17, 2021 at 09:43:56AM -0600, Justin Pryzby wrote:
> >> I want to mention that the 2nd problem I mentioned here is still broken.
> >> https://www.postgresql.org/message-id/20210717010259.gu20...@telsasoft.com
> >> It happens if non-inheritted triggers on child and parent have the same 
> >> name.
> 
> > This is the fix I was proposing
> > It depends on pg_partition_ancestors() to return its partitions in order:
> > this partition => parent => ... => root.
> 
> I don't think that works at all.  I might be willing to accept the
> assumption about pg_partition_ancestors()'s behavior, but you're also
> making an assumption about how the output of pg_partition_ancestors()
> is joined to "pg_trigger AS u", and I really don't think that's safe.

> ISTM the real problem is the assumption that only related triggers could
> share a tgname, which evidently isn't true.  I think this query needs to
> actually match on tgparentid, rather than taking shortcuts.

I don't think that should be needed - tgparentid should match
pg_partition_ancestors().

> If we don't
> want to use a recursive CTE, maybe we could define it as only looking up to
> the immediate parent, rather than necessarily finding the root?

I think that defeats the intent of c33869cc3.

Is there any reason why WITH ORDINALITY can't work ?
This is passing the smoke test.

-- 
Justin
>From 99f8ee921258d9b2e75299e69826004fda3b8919 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Fri, 3 Apr 2020 22:43:26 -0500
Subject: [PATCH 1/2] psql: Add newlines and spacing in trigger query for \d

cosmetic change to c33869cc3 to make the output of psql -E look pretty.
---
 src/bin/psql/describe.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 61cec118aca..ed9f320b015 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -2993,12 +2993,12 @@ describeOneTableDetails(const char *schemaname,
 		  "t.tgenabled, t.tgisinternal, %s\n"
 		  "FROM pg_catalog.pg_trigger t\n"
 		  "WHERE t.tgrelid = '%s' AND ",
-		  (pset.sversion >= 13 ?
-		   "(SELECT (NULLIF(a.relid, t.tgrelid))::pg_catalog.regclass"
-		   " FROM pg_catalog.pg_trigger AS u, "
-		   "  pg_catalog.pg_partition_ancestors(t.tgrelid) AS a"
-		   " WHERE u.tgname = t.tgname AND u.tgrelid = a.relid"
-		   "   AND u.tgparentid = 0) AS parent" :
+		  (pset.sversion >= 13 ? "\n"
+		   "  (SELECT (NULLIF(a.relid, t.tgrelid))::pg_catalog.regclass\n"
+		   "   FROM pg_catalog.pg_trigger AS u,\n"
+		   "  pg_catalog.pg_partition_ancestors(t.tgrelid) AS a\n"
+		   "   WHERE u.tgname = t.tgname AND u.tgrelid = a.relid\n"
+		   "AND u.tgparentid = 0) AS parent" :
 		   "NULL AS parent"),
 		  oid);
 
-- 
2.17.1

>From d866624abbd523afdbc3a539f14c935c42fc345f Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Fri, 16 Jul 2021 19:57:00 -0500
Subject: [PATCH 2/2] psql: fix \d for statement triggers with same name on
 partition and its parent

This depends on pg_partition_ancestors() to return its partitions in order:
this partition => parent => ... => root.

20210716193319.gs20...@telsasoft.com
---
 src/bin/psql/describe.c|  4 ++--
 src/test/regress/expected/triggers.out | 13 +
 src/test/regress/sql/triggers.sql  |  4 
 3 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index ed9f320b015..8787849e8be 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -2996,9 +2996,9 @@ describeOneTableDetails(const char *schemaname,
 		  (pset.sversion >= 13 ? "\n"
 		   "  (SELECT (NULLIF(a.relid, t.tgrelid))::pg_catalog.regclass\n"
 		   "   FROM pg_catalog.pg_trigger AS u,\n"
-		   "  pg_catalog.pg_partition_ancestors(t.tgrelid) AS a\n"
+		   "  pg_catalog.pg_partition_ancestors(t.tgrelid) WITH ORDINALITY AS a(relid,depth)\n"
 		   "   WHERE u.tgname = t.tgname AND u.tgrelid = a.relid\n"
-		   "AND u.tgparentid = 0) AS parent" :
+		   "AND u.tgparentid = 0 ORDER BY depth LIMIT 1) AS parent" :
 		   "NULL AS parent"),
 		  oid);
 
diff --git a/src/test/regress/expected/triggers.out b/src/test/regress/expected/triggers.out
index 5c0e7c2b79e..9278cc07237 100644
--- a/src/test/regress/expected/triggers.out
+++ b/src/test/regress/expected/triggers.out
@@ -2122,6 +2122,19 @@ Triggers:
 a

Re: Avoiding smgrimmedsync() during nbtree index builds

2022-01-17 Thread Justin Pryzby
On Sun, Jan 16, 2022 at 02:25:59PM -0600, Justin Pryzby wrote:
> On Thu, Jan 13, 2022 at 09:52:55AM -0600, Justin Pryzby wrote:
> > This is failing on windows CI when I use initdb --data-checksums, as 
> > attached.
> > 
> > https://cirrus-ci.com/task/5612464120266752
> > https://api.cirrus-ci.com/v1/artifact/task/5612464120266752/regress_diffs/src/test/regress/regression.diffs
> > 
> > +++ c:/cirrus/src/test/regress/results/bitmapops.out2022-01-13 
> > 00:47:46.704621200 +
> > ..
> > +ERROR:  could not read block 0 in file "base/16384/30310": read only 0 of 
> > 8192 bytes
> 
> The failure isn't consistent, so I double checked my report.  I have some more
> details:
> 
> The problem occurs maybe only ~25% of the time.
> 
> The issue is in the 0001 patch.
> 
> data-checksums isn't necessary to hit the issue.
> 
> errlocation says: LOCATION:  mdread, md.c:686 (the only place the error
> exists)
> 
> With Andres' windows crash patch, I obtained a backtrace - attached.
> https://cirrus-ci.com/task/5978171861368832
> https://api.cirrus-ci.com/v1/artifact/task/5978171861368832/crashlog/crashlog-postgres.exe_0fa8_2022-01-16_02-54-35-291.txt
> 
> Maybe its a race condition or synchronization problem that nowhere else tends
> to hit.

I meant to say that I had not seen this issue anywhere but windows.

But now, by chance, I still had the 0001 patch in my tree, and hit the same
issue on linux:

https://cirrus-ci.com/task/4550618281934848
+++ 
/tmp/cirrus-ci-build/src/bin/pg_upgrade/tmp_check/regress/results/tuplesort.out 
2022-01-17 16:06:35.759108172 +
 EXPLAIN (COSTS OFF)
 SELECT id, noabort_increasing, noabort_decreasing FROM abbrev_abort_uuids 
ORDER BY noabort_increasing LIMIT 5;
+ERROR:  could not read block 0 in file "base/16387/t3_36794": read only 0 of 
8192 bytes




Re: Map WAL segment files on PMEM as WAL buffers

2022-01-16 Thread Justin Pryzby
On Thu, Jan 06, 2022 at 10:43:37PM -0600, Justin Pryzby wrote:
> On Fri, Jan 07, 2022 at 12:50:01PM +0900, Takashi Menjo wrote:
> > > But in this case it really doesn't work :(
> > >
> > > running bootstrap script ... 2022-01-05 23:17:30.244 CST [12088] FATAL:  
> > > file not on PMEM: path "pg_wal/00010001"
> > 
> > Do you have a real PMEM device such as NVDIMM-N or Intel Optane PMem?
> 
> No - the point is that we'd like to have a way to exercise this patch on the
> cfbot.  Particularly the new code introduced by this patch, not just the
> --without-pmem case...
..
> I think you should add a patch which does what Thomas suggested: 1) add to
> ./.cirrus.yaml installation of the libpmem package for debian/bsd/mac/windows;
> 2) add setenv to main(), as above; 3) change configure.ac and guc.c to default
> to --with-libpmem and wal_pmem_map=on.  This should be the last patch, for
> cfbot only, not meant to be merged.

I was able to get the cirrus CI to compile on linux and bsd with the below
changes.  I don't know if there's an easy package installation for mac OSX.  I
think it's okay if mac CI doesn't use --enable-pmem for now.

> You can test that the package installation part works before mailing patches 
> to
> the list with the instructions here:
> 
> src/tools/ci/README:
> Enabling cirrus-ci in a github repository..

I ran the CI under my own github account.
Linux crashes in the recovery check.
And freebsd has been stuck for 45min.

I'm not sure, but maybe those are legimate consequence of using
PMEM_IS_PMEM_FORCE (?)  If so, maybe the recovery check would need to be
disabled for this patch to run on CI...  Or maybe my suggestion to enable it by
default for CI doesn't work for this patch.  It would need to be specially
tested with real hardware.

https://cirrus-ci.com/task/6245151591890944

https://cirrus-ci.com/task/6162551485497344?logs=test_world#L3941
#2  0x55ff43c6edad in ExceptionalCondition (conditionName=0x55ff43d18108 
"!XLogRecPtrIsInvalid(missingContrecPtr)", errorType=0x55ff43d151c4 
"FailedAssertion", fileName=0x55ff43d151bd "xlog.c", lineNumber=8297) at 
assert.c:69

commit 15533794e465a381eb23634d67700afa809a0210
Author: Justin Pryzby 
Date:   Thu Jan 6 22:53:28 2022 -0600

tmp: enable pmem by default, for CI

diff --git a/.cirrus.yml b/.cirrus.yml
index 677bdf0e65e..0cb961c8103 100644
--- a/.cirrus.yml
+++ b/.cirrus.yml
@@ -81,6 +81,7 @@ task:
 mkdir -m 770 /tmp/cores
 chown root:postgres /tmp/cores
 sysctl kern.corefile='/tmp/cores/%N.%P.core'
+pkg install -y devel/pmdk
 
   # NB: Intentionally build without --with-llvm. The freebsd image size is
   # already large enough to make VM startup slow, and even without llvm
@@ -99,6 +100,7 @@ task:
 --with-lz4 \
 --with-pam \
 --with-perl \
+--with-libpmem \
 --with-python \
 --with-ssl=openssl \
 --with-tcl --with-tclconfig=/usr/local/lib/tcl8.6/ \
@@ -138,6 +140,7 @@ LINUX_CONFIGURE_FEATURES: _CONFIGURE_FEATURES >-
   --with-lz4
   --with-pam
   --with-perl
+  --with-libpmem
   --with-python
   --with-selinux
   --with-ssl=openssl
@@ -188,6 +191,9 @@ task:
 mkdir -m 770 /tmp/cores
 chown root:postgres /tmp/cores
 sysctl kernel.core_pattern='/tmp/cores/%e-%s-%p.core'
+echo 'deb http://deb.debian.org/debian bullseye universe' 
>>/etc/apt/sources.list
+apt-get update
+apt-get -y install libpmem-dev
 
   configure_script: |
 su postgres <<-EOF
@@ -267,6 +273,7 @@ task:
   make \
   openldap \
   openssl \
+  pmem \
   python \
   tcl-tk
 
@@ -301,6 +308,7 @@ task:
   --with-libxslt \
   --with-lz4 \
   --with-perl \
+  --with-libpmem \
   --with-python \
   --with-ssl=openssl \
   --with-tcl --with-tclconfig=${brewpath}/opt/tcl-tk/lib/ \
diff --git a/src/backend/main/main.c b/src/backend/main/main.c
index 9124060bde7..b814269675d 100644
--- a/src/backend/main/main.c
+++ b/src/backend/main/main.c
@@ -69,6 +69,7 @@ main(int argc, char *argv[])
 #endif
 
progname = get_progname(argv[0]);
+   setenv("PMEM_IS_PMEM_FORCE", "1", 0);
 
/*
 * Platform-specific startup hacks
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index ffc55f33e86..32d650cb9b2 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -1354,7 +1354,7 @@ static struct config_bool ConfigureNamesBool[] =
 "traditional volatile ones."),
},
_pmem_map,
-   false,
+   true,
NULL, NULL, NULL
},
 #endif




Re: pg_dump/restore --no-tableam

2022-01-16 Thread Justin Pryzby
On Mon, Jan 17, 2022 at 02:55:58PM +0900, Michael Paquier wrote:
> On Tue, Jan 11, 2022 at 10:09:07PM -0600, Justin Pryzby wrote:
> > I suppose you're right - I had previously renamed it from no-tableam.
> 
> Thanks for the new version.  I have noticed that support for the
> option with pg_dumpall was missing, but that looks useful to me like
> the other switches.

I saw that you added it to pg_dumpall.  But there's a typo in --help:

diff --git a/src/bin/pg_dump/pg_dumpall.c b/src/bin/pg_dump/pg_dumpall.c
index 1cab0dfdc75..94852e7cdbb 100644
--- a/src/bin/pg_dump/pg_dumpall.c
+++ b/src/bin/pg_dump/pg_dumpall.c
@@ -655,3 +655,3 @@ help(void)
printf(_("  --no-syncdo not wait for changes to be 
written safely to disk\n"));
-   printf(_("  --no-tables-access-methoddo not dump table access 
methods\n"));
+   printf(_("  --no-table-access-method do not dump table access 
methods\n"));
printf(_("  --no-tablespaces do not dump tablespace 
assignments\n"));

Feel free to leave it for now, and I'll add it to my typos branch.

> And, done.

Thanks!

-- 
Justin




Re: Avoiding smgrimmedsync() during nbtree index builds

2022-01-16 Thread Justin Pryzby
On Thu, Jan 13, 2022 at 09:52:55AM -0600, Justin Pryzby wrote:
> This is failing on windows CI when I use initdb --data-checksums, as attached.
> 
> https://cirrus-ci.com/task/5612464120266752
> https://api.cirrus-ci.com/v1/artifact/task/5612464120266752/regress_diffs/src/test/regress/regression.diffs
> 
> +++ c:/cirrus/src/test/regress/results/bitmapops.out  2022-01-13 
> 00:47:46.704621200 +
> ..
> +ERROR:  could not read block 0 in file "base/16384/30310": read only 0 of 
> 8192 bytes

The failure isn't consistent, so I double checked my report.  I have some more
details:

The problem occurs maybe only ~25% of the time.

The issue is in the 0001 patch.

data-checksums isn't necessary to hit the issue.

errlocation says: LOCATION:  mdread, md.c:686 (the only place the error
exists)

With Andres' windows crash patch, I obtained a backtrace - attached.
https://cirrus-ci.com/task/5978171861368832
https://api.cirrus-ci.com/v1/artifact/task/5978171861368832/crashlog/crashlog-postgres.exe_0fa8_2022-01-16_02-54-35-291.txt

Maybe its a race condition or synchronization problem that nowhere else tends
to hit.

Separate from this issue, I wonder if it'd be useful to write a DEBUG log
showing when btree uses shared_buffers vs fsync.  And a regression test which
first SETs client_min_messages=debug to capture the debug log to demonstrate
when/that new code path is being hit.  I'm not sure if that would be good to
merge, but it may be useful for now.

-- 
Justin
Opened log file 
'c:/cirrus/crashlog-postgres.exe_0fa8_2022-01-16_02-54-35-291.txt'

.  0  Id: ed4.728 Suspend: 1 Teb: `002fe000 Unfrozen
*** WARNING: Unable to verify checksum for 
c:\cirrus\tmp_install\bin\postgres.exe
Child-SP  RetAddr   Call Site
`007feb40 0001`408106bb ucrtbased!abort(void)+0x5a 
[minkernel\crts\ucrt\src\appcrt\startup\abort.cpp @ 77]
`007feb80 0001`40603a56 postgres!errfinish(
char * filename = 0x0001`40c27a5b "md.c", 
int lineno = 0n686, 
char * funcname = 0x0001`40c275ac "mdread")+0x41b 
[c:\cirrus\src\backend\utils\error\elog.c @ 683]
`007febe0 0001`4060632d postgres!mdread(
struct SMgrRelationData * reln = 0x`0155f2c0, 
ForkNumber forknum = MAIN_FORKNUM (0n0), 
unsigned int blocknum = 0, 
char * buffer = 0x`0ab60b00 "")+0x286 
[c:\cirrus\src\backend\storage\smgr\md.c @ 682]
`007fec60 0001`405b67ec postgres!smgrread(
struct SMgrRelationData * reln = 0x`0155f2c0, 
ForkNumber forknum = MAIN_FORKNUM (0n0), 
unsigned int blocknum = 0, 
char * buffer = 0x`0ab60b00 "")+0x4d 
[c:\cirrus\src\backend\storage\smgr\smgr.c @ 505]
`007feca0 0001`405b1051 postgres!ReadBuffer_common(
struct SMgrRelationData * smgr = 0x`0155f2c0, 
char relpersistence = 0n112 'p', 
ForkNumber forkNum = MAIN_FORKNUM (0n0), 
unsigned int blockNum = 0, 
ReadBufferMode mode = RBM_NORMAL (0n0), 
struct BufferAccessStrategyData * strategy = 
0x`, 
bool * hit = 0x`007fee00)+0x8ac 
[c:\cirrus\src\backend\storage\buffer\bufmgr.c @ 1005]
`007fedc0 0001`405b0f1a postgres!ReadBufferExtended(
struct RelationData * reln = 0x`01592bc0, 
ForkNumber forkNum = MAIN_FORKNUM (0n0), 
unsigned int blockNum = 0, 
ReadBufferMode mode = RBM_NORMAL (0n0), 
struct BufferAccessStrategyData * strategy = 
0x`)+0x121 [c:\cirrus\src\backend\storage\buffer\bufmgr.c @ 762]
`007fee30 0001`400d51c7 postgres!ReadBuffer(
struct RelationData * reln = 0x`01592bc0, 
unsigned int blockNum = 0)+0x2a 
[c:\cirrus\src\backend\storage\buffer\bufmgr.c @ 697]
`007fee70 0001`400d4b4e postgres!_bt_getbuf(
struct RelationData * rel = 0x`01592bc0, 
unsigned int blkno = 0, 
int access = 0n1)+0x27 
[c:\cirrus\src\backend\access\nbtree\nbtpage.c @ 878]
`007feed0 0001`40452532 postgres!_bt_getrootheight(
struct RelationData * rel = 0x`01592bc0)+0x2e 
[c:\cirrus\src\backend\access\nbtree\nbtpage.c @ 680]
`007fef10 0001`4045a21a postgres!get_relation_info(
struct PlannerInfo * root = 0x`015ecd18, 
   

Re: [PATCH] Full support for index LP_DEAD hint bits on standby

2022-01-15 Thread Justin Pryzby
On Sat, Jan 15, 2022 at 08:39:14PM +0300, Michail Nikolaev wrote:
> Hello, Junien.
> 
> Thanks for your attention.
> 
> > The cfbot reports that this patch is currently failing at least on
> > Linux and Windows, e.g. https://cirrus-ci.com/task/6532060239101952.
> 
> Fixed. It was the issue with the test - hangs on Windows because of
> psql + spurious vacuum sometimes.

It looks like there's still a server crash caused the CI or client to hang.

https://cirrus-ci.com/task/6350310141591552
2022-01-13 06:31:04.182 GMT [8636][walreceiver] FATAL:  could not receive data 
from WAL stream: server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
2022-01-13 06:31:04.182 GMT [6848][startup] LOG:  invalid record length at 
0/3014B58: wanted 24, got 0
2022-01-13 06:31:04.228 GMT [8304][walreceiver] FATAL:  could not connect to 
the primary server: connection to server on socket 
"C:/Users/ContainerAdministrator/AppData/Local/Temp/_7R9Pa5CwW/.s.PGSQL.58307" 
failed: Connection refused (0x274D/10061)




Re: MultiXact/SLRU buffers configuration

2022-01-15 Thread Justin Pryzby
On Sat, Jan 15, 2022 at 12:16:59PM +0500, Andrey Borodin wrote:
> > 15 янв. 2022 г., в 03:20, Shawn Debnath  написал(а):
> > On Fri, Jan 14, 2022 at 05:28:38PM +0800, Julien Rouhaud wrote:
> >>> PFA rebase of the patchset. Also I've added a patch to combine 
> >>> page_number, page_status, and page_dirty together to touch less 
> >>> cachelines.
> >> 
> >> The cfbot reports some errors on the latest version of the patch:
> >> 
> >> https://cirrus-ci.com/task/6121317215764480
> >> [...]
> >> Could you send a new version?  In the meantime I will switch the patch 
> >> status to Waiting on Author.
> > 
> > I was planning on running a set of stress tests on these patches. Could 
> > we confirm which ones we plan to include in the commitfest?
> 
> Many thanks for your interest. Here's the  latest version.

This is failing to compile under linux and windows due to bitfield syntax.
http://cfbot.cputube.org/andrey-borodin.html

And compile warnings:

slru.c: In function ‘SlruAdjustNSlots’:
slru.c:161:2: warning: ISO C90 forbids mixed declarations and code 
[-Wdeclaration-after-statement]
  161 |  int nbanks = 1;
  |  ^~~
slru.c: In function ‘SimpleLruReadPage_ReadOnly’:
slru.c:530:2: warning: ISO C90 forbids mixed declarations and code 
[-Wdeclaration-after-statement]
  530 |  int bankstart = (pageno & shared->bank_mask) * shared->bank_size;
  |  ^~~

Note that you can test the CI result using any github account without waiting
for the cfbot.  See ./src/tools/ci/README.

-- 
Justin




Re: Add 64-bit XIDs into PostgreSQL 15

2022-01-14 Thread Justin Pryzby
I tried to pg_upgrade from a v13 instance like:
time make check -C src/bin/pg_upgrade oldsrc=`pwd`/13 
oldbindir=`pwd`/13/tmp_install/usr/local/pgsql/bin

I had compiled and installed v13 into `pwd`/13.

First, test.sh failed, because of an option in initdb which doesn't exist in
the old version: -x 210

I patched test.sh so the option is used only the "new" version.

The tab_core_types table has an XID column, so pg_upgrade --check complains and
refuses to run.  If I drop it, then pg_upgrade runs, but then fails like this:

|Files /home/pryzbyj/src/postgres/src/bin/pg_upgrade/tmp_check/new_xids.txt and 
/home/pryzbyj/src/postgres/src/bin/pg_upgrade/tmp_check/old_xids.txt differ
|See /home/pryzbyj/src/postgres/src/bin/pg_upgrade/tmp_check/xids.diff
|
|--- /home/pryzbyj/src/postgres/src/bin/pg_upgrade/tmp_check/new_xids.txt   
 2022-01-15 00:14:23.035294414 -0600
|+++ /home/pryzbyj/src/postgres/src/bin/pg_upgrade/tmp_check/old_xids.txt   
 2022-01-15 00:13:59.634945012 -0600
|@@ -1,5 +1,5 @@
|  relfrozenxid | relminmxid 
| --+
|-3 |  3
|+15594 |  3
| (1 row)

Also, the patch needs to be rebased over Peter's vacuum changes.

Here's the changes I used for my test:

diff --git a/src/bin/pg_upgrade/test.sh b/src/bin/pg_upgrade/test.sh
index c6361e3c085..5eae42192b6 100644
--- a/src/bin/pg_upgrade/test.sh
+++ b/src/bin/pg_upgrade/test.sh
@@ -24,7 +24,8 @@ standard_initdb() {
# without increasing test runtime, run these tests with a custom 
setting.
# Also, specify "-A trust" explicitly to suppress initdb's warning.
# --allow-group-access and --wal-segsize have been added in v11.
-   "$1" -N --wal-segsize 1 --allow-group-access -A trust -x 210
+   "$@" -N --wal-segsize 1 --allow-group-access -A trust
+
if [ -n "$TEMP_CONFIG" -a -r "$TEMP_CONFIG" ]
then
cat "$TEMP_CONFIG" >> "$PGDATA/postgresql.conf"
@@ -237,7 +238,7 @@ fi
 
 PGDATA="$BASE_PGDATA"
 
-standard_initdb 'initdb'
+standard_initdb 'initdb' -x 210
 
 pg_upgrade $PG_UPGRADE_OPTS --no-sync -d "${PGDATA}.old" -D "$PGDATA" -b 
"$oldbindir" -p "$PGPORT" -P "$PGPORT"
 
diff --git a/src/bin/pg_upgrade/upgrade_adapt.sql 
b/src/bin/pg_upgrade/upgrade_adapt.sql
index 27c4c7fd011..c5ce8bc95b2 100644
--- a/src/bin/pg_upgrade/upgrade_adapt.sql
+++ b/src/bin/pg_upgrade/upgrade_adapt.sql
@@ -89,3 +89,5 @@ DROP OPERATOR public.#%# (pg_catalog.int8, NONE);
 DROP OPERATOR public.!=- (pg_catalog.int8, NONE);
 DROP OPERATOR public.#@%# (pg_catalog.int8, NONE);
 \endif
+
+DROP TABLE IF EXISTS tab_core_types;




Re: extended stats on partitioned tables

2022-01-14 Thread Justin Pryzby
On Mon, Dec 13, 2021 at 09:40:09PM +0100, Tomas Vondra wrote:
> 1) If the table is a separate relation (not part of an inheritance
> tree), this should make no difference. -> OK
> 
> 2) If the table is using "old" inheritance, this reverts back to
> pre-regression behavior. So people will keep using the old statistics
> until the ANALYZE, and we need to tell them to ANALYZE or something.
> 
> 3) If the table is using partitioning, it's guaranteed to be empty and
> there are no stats at all. Again, we should tell people to run ANALYZE.

I think these can be mentioned in the commit message, which can end up in the
minor release notes as a recommendation to rerun ANALYZE.

Thanks for pushing 0001.

-- 
Justin




Re: Adding CI to our tree

2022-01-14 Thread Justin Pryzby
On Fri, Jan 14, 2022 at 03:34:11PM -0800, Andres Freund wrote:
> Hi,
> 
> On 2022-01-13 15:27:40 -0500, Andrew Dunstan wrote:
> > I can probably adjust to whatever we decide to do. But I think we're
> > really just tinkering at the edges here. What I think we really need is
> > the moral equivalent of `make check-world` in one invocation of
> > vcregress.pl.
> 
> I agree strongly that we need that. But I think a good chunk of Justin's
> changes are actually required to get there?
> 
> Specifically, unless we want lots of duplicated logic in vcregress.pl, we
> need to make vcregress know how to run NO_INSTALLCHECK test. The option added
> was just so the buildfarm doesn't start to run tests multiple times...

The main reason I made the INSTALLCHECK runs conditional (they only run if a
new option is specified) is because of these comments:

| # Disabled because these tests require 
"shared_preload_libraries=pg_stat_statements",
| # which typical installcheck users do not have (e.g. buildfarm clients).
| NO_INSTALLCHECK = 1

Also, I saw that you saw that Thomas discovered/pointed out that a bunch of TAP
tests aren't being run by CI.   I think vcregress should have an "alltap"
target that runs everything like glob("**/t/").  CI would use that instead of
the existing ssl, auth, subscription, recovery, and bin targets.  The buildfarm
could switch to that after it's been published.

https://www.postgresql.org/message-id/20220114234947.av4kkhuj7netsy5r%40alap3.anarazel.de

-- 
Justin




Re: [PATCH] psql: \dn+ to show size of each schema (and \dA+ for AMs)

2022-01-14 Thread Justin Pryzby
Rebased before Julian asks.
>From cf57143c85a2a6088fe6038e6d41771fd60aae34 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Tue, 13 Jul 2021 21:25:48 -0500
Subject: [PATCH 1/4] Add pg_am_size(), pg_namespace_size() ..

See also: 358a897fa, 528ac10c7
---
 src/backend/utils/adt/dbsize.c  | 130 
 src/include/catalog/pg_proc.dat |  19 +
 2 files changed, 149 insertions(+)

diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
index 3a2f2e1f99d..6cde01f0587 100644
--- a/src/backend/utils/adt/dbsize.c
+++ b/src/backend/utils/adt/dbsize.c
@@ -13,18 +13,23 @@
 
 #include 
 
+#include "access/genam.h"
 #include "access/htup_details.h"
 #include "access/relation.h"
+#include "access/table.h"
 #include "catalog/catalog.h"
 #include "catalog/namespace.h"
 #include "catalog/pg_authid.h"
 #include "catalog/pg_tablespace.h"
 #include "commands/dbcommands.h"
+#include "commands/defrem.h"
 #include "commands/tablespace.h"
 #include "miscadmin.h"
 #include "storage/fd.h"
 #include "utils/acl.h"
 #include "utils/builtins.h"
+#include "utils/fmgroids.h"
+#include "utils/lsyscache.h"
 #include "utils/numeric.h"
 #include "utils/rel.h"
 #include "utils/relfilenodemap.h"
@@ -832,6 +837,131 @@ pg_size_bytes(PG_FUNCTION_ARGS)
 	PG_RETURN_INT64(result);
 }
 
+/*
+ * Return the sum of size of relations for which the given attribute of
+ * pg_class matches the specified OID value.
+ */
+static int64
+calculate_size_attvalue(int attnum, Oid attval)
+{
+	int64		totalsize = 0;
+	ScanKeyData 	skey;
+	Relation	pg_class;
+	SysScanDesc	scan;
+	HeapTuple	tuple;
+
+	ScanKeyInit(, attnum,
+			BTEqualStrategyNumber, F_OIDEQ, attval);
+
+	pg_class = table_open(RelationRelationId, AccessShareLock);
+	scan = systable_beginscan(pg_class, InvalidOid, false, NULL, 1, );
+	while (HeapTupleIsValid(tuple = systable_getnext(scan)))
+	{
+		Relation rel;
+		Form_pg_class classtuple = (Form_pg_class) GETSTRUCT(tuple);
+
+		rel = try_relation_open(classtuple->oid, AccessShareLock);
+		if (!rel)
+			continue;
+
+		for (ForkNumber forkNum = 0; forkNum <= MAX_FORKNUM; forkNum++)
+			totalsize += calculate_relation_size(>rd_node, rel->rd_backend, forkNum);
+
+		relation_close(rel, AccessShareLock);
+	}
+
+	systable_endscan(scan);
+	table_close(pg_class, AccessShareLock);
+	return totalsize;
+}
+
+/* Compute the size of relations in a schema (namespace) */
+static int64
+calculate_namespace_size(Oid nspOid)
+{
+	/*
+	 * User must be a member of pg_read_all_stats or have CREATE privilege for
+	 * target namespace.
+	 */
+	if (!is_member_of_role(GetUserId(), ROLE_PG_READ_ALL_STATS))
+	{
+		AclResult aclresult;
+		aclresult = pg_namespace_aclcheck(nspOid, GetUserId(), ACL_CREATE);
+		if (aclresult != ACLCHECK_OK)
+			aclcheck_error(aclresult, OBJECT_SCHEMA,
+		   get_namespace_name(nspOid));
+	}
+
+	return calculate_size_attvalue(Anum_pg_class_relnamespace, nspOid);
+}
+
+Datum
+pg_namespace_size_oid(PG_FUNCTION_ARGS)
+{
+	int64		size;
+	Oid			nspOid = PG_GETARG_OID(0);
+
+	size = calculate_namespace_size(nspOid);
+
+	if (size < 0)
+		PG_RETURN_NULL();
+
+	PG_RETURN_INT64(size);
+}
+
+Datum
+pg_namespace_size_name(PG_FUNCTION_ARGS)
+{
+	int64		size;
+	Name		nspName = PG_GETARG_NAME(0);
+	Oid			nspOid = get_namespace_oid(NameStr(*nspName), false);
+
+	size = calculate_namespace_size(nspOid);
+
+	if (size < 0)
+		PG_RETURN_NULL();
+
+	PG_RETURN_INT64(size);
+}
+
+/* Compute the size of relations using the given access method */
+static int64
+calculate_am_size(Oid amOid)
+{
+	/* XXX acl_check? */
+
+	return calculate_size_attvalue(Anum_pg_class_relam, amOid);
+}
+
+Datum
+pg_am_size_oid(PG_FUNCTION_ARGS)
+{
+	int64		size;
+	Oid			amOid = PG_GETARG_OID(0);
+
+	size = calculate_am_size(amOid);
+
+	if (size < 0)
+		PG_RETURN_NULL();
+
+	PG_RETURN_INT64(size);
+}
+
+Datum
+pg_am_size_name(PG_FUNCTION_ARGS)
+{
+	int64		size;
+	Name		amName = PG_GETARG_NAME(0);
+	Oid			amOid = get_am_oid(NameStr(*amName), false);
+
+	size = calculate_am_size(amOid);
+
+	if (size < 0)
+		PG_RETURN_NULL();
+
+	PG_RETURN_INT64(size);
+}
+
 /*
  * Get the filenode of a relation
  *
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index d6bf1f3274b..18248799e73 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -7230,6 +7230,25 @@
   descr => 'total disk space usage for the specified tablespace',
   proname => 'pg_tablespace_size', provolatile => 'v', prorettype => 'int8',
   proargtypes => 'name', prosrc => 'pg_tablespace_size_name' },
+
+{ oid => '9410',
+  descr => 'total disk space usage for the specified namespace',
+  proname => 'pg_namespace_size', provolatile => 'v', prorettype => 'in

Re: support for MERGE

2022-01-14 Thread Justin Pryzby
Resending some language fixes to the public documentation.
>From 28b8532976ddb3e8b617ca007fae6b4822b36527 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Sat, 13 Nov 2021 12:11:46 -0600
Subject: [PATCH] f!typos

---
 doc/src/sgml/mvcc.sgml  |  8 ++---
 doc/src/sgml/ref/create_policy.sgml |  5 +--
 doc/src/sgml/ref/insert.sgml|  2 +-
 doc/src/sgml/ref/merge.sgml | 48 ++---
 doc/src/sgml/trigger.sgml   |  7 +++--
 src/test/regress/expected/merge.out |  4 +--
 src/test/regress/sql/merge.sql  |  4 +--
 7 files changed, 40 insertions(+), 38 deletions(-)

diff --git a/doc/src/sgml/mvcc.sgml b/doc/src/sgml/mvcc.sgml
index a1ae8423414..61a10c28120 100644
--- a/doc/src/sgml/mvcc.sgml
+++ b/doc/src/sgml/mvcc.sgml
@@ -441,16 +441,16 @@ COMMIT;
 can specify several actions and they can be conditional, the
 conditions for each action are re-evaluated on the updated version of
 the row, starting from the first action, even if the action that had
-originally matched was later in the list of actions.
+originally matched appears later in the list of actions.
 On the other hand, if the row is concurrently updated or deleted so
 that the join condition fails, then MERGE will
-evaluate the conditions NOT MATCHED actions next,
+evaluate the condition's NOT MATCHED actions next,
 and execute the first one that succeeds.
 If MERGE attempts an INSERT
 and a unique index is present and a duplicate row is concurrently
-inserted then a uniqueness violation is raised.
+inserted, then a uniqueness violation is raised.
 MERGE does not attempt to avoid the
-ERROR by attempting an UPDATE.
+ERROR by executing an UPDATE.

 

diff --git a/doc/src/sgml/ref/create_policy.sgml b/doc/src/sgml/ref/create_policy.sgml
index 3db3908b429..db312681f78 100644
--- a/doc/src/sgml/ref/create_policy.sgml
+++ b/doc/src/sgml/ref/create_policy.sgml
@@ -96,10 +96,11 @@ CREATE POLICY name ON 
 
   
-   No separate policy exists for MERGE. Instead policies
+   No separate policy exists for MERGE. Instead, the policies
defined for SELECT, INSERT,
UPDATE and DELETE are applied
-   while executing MERGE, depending on the actions that are activated.
+   while executing MERGE, depending on the actions that are
+   performed.
   
  
 
diff --git a/doc/src/sgml/ref/insert.sgml b/doc/src/sgml/ref/insert.sgml
index 477de2689b6..ad61d757af5 100644
--- a/doc/src/sgml/ref/insert.sgml
+++ b/doc/src/sgml/ref/insert.sgml
@@ -592,7 +592,7 @@ INSERT oid count
You may also wish to consider using MERGE, since that
-   allows mixed INSERT, UPDATE and
+   allows mixing INSERT, UPDATE and
DELETE within a single statement.
See .
   
diff --git a/doc/src/sgml/ref/merge.sgml b/doc/src/sgml/ref/merge.sgml
index 7700d9b9bb1..149df9f0ad9 100644
--- a/doc/src/sgml/ref/merge.sgml
+++ b/doc/src/sgml/ref/merge.sgml
@@ -73,11 +73,11 @@ DELETE
from data_source to
target_table_name
producing zero or more candidate change rows.  For each candidate change
-   row the status of MATCHED or NOT MATCHED
+   row, the status of MATCHED or NOT MATCHED
is set just once, after which WHEN clauses are evaluated
-   in the order specified.  The first clause to match each candidate change
-   row is executed.  No more than one WHEN clause is
-   executed for any candidate change row.  
+   in the order specified.  For each candidate change row, the first clause to
+   evaluate as true executed.  No more than one WHEN clause
+   is executed for any candidate change row.  
   
 
   
@@ -85,14 +85,14 @@ DELETE
regular UPDATE, INSERT, or
DELETE commands of the same names. The syntax of
those commands is different, notably that there is no WHERE
-   clause and no tablename is specified.  All actions refer to the
+   clause and no table name is specified.  All actions refer to the
target_table_name,
though modifications to other tables may be made using triggers.
   
 
   
-   When DO NOTHING action is specified, the source row is
-   skipped. Since actions are evaluated in the given order, DO
+   When DO NOTHING is specified, the source row is
+   skipped. Since actions are evaluated in their specified order, DO
NOTHING can be handy to skip non-interesting source rows before
more fine-grained handling.
   
@@ -178,7 +178,7 @@ DELETE
 
  
   A substitute name for the data source. When an alias is
-  provided, it completely hides whether table or query was specified.
+  provided, it completely hides the actual name of the table or query.
  
 

@@ -202,7 +202,7 @@ DELETE
rows should appear in join_condition.
join_condition subexpressions that
only reference target_table_name
-   columns can only affect which action is taken, often in surprising ways.
+   columns can affect which action is taken, often in surprising w

Re: libpq compression (part 2)

2022-01-13 Thread Justin Pryzby
On Fri, Jan 14, 2022 at 02:12:17AM +0500, Daniil Zakhlystov wrote:
> Hi, Justin!
> 
> First of all, thanks for the detailed review. I’ve applied your patches to 
> the current version.

Note that my message had other comments that weren't addressed in this patch.

Your 0003 patch has a couple "noise" hunks that get rid of ^M characters added
in previous patches.  The ^M shouldn't be added in the first place.  Did you
apply my fixes using git-am or something else ?

On Fri, Jan 14, 2022 at 02:12:17AM +0500, Daniil Zakhlystov wrote:
> > On 12 Jan 2022, at 19:15, Justin Pryzby  wrote:
> > 
> > zlib still causes check-world to get stuck.  I first mentioned this last 
> > March:
> > 20210319062800.gi11...@telsasoft.com
> > 
...
> > I removed the thread from the CFBOT until that's resolved.
> 
> I’ve fixed the failing tests, now they should pass.

macos: passed
linux: timed out after 1hr
freebsd: failed in pg_rewind: ... <= replay_lsn AND state = 'streaming' FROM ...
windows: "Failed test 'data_checksums=on is reported on an offline cluster 
stdout /(?^:^on$)/'" / WARNING:  01000: algorithm zlib is not supported

Note that it's possible and easy to kick off a CI run using any github account:
see ./src/tools/ci/README

For me, it's faster than running check-world -j4 locally, and runs tests on 4
OSes.

I re-ran your branch under my own account and linux didn't get stuck (and the
compiler warnings tests passed).  But on a third attempt, macos failed the
pg_rewind test, and bsd failed the subscription test:
| SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('s', 
'r');

-- 
Justin




<    5   6   7   8   9   10   11   12   13   14   >