Re: Inconsistent behavior of smart shutdown handling for queries with and without parallel workers

2020-08-13 Thread Bharath Rupireddy
I think the inconsistent behaviour reported in this thread gets
resolved with the approach and patch being discussed in [1].

>
> 1. In general, do we need to allow postmaster to send different
> signals to bgworkers for fast and smart shutdowns and let them
> differentiate the two modes(if needed)?
>

Is there any way the bgworkers(for that matter, any postmaster's child
process) knowing that there's a smart shutdown pending? This is
useful, if any of the bgworker(if not parallel workers) want to
differentiate the two modes i.e. smart and fast shutdown modes and
smartly finish of their work.

[1] - https://www.postgresql.org/message-id/469199.1597337108%40sss.pgh.pa.us

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com




Re: doc examples for pghandler

2020-08-13 Thread Michael Paquier
On Tue, Aug 11, 2020 at 01:01:10PM -0700, Mark Wong wrote:
> Ah, right.  For the moment I've added some empty conditionals for
> trigger and event trigger handling.
> 
> I've created a new entry in the commitfest app. [1]  I'll keep at it. :)

Thanks for the patch.  I have reviewed and reworked it as the
attached.  Some comments below.

+PGFILEDESC = "PL/Sample - procedural language"
+
+REGRESS = create_pl create_func select_func
+
+EXTENSION = plsample
+EXTVERSION = 0.1

This makefile has a couple of mistakes, and can be simplified a lot:
- make check does not work, as you forgot a PGXS part.
- MODULES can just be used as there is only one file (forgot WIN32RES
in OBJS for example)
- DATA does not need the .control file.

.gitignore was missing.

We could just use 1.0 instead of 0.1 for the version number.  That's
not a big deal one way or another, but 1.0 is more consistent with the
other modules.

plsample--1.0.sql should complain if attempting to load the file from
psql.  Also I have cleaned up the README.

Not sure that there is a point in having three different files for the
regression tests.  create_pl.sql is actually not necessary as you
can do the same with CREATE EXTENSION.

The header list of plsample.c was inconsistent with the style used
normally in modules, and I have extended a bit the handler function so
as we return a result only if the return type of the procedure is text
for the source text of the function, tweaked the results a bit, etc.
There was a family of small issues, like using ALLOCSET_SMALL_SIZES
for the context creation.  We could of course expand the sample
handler more in the future to check for pseudotype results, have a
validator, but that could happen later, if necessary.
--
Michael
From fb017d6277a76653385ae7179307177d20dbe194 Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Fri, 14 Aug 2020 14:24:15 +0900
Subject: [PATCH] Add PL/sample to src/test/modules/

---
 src/test/modules/Makefile |   1 +
 src/test/modules/plsample/.gitignore  |   3 +
 src/test/modules/plsample/Makefile|  20 ++
 src/test/modules/plsample/README  |   6 +
 .../modules/plsample/expected/plsample.out|  36 
 src/test/modules/plsample/plsample--1.0.sql   |  14 ++
 src/test/modules/plsample/plsample.c  | 186 ++
 src/test/modules/plsample/plsample.control|   8 +
 src/test/modules/plsample/sql/plsample.sql|  15 ++
 doc/src/sgml/plhandler.sgml   |  62 +-
 10 files changed, 295 insertions(+), 56 deletions(-)
 create mode 100644 src/test/modules/plsample/.gitignore
 create mode 100644 src/test/modules/plsample/Makefile
 create mode 100644 src/test/modules/plsample/README
 create mode 100644 src/test/modules/plsample/expected/plsample.out
 create mode 100644 src/test/modules/plsample/plsample--1.0.sql
 create mode 100644 src/test/modules/plsample/plsample.c
 create mode 100644 src/test/modules/plsample/plsample.control
 create mode 100644 src/test/modules/plsample/sql/plsample.sql

diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index 1428529b04..a6d2ffbf9e 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -10,6 +10,7 @@ SUBDIRS = \
 		  delay_execution \
 		  dummy_index_am \
 		  dummy_seclabel \
+		  plsample \
 		  snapshot_too_old \
 		  test_bloomfilter \
 		  test_ddl_deparse \
diff --git a/src/test/modules/plsample/.gitignore b/src/test/modules/plsample/.gitignore
new file mode 100644
index 00..44d119cfcc
--- /dev/null
+++ b/src/test/modules/plsample/.gitignore
@@ -0,0 +1,3 @@
+# Generated subdirectories
+/log/
+/results/
diff --git a/src/test/modules/plsample/Makefile b/src/test/modules/plsample/Makefile
new file mode 100644
index 00..f1bc334bfc
--- /dev/null
+++ b/src/test/modules/plsample/Makefile
@@ -0,0 +1,20 @@
+# src/test/modules/plsample/Makefile
+
+MODULES = plsample
+
+EXTENSION = plsample
+DATA = plsample--1.0.sql
+PGFILEDESC = "PL/Sample - template for procedural language"
+
+REGRESS = plsample
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/plsample
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/src/test/modules/plsample/README b/src/test/modules/plsample/README
new file mode 100644
index 00..7d44d7b3f2
--- /dev/null
+++ b/src/test/modules/plsample/README
@@ -0,0 +1,6 @@
+PL/Sample
+=
+
+PL/Sample is an example template of procedural-language handler.  It is
+kept a maximum simple, and demonstrates some of the things that can be done
+to build a fully functional procedural-language handler.
diff --git a/src/test/modules/plsample/expected/plsample.out b/src/test/modules/plsample/expected/plsample.out
new file mode 100644
index 00..a0c318b6df
--- /dev/null
+++ b/src/test/modules/plsample/expected/plsample.out
@@ -0,0 

Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

2020-08-13 Thread Thomas Munro
On Thu, Aug 13, 2020 at 6:38 PM Amit Kapila  wrote:
> I have pushed that patch last week and attached are the remaining
> patches. I have made a few changes in the next patch
> 0001-Extend-the-BufFile-interface.patch and have some comments on it
> which are as below:

Hi Amit,

I noticed that Konstantin Knizhnik's CF entry 2386 calls
table_scan_XXX() functions from an extension, namely
contrib/auto_explain, and started failing to build on Windows after
commit 7259736a.  This seems to be due to the new global variables
CheckXidAlive and bsysscan, which probably need PGDLLIMPORT if they
are accessed from inline functions that are part of the API that we
expect extensions to be allowed to call.




Re: recovering from "found xmin ... from before relfrozenxid ..."

2020-08-13 Thread Masahiko Sawada
On Wed, 12 Aug 2020 at 22:27, Ashutosh Sharma  wrote:
>
> Thanks Robert for the review. Please find my comments inline below:
>
> On Fri, Aug 7, 2020 at 9:21 PM Robert Haas  wrote:
> >
> > On Thu, Aug 6, 2020 at 9:23 AM Ashutosh Sharma  
> > wrote:
> > > Attached v4 patch fixes the latest comments from Robert and Masahiko-san.
> >
> > Compiler warning:
> >
> > heap_surgery.c:136:13: error: comparison of unsigned expression < 0 is
> > always false [-Werror,-Wtautological-compare]
> > if (blkno < 0 || blkno >= nblocks)
> > ~ ^ ~
> >
>
> Fixed.
>
> > There's a certain inconsistency to these messages:
> >
> > rhaas=# create table foo (a int);
> > CREATE TABLE
> > rhaas=# insert into foo values (1);
> > INSERT 0 1
> > rhaas=# select heap_force_kill('foo'::regclass, array['(0,2)'::tid]);
> > NOTICE:  skipping tid (0, 2) because it contains an invalid offset
> >  heap_force_kill
> > -
> >
> > (1 row)
> >
> > rhaas=# select heap_force_kill('foo'::regclass, array['(1,0)'::tid]);
> > ERROR:  invalid item pointer
> > LOCATION:  tids_same_page_fetch_offnums, heap_surgery.c:347
> > rhaas=# select heap_force_kill('foo'::regclass, array['(1,1)'::tid]);
> > ERROR:  block number 1 is out of range for relation "foo"
> >
> > From a user perspective it seems like I've made three very similar
> > mistakes: in the first case, the offset is too high, in the second
> > case it's too low, and in the third case the block number is out of
> > range. But in one case I get a NOTICE and in the other two cases I get
> > an ERROR. In one case I get the relation name and in the other two
> > cases I don't. The two complaints about an invalid offset are phrased
> > completely differently from each other. For example, suppose you do
> > this:
> >
> > ERROR: tid (%u, %u) is invalid for relation "%s" because the block
> > number is out of range (%u..%u)
> > ERROR: tid (%u, %u) is invalid for relation "%s" because the item
> > number is out of range for this block (%u..%u)
> > ERROR: tid (%u, %u) is invalid for relation "%s" because the item is unused
> > ERROR: tid (%u, %u) is invalid for relation "%s" because the item is dead
> >
>
> Corrected.
>
> > I think I misled you when I said to use pg_class_aclcheck. I think it
> > should actually be pg_class_ownercheck.
> >
>
> okay, I've changed it to pg_class_ownercheck.
>
> > I think the relkind sanity check should permit RELKIND_MATVIEW also.
> >
>
> Yeah, actually we should allow MATVIEW, don't know why I thought of
> blocking it earlier.
>
> > It's unclear to me why the freeze logic here shouldn't do this part
> > what heap_prepare_freeze_tuple() does when freezing xmax:
> >
> > frz->t_infomask2 &= ~HEAP_HOT_UPDATED;
> > frz->t_infomask2 &= ~HEAP_KEYS_UPDATED;
> >
>
> Yeah, we should have these changes when freezing the xmax.
>
> > Likewise, why should we not freeze or invalidate xvac in the case
> > where tuple->t_infomask & HEAP_MOVED, as heap_prepare_freeze_tuple()
> > would do?
> >
>
> Again, we should have this as well.
>
> Apart from above, this time I've also added the documentation on
> pg_surgery module and added a few more test-cases.
>
> Attached patch with above changes.
>

Thank you for updating the patch! Here are my comments on v5 patch:

--- a/contrib/Makefile
+++ b/contrib/Makefile
@@ -35,6 +35,7 @@ SUBDIRS = \
pg_standby  \
pg_stat_statements \
pg_trgm \
+   pg_surgery  \
pgcrypto\

I guess we use alphabetical order here. So pg_surgery should be placed
before pg_trgm.

---
+   if (heap_force_opt == HEAP_FORCE_KILL)
+   ItemIdSetDead(itemid);

I think that if the page is an all-visible page, we should clear an
all-visible bit on the visibility map corresponding to the page and
PD_ALL_VISIBLE on the page header. Otherwise, index only scan would
return the wrong results.

---
+   /*
+* We do not mark the buffer dirty or do WAL logging for unmodifed
+* pages.
+*/
+   if (!did_modify_page)
+   goto skip_wal;
+
+   /* Mark buffer dirty before we write WAL. */
+   MarkBufferDirty(buf);
+
+   /* XLOG stuff */
+   if (RelationNeedsWAL(rel))
+   log_newpage_buffer(buf, true);
+
+skip_wal:
+   END_CRIT_SECTION();
+

s/unmodifed/unmodified/

Do we really need to use goto? I think we can modify it like follows:

if (did_modity_page)
{
   /* Mark buffer dirty before we write WAL. */
   MarkBufferDirty(buf);

   /* XLOG stuff */
   if (RelationNeedsWAL(rel))
   log_newpage_buffer(buf, true);
}

END_CRIT_SECTION();

---
pg_force_freeze() can revival a tuple that is already deleted but not
vacuumed yet. Therefore, the user might need to reindex indexes after
using that function. For instance, with the following script, the last
two queries: index scan and seq scan, will return different results.

set enable_seqscan to off;
set enable_bitmapscan to off;

Re: massive FPI_FOR_HINT load after promote

2020-08-13 Thread Masahiko Sawada
On Wed, 12 Aug 2020 at 02:42, Alvaro Herrera  wrote:
>
> On 2020-Aug-11, Masahiko Sawada wrote:
>
> > On Tue, 11 Aug 2020 at 07:56, Alvaro Herrera  
> > wrote:
>
> > > So if you have some table where tuples gain hint bits in bulk, and
> > > rarely modify the pages afterwards, and promote before those pages are
> > > frozen, then you may end up with a massive amount of pages that will
> > > need hinting after the promote, which can become troublesome.
> >
> > Did the case you observed not use hot standby? I thought the impact of
> > this issue could be somewhat alleviated in hot standby cases since
> > read queries on the hot standby can set hint bits.
>
> Oh, interesting, I didn't know that.  However, it's not 100% true: the
> standby can set the bit in shared buffers, but it does not mark the page
> dirty.  So when the page is evicted, those bits that were set are lost.
> That's not great.  See MarkBufferDirtyHint:

Yeah, you're right.

>
> > > One simple idea to try to forestall this problem would be to modify the
> > > algorithm so that all tuples are scanned and hinted if the page is going
> > > to be dirtied -- then send a single FPI setting bits for all tuples,
> > > instead of just on the first tuple.
> >
> > This idea seems good to me but I'm concerned a bit that the
> > probability of concurrent processes writing FPI for the same page
> > might get higher since concurrent processes could set hint bits at the
> > same time. If it's true, I wonder if we can advertise hint bits are
> > being updated to prevent concurrent FPI writes for the same page.
>
> Hmm, a very good point.  Sounds like we would need to obtain an
> exclusive lock on the page .. but that would be very problematic.
>

I think that when the page is going to be dirty only updating hint
bits on the page and writing FPI need to be performed exclusively. So
perhaps we can add a flag, say BM_UPDATE_HINTBITS, to buffer
descriptor indicating the hint bits are being updated.

Regards,

--
Masahiko Sawadahttp://www.2ndQuadrant.com/


PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: Collation versioning

2020-08-13 Thread Thomas Munro
On Thu, Aug 13, 2020 at 9:52 PM Julien Rouhaud  wrote:
> Here's a rebased v27 that removes the current approach to ignore indexes that
> don't rely on a stable ordering.  I'll start a new thread on that matter once
> the infrastructure pieces will be committed.

Thanks Julien.  I'm planning to do a bit more testing and review, and
then hopefully commit this next week.  If anyone else has objections
to this design, now would be a good time to speak up.




Re: Asynchronous Append on postgres_fdw nodes.

2020-08-13 Thread Thomas Munro
On Thu, Jul 2, 2020 at 3:20 PM Etsuro Fujita  wrote:
> On Thu, Jul 2, 2020 at 11:14 AM Kyotaro Horiguchi
>  wrote:
> > As the result of a discussion with Fujita-san off-list, I'm going to
> > hold off development until he decides whether mine or Thomas' is
> > better.
>
> I'd like to join the party, but IIUC, we don't yet reach a consensus
> on which one is the right way to go.  So I think we need to discuss
> that first.

Either way, we definitely need patch 0001.  One comment:

-CreateWaitEventSet(MemoryContext context, int nevents)
+CreateWaitEventSet(MemoryContext context, ResourceOwner res, int nevents)

I wonder if it's better to have it receive ResourceOwner like that, or
to have it capture CurrentResourceOwner.  I think the latter is more
common in existing code.




Re: fixing old_snapshot_threshold's time->xid mapping

2020-08-13 Thread Thomas Munro
On Fri, Aug 14, 2020 at 12:52 PM Thomas Munro  wrote:
> Here's a rebase.

And another, since I was too slow and v6 is already in conflict...
sorry for the high frequency patches.
From 1ced7b8c881676f21623c048f5e9e012ca8416ec Mon Sep 17 00:00:00 2001
From: Robert Haas 
Date: Thu, 16 Apr 2020 09:37:31 -0400
Subject: [PATCH v7 1/6] Expose oldSnapshotControl.

---
 src/backend/utils/time/snapmgr.c | 55 +--
 src/include/utils/old_snapshot.h | 75 
 2 files changed, 77 insertions(+), 53 deletions(-)
 create mode 100644 src/include/utils/old_snapshot.h

diff --git a/src/backend/utils/time/snapmgr.c b/src/backend/utils/time/snapmgr.c
index 752af0c10d..6cfb07e82b 100644
--- a/src/backend/utils/time/snapmgr.c
+++ b/src/backend/utils/time/snapmgr.c
@@ -63,6 +63,7 @@
 #include "storage/spin.h"
 #include "utils/builtins.h"
 #include "utils/memutils.h"
+#include "utils/old_snapshot.h"
 #include "utils/rel.h"
 #include "utils/resowner_private.h"
 #include "utils/snapmgr.h"
@@ -74,59 +75,7 @@
  */
 int			old_snapshot_threshold; /* number of minutes, -1 disables */
 
-/*
- * Structure for dealing with old_snapshot_threshold implementation.
- */
-typedef struct OldSnapshotControlData
-{
-	/*
-	 * Variables for old snapshot handling are shared among processes and are
-	 * only allowed to move forward.
-	 */
-	slock_t		mutex_current;	/* protect current_timestamp */
-	TimestampTz current_timestamp;	/* latest snapshot timestamp */
-	slock_t		mutex_latest_xmin;	/* protect latest_xmin and next_map_update */
-	TransactionId latest_xmin;	/* latest snapshot xmin */
-	TimestampTz next_map_update;	/* latest snapshot valid up to */
-	slock_t		mutex_threshold;	/* protect threshold fields */
-	TimestampTz threshold_timestamp;	/* earlier snapshot is old */
-	TransactionId threshold_xid;	/* earlier xid may be gone */
-
-	/*
-	 * Keep one xid per minute for old snapshot error handling.
-	 *
-	 * Use a circular buffer with a head offset, a count of entries currently
-	 * used, and a timestamp corresponding to the xid at the head offset.  A
-	 * count_used value of zero means that there are no times stored; a
-	 * count_used value of OLD_SNAPSHOT_TIME_MAP_ENTRIES means that the buffer
-	 * is full and the head must be advanced to add new entries.  Use
-	 * timestamps aligned to minute boundaries, since that seems less
-	 * surprising than aligning based on the first usage timestamp.  The
-	 * latest bucket is effectively stored within latest_xmin.  The circular
-	 * buffer is updated when we get a new xmin value that doesn't fall into
-	 * the same interval.
-	 *
-	 * It is OK if the xid for a given time slot is from earlier than
-	 * calculated by adding the number of minutes corresponding to the
-	 * (possibly wrapped) distance from the head offset to the time of the
-	 * head entry, since that just results in the vacuuming of old tuples
-	 * being slightly less aggressive.  It would not be OK for it to be off in
-	 * the other direction, since it might result in vacuuming tuples that are
-	 * still expected to be there.
-	 *
-	 * Use of an SLRU was considered but not chosen because it is more
-	 * heavyweight than is needed for this, and would probably not be any less
-	 * code to implement.
-	 *
-	 * Persistence is not needed.
-	 */
-	int			head_offset;	/* subscript of oldest tracked time */
-	TimestampTz head_timestamp; /* time corresponding to head xid */
-	int			count_used;		/* how many slots are in use */
-	TransactionId xid_by_minute[FLEXIBLE_ARRAY_MEMBER];
-} OldSnapshotControlData;
-
-static volatile OldSnapshotControlData *oldSnapshotControl;
+volatile OldSnapshotControlData *oldSnapshotControl;
 
 
 /*
diff --git a/src/include/utils/old_snapshot.h b/src/include/utils/old_snapshot.h
new file mode 100644
index 00..284af7d508
--- /dev/null
+++ b/src/include/utils/old_snapshot.h
@@ -0,0 +1,75 @@
+/*-
+ *
+ * old_snapshot.h
+ *		Data structures for 'snapshot too old'
+ *
+ * Portions Copyright (c) 1996-2020, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ *	  src/include/utils/old_snapshot.h
+ *
+ *-
+ */
+
+#ifndef OLD_SNAPSHOT_H
+#define OLD_SNAPSHOT_H
+
+#include "datatype/timestamp.h"
+#include "storage/s_lock.h"
+
+/*
+ * Structure for dealing with old_snapshot_threshold implementation.
+ */
+typedef struct OldSnapshotControlData
+{
+	/*
+	 * Variables for old snapshot handling are shared among processes and are
+	 * only allowed to move forward.
+	 */
+	slock_t		mutex_current;	/* protect current_timestamp */
+	TimestampTz current_timestamp;	/* latest snapshot timestamp */
+	slock_t		mutex_latest_xmin;	/* protect latest_xmin and next_map_update */
+	TransactionId latest_xmin;	/* latest snapshot xmin */
+	TimestampTz next_map_update;	/* latest 

Re: security_context_t marked as deprecated in libselinux 3.1

2020-08-13 Thread Michael Paquier
On Thu, Aug 13, 2020 at 08:47:28PM -0400, Tom Lane wrote:
> Should we back-patch that?  Usually I figure that people might want
> to build back PG branches on newer platforms at some point, so that
> it's useful to apply portability fixes across-the-board.  On the
> other hand, since it's only a compiler warning, maybe it's not worth
> the trouble.

Not sure that's worth the trouble as long as people don't complain
about it directly, and it does not prevent the contrib module to
work.
--
Michael


signature.asc
Description: PGP signature


Re: pg_dump from v13 is slow

2020-08-13 Thread Tom Lane
Justin Pryzby  writes:
> I can reproduce the issue with generated data:
> pryzbyj=# SELECT format('create table t%s(i int)', i) FROM 
> generate_series(1,)i;

Hm, I tried this case and didn't really detect much runtime difference
between v12 and HEAD.

> I don't know if it's any issue, but I found that pg12 can process "null
> statements" almost 2x as fast:

Now I'm suspicious that you're comparing an assert-enabled v13 build
to a non-assert-enabled v12 build.  Check the output of
"pg_config --configure" from each installation to see if they're
configured alike.

regards, tom lane




Re: pg_dump from v13 is slow

2020-08-13 Thread Justin Pryzby
On Thu, Aug 13, 2020 at 08:53:46PM -0400, Alvaro Herrera wrote:
> Hmm, I wonder if you're comparing an assert-enabled pg13 build to a
> non-assert-enabled pg12 build, or something like that.

Great question - I thought of it myself but then forgot to look..

$ rpm -q postgresql1{2,3}-server
postgresql12-server-12.4-1PGDG.rhel7.x86_64
postgresql13-server-13-beta3_1PGDG.rhel7.x86_64

$ /usr/pgsql-12/bin/pg_config |grep -o cassert || echo not found
not found
$ /usr/pgsql-13/bin/pg_config |grep -o cassert || echo not found
cassert

It looks like the beta packages are compiled with cassert, which makes sense.

Thanks and sorry for noise.

-- 
Justin




Re: pg_dump from v13 is slow

2020-08-13 Thread Alvaro Herrera
Hmm, I wonder if you're comparing an assert-enabled pg13 build to a
non-assert-enabled pg12 build, or something like that.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: fixing old_snapshot_threshold's time->xid mapping

2020-08-13 Thread Thomas Munro
On Wed, Apr 22, 2020 at 5:39 PM Dilip Kumar  wrote:
> -   if (ts == update_ts)
> +   if (ts >= update_ts)

Hi Dilip, I didn't follow this bit -- could you explain?

Here's a rebase.  In the 0004 patch I chose to leave behind some
unnecessary braces to avoid reindenting a bunch of code after removing
an if branch, just for ease of review, but I'd probably remove those
in a committed version.  I'm going to add this stuff to the next CF so
we don't lose track of it.
From 628eb4cfb7a7d67125c5fd9ebc1ecd69b3d9cb82 Mon Sep 17 00:00:00 2001
From: Robert Haas 
Date: Thu, 16 Apr 2020 09:37:31 -0400
Subject: [PATCH v6 1/6] Expose oldSnapshotControl.

---
 src/backend/utils/time/snapmgr.c | 55 +--
 src/include/utils/old_snapshot.h | 75 
 2 files changed, 77 insertions(+), 53 deletions(-)
 create mode 100644 src/include/utils/old_snapshot.h

diff --git a/src/backend/utils/time/snapmgr.c b/src/backend/utils/time/snapmgr.c
index 604d823f68..9dc19b2f58 100644
--- a/src/backend/utils/time/snapmgr.c
+++ b/src/backend/utils/time/snapmgr.c
@@ -63,6 +63,7 @@
 #include "storage/spin.h"
 #include "utils/builtins.h"
 #include "utils/memutils.h"
+#include "utils/old_snapshot.h"
 #include "utils/rel.h"
 #include "utils/resowner_private.h"
 #include "utils/snapmgr.h"
@@ -74,59 +75,7 @@
  */
 int			old_snapshot_threshold; /* number of minutes, -1 disables */
 
-/*
- * Structure for dealing with old_snapshot_threshold implementation.
- */
-typedef struct OldSnapshotControlData
-{
-	/*
-	 * Variables for old snapshot handling are shared among processes and are
-	 * only allowed to move forward.
-	 */
-	slock_t		mutex_current;	/* protect current_timestamp */
-	TimestampTz current_timestamp;	/* latest snapshot timestamp */
-	slock_t		mutex_latest_xmin;	/* protect latest_xmin and next_map_update */
-	TransactionId latest_xmin;	/* latest snapshot xmin */
-	TimestampTz next_map_update;	/* latest snapshot valid up to */
-	slock_t		mutex_threshold;	/* protect threshold fields */
-	TimestampTz threshold_timestamp;	/* earlier snapshot is old */
-	TransactionId threshold_xid;	/* earlier xid may be gone */
-
-	/*
-	 * Keep one xid per minute for old snapshot error handling.
-	 *
-	 * Use a circular buffer with a head offset, a count of entries currently
-	 * used, and a timestamp corresponding to the xid at the head offset.  A
-	 * count_used value of zero means that there are no times stored; a
-	 * count_used value of OLD_SNAPSHOT_TIME_MAP_ENTRIES means that the buffer
-	 * is full and the head must be advanced to add new entries.  Use
-	 * timestamps aligned to minute boundaries, since that seems less
-	 * surprising than aligning based on the first usage timestamp.  The
-	 * latest bucket is effectively stored within latest_xmin.  The circular
-	 * buffer is updated when we get a new xmin value that doesn't fall into
-	 * the same interval.
-	 *
-	 * It is OK if the xid for a given time slot is from earlier than
-	 * calculated by adding the number of minutes corresponding to the
-	 * (possibly wrapped) distance from the head offset to the time of the
-	 * head entry, since that just results in the vacuuming of old tuples
-	 * being slightly less aggressive.  It would not be OK for it to be off in
-	 * the other direction, since it might result in vacuuming tuples that are
-	 * still expected to be there.
-	 *
-	 * Use of an SLRU was considered but not chosen because it is more
-	 * heavyweight than is needed for this, and would probably not be any less
-	 * code to implement.
-	 *
-	 * Persistence is not needed.
-	 */
-	int			head_offset;	/* subscript of oldest tracked time */
-	TimestampTz head_timestamp; /* time corresponding to head xid */
-	int			count_used;		/* how many slots are in use */
-	TransactionId xid_by_minute[FLEXIBLE_ARRAY_MEMBER];
-} OldSnapshotControlData;
-
-static volatile OldSnapshotControlData *oldSnapshotControl;
+volatile OldSnapshotControlData *oldSnapshotControl;
 
 
 /*
diff --git a/src/include/utils/old_snapshot.h b/src/include/utils/old_snapshot.h
new file mode 100644
index 00..284af7d508
--- /dev/null
+++ b/src/include/utils/old_snapshot.h
@@ -0,0 +1,75 @@
+/*-
+ *
+ * old_snapshot.h
+ *		Data structures for 'snapshot too old'
+ *
+ * Portions Copyright (c) 1996-2020, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ *	  src/include/utils/old_snapshot.h
+ *
+ *-
+ */
+
+#ifndef OLD_SNAPSHOT_H
+#define OLD_SNAPSHOT_H
+
+#include "datatype/timestamp.h"
+#include "storage/s_lock.h"
+
+/*
+ * Structure for dealing with old_snapshot_threshold implementation.
+ */
+typedef struct OldSnapshotControlData
+{
+	/*
+	 * Variables for old snapshot handling are shared among processes and are
+	 * only 

Re: security_context_t marked as deprecated in libselinux 3.1

2020-08-13 Thread Tom Lane
Michael Paquier  writes:
> Applied on HEAD then.  Thanks for the discussion!

Should we back-patch that?  Usually I figure that people might want
to build back PG branches on newer platforms at some point, so that
it's useful to apply portability fixes across-the-board.  On the
other hand, since it's only a compiler warning, maybe it's not worth
the trouble.

regards, tom lane




Re: pg_dump from v13 is slow

2020-08-13 Thread Justin Pryzby
I can reproduce the issue with generated data:

pryzbyj=# SELECT format('create table t%s(i int)', i) FROM 
generate_series(1,)i;
\set ECHO errors
\set QUIET
\gexec

$ time pg_dump --section=pre-data -d pryzbyj -h /tmp -p 5613 |wc   
 110015  240049 1577087
real0m50.445s

$ time pg_dump --section=pre-data -d pryzbyj -h /tmp -p 5612 |wc
 110015  240049 1577084
real0m11.203s

On Thu, Aug 13, 2020 at 04:30:14PM -0700, Andres Freund wrote:
> Would be worth knowing how much of the time pgbench is 100% CPU
> utilized, and how much of the time it is basically waiting for server
> side queries and largely idle.

Good question - I guess you mean pg_dump.

$ command time -v pg_dump --section=pre-data -d pryzbyj -h /tmp -p 5612 |wc   
Command being timed: "pg_dump --section=pre-data -d pryzbyj -h /tmp -p 
5612"
User time (seconds): 0.65
System time (seconds): 0.52
Percent of CPU this job got: 9%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:11.85

$ command time -v pg_dump --section=pre-data -d pryzbyj -h /tmp -p 5613 |wc
Command being timed: "pg_dump --section=pre-data -d pryzbyj -h /tmp -p 
5613"
User time (seconds): 0.79
System time (seconds): 0.49
Percent of CPU this job got: 2%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:48.51

So v13 was 4.5x slower and it seems to be all on the server side.

I looked queries like this:
time strace -ts999 -e sendto pg_dump  --section=pre-data -d pryzbyj -h /tmp -p 
5613 2>strace-13-3 |wc
cut -c1-66 strace-13-3 |sort |uniq |less

Most of the time is spent on these three queries:

|12:58:11 sendto(3, 
"Q\0\0\3\215SELECT\na.attnum,\na.attname,\na.atttypmod,\na.attstattarget,\na.attstorage,\nt.typstorage,\na.attn
|...
|12:58:30 sendto(3, 
"Q\0\0\3\215SELECT\na.attnum,\na.attname,\na.atttypmod,\na.attstattarget,\na.attstorage,\nt.typstorage,\na.attn

|12:58:32 sendto(3, "Q\0\0\1\314SELECT oid, tableoid, pol.polname, pol.polcmd, 
pol.polpermissive, CASE WHEN pol.polroles = '{0}' TH
|...
|12:58:47 sendto(3, "Q\0\0\1\314SELECT oid, tableoid, pol.polname, pol.polcmd, 
pol.polpermissive, CASE WHEN pol.polroles = '{0}' TH

|12:58:49 sendto(3, "Q\0\0\0\213SELECT pr.tableoid, pr.oid, p.pubname FROM 
pg_publication_rel pr, pg_publication p WHERE pr.prrelid
|...
|12:59:01 sendto(3, "Q\0\0\0\213SELECT pr.tableoid, pr.oid, p.pubname FROM 
pg_publication_rel pr, pg_publication p WHERE pr.prrelid

Compare with v12:

|12:57:58 sendto(3, 
"Q\0\0\3\215SELECT\na.attnum,\na.attname,\na.atttypmod,\na.attstattarget,\na.attstorage,\nt.typstorage,\na.attn
|...
|12:58:03 sendto(3, 
"Q\0\0\3\215SELECT\na.attnum,\na.attname,\na.atttypmod,\na.attstattarget,\na.attstorage,\nt.typstorage,\na.attn

|12:58:05 sendto(3, "Q\0\0\1\314SELECT oid, tableoid, pol.polname, pol.polcmd, 
pol.polpermissive, CASE WHEN pol.polroles = '{0}' TH
|...
|12:58:07 sendto(3, "Q\0\0\1\314SELECT oid, tableoid, pol.polname, pol.polcmd, 
pol.polpermissive, CASE WHEN pol.polroles = '{0}' TH

|12:58:09 sendto(3, "Q\0\0\0\213SELECT pr.tableoid, pr.oid, p.pubname FROM 
pg_publication_rel pr, pg_publication p WHERE pr.prrelid
|...
|12:58:11 sendto(3, "Q\0\0\0\213SELECT pr.tableoid, pr.oid, p.pubname FROM 
pg_publication_rel pr, pg_publication p WHERE pr.prrelid

The first query regressed the worst.

$ psql -h /tmp -Ap 5612 pryzbyj
psql (13beta3, server 12.4)
pryzbyj=# explain analyze SELECT 
a.attnum,a.attname,a.atttypmod,a.attstattarget,a.attstorage,t.typstorage,a.attnotnull,a.atthasdef,a.attisdropped,a.attlen,a.attalign,a.attislocal,pg_catalog.format_type(t.oid,
 a.atttypmod) AS atttypname,a.attgenerated,CASE WHEN a.atthasmissing AND NOT 
a.attisdropped THEN a.attmissingval ELSE null END AS 
attmissingval,a.attidentity,pg_catalog.array_to_string(ARRAY(SELECT 
pg_catalog.quote_ident(option_name) || ' ' || 
pg_catalog.quote_literal(option_value) FROM 
pg_catalog.pg_options_to_table(attfdwoptions) ORDER BY option_name), E',') 
AS attfdwoptions,CASE WHEN a.attcollation <> t.typcollation THEN a.attcollation 
ELSE 0 END AS attcollation,array_to_string(a.attoptions, ', ') AS attoptions 
FROM pg_catalog.pg_attribute a LEFT JOIN pg_catalog.pg_type t ON a.atttypid = 
t.oid WHERE a.attrelid = '191444'::pg_catalog.oid AND a.attnum > 
0::pg_catalog.int2 ORDER BY a.attnum;
QUERY PLAN
Nested Loop Left Join  (cost=0.58..16.72 rows=1 width=217) (actual 
time=0.205..0.209 rows=1 loops=1)
  ->  Index Scan using pg_attribute_relid_attnum_index on pg_attribute a  
(cost=0.29..8.31 rows=1 width=189) (actual time=0.030..0.032 rows=1 loops=1)
Index Cond: ((attrelid = '191444'::oid) AND (attnum > '0'::smallint))
  ->  Index Scan using pg_type_oid_index on pg_type t  (cost=0.29..8.30 rows=1 
width=9) (actual time=0.011..0.011 rows=1 loops=1)
Index Cond: (oid = a.atttypid)
  SubPlan 1
->  Sort  (cost=0.09..0.09 rows=3 width=64) (actual time=0.119..0.119 
rows=0 loops=1)
  Sort Key: pg_options_to_table.option_name
  Sort Method: 

Re: security_context_t marked as deprecated in libselinux 3.1

2020-08-13 Thread Michael Paquier
On Thu, Aug 13, 2020 at 02:35:28PM +0900, Michael Paquier wrote:
> Okay, thanks for confirming.  Let's do so then, I'll just wait a bit
> to see if there are more comments from others.

Applied on HEAD then.  Thanks for the discussion!
--
Michael


signature.asc
Description: PGP signature


Re: 回复:how to create index concurrently on partitioned table

2020-08-13 Thread Michael Paquier
On Wed, Aug 12, 2020 at 10:37:08PM +0900, Michael Paquier wrote:
> I have been able to work more on this patch today, and finally added
> an error context for the transaction block check, as that's cleaner.
> In my manual tests, I have also bumped into a case that failed with
> the original patch (where there were no session locks), and created
> an isolation test based on it: drop of a partition leaf concurrently
> to REINDEX done on the parent.  One last thing I have spotted is that
> we failed to discard properly foreign tables defined as leaves of a
> partition tree, causing a reindex to fail, so reindex_partitions()
> ought to just use RELKIND_HAS_STORAGE() to do its filtering work.  I
> am leaving this patch alone for a couple of days now, and I'll try to
> come back to it after and potentially commit it.  The attached has
> been indented by the way.

I got to think more about the locking strategy used in this patch, and
I am afraid that we should fix the bug with REINDEX SCHEMA/DATABASE
first.  What we have here is rather similar to a REINDEX SCHEMA with
all the partitions on the same schema, so it would be better to apply
the same set of assumptions and logic for the reindex of partitioned
relations as we do for the others.  This makes the whole logic more
consistent, but it also reduces the surface of bugs.  I have created a
separate thread for the problem, and posted a patch:
https://www.postgresql.org/message-id/20200813043805.ge11...@paquier.xyz

Once this gets done, we should then be able to get rid of the extra
session locking taken when building the list of partitions, limiting
session locks to only be taken during the concurrent reindex of a
single partition (the table itself for a partition table, and the
parent table for a partition index), making the whole operation less
invasive.
--
Michael


signature.asc
Description: PGP signature


RE: Libpq support to connect to standby server as priority

2020-08-13 Thread tsunakawa.ta...@fujitsu.com
From: Robert Haas 
> I think it would be better to have read-write and read-only check
> trnasaction_read_only, and primary and standby can check the new
> thing. There can never be any real advantage in having synonyms for
> the same thing, but there can be an advantage to letting users choose
> the behavior they want.

+1
"primary" is not always equal to "read-write".  When normal users are only 
allowed to query data on a logically replicated database (ALTER USER SET 
default_transaction_read_only = on), it's the primary read-only server.


Regards
Takayuki Tsunakawa





Re: security_context_t marked as deprecated in libselinux 3.1

2020-08-13 Thread Michael Paquier
On Thu, Aug 13, 2020 at 06:54:41AM -0400, Joe Conway wrote:
> On 8/13/20 1:22 AM, Michael Paquier wrote:
>> Joe, what's the version of libselinux used in rhinoceros?  2.5?
>
> rpm -q libselinux
> libselinux-2.5-15.el7.x86_64

Thanks!
--
Michael


signature.asc
Description: PGP signature


Re: pg_dump from v13 is slow

2020-08-13 Thread Andres Freund
Hi,

On 2020-08-13 17:48:23 -0500, Justin Pryzby wrote:
> I'm trying to narrow this down, but I'd be very happy for suggestions.

Would be worth knowing how much of the time pgbench is 100% CPU
utilized, and how much of the time it is basically waiting for server
side queries and largely idle.

If it's close to 100% busy for a significant part of that time, it'd be
useful to get a perf profile. If it's largely queries to the server that
are the issue, logging those would be relevant.

Greetings,

Andres Freund




Re: pg_dump from v13 is slow

2020-08-13 Thread Alvaro Herrera
On 2020-Aug-13, Justin Pryzby wrote:

> I'm trying to narrow this down, but I'd be very happy for suggestions.

Maybe you can time "pg_dump --binary-upgrade" to see if the slowness
comes from there.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




pg_dump from v13 is slow

2020-08-13 Thread Justin Pryzby
I ran into this while running pg_upgrade from beta2 to beta3:
$ time sudo -u postgres sh -c 'cd /var/lib/pgsql; /usr/pgsql-13/bin/pg_upgrade 
-b /usr/pgsql-13b2/bin/ -d ./13b2/data -D ./13/data --link'
real94m18.335s

This instances has many table partitions, and the production instance uses
tablespaces.  Some of our tables are wide.  This VM is not idle, but does not
account for being 20x slower.

pg_dump -v --section=pre-data ts |wc
1846659 4697507 59575253
real39m8.524s

Compare v12 and v13:

|$ /usr/pgsql-12/bin/initdb -D 12
|$ /usr/pgsql-12/bin/postgres -D 12 -c shared_buffers=256MB -c 
max_locks_per_transaction=128 -c port=5678 -c unix_socket_directories=/tmp&
|$ psql -h /tmp -p 5678 postgres 

Re: [BUG] Error in BRIN summarization

2020-08-13 Thread Alvaro Herrera
On 2020-Aug-13, Anastasia Lubennikova wrote:

> Cool.
> This version looks much simpler than mine and passes the tests fine.

Thanks, pushed it to all branches.  Thanks for diagnosing this problem!

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: proposal: possibility to read dumped table's name from file

2020-08-13 Thread Justin Pryzby
On Mon, Jul 27, 2020 at 07:25:54AM +0200, Pavel Stehule wrote:
> so 25. 7. 2020 v 15:26 odesílatel vignesh C  napsal:
> 
> > On Tue, Jul 14, 2020 at 12:03 PM Pavel Stehule 
> > wrote:
> > >> I meant can this:
> > >> printf(_("  --filter=FILENAMEread object name filter
> > >> expressions from file\n"));
> > >> be changed to:
> > >> printf(_("  --filter=FILENAMEdump objects and data based
> > >> on the filter expressions from the filter file\n"));
> > >
> > > done in today patch

This looks good to my eyes - marked RFC.

-- 
Justin




Re: run pgindent on a regular basis / scripted manner

2020-08-13 Thread Stephen Frost
Greetings,

* Robert Haas (robertmh...@gmail.com) wrote:
> On Thu, Aug 13, 2020 at 12:30 PM Stephen Frost  wrote:
> > So, in our world, wouldn't this translate to 'make cfbot complain'?
> 
> This seems like it would be useful, but we'd have to figure out what
> to do about typedefs.list. If the patch is indented with the current
> one (which is auto-generated by the entire build farm, remember) it's
> likely to mess up a patch that's otherwise properly formatted. We'd
> either need to insist that people include updates to typedefs.list in
> the patch, or else have the cfbot take a stab at doing those updates
> itself.

For my 2c, anyway, I like the idea of having folks update the typedefs
themselves when they've got a patch that needs a new typedef to be
indented correctly.  Having cfbot try to do that seems unlikely to work
well.

I also didn't mean to imply that we'd push back and ask for a rebase due
to indentation changes, but at the same time, I question if it's really
that realistic a concern- either whomever posted the patch ran pgindent
on it, or they didn't, and I doubt cfbot's check of that would change
without there being a conflict between the patch and something that got
committed anyway.

I also disagree that it's that much of a burden to ask people who are
already hacking on PG to install pgindent.

All that said, seems that others feel differently and while I still
think it's a pretty reasonable idea to have cfbot check, if no one
agrees with me, that's fine too.  Having the pre-commit hook would help
with the downstream issue of pgindent pain from unrelated incorrect
indentation, so at least dealing with the patch author not properly
indenting to start with would be just on the bits the patch is already
modifying, which is a lot better.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: run pgindent on a regular basis / scripted manner

2020-08-13 Thread Andres Freund
Hi,

On 2020-08-13 12:50:16 -0400, Tom Lane wrote:
> Stephen Frost  writes:
> > So, in our world, wouldn't this translate to 'make cfbot complain'?
> 
> > I'm definitely a fan of the idea of having cfbot flag these and then we
> > maybe get to a point where it's not the committers dealing with fixing
> > patches that weren't pgindent'd properly, it's the actual patch
> > submitters being nagged about it...
> 
> Meh.  Asking all submitters to install pgindent is a bit of a burden.

+1. We could improve on that by slurping it into src/tools though. If
there were a 'make patchprep' target, it'd be a lot more realistic.


But even so, it'd robably further inrease the rate of needing to
constantly rebase lingering patches if cfbot considered indentation
issues failures. E.g. because of typedefs.list updates etc. So I'm
against that, even if we had a patchprep target that didn't need
external tools.

Greetings,

Andres Freund




Re: run pgindent on a regular basis / scripted manner

2020-08-13 Thread Robert Haas
On Thu, Aug 13, 2020 at 3:47 PM Tom Lane  wrote:
> The traditional reason for not doing pgindent too often has been that
> it'd cause more work for people who have to rebase their patches over
> pgindent's results.  If we want to do it more often, then in order to
> respond to that concern, I think we need to do it really often ---
> not necessarily quite continuously, but often enough that pgindent
> is only changing recently-committed code.  In this way, it'd be likely
> that anyone with a patch touching that same code would only need to
> rebase once not twice.  The approaches involving an automated run
> give a guarantee of that, otherwise we don't have a guarantee; but
> as long as it's not many days delay I think it wouldn't be bad.
>
> Intervals on the order of a month seem likely to be the worst of
> both worlds from this standpoint --- too long for people to wait
> before rebasing their patch, yet short enough that they'd have
> to do so repeatedly.

Yeah, I get the point. It depends somewhat on how often you think
people will rebase. The main thing against more frequent pgindent runs
is that it clutters the history. If done manually, it's also a lot of
work.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: Libpq support to connect to standby server as priority

2020-08-13 Thread Robert Haas
On Fri, Dec 27, 2019 at 8:08 AM Alvaro Herrera  wrote:
> So, we can know whether server is primary/standby by checking
> in_recovery, as opposed to knowing whether read-write which is done by
> checking transaction_read_only.  So we can keep read-write as a synonym
> for "primary", and check in_recovery when used in servers that support
> the new GUC, and check transaction_read_only in older servers.

I think it would be better to have read-write and read-only check
trnasaction_read_only, and primary and standby can check the new
thing. There can never be any real advantage in having synonyms for
the same thing, but there can be an advantage to letting users choose
the behavior they want.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: run pgindent on a regular basis / scripted manner

2020-08-13 Thread Tom Lane
Robert Haas  writes:
> I am not sure whether weekly or after-every-commit pgindent runs is a
> good idea, but I think we should try to do it once a month or so. It's
> too annoying otherwise. I could go either way on the question of
> automation.

The traditional reason for not doing pgindent too often has been that
it'd cause more work for people who have to rebase their patches over
pgindent's results.  If we want to do it more often, then in order to
respond to that concern, I think we need to do it really often ---
not necessarily quite continuously, but often enough that pgindent
is only changing recently-committed code.  In this way, it'd be likely
that anyone with a patch touching that same code would only need to
rebase once not twice.  The approaches involving an automated run
give a guarantee of that, otherwise we don't have a guarantee; but
as long as it's not many days delay I think it wouldn't be bad.

Intervals on the order of a month seem likely to be the worst of
both worlds from this standpoint --- too long for people to wait
before rebasing their patch, yet short enough that they'd have
to do so repeatedly.

regards, tom lane




Re: run pgindent on a regular basis / scripted manner

2020-08-13 Thread Robert Haas
On Thu, Aug 13, 2020 at 12:30 PM Stephen Frost  wrote:
> So, in our world, wouldn't this translate to 'make cfbot complain'?

This seems like it would be useful, but we'd have to figure out what
to do about typedefs.list. If the patch is indented with the current
one (which is auto-generated by the entire build farm, remember) it's
likely to mess up a patch that's otherwise properly formatted. We'd
either need to insist that people include updates to typedefs.list in
the patch, or else have the cfbot take a stab at doing those updates
itself.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: run pgindent on a regular basis / scripted manner

2020-08-13 Thread Robert Haas
On Wed, Aug 12, 2020 at 7:47 PM Tom Lane  wrote:
> I'm not in favor of unsupervised pgindent runs, really.  It can do a lot
> of damage to code that was written without thinking about it --- in
> particular, it'll make a hash of comment blocks that were manually
> formatted and not protected with dashes.

No committer should be committing code without thinking about
pgindent. If some are, they need to up their game.

I am not sure whether weekly or after-every-commit pgindent runs is a
good idea, but I think we should try to do it once a month or so. It's
too annoying otherwise. I could go either way on the question of
automation.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: recovering from "found xmin ... from before relfrozenxid ..."

2020-08-13 Thread Robert Haas
On Thu, Aug 13, 2020 at 3:52 AM Ashutosh Sharma  wrote:
> This looks like a very good suggestion to me. I will do this change in
> the next version. Just wondering if we should be doing similar changes
> in other contrib modules (like pgrowlocks, pageinspect and
> pgstattuple) as well?

It seems like it should be consistent, but I'm not sure the proposed
change is really an improvement.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: jsonb, collection & postgres_fdw

2020-08-13 Thread Tom Lane
Konstantin Knizhnik  writes:
> Or funccollid=0 doesn't mean that collations of function arguments do 
> not affect function behavior?

No, it does not.  As I said already, there is no way to tell from outside
a function whether it pays attention to collation or not.  funccollid
is the collation to ascribe to the function's *output*, but that's always
zero for a non-collatable output type such as boolean.  An example
is text_lt(), which returns boolean but surely does depend on the input
collation.  We don't really have any way to distinguish between that and
jsonb_exists().

In hindsight, it was probably a bad idea not to have a way to mark whether
functions care about collation.  I don't know if it'd be practical to
retrofit such a marker now.

regards, tom lane




Re: jsonb, collection & postgres_fdw

2020-08-13 Thread Konstantin Knizhnik




On 13.08.2020 20:00, Tom Lane wrote:

Konstantin Knizhnik  writes:

Right now jsonb functions are treated as non-shippable by postgres_fdw
and so predicates with them are not pushed down to foreign server:

Yeah, that's kind of annoying, but breaking the collation check
is not an acceptable fix.  And what you're proposing *does* break it.
The issue here is that the function's input collation is coming from
the default collation applied to the text constant, and we can't assume
that that will be the same on the remote side.

In reality, of course, jsonb_exists doesn't care about its input collation
--- but postgres_fdw has no way to know that.  I don't see any easy way
around that.

One idea that would probably work in a lot of postgres_fdw usage scenarios
is to have a foreign-server-level flag that says "all the collations on
that server behave the same as the local ones, and the default collation
is the same too", and then we just skip the collation checking altogether.
But I'm a bit worried that if someone mistakenly sets that flag, the
misbehavior will be very hard to detect.

regards, tom lane

Thank you for clarification.
And sorry for mistyping in topic (there should be "collation" instead of 
"collection").
Actually I do not know much about handling collations in Postgres and 
particularly in postgres_fdw.
Can you (or somebody else) provide more information about this fragment 
of code:

                /*
                 * If function's input collation is not derived from a 
foreign

                 * Var, it can't be sent to remote.
                 */
                if (fe->inputcollid == InvalidOid)
                     /* OK, inputs are all noncollatable */ ;
                else if (inner_cxt.state != FDW_COLLATE_SAFE ||
                         fe->inputcollid != inner_cxt.collation)
                    return false;

So we have function call expression which arguments have associated 
collation,

but function itself is collection-neutral: funccollid = 0
Why it is not safe to push this function call to the remote server?
Why it breaks collation check?
If there are some unsafe operations with collations during argument 
evaluation, then

we detect it while recursive processing of arguments.

I agree that my proposed fix is not correct.
But what about this check:

                if (fe->inputcollid == InvalidOid)
                     /* OK, inputs are all noncollatable */ ;
                else if (fe->funccollid == InvalidOid)
 /* OK, function is noncollatable */ ;

Or funccollid=0 doesn't mean that collations of function arguments do 
not affect function behavior?






Re: jsonb, collection & postgres_fdw

2020-08-13 Thread Tom Lane
Konstantin Knizhnik  writes:
> Right now jsonb functions are treated as non-shippable by postgres_fdw 
> and so predicates with them are not pushed down to foreign server:

Yeah, that's kind of annoying, but breaking the collation check
is not an acceptable fix.  And what you're proposing *does* break it.
The issue here is that the function's input collation is coming from
the default collation applied to the text constant, and we can't assume
that that will be the same on the remote side.

In reality, of course, jsonb_exists doesn't care about its input collation
--- but postgres_fdw has no way to know that.  I don't see any easy way
around that.

One idea that would probably work in a lot of postgres_fdw usage scenarios
is to have a foreign-server-level flag that says "all the collations on
that server behave the same as the local ones, and the default collation
is the same too", and then we just skip the collation checking altogether.
But I'm a bit worried that if someone mistakenly sets that flag, the
misbehavior will be very hard to detect.

regards, tom lane




Re: run pgindent on a regular basis / scripted manner

2020-08-13 Thread Magnus Hagander
On Thu, Aug 13, 2020 at 6:30 PM Stephen Frost  wrote:

>
> * Magnus Hagander (mag...@hagander.net) wrote:
>
> > There are many solutions that do such things but that do it in the
> "github
> > workflow" way, which is they do change -> commit -> create pull request,
> > and then somebody eyeballs that pullrequest and approves it. We don't
> have
> > PRs, but we could either have a script that simply sends out a patch to a
> > mailinglist, or we could have a script that maintains a separate branch
> > which is auto-pgindented, and then have a committer squash-merge that
> > branch after having reviewed it.
> >
> > Maybe something like that in combination with a pre-commit hook per
> above.
>
> So, in our world, wouldn't this translate to 'make cfbot complain'?
>
> I'm definitely a fan of the idea of having cfbot flag these and then we
> maybe get to a point where it's not the committers dealing with fixing
> patches that weren't pgindent'd properly, it's the actual patch
> submitters being nagged about it...
>

Werll, that's one thing, but what I was thinking of here was more of an
automated branch maintained for the committers, not for the individual
patch owners.

That is:
1. Whenever a patch is pushed on master on the main repo a process kicked
off (or maybe wait 5 minutes to coalesce multiple patches if there are)
2. This process checks out master, and runs pgindent on it
3. When done, this gets committed to a new branch (or just overwrites an
existing branch of course, we don't need to maintain history here) like
"master-indented". This branch can be in a different repo, but one that
starts out as a clone of the main one
4. A committer (any committer) can then on regular basis examine the
differences by fetch + diff. If they're happy with it, cherry pick it in.
If not, figure out what needs to be done to adjust it.

Step 4 can be done at whatever interval we prefer, and we can have
something to nag us if head has been "off-indent"  for too long.

This would be the backup for things that aren't indented during patch
commit, not other things.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/ 


Re: run pgindent on a regular basis / scripted manner

2020-08-13 Thread Tom Lane
Stephen Frost  writes:
> So, in our world, wouldn't this translate to 'make cfbot complain'?

> I'm definitely a fan of the idea of having cfbot flag these and then we
> maybe get to a point where it's not the committers dealing with fixing
> patches that weren't pgindent'd properly, it's the actual patch
> submitters being nagged about it...

Meh.  Asking all submitters to install pgindent is a bit of a burden.
Moreover, sometimes it's better to provide a patch that deliberately
hasn't reindented existing code, for ease of review (say, when you're
adding if() { ... } around some big hunk of code).  I think getting
committers to do this as part of commit is a better workflow.

(Admittedly, since I've been doing that for a long time, I don't
find it to be a burden.  I suppose some committers do.)

regards, tom lane




Re: Parallel query hangs after a smart shutdown is issued

2020-08-13 Thread Tom Lane
I wrote:
> Hmmm ... maybe that should be more like
> if (smartShutState != SMART_NORMAL_USAGE &&
> backend_type == BACKEND_TYPE_NORMAL)

After some more rethinking and testing, here's a v5 that feels
fairly final to me.  I realized that the logic in canAcceptConnections
was kind of backwards: it's better to check the main pmState restrictions
first and then the smart-shutdown restrictions afterwards.

I'm assuming we want to back-patch this as far as 9.6, where parallel
query began to be a thing.

regards, tom lane

diff --git a/doc/src/sgml/ref/pg_ctl-ref.sgml b/doc/src/sgml/ref/pg_ctl-ref.sgml
index e31275a04e..3946fa52ea 100644
--- a/doc/src/sgml/ref/pg_ctl-ref.sgml
+++ b/doc/src/sgml/ref/pg_ctl-ref.sgml
@@ -185,8 +185,8 @@ PostgreSQL documentation
stop mode shuts down the server that is running in
the specified data directory.  Three different
shutdown methods can be selected with the -m
-   option.  Smart mode waits for all active
-   clients to disconnect and any online backup to finish.
+   option.  Smart mode disallows new connections, then waits
+   for all existing clients to disconnect and any online backup to finish.
If the server is in hot standby, recovery and streaming replication
will be terminated once all clients have disconnected.
Fast mode (the default) does not wait for clients to disconnect and
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index 38e2c16ac2..8a179cd691 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -148,8 +148,6 @@
 #define BACKEND_TYPE_BGWORKER	0x0008	/* bgworker process */
 #define BACKEND_TYPE_ALL		0x000F	/* OR of all the above */
 
-#define BACKEND_TYPE_WORKER		(BACKEND_TYPE_AUTOVAC | BACKEND_TYPE_BGWORKER)
-
 /*
  * List of active backends (or child processes anyway; we don't actually
  * know whether a given child has become a backend or is still in the
@@ -304,8 +302,7 @@ static bool FatalError = false; /* T if recovering from backend crash */
  * and we switch to PM_RUN state.
  *
  * Normal child backends can only be launched when we are in PM_RUN or
- * PM_HOT_STANDBY state.  (We also allow launch of normal
- * child backends in PM_WAIT_BACKUP state, but only for superusers.)
+ * PM_HOT_STANDBY state.  (connsAllowed can also restrict launching.)
  * In other states we handle connection requests by launching "dead_end"
  * child processes, which will simply send the client an error message and
  * quit.  (We track these in the BackendList so that we can know when they
@@ -319,10 +316,10 @@ static bool FatalError = false; /* T if recovering from backend crash */
  *
  * Notice that this state variable does not distinguish *why* we entered
  * states later than PM_RUN --- Shutdown and FatalError must be consulted
- * to find that out.  FatalError is never true in PM_RECOVERY_* or PM_RUN
- * states, nor in PM_SHUTDOWN states (because we don't enter those states
- * when trying to recover from a crash).  It can be true in PM_STARTUP state,
- * because we don't clear it until we've successfully started WAL redo.
+ * to find that out.  FatalError is never true in PM_RECOVERY, PM_HOT_STANDBY,
+ * or PM_RUN states, nor in PM_SHUTDOWN states (because we don't enter those
+ * states when trying to recover from a crash).  It can be true in PM_STARTUP
+ * state, because we don't clear it until we've successfully started WAL redo.
  */
 typedef enum
 {
@@ -331,8 +328,7 @@ typedef enum
 	PM_RECOVERY,/* in archive recovery mode */
 	PM_HOT_STANDBY,/* in hot standby mode */
 	PM_RUN,		/* normal "database is alive" state */
-	PM_WAIT_BACKUP,/* waiting for online backup mode to end */
-	PM_WAIT_READONLY,			/* waiting for read only backends to exit */
+	PM_STOP_BACKENDS,			/* need to stop remaining backends */
 	PM_WAIT_BACKENDS,			/* waiting for live backends to exit */
 	PM_SHUTDOWN,/* waiting for checkpointer to do shutdown
  * ckpt */
@@ -344,6 +340,21 @@ typedef enum
 
 static PMState pmState = PM_INIT;
 
+/*
+ * While performing a "smart shutdown", we restrict new connections but stay
+ * in PM_RUN or PM_HOT_STANDBY state until all the client backends are gone.
+ * connsAllowed is a sub-state indicator showing the active restriction.
+ * It is of no interest unless pmState is PM_RUN or PM_HOT_STANDBY.
+ */
+typedef enum
+{
+	ALLOW_ALL_CONNS,			/* normal not-shutting-down state */
+	ALLOW_SUPERUSER_CONNS,		/* only superusers can connect */
+	ALLOW_NO_CONNS/* no new connections allowed, period */
+} ConnsAllowedState;
+
+static ConnsAllowedState connsAllowed = ALLOW_ALL_CONNS;
+
 /* Start time of SIGKILL timeout during immediate shutdown or child crash */
 /* Zero means timeout is not running */
 static time_t AbortStartTime = 0;
@@ -2323,7 +2334,7 @@ retry1:
 	(errcode(ERRCODE_TOO_MANY_CONNECTIONS),
 	 errmsg("sorry, too many clients already")));
 			break;
-		case CAC_WAITBACKUP:

Re: run pgindent on a regular basis / scripted manner

2020-08-13 Thread Stephen Frost
Greetings,

* Magnus Hagander (mag...@hagander.net) wrote:
> There's another option here as well, that is a bit "softer", is to use a
> pre-commit hook.

Yeah, +1 on a pre-commit idea to help address this.  I certainly agree
with Andres that it's quite annoying to deal with commits coming in that
aren't indented properly but are in some file that I'm working on.

> This obviously only works in the case where we can rely on the committers
> to remember to install such a hook, but given the few committers that we do
> have, I think we can certainly get that up to an "acceptable rate" fairly
> easily. FWIW, this is similar to what we do in the pgweb, pgeu and a few
> other repositories, to ensure python styleguides are followed.

Yeah, no guarantee, but definitely seems like it'd be a good
improvement.

> > I was objecting to (2).  (1) would perhaps work.  (3) could be pretty
> > darn annoying, especially if it blocks a time-critical security patch.
> 
> FWIW, I agree that (2) seems like a really bad option. In that suddenly a
> committer has committed something they were not aware of.

Yeah, I dislike (2) a lot too.

> Yeah, I'm definitely not a big fan of automated commits, regardless of if
> it's just indent or both indent+typedef. It's happened at least once, and I
> think more than once, that we've had to basically hard reset the upstream
> repository and clean things up after automated commits have gone bonkers
> (hi, Bruce!). Having an automated system do the whole flow of
> change->commit->push definitely invites this type of problem.

Agreed, automated commits seems terribly risky.

> There are many solutions that do such things but that do it in the "github
> workflow" way, which is they do change -> commit -> create pull request,
> and then somebody eyeballs that pullrequest and approves it. We don't have
> PRs, but we could either have a script that simply sends out a patch to a
> mailinglist, or we could have a script that maintains a separate branch
> which is auto-pgindented, and then have a committer squash-merge that
> branch after having reviewed it.
> 
> Maybe something like that in combination with a pre-commit hook per above.

So, in our world, wouldn't this translate to 'make cfbot complain'?

I'm definitely a fan of the idea of having cfbot flag these and then we
maybe get to a point where it's not the committers dealing with fixing
patches that weren't pgindent'd properly, it's the actual patch
submitters being nagged about it...

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Switch to multi-inserts for pg_depend

2020-08-13 Thread Alvaro Herrera
On 2020-Aug-13, Michael Paquier wrote:

> On Thu, Aug 13, 2020 at 05:35:14AM -0400, Alvaro Herrera wrote:
> > Next to the API definition I guess, is that dependency.h?
> 
> We need something more central, see also MAX_PGATTRIBUTE_INSERT_BYTES
> and MAX_PGSHDEPEND_INSERT_BYTES.  And the definition should be named
> something like MAX_CATALOG_INSERT_BYTES or such I guess.

MAX_CATALOG_INSERT_BYTES sounds decent to me.  I mentioned dependency.h
because I was uncaffeinatedly thinking that this was used with API
defined there -- but in reality it's used with indexing.h functions, and
it seems to me that that file would be the place for it.

Looking at the existing contents of catalog.h, I would say it does not
fit in there.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




jsonb, collection & postgres_fdw

2020-08-13 Thread Konstantin Knizhnik

Hi hackers,

Right now jsonb functions are treated as non-shippable by postgres_fdw 
and so predicates with them are not pushed down to foreign server:


create table jt(content jsonb);
create extension postgres_fdw;
create server pg_fdw  FOREIGN DATA WRAPPER postgres_fdw options(host 
'127.0.0.1', dbname 'postgres');
create user mapping for current_user server pg_fdw options (user 
'postgres');
create foreign table fjt(content jsonb) server pg_fdw options 
(table_name 'jt');

postgres=# explain select * from fjt where jsonb_exists(content, 'some');
  QUERY PLAN
--
 Foreign Scan on fjt  (cost=100.00..157.50 rows=487 width=32)
   Filter: jsonb_exists(content, 'some'::text)

It is because of the following check  in postgres_fdw:

                /*
                 * If function's input collation is not derived from a 
foreign

                 * Var, it can't be sent to remote.
                 */
                if (fe->inputcollid == InvalidOid)
                     /* OK, inputs are all noncollatable */ ;
                else if (inner_cxt.state != FDW_COLLATE_SAFE ||
                         fe->inputcollid != inner_cxt.collation)
                    return false;

In my case
(gdb) p fe->inputcollid
$1 = 100
(gdb) p inner_cxt.collation
$3 = 0
(gdb) p inner_cxt.state
$4 = FDW_COLLATE_NONE


I wonder if there is some way of making postgres_fdw to push this this 
function to foreign server?

May be this check should be changed to:

                if (fe->inputcollid == InvalidOid || inner_cxt.state == 
FDW_COLLATE_NONE)

                     /* OK, inputs are all noncollatable */ ;





Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

2020-08-13 Thread Amit Kapila
On Thu, Aug 13, 2020 at 12:08 PM Amit Kapila  wrote:
>
> On Fri, Aug 7, 2020 at 2:04 PM Dilip Kumar  wrote:
> >
> > On Thu, Aug 6, 2020 at 2:43 PM Amit Kapila  wrote:
> > >
> ..
> > > This patch's functionality can be independently verified by SQL APIs
> >
> > Your changes look fine to me.
> >
>
> I have pushed that patch last week and attached are the remaining
> patches. I have made a few changes in the next patch
> 0001-Extend-the-BufFile-interface.patch and have some comments on it
> which are as below:
>

Few more comments on the latest patches:
v48-0002-Add-support-for-streaming-to-built-in-replicatio
1. It appears to me that we don't remove the temporary folders created
by the apply worker. So, we have folders like
pgsql_tmp15324.0.sharedfileset in base/pgsql_tmp directory even when
the apply worker exits. I think we can remove these by calling
PathNameDeleteTemporaryDir in SharedFileSetUnregister while removing
the fileset from registered filesetlist.

2.
+typedef struct SubXactInfo
+{
+ TransactionId xid; /* XID of the subxact */
+ int fileno; /* file number in the buffile */
+ off_t offset; /* offset in the file */
+} SubXactInfo;
+
+static uint32 nsubxacts = 0;
+static uint32 nsubxacts_max = 0;
+static SubXactInfo *subxacts = NULL;
+static TransactionId subxact_last = InvalidTransactionId;

Will it be better if we move all the subxact related variables (like
nsubxacts, nsubxacts_max and subxact_last) inside SubXactInfo struct
as all the information anyway is related to sub-transactions?

3.
+ /*
+ * If there is no subtransaction then nothing to do,  but if already have
+ * subxact file then delete that.
+ */

extra space before 'but' in the above sentence is not required.

v48-0001-Extend-the-BufFile-interface
4.
- * SharedFileSets can also be used by backends when the temporary files need
- * to be opened/closed multiple times and the underlying files need to survive
+ * SharedFileSets can be used by backends when the temporary files need to be
+ * opened/closed multiple times and the underlying files need to survive
  * across transactions.
  *

No need of 'also' in the above sentence.

-- 
With Regards,
Amit Kapila.




Re: security_context_t marked as deprecated in libselinux 3.1

2020-08-13 Thread Joe Conway
On 8/13/20 1:22 AM, Michael Paquier wrote:
> On Wed, Aug 12, 2020 at 10:50:21PM -0400, Tom Lane wrote:
>> Ummm ... aren't you going to get some cast-away-const warnings now?
>> Or are all of the called functions declared as taking "const char *"
>> not just "char *"?
> 
> Let me see..  The function signatures we use have been visibly changed
> in 9eb9c932, which comes down to a point between 2.2.2 and 2.3, and
> there are two of them we care about, both use now "const char *":
> - security_check_context_raw()
> - security_compute_create_name_raw()
> We claim in the docs that the minimum version of libselinux supported
> is 2.1.10 (7a86fe1a from march 2012).
> 
> Then, the only buildfarm animal I know of testing selinux is
> rhinoceros, that uses CentOS 7.1, and this visibly already bundles
> libselinux 2.5 that was released in 2016 (2b69984), per the RPM list
> here:
> http://mirror.centos.org/centos/7/
> Joe, what's the version of libselinux used in rhinoceros?  2.5?


rpm -q libselinux
libselinux-2.5-15.el7.x86_64


Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development



signature.asc
Description: OpenPGP digital signature


Re: [BUG] Error in BRIN summarization

2020-08-13 Thread Anastasia Lubennikova

On 12.08.2020 22:58, Alvaro Herrera wrote:

On 2020-Aug-12, Alvaro Herrera wrote:


'anyvisible' mode is not required AFAICS; reading the code, I think this
could also hit REINDEX CONCURRENTLY and CREATE INDEX CONCURRENTLY, which
do not use that flag.  I didn't try to reproduce it there, though.
Anyway, I'm going to remove that Assert() I added.

So this is what I propose as the final form of the fix.


Cool.
This version looks much simpler than mine and passes the tests fine.

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company





Re: Switch to multi-inserts for pg_depend

2020-08-13 Thread Michael Paquier
On Thu, Aug 13, 2020 at 05:35:14AM -0400, Alvaro Herrera wrote:
> Next to the API definition I guess, is that dependency.h?

We need something more central, see also MAX_PGATTRIBUTE_INSERT_BYTES
and MAX_PGSHDEPEND_INSERT_BYTES.  And the definition should be named
something like MAX_CATALOG_INSERT_BYTES or such I guess.
--
Michael


signature.asc
Description: PGP signature


Missing unknown status in PGresult

2020-08-13 Thread Hao Wu
Hi hackers,

PGRES_FATAL_ERROR is the result status in most cases in the client-side when 
the backend process raises an error. When the query is failed to execute, 
PGRES_FATAL_ERROR is returned.
But in another case, PGRES_FATAL_ERROR is also returned. The situation is that 
the network is broken between the client and the backend immediately after the 
backend process has committed locally. So the client gets the same result 
status(PGRES_FATAL_ERROR) with normal errors(transaction failed). But the 
transaction is actually succeeded.

When the libpq detects an EOF, a PGresult with status PGRES_FATAL_ERROR is 
returned to the client.
We can check the error message in PGresult to see why an error is returned, but 
it's unlikely reliable and tricky.

The result status should be unknown in the above case. Because the server has 
received the request, and the client doesn't get any response from the server, 
so it may succeed or fail.

Regards,
Hao Wu


Re: Switch to multi-inserts for pg_depend

2020-08-13 Thread Alvaro Herrera
On 2020-Aug-13, Michael Paquier wrote:

> Okay.  Would src/include/catalog/catalog.h be a suited location for
> this flag or somebody has a better idea?

Next to the API definition I guess, is that dependency.h?



-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: recovering from "found xmin ... from before relfrozenxid ..."

2020-08-13 Thread Ashutosh Sharma
Hi Asim,

Thanks for having a look into the patch and for sharing your feedback.
Please find my comments inline below:

On Thu, Aug 13, 2020 at 12:36 PM Asim Praveen  wrote:
>
> Hi Ashutosh
>
> I stumbled upon this thread today, went through your patch and it looks good. 
>  A minor suggestion in sanity_check_relation():
>
> if (rel->rd_rel->relam != HEAP_TABLE_AM_OID)
> ereport(ERROR,
> (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
>  errmsg("only heap AM is supported")));
>
> Instead of checking the access method OID, it seems better to check the 
> handler OID like so:
>
> if (rel->rd_amhandler != HEAP_TABLE_AM_HANDLER_OID)
>
> The reason is current version of sanity_check_relation() would emit error for 
> the following case even when the table structure is actually heap.
>
> create access method myam type table handler heap_tableam_handler;
> create table mytable (…) using myam;
>

This looks like a very good suggestion to me. I will do this change in
the next version. Just wondering if we should be doing similar changes
in other contrib modules (like pgrowlocks, pageinspect and
pgstattuple) as well?

--
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com




Re: run pgindent on a regular basis / scripted manner

2020-08-13 Thread Magnus Hagander
On Thu, Aug 13, 2020 at 6:08 AM Tom Lane  wrote:

> Noah Misch  writes:
> > On Wed, Aug 12, 2020 at 07:47:01PM -0400, Tom Lane wrote:
> >> If the workflow is commit first and re-indent later, then we can always
> >> revert the pgindent commit and clean things up manually; but an auto
> >> re-indent during commit wouldn't provide that history.
>
> > There are competing implementations of assuring pgindent-cleanliness at
> every
> > check-in:
>
> > 1. After each push, an automated followup commit appears, restoring
> >pgindent-cleanliness.
> > 2. "git push" results in a commit that melds pgindent changes into what
> the
> >committer tried to push.
> > 3. "git push" fails, for the master branch, if the pushed commit disrupts
> >pgindent-cleanliness.
>

There's another option here as well, that is a bit "softer", is to use a
pre-commit hook.

That is, it's a hook that runs on the committers machine prior to the
commit. This hook can then yell "hey, you need to run pgindent before
committing this", but it gives the committer the ability to do --no-verify
and commit anyway (thus won't block things that are urgent).

Since it allows a simple bypass, and very much relies on the committer to
remember to install the hook in their local repository, this is not a
guarantee in any way. So it might need to be done together with something
else in the background doing like a daily job or so, but it might make that
background work be smaller and fewer changes.

This obviously only works in the case where we can rely on the committers
to remember to install such a hook, but given the few committers that we do
have, I think we can certainly get that up to an "acceptable rate" fairly
easily. FWIW, this is similar to what we do in the pgweb, pgeu and a few
other repositories, to ensure python styleguides are followed.


> Were you thinking of (2)?
>
> I was objecting to (2).  (1) would perhaps work.  (3) could be pretty
> darn annoying, especially if it blocks a time-critical security patch.
>

FWIW, I agree that (2) seems like a really bad option. In that suddenly a
committer has committed something they were not aware of.



>
> > Regarding typedefs.list, I would use the in-tree one, like you discussed
> here:
>
> Yeah, after thinking about that more, it seems like automated
> typedefs.list updates would be far riskier than automated reindent
> based on the existing typedefs.list.  The latter could at least be
> expected not to change code unrelated to the immediate commit.
> typedefs.list updates need some amount of adult supervision.
>
> (I'd still vote for nag-mail to the committer whose work got reindented,
> in case the bot made things a lot worse.)
>

Yeah, I'm definitely not a big fan of automated commits, regardless of if
it's just indent or both indent+typedef. It's happened at least once, and I
think more than once, that we've had to basically hard reset the upstream
repository and clean things up after automated commits have gone bonkers
(hi, Bruce!). Having an automated system do the whole flow of
change->commit->push definitely invites this type of problem.

There are many solutions that do such things but that do it in the "github
workflow" way, which is they do change -> commit -> create pull request,
and then somebody eyeballs that pullrequest and approves it. We don't have
PRs, but we could either have a script that simply sends out a patch to a
mailinglist, or we could have a script that maintains a separate branch
which is auto-pgindented, and then have a committer squash-merge that
branch after having reviewed it.

Maybe something like that in combination with a pre-commit hook per above.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/ 


Re: [PG13] Planning (time + buffers) data structure in explain plan (format text)

2020-08-13 Thread Julien Rouhaud
On Fri, Aug 7, 2020 at 3:51 PM Pierre Giraud  wrote:
>
> Le 07/08/2020 à 14:52, Julien Rouhaud a écrit :
> > Hi,
> >
> > On Fri, Aug 7, 2020 at 2:30 PM Pierre Giraud  
> > wrote:
> >>
> >> Hi all,
> >>
> >> As far as I understand, in the upcoming version 13, information about
> >> buffers used during planning is now available in the explain plan.
> >
> > Indeed.
> >
> >> […]
> >>  Planning Time: 0.203 ms
> >>Buffers: shared hit=14
> >> […]
> >>
> >> In the JSON format, the data structure is a bit different:
> >>
> >> […]
> >>  "Planning": {
> >>"Planning Time": 0.533,
> >>"Shared Hit Blocks": 14,
> >>"Shared Read Blocks": 0,
> >>"Shared Dirtied Blocks": 0,
> >>"Shared Written Blocks": 0,
> >>"Local Hit Blocks": 0,
> >>"Local Read Blocks": 0,
> >>"Local Dirtied Blocks": 0,
> >>"Local Written Blocks": 0,
> >>"Temp Read Blocks": 0,
> >>"Temp Written Blocks": 0
> >>  },
> >> […]
> >>
> >> For a matter of consistency, I wonder if it would be possible to format
> >> it like the following:
> >>
> >> […]
> >>  Planning:
> >>Planning Time: 0.203 ms
> >>Buffers: shared hit=14
> >> […]
> >
> > I agree that this output looks more consistent with other output,
> > including JIT as you mentioned.  I'll send a patch for that if there's
> > no objection.
>
> Thanks a lot!
>
> >
> > Out of curiosity, is the current text output actually harder to parse
> > than the one you're suggesting?
> >
>
> I don't want to speak in the name of developers of others parsing tools
> but this should not require a lot of work to parse the output I'm proposing.
> It would be nice to have their opinion though, especially Hubert depesz
> Lubaczewski's since he already integrated the change:
> https://gitlab.com/depesz/Pg--Explain/-/commit/4a760136ee69ee4929625d4e4022f79ac60b763f
>
> However, as far as I know, he's not doing anything with the buffers
> information with the "Planning" section yet.
>
> To answer your question, I think that the new output would make the
> parser a little bit easier to write because it would make things a bit
> clearer (ie. more separated) so less prone to errors.

Adding depesz.




Re: recovering from "found xmin ... from before relfrozenxid ..."

2020-08-13 Thread Asim Praveen
Hi Ashutosh

I stumbled upon this thread today, went through your patch and it looks good.  
A minor suggestion in sanity_check_relation():

if (rel->rd_rel->relam != HEAP_TABLE_AM_OID)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 errmsg("only heap AM is supported")));

Instead of checking the access method OID, it seems better to check the handler 
OID like so:

if (rel->rd_amhandler != HEAP_TABLE_AM_HANDLER_OID)

The reason is current version of sanity_check_relation() would emit error for 
the following case even when the table structure is actually heap.

create access method myam type table handler heap_tableam_handler;
create table mytable (…) using myam;

Asim


Re: WIP: WAL prefetch (another approach)

2020-08-13 Thread Thomas Munro
On Thu, Aug 6, 2020 at 10:47 PM Tomas Vondra
 wrote:
> On Thu, Aug 06, 2020 at 02:58:44PM +1200, Thomas Munro wrote:
> >On Tue, Aug 4, 2020 at 3:47 AM Tomas Vondra
> >> Any luck trying to reproduce thigs? Should I try again and collect some
> >> additional debug info?
> >
> >No luck.  I'm working on it now, and also trying to reduce the
> >overheads so that we're not doing extra work when it doesn't help.
>
> OK, I'll see if I can still reproduce it.

Since someone else ask me off-list, here's a rebase, with no
functional changes.  Soon I'll post a new improved version, but this
version just fixes the bitrot and hopefully turns cfbot green.
From 630b329de06705c09ab2372f2fb8f102d1f1f701 Mon Sep 17 00:00:00 2001
From: Thomas Munro 
Date: Sat, 28 Mar 2020 11:42:59 +1300
Subject: [PATCH v10 1/3] Add pg_atomic_unlocked_add_fetch_XXX().

Add a variant of pg_atomic_add_fetch_XXX with no barrier semantics, for
cases where you only want to avoid the possibility that a concurrent
pg_atomic_read_XXX() sees a torn/partial value.

Discussion: https://postgr.es/m/CA%2BhUKGJ4VJN8ttxScUFM8dOKX0BrBiboo5uz1cq%3DAovOddfHpA%40mail.gmail.com
---
 src/include/port/atomics.h | 24 ++
 src/include/port/atomics/generic.h | 33 ++
 2 files changed, 57 insertions(+)

diff --git a/src/include/port/atomics.h b/src/include/port/atomics.h
index 4956ec55cb..2abb852893 100644
--- a/src/include/port/atomics.h
+++ b/src/include/port/atomics.h
@@ -389,6 +389,21 @@ pg_atomic_add_fetch_u32(volatile pg_atomic_uint32 *ptr, int32 add_)
 	return pg_atomic_add_fetch_u32_impl(ptr, add_);
 }
 
+/*
+ * pg_atomic_unlocked_add_fetch_u32 - atomically add to variable
+ *
+ * Like pg_atomic_unlocked_write_u32, guarantees only that partial values
+ * cannot be observed.
+ *
+ * No barrier semantics.
+ */
+static inline uint32
+pg_atomic_unlocked_add_fetch_u32(volatile pg_atomic_uint32 *ptr, int32 add_)
+{
+	AssertPointerAlignment(ptr, 4);
+	return pg_atomic_unlocked_add_fetch_u32_impl(ptr, add_);
+}
+
 /*
  * pg_atomic_sub_fetch_u32 - atomically subtract from variable
  *
@@ -519,6 +534,15 @@ pg_atomic_sub_fetch_u64(volatile pg_atomic_uint64 *ptr, int64 sub_)
 	return pg_atomic_sub_fetch_u64_impl(ptr, sub_);
 }
 
+static inline uint64
+pg_atomic_unlocked_add_fetch_u64(volatile pg_atomic_uint64 *ptr, int64 add_)
+{
+#ifndef PG_HAVE_ATOMIC_U64_SIMULATION
+	AssertPointerAlignment(ptr, 8);
+#endif
+	return pg_atomic_unlocked_add_fetch_u64_impl(ptr, add_);
+}
+
 #undef INSIDE_ATOMICS_H
 
 #endif			/* ATOMICS_H */
diff --git a/src/include/port/atomics/generic.h b/src/include/port/atomics/generic.h
index d60a0d9e7f..3e1598d8ff 100644
--- a/src/include/port/atomics/generic.h
+++ b/src/include/port/atomics/generic.h
@@ -234,6 +234,16 @@ pg_atomic_add_fetch_u32_impl(volatile pg_atomic_uint32 *ptr, int32 add_)
 }
 #endif
 
+#if !defined(PG_HAVE_ATOMIC_UNLOCKED_ADD_FETCH_U32)
+#define PG_HAVE_ATOMIC_UNLOCKED_ADD_FETCH_U32
+static inline uint32
+pg_atomic_unlocked_add_fetch_u32_impl(volatile pg_atomic_uint32 *ptr, int32 add_)
+{
+	ptr->value += add_;
+	return ptr->value;
+}
+#endif
+
 #if !defined(PG_HAVE_ATOMIC_SUB_FETCH_U32) && defined(PG_HAVE_ATOMIC_FETCH_SUB_U32)
 #define PG_HAVE_ATOMIC_SUB_FETCH_U32
 static inline uint32
@@ -399,3 +409,26 @@ pg_atomic_sub_fetch_u64_impl(volatile pg_atomic_uint64 *ptr, int64 sub_)
 	return pg_atomic_fetch_sub_u64_impl(ptr, sub_) - sub_;
 }
 #endif
+
+#if defined(PG_HAVE_8BYTE_SINGLE_COPY_ATOMICITY) && \
+	!defined(PG_HAVE_ATOMIC_U64_SIMULATION)
+
+#ifndef PG_HAVE_ATOMIC_UNLOCKED_ADD_FETCH_U64
+#define PG_HAVE_ATOMIC_UNLOCKED_ADD_FETCH_U64
+static inline uint64
+pg_atomic_unlocked_add_fetch_u64_impl(volatile pg_atomic_uint64 *ptr, uint64 val)
+{
+	ptr->value += val;
+	return ptr->value;
+}
+#endif
+
+#else
+
+static inline uint64
+pg_atomic_unlocked_add_fetch_u64_impl(volatile pg_atomic_uint64 *ptr, uint64 val)
+{
+	return pg_atomic_add_fetch_u64_impl(ptr, val);
+}
+
+#endif
-- 
2.20.1

From cdfaedd530b5b60171952dc21ad3e2a5a3e6451b Mon Sep 17 00:00:00 2001
From: Thomas Munro 
Date: Tue, 7 Apr 2020 22:56:27 +1200
Subject: [PATCH v10 2/3] Allow XLogReadRecord() to be non-blocking.

Extend read_local_xlog_page() to support non-blocking modes:

1. Reading as far as the WAL receiver has written so far.
2. Reading all the way to the end, when the end LSN is unknown.

Discussion: https://postgr.es/m/CA%2BhUKGJ4VJN8ttxScUFM8dOKX0BrBiboo5uz1cq%3DAovOddfHpA%40mail.gmail.com
---
 src/backend/access/transam/xlogreader.c |  37 --
 src/backend/access/transam/xlogutils.c  | 149 +---
 src/backend/replication/walsender.c |   2 +-
 src/include/access/xlogreader.h |  14 ++-
 src/include/access/xlogutils.h  |  26 +
 5 files changed, 173 insertions(+), 55 deletions(-)

diff --git a/src/backend/access/transam/xlogreader.c b/src/backend/access/transam/xlogreader.c
index 67996018da..aad9fc2ce1 100644
--- a/src/backend/access/transam/xlogreader.c
+++