Re: archive_command / pg_stat_archiver & documentation

2021-02-28 Thread Julien Rouhaud
On Mon, Mar 1, 2021 at 3:36 PM Michael Paquier  wrote:
>
> On Fri, Feb 26, 2021 at 10:03:05AM +0100, Benoit Lobréau wrote:
> > Done here : https://commitfest.postgresql.org/32/3012/
>
> Documenting that properly for the archive command, as already done for
> restore_command, sounds good to me.  I am not sure that there is much
> point in doing a cross-reference to the archiving section for one
> specific field of pg_stat_archiver.

Agreed.

> For the second paragraph, I would recommend to move that to a
> different  to outline this special case, leading to the
> attached.

+1

> What do you think?

LGTM!




Re: proposal: schema variables

2021-02-28 Thread Pavel Stehule
út 16. 2. 2021 v 18:46 odesílatel Pavel Stehule 
napsal:

> Hi
>
> út 2. 2. 2021 v 9:43 odesílatel Pavel Stehule 
> napsal:
>
>> Hi
>>
>> rebase and set PK for pg_variable table
>>
>
> rebase
>

rebase



> Pavel
>
>
>> Regards
>>
>> Pavel
>>
>


schema-variables-20200301.patch.gz
Description: application/gzip


Re: archive_command / pg_stat_archiver & documentation

2021-02-28 Thread Michael Paquier
On Fri, Feb 26, 2021 at 10:03:05AM +0100, Benoit Lobréau wrote:
> Done here : https://commitfest.postgresql.org/32/3012/

Documenting that properly for the archive command, as already done for
restore_command, sounds good to me.  I am not sure that there is much
point in doing a cross-reference to the archiving section for one
specific field of pg_stat_archiver.

For the second paragraph, I would recommend to move that to a
different  to outline this special case, leading to the
attached.

What do you think?
--
Michael
diff --git a/doc/src/sgml/backup.sgml b/doc/src/sgml/backup.sgml
index 21094c6a9d..c5557d5444 100644
--- a/doc/src/sgml/backup.sgml
+++ b/doc/src/sgml/backup.sgml
@@ -639,6 +639,15 @@ test ! -f /mnt/server/archivedir/000100A90065 && cp pg_wal/0
 it will try again periodically until it succeeds.

 
+   
+When the archive command is terminated by a signal (other than
+SIGTERM that is used as part of a server
+shutdown) or an error by the shell with an exit status greater than
+125 (such as command not found), the archiver process aborts and gets
+restarted by the postmaster. In such cases, the failure is
+not reported in .
+   
+

 The archive command should generally be designed to refuse to overwrite
 any pre-existing archive file.  This is an important safety feature to


signature.asc
Description: PGP signature


[PATCH] postgres-fdw: column option to override foreign types

2021-02-28 Thread Dian M Fay
Full use of a custom data type with postgres_fdw currently requires the
type be maintained in both the local and remote databases. `CREATE
FOREIGN TABLE` does not check declared types against the remote table,
but declaring e.g. a remote enum to be local text works only partway, as
seen here. A simple select query against alpha_items returns the enum
values as text; however, *filtering* on the column yields an error.

create database alpha;
create database beta;

\c alpha

create type itemtype as enum ('one', 'two', 'three');
create table items (
  id serial not null primary key,
  type itemtype not null
);
insert into items (type) values ('one'), ('one'), ('two');

\c beta

create extension postgres_fdw;
create server alpha foreign data wrapper postgres_fdw options (dbname 'alpha', 
host 'localhost', port '5432');
create user mapping for postgres server alpha options (user 'postgres');

create foreign table alpha_items (
  id int,
  type text
) server alpha options (table_name 'items');
select * from alpha_items; -- ok
select * from alpha_items where type = 'one';

ERROR:  operator does not exist: public.itemtype = text
HINT:  No operator matches the given name and argument types. You might need to 
add explicit type casts.
CONTEXT:  remote SQL command: SELECT id, type FROM public.items WHERE ((type = 
'one'::text))

The attached changeset adds a new boolean option for postgres_fdw
foreign table columns, `use_local_type`. When true, ColumnRefs for the
relevant attribute will be deparsed with a cast to the type defined in
`CREATE FOREIGN TABLE`.

create foreign table alpha_items (
  id int,
  type text options (use_local_type 'true')
) server alpha options (table_name 'items');
select * from alpha_items where type = 'one'; -- succeeds

This builds and checks, with a new regression test and documentation.
From 09961e10daf72e2c1fbbdddb05b2940b2da14df0 Mon Sep 17 00:00:00 2001
From: Dian M Fay 
Date: Mon, 1 Mar 2021 00:44:15 -0500
Subject: [PATCH] postgres_fdw: column option to override foreign types

Enabling the use_local_type option on a foreign table column will
deparse it with a cast to the locally-declared type, allowing overrides
of foreign types not known to the local database.
---
 contrib/postgres_fdw/deparse.c | 15 ---
 contrib/postgres_fdw/expected/postgres_fdw.out | 17 +
 contrib/postgres_fdw/option.c  |  1 +
 contrib/postgres_fdw/sql/postgres_fdw.sql  | 10 ++
 doc/src/sgml/postgres-fdw.sgml | 13 +
 5 files changed, 53 insertions(+), 3 deletions(-)

diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c
index 6faf499f9a..c722d49316 100644
--- a/contrib/postgres_fdw/deparse.c
+++ b/contrib/postgres_fdw/deparse.c
@@ -2264,6 +2264,7 @@ deparseColumnRef(StringInfo buf, int varno, int varattno, 
RangeTblEntry *rte,
else
{
char   *colname = NULL;
+   bool   uselocaltype = NULL;
List   *options;
ListCell   *lc;
 
@@ -2280,10 +2281,9 @@ deparseColumnRef(StringInfo buf, int varno, int 
varattno, RangeTblEntry *rte,
DefElem*def = (DefElem *) lfirst(lc);
 
if (strcmp(def->defname, "column_name") == 0)
-   {
colname = defGetString(def);
-   break;
-   }
+   else if (strcmp(def->defname, "use_local_type") == 0)
+   uselocaltype = defGetBoolean(def);
}
 
/*
@@ -2297,6 +2297,15 @@ deparseColumnRef(StringInfo buf, int varno, int 
varattno, RangeTblEntry *rte,
ADD_REL_QUALIFIER(buf, varno);
 
appendStringInfoString(buf, quote_identifier(colname));
+
+   if (uselocaltype)
+   {
+   Oid coltype = get_atttype(rte->relid, varattno);
+   char *typname = deparse_type_name(coltype, -1);
+
+   appendStringInfoString(buf, "::");
+   appendStringInfoString(buf, typname);
+   }
}
 }
 
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out 
b/contrib/postgres_fdw/expected/postgres_fdw.out
index 0649b6b81c..c3271bac8f 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -4105,6 +4105,23 @@ ERROR:  invalid input syntax for type integer: "foo"
 CONTEXT:  processing expression at position 2 in select list
 ALTER FOREIGN TABLE ft1 ALTER COLUMN c8 TYPE user_enum;
 -- ===
+-- conversion with use_local_type
+-- ===
+ALTER FOREIGN TABLE ft1 ALTER COLUMN c8 TYPE text;
+SELECT * FROM ft1 WHERE c8 = 'foo' LIMIT 1; -- ERROR
+ERROR:  operator does not 

Re: macOS SIP, next try

2021-02-28 Thread Peter Eisentraut


So, we haven't gotten anywhere satisfying with these proposed technical 
solutions.


I have since learned that there is a way to disable only the part of SIP 
that is relevant for us.  This seems like a useful compromise, and it 
appears that a number of other open-source projects are following the 
same route.  I suggest the attached documentation patch and then close 
this issue.
From 7efb0ec3e15f37f9c5e12845aeccd9cd8693c01d Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Mon, 1 Mar 2021 07:58:17 +0100
Subject: [PATCH] doc: Update information on macOS SIP

On more recent versions of macOS, it is sufficient to disable only a
part of SIP in order to get make check to run before make install.
Document how.
---
 doc/src/sgml/installation.sgml | 26 ++
 1 file changed, 26 insertions(+)

diff --git a/doc/src/sgml/installation.sgml b/doc/src/sgml/installation.sgml
index 66ad4ba938..39adf1f5d9 100644
--- a/doc/src/sgml/installation.sgml
+++ b/doc/src/sgml/installation.sgml
@@ -2375,6 +2375,9 @@ macOS
 You may or may not wish to also install Xcode.

 
+   
+Sysroot
+

 On recent macOS releases, it's necessary to
 embed the sysroot path in the include switches used to
@@ -2419,6 +2422,10 @@ macOS
 to build with a non-Apple compiler, but beware that that case is not
 tested or supported by the PostgreSQL developers.

+   
+
+   
+System Integrity Protection
 

 macOS's System Integrity
@@ -2429,6 +2436,25 @@ macOS
 install before make check.
 Most PostgreSQL developers just turn off SIP, though.

+
+   
+To disable SIP, boot into recovery mode, open a terminal, and run
+
+csrutil disable
+
+   and reboot.  In macOS version 10.14 and later, it is sufficient to disable
+   the Debugging part of SIP, by running
+
+csrutil enable --without debug
+
+instead.  The status of SIP can be shown using
+
+csrutil status
+
+Note that that status display does not reflect changes until after a
+reboot.
+   
+   
   
 
   
-- 
2.30.1



Re: Side effect of remove_useless_groupby_columns

2021-02-28 Thread Richard Guo
Hi,

On Sun, Feb 28, 2021 at 5:15 PM David Rowley  wrote:

> On Sun, 28 Feb 2021 at 20:52, Richard Guo  wrote:
> > When looking at [1], I realized we may have a side effect when removing
> > redundant columns in the GROUP BY clause. Suppose we have a query with
> > ORDER BY 'b', and meanwhile column 'b' is also a group key. If we decide
> > that 'b' is redundant due to being functionally dependent on other GROUP
> > BY columns, we would remove it from group keys. This will make us lose
> > the opportunity to leverage the index on 'b'.
>
> > Any thoughts?
>
> I had initially thought that this was a non-issue before incremental
> sort was added, but the following case highlights that's wrong.
>
> # create table t (a int not null, b int);
> # insert into t select i, i%1000 from generate_series(1,100)i;
> # create index on t(b,a);
>
> # explain (analyze) select b from t group by a, b order by b,a limit 10;
> # alter table t add constraint t_pkey primary key (a);
> # explain (analyze) select b from t group by a, b order by b,a limit 10;
>
> Execution slows down after adding the primary key since that allows
> the functional dependency to be detected and the  "b" column removed
> from the GROUP BY clause.
>
> Perhaps we could do something like add:
>
> diff --git a/src/backend/optimizer/plan/planner.c
> b/src/backend/optimizer/plan/planner.c
> index 545b56bcaf..7e52212a13 100644
> --- a/src/backend/optimizer/plan/planner.c
> +++ b/src/backend/optimizer/plan/planner.c
> @@ -3182,7 +3182,8 @@ remove_useless_groupby_columns(PlannerInfo *root)
> if (!IsA(var, Var) ||
> var->varlevelsup > 0 ||
> !bms_is_member(var->varattno -
> FirstLowInvalidHeapAttributeNumber,
> -
> surplusvars[var->varno]))
> +
> surplusvars[var->varno]) ||
> +   list_member(parse->sortClause, sgc))
> new_groupby = lappend(new_groupby, sgc);
> }
>
> to remove_useless_groupby_columns().  It's not exactly perfect though
> since it's still good to remove the useless GROUP BY columns if
> there's no useful index to implement the ORDER BY.  We just don't know
> that when we call remove_useless_groupby_columns().
>

Or, rather than thinking it as a side effect of removing useless groupby
columns, how about we do an additional optimization as 'adding useful
groupby columns', to add into group keys some column which matches ORDER
BY and meanwhile is being functionally dependent on other GROUP BY
columns. What I'm thinking is a scenario like below.

# create table t (a int primary key, b int, c int);
# insert into t select i, i%1000, i%1000 from generate_series(1,100)i;
# create index on t(b);

# explain (analyze) select b from t group by a, c order by b limit 10;

For this query, we can first remove 'c' from groupby columns with the
existing optimization of 'removing useless groupby columns', and then
add 'b' to groupby columns with the new-added optimization of 'adding
useful groupby columns'.

We can do that because we know 'b' is functionally dependent on 'a', and
we are required to be sorted by 'b', and meanwhile there is index on 'b'
that we can leverage.

Thanks
Richard


Re: Parallel INSERT (INTO ... SELECT ...)

2021-02-28 Thread Amit Langote
On Mon, Mar 1, 2021 at 12:38 PM Greg Nancarrow  wrote:
> On Fri, Feb 26, 2021 at 5:50 PM Amit Langote  wrote:
> >
> > On Fri, Feb 26, 2021 at 3:35 PM Greg Nancarrow  wrote:
> > > On Fri, Feb 26, 2021 at 4:07 PM Amit Langote  
> > > wrote:
> > > > The attached patch fixes this, although I am starting to have second
> > > > thoughts about how we're tracking partitions in this patch.  Wondering
> > > > if we should bite the bullet and add partitions into the main range
> > > > table instead of tracking them separately in partitionOids, which
> > > > might result in a cleaner patch overall.
> > >
> > > Thanks Amit,
> > >
> > > I was able to reproduce the problem using your instructions (though I
> > > found I had to run that explain an extra time, in order to hit the
> > > breakpoint).
> > > Also, I can confirm that the problem doesn't occur after application
> > > of your patch.
> > > I'll leave it to your better judgement as to what to do next  - if you
> > > feel the current tracking method is not sufficient
> >
> > Just to be clear, I think the tracking method added by the patch is
> > sufficient AFAICS for the problems we were able to discover.  The
> > concern I was trying to express is that we seem to be duct-taping
> > holes in our earlier chosen design to track partitions separately from
> > the range table.  If we had decided to add partitions to the range
> > table as "extra" target relations from the get-go, both the issues I
> > mentioned with cached plans -- partitions not being counted as a
> > dependency and partitions not being locked before execution -- would
> > have been prevented.  I haven't fully grasped how invasive that design
> > would be, but it sure sounds like it would be a bit more robust.
>
> Posting an updated set of patches that includes Amit Langote's patch
> to the partition tracking scheme...

Thanks Greg.

> (the alternative of adding partitions to the range table needs further
> investigation)

I looked at this today with an intention to write and post a PoC
patch, but was quickly faced with the task of integrating that design
with how ModifyTable works for inserts.  Specifically if, in addition
to adding partitions to the range table, we also their RT indexes to
ModifyTable.resultRelations, then maybe we'll need to rethink our
executor tuple routing code.  That code currently tracks the insert's
target partitions separately from the range table, exactly because
they are not there to begin with.  So if we change the latter as in
this hypothetical PoC, maybe we should also revisit the former.  Also,
it would not be great that the planner's arrays in PlannerInfo would
get longer as a result of Query.rtable getting longer by adding
partitions, thus making all the loops over those arrays slower for no
benefit.

So, let's do this in a follow-on patch, if at all, instead of
considering this topic a blocker for committing the current set of
patches.




--
Amit Langote
EDB: http://www.enterprisedb.com




Re: [PATCH] refactor ATExec{En,Dis}ableRowSecurity

2021-02-28 Thread Michael Paquier
On Sun, Feb 28, 2021 at 02:27:44PM -0800, Zhihong Yu wrote:
> For 0002-Further-refactoring.patch, should there be assertion
> inside ATExecSetRowSecurity() on the values for rls and force_rls ?
> There could be 3 possible values: -1, 0 and 1.

0001 is a clean simplification and a good catch, so I'll see about
applying it.  0002 just makes the code more confusing to the reader
IMO, and its interface could easily lead to unwanted errors.
--
Michael


signature.asc
Description: PGP signature


Re: doc review for v14

2021-02-28 Thread Michael Paquier
On Sun, Feb 28, 2021 at 10:33:55PM -0600, Justin Pryzby wrote:
> I still have an issue with the sentence that begins:
> "Checksums are normally enabled..."

You could say here "Checksums can be enabled", but "normally" does not
sound bad to me either as it insists on the fact that it is better to
do that when the cluster is initdb'd as this has no downtime compared
to enabling checksums on an existing cluster.

> Note, the patch I sent said "create" but should be "created".

Initialized sounds better to me, FWIW.

> "page corruptions" is wrong .. you could say "corrupt pages"

"corruptED pages" would sound more correct to me as something that has
already happened.  Anyway, I'd rather keep what I am proposing
upthread.
--
Michael


signature.asc
Description: PGP signature


Re: simplifying foreign key/RI checks

2021-02-28 Thread Corey Huinker
>
> > It seems to me 1 (RI_PLAN_CHECK_LOOKUPPK) is still alive. (Yeah, I
> > know that doesn't mean the usefulness of the macro but the mechanism
> > the macro suggests, but it is confusing.) On the other hand,
> > RI_PLAN_CHECK_LOOKUPPK_FROM_PK and RI_PLAN_LAST_ON_PK seem to be no
> > longer used.  (Couldn't we remove them?)
>
> Yeah, better to just remove those _PK macros and say this module no
> longer runs any queries on the PK table.
>
> How about the attached?
>
>
Sorry for the delay.
I see that the changes were made as described.
Passes make check and make check-world yet again.
I'm marking this Ready For Committer unless someone objects.


Re: repeated decoding of prepared transactions

2021-02-28 Thread Amit Kapila
On Mon, Mar 1, 2021 at 7:23 AM Ajin Cherian  wrote:
>

Pushed, the first patch in the series.

-- 
With Regards,
Amit Kapila.




Re: [HACKERS] Custom compression methods

2021-02-28 Thread Dilip Kumar
On Sat, Feb 27, 2021 at 9:35 PM Justin Pryzby  wrote:
>
> > Subject: [PATCH v28 3/4] Add default_toast_compression GUC
>
> This part isn't working.  My first patch worked somewhat better: due to doing
> strcmp() with the default GUC, it avoided using the cached AM OID.  (But it
> would've failed with more than 2 AMs, since the cache wasn't invalidated, 
> since
> I couldn't tell when it was needed).
>
> Your patch does this:
>
> |postgres=# SET default_toast_compression=lz4 ;
> |postgres=# CREATE TABLE t(a text);
> |postgres=# \d+ t
> | a  | text |   |  | | extended | pglz|   
>|
>
> assign_default_toast_compression() should set
> default_toast_compression_oid=InvalidOid, rather than
> default_toast_compression=NULL.

I will fix this.

> In my original patch, that was commented, since I was confused, not realizing
> that the GUC machinery itself assigns to the string value.  We should assign 
> to
> the cached Oid, instead.

> Reading my own patch, I see that in AccessMethodCallback() should also say
> InvalidOid.
> | default_toast_compression_oid = false;
> The false assignment was copied from namespace.c: baseSearchPathValid.

I will fix this.

So as of now, we can make this patch such that it is enough to work
with the built-in method and later we can add another enhancement
patch that can work with the custom compression methods.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com




Re: [HACKERS] Custom compression methods

2021-02-28 Thread Justin Pryzby
On Mon, Mar 01, 2021 at 10:32:23AM +0530, Dilip Kumar wrote:
> On Sun, Feb 28, 2021 at 9:48 AM Justin Pryzby  wrote:
> >
> > On my PC, this new change is causing a test failure:
> >
> >  SELECT SUBSTR(f1, 2000, 50) FROM cmdata1;
> > -   substr
> > -
> > - 01234567890123456789012345678901234567890123456789
> > -(1 row)
> > -
> > +ERROR:  compressed lz4 data is corrupt
> 
> The older version of lz4 had this problem that while decompressing
> partial if we don't give the buffer size up to full data length it was
> failing[1] but it is solved in version 1.9.

Thanks.  It seems like that explains it.

   
I think if that's a problem with recent versions, then you'll have to
conditionally disable slicing.
https://packages.debian.org/liblz4-dev  

   

Slicing isn't generally usable if it sometimes makes people's data inaccessible
and gives errors about corruption.

I guess you could make it a compile time test on these constants (I don't know
the necessary version, though)

#define LZ4_VERSION_MAJOR1/* for breaking interface changes  */
#define LZ4_VERSION_MINOR7/* for new (non-breaking) interface 
capabilities */
#define LZ4_VERSION_RELEASE  1/* for tweaks, bug-fixes, or development */
#define LZ4_VERSION_NUMBER (LZ4_VERSION_MAJOR *100*100 + LZ4_VERSION_MINOR *100 
+ LZ4_VERSION_RELEASE)

If the version is too low, either make it #error, or disable slicing.   

   
The OS usual library version infrastructure will make sure the runtime version
is at least the MAJOR+MINOR of the compile time version.

-- 
Justin




Re: Add client connection check during the execution of the query

2021-02-28 Thread Thomas Munro
On Sat, Aug 3, 2019 at 4:40 AM Konstantin Knizhnik
 wrote:
> On 18.07.2019 6:19, Tatsuo Ishii wrote:
> > So the performance is about 5% down with the feature enabled in this
> > case.  For me, 5% down is not subtle. Probably we should warn this in
> > the doc.
>
> I also see some performance degradation, although it is not so large in
> my case (I used pgbench with scale factor 10 and run the same command as
> you).
> In my case difference is 103k vs. 105k TPS is smaller than 2%.

I didn't test, but hopefully the degradation is fixed by commit 09cf1d52?

> If OS detected closed connection, it should return POLLHUP, should not it?
> I am not sure if it is more portable or more efficient way - just seems
> to be a little bit more natural way (from my point of view) to check if
> connection is still alive.

That's if you're sleeping inepoll etc.  This patch is for CPU-bound
backends, running a long query.  We need to do something special to
find out if the kernel knows that the connection has been closed.

I've done a quick rebase of this the patch and added it to the
commitfest.  No other changes.  Several things were mentioned earlier
that still need to be tidied up.
From 5f7c327c5896369b80529467a3f1f64eab690887 Mon Sep 17 00:00:00 2001
From: Thomas Munro 
Date: Mon, 1 Mar 2021 18:08:23 +1300
Subject: [PATCH v3] Detect dropped connections while running queries.

Provide a new optional GUC that can be used to check whether the client
connection has gone away periodically while running very long queries.

Author: Sergey Cherkashin 
Author: 
Discussion: https://postgr.es/m/77def86b27e41f0efcba411460e929ae%40postgrespro.ru
---
 doc/src/sgml/config.sgml  |  20 
 src/backend/libpq/pqcomm.c|  31 +
 src/backend/tcop/postgres.c   |   5 +
 src/backend/utils/init/globals.c  |   1 +
 src/backend/utils/init/postinit.c |  13 +++
 src/backend/utils/misc/guc.c  |  10 ++
 src/backend/utils/misc/postgresql.conf.sample |   3 +
 src/include/libpq/libpq.h |   2 +
 src/include/miscadmin.h   |   3 +-
 src/include/utils/timeout.h   |   1 +
 src/test/modules/Makefile |   1 +
 src/test/modules/connection/Makefile  |  16 +++
 .../connection/t/001_close_connection.pl  | 107 ++
 13 files changed, 212 insertions(+), 1 deletion(-)
 create mode 100644 src/test/modules/connection/Makefile
 create mode 100644 src/test/modules/connection/t/001_close_connection.pl

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index b5718fc136..db8798eb95 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -9220,6 +9220,26 @@ dynamic_library_path = 'C:\tools\postgresql;H:\my_project\lib;$libdir'
   
  
 
+ 
+  client_connection_check_interval (integer)
+  
+   client_connection_check_interval configuration parameter
+  
+  
+  
+   
+Sets a time interval, in milliseconds, between periodic
+verification of client-server connection during query execution.
+If the client aborts the connection, the query is terminated.
+   
+   
+Default value is zero - it disables connection
+checks, so the backend will detect client disconnection only when trying
+to send a response to the query.
+   
+  
+ 
+
  
 

diff --git a/src/backend/libpq/pqcomm.c b/src/backend/libpq/pqcomm.c
index 27a298f110..44fb79c2b5 100644
--- a/src/backend/libpq/pqcomm.c
+++ b/src/backend/libpq/pqcomm.c
@@ -118,6 +118,7 @@
  */
 int			Unix_socket_permissions;
 char	   *Unix_socket_group;
+int 		client_connection_check_interval;
 
 /* Where the Unix socket files are (list of palloc'd strings) */
 static List *sock_paths = NIL;
@@ -2022,3 +2023,33 @@ pq_settcpusertimeout(int timeout, Port *port)
 
 	return STATUS_OK;
 }
+
+/* 
+ *	pq_check_client_connection - check if client connected to socket or not
+ * 
+ */
+void pq_check_client_connection(void)
+{
+	CheckClientConnectionPending = false;
+	if (IsUnderPostmaster &&
+		MyProcPort != NULL && !PqCommReadingMsg && !PqCommBusy)
+	{
+		char nextbyte;
+		int r;
+
+#ifdef WIN32
+		pgwin32_noblock = 1;
+#endif
+		r = recv(MyProcPort->sock, &nextbyte, 1, MSG_PEEK);
+#ifdef WIN32
+		pgwin32_noblock = 0;
+#endif
+
+		if (r == 0 || (r == -1 &&
+			errno != EAGAIN && errno != EWOULDBLOCK && errno != EINTR))
+		{
+			ClientConnectionLost = true;
+			InterruptPending = true;
+		}
+	}
+}
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index bb5ccb4578..4ce004a2d5 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -3128,6 +3128,8 @@ ProcessInterrupts(void)
 	(errcode(ERRCODE_ADMIN_SHUTDOWN),
 	 errmsg("terminating connection due to administrator command")));
 	}
+	if (CheckClientConne

Re: Speeding up GIST index creation for tsvectors

2021-02-28 Thread Amit Khandekar
I have added this one in the March commitfest.
https://commitfest.postgresql.org/32/3023/




Re: Printing backtrace of postgres processes

2021-02-28 Thread torikoshia

Hi,

I also think this feature would be useful when supporting
environments that lack debugger or debug symbols.
I think such environments are not rare.


+ for more information. 
This
+will help in identifying where exactly the backend process is 
currently

+executing.

When I read this, I expected a backtrace would be generated at
the moment when it receives the signal, but actually it just
sets a flag that causes the next CHECK_FOR_INTERRUPTS to print
a backtrace.

How about explaining the timing of the backtrace generation?


+print backtrace of superuser backends. This feature is not 
supported

+for postmaster, logging and statistics process.

Since the current patch use BackendPidGetProc(), it does not
support this feature not only postmaster, logging, and
statistics but also checkpointer, background writer, and
walwriter.

And when I specify pid of these PostgreSQL processes, it
says "PID  is not a PostgreSQL server process".

I think it may confuse users, so it might be worth
changing messages for those PostgreSQL processes.
AuxiliaryPidGetProc() may help to do it.


diff --git a/src/backend/postmaster/checkpointer.c 
b/src/backend/postmaster/checkpointer.c

index 54a818b..5fae328 100644
--- a/src/backend/postmaster/checkpointer.c
+++ b/src/backend/postmaster/checkpointer.c
@@ -57,6 +57,7 @@
 #include "storage/shmem.h"
 #include "storage/smgr.h"
 #include "storage/spin.h"
+#include "tcop/tcopprot.h"
 #include "utils/guc.h"
 #include "utils/memutils.h"
 #include "utils/resowner.h"
@@ -547,6 +548,13 @@ HandleCheckpointerInterrupts(void)
if (ProcSignalBarrierPending)
ProcessProcSignalBarrier();

+   /* Process printing backtrace */
+   if (PrintBacktracePending)
+   {
+   PrintBacktracePending = false;
+   set_backtrace(NULL, 0);
+   }
+

Although it implements backtrace for checkpointer, when
I specified pid of checkpointer it was refused from
BackendPidGetProc().


Regards,

--
Atsushi Torikoshi
NTT DATA CORPORATION




Re: Is Recovery actually paused?

2021-02-28 Thread Dilip Kumar
On Fri, Feb 26, 2021 at 1:33 PM Kyotaro Horiguchi
 wrote:
>
> At Thu, 25 Feb 2021 13:22:53 +0530, Dilip Kumar  wrote 
> in
> > On Thu, Feb 25, 2021 at 12:42 PM Kyotaro Horiguchi
> >  wrote:
> > > The latest version applies (almost) cleanly to the current master and
> > > works fine.
> > > I don't have further comment on this.
> > >
> > > I'll wait for a day before marking this RfC in case anyone have
> > > further comments.
> >
> > Okay.
>
> Hearing nothing, done that.

Thanks.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com




Re: Different compression methods for FPI

2021-02-28 Thread Justin Pryzby
On Mon, Mar 01, 2021 at 01:57:12PM +0900, Michael Paquier wrote:
> On Sun, Feb 28, 2021 at 05:08:17PM -0600, Justin Pryzby wrote:
> > Does this need to patch ./configure{,.ac} and Solution.pm for HAVE_LIBLZ4 ?
> > I suggest to also include an 0002 patch (not for commit) which changes to 
> > use a
> > non-default compression, to exercise this on the CIs - linux and bsd
> > environments now have liblz4 installed, and for windows you can keep 
> > "undef".
> 
> Good idea.  But are you sure that lz4 is available in the CF bot
> environments?

Yes, see here
https://www.postgresql.org/message-id/flat/20210119205643.GA1941%40telsasoft.com

-- 
Justin




Re: [HACKERS] Custom compression methods

2021-02-28 Thread Dilip Kumar
On Sun, Feb 28, 2021 at 9:48 AM Justin Pryzby  wrote:
>
> On my PC, this new change is causing a test failure:
>
>  SELECT SUBSTR(f1, 2000, 50) FROM cmdata1;
> -   substr
> -
> - 01234567890123456789012345678901234567890123456789
> -(1 row)
> -
> +ERROR:  compressed lz4 data is corrupt

The older version of lz4 had this problem that while decompressing
partial if we don't give the buffer size up to full data length it was
failing[1] but it is solved in version 1.9.

> @@ -119,15 +119,15 @@ lz4_cmdecompress_slice(const struct varlena *value, 
> int32 slicelength)
> int32   rawsize;
> struct varlena *result;
>
> -   /* allocate memory for holding the uncompressed data */
> -   result = (struct varlena *) palloc(VARRAWSIZE_4B_C(value) + VARHDRSZ);
> +   /* allocate memory for the uncompressed data */
> +   result = (struct varlena *) palloc(slicelength + VARHDRSZ);
>
> -   /* decompress partial data using lz4 routine */
> +   /* decompress the data */
> rawsize = LZ4_decompress_safe_partial((char *) value + 
> VARHDRSZ_COMPRESS,
>   
> VARDATA(result),
>   
> VARSIZE(value) - VARHDRSZ_COMPRESS,
>   
> slicelength,
> - 
> VARRAWSIZE_4B_C(value));
> + 
> slicelength);

This is done for the latest version as now we don't need to allocate
the buffer of full size, it is enough the allocate just equal to the
slicelength.

> Also, in the tests, you have this at both the top and bottom of the file:
>
> src/test/regress/sql/compression.sql:\set HIDE_COMPRESSAM false
> src/test/regress/sql/compression.sql:\set HIDE_COMPRESSAM false
>
> Whereas the patch I sent had at the end:
>
> +\set HIDE_COMPRESSAM on
>
> ("on" is the default when run under pg_regress)

I will fix this.

[1] 
https://docs.unrealengine.com/en-US/API/Runtime/Core/Compression/LZ4_decompress_safe_partial/index.html

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com




Re: repeated decoding of prepared transactions

2021-02-28 Thread vignesh C
On Mon, Mar 1, 2021 at 7:23 AM Ajin Cherian  wrote:
>
> On Sat, Feb 27, 2021 at 11:06 PM Amit Kapila  wrote:
>
> > Few comments on 0002 patch:
> > =
> > 1.
> > +
> > + /*
> > + * Disable two-phase here, it will be set in the core if it was
> > + * enabled whole creating the slot.
> > + */
> > + ctx->twophase = false;
> >
> > Typo, /whole/while. I think we don't need to initialize this variable
> > here at all.
> >
> > 2.
> > + /* If twophase is set on the slot at create time, then
> > + * make sure the field in the context is also updated
> > + */
> > + if (MyReplicationSlot->data.twophase)
> > + {
> > + ctx->twophase = true;
> > + }
> > +
> >
> > For multi-line comments, the first line of comment should be empty.
> > Also, I think this is not the right place because the WALSender path
> > needs to set it separately. I guess you can set it in
> > CreateInitDecodingContext/CreateDecodingContext by doing something
> > like
> >
> > ctx->twophase &= MyReplicationSlot->data.twophase
>
> Updated accordingly.
>
> >
> > 3. I think we can support this option at the protocol level in a
> > separate patch where we need to allow it via replication commands (say
> > when we support it in CreateSubscription). Right now, there is nothing
> > to test all the code you have added in repl_gram.y.
> >
>
> Removed that.
>
>
> > 4. I think we can expose this new option via pg_replication_slots.
> >
>
> Done. Added,
>

v7-0002-Add-option-to-enable-two-phase-commits-in-pg_crea.patch adds
twophase to pg_create_logical_replication_slot, I feel this option
should be documented in src/sgml/func.sgml.

Regards,
Vignesh




Re: Different compression methods for FPI

2021-02-28 Thread Michael Paquier
On Sun, Feb 28, 2021 at 05:08:17PM -0600, Justin Pryzby wrote:
> Does this need to patch ./configure{,.ac} and Solution.pm for HAVE_LIBLZ4 ?
> I suggest to also include an 0002 patch (not for commit) which changes to use 
> a
> non-default compression, to exercise this on the CIs - linux and bsd
> environments now have liblz4 installed, and for windows you can keep "undef".

Good idea.  But are you sure that lz4 is available in the CF bot
environments?

> Your patch looks fine, but I wonder if we should first implement a generic
> compression API.  Then, the callers don't need to have a bunch of #ifdef.
> If someone calls zs_create() for an algorithm which isn't enabled at compile
> time, it throws the error at a lower level.

Yeah, the patch feels incomplete with its footprint in xloginsert.c
for something that could be available in src/common/ like pglz,
particularly knowing that you will need to have this information 
available to frontend tools, no?

> In some cases there's an argument that the compression ID should be globally
> defined constant, not like a dynamic "plugable" OID.  That may be true for the
> libpq compression, WAL compression, and pg_dump, since there's separate
> "producer" and "consumer".  I think if there's "pluggable" compression (like
> the TOAST patch), then it may have to map between the static or dynamic OIDs
> and the global compression ID.

This declaration should be frontend-aware.  As presented, the patch
would enforce zlib or lz4 on top of pglz, but I guess that it would be
more user-friendly to complain when attempting to set up
wal_compression_method (why not just using wal_compression?) to an
unsupported option rather than doing it after-the-fact for the first
full page generated once the new parameter value is loaded.

This stuff could just add tests in src/test/recovery/.  See for
example src/test/ssl and with_ssl to see how conditional tests happen
depending on the configure options.

Not much a fan either of assuming that it is just fine to add one byte
to XLogRecordBlockImageHeader for the compression_method field.
--
Michael


signature.asc
Description: PGP signature


Re: Fix DROP TABLESPACE on Windows with ProcSignalBarrier?

2021-02-28 Thread Thomas Munro
On Sat, Feb 27, 2021 at 4:14 PM Thomas Munro  wrote:
> Here's a new version.  The condition variable patch 0001 fixes a bug:
> CleanupProcSignalState() also needs to broadcast.  The hunk that
> allows the interrupt handlers to use CVs while you're already waiting
> on a CV is now in a separate patch 0002.  I'm thinking of going ahead
> and committing those two.

Done.  Of course nothing in the tree reaches any of this code yet.
It'll be exercised by cfbot in this thread and (I assume) Amul's
"ALTER SYSTEM READ { ONLY | WRITE }" thread.

> The 0003 patch to achieve $SUBJECT needs
> more discussion.

Rebased.

The more I think about it, the more I think that this approach is good
enough for an initial solution to the problem.  It only affects
Windows, dropping tablespaces is hopefully rare, and it's currently
broken on that OS.  That said, it's complex enough, and I guess more
to the point, enough of a compromise, that I'm hoping to get some
explicit consensus about that.

A better solution would probably have to be based on the sinval queue,
somehow.  Perhaps with a new theory or rule making it safe to process
at every CFI(), or by deciding that we're prepared have the operation
wait arbitrarily long until backends reach a point where it is known
to be safe (probably near ProcessClientReadInterrupt()'s call to
ProcessCatchupInterrupt()), or by inventing a new kind of lightweight
"sinval peek" that is safe to run at every CFI() point, being based on
reading (but not consuming!) the sinval queue and performing just
vfd-close of referenced smgr relations in this case.  The more I think
about all that complexity for a super rare event on only one OS, the
more I want to just do it the stupid way and close all the fds.
Robert opined similarly in an off-list chat about this problem.
From 5daf2244894bf5a399aa8022efbf71b5428335c7 Mon Sep 17 00:00:00 2001
From: Thomas Munro 
Date: Mon, 1 Mar 2021 17:14:11 +1300
Subject: [PATCH v5] Use a global barrier to fix DROP TABLESPACE on Windows.

Previously, it was possible for DROP TABLESPACE to fail because a table
had been dropped, but other backends still had open file handles.
Windows won't allow a directory containing unlinked-but-still-open files
to be unlinked.  Tackle this problem by forcing all backends to close
all fds.  No change for Unix systems, which didn't suffer from the
problem, but simulate the need to do it in assertion builds just to
exercise the code.

Reviewed-by: Andres Freund 
Discussion: https://postgr.es/m/ca+hukgldemy2gbm80kz20gte6hnvwoere8kwcjk6-u56ost...@mail.gmail.com
---
 src/backend/commands/tablespace.c| 22 +++---
 src/backend/storage/ipc/procsignal.c | 22 ++
 src/backend/storage/smgr/md.c|  6 ++
 src/backend/storage/smgr/smgr.c  | 12 
 src/include/storage/md.h |  1 +
 src/include/storage/procsignal.h |  7 +--
 src/include/storage/smgr.h   |  1 +
 7 files changed, 42 insertions(+), 29 deletions(-)

diff --git a/src/backend/commands/tablespace.c b/src/backend/commands/tablespace.c
index 69ea155d50..8853f1b658 100644
--- a/src/backend/commands/tablespace.c
+++ b/src/backend/commands/tablespace.c
@@ -520,15 +520,23 @@ DropTableSpace(DropTableSpaceStmt *stmt)
 		 * but we can't tell them apart from important data files that we
 		 * mustn't delete.  So instead, we force a checkpoint which will clean
 		 * out any lingering files, and try again.
-		 *
-		 * XXX On Windows, an unlinked file persists in the directory listing
-		 * until no process retains an open handle for the file.  The DDL
-		 * commands that schedule files for unlink send invalidation messages
-		 * directing other PostgreSQL processes to close the files.  DROP
-		 * TABLESPACE should not give up on the tablespace becoming empty
-		 * until all relevant invalidation processing is complete.
 		 */
 		RequestCheckpoint(CHECKPOINT_IMMEDIATE | CHECKPOINT_FORCE | CHECKPOINT_WAIT);
+		/*
+		 * On Windows, an unlinked file persists in the directory listing until
+		 * no process retains an open handle for the file.  The DDL
+		 * commands that schedule files for unlink send invalidation messages
+		 * directing other PostgreSQL processes to close the files, but nothing
+		 * guarantees they'll be processed in time.  So, we'll also use a
+		 * global barrier to ask all backends to close all files, and wait
+		 * until they're finished.
+		 */
+#if defined(WIN32) || defined(USE_ASSERT_CHECKING)
+		LWLockRelease(TablespaceCreateLock);
+		WaitForProcSignalBarrier(EmitProcSignalBarrier(PROCSIGNAL_BARRIER_SMGRRELEASE));
+		LWLockAcquire(TablespaceCreateLock, LW_EXCLUSIVE);
+#endif
+		/* And now try again. */
 		if (!destroy_tablespace_directories(tablespaceoid, false))
 		{
 			/* Still not empty, the files must be important then */
diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c
index 8e5ee49fbd..236f6edd60 100644
--- a/src/backend/storage/ipc/procsignal.c

Re: doc review for v14

2021-02-28 Thread Justin Pryzby
On Mon, Mar 01, 2021 at 01:11:10PM +0900, Michael Paquier wrote:
> On Sun, Feb 28, 2021 at 06:46:47PM -0600, Justin Pryzby wrote:
> > It looks like you applied 0010...but I agree that it's not an improvement.  
> > It
> > appears that's something I intended to go back and revisit myself.
> > The rest of the patch looks right, to me.
> 
> Oops.  This was not intended.
> 
> > I'm suggesting to either revert that part, or apply these more polished 
> > changes
> > in 0002.
> 
> I would just group both things together.  Monday helping, I can see
> that the new wording is better on a couple of points after doing a
> diff of wal.sgml with c82d59d6:
> - "checksum protected" in the first sentence is weird, so I agree that
> using "By default, data pages are not protected by checksums" is an
> improvement.
> - "assigned" is indeed a bit strange, "includes" is an improvement,
> and I would tend to not use a passive form here.

+1

> - "to recover from corrupt data" is redundant with "to recover data"
> so the second one should be removed.  My take is to use "page
> corruptions" instead of "corrupt data", which should be corrupted data
> to be grammatically correct.

> -   Checksums verification is normally ENABLED when the cluster is 
> initialized using  +   Checksums are normally enabled when the cluster is initialized using  -   When attempting to recover from corrupt data, it may be necessary to 
> bypass
> -   the checksum protection. To do this, temporarily set the configuration
> -   parameter .
> +   When attempting to recover from page corruptions, it may be necessary to
> +   bypass the checksum protection. To do this, temporarily set the
> +   configuration parameter .

"page corruptions" is wrong .. you could say "corrupt pages"

-- 
Justin




Re: doc review for v14

2021-02-28 Thread Michael Paquier
On Sun, Feb 28, 2021 at 06:46:47PM -0600, Justin Pryzby wrote:
> It looks like you applied 0010...but I agree that it's not an improvement.  It
> appears that's something I intended to go back and revisit myself.
> The rest of the patch looks right, to me.

Oops.  This was not intended.

> I'm suggesting to either revert that part, or apply these more polished 
> changes
> in 0002.

I would just group both things together.  Monday helping, I can see
that the new wording is better on a couple of points after doing a
diff of wal.sgml with c82d59d6:
- "checksum protected" in the first sentence is weird, so I agree that
using "By default, data pages are not protected by checksums" is an
improvement.
- "assigned" is indeed a bit strange, "includes" is an improvement,
and I would tend to not use a passive form here.
- "to recover from corrupt data" is redundant with "to recover data"
so the second one should be removed.  My take is to use "page
corruptions" instead of "corrupt data", which should be corrupted data
to be grammatically correct.

This gives the attached, that has as result to not change the second
paragraph compared to the pre-c82d59d6 version of the area.
--
Michael
diff --git a/doc/src/sgml/wal.sgml b/doc/src/sgml/wal.sgml
index 02f576a1a9..f75527f764 100644
--- a/doc/src/sgml/wal.sgml
+++ b/doc/src/sgml/wal.sgml
@@ -237,19 +237,19 @@
   
 
   
-   By default, data pages are not protected by checksums, but this can optionally be
-   enabled for a cluster.  When enabled, each data page will be ASSIGNED a
-   checksum that is updated when the page is written and verified each time
+   By default, data pages are not protected by checksums, but this can
+   optionally be enabled for a cluster. When enabled, each data page includes
+   a checksum that is updated when the page is written and verified each time
the page is read. Only data pages are protected by checksums; internal data
structures and temporary files are not.
   
 
   
-   Checksums verification is normally ENABLED when the cluster is initialized using initdb.
They can also be enabled or disabled at a later time as an offline
operation. Data checksums are enabled or disabled at the full cluster
-   level, and cannot be specified for individual databases or tables.
+   level, and cannot be specified individually for databases or tables.
   
 
   
@@ -260,9 +260,9 @@
   
 
   
-   When attempting to recover from corrupt data, it may be necessary to bypass
-   the checksum protection. To do this, temporarily set the configuration
-   parameter .
+   When attempting to recover from page corruptions, it may be necessary to
+   bypass the checksum protection. To do this, temporarily set the
+   configuration parameter .
   
 
   


signature.asc
Description: PGP signature


Re: 64-bit XIDs in deleted nbtree pages

2021-02-28 Thread Masahiko Sawada
On Fri, Feb 26, 2021 at 9:58 AM Peter Geoghegan  wrote:
> > If we don't want btvacuumcleanup() to collect index statistics, we can
> > remove vacuum_cleanup_index_scale_factor (at least from btree
> > perspectives), as you mentioned. One thing that may be worth
> > mentioning is that the difference between the index statistics taken
> > by ANALYZE and btvacuumcleanup() is that the former statistics is
> > always an estimation. That’s calculated by compute_index_stats()
> > whereas the latter uses the result of an index scan. If
> > btvacuumcleanup() doesn’t scan the index and always returns NULL, it
> > would become hard to get accurate index statistics, for example in a
> > static table case. I've not checked which cases index statistics
> > calculated by compute_index_stats() are inaccurate, though.
>
> The historic context makes it easier to understand what to do here --
> it makes it clear that amvacuumcleanup() routine does not (or should
> not) do any index scan when the index hasn't (and won't) be modified
> by the current VACUUM operation. The relevant sgml doc sentence I
> quoted to you recently ("It is OK to return NULL if the index was not
> changed at all during the VACUUM operation...") was added by commit
> e57345975cf in 2006. Much of the relevant 2006 discussion is here,
> FWIW:
>
> https://www.postgresql.org/message-id/flat/26433.1146598265%40sss.pgh.pa.us#862ee11c24da63d0282e0025abbad19c
>
> So now we have the formal rules for index AMs, as well as background
> information about what various hackers (mostly Tom) were considering
> when the rules were written.
>
> > According to the doc, if amvacuumcleanup/btvacuumcleanup returns NULL,
> > it means the index is not changed at all. So do_analyze_rel() executed
> > by VACUUM ANALYZE also doesn't need to update the index statistics
> > even when amvacuumcleanup/btvacuumcleanup returns NULL. No?
>
> Consider hashvacuumcleanup() -- here it is in full (it hasn't really
> changed since 2006, when it was updated by that same commit I cited):
>
> /*
>  * Post-VACUUM cleanup.
>  *
>  * Result: a palloc'd struct containing statistical info for VACUUM displays.
>  */
> IndexBulkDeleteResult *
> hashvacuumcleanup(IndexVacuumInfo *info, IndexBulkDeleteResult *stats)
> {
> Relationrel = info->index;
> BlockNumber num_pages;
>
> /* If hashbulkdelete wasn't called, return NULL signifying no change */
> /* Note: this covers the analyze_only case too */
> if (stats == NULL)
> return NULL;
>
> /* update statistics */
> num_pages = RelationGetNumberOfBlocks(rel);
> stats->num_pages = num_pages;
>
> return stats;
> }
>
> Clearly hashvacuumcleanup() was considered by Tom when he revised the
> documentation in 2006. Here are some observations about
> hashvacuumcleanup() that seem relevant now:
>
> * There is no "analyze_only" handling, just like nbtree.
>
> "analyze_only" is only used by GIN, even now, 15+ years after it was
> added. GIN uses it to make autovacuum workers (never VACUUM outside of
> an AV worker) do pending list insertions for ANALYZE -- just to make
> it happen more often.  This is a niche thing -- clearly we don't have
> to care about it in nbtree, even if we make btvacuumcleanup() (almost)
> always return NULL when there was no btbulkdelete() call.
>
> * num_pages (which will become pg_class.relpages for the index) is not
> set when we return NULL -- hashvacuumcleanup() assumes that ANALYZE
> will get to it eventually in the case where VACUUM does no real work
> (when it just returns NULL).
>
> * We also use RelationGetNumberOfBlocks() to set pg_class.relpages for
> index relations during ANALYZE -- it's called when we call
> vac_update_relstats() (I quoted this do_analyze_rel() code to you
> directly in a recent email).
>
> * In general, pg_class.relpages isn't an estimate (because we use
> RelationGetNumberOfBlocks(), both in the VACUUM-updates case and the
> ANALYZE-updates case) -- only pg_class.reltuples is truly an estimate
> during ANALYZE, and so getting a "true count" seems to have only
> limited practical importance.
>
> I think that this sets a precedent in support of my view that we can
> simply get rid of vacuum_cleanup_index_scale_factor without any
> special effort to maintain pg_class.reltuples. As I said before, we
> can safely make btvacuumcleanup() just like hashvacuumcleanup(),
> except when there are known deleted-but-not-recycled pages, where a
> full index scan really is necessary for reasons that are not related
> to statistics at all (of course we still need the *logic* that was
> added to nbtree by the vacuum_cleanup_index_scale_factor commit --
> that is clearly necessary). My guess is that Tom would have made
> btvacuumcleanup() look identical to hashvacuumcleanup() in 2006 if
> nbtree didn't have page deletion to consider -- but that had to be
> considered.

Makes sense. If getting a true pg_class.reltuples is not important in
practice, it seems not to need btvacuumcle

Re: Update docs of logical replication for commit ce0fdbfe97.

2021-02-28 Thread Amit Kapila
On Fri, Feb 26, 2021 at 8:29 AM Amit Kapila  wrote:
>
> We forgot to update the logical replication configuration settings
> page in commit ce0fdbfe97. After commit ce0fdbfe97, table
> synchronization workers also started using replication origins to
> track the progress and the same should be reflected in docs.
>
> Attached patch for the same.
>

Pushed!

-- 
With Regards,
Amit Kapila.




Re: Reducing WaitEventSet syscall churn

2021-02-28 Thread Thomas Munro
On Sat, Feb 27, 2021 at 2:48 PM Thomas Munro  wrote:
> Here's the alternative patch set, with no change to existing error
> message behaviour.  I'm going to commit this version and close this CF
> item soon if there are no objections.

Pushed.

That leaves just walreceiver and postgres_fdw in need of WaitEventSet
improvements along these lines.  The raw ingredients for that were
present in earlier patches, and I think Horiguchi-san had the right
idea: we should use the libpq event system to adjust our WES as
appropriate when it changes the socket underneath us.  I will leave
this CF entry open a bit longer in case he would like to post an
updated version of that part (considering Tom's feedback[1]).  If not,
we can close this and try again in the next cycle.

[1] https://www.postgresql.org/message-id/2446176.1594495351%40sss.pgh.pa.us




Re: locking [user] catalog tables vs 2pc vs logical rep

2021-02-28 Thread Amit Kapila
On Tue, Feb 23, 2021 at 12:00 PM Amit Kapila  wrote:
>
> On Tue, Feb 23, 2021 at 9:33 AM Andres Freund  wrote:
> >
> > On 2021-02-23 09:24:18 +0530, Amit Kapila wrote:
> > > Okay, so is it sufficient to add comments in code, or do we want to
> > > add something in docs? I am not completely sure if we need to add in
> > > docs till we have core-implementation of prepare waiting to get
> > > logically replicated.
> >
> > There's plenty users of logical decoding that aren't going through the
> > normal replication mechanism - so they can hit this. So I think it needs
> > to be documented somewhere.
> >
>
> As per discussion, the attached patch updates both docs and comments
> in the code.
>

I have pushed this patch.

-- 
With Regards,
Amit Kapila.




Re: alter table set TABLE ACCESS METHOD

2021-02-28 Thread Justin Pryzby
On Mon, Mar 01, 2021 at 11:16:36AM +0900, Michael Paquier wrote:
> On Sun, Feb 28, 2021 at 04:25:30PM -0600, Justin Pryzby wrote:
> > I called this "set TABLE access method" rather than just "set access method"
> > for the reasons given on the LIKE thread:
> > https://www.postgresql.org/message-id/20210119210331.gn8...@telsasoft.com
> 
> ALTER TABLE applies to a table (or perhaps a sequence, still..), so
> that sounds a bit weird to me to add again the keyword "TABLE" for
> that.

I don't know if you're following along the toast compression patch -
Alvaro had suggested that instead of making a new catalog just for a handful of
tuples for compression types, to instead store them in pg_am, with a new
am_type='c'.  So I proposed a patch for
| CREATE TABLE .. (LIKE other INCLUDING ACCESS METHOD),
but then decided that it should say INCLUDING *TABLE* ACCESS METHOD, since
otherwise it was somewhat strange that it didn't include the compression access
methods (which have a separate LIKE option).

> > I've tested this with zedstore, but the lack of 2nd, in-core table AM limits
> > testing possibilities.  It also limits at least my own ability to reason 
> > about
> > the AM API.
> >
> > For example, I was surprised to hear that toast is a concept that's
> > intended to be applied to AMs other than heap.
> > https://www.postgresql.org/message-id/flat/CA%2BTgmoYTuT4sRtviMLOOO%2B79VnDCpCNyy9rK6UZFb7KEAVt21w%40mail.gmail.com
> 
> What kind of advanced testing do you have in mind?  It sounds pretty
> much enough to me for a basic patch to use the trick with heap2 as
> your patch does.  That would be enough to be sure that the rewrite
> happens and that data is still around.

The issue is that the toast data can be compressed, so it needs to be detoasted
before pushing it to the other AM, which otherwise may not know how to
decompress it.

If it's not detoasted, this works with "COMPRESSION lz4" (since zedstore
happens to know how to decompress it) but that's just an accident, and it fails
with when using pglz.  That's got to do with 2 non-core patches - when core has
only heap, then I don't see how something like this can be exercized.

postgres=# DROP TABLE t; CREATE TABLE t (a TEXT COMPRESSION pglz) USING heap; 
INSERT INTO t SELECT repeat(a::text,) FROM generate_series(1,99)a; ALTER 
TABLE t SET ACCESS METHOD zedstore; SELECT * FROM t;
DROP TABLE
CREATE TABLE
INSERT 0 99
ALTER TABLE
2021-02-28 20:50:42.653 CST client backend[14958] psql ERROR:  compressed lz4 
data is corrupt
2021-02-28 20:50:42.653 CST client backend[14958] psql STATEMENT:  SELECT * 
FROM t;

-- 
Justin




Re: [PATCH] pgbench: Remove ecnt, a member variable of CState

2021-02-28 Thread miyake_kouta

2021-02-28 08:06, Michael Paquier wrote:

On Fri, Feb 26, 2021 at 04:36:41PM -0300, Alvaro Herrera wrote:

+1


Thanks, done.
--
Michael


Thanks for reviewing and committing this patch!

Regards
--
Kota Miyake




Re: [BUG] segfault during delete

2021-02-28 Thread Amit Langote
On Sun, Feb 28, 2021 at 6:14 AM Álvaro Herrera  wrote:
> Thanks Amit for working on this fix!  It seems correct to me, so I pushed it 
> with trivial changes.  Thanks Bertrand for reporting the problem.

Thanks Alvaro.

> In addition to backpatching the code fix to pg12, I backpatched the test case 
> to pg11. It worked fine for me (with no code changes), but it seems good to 
> have it there just to make sure the buildfarm agrees with us on this.

Ah, good call.

-- 
Amit Langote
EDB: http://www.enterprisedb.com




Re: alter table set TABLE ACCESS METHOD

2021-02-28 Thread Michael Paquier
On Sun, Feb 28, 2021 at 04:25:30PM -0600, Justin Pryzby wrote:
> I called this "set TABLE access method" rather than just "set access method"
> for the reasons given on the LIKE thread:
> https://www.postgresql.org/message-id/20210119210331.gn8...@telsasoft.com

ALTER TABLE applies to a table (or perhaps a sequence, still..), so
that sounds a bit weird to me to add again the keyword "TABLE" for
that.

> I've tested this with zedstore, but the lack of 2nd, in-core table AM limits
> testing possibilities.  It also limits at least my own ability to reason about
> the AM API.
>
> For example, I was surprised to hear that toast is a concept that's
> intended to be applied to AMs other than heap.
> https://www.postgresql.org/message-id/flat/CA%2BTgmoYTuT4sRtviMLOOO%2B79VnDCpCNyy9rK6UZFb7KEAVt21w%40mail.gmail.com

What kind of advanced testing do you have in mind?  It sounds pretty
much enough to me for a basic patch to use the trick with heap2 as
your patch does.  That would be enough to be sure that the rewrite
happens and that data is still around.  If you are worrying about
recovery, a TAP test with an immediate stop of the server could
equally be used to check after the FPWs produced for the new
relfilenode during the rewrite.
--
Michael


signature.asc
Description: PGP signature


Re: repeated decoding of prepared transactions

2021-02-28 Thread Ajin Cherian
On Sat, Feb 27, 2021 at 11:06 PM Amit Kapila  wrote:

> Few comments on 0002 patch:
> =
> 1.
> +
> + /*
> + * Disable two-phase here, it will be set in the core if it was
> + * enabled whole creating the slot.
> + */
> + ctx->twophase = false;
>
> Typo, /whole/while. I think we don't need to initialize this variable
> here at all.
>
> 2.
> + /* If twophase is set on the slot at create time, then
> + * make sure the field in the context is also updated
> + */
> + if (MyReplicationSlot->data.twophase)
> + {
> + ctx->twophase = true;
> + }
> +
>
> For multi-line comments, the first line of comment should be empty.
> Also, I think this is not the right place because the WALSender path
> needs to set it separately. I guess you can set it in
> CreateInitDecodingContext/CreateDecodingContext by doing something
> like
>
> ctx->twophase &= MyReplicationSlot->data.twophase

Updated accordingly.

>
> 3. I think we can support this option at the protocol level in a
> separate patch where we need to allow it via replication commands (say
> when we support it in CreateSubscription). Right now, there is nothing
> to test all the code you have added in repl_gram.y.
>

Removed that.


> 4. I think we can expose this new option via pg_replication_slots.
>

Done. Added,

regards,
Ajin Cherian
Fujitsu Australia


v7-0001-Avoid-repeated-decoding-of-prepared-transactions-.patch
Description: Binary data


v7-0002-Add-option-to-enable-two-phase-commits-in-pg_crea.patch
Description: Binary data


Re: [PoC] Non-volatile WAL buffer

2021-02-28 Thread Takashi Menjo
Hi Sawada,

I am relieved to hear that the performance problem was solved.

And I added a tip about PMEM namespace and partitioning in PG wiki[1].

Regards,

[1] 
https://wiki.postgresql.org/wiki/Persistent_Memory_for_WAL#Configure_and_verify_DAX_hugepage_faults

-- 
Takashi Menjo 




Re: Optimising latch signals

2021-02-28 Thread Thomas Munro
On Sat, Feb 27, 2021 at 12:04 AM Thomas Munro  wrote:
> I'm planning to commit this soon if there are no objections.

Pushed, with the addition of an SFD_CLOEXEC flag for the signalfd.
Time to watch the buildfarm to find out if my speculation about
illumos is correct...




Regex back-reference semantics and performance

2021-02-28 Thread Tom Lane
I noticed that some of the slowest cases in Joel's regex test corpus
had issues with back-reference matching, and along the way to fixing
that discovered what seems to me to be a bug in the engine's handling
of back-references.  To wit, what should happen if a back-reference
is to a subexpression that contains constraints?  A simple example is

SELECT regexp_match('foof', '(^f)o*\1');

To my mind, the back reference is only chartered to match the literal
characters matched by the referenced subexpression.  Here, since that
expression matches "f", the backref should too, and thus we should get
a match to "foof".  Perl gives that answer, anyway; but our existing
code says there's no match.  That's because it effectively copies the
constraints within the referenced subexpression, in addition to making
the data comparison.  The "^" can't match where the second "f" is, so
we lose.

0001 attached fixes this by stripping constraint arcs out of the NFA
that's applied to the backref subre tree node.

Now, as to the performance issue ... if you load up the data in
"trouble.sql" attached, and do

SELECT regexp_matches(subject, pattern, 'g') FROM trouble;

you'll be waiting a good long time, even with our recent improvements.
(Up to now I hadn't tried the 'g' flag with Joel's test cases, so
I hadn't noticed what a problem this particular example has got.)
The reason for the issue is that the pattern is

(["'`])(?:\\\1|.)*?\1

and the subject string has a mix of " and ' quote characters.  As
currently implemented, our engine tries to resolve the match at
any substring ending in either " or ', since the NFA created for
the backref can match either.  That leads to O(N^2) time wasted
trying to verify wrong matches.

I realized that this could be improved by replacing the NFA/DFA
match step for a backref node with a string literal match, if the
backreference match string is already known at the time we try
to apply the NFA/DFA.  That's not a panacea, but it helps in most
simple cases including this one.  The way to visualize what is
happening is that we have a tree of binary concatenation nodes:

  concat
 /  \
  captureconcat
/  \
 other stuffbackref

Each concat node performs fast NFA/DFA checks on both its children
before recursing to the children to make slow exact checks.  When we
recurse to the capture node, it records the actual match substring,
so now we know whether the capture is " or '.  Then, when we recurse
to the lower concat node, the capture is available while it makes
NFA/DFA checks for its two children; so it will never mistakenly
guess that its second child matches a substring it doesn't, and
thus it won't try to do exact checking of the "other stuff" on a
match that's bound to fail later.

So this works as long as the tree of concat nodes is right-deep,
which fortunately is the normal case.  It won't help if we have
a left-deep tree:

   concat
  /  \
concatbackref
   /  \
captureother stuff

because the upper concat node will do its NFA/DFA check on the backref
node before recursing to its left child, where the capture will occur.
But to get that tree, you have to have written extra parentheses:

((capture)otherstuff)\2

I don't see a way to improve that situation, unless perhaps with
massive rejiggering of the regex execution engine.  But 0002 attached
does help a lot in the simple case.

(BTW, the connection between 0001 and 0002 is that if we want to keep
the existing semantics that a backref enforces constraints, 0002
doesn't work, since it won't do that.)

regards, tom lane

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 08f08322ca..6c189bfed2 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -6166,6 +6166,9 @@ SELECT foo FROM regexp_split_to_table('the quick brown fox', '\s*') AS foo;
 The subexpression must entirely precede the back reference in the RE.
 Subexpressions are numbered in the order of their leading parentheses.
 Non-capturing parentheses do not define subexpressions.
+The back reference considers only the string characters matched by the
+referenced subexpression, not any constraints contained in it.  For
+example, (^\d)\1 will match 22.

 

diff --git a/src/backend/regex/regc_nfa.c b/src/backend/regex/regc_nfa.c
index a10a346e8f..77b860cb0f 100644
--- a/src/backend/regex/regc_nfa.c
+++ b/src/backend/regex/regc_nfa.c
@@ -1382,6 +1382,77 @@ duptraverse(struct nfa *nfa,
 	}
 }
 
+/*
+ * removeconstraints - remove any constraints in an NFA
+ *
+ * Constraint arcs are replaced by empty arcs, essentially treating all
+ * constraints as automatically satisfied.
+ */
+static void
+removeconstraints(struct nfa *nfa,
+  struct state *start,	/* process subNFA starting 

Re: doc review for v14

2021-02-28 Thread Justin Pryzby
On Wed, Feb 24, 2021 at 04:18:51PM +0900, Michael Paquier wrote:
> This leaves 0003, 0004, 0005, 0010, 0012, 0018, 0020 and 0021 as these
> did not look like improvements after review.

It looks like you applied 0010...but I agree that it's not an improvement.  It
appears that's something I intended to go back and revisit myself.
The rest of the patch looks right, to me.

Subject: [PATCH 10/21] doc review for checksum docs
 doc/src/sgml/wal.sgml | 18 +-  

   

I'm suggesting to either revert that part, or apply these more polished changes
in 0002.

-- 
Justin
>From 03d014809720d90ba43c780cb34fc82dd7173c8d Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Fri, 26 Feb 2021 19:42:54 -0600
Subject: [PATCH 1/2] Partially revert bcf2667bf62d72faced64cb60ffbd2e599a0ebe8

---
 doc/src/sgml/wal.sgml | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/doc/src/sgml/wal.sgml b/doc/src/sgml/wal.sgml
index 02f576a1a9..66de1ee2f8 100644
--- a/doc/src/sgml/wal.sgml
+++ b/doc/src/sgml/wal.sgml
@@ -237,19 +237,19 @@
   
 
   
-   By default, data pages are not protected by checksums, but this can optionally be
-   enabled for a cluster.  When enabled, each data page will be ASSIGNED a
-   checksum that is updated when the page is written and verified each time
-   the page is read. Only data pages are protected by checksums; internal data
+   Data pages are not checksum protected by default, but this can optionally be
+   enabled for a cluster.  When enabled, each data page will be assigned a
+   checksum that is updated when the page is written and verified every time
+   the page is read. Only data pages are protected by checksums, internal data
structures and temporary files are not.
   
 
   
-   Checksums verification is normally ENABLED when the cluster is initialized using initdb.
They can also be enabled or disabled at a later time as an offline
operation. Data checksums are enabled or disabled at the full cluster
-   level, and cannot be specified for individual databases or tables.
+   level, and cannot be specified individually for databases or tables.
   
 
   
@@ -260,9 +260,9 @@
   
 
   
-   When attempting to recover from corrupt data, it may be necessary to bypass
-   the checksum protection. To do this, temporarily set the configuration
-   parameter .
+   When attempting to recover from corrupt data it may be necessary to bypass
+   the checksum protection in order to recover data. To do this, temporarily
+   set the configuration parameter .
   
 
   
-- 
2.17.0

>From bed0d532bc2415cea1e2da54b6842053ecc27e04 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Fri, 26 Feb 2021 19:56:25 -0600
Subject: [PATCH 2/2] Second attempt to improve checksum docs

'assigned' is odd

Each not every

Colon not comma

Rephase 'Checksums are normally enabled...', since this is easy to mis-interpret and sounds like they're enabled by default

Do not repeat: 'to recover...data'
---
 doc/src/sgml/wal.sgml | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/doc/src/sgml/wal.sgml b/doc/src/sgml/wal.sgml
index 66de1ee2f8..347be030b2 100644
--- a/doc/src/sgml/wal.sgml
+++ b/doc/src/sgml/wal.sgml
@@ -238,18 +238,18 @@
 
   
Data pages are not checksum protected by default, but this can optionally be
-   enabled for a cluster.  When enabled, each data page will be assigned a
-   checksum that is updated when the page is written and verified every time
-   the page is read. Only data pages are protected by checksums, internal data
+   enabled for a cluster.  When enabled, a checksum is included in each data page
+   when the page is written and verified each time
+   the page is read. Only data pages are protected by checksums; internal data
structures and temporary files are not.
   
 
   
-   Checksums are normally enabled when the cluster is initialized using initdb.
They can also be enabled or disabled at a later time as an offline
operation. Data checksums are enabled or disabled at the full cluster
-   level, and cannot be specified individually for databases or tables.
+   level, and cannot be specified for individual databases or tables.
   
 
   
@@ -260,8 +260,8 @@
   
 
   
-   When attempting to recover from corrupt data it may be necessary to bypass
-   the checksum protection in order to recover data. To do this, temporarily
+   When attempting to recover from corrupt data, it may be necessary to bypass
+   the checksum protection. To do this, temporarily
set the configuration parameter .
   
 
-- 
2.17.0



Re: Different compression methods for FPI

2021-02-28 Thread Justin Pryzby
On Sat, Feb 27, 2021 at 12:43:52PM +0500, Andrey Borodin wrote:
> So I think it worth to propose a patch to make wal_compression_method = 
> {"pglz", "lz4", "zlib"}. Probably, "zstd" can be added to the list.
> Attached is a draft taking CompressionId from "custom compression methods" 
> patch and adding zlib to it.

Thanks for submitting it.

Does this need to patch ./configure{,.ac} and Solution.pm for HAVE_LIBLZ4 ?
I suggest to also include an 0002 patch (not for commit) which changes to use a
non-default compression, to exercise this on the CIs - linux and bsd
environments now have liblz4 installed, and for windows you can keep "undef".

Daniil had a patch to add src/common/z_stream.c:
https://github.com/postgrespro/libpq_compression/blob/0a9c70d582cd4b1ef60ff39f8d535f6e800bd7d4/src/common/z_stream.c
https://www.postgresql.org/message-id/470e411e-681d-46a2-a1e9-6de11b5f5...@yandex-team.ru

Your patch looks fine, but I wonder if we should first implement a generic
compression API.  Then, the callers don't need to have a bunch of #ifdef.
If someone calls zs_create() for an algorithm which isn't enabled at compile
time, it throws the error at a lower level.

That also allows a central place to do things like handle options (compression
level, and things like zstd --long, --rsyncable, etc).

In some cases there's an argument that the compression ID should be globally
defined constant, not like a dynamic "plugable" OID.  That may be true for the
libpq compression, WAL compression, and pg_dump, since there's separate
"producer" and "consumer".  I think if there's "pluggable" compression (like
the TOAST patch), then it may have to map between the static or dynamic OIDs
and the global compression ID.

-- 
Justin




Re: Remove latch.c workaround for Linux < 2.6.27

2021-02-28 Thread Thomas Munro
On Sat, Feb 27, 2021 at 9:30 PM Thomas Munro  wrote:
> On Sat, Feb 27, 2021 at 9:01 PM Heikki Linnakangas  wrote:
> > What happens if you try to try to compile or run on such an ancient kernel? 
> > Does it fall back to something else? Can you still make it work with 
> > different configure options?
> >
> > I'm just curious, not objecting.
>
> With this patch, I guess you'd have to define WAIT_USE_POLL.  I
> suppose we could fall back to that automatically if EPOLL_CLOEXEC
> isn't defined, if anyone thinks that's worth bothering with.

I thought about doing:

 /* don't overwrite manual choice */
-#elif defined(HAVE_SYS_EPOLL_H)
+#elif defined(HAVE_SYS_EPOLL_H) && defined(EPOLL_CLOEXEC)
 #define WAIT_USE_EPOLL

... but on reflection, I don't think we should expend energy on a
desupported OS vintage that isn't present in our build farm, at least
not without a reasonable field complaint; I wouldn't even know if it
worked.

So, pushed without that.




alter table set TABLE ACCESS METHOD

2021-02-28 Thread Justin Pryzby
I thought this was a good idea, but didn't hear back when I raised it before.
I was motivated to finally look into it by Dilip's toast compression patch,
which also (can do) a table rewrite when changing a column's toast compression.

I called this "set TABLE access method" rather than just "set access method"
for the reasons given on the LIKE thread:
https://www.postgresql.org/message-id/20210119210331.gn8...@telsasoft.com

I've tested this with zedstore, but the lack of 2nd, in-core table AM limits
testing possibilities.  It also limits at least my own ability to reason about
the AM API.  For example, I was surprised to hear that toast is a concept
that's intended to be applied to AMs other than heap.
https://www.postgresql.org/message-id/flat/CA%2BTgmoYTuT4sRtviMLOOO%2B79VnDCpCNyy9rK6UZFb7KEAVt21w%40mail.gmail.com

I plan to add to CF - it seems like a minor addition, but may not be for v14.

https://www.postgresql.org/message-id/20190818193533.gl11...@telsasoft.com
On Sun, Aug 18, 2019 at 02:35:33PM -0500, Justin Pryzby wrote:
>  . What do you think about pg_restore --no-tableam; similar to
>--no-tablespaces, it would allow restoring a table to a different AM:
>PGOPTIONS='-c default_table_access_method=zedstore' pg_restore 
> --no-tableam ./pg_dump.dat -d postgres
>Otherwise, the dump says "SET default_table_access_method=heap", which
>overrides any value from PGOPTIONS and precludes restoring to new AM.
...
>  . it'd be nice if there was an ALTER TABLE SET ACCESS METHOD, to allow
>migrating data.  Otherwise I think the alternative is:
>   begin; lock t;
>   CREATE TABLE new_t LIKE (t INCLUDING ALL) USING (zedstore);
>   INSERT INTO new_t SELECT * FROM t;
>   for index; do CREATE INDEX...; done
>   DROP t; RENAME new_t (and all its indices). attach/inherit, etc.
>   commit;
>
>  . Speaking of which, I think LIKE needs a new option for ACCESS METHOD, which
>is otherwise lost.
>From b13dac0ffbe108c45c97095b69218e76bf02799f Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Sat, 27 Feb 2021 20:40:54 -0600
Subject: [PATCH] ALTER TABLE SET ACCESS METHOD

This does a tuple-level rewrite to the new AM
---
 src/backend/catalog/heap.c  |  1 +
 src/backend/commands/cluster.c  | 17 +++--
 src/backend/commands/matview.c  |  1 +
 src/backend/commands/tablecmds.c| 86 ++---
 src/backend/parser/gram.y   | 10 +++
 src/include/commands/cluster.h  |  2 +-
 src/include/commands/event_trigger.h|  1 +
 src/include/nodes/parsenodes.h  |  1 +
 src/test/regress/expected/create_am.out | 10 +++
 src/test/regress/sql/create_am.sql  |  5 ++
 10 files changed, 119 insertions(+), 15 deletions(-)

diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index 9abc4a1f55..9d5a9240e9 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -1117,6 +1117,7 @@ AddNewRelationType(const char *typeName,
  *	reltypeid: OID to assign to rel's rowtype, or InvalidOid to select one
  *	reloftypeid: if a typed table, OID of underlying type; else InvalidOid
  *	ownerid: OID of new rel's owner
+ *	accessmtd: OID of new rel's access method
  *	tupdesc: tuple descriptor (source of column definitions)
  *	cooked_constraints: list of precooked check constraints and defaults
  *	relkind: relkind for new rel
diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c
index 096a06f7b3..426a5b83ba 100644
--- a/src/backend/commands/cluster.c
+++ b/src/backend/commands/cluster.c
@@ -577,6 +577,7 @@ rebuild_relation(Relation OldHeap, Oid indexOid, bool verbose)
 {
 	Oid			tableOid = RelationGetRelid(OldHeap);
 	Oid			tableSpace = OldHeap->rd_rel->reltablespace;
+	Oid			accessMethod = OldHeap->rd_rel->relam;
 	Oid			OIDNewHeap;
 	char		relpersistence;
 	bool		is_system_catalog;
@@ -598,6 +599,7 @@ rebuild_relation(Relation OldHeap, Oid indexOid, bool verbose)
 	/* Create the transient table that will receive the re-ordered data */
 	OIDNewHeap = make_new_heap(tableOid, tableSpace,
 			   relpersistence,
+			   accessMethod,
 			   AccessExclusiveLock);
 
 	/* Copy the heap data into the new table in the desired order */
@@ -619,15 +621,15 @@ rebuild_relation(Relation OldHeap, Oid indexOid, bool verbose)
  * Create the transient table that will be filled with new data during
  * CLUSTER, ALTER TABLE, and similar operations.  The transient table
  * duplicates the logical structure of the OldHeap, but is placed in
- * NewTableSpace which might be different from OldHeap's.  Also, it's built
- * with the specified persistence, which might differ from the original's.
+ * NewTableSpace/accessMethod/persistence, which might differ from those
+ * of the OldHeap.
  *
  * After this, the caller should load the new heap with transferred/modified
  * data, then call finish_heap_swap to complete the operation.
  */
 Oid
 make_new_heap(Oid OIDOldHeap, Oid NewTableSpace, char r

Re: [PATCH] refactor ATExec{En,Dis}ableRowSecurity

2021-02-28 Thread Zhihong Yu
Hi,
For 0002-Further-refactoring.patch, should there be assertion
inside ATExecSetRowSecurity() on the values for rls and force_rls ?
There could be 3 possible values: -1, 0 and 1.

Cheers

On Sun, Feb 28, 2021 at 1:19 PM Justin Pryzby  wrote:

> tablecmds.c is 17k lines long, this makes it ~30 lines shorter.
>


[PATCH] refactor ATExec{En,Dis}ableRowSecurity

2021-02-28 Thread Justin Pryzby
tablecmds.c is 17k lines long, this makes it ~30 lines shorter.
>From 2e9500227d45142eb00e9e1ebee001642a834518 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Sat, 27 Feb 2021 20:39:10 -0600
Subject: [PATCH 1/2] Refactor ATExec{En,Dis}ableRowSecurity in the style of
 ATExecForceNoForceRowSecurity

commit 088c83363a11200f2225f279d4a5c6cc6f9db3d2
ALTER TABLE .. FORCE ROW LEVEL SECURITY

commit 491c029dbc4206779cf659aa0ff986af7831d2ff
Row-Level Security Policies (RLS)
---
 src/backend/commands/tablecmds.c | 34 +---
 1 file changed, 5 insertions(+), 29 deletions(-)

diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 7adeaedd0e..69e3184c86 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -526,8 +526,7 @@ static ObjectAddress ATExecAddOf(Relation rel, const TypeName *ofTypename, LOCKM
 static void ATExecDropOf(Relation rel, LOCKMODE lockmode);
 static void ATExecReplicaIdentity(Relation rel, ReplicaIdentityStmt *stmt, LOCKMODE lockmode);
 static void ATExecGenericOptions(Relation rel, List *options);
-static void ATExecEnableRowSecurity(Relation rel);
-static void ATExecDisableRowSecurity(Relation rel);
+static void ATExecSetRowSecurity(Relation rel, bool rls);
 static void ATExecForceNoForceRowSecurity(Relation rel, bool force_rls);
 
 static void index_copy_data(Relation rel, RelFileNode newrnode);
@@ -4865,10 +4864,10 @@ ATExecCmd(List **wqueue, AlteredTableInfo *tab, Relation rel,
 			ATExecReplicaIdentity(rel, (ReplicaIdentityStmt *) cmd->def, lockmode);
 			break;
 		case AT_EnableRowSecurity:
-			ATExecEnableRowSecurity(rel);
+			ATExecSetRowSecurity(rel, true);
 			break;
 		case AT_DisableRowSecurity:
-			ATExecDisableRowSecurity(rel);
+			ATExecSetRowSecurity(rel, false);
 			break;
 		case AT_ForceRowSecurity:
 			ATExecForceNoForceRowSecurity(rel, true);
@@ -14883,30 +14882,7 @@ ATExecReplicaIdentity(Relation rel, ReplicaIdentityStmt *stmt, LOCKMODE lockmode
  * ALTER TABLE ENABLE/DISABLE ROW LEVEL SECURITY
  */
 static void
-ATExecEnableRowSecurity(Relation rel)
-{
-	Relation	pg_class;
-	Oid			relid;
-	HeapTuple	tuple;
-
-	relid = RelationGetRelid(rel);
-
-	pg_class = table_open(RelationRelationId, RowExclusiveLock);
-
-	tuple = SearchSysCacheCopy1(RELOID, ObjectIdGetDatum(relid));
-
-	if (!HeapTupleIsValid(tuple))
-		elog(ERROR, "cache lookup failed for relation %u", relid);
-
-	((Form_pg_class) GETSTRUCT(tuple))->relrowsecurity = true;
-	CatalogTupleUpdate(pg_class, &tuple->t_self, tuple);
-
-	table_close(pg_class, RowExclusiveLock);
-	heap_freetuple(tuple);
-}
-
-static void
-ATExecDisableRowSecurity(Relation rel)
+ATExecSetRowSecurity(Relation rel, bool rls)
 {
 	Relation	pg_class;
 	Oid			relid;
@@ -14922,7 +14898,7 @@ ATExecDisableRowSecurity(Relation rel)
 	if (!HeapTupleIsValid(tuple))
 		elog(ERROR, "cache lookup failed for relation %u", relid);
 
-	((Form_pg_class) GETSTRUCT(tuple))->relrowsecurity = false;
+	((Form_pg_class) GETSTRUCT(tuple))->relrowsecurity = rls;
 	CatalogTupleUpdate(pg_class, &tuple->t_self, tuple);
 
 	table_close(pg_class, RowExclusiveLock);
-- 
2.17.0

>From bf200702203eebe08011eaa007e93cae46a63ccb Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Sat, 27 Feb 2021 20:52:35 -0600
Subject: [PATCH 2/2] Further refactoring

---
 src/backend/commands/tablecmds.c | 46 +---
 1 file changed, 13 insertions(+), 33 deletions(-)

diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 69e3184c86..2e322e78ae 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -526,8 +526,7 @@ static ObjectAddress ATExecAddOf(Relation rel, const TypeName *ofTypename, LOCKM
 static void ATExecDropOf(Relation rel, LOCKMODE lockmode);
 static void ATExecReplicaIdentity(Relation rel, ReplicaIdentityStmt *stmt, LOCKMODE lockmode);
 static void ATExecGenericOptions(Relation rel, List *options);
-static void ATExecSetRowSecurity(Relation rel, bool rls);
-static void ATExecForceNoForceRowSecurity(Relation rel, bool force_rls);
+static void ATExecSetRowSecurity(Relation rel, int rls, int force_rls);
 
 static void index_copy_data(Relation rel, RelFileNode newrnode);
 static const char *storage_name(char c);
@@ -4864,16 +4863,16 @@ ATExecCmd(List **wqueue, AlteredTableInfo *tab, Relation rel,
 			ATExecReplicaIdentity(rel, (ReplicaIdentityStmt *) cmd->def, lockmode);
 			break;
 		case AT_EnableRowSecurity:
-			ATExecSetRowSecurity(rel, true);
+			ATExecSetRowSecurity(rel, true, -1);
 			break;
 		case AT_DisableRowSecurity:
-			ATExecSetRowSecurity(rel, false);
+			ATExecSetRowSecurity(rel, false, -1);
 			break;
 		case AT_ForceRowSecurity:
-			ATExecForceNoForceRowSecurity(rel, true);
+			ATExecSetRowSecurity(rel, -1, true);
 			break;
 		case AT_NoForceRowSecurity:
-			ATExecForceNoForceRowSecurity(rel, false);
+			ATExecSetRowSecurity(rel, -1, false);
 			break;
 		case AT_GenericOptions:
 			

Re: proposal: psql –help reflecting service or URI usage

2021-02-28 Thread Paul Förster
Hi Mark,

> On 28. Feb, 2021, at 17:54, Mark Dilger  wrote:
> 
> "definited" is a typo.

yes, definitely a typo, sorry. Thanks for pointing this out.

> Should this say "as defined in pg_service.conf"?  That's the default, but the 
> user might have $PGSERVICEFILE set to something else.  Perhaps you could 
> borrow the wording of other options and use "(default: as defined in 
> pg_service.conf)", or something like that, but of course being careful to 
> still fit in the line length limit.

I agree to all, thanks. What is the line length limit?

> Other client applications follow the same pattern as psql, so if this change 
> were adopted, it should apply to all of them.

well, psql is central and IMHO the best place to start. I'd have to try out all 
of them then. What I do know, though, is that pg_isready does not understand a 
URI (why is that?), which is very unfortunate. So, I'd have to try them all out 
and supply patches for them all?

Still, supporting a feature and not documenting it in its help is IMHO not a 
good idea.

> Your proposal seems like something that would have been posted to the list 
> before, possibly multiple times.  Any chance you could dig up past 
> conversations on this subject and post links here for context?

I don't know any past discussions here. I only subscribed today to 
psql-hackers. It might have been mentioned on psql-general, though. But I'm not 
sure. This idea popped into my mind just yesterday when I was playing around 
with psql and URIs and noticed that psql –help doesn't show them.

Cheers,
Paul



Re: proposal: psql –help reflecting service or URI usage

2021-02-28 Thread Mark Dilger



> On Feb 28, 2021, at 1:57 AM, Paul Förster  wrote:
> 
> Hi,
> 
> I'd like to propose a patch to psql --help output:
> 
> Currently it is:
> 
> Usage:
>  psql [OPTION]... [DBNAME [USERNAME]]
> 
> ...
> 
> Connection options:
>  -h, --host=HOSTNAME  database server host or socket directory (default: 
> "local socket")
>  -p, --port=PORT  database server port (default: "5432")
>  -U, --username=USERNAME  database user name (default: "paul")
>  -w, --no-passwordnever prompt for password
>  -W, --password   force password prompt (should happen automatically)
> 
> I'd like to change it to the following to reflect the psql ability to process 
> a service name or a PostgreSQL URI:
> 
> Usage:
>  psql [OPTION]... [DBNAME [USERNAME]|service|uri]
> 
> ...
> 
> Connection options:
>  -h, --host=HOSTNAME  database server host or socket directory (default: 
> "local socket")
>  -p, --port=PORT  database server port (default: "5432")
>  -U, --username=USERNAME  database user name (default: "paul")
>  -w, --no-passwordnever prompt for password
>  -W, --password   force password prompt (should happen automatically)
>  service=name service name as definited in pg_service.conf

"definited" is a typo.

Should this say "as defined in pg_service.conf"?  That's the default, but the 
user might have $PGSERVICEFILE set to something else.  Perhaps you could borrow 
the wording of other options and use "(default: as defined in 
pg_service.conf)", or something like that, but of course being careful to still 
fit in the line length limit.

>  uri  connection URI (postgresql://...)
> 
> ...
> 
> Attached is a patch for src/bin/psql/help.c for this. The file 
> doc/src/sgml/ref/psql-ref.sgml does not seem to need any changes for this.
> 
> Any thoughts on this?

Other client applications follow the same pattern as psql, so if this change 
were adopted, it should apply to all of them.

Your proposal seems like something that would have been posted to the list 
before, possibly multiple times.  Any chance you could dig up past 
conversations on this subject and post links here for context?

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: regexp_positions()

2021-02-28 Thread Joel Jacobson
I had a bug in the function, and I see I also accidentally renamed it to 
regexp_ranges().

Attached is a fixed version of the PoC.

This function is e.g. useful when we're interested in patterns in meta-data,
where we're not actually finding patterns in the data,
but in a string where each character corresponds to an element
in an array, containing the actual data.

In such case, we need to know the positions of the matches,
since they tell what corresponding array elements that matched.

For instance, let's take the UNIX diff tool we all know as an example.

Let's say you have all the raw diff lines stored in a text[] array,
and we want to produce a unified diff, by finding hunks
with up to 3 unchanged lines before/after each hunk
containing changes.

If we produce a text string containing one character per diff line,
using "=" for unchanged, "+" for addition, "-" for deletion.

Example: =-===+=-+==

We could then find the hunks using this regex:

(={0,3}[-+]+={0,3})+

using regexp_positions() to find the start and end positions for each hunk:

SELECT * FROM 
regexp_positions('=-===+=-+==','(={0,3}[-+]+={0,3})+');
start_pos | end_pos
---+-
 3 |   9
11 |  24
(2 rows)

/Joel

regexp_positions.sql
Description: Binary data


proposal: psql –help reflecting service or URI usage

2021-02-28 Thread Paul Förster
Hi,

I'd like to propose a patch to psql --help output:

Currently it is:

Usage:
  psql [OPTION]... [DBNAME [USERNAME]]

...

Connection options:
  -h, --host=HOSTNAME  database server host or socket directory (default: 
"local socket")
  -p, --port=PORT  database server port (default: "5432")
  -U, --username=USERNAME  database user name (default: "paul")
  -w, --no-passwordnever prompt for password
  -W, --password   force password prompt (should happen automatically)

I'd like to change it to the following to reflect the psql ability to process a 
service name or a PostgreSQL URI:

Usage:
  psql [OPTION]... [DBNAME [USERNAME]|service|uri]

...

Connection options:
  -h, --host=HOSTNAME  database server host or socket directory (default: 
"local socket")
  -p, --port=PORT  database server port (default: "5432")
  -U, --username=USERNAME  database user name (default: "paul")
  -w, --no-passwordnever prompt for password
  -W, --password   force password prompt (should happen automatically)
  service=name service name as definited in pg_service.conf
  uri  connection URI (postgresql://...)

...

Attached is a patch for src/bin/psql/help.c for this. The file 
doc/src/sgml/ref/psql-ref.sgml does not seem to need any changes for this.

Any thoughts on this?



psql-help.c.patch
Description: Binary data


Cheers,
Paul

Re: Side effect of remove_useless_groupby_columns

2021-02-28 Thread David Rowley
On Sun, 28 Feb 2021 at 20:52, Richard Guo  wrote:
> When looking at [1], I realized we may have a side effect when removing
> redundant columns in the GROUP BY clause. Suppose we have a query with
> ORDER BY 'b', and meanwhile column 'b' is also a group key. If we decide
> that 'b' is redundant due to being functionally dependent on other GROUP
> BY columns, we would remove it from group keys. This will make us lose
> the opportunity to leverage the index on 'b'.

> Any thoughts?

I had initially thought that this was a non-issue before incremental
sort was added, but the following case highlights that's wrong.

# create table t (a int not null, b int);
# insert into t select i, i%1000 from generate_series(1,100)i;
# create index on t(b,a);

# explain (analyze) select b from t group by a, b order by b,a limit 10;
# alter table t add constraint t_pkey primary key (a);
# explain (analyze) select b from t group by a, b order by b,a limit 10;

Execution slows down after adding the primary key since that allows
the functional dependency to be detected and the  "b" column removed
from the GROUP BY clause.

Perhaps we could do something like add:

diff --git a/src/backend/optimizer/plan/planner.c
b/src/backend/optimizer/plan/planner.c
index 545b56bcaf..7e52212a13 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -3182,7 +3182,8 @@ remove_useless_groupby_columns(PlannerInfo *root)
if (!IsA(var, Var) ||
var->varlevelsup > 0 ||
!bms_is_member(var->varattno -
FirstLowInvalidHeapAttributeNumber,
-
surplusvars[var->varno]))
+
surplusvars[var->varno]) ||
+   list_member(parse->sortClause, sgc))
new_groupby = lappend(new_groupby, sgc);
}

to remove_useless_groupby_columns().  It's not exactly perfect though
since it's still good to remove the useless GROUP BY columns if
there's no useful index to implement the ORDER BY.  We just don't know
that when we call remove_useless_groupby_columns().

FWIW, these also don't really seem to be all that great examples since
it's pretty useless to put the primary key columns in the GROUP BY
unless there's some join that's going to duplicate those columns.
Having the planner remove the GROUP BY completely in that case would
be nice.  That's a topic of discussion in the UniqueKeys patch thread.
However, it is possible to add a join with a join condition matching
the ORDER BY and have the same problem when a join type that preserves
the outer path's order is picked.

David

> [1] 
> https://www.postgresql.org/message-id/flat/16869-26346b77d6ccaeec%40postgresql.org