Re: [HACKERS] proposal: schema variables

2019-03-25 Thread Pavel Stehule
po 25. 3. 2019 v 20:40 odesílatel Erik Rijkers  napsal:

> On 2019-03-24 10:32, Pavel Stehule wrote:
> > ne 24. 3. 2019 v 10:25 odesílatel Erik Rijkers  napsal:
> >
> >> On 2019-03-24 06:57, Pavel Stehule wrote:
> >> > Hi
> >> >
> >> > rebase against current master
> >>
> >> I ran into this:
> >>
> >> (schema 'varschema2' does not exist):
> >>
> >> drop variable varschema2.testv cascade;
> >> ERROR:  schema "varschema2" does not exist
> >> create variable if not exists testv as text;
> >> server closed the connection unexpectedly
> >> This probably means the server terminated abnormally
> >> before or while processing the request.
> >> connection to server was lost
> >>
> >>
> >> (both statements are needed to force the crash)
> >>
> >
> > I cannot to reproduce it.
> >  [backtrace and stuff]
>
> Sorry, I don't have the wherewithal to get more info but I have repeated
> this now on 4 different machines (debian jessie/stretch; centos).
>
> I did notice that sometimes those two offending lines
> "
>drop variable varschema2.testv cascade;
>create variable if not exists testv as text;
> "
> have to be repeated a few times (never more than 4 or 5 times) before
> the crash occurs (signal 11: Segmentation fault).
>

Should be fixed now.

Thank you for report

Pavel


>
> Erik Rijkers
>
>
>


Re: [HACKERS] proposal: schema variables

2019-03-25 Thread Pavel Stehule
Hi

ne 24. 3. 2019 v 6:57 odesílatel Pavel Stehule 
napsal:

> Hi
>
> rebase against current master
>


fixed issue IF NOT EXISTS & related regress tests

Regards

Pavel


> Regards
>
> Pavel
>


schema-variables-20190326.patch.gz
Description: application/gzip


RE: Timeout parameters

2019-03-25 Thread Nagaura, Ryohei
Hi, kirk-san.

> From: Jamison, Kirk
> Ok. So I'd only take a look at TCP_USER_TIMEOUT parameter for now (this
> CommitFest), and maybe we could resume the discussion on socket_timeout
> in the future.
Yes, please.

> Your patch applies, however in TCP_backend_v10 patch, your documentation
> is missing a closing tag  so it could not be tested.
> When that's fixed, it passes the make check.
Oops! Fixed.

> what's the reason to emphasize Windows not supported in a separate
> paragraph?
I thought that it needs to be warned because there are some windows users.
But I came upon the idea that there are no need even it.
Thus the former doc you suggested seems better and I minor fixed doc to yours.


> (Note: I haven't checked which Linux versions are supported, I got it
> from your previous patch version.)
FYI, [1]

[1] http://man7.org/linux/man-pages/man7/tcp.7.html
Best regards,
-
Ryohei Nagaura



socket_timeout_v9.patch
Description: socket_timeout_v9.patch


TCP_backend_v11.patch
Description: TCP_backend_v11.patch


TCP_interface_v11.patch
Description: TCP_interface_v11.patch


Re: Refactoring the checkpointer's fsync request queue

2019-03-25 Thread Thomas Munro
On Wed, Mar 13, 2019 at 2:27 PM Thomas Munro  wrote:
> [...] Aside from some refactoring which I
> think looks good anyway and prepares for future patches, the main
> effect of this patch is to force the checkpointer to open and close
> the files every time which seems OK to me.

I've been trying to decide if that is a problem.  Maybe there is a
performance angle, and I'm also wondering if it might increase the
risk of missing a write-back error.  Of course we'll find a proper
solution to that problem (perhaps fd-passing or sync-on-close[1]), but
I don't want to commit anything in the name of refactoring that might
make matters worse incidentally.  Or perhaps those worries are bogus,
since the checkpointer calls smgrcloseall() at the end anyway.

On balance, I'm inclined to err on the side of caution and try to keep
things a bit closer to the way they are for now.

Here's a fixup patch.  0001 is the same as Shawn's v12 patch, and 0002
has my proposed changes to switch to callbacks that actually perform
the sync and unlink operations given a file tag, and do so via the
SMGR fd cache, rather than exposing the path to sync.c.  This moves us
back towards the way I had it in an earlier version of the patch, but
instead of using smgrsyncimmed() as I had it, it goes via Shawn's new
sync handler function lookup table, allowing for non-smgr components
to use this machinery in future (as requested by Andres).

Thoughts?

It all needs to be pgindented, I'll do that later.  I'll post a rebase
of my undo stuff on top of this soon, to show how it looks with this
interface.

[1] 
https://www.postgresql.org/message-id/flat/CA%2BhUKGLMPXMnSLDwgnNRFPyxvf_0bJ5HwXcHWjCp7Cvh7G%3DxEA%40mail.gmail.com

-- 
Thomas Munro
https://enterprisedb.com


0001-Refactor-the-fsync-mechanism-to-support-future-S-v13.patch
Description: Binary data


0002-fixup-v13.patch
Description: Binary data


Re: pg_basebackup ignores the existing data directory permissions

2019-03-25 Thread Haribabu Kommi
On Tue, Mar 26, 2019 at 1:27 PM Michael Paquier  wrote:

> On Sun, Mar 24, 2019 at 10:30:47PM +1100, Haribabu Kommi wrote:
> > With the above additional options, the pg_basebackup is able to control
> > the access permissions of the backup files, but when it comes to tar mode
> > all the files are sent from the server and stored as it is in backup, to
> > support
> > tar mode group access mode control, the BASE BACKUP protocol is
> > enhanced with new option GROUP_MODE 'none' or GROUP_MODE 'group'
> > to control the file permissions before they are sent to backup. Sending
> > GROUP_MODE to the server depends on the -g option received to the
> > pg_basebackup utility.
>
>
Thanks for the review.


> Do we really want to extend the replication protocol to control that?
>

As the backup data is passed in tar format and if the pg_basebackup
is also storing it in tar format, i feel changing the permissions on tar
creation is easier than regenerating the received tar with different
permissions at pg_basebackup side.

Other than tar format, changing only in pg_basebackup can support
independent group access permissions of the standby directory.

I am really questioning if we should keep this stuff isolated within
> pg_basebackup or not.  At the same time, it may be confusing to have
> BASE_BACKUP only use the permissions inherited from the data
> directory, so some input from folks maintaining an external backup
> tool is welcome.
>

That would be good to hear what other external backup tool authors
think of this change.

Regards,
Haribabu Kommi
Fujitsu Australia


RE: Timeout parameters

2019-03-25 Thread Jamison, Kirk
Hi Nagaura-san,

On Monday, March 25, 2019 2:26 PM (GMT+9), Ryohei Nagaura wrote:

>Yes, I want to commit TCP_USER_TIMEOUT patches in PG12.
>Also I'd like to continue to discuss about socket_timeout after this CF.
Ok. So I'd only take a look at TCP_USER_TIMEOUT parameter for now (this 
CommitFest),
and maybe we could resume the discussion on socket_timeout in the future.

> About TCP_USER_TIMEOUT:
> 1) Documentation and checking on UNIX system As far as I surveyed 
> (solaris and BSD family), these UNIX OS don't have "TCP_USER_TIMEOUT" 
> parameter.
> Accordingly I have not checked on real machine.
> Also, I modified documentations to remove "equivalent to socket option"

Your patch applies, however in TCP_backend_v10 patch,
your documentation is missing a closing tag 
so it could not be tested.
When that's fixed, it passes the make check.

Also regarding docs,
what's the reason to emphasize Windows not supported in a separate paragraph?
How about simplifying and combining the 2 paragraphs such as below?

Specifies the number of milliseconds after which a TCP connection can be
aborted by the operating system due to network problems when sending or
receiving data through this connection. A value of zero uses the system default.
For connections made via a Unix-domain socket, this parameter is ignored. This
parameter is supported only on systems that support 
TCP_USER_TIMEOUT;
on other systems such as Windows, it has no effect and must be zero.

Or if you really want to emphasize which systems are supported and which are 
not,
you may remove the last sentence above, then indicate that in the note tag.
Example:
This parameter is supported only on systems that support 
TCP_USER_TIMEOUT such as Linux version 2.6.37 or later;
on other systems such as Windows, it has no effect and must be zero.

(Note: I haven't checked which Linux versions are supported,
I got it from your previous patch version.)

Regards,
Kirk Jamison


Re: psql display of foreign keys

2019-03-25 Thread Alvaro Herrera
Patch tester didn't like that one bit.  Here's v10 with the fixup
applied.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>From 4e352a1b3a0730f2c3e91e6d62335d6db6823796 Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Fri, 22 Mar 2019 19:13:16 -0300
Subject: [PATCH v10] fix psql display of FKs 

---
 src/bin/psql/describe.c   | 156 --
 src/test/regress/expected/foreign_key.out |  38 --
 src/test/regress/sql/foreign_key.sql  |   2 +
 3 files changed, 145 insertions(+), 51 deletions(-)

diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index fd8ebee8cd3..91f2306734e 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -1479,6 +1479,7 @@ describeOneTableDetails(const char *schemaname,
 		bool		rowsecurity;
 		bool		forcerowsecurity;
 		bool		hasoids;
+		bool		ispartition;
 		Oid			tablespace;
 		char	   *reloptions;
 		char	   *reloftype;
@@ -1502,7 +1503,7 @@ describeOneTableDetails(const char *schemaname,
 		printfPQExpBuffer(,
 		  "SELECT c.relchecks, c.relkind, c.relhasindex, c.relhasrules, "
 		  "c.relhastriggers, c.relrowsecurity, c.relforcerowsecurity, "
-		  "false AS relhasoids, %s, c.reltablespace, "
+		  "false AS relhasoids, c.relispartition, %s, c.reltablespace, "
 		  "CASE WHEN c.reloftype = 0 THEN '' ELSE c.reloftype::pg_catalog.regtype::pg_catalog.text END, "
 		  "c.relpersistence, c.relreplident, am.amname\n"
 		  "FROM pg_catalog.pg_class c\n "
@@ -1515,12 +1516,29 @@ describeOneTableDetails(const char *schemaname,
 		   : "''"),
 		  oid);
 	}
+	else if (pset.sversion >= 10)
+	{
+		printfPQExpBuffer(,
+		  "SELECT c.relchecks, c.relkind, c.relhasindex, c.relhasrules, "
+		  "c.relhastriggers, c.relrowsecurity, c.relforcerowsecurity, "
+		  "c.relhasoids, c.relispartition, %s, c.reltablespace, "
+		  "CASE WHEN c.reloftype = 0 THEN '' ELSE c.reloftype::pg_catalog.regtype::pg_catalog.text END, "
+		  "c.relpersistence, c.relreplident\n"
+		  "FROM pg_catalog.pg_class c\n "
+		  "LEFT JOIN pg_catalog.pg_class tc ON (c.reltoastrelid = tc.oid)\n"
+		  "WHERE c.oid = '%s';",
+		  (verbose ?
+		   "pg_catalog.array_to_string(c.reloptions || "
+		   "array(select 'toast.' || x from pg_catalog.unnest(tc.reloptions) x), ', ')\n"
+		   : "''"),
+		  oid);
+	}
 	else if (pset.sversion >= 90500)
 	{
 		printfPQExpBuffer(,
 		  "SELECT c.relchecks, c.relkind, c.relhasindex, c.relhasrules, "
 		  "c.relhastriggers, c.relrowsecurity, c.relforcerowsecurity, "
-		  "c.relhasoids, %s, c.reltablespace, "
+		  "c.relhasoids, false as relispartition, %s, c.reltablespace, "
 		  "CASE WHEN c.reloftype = 0 THEN '' ELSE c.reloftype::pg_catalog.regtype::pg_catalog.text END, "
 		  "c.relpersistence, c.relreplident\n"
 		  "FROM pg_catalog.pg_class c\n "
@@ -1537,7 +1555,7 @@ describeOneTableDetails(const char *schemaname,
 		printfPQExpBuffer(,
 		  "SELECT c.relchecks, c.relkind, c.relhasindex, c.relhasrules, "
 		  "c.relhastriggers, false, false, c.relhasoids, "
-		  "%s, c.reltablespace, "
+		  "false as relispartition, %s, c.reltablespace, "
 		  "CASE WHEN c.reloftype = 0 THEN '' ELSE c.reloftype::pg_catalog.regtype::pg_catalog.text END, "
 		  "c.relpersistence, c.relreplident\n"
 		  "FROM pg_catalog.pg_class c\n "
@@ -1554,7 +1572,7 @@ describeOneTableDetails(const char *schemaname,
 		printfPQExpBuffer(,
 		  "SELECT c.relchecks, c.relkind, c.relhasindex, c.relhasrules, "
 		  "c.relhastriggers, false, false, c.relhasoids, "
-		  "%s, c.reltablespace, "
+		  "false as relispartition, %s, c.reltablespace, "
 		  "CASE WHEN c.reloftype = 0 THEN '' ELSE c.reloftype::pg_catalog.regtype::pg_catalog.text END, "
 		  "c.relpersistence\n"
 		  "FROM pg_catalog.pg_class c\n "
@@ -1571,7 +1589,7 @@ describeOneTableDetails(const char *schemaname,
 		printfPQExpBuffer(,
 		  "SELECT c.relchecks, c.relkind, c.relhasindex, c.relhasrules, "
 		  "c.relhastriggers, false, false, c.relhasoids, "
-		  "%s, c.reltablespace, "
+		  "false as relispartition, %s, c.reltablespace, "
 		  "CASE WHEN c.reloftype = 0 THEN '' ELSE c.reloftype::pg_catalog.regtype::pg_catalog.text END\n"
 		  "FROM pg_catalog.pg_class c\n "
 		  "LEFT JOIN pg_catalog.pg_class tc ON (c.reltoastrelid = tc.oid)\n"
@@ -1587,7 +1605,7 @@ describeOneTableDetails(const char *schemaname,
 		printfPQExpBuffer(,
 		  "SELECT c.relchecks, c.relkind, c.relhasindex, c.relhasrules, "
 		  "c.relhastriggers, false, false, c.relhasoids, "
-		  "%s, c.reltablespace\n"
+		  "false as relispartition, %s, c.reltablespace\n"
 		  "FROM pg_catalog.pg_class c\n "
 		  "LEFT JOIN pg_catalog.pg_class tc ON (c.reltoastrelid = tc.oid)\n"
 		  "WHERE c.oid = '%s';",
@@ -1602,7 +1620,7 

Re: monitoring CREATE INDEX [CONCURRENTLY]

2019-03-25 Thread Alvaro Herrera
Hi

On 2019-Mar-25, Andres Freund wrote:

> I've not followed this thread at all, so I might just be out of my depth
> here. From my POV, a later patch in the yet-to-be-applied patchqueue
> moves the main part of cluster below the table AM (as there's enough low
> level details, e.g. dealing with HOT). Therefore I don't have a problem
> having heap's implementation interrogate the scan in a heap specific
> manner.
> 
> Is that the angle you were wondering about? If not, any chance to point
> out more precisely what to look at?
>
> Obviously out of pure laziness, I'd prefer this to go in after my move
> of index creation scans & cluster below tableam.h. But admittedly,
> managing my exhaustion isn't the the sole goal of the project

Well, this is create index rather than cluster, but yes this conflicts
pretty heavily with patch 0008 in your v21 at
20190324031630.nt7numguo5ojq...@alap3.anarazel.de.  I wonder if I should
rather push and help merge your 0008, or wait until you push and deal
with it afterwards.  I'd rather do the former, I think.

Anyway I was thinking about the conceptual angle -- the progress
monitoring stuff is doing block-based reporting.  I think even if we get
a non-block-based heap, we can still report the number of physical
blocks already processed by the scan, which is what the index build
monitoring is interested in showing the user.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: REINDEX CONCURRENTLY 2.0

2019-03-25 Thread Michael Paquier
On Mon, Mar 25, 2019 at 04:23:34PM +0100, Peter Eisentraut wrote:
> Let's do it. :-)

I am pretty sure that this has been said at least once since 2012.

> I've gone over this patch a few more times.  I've read all the
> discussion since 2012 again and made sure all the issues were addressed.
>  I made particularly sure that during the refactoring nothing in CREATE
> INDEX CONCURRENTLY and DROP INDEX CONCURRENTLY was inadvertently
> changed.  I checked all the steps again.  I'm happy with it.

From my side, I would be happy to look at this patch.  Unfortunately I
won't have the room to look at it this week I think :(

But if you are happy with it that's fine by me, at least I can fix
anything which is broken :)
--
Michael


signature.asc
Description: PGP signature


Re: Special role for subscriptions

2019-03-25 Thread Michael Paquier
On Sat, Mar 23, 2019 at 10:52:52AM -0400, Stephen Frost wrote:
> * Petr Jelinek (petr.jeli...@2ndquadrant.com) wrote:
>> I agree the feature is important, it just does not seem like the patch
>> is RFC and given security implications I err on the side of safety here.
> 
> Agreed.

Based on the latest exchanges, I am marking the patch as returned with
feedback.
--
Michael


signature.asc
Description: PGP signature


Re: pg_basebackup ignores the existing data directory permissions

2019-03-25 Thread Michael Paquier
On Sun, Mar 24, 2019 at 10:30:47PM +1100, Haribabu Kommi wrote:
> With the above additional options, the pg_basebackup is able to control
> the access permissions of the backup files, but when it comes to tar mode
> all the files are sent from the server and stored as it is in backup, to
> support
> tar mode group access mode control, the BASE BACKUP protocol is
> enhanced with new option GROUP_MODE 'none' or GROUP_MODE 'group'
> to control the file permissions before they are sent to backup. Sending
> GROUP_MODE to the server depends on the -g option received to the
> pg_basebackup utility.

Do we really want to extend the replication protocol to control that?
I am really questioning if we should keep this stuff isolated within
pg_basebackup or not.  At the same time, it may be confusing to have
BASE_BACKUP only use the permissions inherited from the data
directory, so some input from folks maintaining an external backup
tool is welcome.
--
Michael


signature.asc
Description: PGP signature


Re: pg_upgrade: Pass -j down to vacuumdb

2019-03-25 Thread Michael Paquier
On Mon, Mar 25, 2019 at 05:57:50PM -0400, Tom Lane wrote:
> In short, I'm not convinced that most of this patch is an improvement
> on the status quo.  I think we'd be best off to just take the idea
> of explicitly mentioning adding --jobs to a manual run, ie roughly

Yeah, no objections from here to keep that stuff the simpler the
better.  So I am on board with your suggestion.
--
Michael


signature.asc
Description: PGP signature


Re: monitoring CREATE INDEX [CONCURRENTLY]

2019-03-25 Thread Andres Freund
Hi,

On 2019-03-25 23:11:00 -0300, Alvaro Herrera wrote:
> On 2019-Mar-25, Alvaro Herrera wrote:
> 
> > Here's v6 of this patch.  I have rebased on top of today's CLUSTER
> > monitoring, as well as on table AM commits.  The latter caused a bit of
> > trouble, as now the number of blocks processed by a scan is not as easy
> > to get as before; I added a new entry point heapscan_get_blocks_done on
> > heapam.c to help with that.  (I suppose this will need some fixups later
> > on.)
> 
> Andres, I suppose you have something to say about patch 0001 here?

I've not followed this thread at all, so I might just be out of my depth
here. From my POV, a later patch in the yet-to-be-applied patchqueue
moves the main part of cluster below the table AM (as there's enough low
level details, e.g. dealing with HOT). Therefore I don't have a problem
having heap's implementation interrogate the scan in a heap specific
manner.

Is that the angle you were wondering about? If not, any chance to point
out more precisely what to look at?

Obviously out of pure laziness, I'd prefer this to go in after my move
of index creation scans & cluster below tableam.h. But admittedly,
managing my exhaustion isn't the the sole goal of the project

Greetings,

Andres Freund



Re: [HACKERS] CLUSTER command progress monitor

2019-03-25 Thread Michael Paquier
On Tue, Mar 26, 2019 at 10:04:48AM +0900, Tatsuro Yamada wrote:
> Hope this feature will help DBA and user. :)

Congrats, Yamada-san.
--
Michael


signature.asc
Description: PGP signature


Re: Ordered Partitioned Table Scans

2019-03-25 Thread David Rowley
On Tue, 26 Mar 2019 at 09:02, Julien Rouhaud  wrote:
> FTR this patch doesn't apply since single child [Merge]Append
> suppression (8edd0e7946) has been pushed.

Thanks for letting me know.  I've attached v14 based on current master.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


mergeappend_to_append_conversion_v14.patch
Description: Binary data


Re: monitoring CREATE INDEX [CONCURRENTLY]

2019-03-25 Thread Alvaro Herrera
On 2019-Mar-25, Alvaro Herrera wrote:

> Here's v6 of this patch.  I have rebased on top of today's CLUSTER
> monitoring, as well as on table AM commits.  The latter caused a bit of
> trouble, as now the number of blocks processed by a scan is not as easy
> to get as before; I added a new entry point heapscan_get_blocks_done on
> heapam.c to help with that.  (I suppose this will need some fixups later
> on.)

Andres, I suppose you have something to say about patch 0001 here?

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Misleading errors with column references in default expressions and partition bounds

2019-03-25 Thread Michael Paquier
Hi all,

I have just committed a fix for a crash with the handling of partition
bounds using column references which has been discussed here:
https://www.postgresql.org/message-id/15668-0377b1981aa1a...@postgresql.org

And while discussing on the matter with Amit, the point has been
raised that default expressions with column references can lead to
some funny error messages with the context.  For example, take that
with an undefined column:
=# create table foo (a int default (a.a));
ERROR:  42P01: missing FROM-clause entry for table "a"
LINE 1: create table foo (a int default (a.a));

This confusion is old I think, and reproduces down to 9.4 and older.
If using directly a reference from a column's table then things get
correct:
=# create table foo (a int default (foo.a));
ERROR:  42P10: cannot use column references in default expression
LOCATION:  cookDefault, heap.c:2948
=# create table foo (a int default (a));
ERROR:  42P10: cannot use column references in default expression
LOCATION:  cookDefault, heap.c:2948

We have the same problem for partition bounds actually, which is new
as v12 as partition bound expressions now use the common expression
machinery for transformation:
=# CREATE TABLE list_parted (a int) PARTITION BY LIST (a);
CREATE TABLE
=# CREATE TABLE part_list_crash PARTITION OF list_parted
 FOR VALUES IN (somename.somename);
ERROR:  42P01: missing FROM-clause entry for table "somename"
LINE 2:   FOR VALUES IN (somename.somename)

One idea which came from Amit, and it seems to me that it is a good
idea, would be to have more context-related error messages directly in
transformColumnRef(), so as we can discard at an early stage column
references which are part of contexts where there is no meaning to
have them.  The inconsistent part in HEAD is that cookDefault() and
transformPartitionBoundValue() already discard column references, so
if we move those checks at transformation phase we can simplify the
error handling post-transformation.  This would make the whole thing
more consistent.

While this takes care of the RTE issues, this has a downside though.
Take for example this case using an expression with an aggregate and
a column reference: 
=# CREATE TABLE part_bogus_expr_fail PARTITION OF list_parted
   FOR VALUES IN (sum(a));
-ERROR:  aggregate functions are not allowed in partition bound
+ERROR:  cannot use column reference in partition bound expression

So this would mean that we would first complain of the most inner
parts of the expression, which is more intuitive actually in my
opinion.  The difference can be seen using the patch attached for
partition bounds, as I have added more test coverage with a previous
commit.  We also don't have much tests in the code for default
expression patterns, so I have added some.

The docs of CREATE TABLE also look incorrect to me when it comes to
default expressions.  It says the following: "other columns in the
current table are not allowed".  However *all* columns are not
authorized, including the column which uses the expression.

Thoughts?
--
Michael
diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index e94fe2c3b6..5fe16aa6f6 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -784,7 +784,7 @@ WITH ( MODULUS numeric_literal, REM
   The DEFAULT clause assigns a default data value for
   the column whose column definition it appears within.  The value
   is any variable-free expression (subqueries and cross-references
-  to other columns in the current table are not allowed).  The
+  to any columns in the current table are not allowed).  The
   data type of the default expression must match the data type of the
   column.
  
diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index c7b5ff62f9..fc682e0b52 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -2939,19 +2939,11 @@ cookDefault(ParseState *pstate,
 	expr = transformExpr(pstate, raw_default, EXPR_KIND_COLUMN_DEFAULT);
 
 	/*
-	 * Make sure default expr does not refer to any vars (we need this check
-	 * since the pstate includes the target table).
-	 */
-	if (contain_var_clause(expr))
-		ereport(ERROR,
-(errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
- errmsg("cannot use column references in default expression")));
-
-	/*
-	 * transformExpr() should have already rejected subqueries, aggregates,
-	 * window functions, and SRFs, based on the EXPR_KIND_ for a default
-	 * expression.
+	 * transformExpr() should have already rejected column references,
+	 * subqueries, aggregates, window functions, and SRFs, based on the
+	 * EXPR_KIND_ for a default expression.
 	 */
+	Assert(!contain_var_clause(expr));
 
 	/*
 	 * Coerce the expression to the correct type and typmod, if given. This
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index e559353529..3e648dc8ef 100644
--- a/src/backend/parser/parse_expr.c
+++ 

Re: monitoring CREATE INDEX [CONCURRENTLY]

2019-03-25 Thread Amit Langote
On 2019/03/26 1:53, Alvaro Herrera wrote:
> Here's v6 of this patch.  I have rebased on top of today's CLUSTER
> monitoring, as well as on table AM commits.  The latter caused a bit of
> trouble, as now the number of blocks processed by a scan is not as easy
> to get as before; I added a new entry point heapscan_get_blocks_done on
> heapam.c to help with that.  (I suppose this will need some fixups later
> on.)
> 
> I removed the "M of N" phase labels that Robert didn't like; those were
> suggested by Rahila and upvoted by Amit L.  I'm of two minds about
> those.  If you care about those and want them back, please speak up.

On second thought, I'm neutral on it too.

Thanks,
Amit




Re: psql display of foreign keys

2019-03-25 Thread Amit Langote
Hi,

On 2019/03/26 7:38, Alvaro Herrera wrote:
> v9 attached; this one's final AFAICT.

Agreed.

Thanks,
Amit




Re: psql display of foreign keys

2019-03-25 Thread Alvaro Herrera
On 2019-Mar-25, Alvaro Herrera wrote:

> v9 attached; this one's final AFAICT.

Actually, I propose this fixup.  It doesn't change the current output,
but of course it affects how this works with my patch in
https://postgr.es/m/20190321215420.GA22766@alvherre.pgsql
The v9 patch does not show anything for the partitions of the referenced
partitioned table; with this one it shows like this

-- verify psql behaves sanely
\d droppk
   Partitioned table "regress_fk.droppk"
 Column |  Type   | Collation | Nullable | Default 
+-+---+--+-
 a  | integer |   | not null | 
Partition key: RANGE (a)
Indexes:
"droppk_pkey" PRIMARY KEY, btree (a)
Referenced by:
TABLE "dropfk" CONSTRAINT "dropfk_a_fkey" FOREIGN KEY (a) REFERENCES 
droppk(a)
Number of partitions: 3 (Use \d+ to list them.)

\d droppk21
Table "regress_fk.droppk21"
 Column |  Type   | Collation | Nullable | Default 
+-+---+--+-
 a  | integer |   | not null | 
Partition of: droppk2 FOR VALUES FROM (1000) TO (1400)
Indexes:
"droppk21_pkey" PRIMARY KEY, btree (a)
Referenced by:
TABLE "dropfk" CONSTRAINT "dropfk_a_fkey" FOREIGN KEY (a) REFERENCES 
droppk(a)

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
commit 12642d9ff66a2661d4f707cb15b0c3de6a9e1d4e
Author: Alvaro Herrera 
AuthorDate: Mon Mar 25 22:50:59 2019 -0300
CommitDate: Mon Mar 25 22:51:15 2019 -0300

fixup

diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index fc404bbaab8..91f2306734e 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -2489,11 +2489,12 @@ describeOneTableDetails(const char *schemaname,
 printfPQExpBuffer(,
   "SELECT conname, conrelid::pg_catalog.regclass AS ontable,\n"
   "   pg_catalog.pg_get_constraintdef(oid, true) AS condef\n"
-  "  FROM pg_catalog.pg_constraint c LEFT JOIN \n"
-  "   pg_catalog.pg_partition_ancestors(c.confrelid) ON (true)\n"
-  " WHERE confrelid = '%s' AND contype = 'f' AND conparentid = 0\n"
+  "  FROM pg_catalog.pg_constraint c\n"
+  " WHERE confrelid IN (SELECT pg_catalog.pg_partition_ancestors('%s')\n"
+  " UNION ALL VALUES (regclass '%s'))\n"
+  "   AND contype = 'f' AND conparentid = 0\n"
   "ORDER BY conname;",
-  oid);
+  oid, oid);
 			}
 			else
 			{


Re: BUG #15668: Server crash in transformPartitionRangeBounds

2019-03-25 Thread Amit Langote
Hi,

On 2019/03/26 10:15, Michael Paquier wrote:
> Done as you suggested, with a minimal set enough to trigger the crash,
> still the error message is rather misleading as you would expect :)

Thanks for committing.

>> A separate thread will definitely attract more attention, at least in due
>> time. :)
> 
> Sure.  For now I have committed a lighter version of 0001, with
> tweaked comments based on your suggestion, as well as a minimum set of
> test cases.  I have added on the way some tests for range partitions
> which have been missing from the start, and improved the existing set
> by removing the original "a.a" references, and switching to use
> max(date) for range partitions to bump correctly on the aggregate
> error.  I am just updating the second patch now and I'll begin a new
> thread soon.

Thanks.

Regards,
Amit




Re: [HACKERS] Block level parallel vacuum

2019-03-25 Thread Haribabu Kommi
On Fri, Mar 22, 2019 at 4:06 PM Masahiko Sawada 
wrote:

>
> Attached the updated version patch. 0001 patch allows all existing
> vacuum options an boolean argument. 0002 patch introduces parallel
> lazy vacuum. 0003 patch adds -P (--parallel) option to vacuumdb
> command.
>

Thanks for sharing the updated patches.

0001 patch:

+PARALLEL [ N ]

But this patch contains syntax of PARALLEL but no explanation, I saw that
it is explained in 0002. It is not a problem, but just mentioning.

+  Specifies parallel degree for PARALLEL option. The
+  value must be at least 1. If the parallel degree
+  integer is omitted, then
+  VACUUM decides the number of workers based on
number of
+  indexes on the relation which further limited by
+  .

Can we add some more details about backend participation also, parallel
workers will
come into picture only when there are 2 indexes in the table.

+ /*
+ * Do post-vacuum cleanup and statistics update for each index if
+ * we're not in parallel lazy vacuum. If in parallel lazy vacuum, do
+ * only post-vacum cleanup and then update statistics after exited
+ * from parallel mode.
+ */
+ lazy_vacuum_all_indexes(vacrelstats, Irel, nindexes, indstats,
+ lps, true);

How about renaming the above function, as it does the cleanup also?
lazy_vacuum_or_cleanup_all_indexes?


+ if (!IsInParallelVacuum(lps))
+ {
+ /*
+ * Update index statistics. If in parallel lazy vacuum, we will
+ * update them after exited from parallel mode.
+ */
+ lazy_update_index_statistics(Irel[idx], stats[idx]);
+
+ if (stats[idx])
+ pfree(stats[idx]);
+ }

The above check in lazy_vacuum_all_indexes can be combined it with the outer
if check where the memcpy is happening. I still feel that the logic around
the stats
makes it little bit complex.

+ if (IsParallelWorker())
+ msg = "scanned index \"%s\" to remove %d row versions by parallel vacuum
worker";
+ else
+ msg = "scanned index \"%s\" to remove %d row versions";

I feel, this way of error message may not be picked for the translations.
Is there any problem if we duplicate the entire ereport message with
changed message?

+ for (i = 0; i < nindexes; i++)
+ {
+ LVIndStats *s = &(copied_indstats[i]);
+
+ if (s->updated)
+ lazy_update_index_statistics(Irel[i], &(s->stats));
+ }
+
+ pfree(copied_indstats);

why can't we use the shared memory directly to update the stats once all
the workers
are finished, instead of copying them to a local memory?

+ tab->at_params.nworkers = 0; /* parallel lazy autovacuum is not supported
*/

User is not required to provide workers number compulsory even that
parallel vacuum can
work, so just setting the above parameters doesn't stop the parallel
workers, user must
pass the PARALLEL option also. So mentioning that also will be helpful
later when we
start supporting it or some one who is reading the code can understand.

Regards,
Haribabu Kommi
Fujitsu Australia


Re: BUG #15668: Server crash in transformPartitionRangeBounds

2019-03-25 Thread Michael Paquier
On Fri, Mar 22, 2019 at 02:49:42PM +0900, Amit Langote wrote:
> This comment sounds a bit misleading.  The code above this "did" find
> field names, but there were too many.  What this comment should mention is
> that parsing didn't return a single field name, which is the format that
> the code below this can do something useful with.  I had proposed that
> upthread, but maybe that feedback got lost in the discussion about other
> related issues.

True.  I was reviewing that stuff yesterday and I have not been able
to finish wrapping it.

> +CREATE TABLE part_bogus_expr_fail PARTITION OF range_parted
> +  FOR VALUES FROM (sum(a)) TO ('2019-01-01');
> +ERROR:  function sum(date) does not exist
> +LINE 2:   FOR VALUES FROM (sum(a)) TO ('2019-01-01');
> 
> Maybe, we should add to this patch only the tests relevant to the cases
> that would lead to crash without this patch.

Done as you suggested, with a minimal set enough to trigger the crash,
still the error message is rather misleading as you would expect :)

> A separate thread will definitely attract more attention, at least in due
> time. :)

Sure.  For now I have committed a lighter version of 0001, with
tweaked comments based on your suggestion, as well as a minimum set of
test cases.  I have added on the way some tests for range partitions
which have been missing from the start, and improved the existing set
by removing the original "a.a" references, and switching to use
max(date) for range partitions to bump correctly on the aggregate
error.  I am just updating the second patch now and I'll begin a new
thread soon.
--
Michael


signature.asc
Description: PGP signature


Re: [HACKERS] CLUSTER command progress monitor

2019-03-25 Thread Tatsuro Yamada

Hi Robert and Reviewers!

On 2019/03/26 0:02, Robert Haas wrote:

On Sun, Mar 24, 2019 at 9:02 PM Tatsuro Yamada
 wrote:

Please find attached files. :)


Committed.  Thanks!


Thank you!
Hope this feature will help DBA and user. :)

Regards,
Tatsuro Yamada
NTT Open Source Software Center




RE: Re: Log a sample of transactions

2019-03-25 Thread Kuroda, Hayato
Dear David,

I have a will and already read the patch, but I thought it's not my turn.
Sorry.


Adrien,

>  I did not find any test for log_min_duration that could help me. LCOV 
> indicate
> there is no tests on this part (look check_log_duration()):
> https://coverage.postgresql.org/src/backend/tcop/postgres.c.gcov.html

I understand the unnecessarily of some test case. It's OK.

Finally, how do you think about the deviation of randomness?
If this parameter is set very low, nobody may be output because of the 
deviation.
we can avoid this phenomenon by counting up internal parameter for each 
transactions and output to log file if the parameter becomes  more than 1.

After consideration for this case and rebasing, I think this patch is enough.
Do I have to measure the change of throughput?


Best Regards,
Hayato Kuroda
Fujitsu LIMITED




setLastTid() and currtid()

2019-03-25 Thread Andres Freund
Hi,

For the tableam work I'd like to remove heapam.h from
nodeModifyTable.c. The only remaining impediment to that is a call to
setLastTid(), which is defined in tid.c but declared in heapam.h.

That doesn't seem like a particularly accurate location, it doesn't
really have that much to do with heap. It seems more like a general
executor facility or something.  Does anybody have a good idea where to
put the declaration?


Looking at how this function is used, lead to some confusion on my part.


We currently call setLastTid in ExecInsert():

if (canSetTag)
{
(estate->es_processed)++;
setLastTid(>tts_tid);
}

And Current_last_tid, the variable setLastTid sets, is only used in
currtid_byreloid():


Datum
currtid_byreloid(PG_FUNCTION_ARGS)
{
Oid reloid = PG_GETARG_OID(0);
ItemPointer tid = PG_GETARG_ITEMPOINTER(1);
ItemPointer result;
Relationrel;
AclResult   aclresult;
Snapshotsnapshot;

result = (ItemPointer) palloc(sizeof(ItemPointerData));
if (!reloid)
{
*result = Current_last_tid;
PG_RETURN_ITEMPOINTER(result);
}

I've got to say I'm a bit baffled by this interface. If somebody passes
in a 0 reloid, we just ignore the passed in tid, and return the last tid
inserted into any table?

I then was even more baffled to find that there's no documentation of
this function, nor this special case behaviour, to be found
anywhere. Not in the docs (which don't mention the function, nor it's
special case behaviour for relation 0), nor in the code.


It's unfortunately used in psqlobdc:

else if ((flag & USE_INSERTED_TID) != 0)
printfPQExpBuffer(, "%s where ctid = (select 
currtid(0, '(0,0)'))", load_stmt);

I gotta say, all that currtid code looks to me like it just should be
ripped out.  It really doesn't make a ton of sense to just walk the tid
chain for a random tid - without an open snapshot, there's absolutely no
guarantee that you get back anything meaningful.  Nor am I convinced
it's perfectly alright to just return the latest inserted tid for a
relation the user might not have any permission for.

OTOH, we probably can't just break psqlodbc, so we probably have to hold
our noses a bit longer and just move the prototype elsewhere?  But I'm
inclined to just say that this functionality is going to get ripped out
soon, unless somebody from the odbc community works on making it make a
bit more sense (tests, docs at the very very least).

Greetings,

Andres Freund



Re: selecting from partitions and constraint exclusion

2019-03-25 Thread Amit Langote
On 2019/03/26 0:21, Robert Haas wrote:
> On Wed, Mar 20, 2019 at 12:37 AM Amit Langote
>  wrote:
>> That's because get_relation_constraints() no longer (as of PG 11) includes
>> the partition constraint for SELECT queries.
> 
> What commit made that change?

That would be 9fdb675fc5d2 (faster partition pruning) that got into PG 11.

> This sounds to me like maybe it should be an open item.

I've added this under Older Bugs.

Thanks,
Amit




Re: Usage of epoch in txid_current

2019-03-25 Thread Thomas Munro
On Tue, Mar 26, 2019 at 3:23 AM Heikki Linnakangas  wrote:
> Looks good.
>
> I started to write a patch to use XID & epoch in dealing with GiST page
> deletions [1], and I really could've used an epoch to go with
> RecentGlobalXmin. I presume that would be pretty straightforward to have
> with this, too.

Yeah.  A simple way to compute that would be to use FullTransactionId
in PGXACT, so that GetSnapshotData() could find the minimum value and
put that into GlobalRecentFullXmin, and set GlobalRecentXmin with the
truncated version (or perhaps #define it as
(XidFromFullTransactionId(GlobalRecentFullXmin))).  Increasing the
number of cachelines that GetSnapshotData() touches may not be
popular, though.  I think you could probably reclaim that space by
using a more compact representation of vacuumFlags, overflowed,
delayChkpt, nxids (it's funny, the comment says "as tightly as
possible", which clearly isn't the case).  You'd probably also need
atomic 64 bit reads for the FullTransactionIds, which would be ugly on
ancient systems but otherwise no problem.

Instead of all that you might want to just infer the epoch instead.
I'm not sure of the exact logic required for that off-hand.  You'd
probably want nextFullXid as a reference to compute the epoch, but
you'd either need to acquire a lock to read it while you already hold
ProcArrayLock (!), or read it atomically after releasing ProcArrayLock
... but then a Deathstation 9000 might allocate 8 billion more xids
with all the necessary auto-vacuuming to allow that before scheduling
you back onto the CPU.  Admittedly, I haven't though about this very
deeply at all, there may be a simple and correct way to do it.



--
Thomas Munro
https://enterprisedb.com



Re: [HACKERS] PATCH: multivariate histograms and MCV lists

2019-03-25 Thread Tomas Vondra



On 3/24/19 8:36 AM, Dean Rasheed wrote:
> On Sun, 24 Mar 2019 at 00:17, David Rowley  
> wrote:
>>
>> On Sun, 24 Mar 2019 at 12:41, Tomas Vondra  
>> wrote:
>>>
>>> On 3/21/19 4:05 PM, David Rowley wrote:
>>
 29. Looking at the tests I see you're testing that you get bad
 estimates without extended stats.  That does not really seem like
 something that should be done in tests that are meant for extended
 statistics.

>>>
>>> True, it might be a bit unnecessary. Initially the tests were meant to
>>> show old/new estimates for development purposes, but it might not be
>>> appropriate for regression tests. I don't think it's a big issue, it's
>>> not like it'd slow down the tests significantly. Opinions?
>>
>> My thoughts were that if someone did something to improve non-MV
>> stats, then is it right for these tests to fail? What should the
>> developer do in the case? update the expected result? remove the test?
>> It's not so clear.
>>
> 
> I think the tests are fine as they are. Don't think of these as "good"
> and "bad" estimates. They should both be "good" estimates, but under
> different assumptions -- one assuming no correlation between columns,
> and one taking into account the relationship between the columns. If
> someone does do something to "improve" the non-MV stats, then the
> former tests ought to tell us whether it really was an improvement. If
> so, then the test result can be updated and perhaps whatever was done
> ought to be factored into the MV-stats' calculation of base
> frequencies. If not, the test is providing valuable feedback that
> perhaps it wasn't such a good improvement after all.
> 

Yeah, I agree. I'm sure there are ways to further simplify (or otherwise
improve) the tests, but I think those tests are useful to demonstrate
what the "baseline" estimates are.

regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] PATCH: multivariate histograms and MCV lists

2019-03-25 Thread Tomas Vondra
Hi,

Attached is an updated patch, fixing all the issues pointed out so far.
Unless there are some objections, I plan to commit the 0001 part by the
end of this CF. Part 0002 is a matter for PG13, as previously agreed.


On 3/24/19 1:17 AM, David Rowley wrote:
> On Sun, 24 Mar 2019 at 12:41, Tomas Vondra  
> wrote:
>>
>> On 3/21/19 4:05 PM, David Rowley wrote:
>>> 11. In get_mincount_for_mcv_list() it's probably better to have the
>>> numerical literals of 0.0 instead of just 0.
>>
>> Why?
> 
> Isn't it what we do for float and double literals?
> 

OK, fixed.

>>
>>> 12. I think it would be better if you modified build_attnums_array()
>>> to add an output argument that sets the size of the array. It seems
>>> that most places you call this function you perform bms_num_members()
>>> to determine the array size.
>>
>> Hmmm. I've done this, but I'm not sure I like it very much - there's no
>> protection the value passed in is the right one, so the array might be
>> allocated either too small or too large. I think it might be better to
>> make it work the other way, i.e. pass the value out instead.
> 
> When I said "that sets the size", I meant "that gets set to the size",
> sorry for the confusion.  I mean, if you're doing bms_num_members()
> inside build_attnums_array() anyway, then this will save you from
> having to do that in the callers too.
> 

OK, I've done this now, and I'm fairly happy with it.

>>> 28. Can you explain what this is?
>>>
>>> uint32 type; /* type of MCV list (BASIC) */
>>>
>>> I see: #define STATS_MCV_TYPE_BASIC 1 /* basic MCV list type */
>>>
>>> but it's not really clear to me what else could exist. Maybe the
>>> "type" comment can explain there's only one type for now, but more
>>> might exist in the future?
>>>
>>
>> It's the same idea as for dependencies/mvdistinct stats, i.e.
>> essentially a version number for the data structure so that we can
>> perhaps introduce some improved version of the data structure in the future.
>>
>> But now that I think about it, it seems a bit pointless. We would only
>> do that in a major version anyway, and we don't keep statistics during
>> upgrades. So we could just as well introduce the version/flag/... if
>> needed. We can't do this for regular persistent data, but for stats it
>> does not matter.
>>
>> So I propose we just remove this thingy from both the existing stats and
>> this patch.
> 
> I see. I wasn't aware that existed for the other types.   It certainly
> gives some wiggle room if some mistakes were discovered after the
> release, but thinking about it, we could probably just change the
> "magic" number and add new code in that branch only to ignore the old
> magic number, perhaps with a WARNING to analyze the table again. The
> magic field seems sufficiently early in the struct that we could do
> that.  In the master branch we'd just error if the magic number didn't
> match, since we wouldn't have to deal with stats generated by the
> previous version's bug.
> 

OK. I've decided to keep the field for now, for sake of consistency with
the already existing statistic types. I think we can rethink that in the
future, if needed.

>>> 29. Looking at the tests I see you're testing that you get bad
>>> estimates without extended stats.  That does not really seem like
>>> something that should be done in tests that are meant for extended
>>> statistics.
>>>
>>
>> True, it might be a bit unnecessary. Initially the tests were meant to
>> show old/new estimates for development purposes, but it might not be
>> appropriate for regression tests. I don't think it's a big issue, it's
>> not like it'd slow down the tests significantly. Opinions?
> 
> My thoughts were that if someone did something to improve non-MV
> stats, then is it right for these tests to fail? What should the
> developer do in the case? update the expected result? remove the test?
> It's not so clear.
> 

I think such changes would affect a number of other places in regression
tests (changing plans, ...), so I don't see why fixing these tests would
be any different.

regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


0001-multivariate-MCV-lists-20190325.patch.gz
Description: application/gzip


0002-multivariate-histograms-20190325.patch.gz
Description: application/gzip


Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath

2019-03-25 Thread David Rowley
On Tue, 26 Mar 2019 at 09:00, Tom Lane  wrote:
>
> David Rowley  writes:
> > [ drop-useless-merge-appends-15.patch ]
>
> Pushed with some minor adjustments.

Thank for all your work on this, and thanks for the final push.

FWIW, you should probably have credited yourself as the main author
since I don't think there was much of my patch left.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: pg_upgrade version checking questions

2019-03-25 Thread Daniel Gustafsson
On Tuesday, March 19, 2019 12:35 AM, Tom Lane  wrote:

> I noticed a few things that seem a bit fishy about pg_upgrade.

Attached are three patches which takes a stab at the issues raised here (and
the discussion in this thread):

0001 - Enforces the version check to the full version including the minor
0002 - Tests for all the binaries that pg_upgrade executes
0003 - Make -B default to CWD and remove the exec_path check

Defaulting to CWD for the new bindir has the side effect that the default
sockdir is in the bin/ directory which may be less optimal.

cheers ./daniel



0002-Check-all-used-executables.patch
Description: Binary data


0003-Default-new-bindir-to-CWD.patch
Description: Binary data


0001-Only-allow-upgrades-by-the-same-exact-version-new-bi.patch
Description: Binary data


Re: Fix XML handling with DOCTYPE

2019-03-25 Thread Tom Lane
Chapman Flack  writes:
> Thanks for the review! Yes, that part of this commitfest entry has been
> committed already and will appear in the next minor releases of those
> branches.

Indeed, thanks for verifying that this fixes your problem.

> That leaves only one patch in this commitfest entry that is still in
> need of review, namely the update to the documentation.

Yeah.  Since it *is* in need of review, I changed the CF entry's
state back to Needs Review.

regards, tom lane



Re: Tid scan improvements

2019-03-25 Thread Tom Lane
Andres Freund  writes:
> I was kinda hoping to keep block numbers out of the "main" APIs, to
> avoid assuming everything is BLCKSZ based. I don't have a particular
> problem allowing an optional setscanlimits type callback that works with
> block numbers. The planner could check its presence and just not build
> tid range scans if not present.  Alternatively a bespoke scan API for
> tid range scans, like the later patches in the tableam series for
> bitmap, sample, analyze scans, might be an option.

Given Andres' API concerns, and the short amount of time remaining in
this CF, I'm not sure how much of this patch set we can expect to land
in v12.  It seems like it might be a good idea to scale back our ambitions
and see whether there's a useful subset we can push in easily.

With that in mind, I went ahead and pushed 0001+0004, since improving
the planner's selectivity estimate for a "ctid vs constant" qual is
likely to be helpful whether the executor is smart about it or not.

FWIW, I don't really see the point of treating 0002 as a separate patch.
If it had some utility on its own, then it'd be sensible, but what
would that be?  Also, it looks from 0002 like you are trying to overload
rs_startblock with a different meaning than it has for syncscans, and
I think that might be a bad idea. 

regards, tom lane



Re: Fix XML handling with DOCTYPE

2019-03-25 Thread Chapman Flack
On 03/25/19 18:03, Ryan Lambert wrote:
> The following review has been posted through the commitfest application:
> make installcheck-world:  tested, passed
> Implements feature:   tested, passed
> Spec compliant:   not tested
> Documentation:not tested

Hi,

Thanks for the review! Yes, that part of this commitfest entry has been
committed already and will appear in the next minor releases of those
branches.

That leaves only one patch in this commitfest entry that is still in
need of review, namely the update to the documentation.

If you happened to feel moved to look over a documentation patch, that
would be what this CF entry most needs in the waning days of the commitfest.

There seem to be community members reluctant to review it because of not
feeling sufficiently expert in XML to scrutinize every technical detail,
but there are other valuable angles for documentation review. (And the
reason there *is* a documentation patch is the plentiful room for
improvement in the documentation that's already there, so as far as
reviewing goes, the old yarn about the two guys, the running shoes, and
the bear comes to mind.)

I can supply pointers to specs, etc., for anyone who does see some technical
details in the patch and has questions about them.

Regards,
-Chap



Re: MacPorts support for "extra" tests

2019-03-25 Thread Thomas Munro
On Thu, Mar 21, 2019 at 4:39 PM Tom Lane  wrote:
> Thomas Munro  writes:
> > Peter E added some nice tests for LDAP and Kerberos, but they assume
> > you have Homebrew when testing on a Mac.  Here's a patch to make them
> > work with MacPorts too (a competing open source port/package
> > distribution that happens to be the one that I use).  The third
> > "extra" test is ssl, but that was already working.
>
> +1, but could we comment that a bit?  I'm thinking of something like
>
>   # typical library location for Homebrew
>
> in each of the if-branches.

Pushed.

I tried half-heartedly to understand why Apple's /usr/libexec/slapd
doesn't work for our tests.  I noticed that is does actually start up
and run in the foreground if you use -d 255 (= debug level), and
prints similar debug output to upstream slapd, but it complains about
TLS stuff that AFAIK it should be happy with.  Perhaps it wants to do
TLS stuff via the Apple keychain technology?  Without the debug switch
it doesn't launch at all (making -d a bit of a heisendebug level if
you ask me), while upstream slapd double-forks a daemon, and that's
the first thing stopping our test from working (though of course the
TLS stuff would be the next problem).  One magical thing about it is
that it's one of those signed executables that won't let you dtruss it
unless you disable SIP.  I wonder if that's relevant.  Anyway, I frown
in the general direction of California, and hereby give up.

--
Thomas Munro
https://enterprisedb.com



Re: pg_basebackup ignores the existing data directory permissions

2019-03-25 Thread Michael Paquier
On Mon, Mar 25, 2019 at 09:08:23AM -0400, Robert Haas wrote:
> If we're going to go with -g {inherit|none|group} then -g inherit
> ought to do what was originally proposed on this thread -- i.e. set
> the directory permissions to match the contents.  I don't think that's
> a change that can or should be back-patched, but we should make it
> consistent as part of this cleanup.

No objections from me to do that actually.  That's a small
compatibility break, but with the new options it is possible to live
with.  Perhaps we should add into initdb the same switch?  Except that
"inherit" makes no sense there.
--
Michael


signature.asc
Description: PGP signature


Re: psql display of foreign keys

2019-03-25 Thread Alvaro Herrera
v9 attached; this one's final AFAICT.

On 2019-Mar-25, Peter Eisentraut wrote:

> relispartition was added in PG10, so the conditional in
> describeOneTableDetails() seems wrong.

Hmm, yeah, we weren't using it anyway (since we can only use the new
display with pg_partition_ancestors which is new in pg12), but I guess
you have a point that this could confuse somebody in the future.

> In the older branches of that same function, I'd prefer writing
> 
> false AS relispartition
> 
> for clarity.

Yeah, some previous commits in that area have added "false" flags here
and there without adding aliases.  We should fix those sometime.  And
also the new "amname" output column is conditional on the version number
and changes column numbering for any column that appears afterwards ...
that one definitely deserves a "NULL as amname" in the older branches.

I changed some code to use PQfnumber() the way pg_dump does it; that
code's support for back-branch compatibility is much more battle-tested
than psql's and I trust that to be more maintainable.  In fact, my
motivation for doing it that way is that I found psql's way to be
confusing.

> Some of the other queries could also use some column aliases, like
> 
> conrelid = '%s'::pg_catalog.regclass AS isroot (?)
> 
> or
> 
> pg_catalog.pg_get_constraintdef(oid, true) AS condef
> 
> (as in the other branch).

Agreed, added.

> A test case for the incoming foreign key display would be nice, as that
> was the original argument for the patch.

Agreed, added.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>From 58a8890daa0599be0dcb27c018ee31bafa2ee359 Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Fri, 22 Mar 2019 19:13:16 -0300
Subject: [PATCH v9] fix psql display of FKs 

---
 src/bin/psql/describe.c   | 155 --
 src/test/regress/expected/foreign_key.out |  38 --
 src/test/regress/sql/foreign_key.sql  |   2 +
 3 files changed, 144 insertions(+), 51 deletions(-)

diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index fd8ebee8cd3..fc404bbaab8 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -1479,6 +1479,7 @@ describeOneTableDetails(const char *schemaname,
 		bool		rowsecurity;
 		bool		forcerowsecurity;
 		bool		hasoids;
+		bool		ispartition;
 		Oid			tablespace;
 		char	   *reloptions;
 		char	   *reloftype;
@@ -1502,7 +1503,7 @@ describeOneTableDetails(const char *schemaname,
 		printfPQExpBuffer(,
 		  "SELECT c.relchecks, c.relkind, c.relhasindex, c.relhasrules, "
 		  "c.relhastriggers, c.relrowsecurity, c.relforcerowsecurity, "
-		  "false AS relhasoids, %s, c.reltablespace, "
+		  "false AS relhasoids, c.relispartition, %s, c.reltablespace, "
 		  "CASE WHEN c.reloftype = 0 THEN '' ELSE c.reloftype::pg_catalog.regtype::pg_catalog.text END, "
 		  "c.relpersistence, c.relreplident, am.amname\n"
 		  "FROM pg_catalog.pg_class c\n "
@@ -1515,12 +1516,29 @@ describeOneTableDetails(const char *schemaname,
 		   : "''"),
 		  oid);
 	}
+	else if (pset.sversion >= 10)
+	{
+		printfPQExpBuffer(,
+		  "SELECT c.relchecks, c.relkind, c.relhasindex, c.relhasrules, "
+		  "c.relhastriggers, c.relrowsecurity, c.relforcerowsecurity, "
+		  "c.relhasoids, c.relispartition, %s, c.reltablespace, "
+		  "CASE WHEN c.reloftype = 0 THEN '' ELSE c.reloftype::pg_catalog.regtype::pg_catalog.text END, "
+		  "c.relpersistence, c.relreplident\n"
+		  "FROM pg_catalog.pg_class c\n "
+		  "LEFT JOIN pg_catalog.pg_class tc ON (c.reltoastrelid = tc.oid)\n"
+		  "WHERE c.oid = '%s';",
+		  (verbose ?
+		   "pg_catalog.array_to_string(c.reloptions || "
+		   "array(select 'toast.' || x from pg_catalog.unnest(tc.reloptions) x), ', ')\n"
+		   : "''"),
+		  oid);
+	}
 	else if (pset.sversion >= 90500)
 	{
 		printfPQExpBuffer(,
 		  "SELECT c.relchecks, c.relkind, c.relhasindex, c.relhasrules, "
 		  "c.relhastriggers, c.relrowsecurity, c.relforcerowsecurity, "
-		  "c.relhasoids, %s, c.reltablespace, "
+		  "c.relhasoids, false as relispartition, %s, c.reltablespace, "
 		  "CASE WHEN c.reloftype = 0 THEN '' ELSE c.reloftype::pg_catalog.regtype::pg_catalog.text END, "
 		  "c.relpersistence, c.relreplident\n"
 		  "FROM pg_catalog.pg_class c\n "
@@ -1537,7 +1555,7 @@ describeOneTableDetails(const char *schemaname,
 		printfPQExpBuffer(,
 		  "SELECT c.relchecks, c.relkind, c.relhasindex, c.relhasrules, "
 		  "c.relhastriggers, false, false, c.relhasoids, "
-		  "%s, c.reltablespace, "
+		  "false as relispartition, %s, c.reltablespace, "
 		  "CASE WHEN c.reloftype = 0 THEN '' ELSE c.reloftype::pg_catalog.regtype::pg_catalog.text END, "
 		  "c.relpersistence, c.relreplident\n"
 		  "FROM pg_catalog.pg_class c\n "
@@ -1554,7 +1572,7 @@ describeOneTableDetails(const char 

Re: reorder pg_rewind control file sync

2019-03-25 Thread Michael Paquier
On Mon, Mar 25, 2019 at 10:29:46AM +0100, Fabien COELHO wrote:
> I have run the non regression tests. I'm not sure of what scenarii are
> covered there, but probably not an interruption in the middle of a fsync,
> specially if fsync is usually disabled to ease the tests:-)

Force the tool to stop at a specific point requires a booby-trap.  And
even if fsync is not killed, you could just enforce the tool to stop
once before updating the control file, and attempt a re-run without
the trap, checking if it works at the second attempt, so the problem
is quite independent from the timing of fsync().
--
Michael


signature.asc
Description: PGP signature


Re: Fix XML handling with DOCTYPE

2019-03-25 Thread Ryan Lambert
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:   tested, passed
Spec compliant:   not tested
Documentation:not tested

I tested the master branch (commit 8edd0e7), REL_11_STABLE (commit 24df866) and 
REL9_6_STABLE (commit 5097368) and verified functionality.  This patch fixes 
the bug I had reported [1] previously.

With this in the stable branches is it safe to assume this will be included 
with the next minor releases?  Thanks for your work on this!!

Ryan

[1] 
https://www.postgresql.org/message-id/flat/153478795159.1302.9617586466368699403%40wrigleys.postgresql.org

The new status of this patch is: Ready for Committer


Re: pg_upgrade: Pass -j down to vacuumdb

2019-03-25 Thread Tom Lane
Jesper Pedersen  writes:
> [ v9_0001-Highlight-that-the-jobs-option-isn-t-passed-down-to-.patch ]

Now that I've looked at this, it seems like it's fairly confused about
what the proposed VACUUMDB_OPTS variable would actually do for you.
If you're going to run vacuumdb directly, it hardly seems like you'd
bother with setting such a variable; you'd just enter the options you
want directly on the command line.

Conversely, if what you plan to do is set VACUUMDB_OPTS and then
run this script, the fact that --analyze-in-stages is hard-wired
into the script's invocation doesn't seem right either: you probably
don't want that if your goal is to get done as fast as possible.

In short, I'm not convinced that most of this patch is an improvement
on the status quo.  I think we'd be best off to just take the idea
of explicitly mentioning adding --jobs to a manual run, ie roughly

fprintf(script, "echo %sthis script and run:%s\n",
ECHO_QUOTE, ECHO_QUOTE);
fprintf(script, "echo %s\"%s/vacuumdb\" %s--all %s%s\n", ECHO_QUOTE,
new_cluster.bindir, user_specification.data,
/* Did we copy the free space files? */
(GET_MAJOR_VERSION(old_cluster.major_version) >= 804) ?
"--analyze-only" : "--analyze", ECHO_QUOTE);
+   fprintf(script, "echo %sYou may wish to add --jobs=N for parallel 
analyzing.%s\n",
+   ECHO_QUOTE, ECHO_QUOTE);
fprintf(script, "echo%s\n\n", ECHO_BLANK);

fprintf(script, "\"%s/vacuumdb\" %s--all --analyze-in-stages\n",


regards, tom lane



Re: Tid scan improvements

2019-03-25 Thread Andres Freund
Hi,

On 2019-03-18 22:35:05 +1300, David Rowley wrote:
> The commit message in 8586bf7ed mentions:
> 
> > Subsequent commits will incrementally abstract table access
> > functionality to be routed through table access methods. That change
> > is too large to be reviewed & committed at once, so it'll be done
> > incrementally.
> 
> and looking at [1] I see patch 0004 introduces some changes in
> nodeTidscan.c to call a new tableam API function named
> heapam_fetch_row_version. I see this function does have a ItemPointer
> argument, so I guess we must be keeping those as unique row
> identifiers in the API.

Right, we are. At least for now - there's some discussions around
allowing different format for TIDs, to allow things like index organized
tables, but that's for later.


> Patch 0001 does change the signature of heap_setscanlimits() (appears
> to be committed already), and then in 0010 the only code that calls
> heap_setscanlimits() (IndexBuildHeapRangeScan()) is moved and renamed
> to heapam_index_build_range_scan() and set to be called via the
> index_build_range_scan TableAmRoutine method.  So it looks like out of
> that patch series nothing is there to allow you to access
> heap_setscanlimits() directly via the TableAmRoutine API, so perhaps
> for this to work heap_setscanlimits will need to be interfaced,
> however, I'm unsure if that'll violate any assumptions that Andres
> wants to keep out of the API...

I was kinda hoping to keep block numbers out of the "main" APIs, to
avoid assuming everything is BLCKSZ based. I don't have a particular
problem allowing an optional setscanlimits type callback that works with
block numbers. The planner could check its presence and just not build
tid range scans if not present.  Alternatively a bespoke scan API for
tid range scans, like the later patches in the tableam series for
bitmap, sample, analyze scans, might be an option.

Greetings,

Andres Freund



Re: Tid scan improvements

2019-03-25 Thread Andres Freund
Hi,

On 2019-03-15 18:42:40 +1300, Edmund Horner wrote:
> I've had to adapt it to use the table scan API.  I've got it compiling
> and passing tests, but I'm uneasy about some things that still use the
> heapam API.
> 
> 1. I call heap_setscanlimits as I'm not sure there is a tableam
> equivalent.

There used to be, but it wasn't clear that it was useful. In core pg the
only caller are index range scans, and those are - in a later patch in
the series - moved into the AM as well, as they need to deal with things
like HOT.


> 2. I'm not sure whether non-heap tableam implementations can also be
> supported by my TID Range Scan: we need to be able to set the scan
> limits.  There may not be any other implementations yet, but when
> there are, how do we stop the planner using a TID Range Scan for
> non-heap relations?

I've not yet looked through your code, but if required we'd probably
need to add a new tableam callback. It'd be marked optional, and the
planner could just check for its presence. A later part of the pluggable
storage series does that for bitmap scans, perhaps it's worth looking at
that?


> 3. When fetching tuples, I see that nodeSeqscan.c uses
> table_scan_getnextslot, which saves dealing with HeapTuples.  But
> nodeTidrangescan wants to do some checking of the block and offset
> before returning the slot.  So I have it using heap_getnext and
> ExecStoreBufferHeapTuple.  Apart from being heapam-specific, it's just
> not as clean as the new API calls.

Yea, that's not ok.  Note that, since yesterday, nodeTidscan doesn't
call heap_fetch() anymore (there's still a heap dependency, but that's
just for heap_get_latest_tid(), which I'll move into execMain or such).


> Ideally, we can get to to support general tableam implementations
> rather than using heapam-specific calls.  Any advice on how to do
> this?

Not yet - could you perhaps look at the bitmap scan patch in the tableam
queue, and see if that gives you inspiration?

- Andres



Re: MSVC Build support with visual studio 2019

2019-03-25 Thread Andrew Dunstan
I

On 3/25/19 3:44 PM, Andrew Dunstan wrote:
> On 3/21/19 3:13 AM, Michael Paquier wrote:
>> On Thu, Mar 21, 2019 at 12:45:57PM +1100, Haribabu Kommi wrote:
>>> The commit f2ab389 is later back-patch to version till 9.3 in commit
>>> 19acfd65.  I guess that building the windows installer for all the
>>> versions using the same visual studio is may be the reason behind
>>> that back-patch.
>> I did not remember this one, thanks for pointing it out.  So my
>> memories on that were incorrect.  If it is possible to get the code to
>> build with MSVC 2019 on all the supported branches, we could do so.
>
>
>
> VS2019 is currently in preview. I think we'd probably be better off
> waiting until the full release. I don't know of any pressing urgency for
> us to support it.
>
>

I see Haribabu mentioned that in his original email.


I'll take a look at verifying the patch.


cheers


andrew


-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: Planning counters in pg_stat_statements (using pgss_store)

2019-03-25 Thread legrand legrand
As there are now 3 locking times on pgss hash struct, one day or an other, 
somebody will ask for a GUC to disable this feature (to be able to run pgss
unchanged with only one lock as today).

With this GUC, pgss_store should be able to store the query text and
accumulated 
execution duration in the same call (as today).

Will try to provide this soon.




--
Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html



Re: BUG #15708: RLS 'using' running as wrong user when called from a view

2019-03-25 Thread Stephen Frost
Greetings,

* Dean Rasheed (dean.a.rash...@gmail.com) wrote:
> On Thu, 21 Mar 2019 at 00:39, PG Bug reporting form
>  wrote:
> >
> > This fails, seemingly because the RLS on 'bar' is being checked by alice,
> > instead of the view owner bob:
> 
> Yes I agree, that appears to be a bug. The subquery in the RLS policy
> should be checked as the view owner -- i.e., we need to propagate the
> checkAsUser for the RTE with RLS to any subqueries in its RLS
> policies.

Agreed.

> It looks like the best place to fix it is in
> get_policies_for_relation(), since that's where all the policies to be
> applied for a given RTE are pulled together. Patch attached.

Yes, on a quick review, that looks like a good solution to me as well.

Thanks!

Stephen


signature.asc
Description: PGP signature


Re: Ordered Partitioned Table Scans

2019-03-25 Thread Julien Rouhaud
On Sun, Mar 24, 2019 at 11:06 AM David Rowley
 wrote:
>
> On Sat, 23 Mar 2019 at 19:42, Tom Lane  wrote:
> >
> > David Rowley  writes:
> > > On Sat, 23 Mar 2019 at 05:40, Tom Lane  wrote:
> > >> BTW, another thing we could possibly do to answer this objection is to
> > >> give the ordered-Append node an artificially pessimistic startup cost,
> > >> such as the sum or the max of its children's startup costs.  That's
> > >> pretty ugly and unprincipled, but maybe it's better than not having the
> > >> ability to generate the plan shape at all?
> >
> > > I admit to having thought of that while trying to get to sleep last
> > > night, but I was too scared to even suggest it.  It's pretty much how
> > > MergeAppend would cost it anyway.  I agree it's not pretty to lie
> > > about the startup cost, but it does kinda seem silly to fall back on a
> > > more expensive MergeAppend when we know fine well Append is cheaper.
> >
> > Yeah.  I'm starting to think that this might actually be the way to go,
>
> Here's a version with it done that way.

FTR this patch doesn't apply since single child [Merge]Append
suppression (8edd0e7946) has been pushed.



Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath

2019-03-25 Thread Tom Lane
David Rowley  writes:
> [ drop-useless-merge-appends-15.patch ]

Pushed with some minor adjustments.

> I can get a plan that does end up with Result nodes above a Scan node.

> create table rangep (a int, b int) partition by range (a);
> create table rangep1 partition of rangep for values from(0) to (100);
> explain select r1::text from rangep r1 inner join rangep r2 on
> r1::text = r2::Text order by r1::text;

> However, if we modify is_projection_capable_plan() similar to how
> ExecSupportsMarkRestore() has been modified then we'd need to ensure
> that any changes made to the Append/MergeAppend's tlist also are made
> to the underlying node's tlist.  I'm unsure if that's worth the
> complexity.  What do you think?

I'm inclined to think not, at least not without more compelling
examples than that one ;-).  Note that we had Result-above-the-Append
before, so we're not regressing for such cases, we're just leaving
something on the table.

>> One other remark is that the division of labor between
>> create_[merge]append_path and their costsize.c subroutines
>> seems pretty unprincipled.  I'd be inclined to push all the
>> relevant logic into costsize.c, but have not done so here.
>> Moving the existing cost calculations in create_mergeappend_path
>> into costsize.c would better be done as a separate refactoring patch,
>> perhaps.

> I've not touched that, but happy to do it as a separate patch.

After looking at this, I'm less excited than I was before.
It seems it'd be only marginally less crufty than the way it is now.
In particular, costsize.c would have to be aware of the rule about
single-child MergeAppend being removable, which seems a little outside
its purview: typically we only ask costsize.c to estimate the cost
of a plan node as-presented, not to make decisions about what the
shape of the plan tree will be.  So I left it alone.

regards, tom lane



Re: MSVC Build support with visual studio 2019

2019-03-25 Thread Andrew Dunstan


On 3/21/19 3:13 AM, Michael Paquier wrote:
> On Thu, Mar 21, 2019 at 12:45:57PM +1100, Haribabu Kommi wrote:
>> The commit f2ab389 is later back-patch to version till 9.3 in commit
>> 19acfd65.  I guess that building the windows installer for all the
>> versions using the same visual studio is may be the reason behind
>> that back-patch.
> I did not remember this one, thanks for pointing it out.  So my
> memories on that were incorrect.  If it is possible to get the code to
> build with MSVC 2019 on all the supported branches, we could do so.




VS2019 is currently in preview. I think we'd probably be better off
waiting until the full release. I don't know of any pressing urgency for
us to support it.


cheers


andrew


-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: [HACKERS] proposal: schema variables

2019-03-25 Thread Erik Rijkers

On 2019-03-24 10:32, Pavel Stehule wrote:

ne 24. 3. 2019 v 10:25 odesílatel Erik Rijkers  napsal:


On 2019-03-24 06:57, Pavel Stehule wrote:
> Hi
>
> rebase against current master

I ran into this:

(schema 'varschema2' does not exist):

drop variable varschema2.testv cascade;
ERROR:  schema "varschema2" does not exist
create variable if not exists testv as text;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
connection to server was lost


(both statements are needed to force the crash)



I cannot to reproduce it.
 [backtrace and stuff]


Sorry, I don't have the wherewithal to get more info but I have repeated 
this now on 4 different machines (debian jessie/stretch; centos).


I did notice that sometimes those two offending lines
"
  drop variable varschema2.testv cascade;
  create variable if not exists testv as text;
"
have to be repeated a few times (never more than 4 or 5 times) before 
the crash occurs (signal 11: Segmentation fault).



Erik Rijkers





Re: Fix foreign key constraint check for partitioned tables

2019-03-25 Thread Hadi Moshayedi
Hello Edmund,

Thanks for the review.


> I was a bit curious about the need for "set role" in the reproduction,
> but I see that it's because RI_Initial_Check does some checks to see
> if a simple SELECT can be used, and one of the checks is for basic
> table permissions.
>

I think to reproduce this the current user shouldn't be able to SELECT on
both tables, so RI_Initial_Check fails. Setting the owner of one of the
tables isn't always enough as the current user can be a super user.


> I wonder if the macro RELKIND_HAS_STORAGE should be used instead of
> checking for each relkind?  This would apply to the check on line 4405
> too.
>

done.

This patch also changed the output of some of tests, i.e. previously
foreign key constraint failures errored on the partitioned table itself,
but now it shows the child table's name in the error message. I hope it is
ok.

I also added a regression test which would fail without this patch.

Thanks,
Hadi


fix-foreign-key-check.patch
Description: Binary data


Re: warning to publication created and wal_level is not set to logical

2019-03-25 Thread Tom Lane
Andres Freund  writes:
> On 2019-03-25 13:53:32 -0400, Tom Lane wrote:
>> One idea that might be useful is to have walsenders refuse to transmit
>> any logical-replication data if they see wal_level is too low.  That
>> would get users' attention pretty quickly.

> They do:

Oh, OK, then this seems like it's basically covered already.  I think
the original suggestion to add a WARNING during CREATE PUBLICATION
isn't unreasonable.  But we don't need to do more than that (and it
shouldn't be higher than WARNING).

regards, tom lane



Re: [HACKERS] Re: [COMMITTERS] pgsql: Remove pgbench "progress" test pending solution of its timing is (fwd)

2019-03-25 Thread Fabien COELHO


I pushed the refactoring now, to extract the progress reporting to a separate 
function. That seems like an improvement independently from the rest of the 
patch.


Sure.


I'll take another look at the rest, probably tomorrow.


Ok.

Attached is the remainder of the patch rebased to current head.

--
Fabien.diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index b1afe44817..6a4c7396c9 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -6051,6 +6051,7 @@ threadRun(void *arg)
 	int			nstate = thread->nstate;
 	int			remains = nstate;	/* number of remaining clients */
 	socket_set *sockets = alloc_socket_set(nstate);
+	int			aborted = 0;
 	int			i;
 
 	/* for reporting progress: */
@@ -6280,6 +6281,10 @@ threadRun(void *arg)
 			 */
 			if (st->state == CSTATE_FINISHED || st->state == CSTATE_ABORTED)
 remains--;
+
+			/* count aborted clients */
+			if (st->state == CSTATE_ABORTED)
+aborted++;
 		}
 
 		/* progress report is made by thread 0 for all threads */
@@ -6296,7 +6301,10 @@ threadRun(void *arg)
 
 /*
  * Ensure that the next report is in the future, in case
- * pgbench/postgres got stuck somewhere.
+ * pgbench/postgres got stuck somewhere. This may skip
+ * some progress reports if the thread does not get enough
+ * cpu time. In such case, probably the whole bench should
+ * be ignored.
  */
 do
 {
@@ -6307,6 +6315,52 @@ threadRun(void *arg)
 	}
 
 done:
+	/*
+	 * Under -R, comply with -T and -P even if there is nothing to do,
+	 * (unless all clients aborted) and ensure that one report is printed.
+	 * This special behavior allows tap tests to check that a run lasts
+	 * as expected and that some progress is shown, even on very slow hosts.
+	 */
+	if (duration && throttle_delay && aborted < nstate && thread->tid == 0)
+	{
+		int64		thread_end = thread_start + (int64) duration * 100;
+		instr_time	now_time;
+		int64		now;
+
+		INSTR_TIME_SET_CURRENT(now_time);
+		now = INSTR_TIME_GET_MICROSEC(now_time);
+
+		while (now < thread_end)
+		{
+			if (progress && next_report <= thread_end)
+			{
+pg_usleep(next_report - now);
+INSTR_TIME_SET_CURRENT(now_time);
+now = INSTR_TIME_GET_MICROSEC(now_time);
+printProgressReport(thread, thread_start, now, , _report);
+
+/* schedule next report */
+do
+{
+	next_report += (int64) progress * 100;
+} while (now >= next_report);
+			}
+			else
+			{
+pg_usleep(thread_end - now);
+INSTR_TIME_SET_CURRENT(now_time);
+now = INSTR_TIME_GET_MICROSEC(now_time);
+			}
+		}
+
+		/*
+		 * Print a closing progress report if none were printed
+		 * and at least one was expected.
+		 */
+		if (progress && progress <= duration && last_report == thread_start)
+			printProgressReport(thread, thread_start, now, , _report);
+	}
+
 	INSTR_TIME_SET_CURRENT(start);
 	disconnect_all(state, nstate);
 	INSTR_TIME_SET_CURRENT(end);
diff --git a/src/bin/pgbench/t/001_pgbench_with_server.pl b/src/bin/pgbench/t/001_pgbench_with_server.pl
index ad3a7a79f8..95e227ec7b 100644
--- a/src/bin/pgbench/t/001_pgbench_with_server.pl
+++ b/src/bin/pgbench/t/001_pgbench_with_server.pl
@@ -4,13 +4,14 @@ use warnings;
 use PostgresNode;
 use TestLib;
 use Test::More;
+use Time::HiRes qw{time};
 
 # start a pgbench specific server
 my $node = get_new_node('main');
 $node->init;
 $node->start;
 
-# invoke pgbench, with parameters:
+# invoke pgbench, return elapsed time, with parameters:
 #   $opts: options as a string to be split on spaces
 #   $stat: expected exit status
 #   $out: reference to a regexp list that must match stdout
@@ -49,13 +50,14 @@ sub pgbench
 	}
 
 	push @cmd, @args;
+	my $t0 = time();
 
 	$node->command_checks_all(\@cmd, $stat, $out, $err, $name);
 
 	# cleanup?
 	#unlink @filenames or die "cannot unlink files (@filenames): $!";
 
-	return;
+	return time() - $t0;
 }
 
 # Test concurrent insertion into table with serial column.  This
@@ -894,7 +896,45 @@ sub check_pgbench_logs
 
 my $bdir = $node->basedir;
 
-# with sampling rate
+# The point of this test is coverage of various time-related features
+# (-T, -P, --aggregate-interval, --rate, --latency-limit...), so it is
+# somehow time sensitive.
+# The checks performed are quite loose so as to still pass even under
+# degraded (high load, slow host, valgrind run) testing conditions.
+# The main point is to provide coverage, most of the time.
+my $elapsed = pgbench(
+	"-T 2 -P 1 -l --aggregate-interval=1"
+	  . ' -S -b se@2 --rate=20 --latency-limit=1000 -j ' . $nthreads
+	  . ' -c 3 -r',
+	0,
+	[   qr{type: multiple},
+		qr{clients: 3},
+		qr{threads: $nthreads},
+		# the shown duration is really -T argument value
+		qr{duration: 2 s},
+		qr{script 1: .* select only},
+		qr{script 2: .* select only},
+		qr{statement latencies in milliseconds},
+		qr{FROM pgbench_accounts} ],
+	[	qr{vacuum},
+		qr{progress: \d\b} ],
+	'pgbench progress', undef,
+	"--log-prefix=$bdir/001_pgbench_log_1");
+
+# 

Re: warning to publication created and wal_level is not set to logical

2019-03-25 Thread Andres Freund
Hi,

On 2019-03-25 13:53:32 -0400, Tom Lane wrote:
> One idea that might be useful is to have walsenders refuse to transmit
> any logical-replication data if they see wal_level is too low.  That
> would get users' attention pretty quickly.

They do:


/*
 * Load previously initiated logical slot and prepare for sending data (via
 * WalSndLoop).
 */
static void
StartLogicalReplication(StartReplicationCmd *cmd)
{
StringInfoData buf;

/* make sure that our requirements are still fulfilled */
CheckLogicalDecodingRequirements();

and CheckLogicalDecodingReqs contains:

if (wal_level < WAL_LEVEL_LOGICAL)
ereport(ERROR,

(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
 errmsg("logical decoding requires wal_level >= 
logical")));


Greetings,

Andres Freund



Re: Assert failure when validating foreign keys

2019-03-25 Thread Andres Freund
Hi,

On 2019-03-24 23:54:53 +1300, David Rowley wrote:
> This results in an Assert failure on master and an elog ERROR prior to
> c2fe139c201:
> 
> create role test_role with login;
> create table ref(a int primary key);
> grant references on ref to test_role;
> set role test_role;
> create table t1(a int, b int);
> insert into t1 values(1,1);
> alter table t1 add constraint t1_b_key foreign key (b) references ref(a);
> server closed the connection unexpectedly
> This probably means the server terminated abnormally
> before or while processing the request.
> The connection to the server was lost. Attempting reset: Failed.
> 
> Fails in heapam_tuple_satisfies_snapshot() at
> Assert(BufferIsValid(bslot->buffer));
> 
> c2fe139c201~1:
> ERROR:  expected buffer tuple
> 
> The test case is just a variation of the case in [1], but a different
> bug, so reporting it on a different thread.
> 
> I've not looked into the cause or when it started happening.

I think the cause is stupidity of mine. In
validateForeignKeyConstraint() I passed true to the materialize argument
of ExecFetchSlotHeapTuple(). Which therefore is made independent of
buffers. Which this assert then notices.  Just changing that to false,
which is correct, fixes the issue for me.

I'm a bit confused as to how we have no tests for this code?  Is it just
that the left join codepath is "too good"?

I've also noticed that we should free the tuple - that doesn't matter
for heap, but it sure does for other callers.  But uh, is it actually ok
to validate an entire table's worth of foreign keys without a memory
context reset? I.e. shouldn't we have a memory context that we reset
after each iteration?

Also, why's there no CHECK_FOR_INTERRUPTS()? heap has some internally on
a page level, but that doesn't seem all that granular?

Greetings,

Andres Freund
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 3183b2aaa12..6cb2b8079cf 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -9017,6 +9017,8 @@ validateForeignKeyConstraint(char *conname,
 	TableScanDesc scan;
 	Trigger		trig;
 	Snapshot	snapshot;
+	MemoryContext oldcxt,
+perTupCxt;
 
 	ereport(DEBUG1,
 			(errmsg("validating foreign key constraint \"%s\"", conname)));
@@ -9052,11 +9054,19 @@ validateForeignKeyConstraint(char *conname,
 	slot = table_slot_create(rel, NULL);
 	scan = table_beginscan(rel, snapshot, 0, NULL);
 
+	perTupCxt = AllocSetContextCreate(CurrentMemoryContext,
+	  "validateForeignKeyConstraint",
+	  ALLOCSET_SMALL_SIZES);
+
 	while (table_scan_getnextslot(scan, ForwardScanDirection, slot))
 	{
 		LOCAL_FCINFO(fcinfo, 0);
 		TriggerData trigdata;
 
+		CHECK_FOR_INTERRUPTS();
+
+		oldcxt = MemoryContextSwitchTo(perTupCxt);
+
 		/*
 		 * Make a call to the trigger function
 		 *
@@ -9070,16 +9080,22 @@ validateForeignKeyConstraint(char *conname,
 		trigdata.type = T_TriggerData;
 		trigdata.tg_event = TRIGGER_EVENT_INSERT | TRIGGER_EVENT_ROW;
 		trigdata.tg_relation = rel;
-		trigdata.tg_trigtuple = ExecFetchSlotHeapTuple(slot, true, NULL);
+		trigdata.tg_trigtuple = ExecFetchSlotHeapTuple(slot, false, NULL);
 		trigdata.tg_trigslot = slot;
 		trigdata.tg_newtuple = NULL;
+		trigdata.tg_newslot = NULL;
 		trigdata.tg_trigger = 
 
 		fcinfo->context = (Node *) 
 
 		RI_FKey_check_ins(fcinfo);
+
+		MemoryContextReset(perTupCxt);
 	}
 
+	MemoryContextSwitchTo(oldcxt);
+	MemoryContextDelete(perTupCxt);
+
 	table_endscan(scan);
 	UnregisterSnapshot(snapshot);
 	ExecDropSingleTupleTableSlot(slot);


Re: warning to publication created and wal_level is not set to logical

2019-03-25 Thread Tom Lane
Robert Haas  writes:
> On Mon, Mar 25, 2019 at 10:15 AM David Fetter  wrote:
>>> I do not believe this is practical either.  GUC manipulation cannot
>>> look at the catalogs.

>> In this case, it really has to do something. Is setting GUCs a path so
>> critical it can't take one branch?

> No, but that has about zero to do with the actual problem that Tom is
> describing.

To clarify, the problems with that are

(1) Initial GUC settings are absorbed by the postmaster, which cannot
examine catalogs *at all*.  It is neither connected to any database
nor allowed to participate in transactions.  These are not things that
will change.

(2) wal_level is a global setting, but the catalogs we'd have to look
at to discover the existence of a publication are per-database.  Thus
for example there is no reliable way for "ALTER SYSTEM SET wal_level"
to detect whether publications exist in other databases of the cluster.
(To say nothing of race conditions against concurrent publication
creation commands.)

Adding the dump/restore issue on top of that, it seems clear to me that
we can't usefully prevent a conflicting setting of wal_level from being
established.  The best we can do is whine about it later.

One idea that might be useful is to have walsenders refuse to transmit
any logical-replication data if they see wal_level is too low.  That
would get users' attention pretty quickly.

regards, tom lane



Re: warning to publication created and wal_level is not set to logical

2019-03-25 Thread Robert Haas
On Mon, Mar 25, 2019 at 10:15 AM David Fetter  wrote:
> > I do not believe this is practical either.  GUC manipulation cannot
> > look at the catalogs.
>
> In this case, it really has to do something. Is setting GUCs a path so
> critical it can't take one branch?

No, but that has about zero to do with the actual problem that Tom is
describing.

> If, as I strongly suspect, no such circumstance exists, it should not
> be possible for someone to have both of those at once, however
> inconvenient it is for us to arrange it.

Uh, Tom already told you how it can happen.  You just take a pg_dump
of an existing database, run initdb to create a new cluster, and then
try to restore the dump on the new cluster.  That shouldn't fail just
because wal_level = 'logical' isn't configured yet.  If it did, that
would be creating a huge booby-trap for users that doesn't exist
today.  You can't just dismiss that as nothing.  I think users have
every right to expect that a dump and restore is going to work without
preconfiguring things like wal_level -- it's bad enough that you
already have to struggle with things like encoding to get dumps to
restore properly.  Adding more ways for dump restoration to fail is a
really bad idea.

Besides that, it is obviously impractical to stop somebody from
shutting down the server, changing wal_level, and then restarting the
server.  Nor can you make all publications magically go away if
someone does that.  Nor would it be a good idea if we could do that.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: renaming ExecStoreWhateverTuple

2019-03-25 Thread Robert Haas
On Mon, Mar 25, 2019 at 12:33 PM Tom Lane  wrote:
> Robert Haas  writes:
> > Maybe we don't really need the word "tuple".  Like we could just make
> > it slot_store_heap() or SlotStoreHeap().  A slot can only store a
> > tuple, after all.
>
> I don't think it's wise to think of these things as just "slots";
> that name is way too generic.  They are "tuple slots", and so that
> word has to stay in the relevant function names.

I suppose there is some potential for confusion with things like
logical replication slots, but I think that these are the most
widely-used type of slot in the backend, so it's not entirely crazy to
think that they deserve a bit of special consideration.  I'm not
violently opposed to using four words instead of three
(slot_store_heap_tuple vs. slot_store_heap) but to really spell out
the operation in full you'd need to say something like
HeapTupleTableSlotStoreHeapTuple, and I think that's pretty unwieldy
for what's likely to end up being a very common programming idiom.

It's not crazy that we type 'cd' to change directories rather than
'chdir' or 'change_directory'.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: renaming ExecStoreWhateverTuple

2019-03-25 Thread Andres Freund
Hi,

On 2019-03-25 12:45:36 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > On 2019-03-25 12:33:38 -0400, Tom Lane wrote:
> >> I don't think it's wise to think of these things as just "slots";
> >> that name is way too generic.  They are "tuple slots", and so that
> >> word has to stay in the relevant function names.
> 
> > Hm. But we already have slot_{getsomeattrs, getallattrs, attisnull,
> > getattr, getsysattr}. But perhaps the att in there is enough addiitional
> > information?
> 
> I don't claim to be entirely innocent in this matter ;-)
> 
> If we're going to rename stuff in this area without concern for avoiding
> inessential code churn, then those are valid targets as well.

FWIW, I don't have much of a problem with current slot_ names.


> BTW, maybe it's worth drawing a naming distinction between
> slot-type-specific and slot-type-independent functions?
> (I assume there are still some of the latter.)

Hm, I guess that depends on what you classify as type independent. Is
something like slot_getattr type independent, even though it internally
calls slot_getsomeattrs, which'll call a callback if additional
deforming is necessary?

I'd say, if you exclude functions like that, ExecStoreVirtualTuple(),
ExecStoreAllNullTuple(), slot_getmissingattrs(), ExecSetSlotDescriptor()
are probably the only ones that have no awareness of the type of a
slot.

I'm not sure it matters that much however? Unless you store something in
a slot, code normally shouldn't have to care what type a slot is.

Greetings,

Andres Freund



Re: monitoring CREATE INDEX [CONCURRENTLY]

2019-03-25 Thread Alvaro Herrera
Here's v6 of this patch.  I have rebased on top of today's CLUSTER
monitoring, as well as on table AM commits.  The latter caused a bit of
trouble, as now the number of blocks processed by a scan is not as easy
to get as before; I added a new entry point heapscan_get_blocks_done on
heapam.c to help with that.  (I suppose this will need some fixups later
on.)

I removed the "M of N" phase labels that Robert didn't like; those were
suggested by Rahila and upvoted by Amit L.  I'm of two minds about
those.  If you care about those and want them back, please speak up.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>From ccf8a84f785c96ac7ae01e17b4fb7999183fed9e Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Mon, 25 Mar 2019 13:29:34 -0300
Subject: [PATCH v6 1/3] implement heapscan_get_blocks_done

---
 src/backend/access/heap/heapam.c | 34 
 src/include/access/heapam.h  |  1 +
 2 files changed, 35 insertions(+)

diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index fa747be73ab..a7f6c09dcd0 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -1053,6 +1053,40 @@ heapgettup_pagemode(HeapScanDesc scan,
 	}
 }
 
+/*
+ * Return the number of blocks that have been read by this scan since
+ * starting.  This is not 100% accurate, because in a parallel scan the
+ * counter can be moving concurrently.
+ */
+BlockNumber
+heapscan_get_blocks_done(TableScanDesc sscan)
+{
+	HeapScanDesc	hscan = (HeapScanDesc) sscan;
+	BlockNumber		startblock;
+	BlockNumber		blocks_done;
+
+	if (hscan->rs_base.rs_parallel != NULL)
+	{
+		ParallelBlockTableScanDesc bpscan;
+
+		bpscan = (ParallelBlockTableScanDesc) hscan->rs_base.rs_parallel;
+		startblock = bpscan->phs_startblock;
+	}
+	else
+		startblock = hscan->rs_startblock;
+
+	/*
+	 * Might have wrapped around the end of the relation, if startblock was
+	 * not zero.
+	 */
+	if (hscan->rs_cblock > startblock)
+		blocks_done = hscan->rs_cblock - startblock;
+	else
+		blocks_done = hscan->rs_nblocks - startblock +
+			hscan->rs_cblock;
+
+	return blocks_done;
+}
 
 #if defined(DISABLE_COMPLEX_MACRO)
 /*
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index e72f9787517..d4f537b3791 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -118,6 +118,7 @@ extern TableScanDesc heap_beginscan(Relation relation, Snapshot snapshot,
 extern void heap_setscanlimits(TableScanDesc scan, BlockNumber startBlk,
    BlockNumber endBlk);
 extern void heapgetpage(TableScanDesc scan, BlockNumber page);
+extern BlockNumber heapscan_get_blocks_done(TableScanDesc sscan);
 extern void heap_rescan(TableScanDesc scan, ScanKey key, bool set_params,
 			bool allow_strat, bool allow_sync, bool allow_pagemode);
 extern void heap_rescan_set_params(TableScanDesc scan, ScanKey key,
-- 
2.17.1

>From 7adb314a8be321084c7f7255cefc53eaa8c2ab0e Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Wed, 2 Jan 2019 16:14:39 -0300
Subject: [PATCH v6 2/3] Report progress of CREATE INDEX operations

---
 contrib/amcheck/verify_nbtree.c   |   2 +-
 contrib/bloom/blinsert.c  |   2 +-
 contrib/bloom/blutils.c   |   1 +
 doc/src/sgml/indexam.sgml |  13 ++
 doc/src/sgml/monitoring.sgml  | 224 +-
 src/backend/access/brin/brin.c|   5 +-
 src/backend/access/gin/gininsert.c|   2 +-
 src/backend/access/gin/ginutil.c  |   1 +
 src/backend/access/gist/gist.c|   1 +
 src/backend/access/gist/gistbuild.c   |   2 +-
 src/backend/access/hash/hash.c|   3 +-
 src/backend/access/nbtree/nbtree.c|   9 ++
 src/backend/access/nbtree/nbtsort.c   |  58 ++-
 src/backend/access/nbtree/nbtutils.c  |  24 +++
 src/backend/access/spgist/spginsert.c |   2 +-
 src/backend/access/spgist/spgutils.c  |   1 +
 src/backend/catalog/index.c   | 118 +-
 src/backend/catalog/system_views.sql  |  27 
 src/backend/commands/indexcmds.c  |  52 +-
 src/backend/storage/ipc/standby.c |   2 +-
 src/backend/storage/lmgr/lmgr.c   |  46 +-
 src/backend/storage/lmgr/lock.c   |   7 +-
 src/backend/utils/adt/amutils.c   |  23 +++
 src/backend/utils/adt/pgstatfuncs.c   |   2 +
 src/include/access/amapi.h|   4 +
 src/include/access/genam.h|   1 +
 src/include/access/nbtree.h   |  11 ++
 src/include/catalog/index.h   |   2 +
 src/include/catalog/pg_proc.dat   |  10 +-
 src/include/commands/progress.h   |  35 
 src/include/pgstat.h  |   5 +-
 src/include/storage/lmgr.h|   4 +-
 src/include/storage/lock.h|   2 +-
 src/test/regress/expected/rules.out   |  30 +++-
 34 files changed, 693 insertions(+), 38 deletions(-)

diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c

Re: renaming ExecStoreWhateverTuple

2019-03-25 Thread Tom Lane
Andres Freund  writes:
> On 2019-03-25 12:33:38 -0400, Tom Lane wrote:
>> I don't think it's wise to think of these things as just "slots";
>> that name is way too generic.  They are "tuple slots", and so that
>> word has to stay in the relevant function names.

> Hm. But we already have slot_{getsomeattrs, getallattrs, attisnull,
> getattr, getsysattr}. But perhaps the att in there is enough addiitional
> information?

I don't claim to be entirely innocent in this matter ;-)

If we're going to rename stuff in this area without concern for avoiding
inessential code churn, then those are valid targets as well.

BTW, maybe it's worth drawing a naming distinction between
slot-type-specific and slot-type-independent functions?
(I assume there are still some of the latter.)

regards, tom lane



Re: renaming ExecStoreWhateverTuple

2019-03-25 Thread Andres Freund
Hi,

On 2019-03-25 12:33:38 -0400, Tom Lane wrote:
> Robert Haas  writes:
> > Maybe we don't really need the word "tuple".  Like we could just make
> > it slot_store_heap() or SlotStoreHeap().  A slot can only store a
> > tuple, after all.
> 
> I don't think it's wise to think of these things as just "slots";
> that name is way too generic.  They are "tuple slots", and so that
> word has to stay in the relevant function names.

Hm. But we already have slot_{getsomeattrs, getallattrs, attisnull,
getattr, getsysattr}. But perhaps the att in there is enough addiitional
information?

Although now I'm looking at consistency annoyed at slot_attisnull (no r)
and slot_getattr (r) ;)

Greetings,

Andres Freund



Re: renaming ExecStoreWhateverTuple

2019-03-25 Thread Tom Lane
Robert Haas  writes:
> Maybe we don't really need the word "tuple".  Like we could just make
> it slot_store_heap() or SlotStoreHeap().  A slot can only store a
> tuple, after all.

I don't think it's wise to think of these things as just "slots";
that name is way too generic.  They are "tuple slots", and so that
word has to stay in the relevant function names.

regards, tom lane



Re: renaming ExecStoreWhateverTuple

2019-03-25 Thread Andres Freund
Hi,

On 2019-03-25 12:05:52 -0400, Robert Haas wrote:
> On Mon, Mar 25, 2019 at 11:57 AM Andres Freund  wrote:
> > I think I might go for slot_store_heap_tuple etc, given that a lot of
> > those accessors have been slot_ for about ever. What do you think?
> 
> I don't have a super-strong feeling about it, although changing the
> case convention might confuse a few people.

Right, but is that going to be more people that are going to be confused
by the difference between the already existing slot_ functions and the
existing camel-cased stuff?

Greetings,

Andres Freund



Re: Error message inconsistency

2019-03-25 Thread Robert Haas
On Sun, Mar 24, 2019 at 11:23 PM Amit Kapila  wrote:
> Fair point.  Can such an error message change break any application?
> I see some cases where users have check based on Error Code, but not
> sure if there are people who have check based on error messages.

I'm sure there are -- in fact, I've written code that does that.  But
I also don't think that's a reason not to improve the error messages.
If we start worrying about stuff like this, we'll be unable to ever
improve anything.

> Anyone else having an opinion on this matter?  Basically, I would like
> to hear if anybody thinks that this change can cause any sort of
> problem.

I don't think it's going to cause a problem for users, provided the
patch is correct.  I wondered whether it was always going to pick up
the relation name, e.g. if partitioning is involved, but I didn't
check into it at all, so it may be fine.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] Re: [COMMITTERS] pgsql: Remove pgbench "progress" test pending solution of its timing is (fwd)

2019-03-25 Thread Heikki Linnakangas

On 10/03/2019 13:26, Fabien COELHO wrote:

Attached is a rebase.


I pushed the refactoring now, to extract the progress reporting to a 
separate function. That seems like an improvement independently from the 
rest of the patch. I'll take another look at the rest, probably tomorrow.


- Heikki



Re: renaming ExecStoreWhateverTuple

2019-03-25 Thread Robert Haas
On Mon, Mar 25, 2019 at 11:57 AM Andres Freund  wrote:
> I think I might go for slot_store_heap_tuple etc, given that a lot of
> those accessors have been slot_ for about ever. What do you think?

I don't have a super-strong feeling about it, although changing the
case convention might confuse a few people.

Maybe we don't really need the word "tuple".  Like we could just make
it slot_store_heap() or SlotStoreHeap().  A slot can only store a
tuple, after all.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: renaming ExecStoreWhateverTuple

2019-03-25 Thread Andres Freund
Hi,

On 2019-03-25 11:20:13 -0400, Robert Haas wrote:
> While reviewing some zheap code last week, I noticed that the naming
> of the ExecStore*Tuple() functions seems a bit unfortunate.  One
> reason is that we are now using slots in all kinds of places other
> than the executor, so that the "Exec" prefix seems a bit out of place.
> However, you could argue that this is OK, on the grounds that the
> functions are defined in the executor.  What I noticed, though, is
> that every new table AM is probably going to need its own type of
> slot, and it's natural to define those slot support functions in the
> AM code rather than having every AM shove whatever it's got into
> execTuples.c.

Yea, it's not accurate. Nor was it before this change though - there's
e.g. plenty executor code that used slots long before v12.


> But if you do that, then you end up with a function with a name like
> ExecStoreZHeapTuple() which isn't in src/backend/executor, and which
> furthermore will never be called from the executor, because the
> executor only knows about a short, hard-coded list of built-in slot
> types, and ZHeapTuples are not one of them.  That function will,
> rather, only be called from the zheap code.  And having a function
> whose name starts with Exec... that is neither defined in the executor
> nor used in the executor seems wrong.

Yea.

I feel if we go there we probably should also rename ExecClearTuple,
ExecMaterializeSlot, ExecCopySlotHeapTuple, ExecCopySlotMinimalTuple,
ExecCopySlot. Although there we probably can add a compat wrapper.


> So I think we should rename these functions before we get too used to
> the new names.  There is a significant advantage in doing that for v12
> because people are already going to have to adjust third-party code to
> compensate for the fact that we no longer have an ExecStoreTuple()
> function any more.

Indeed.


> I'm not sure exactly what names would be better.  Perhaps just change
> the "Exec" prefix to "Slot", e.g. SlotStoreHeapTuple().  Or maybe put
> InSlot at the end, like StoreHeapTupleInSlot().  Or just take those
> four words - slot, heap, tuple, and store - and put them in any order
> you like.  TupleSlotStoreHeap?

I think I might go for slot_store_heap_tuple etc, given that a lot of
those accessors have been slot_ for about ever. What do you think?

Greetings,

Andres Freund



Re: renaming ExecStoreWhateverTuple

2019-03-25 Thread Alvaro Herrera
On 2019-Mar-25, Robert Haas wrote:

> On Mon, Mar 25, 2019 at 11:27 AM Alvaro Herrera
>  wrote:
> > Should we keep ExecStoreTuple as a compatibility macro for third party
> > code?
> 
> I think that might be rather dangerous, actually, because slots now
> have a type, which they didn't before.  You have to use the correct
> function for the kind of slot you've got.

Ah, right.  I was thinking of assuming that the un-updated third-party
code would always have a slot of type heap, but of course that's not
guaranteed.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: renaming ExecStoreWhateverTuple

2019-03-25 Thread Robert Haas
On Mon, Mar 25, 2019 at 11:27 AM Alvaro Herrera
 wrote:
> Should we keep ExecStoreTuple as a compatibility macro for third party
> code?

I think that might be rather dangerous, actually, because slots now
have a type, which they didn't before.  You have to use the correct
function for the kind of slot you've got.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [PATCH] kNN for btree

2019-03-25 Thread Nikita Glukhov

On 25.03.2019 11:17, David Steele wrote:


On 3/15/19 2:11 AM, Nikita Glukhov wrote:

Attached 10th versions of the patches.

Fixed two bugs in patches 3 and 10 (see below).
Patch 3 was extracted from the main patch 5 (patch 4 in v9).


This patch no longer applies so marking Waiting on Author.


Attached 11th version of the patches rebased onto current master.

--
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


0001-Fix-get_index_column_opclass-v11.patch.gz
Description: application/gzip


0002-Introduce-ammatchorderby-function-v11.patch.gz
Description: application/gzip


0003-Enable-ORDER-BY-operator-scans-on-ordered-indexes-v11.patch.gz
Description: application/gzip


0004-Extract-structure-BTScanState-v11.patch.gz
Description: application/gzip


0005-Add-kNN-support-to-btree-v11.patch.gz
Description: application/gzip


0006-Add-btree-distance-operators-v11.patch.gz
Description: application/gzip


0007-Remove-distance-operators-from-btree_gist-v11.patch.gz
Description: application/gzip


0008-Add-regression-tests-for-kNN-btree-v11.patch.gz
Description: application/gzip


0009-Allow-ammatchorderby-to-return-pathkey-sublists-v11.patch.gz
Description: application/gzip


0010-Add-support-of-array-ops-to-btree-kNN-v11.patch.gz
Description: application/gzip


Re: renaming ExecStoreWhateverTuple

2019-03-25 Thread Alvaro Herrera
I agree about taking out the "Exec" part of the name.

On 2019-Mar-25, Robert Haas wrote:

> I'm not sure exactly what names would be better.  Perhaps just change
> the "Exec" prefix to "Slot", e.g. SlotStoreHeapTuple().  Or maybe put
> InSlot at the end, like StoreHeapTupleInSlot().  Or just take those
> four words - slot, heap, tuple, and store - and put them in any order
> you like.  TupleSlotStoreHeap?

HeapStoreTupleInSlot?

(I wonder why not "Put" instead of "Store".)

Should we keep ExecStoreTuple as a compatibility macro for third party
code?

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Removing \cset from pgbench

2019-03-25 Thread Alvaro Herrera
On 2019-Mar-23, Fabien COELHO wrote:

> > Per your request, here is a patch which removes \cset from pgbench. Sigh.
> 
> Again, only the removal is a little deeper. This lifts the constraint about
> not using empty queries in a compound statement, at the price of some added
> logic to detect the last result.

Thank you very much!  I've pushed this.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: selecting from partitions and constraint exclusion

2019-03-25 Thread Robert Haas
On Wed, Mar 20, 2019 at 12:37 AM Amit Langote
 wrote:
> That's because get_relation_constraints() no longer (as of PG 11) includes
> the partition constraint for SELECT queries.

What commit made that change?

This sounds to me like maybe it should be an open item.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



renaming ExecStoreWhateverTuple

2019-03-25 Thread Robert Haas
Hi,

While reviewing some zheap code last week, I noticed that the naming
of the ExecStore*Tuple() functions seems a bit unfortunate.  One
reason is that we are now using slots in all kinds of places other
than the executor, so that the "Exec" prefix seems a bit out of place.
However, you could argue that this is OK, on the grounds that the
functions are defined in the executor.  What I noticed, though, is
that every new table AM is probably going to need its own type of
slot, and it's natural to define those slot support functions in the
AM code rather than having every AM shove whatever it's got into
execTuples.c.

But if you do that, then you end up with a function with a name like
ExecStoreZHeapTuple() which isn't in src/backend/executor, and which
furthermore will never be called from the executor, because the
executor only knows about a short, hard-coded list of built-in slot
types, and ZHeapTuples are not one of them.  That function will,
rather, only be called from the zheap code.  And having a function
whose name starts with Exec... that is neither defined in the executor
nor used in the executor seems wrong.

So I think we should rename these functions before we get too used to
the new names.  There is a significant advantage in doing that for v12
because people are already going to have to adjust third-party code to
compensate for the fact that we no longer have an ExecStoreTuple()
function any more.

I'm not sure exactly what names would be better.  Perhaps just change
the "Exec" prefix to "Slot", e.g. SlotStoreHeapTuple().  Or maybe put
InSlot at the end, like StoreHeapTupleInSlot().  Or just take those
four words - slot, heap, tuple, and store - and put them in any order
you like.  TupleSlotStoreHeap?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Usage of epoch in txid_current

2019-03-25 Thread Heikki Linnakangas

On 25/03/2019 12:49, Thomas Munro wrote:

On Mon, Mar 25, 2019 at 5:01 PM Thomas Munro  wrote:

New version attached.  I'd like to commit this for PG12.


Here is a follow-up sketch patch that shows FullTransactionId being
used in the transaction stack, so you can call eg
GetCurrentFullTransactionId().  A table access method could use this
to avoid the need to freeze stuff later (eg zheap).

I suppose it's not strictly necessary, since you could use
GetCurrentTransactionId() and infer the epoch by comparing with
ReadNextFullTransactionId() (now that the epoch counting is reliable,
due to patch 0001 which I'm repeating again here just for cfbot).  But
I suppose we want to get away from that sort of thing.  Thoughts?


Looks good.

I started to write a patch to use XID & epoch in dealing with GiST page 
deletions [1], and I really could've used an epoch to go with 
RecentGlobalXmin. I presume that would be pretty straightforward to have 
with this, too.


[1] 
https://www.postgresql.org/message-id/5f7ed675-d1fc-66ef-f316-645080ff9...@iki.fi


- Heikki



Re: Removing unneeded self joins

2019-03-25 Thread Alexander Kuzmenkov

I noticed you lost a couple of test cases in v14, so I added them back.

I also added a case where your implementation returns a different number 
of rows compared to vanilla:


select * from sl t1, sl t2 where t1.a = t2.a and t1.b = 1;

Please see the attached v15.

--
Alexander Kuzmenkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

>From 2b1350df99b5e59f9ad7e516806800d09256beea Mon Sep 17 00:00:00 2001
From: Alexander Kuzmenkov 
Date: Mon, 25 Mar 2019 18:10:52 +0300
Subject: [PATCH v15] Remove self joins.

---
 src/backend/optimizer/plan/analyzejoins.c | 961 --
 src/backend/optimizer/plan/planmain.c |   7 +-
 src/backend/optimizer/util/relnode.c  |  26 +-
 src/include/optimizer/pathnode.h  |   5 +
 src/include/optimizer/planmain.h  |   7 +-
 src/test/regress/expected/equivclass.out  |  32 +
 src/test/regress/expected/join.out| 260 
 src/test/regress/sql/equivclass.sql   |  17 +
 src/test/regress/sql/join.sql | 117 
 9 files changed, 1350 insertions(+), 82 deletions(-)

diff --git a/src/backend/optimizer/plan/analyzejoins.c b/src/backend/optimizer/plan/analyzejoins.c
index a4efa69..7efcf9f 100644
--- a/src/backend/optimizer/plan/analyzejoins.c
+++ b/src/backend/optimizer/plan/analyzejoins.c
@@ -22,6 +22,7 @@
  */
 #include "postgres.h"
 
+#include "catalog/pg_class.h"
 #include "nodes/nodeFuncs.h"
 #include "optimizer/clauses.h"
 #include "optimizer/joininfo.h"
@@ -33,7 +34,7 @@
 #include "utils/lsyscache.h"
 
 /* local functions */
-static bool join_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo);
+static bool leftjoin_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo);
 static void remove_rel_from_query(PlannerInfo *root, int relid,
 	  Relids joinrelids);
 static List *remove_rel_from_joinlist(List *joinlist, int relid, int *nremoved);
@@ -47,18 +48,21 @@ static bool is_innerrel_unique_for(PlannerInfo *root,
 	   RelOptInfo *innerrel,
 	   JoinType jointype,
 	   List *restrictlist);
+static void split_selfjoin_quals(PlannerInfo *root, List *joinquals,
+	 List **selfjoinquals,
+	 List **otherjoinquals);
 
 
 /*
- * remove_useless_joins
- *		Check for relations that don't actually need to be joined at all,
- *		and remove them from the query.
+ * remove_useless_left_joins
+ *		Check for left joined relations that don't actually need to be joined
+ *		at all, and remove them from the query.
  *
  * We are passed the current joinlist and return the updated list.  Other
  * data structures that have to be updated are accessible via "root".
  */
 List *
-remove_useless_joins(PlannerInfo *root, List *joinlist)
+remove_useless_left_joins(PlannerInfo *root, List *joinlist)
 {
 	ListCell   *lc;
 
@@ -74,11 +78,11 @@ restart:
 		int			nremoved;
 
 		/* Skip if not removable */
-		if (!join_is_removable(root, sjinfo))
+		if (!leftjoin_is_removable(root, sjinfo))
 			continue;
 
 		/*
-		 * Currently, join_is_removable can only succeed when the sjinfo's
+		 * Currently, leftjoin_is_removable can only succeed when the sjinfo's
 		 * righthand is a single baserel.  Remove that rel from the query and
 		 * joinlist.
 		 */
@@ -146,9 +150,9 @@ clause_sides_match_join(RestrictInfo *rinfo, Relids outerrelids,
 }
 
 /*
- * join_is_removable
- *	  Check whether we need not perform this special join at all, because
- *	  it will just duplicate its left input.
+ * leftjoin_is_removable
+ *	  Check whether we need not perform this left join at all, because it will
+ *	  just duplicate its left input.
  *
  * This is true for a left join for which the join condition cannot match
  * more than one inner-side row.  (There are other possibly interesting
@@ -157,12 +161,11 @@ clause_sides_match_join(RestrictInfo *rinfo, Relids outerrelids,
  * above the join.
  */
 static bool
-join_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo)
+leftjoin_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo)
 {
 	int			innerrelid;
 	RelOptInfo *innerrel;
 	Relids		joinrelids;
-	List	   *clause_list = NIL;
 	ListCell   *l;
 	int			attroff;
 
@@ -238,67 +241,23 @@ join_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo)
 	}
 
 	/*
-	 * Search for mergejoinable clauses that constrain the inner rel against
-	 * either the outer rel or a pseudoconstant.  If an operator is
-	 * mergejoinable then it behaves like equality for some btree opclass, so
-	 * it's what we want.  The mergejoinability test also eliminates clauses
-	 * containing volatile functions, which we couldn't depend on.
+	 * Check for pushed-down clauses referencing the inner rel. If there is
+	 * such a clause then join removal has to be disallowed.  We have to
+	 * check this despite the previous attr_needed checks because of the
+	 * possibility of pushed-down clauses referencing the rel.
 	 */
 	foreach(l, innerrel->joininfo)
 	{
 		RestrictInfo *restrictinfo = (RestrictInfo *) lfirst(l);
-
-		/*
-		 * If 

Re: Re: using index or check in ALTER TABLE SET NOT NULL

2019-03-25 Thread Robert Haas
On Mon, Mar 25, 2019 at 3:51 AM David Steele  wrote:
> On 3/17/19 3:40 PM, David Rowley wrote:
> > On Thu, 14 Mar 2019 at 06:24, Robert Haas  wrote:
> >
> > I think we can mark this patch as committed now as I don't think the
> > lack of a way to test it is likely to cause it to be reverted.
>
> Should we mark this as committed now or is there more tweaking to be done?

I have now marked it as Committed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] CLUSTER command progress monitor

2019-03-25 Thread Robert Haas
On Sun, Mar 24, 2019 at 9:02 PM Tatsuro Yamada
 wrote:
> Please find attached files. :)

Committed.  Thanks!

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Fix XML handling with DOCTYPE

2019-03-25 Thread Ryan Lambert
Perfect, thank you!  I do remember seeing that message now, but hadn't
understood what it really meant.
I will test later today.  Thanks

*Ryan*

On Sun, Mar 24, 2019 at 9:19 PM Tom Lane  wrote:

> Chapman Flack  writes:
> > On 03/24/19 21:04, Ryan Lambert wrote:
> >> I am unable to get latest patches I found  [1] to apply cleanly to
> current
> >> branches. It's possible I missed the latest patches so if I'm using the
> >> wrong ones please let me know.  I tried against master, 11.2 stable and
> the
> >> 11.2 tag with similar results.
>
> > Tom pushed the content-with-DOCTYPE patch, it's now included in master,
> > REL_11_STABLE, REL_10_STABLE, REL9_6_STABLE, REL9_5_STABLE, and
> > REL9_4_STABLE.
>
> Right.  If you want to test (and please do!) you could grab the relevant
> branch tip from our git repo, or download a nightly snapshot tarball from
>
> https://www.postgresql.org/ftp/snapshot/
>
> regards, tom lane
>


Re: Assert failure when validating foreign keys

2019-03-25 Thread Andres Freund
Hi,

On 2019-03-24 23:54:53 +1300, David Rowley wrote:
> This results in an Assert failure on master and an elog ERROR prior to
> c2fe139c201:
> 
> create role test_role with login;
> create table ref(a int primary key);
> grant references on ref to test_role;
> set role test_role;
> create table t1(a int, b int);
> insert into t1 values(1,1);
> alter table t1 add constraint t1_b_key foreign key (b) references ref(a);
> server closed the connection unexpectedly
> This probably means the server terminated abnormally
> before or while processing the request.
> The connection to the server was lost. Attempting reset: Failed.
> 
> Fails in heapam_tuple_satisfies_snapshot() at
> Assert(BufferIsValid(bslot->buffer));
> 
> c2fe139c201~1:
> ERROR:  expected buffer tuple
> 
> The test case is just a variation of the case in [1], but a different
> bug, so reporting it on a different thread.
> 
> I've not looked into the cause or when it started happening.

That's probably my fault somehow, I'll look into it. Got some urgent
errands to run first (and it's still early here).

Thanks for noticing,

Andres Freund



Re: warning to publication created and wal_level is not set to logical

2019-03-25 Thread David Fetter
On Sun, Mar 24, 2019 at 02:06:59PM -0400, Tom Lane wrote:
> David Fetter  writes:
> > On Thu, Mar 21, 2019 at 07:45:59PM -0300, Lucas Viecelli wrote:
> >> I am sending a patch that when an PUBLICATION is created and the
> >> wal_level is different from logical prints a WARNING in console/log:
> 
> > Is a WARNING sufficient?  Maybe I'm misunderstanding something
> > important, but I think the attempt should fail with a HINT to set the
> > wal_level ahead of time.
> 
> That would be a booby-trap for dump/restore and pg_upgrade, so I don't
> think making CREATE PUBLICATION fail outright would be wise.

I haven't yet come up with a situation where it would be appropriate
both for wal_level to be below logical and for a PUBLICATION to exist,
even as some intermediate state during pg_restore.

> > Possibly in a separate patch, setting the wal_level to anything lower
> > than logical when publications exist should also fail.
> 
> I do not believe this is practical either.  GUC manipulation cannot
> look at the catalogs.

In this case, it really has to do something. Is setting GUCs a path so
critical it can't take one branch?

> I agree that it'd be nice to be noisier about the problem, but I'm
> not sure we can do more than bleat in the postmaster log from time
> to time if a publication is active and wal_level is too low.
> (And we'd better be careful about the log-spam aspect of that...)

With utmost respect, we have a lot more responsibility to the users of
this feature than this might imply. If there are circumstances where
there should be both a PUBLICATION and a wal_level less than logical,
by all means, let's document them very clearly in all the relevant
places.

If, as I strongly suspect, no such circumstance exists, it should not
be possible for someone to have both of those at once, however
inconvenient it is for us to arrange it.

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate



Re: Feature: Add Greek language fulltext search

2019-03-25 Thread Tom Lane
Panagiotis Mavrogiorgos  writes:
> Last November snowball added support for Greek language [1]. Following the
> instructions [2], I wrote a patch that adds fulltext search for Greek in
> Postgres. The patch is attached.

Cool!

> I would appreciate any feedback that will help in getting this merged.

We're past the deadline for submitting features for v12, but please
register this patch in the first v13 commitfest so that we remember
about it when the time comes:

https://commitfest.postgresql.org/23/

regards, tom lane



Re: Removing unneeded self joins

2019-03-25 Thread Alexander Kuzmenkov

On 3/25/19 07:07, David Rowley wrote:

You had commented the test with:

-- If index conditions are different for each side, we won't select the same
-- row on both sides, so the join can't be removed.

but I don't quite understand why we can't remove the join in this
situation.



My rationale was that we're not exactly removing the join, but replacing 
it with a scan. So it is not enough to just have a single row on each 
side, it must be the same physical row. In this example, before the 
transformation, t1.b is not equal to t2.b, but they become equal 
afterwards. This looks somewhat wrong. On the other hand, if the 
conditions are different, the resulting condition is going to evaluate 
to constant false and we won't get any rows, so maybe it's OK.


This brings me again to the question of what are the conditions for join 
removal. If the formulation with indexes is not general enough, what do 
we use instead? I guess it could be something like this:


1. Given the (btree equality) join and restriction clauses, both sides 
are unique on the same set of columns. That is, if we fix the values of 
these columns, both sides have at most one matching row.


    a. For each of these columns, we have either

        i) a join clause that equates some expression referencing the 
outer column to the same expression referencing the same inner column.


        ii) a clause for each relation that equates the same expression 
referencing the outer and inner column to some other arbitrary 
expression, possibly a different one for each side. This expression may 
be a Const or some expression that references a Var of some third relation.


2. All the resulting columns can be calculated using either side of the 
join. For now, just require that both sides are base relations that 
refer to the same physical relation.


Two points are not clear to me here:

1. We don't handle join clauses to third relations, but can they be 
treated the same way we treat Consts?


2. Can we simplify the join when we don't have any join clauses and only 
have Consts? Or should we have at least one join clause that equates the 
same inner and outer column? Why is one join clause enough?


--
Alexander Kuzmenkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




Re: Protect syscache from bloating with negative cache entries

2019-03-25 Thread Robert Haas
On Thu, Mar 7, 2019 at 11:40 PM Ideriha, Takeshi
 wrote:
> Just to be sure, we introduced the LRU list in this thread to find the 
> entries less than threshold time
> without scanning the whole hash table. If hash table becomes large without 
> LRU list, scanning time becomes slow.

Hmm.  So, it's a trade-off, right?  One option is to have an LRU list,
which imposes a small overhead on every syscache or catcache operation
to maintain the LRU ordering.  The other option is to have no LRU
list, which imposes a larger overhead every time we clean up the
syscaches.  My bias is toward thinking that the latter is better,
because:

1. Not everybody is going to use this feature, and

2. Syscache cleanup should be something that only happens every so
many minutes, and probably while the backend is otherwise idle,
whereas lookups can happen many times per millisecond.

However, perhaps someone will provide some evidence that casts a
different light on the situation.

I don't see much point in continuing to review this patch at this
point.  There's been no new version of the patch in 3 weeks, and there
is -- in my view at least -- a rather frustrating lack of evidence
that the complexity this patch introduces is actually beneficial.  No
matter how many people +1 the idea of making this more complicated, it
can't be justified unless you can provide a test result showing that
the additional complexity solves a problem that does not get solved
without that complexity.  And even then, who is going to commit a
patch that uses a design which Tom Lane says was tried before and
stunk?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Transaction commits VS Transaction commits (with parallel) VS query mean time

2019-03-25 Thread Haribabu Kommi
On Sat, Mar 23, 2019 at 11:10 PM Amit Kapila 
wrote:

> On Sat, Mar 23, 2019 at 9:50 AM Alvaro Herrera 
> wrote:
> >
> > On 2019-Mar-23, Amit Kapila wrote:
> >
> > > I think some users might also be interested in the write transactions
> > > happened in the system, basically, those have consumed xid.
> >
> > Well, do they really want to *count* these transactions, or is it enough
> > to keep an eye on the "age" of some XID column?  Other than for XID
> > freezing purposes, I don't see such internal transactions as very
> > interesting.
> >
>
> That's what I also had in mind.  I think doing anything more than just
> fixing the count for the parallel cooperating transaction by parallel
> workers doesn't seem intuitive to me. I mean if we want we can commit
> the fix such that all supporting transactions by parallel worker
> shouldn't be counted, but I am not able to convince myself that that
> is the good fix.  Instead, I think rather than fixing that one case we
> should think more broadly about all the supportive transactions
> happening in the various operations.  Also, as that is a kind of
> behavior change, we should discuss that as a separate topic.
>

Thanks to everyone for their opinions and suggestions to improve.

Without parallel workers, there aren't many internal implementation
logic code that causes the stats to bloat. Parallel worker stats are not
just the transactions, it update other stats also, for eg; seqscan stats
of a relation. I also eagree that just fixing it for transactions may not
be a proper solution.

Using of XID data may not give proper TPS of the instance as it doesn't
involve the read only transactions information.

How about adding additional two columns that provides all the internal
and background worker transactions into that column?

Regards,
Haribabu Kommi
Fujitsu Australia


Re: warning to publication created and wal_level is not set to logical

2019-03-25 Thread Lucas Viecelli
>
> > Is a WARNING sufficient?  Maybe I'm misunderstanding something
>
> important, but I think the attempt should fail with a HINT to set the
> > wal_level ahead of time.
>

I thought about this possibility, but I was afraid to change the current
behavior a lot, but it's worth discussing.


>
>
I agree that it'd be nice to be noisier about the problem, but I'm
> not sure we can do more than bleat in the postmaster log from time
> to time if a publication is active and wal_level is too low.
> (And we'd better be careful about the log-spam aspect of that...)
>

I agree on being noisier, but I think the main thing is to let the user
aware of the situation and in that the
patch resolves, stating that he needs to adjust wal_level. Initially
WARNING will appear only at the time
the publication is created, precisely not to put spam in the log.

Is it better to warn from time to time that wal_level needs to change
because it has some publication that will not work?
-- 

*Lucas Viecelli*





Re: GiST VACUUM

2019-03-25 Thread Heikki Linnakangas

On 24/03/2019 18:50, Andrey Borodin wrote:

I was working on new version of gist check in amcheck and understand one more 
thing:

/* Can this page be recycled yet? */
bool
gistPageRecyclable(Page page)
{
 return PageIsNew(page) ||
 (GistPageIsDeleted(page) &&
  TransactionIdPrecedes(GistPageGetDeleteXid(page), RecentGlobalXmin));
}

Here RecentGlobalXmin can wraparound and page will become unrecyclable for half 
of xid cycle. Can we prevent it by resetting PageDeleteXid to 
InvalidTransactionId before doing RecordFreeIndexPage()?
(Seems like same applies to GIN...)


True, and B-tree has the same issue. I thought I saw a comment somewhere 
in the B-tree code about that earlier, but now I can't find it. I 
must've imagined it.


We could reset it, but that would require dirtying the page. That would 
be just extra I/O overhead, if the page gets reused before XID 
wraparound. We could avoid that if we stored the full XID+epoch, not 
just XID. I think we should do that in GiST, at least, where this is 
new. In the B-tree, it would require some extra code to deal with 
backwards-compatibility, but maybe it would be worth it even there.


- Heikki



pg_malloc0() instead of pg_malloc()+memset()

2019-03-25 Thread Daniel Gustafsson
When reading another codepath, I happened to notice a few codepaths where we do
pg_malloc() immediately followed by a memset( ..  0, ..), without there being a
justification (that I can see) for not using pg_malloc0() instead. The attached
patch changes to pg_malloc0(), and passes make check.

cheers ./daniel



pg_malloc0.patch
Description: Binary data


Re: pg_basebackup ignores the existing data directory permissions

2019-03-25 Thread Robert Haas
On Thu, Mar 21, 2019 at 8:42 PM Michael Paquier  wrote:
> > Why not?
>
> Because we have released v11 so as we respect the permissions set on
> the source instead from which the backup is taken for all the folder's
> content.  If we begin to enforce it we may break some cases.  If a new
> option is introduced, it seems to me that the default should remain
> what has been released with v11, but that it is additionally possible
> to enforce group permissions or non-group permissions at will on the
> backup taken for all the contents in the data folder, including the
> root folder, created manually or not before running the pg_basebackup
> command.

I don't agree with that logic, because setting the permissions of the
content one way and the directory another cannot really be what anyone
wants.

If we're going to go with -g {inherit|none|group} then -g inherit
ought to do what was originally proposed on this thread -- i.e. set
the directory permissions to match the contents.  I don't think that's
a change that can or should be back-patched, but we should make it
consistent as part of this cleanup.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: ALTER TABLE with ADD COLUMN and ADD PRIMARY KEY USING INDEX throws spurious "column contains null values"

2019-03-25 Thread Sergei Kornilov
Hi

I did not review patch, just want add link to related bug 15580 and one another 
-hackers thread:

https://www.postgresql.org/message-id/flat/15580-d1a6de5a3d65da51%40postgresql.org
https://www.postgresql.org/message-id/flat/CAOVr7%2B3C9u_ZApjxpccrorvt0aw%3Dk8Ct5FhHRVZA-YO36V3%3Drg%40mail.gmail.com

regards, Sergei



Re: [HACKERS] WAL logging problem in 9.4.3?

2019-03-25 Thread Kyotaro HORIGUCHI
Hello. This is a revised version.

At Wed, 20 Mar 2019 22:48:35 -0700, Noah Misch  wrote in 
<20190321054835.gb3842...@rfd.leadboat.com>
> On Wed, Mar 20, 2019 at 05:17:54PM +0900, Kyotaro HORIGUCHI wrote:
> > At Sun, 10 Mar 2019 19:27:08 -0700, Noah Misch  wrote in 
> > <20190311022708.ga2189...@rfd.leadboat.com>
> > > On Mon, Mar 04, 2019 at 12:24:48PM +0900, Kyotaro HORIGUCHI wrote:
> > > I'm mildly unhappy that pendingSyncs entries with "pending->sync_above ==
> > > InvalidBlockNumber" are not sync requests at all.  Those just record the 
> > > fact
> > > of a RelationTruncate() happening.  If you can think of a way to improve 
> > > that,
> > > please do so.  If not, it's okay.
> > 
> > After a truncation, required WAL records are emitted for the
> > truncated pages, so no need to sync. Does this make sense for
> > you? (Maybe commit is needed there)
> 
> Yes, the behavior makes sense.  I wasn't saying the quoted code had the wrong
> behavior.  I was saying that the data structure called "pendingSyncs" is
> actually "pending syncs and past truncates".  It's not ideal that the variable
> name differs from the variable purpose in this way.  However, it's okay if you
> don't find a way to improve that.

It is convincing. The current member names "sync_above" and
"truncated_to" are wordings based on the operations that have
happened on the relation. I changed the names to words based on
what to do on the relation. Renamed to skip_wal_min_blk and
wal_log_min_blk.

> > I don't have enough time for now so the new version will be
> > posted early next week.
> 
> I'll wait for that version.

At Wed, 20 Mar 2019 17:17:54 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI 
 wrote in 
<20190320.171754.171896368.horiguchi.kyot...@lab.ntt.co.jp>
> > Hence, this change causes us to emit WAL for the metapage of a
> > RELPERSISTENCE_UNLOGGED or RELPERSISTENCE_TEMP relation.  We should never do
> > that.  If we do that for RELPERSISTENCE_TEMP, redo will write to a permanent
> > relfilenode.  I've attached a test case for this; it is a patch that applies
> > on top of your v7 patches.  The test checks for orphaned files after redo.
> 
> Oops!  Added RelationNeedsWAL(index) there. (Attched 1st patch on
> top of this patchset)

Done in the attached patch. But the orphan file check in the TAP
diff was wrong. It detects orphaned pg_class entry for temprary
tables, which dissapears after the first autovacuum. The revised
tap test (check_orphan_relfilenodes) doesn't faultly fail and
catches the bug in the previous patch.

> > > +  * If no tuple was inserted, it's possible that we are truncating a
> > > +  * relation. We need to emit WAL for the metapage in the case. However 
> > > it
> > > +  * is not required elsewise,
> > 
> > Did you mean to write more words after that comma?
> 
> Sorry, it is just a garbage. Required work is done in
> _bt_blwritepage.

Removed.

> > We'd need a mechanism to un-remove the sync at subtransaction abort.  My
> > attachment includes a test case demonstrating the consequences of that 
> > defect.
> > Please look for other areas that need to know about subtransactions; patch 
> > v7
> > had no code pertaining to subtransactions.

Added. Passed the new tests.

> > > + elog(DEBUG2, "not skipping WAL-logging for rel %u/%u/%u block 
> > > %u, because sync_above is %u",
> > 
> > As you mention upthread, you have many debugging elog()s.  These are too
> > detailed to include in every binary, but I do want them in the code.  See
> > CACHE_elog() for a good example of achieving that.
> 
> Agreed will do. They were need to check the behavior precisely
> but usually not needed.

I removed all such elog()s.

> > RelationData fields other than "pgstat_info" have "rd_" prefixes; add that
> > prefix to these fields.
> > This is a nonstandard place to clear fields.  Clear them in
> > load_relcache_init_file() only, like we do for rd_statvalid.  (Other paths
> > will then rely on palloc0() for implicit initialization.)

Both are done.

> > > - if (RelationNeedsWAL(relation))
> > > + if (BufferNeedsWAL(relation, buffer) ||
> > > + BufferNeedsWAL(relation, newbuf))
> > 
> > This is fine if both buffers need WAL or neither buffer needs WAL.  It is 
> > not
> > fine when one buffer needs WAL and the other buffer does not.  My attachment
> > includes a test case.  Of the bugs I'm reporting, this one seems most
> > difficult to solve well.

I refactored heap_insert/delete so that the XLOG stuff can be
used from heap_update. Then modify heap_update so that it emits
XLOG_INSERT and XLOG_DELETE in addition to XLOG_UPDATE.

> > We still use heap_sync() in CLUSTER.  Can we migrate CLUSTER to the newer
> > heap_register_sync()?  Patch v7 makes some commands use the new way (COPY,
> > CREATE TABLE AS, REFRESH MATERIALIZED VIEW, ALTER TABLE) and leaves other
> > commands using the old way (CREATE INDEX USING btree, ALTER TABLE SET
> > TABLESPACE, CLUSTER).  It would make the system simpler to understand if we
> 

ALTER TABLE with ADD COLUMN and ADD PRIMARY KEY USING INDEX throws spurious "column contains null values"

2019-03-25 Thread Zhang, Jie
Hi all,

When I do the following:
postgres=# create table t1 (a int);
postgres=# insert into t1 values(1);
postgres=# create unique index uniq_idx on t1(a);
postgres=# alter table t1 add column b float8 not null default random(), add 
primary key using index uniq_idx;
ERROR: column "b" contains null values

PostgreSQL throws error "column b contains null values".

#
alter table t1 add column b float8 not null default 0, add primary key using 
index uniq_idx;

alter table success.
#

The reasons for the error are as follows.

ATController provides top level control over the phases.
Phase 1: preliminary examination of commands, create work queue 
Phase 2: update system catalogs 
Phase 3: scan/rewrite tables as needed 

In Phase 2, when dealing with "add column b float8 not null default random()", 
the table is marked rewrite.
When dealing with "add primary key using index uniq_idx", 
ATExecAddIndexConstraint calls index_check_primary_key.

The calling order is as follows.
index_check_primary_key()
↓
AlterTableInternal()
↓
ATController()
↓
ATRewriteTables()
↓
ATRewriteTable()

ATRewriteTable check all not-null constraints. Column a and column b need to 
check NOT NULL.
Unfortunately, at this time, Phase 3 hasn't been done yet.
The table is not rewrited, just marked rewrite. So, throws error "column b 
contains null values".

In Phase 2, if table is marked rewrite, we can do not check whether columns are 
NOT NULL.
Because phase 3 will do it.

Here's a patch to fix this bug.

Best Regards!




alter_table.patch
Description: alter_table.patch


Re: libpq compression

2019-03-25 Thread Konstantin Knizhnik



On 25.03.2019 13:48, Dmitry Dolgov wrote:

On Mon, Mar 25, 2019 at 11:39 AM David Steele  wrote:

On 3/25/19 1:04 PM, Konstantin Knizhnik wrote:


On 25.03.2019 11:06, David Steele wrote:

Konstantin,


This patch appears to be failing tests so I have marked it Waiting on
Author.

I have also removed the reviewer since no review had been done. Maybe
somebody else will have a look.

Regards,

Can you please inform me which tests are failed?
I have done "make check-world" and there were no failed tests.
Actually if compression is not enabled (and it is disabled by default
unless explicitly requested by client), there should be no difference
with vanilla Postgres.
So it will be strange if some tests are failed without using compression.

Check out the cfbot report at http://commitfest.cputube.org

I guess it's red because the last posted patch happened to be my experimental
patch (based indeed on a broken revision v11), not the one posted by Konstantin.

Rebased version of my patch is attached.

--

Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/configure b/configure
index 8068108..8ebd961 100755
--- a/configure
+++ b/configure
@@ -701,6 +701,7 @@ ELF_SYS
 EGREP
 GREP
 with_zlib
+with_zstd
 with_system_tzdata
 with_libxslt
 with_libxml
@@ -864,6 +865,7 @@ with_libxml
 with_libxslt
 with_system_tzdata
 with_zlib
+with_zstd
 with_gnu_ld
 enable_largefile
 enable_float4_byval
@@ -8307,6 +8309,86 @@ fi
 
 
 #
+# ZStd
+#
+
+
+
+# Check whether --with-zstd was given.
+if test "${with_zstd+set}" = set; then :
+  withval=$with_zstd;
+  case $withval in
+yes)
+  ;;
+no)
+  :
+  ;;
+*)
+  as_fn_error $? "no argument expected for --with-zstd option" "$LINENO" 5
+  ;;
+  esac
+
+else
+  with_zstd=no
+
+fi
+
+
+
+
+if test "$with_zstd" = yes ; then
+  { $as_echo "$as_me:${as_lineno-$LINENO}: checking for ZSTD_compress in -lzstd" >&5
+$as_echo_n "checking for ZSTD_compress in -lzstd... " >&6; }
+if ${ac_cv_lib_zstd_ZSTD_compress+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  ac_check_lib_save_LIBS=$LIBS
+LIBS="-lzstd  $LIBS"
+cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h.  */
+
+/* Override any GCC internal prototype to avoid an error.
+   Use char because int might match the return type of a GCC
+   builtin and then its argument prototype would still apply.  */
+#ifdef __cplusplus
+extern "C"
+#endif
+char ZSTD_compress ();
+int
+main ()
+{
+return ZSTD_compress ();
+  ;
+  return 0;
+}
+_ACEOF
+if ac_fn_c_try_link "$LINENO"; then :
+  ac_cv_lib_zstd_ZSTD_compress=yes
+else
+  ac_cv_lib_zstd_ZSTD_compress=no
+fi
+rm -f core conftest.err conftest.$ac_objext \
+conftest$ac_exeext conftest.$ac_ext
+LIBS=$ac_check_lib_save_LIBS
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_lib_zstd_ZSTD_compress" >&5
+$as_echo "$ac_cv_lib_zstd_ZSTD_compress" >&6; }
+if test "x$ac_cv_lib_zstd_ZSTD_compress" = xyes; then :
+  cat >>confdefs.h <<_ACEOF
+#define HAVE_LIBZSTD 1
+_ACEOF
+
+  LIBS="-lzstd $LIBS"
+
+else
+  as_fn_error $? "library 'zstd' is required for ZSTD support" "$LINENO" 5
+fi
+
+fi
+
+
+
+#
 # Elf
 #
 
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index c1d1b6b..a77adc3 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -1135,6 +1135,20 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
   
  
 
+ 
+  compression
+  
+  
+Request compression of libpq traffic. Client sends to the server list of compression algorithms, supported by client library.
+If server supports one of this algorithms, then it acknowledges use of this algorithm and then all libpq messages send both from client to server and
+visa versa will be compressed. If server is not supporting any of the suggested algorithms, then it replies with 'n' (no compression)
+message and it is up to the client whether to continue work without compression or report error.
+Supported compression algorithms are chosen at configure time. Right now two libraries are supported: zlib (default) and zstd (if Postgres was
+configured with --with-zstd option). In both cases streaming mode is used.
+  
+  
+ 
+
  
   client_encoding
   
diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index d66b860..aaca963 100644
--- a/doc/src/sgml/protocol.sgml
+++ b/doc/src/sgml/protocol.sgml
@@ -92,6 +92,15 @@
such as COPY.
   
 
+  
+It is possible to compress protocol data to reduce traffic and speed-up client-server interaction.
+Compression is especial useful for importing/exporting data to/from database using COPY command
+and for replication (both physical and logical). Also compression can reduce server response time
+in case of queries returning large amount of data (for example returning JSON, BLOBs, text,...)
+Right now two libraries are supported: zlib (default) and zstd (if 

Re: Speed up transaction completion faster after many relations are accessed in a transaction

2019-03-25 Thread David Rowley
On Mon, 25 Mar 2019 at 23:44, Peter Eisentraut
 wrote:
> I did a bit of performance testing, both a plain pgbench and the
> suggested test case with 4096 partitions.  I can't detect any
> performance improvements.  In fact, within the noise, it tends to be
> just a bit on the slower side.
>
> So I'd like to kick it back to the patch submitter now and ask for more
> justification and performance analysis.
>
> Perhaps "speeding up planning with partitions" needs to be accepted first?

Yeah, I think it likely will require that patch to be able to measure
the gains from this patch.

If planning a SELECT to a partitioned table with a large number of
partitions using PREPAREd statements, when we attempt the generic plan
on the 6th execution, it does cause the local lock table to expand to
fit all the locks for each partition. This does cause the
LockReleaseAll() to become slow due to the hash_seq_search having to
skip over many empty buckets.   Since generating a custom plan for a
partitioned table with many partitions is still slow in master, then I
very much imagine you'll struggle to see the gains brought by this
patch.

I did a quick benchmark too and couldn't measure anything:

create table hp (a int) partition by hash (a);
select 'create table hp'||x|| ' partition of hp for values with
(modulus 4096, remainder ' || x || ');' from generate_series(0,4095)
x;

bench.sql
\set p_a 13315
select * from hp where a = :p_a;

Master:
$ pgbench -M prepared -n -T 60 -f bench.sql postgres
tps = 31.844468 (excluding connections establishing)
tps = 32.950154 (excluding connections establishing)
tps = 31.534761 (excluding connections establishing)

Patched:
$ pgbench -M prepared -n -T 60 -f bench.sql postgres
tps = 30.099133 (excluding connections establishing)
tps = 32.157328 (excluding connections establishing)
tps = 32.329884 (excluding connections establishing)

The situation won't be any better with plan_cache_mode =
force_generic_plan either. In this case, we'll only plan once but
we'll also have to obtain and release a lock for each partition for
each execution of the prepared statement. LockReleaseAll() is going to
be slow in that case because it actually has to release a lot of
locks.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



RE: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?

2019-03-25 Thread legrand legrand
>> Shoudn't you add this to commitfest ?

> I added it last week, see https://commitfest.postgresql.org/23/2069/

Oups, sorry for the noise


RE: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?

2019-03-25 Thread legrand legrand

>> Would it make sense to add it in auto explain ?
>> I don't know for explain itself, but maybe ...

> I'd think that people interested in getting the queryid in the logs
> would configure the log_line_prefix to display it consistently rather
> than having it in only a subset of cases, so that's probably not
> really needed.

Ok.
Shoudn't you add this to commitfest ?



Re: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?

2019-03-25 Thread Julien Rouhaud
On Mon, Mar 25, 2019 at 12:36 PM legrand legrand
 wrote:
>
> >> Would it make sense to add it in auto explain ?
> >> I don't know for explain itself, but maybe ...
>
> > I'd think that people interested in getting the queryid in the logs
> > would configure the log_line_prefix to display it consistently rather
> > than having it in only a subset of cases, so that's probably not
> > really needed.
>
> Ok.
> Shoudn't you add this to commitfest ?

I added it last week, see https://commitfest.postgresql.org/23/2069/



Re: psql display of foreign keys

2019-03-25 Thread Peter Eisentraut
On 2019-03-23 03:30, Alvaro Herrera wrote:
>>> Thanks for the updated patch.  I applied it and rebased the
>>> foreign-keys-referencing-partitioned-tables patch on top.  Here's
>>> something I think you may have missed:
>>
>> I missed that indeed!  Thanks for noticing.  Here's an updated and
>> rebased version of this patch.
> 
> I forgot to "git add" the new changes to the expected file.  Here's v8
> with that fixed.

Looks OK in general.

relispartition was added in PG10, so the conditional in
describeOneTableDetails() seems wrong.

In the older branches of that same function, I'd prefer writing

false AS relispartition

for clarity.

Some of the other queries could also use some column aliases, like

conrelid = '%s'::pg_catalog.regclass AS isroot (?)

or

pg_catalog.pg_get_constraintdef(oid, true) AS condef

(as in the other branch).

A test case for the incoming foreign key display would be nice, as that
was the original argument for the patch.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Usage of epoch in txid_current

2019-03-25 Thread Thomas Munro
On Mon, Mar 25, 2019 at 5:01 PM Thomas Munro  wrote:
> New version attached.  I'd like to commit this for PG12.

Here is a follow-up sketch patch that shows FullTransactionId being
used in the transaction stack, so you can call eg
GetCurrentFullTransactionId().  A table access method could use this
to avoid the need to freeze stuff later (eg zheap).

I suppose it's not strictly necessary, since you could use
GetCurrentTransactionId() and infer the epoch by comparing with
ReadNextFullTransactionId() (now that the epoch counting is reliable,
due to patch 0001 which I'm repeating again here just for cfbot).  But
I suppose we want to get away from that sort of thing.  Thoughts?


--
Thomas Munro
https://enterprisedb.com
From 1f513f48a603c663753ddb5fa1d8e62abf500db9 Mon Sep 17 00:00:00 2001
From: Thomas Munro 
Date: Mon, 25 Mar 2019 12:29:23 +1300
Subject: [PATCH 1/2] Track the next transaction ID using 64 bits.

Instead of inferring epoch advancement from xids and checkpoints,
introduce a 64 bit FullTransactionId type and use it to track xid
generation.  This fixes an unlikely bug where epoch information is
corrupted if you somehow manage to make TransactionId wrap around
more than once between checkpoints.

This commit creates some very basic infrastructure allowing for later
patches to adopt 64 bit transaction IDs in more places, potentially
including table access methods that don't need to freeze tables.

Author: Thomas Munro
Reported-by: Amit Kapila
Reviewed-by: Andres Freund, Tom Lane
Discussion: https://postgr.es/m/CAA4eK1%2BMv%2Bmb0HFfWM9Srtc6MVe160WFurXV68iAFMcagRZ0dQ%40mail.gmail.com
---
 src/backend/access/rmgrdesc/xlogdesc.c  |   4 +-
 src/backend/access/transam/clog.c   |   8 +-
 src/backend/access/transam/commit_ts.c  |   4 +-
 src/backend/access/transam/multixact.c  |  20 +
 src/backend/access/transam/subtrans.c   |   8 +-
 src/backend/access/transam/twophase.c   |  40 +++--
 src/backend/access/transam/varsup.c |  76 
 src/backend/access/transam/xact.c   |  35 +---
 src/backend/access/transam/xlog.c   | 113 ++--
 src/backend/replication/walreceiver.c   |   5 +-
 src/backend/replication/walsender.c |   5 +-
 src/backend/storage/ipc/procarray.c |  34 ++-
 src/backend/storage/ipc/standby.c   |   2 +-
 src/backend/storage/lmgr/predicate.c|   2 +-
 src/backend/utils/adt/txid.c|  13 ++-
 src/backend/utils/misc/pg_controldata.c |   5 +-
 src/bin/pg_controldata/pg_controldata.c |   5 +-
 src/bin/pg_resetwal/pg_resetwal.c   |  20 +++--
 src/include/access/transam.h|  52 ++-
 src/include/access/xlog.h   |   1 -
 src/include/catalog/pg_control.h|   6 +-
 src/include/storage/standby.h   |   2 +-
 src/include/storage/standbydefs.h   |   2 +-
 src/tools/pgindent/typedefs.list|   1 +
 24 files changed, 225 insertions(+), 238 deletions(-)

diff --git a/src/backend/access/rmgrdesc/xlogdesc.c b/src/backend/access/rmgrdesc/xlogdesc.c
index bfad284be08..33060f30429 100644
--- a/src/backend/access/rmgrdesc/xlogdesc.c
+++ b/src/backend/access/rmgrdesc/xlogdesc.c
@@ -14,6 +14,7 @@
  */
 #include "postgres.h"
 
+#include "access/transam.h"
 #include "access/xlog.h"
 #include "access/xlog_internal.h"
 #include "catalog/pg_control.h"
@@ -52,7 +53,8 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
 		 checkpoint->ThisTimeLineID,
 		 checkpoint->PrevTimeLineID,
 		 checkpoint->fullPageWrites ? "true" : "false",
-		 checkpoint->nextXidEpoch, checkpoint->nextXid,
+		 EpochFromFullTransactionId(checkpoint->nextFullXid),
+		 XidFromFullTransactionId(checkpoint->nextFullXid),
 		 checkpoint->nextOid,
 		 checkpoint->nextMulti,
 		 checkpoint->nextMultiOffset,
diff --git a/src/backend/access/transam/clog.c b/src/backend/access/transam/clog.c
index aa089d83fa8..3bd55fbdd33 100644
--- a/src/backend/access/transam/clog.c
+++ b/src/backend/access/transam/clog.c
@@ -749,12 +749,12 @@ ZeroCLOGPage(int pageno, bool writeXlog)
 
 /*
  * This must be called ONCE during postmaster or standalone-backend startup,
- * after StartupXLOG has initialized ShmemVariableCache->nextXid.
+ * after StartupXLOG has initialized ShmemVariableCache->nextFullXid.
  */
 void
 StartupCLOG(void)
 {
-	TransactionId xid = ShmemVariableCache->nextXid;
+	TransactionId xid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
 	int			pageno = TransactionIdToPage(xid);
 
 	LWLockAcquire(CLogControlLock, LW_EXCLUSIVE);
@@ -773,7 +773,7 @@ StartupCLOG(void)
 void
 TrimCLOG(void)
 {
-	TransactionId xid = ShmemVariableCache->nextXid;
+	TransactionId xid = XidFromFullTransactionId(ShmemVariableCache->nextFullXid);
 	int			pageno = TransactionIdToPage(xid);
 
 	LWLockAcquire(CLogControlLock, LW_EXCLUSIVE);
@@ -792,7 +792,7 @@ TrimCLOG(void)
 	 * but makes no WAL entry).  Let's just be safe. (We need not worry about
 	 * pages beyond the current one, since those will 

  1   2   >