Re: Switching XLog source from archive to streaming when primary available

2022-09-08 Thread Bharath Rupireddy
On Fri, Sep 9, 2022 at 10:57 AM Kyotaro Horiguchi
 wrote:
>
> At Thu, 8 Sep 2022 10:53:56 -0700, Nathan Bossart  
> wrote in
> > On Thu, Sep 08, 2022 at 05:16:53PM +0530, Bharath Rupireddy wrote:
> > > I'm attaching the v3 patch with the review comments addressed, please
> > > review it further.
> >
> > My general point is that we should probably offer some basic preventative
> > measure against flipping back and forth between streaming and archive
> > recovery while making zero progress.  As I noted, maybe that's as simple as
> > having WaitForWALToBecomeAvailable() attempt to restore a file from archive
> > at least once before the new parameter forces us to switch to streaming
> > replication.  There might be other ways to handle this.
>
> +1.

Hm. In that case, I think we can get rid of timeout based switching
mechanism and have this behaviour - the standby can attempt to switch
to streaming mode from archive, say, after fetching 1, 2 or a
configurable number of WAL files. In fact, this is the original idea
proposed by Satya in this thread.

If okay, I can code on that. Thoughts?

-- 
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: why can't a table be part of the same publication as its schema

2022-09-08 Thread Tom Lane
Amit Kapila  writes:
> To avoid these confusions, we have disallowed adding a table if its
> schema is already part of the publication and vice-versa.

Really?

Is there logic in ALTER TABLE SET SCHEMA that rejects the command
dependent on the contents of the publication tables?  If so, are
there locks taken in both ALTER TABLE SET SCHEMA and the
publication-modifying commands that are sufficient to prevent
race conditions in such changes?

This position sounds quite untenable from here, even if I found
your arguments-in-support convincing, which I don't really.
ISTM the rule should be along the lines of "table S.T should
be published either if schema S is published or S.T itself is".
There's no obvious need to interconnect the two conditions.

regards, tom lane




Re: ATExecAddColumn() doesn't merge inherited properties

2022-09-08 Thread Amit Langote
On Thu, Sep 8, 2022 at 9:03 PM Amit Langote  wrote:
> While testing Alvaro's "cataloguing NOT NULL constraints" patch [1], I
> noticed a behavior of inherited column merging during ALTER TABLE that
> I thought might be a bug (though see at the bottom).
>
> I feel like we may have discussed this before and decided that the
> $subject is left that way intentionally, but wanted to bring this up
> again in the light of Alvaro's patch which touches nearby code.
> Should we try to fix this?

I found a not-so-old email where we indeed called this (allowing a
child table's column to be nullable when it is marked NOT NULL in a
parent) a bug that should be fixed:

https://www.postgresql.org/message-id/CA%2BHiwqEPy72GnNa88jMkHMJaiAYiE7-zgcdPBMwNP-zWi%2Beifw%40mail.gmail.com

-- 
Thanks, Amit Langote
EDB: http://www.enterprisedb.com




Re: Minimum bison and flex versions

2022-09-08 Thread John Naylor
On Fri, Sep 9, 2022 at 12:07 AM Andres Freund  wrote:
>
> Hi,
>
> On 2022-09-06 13:32:32 +0700, John Naylor wrote:
> > Here are autoconf-only patches to that effect.
>
> Looks like you did actually include src/tools/msvc as well :)

Ah, in my head I meant "no patches for the Meson branch". :-)

CI fails on MSVC pg_upgrade, but that shows up in some existing CF bot
failures and in any case doesn't have a grammar, so I have pushed,
thanks for looking!

> > Since the retirement of some older buildfarm members, the oldest Bison
> > that gets regular testing is 2.3. MacOS ships that version, and will
> > continue doing so for the forseeable future because of Apple's policy
> > regarding GPLv3. While Mac users could use a package manager to install
> > a newer version, there is no compelling reason to do so at this time.
>
> s/to do so/to force them to do so/?

There are good reasons for a dev to install a newer Bison, like better
diagnostics, so used this language.

--
John Naylor
EDB: http://www.enterprisedb.com




Re: why can't a table be part of the same publication as its schema

2022-09-08 Thread Amit Kapila
On Thu, Sep 8, 2022 at 5:06 PM Peter Eisentraut
 wrote:
>
> Apparently, you can't add a table to a publication if its schema is
> already part of the publication (and vice versa), e.g.,
>
> =# alter publication p1 add table s1.t1;
> ERROR:  22023: cannot add relation "s1.t1" to publication
> DETAIL:  Table's schema "s1" is already part of the publication or part
> of the specified schema list.
>
> Is there a reason for this?
>

Yes, because otherwise, there was confusion while dropping the objects
from publication. Consider in the above case, if we would have allowed
it and then the user performs ALTER PUBLICATION p1 DROP ALL TABLES IN
SCHEMA s1, then (a) shall we remove both schema s1 and a table that is
separately added (s1.t1) from that schema, or (b) just remove schema
s1? There is a point that the user can expect that as she has added
them separately, so we should allow her to drop them separately as
well. OTOH, if we go that way, then it is again questionable that when
the user has asked to Drop all the tables in the schema (via DROP ALL
TABLES IN SCHEMA command) then why keep some tables?

The other confusion from allowing publications to have schemas and
tables from the same schema is that say the user has created a
publication with the command CREATE PUBLICATION pub1 FOR ALL TABLES IN
SCHEMA s1, and then she can ask to allow dropping one of the tables in
the schema by ALTER PUBLICATION pub1 DROP TABLE s1.t1. Now, it will be
tricky to support and I think it doesn't make sense as well.

Similarly, what if the user has first created a publication with
"CREATE PUBLICATION pub1 ADD TABLE s1.t1, s1.t2, ... s1.t99;" and then
tries to drop all tables in one shot by ALTER PUBLICATION DROP ALL
TABLES IN SCHEMA sch1;?

To avoid these confusions, we have disallowed adding a table if its
schema is already part of the publication and vice-versa. Also, there
won't be any additional benefit to doing so.

-- 
With Regards,
Amit Kapila.




Re: Handle infinite recursion in logical replication setup

2022-09-08 Thread Amit Kapila
On Thu, Sep 8, 2022 at 9:32 AM vignesh C  wrote:
>
>
> The attached patch has the changes to handle the same.
>

Pushed. I am not completely sure whether we want the remaining
documentation patch in this thread in its current form or by modifying
it. Johnathan has shown some interest in it. I feel you can start a
separate thread for it to see if there is any interest in the same and
close the CF entry for this work.

-- 
With Regards,
Amit Kapila.




is_superuser is not documented

2022-09-08 Thread bt22kawamotok

Hi!

is_superuser function checks whether a user is a superuser or not, and 
is commonly used. However, is_superuser is not documented and is set to 
UNGROUPED in guc.c. I think is_superuser should be added to the 
documentation and set to PRESET OPTIONS.What are you thought on this?


Regards,
Kotaro Kawmaoto




Re: Switching XLog source from archive to streaming when primary available

2022-09-08 Thread Kyotaro Horiguchi
At Thu, 8 Sep 2022 10:53:56 -0700, Nathan Bossart  
wrote in 
> On Thu, Sep 08, 2022 at 05:16:53PM +0530, Bharath Rupireddy wrote:
> > I'm attaching the v3 patch with the review comments addressed, please
> > review it further.
> 
> My general point is that we should probably offer some basic preventative
> measure against flipping back and forth between streaming and archive
> recovery while making zero progress.  As I noted, maybe that's as simple as
> having WaitForWALToBecomeAvailable() attempt to restore a file from archive
> at least once before the new parameter forces us to switch to streaming
> replication.  There might be other ways to handle this.

+1.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: build remaining Flex files standalone

2022-09-08 Thread John Naylor
On Wed, Sep 7, 2022 at 4:27 PM Peter Eisentraut
 wrote:
>
> On 04.09.22 20:17, Andres Freund wrote:
> > I think, as a followup improvement, we should move gramparse.h to
> > src/backend/parser, and stop installing gram.h, gramparse.h. gramparse.h
> > already had this note:
> >
> >   * NOTE: this file is only meant to be included in the core parsing files,
> >   * i.e., parser.c, gram.y, and scan.l.
> >   * Definitions that are needed outside the core parser should be in 
> > parser.h.
> >
> > What do you think?
>
> I found in my notes:
>
> * maybe gram.h and gramparse.h should not be installed
>
> So, yeah. ;-)

It seems gramparse.h isn't installed now? In any case, here's a patch
to move gramparse to the backend dir and stop symlinking/ installing
gram.h. Confusingly, MSVC didn't seem to copy gram.h to src/include,
so I'm not yet sure how it still managed to build...

-- 
John Naylor
EDB: http://www.enterprisedb.com
From ff89180d7b0fe4c95a470a657ee465c5384d6ecc Mon Sep 17 00:00:00 2001
From: John Naylor 
Date: Fri, 9 Sep 2022 11:57:39 +0700
Subject: [PATCH v1] Move gramparse.h to src/backend/parser

This allows removal of makefile rules that symlink gram.h to
src/include/parser. While at it, stop installing gram.h. This
seems unnecessary at present time.

Idea from Andres Freund and Peter Eisentraut
Discussion: https://www.postgresql.org/message-id/20220904181759.px6uosll6zbxcum5%40awork3.anarazel.de
---
 src/backend/Makefile| 7 +--
 src/backend/parser/gram.y   | 2 +-
 src/{include => backend}/parser/gramparse.h | 4 ++--
 src/backend/parser/parser.c | 2 +-
 src/backend/parser/scan.l   | 2 +-
 src/include/Makefile| 4 ++--
 src/include/parser/.gitignore   | 1 -
 src/tools/msvc/Install.pm   | 4 
 src/tools/pginclude/cpluspluscheck  | 1 -
 src/tools/pginclude/headerscheck| 1 -
 10 files changed, 8 insertions(+), 20 deletions(-)
 rename src/{include => backend}/parser/gramparse.h (97%)
 delete mode 100644 src/include/parser/.gitignore

diff --git a/src/backend/Makefile b/src/backend/Makefile
index 5c4772298d..42cc3bda1d 100644
--- a/src/backend/Makefile
+++ b/src/backend/Makefile
@@ -153,12 +153,7 @@ submake-utils-headers:
 
 .PHONY: generated-headers
 
-generated-headers: $(top_builddir)/src/include/parser/gram.h $(top_builddir)/src/include/storage/lwlocknames.h submake-catalog-headers submake-nodes-headers submake-utils-headers
-
-$(top_builddir)/src/include/parser/gram.h: parser/gram.h
-	prereqdir=`cd '$(dir $<)' >/dev/null && pwd` && \
-	  cd '$(dir $@)' && rm -f $(notdir $@) && \
-	  $(LN_S) "$$prereqdir/$(notdir $<)" .
+generated-headers: $(top_builddir)/src/include/storage/lwlocknames.h submake-catalog-headers submake-nodes-headers submake-utils-headers
 
 $(top_builddir)/src/include/storage/lwlocknames.h: storage/lmgr/lwlocknames.h
 	prereqdir=`cd '$(dir $<)' >/dev/null && pwd` && \
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 0492ff9a66..668ad3fa8e 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -55,9 +55,9 @@
 #include "catalog/pg_trigger.h"
 #include "commands/defrem.h"
 #include "commands/trigger.h"
+#include "gramparse.h"
 #include "nodes/makefuncs.h"
 #include "nodes/nodeFuncs.h"
-#include "parser/gramparse.h"
 #include "parser/parser.h"
 #include "storage/lmgr.h"
 #include "utils/date.h"
diff --git a/src/include/parser/gramparse.h b/src/backend/parser/gramparse.h
similarity index 97%
rename from src/include/parser/gramparse.h
rename to src/backend/parser/gramparse.h
index 41b753a96c..c4726c618d 100644
--- a/src/include/parser/gramparse.h
+++ b/src/backend/parser/gramparse.h
@@ -11,7 +11,7 @@
  * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
  *
- * src/include/parser/gramparse.h
+ * src/backend/parser/gramparse.h
  *
  *-
  */
@@ -26,7 +26,7 @@
  * NB: include gram.h only AFTER including scanner.h, because scanner.h
  * is what #defines YYLTYPE.
  */
-#include "parser/gram.h"
+#include "gram.h"
 
 /*
  * The YY_EXTRA data that a flex scanner allows us to pass around.  Private
diff --git a/src/backend/parser/parser.c b/src/backend/parser/parser.c
index 50227cc098..ef85d3bb68 100644
--- a/src/backend/parser/parser.c
+++ b/src/backend/parser/parser.c
@@ -22,7 +22,7 @@
 #include "postgres.h"
 
 #include "mb/pg_wchar.h"
-#include "parser/gramparse.h"
+#include "gramparse.h"
 #include "parser/parser.h"
 #include "parser/scansup.h"
 
diff --git a/src/backend/parser/scan.l b/src/backend/parser/scan.l
index 882e081aae..db8b0fe8eb 100644
--- a/src/backend/parser/scan.l
+++ b/src/backend/parser/scan.l
@@ -36,7 +36,7 @@
 #include 
 
 #include "common/string.h"
-#include "parser/gramparse.h"
+#include "gramparse.h"
 #include "parser/parser.

Re: Switching XLog source from archive to streaming when primary available

2022-09-08 Thread Kyotaro Horiguchi
Being late for the party.

It seems to me that the function is getting too long.  I think we
might want to move the core part of the patch into another function.

I think it might be better if intentionalSourceSwitch doesn't need
lastSourceFailed set. It would look like this:

>  if (lastSourceFailed || switchSource)
>  {
> if (nonblocking && lastSourceFailed)
>return XLREAD_WOULDBLOCK;


+   if (first_time)
+   last_switch_time = curr_time;
..
+   if (!first_time &&
+   
TimestampDifferenceExceeds(last_switch_time, curr_time,
..
+   /* We're not here for the first time 
any more */
+   if (first_time)
+   first_time = false;

I don't think the flag first_time is needed.


regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




When should we bump the logical replication protocol version?

2022-09-08 Thread houzj.f...@fujitsu.com
Hi,

When implementing the feature to perform streaming logical transactions by
background workers[1], we plan to extend the LOGICAL_REP_MSG_STREAM_ABORT 
message
to send the additional "abort_lsn" and "abort_time" so that we can advance the
origin lsn in subscriber and can restart streaming from correct position in
case of crash.

Since the LOGICAL_REP_MSG_STREAM_ABORT message is changed, we planned to bump
the logical replication protocol version. But when reviewing the code, we feel
it can also work without using a new protocol version if we check the
new streaming option value called('parallel'). On publisher, we can check if
the streaming option is set the new value('parallel') and only send extra abort
information in this case.

I think it's reasonable to bump the protocol version number if we change any
protocol message even if we only add some new fields to the existing message,
and that's what we've always done.

The only personal concern is that I didn't find any documentation that clearly
stated the standard about when to bump logical replication protocol version,
which makes me a little unsure if this is the right thing to do.

So, I'd like to confirm is it OK to modify or add some fields without bumping
the protocol version ? Or it's a standard to bump it if we change any
protocol message.

Any hints will be appreciated.

[1] 
https://www.postgresql.org/message-id/CAA4eK1%2BwyN6zpaHUkCLorEWNx75MG0xhMwcFhvjqm2KURZEAGw%40mail.gmail.com

Best regards,
Hou zj




Re: pg15b3: recovery fails with wal prefetch enabled

2022-09-08 Thread Thomas Munro
On Wed, Sep 7, 2022 at 1:56 AM Jonathan S. Katz  wrote:
> To close this loop, I added a section for "fixed before RC1" to Open
> Items since this is presumably the next release. We can include it there
> once committed.

Done yesterday.

To tie up a couple of loose ends from this thread:

On Thu, Sep 1, 2022 at 2:48 PM Justin Pryzby  wrote:
> Also, pg_waldump seems to fail early with -w:
> [pryzbyj@template0 ~]$ sudo /usr/pgsql-15/bin/pg_waldump -w -R 
> 1663/16881/2840 -F vm -p /mnt/tmp/15/data/pg_wal 00011201001C
> rmgr: Heap2   len (rec/tot): 64/   122, tx:  0, lsn: 
> 1201/1CAF2658, prev 1201/1CAF2618, desc: VISIBLE cutoff xid 3681024856 flags 
> 0x01, blkref #0: rel 1663/16881/2840 fork vm blk 0 FPW, blkref #1: rel 
> 1663/16881/2840 blk 54
> pg_waldump: error: error in WAL record at 1201/1CD90E48: invalid record 
> length at 1201/1CD91010: wanted 24, got 0

That looks OK to me.  With or without -w, we get as far as
1201/1CD91010 and then hit zeroes.

On Thu, Sep 1, 2022 at 5:35 PM Thomas Munro  wrote:
> So it *looks* like it finished early (and without the expected
> error?).  But it also looks like it replayed that record, according to
> the page LSN.  So which is it?

The reason 1201/1CAF84B0 appeared on a page despite not having been
replayed (due to the bug) is just that vismap pages don't follow the
usual logging rules, and can be read in by heap records that don't
mention the vm page (and therefore no FPW).  So we can finish up
reading a page from disk with a future LSN on it.




Re: pg_walinspect float4/float8 confusion

2022-09-08 Thread Bharath Rupireddy
On Thu, Sep 8, 2022 at 5:23 PM Peter Eisentraut
 wrote:
>
> The pg_walinspect function pg_get_wal_stats() has output arguments
> declared as float4 (count_percentage, record_size_percentage, etc.), but
> the internal computations are all done in type double.  Is there a
> reason why this is then converted to float4 for output?  It probably
> doesn't matter in practice, but it seems unnecessarily confusing.  Or at
> least add a comment so it doesn't look like an accident.  Also compare
> with pgstattuple, which uses float8 in its SQL interface for similar data.

Thanks for finding this. There's no specific reason as such. However,
it's good to be in sync with what code does internally and what it
exposes to the users. pg_walinspect uses double data type (double
precision floating point number) for internal calculations and cuts it
down to single precision floating point number float4 to the users.
Attaching a patch herewith. I'm not sure if this needs to be
backported, if at all, we were to, IMO it should be backported to
reduce the code diff.

While on, I found that pgstattuple uses uint64 for internal percentile
calculations as opposed to double data type for others. Attaching a
small patch to fix it.

-- 
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com


v1-0001-Use-float8-datatype-for-percentiles-in-pg_walinsp.patch
Description: Binary data


v1-0001-Use-double-datatype-for-percentiles-in-pgstattupl.patch
Description: Binary data


[PATCH]Feature improvement for MERGE tab completion

2022-09-08 Thread bt22kawamotok

Hi!

I created a patch for improving MARGE tab completion.
Currently there is a problem with "MERGE INTO dst as d Using src as s ON 
d.key = s.key WHEN " is typed, "MATCHED" and "NOT MATCHED" is not 
completed.
There is also a problem that typing "MERGE INTO a AS " completes 
"USING".

This patch solves the above problems.

Regards,
Kotaro Kawamotodiff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 62a39779b9..ed1f8c5f17 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -4068,7 +4068,7 @@ psql_completion(const char *text, int start, int end)
 	/* with [AS] alias */
 	else if (TailMatches("MERGE", "INTO", MatchAny, "AS", MatchAny))
 		COMPLETE_WITH("USING");
-	else if (TailMatches("MERGE", "INTO", MatchAny, MatchAny))
+	else if (TailMatches("MERGE", "INTO", MatchAny, MatchAny) && !TailMatches("AS"))
 		COMPLETE_WITH("USING");
 	else if (TailMatches("MERGE", "INTO", MatchAny, "AS", MatchAny, "USING"))
 		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tables);
@@ -4095,6 +4095,12 @@ psql_completion(const char *text, int start, int end)
 		COMPLETE_WITH("WHEN MATCHED", "WHEN NOT MATCHED");
 	else if (TailMatches("USING", MatchAny, MatchAny, "ON", MatchAny))
 		COMPLETE_WITH("WHEN MATCHED", "WHEN NOT MATCHED");
+	else if (TailMatches("USING", MatchAny, "ON", MatchAny, "WHEN"))
+		COMPLETE_WITH("MATCHED", "NOT MATCHED");
+	else if (TailMatches("USING", MatchAny, "AS", MatchAny, "ON", MatchAny, "WHEN"))
+		COMPLETE_WITH("MATCHED", "NOT MATCHED");
+	else if (TailMatches("USING", MatchAny, MatchAny, "ON", MatchAny, "WHEN"))
+		COMPLETE_WITH("MATCHED", "NOT MATCHED");
 	else if (TailMatches("WHEN", "MATCHED"))
 		COMPLETE_WITH("THEN", "AND");
 	else if (TailMatches("WHEN", "NOT", "MATCHED"))


Re: Bump MIN_WINNT to 0x0600 (Vista) as minimal runtime in 16~

2022-09-08 Thread Michael Paquier
On Thu, Sep 08, 2022 at 08:29:20AM -0500, Justin Pryzby wrote:
> It seems silly to do it at runtime if it's possible to do it at link time.

Thanks.  This set of simplifications is too good to let go, and I have
a window to look after the buildfarm today and tomorrow, which should
be enough to take action if need be.  Hence, I have applied the
patch.  Now, let's see what the buildfarm tells us ;)
--
Michael


signature.asc
Description: PGP signature


Re: Different compression methods for FPI

2022-09-08 Thread Michael Paquier
On Wed, Sep 07, 2022 at 07:02:07PM +0900, Michael Paquier wrote:
> Before posting my previous patch, I have considered a few options:
> - Extend errormsg_buf with an error code, but the frontend does not
> care about that.
> - Make RestoreBlockImage() a backend-only routine and provide a better
> error control without filling in errormsg_buf, but I'd like to believe
> that this code is useful for some frontend code even if core does not
> use it yet in a frontend context.
> - Change the signature of RestoreBlockImage() to return an enum with
> at least a tri state instead of a boolean.  For this one I could not
> convince myself that this is worth the complexity, as we are talking
> about inconsistent build options between nodes doing physical
> replication, and the error message is the useful piece to know what's
> happening (frontends are only going to consume the error message
> anyway).

After a second look, I was not feeling enthusiastic about adding more
complications in this code path for this case, so I have finished by
applying my previous patch to address this open item.

I am wondering if there is a use-case for backpatching something like
that to older versions though?  FPW compression is available since 9.5
but the code has never consumed the error message produced by
RestoreBlockImage() when pglz fails to decompress an image, and there
is equally no exact information provided when the status data of the
block in the record is incorrect, even if recovery provides some
context associated to the record being replayed.
--
Michael


signature.asc
Description: PGP signature


Re: Improve description of XLOG_RUNNING_XACTS

2022-09-08 Thread Masahiko Sawada
On Tue, Aug 23, 2022 at 11:53 AM Kyotaro Horiguchi
 wrote:
>
> At Mon, 15 Aug 2022 11:16:56 +0900, Masahiko Sawada  
> wrote in
> > Sorry for the late reply.
>
> No worries. Anyway I was in a long (as a Japanese:) vacation.
>
> > On Thu, Jul 28, 2022 at 4:29 PM Kyotaro Horiguchi
> >  wrote:
> > > record is sound". Is it any trouble with assuming the both *can*
> > > happen at once?  If something's broken, it will be reflected in the
> > > output.
> >
> > Fair point. We may not need to interpret the contents.
>
> Yeah.
>
> > On Thu, Jul 28, 2022 at 3:24 PM Kyotaro Horiguchi
> >  wrote:
> > >
> > > Another point is if the xid/subxid lists get long, I see it annoying
> > > that the "overflowed" messages goes far away to the end of the long
> > > line. Couldn't we rearrange the item order of the line as the follows?
> > >
> > > nextXid %u latestCompletedXid %u oldestRunningXid %u;[ subxid 
> > > overflowed;][ %d xacts: %u %u ...;][ subxacts: %u %u ..]
> > >
> >
> > I'm concerned that we have two information of subxact apart. Given
> > that showing both individual subxacts and "overflow" is a bug, I think
>
> Bug or every kind of breakage of the file. So if "overflow"ed, we
> don't need to show a subxid there but I thought that there's no need
> to change behavior in that case since it scarcely happens.
>
> > we can output like:
> >
> > if (xlrec->subxcnt > 0)
> > {
> > appendStringInfo(buf, "; %d subxacts:", xlrec->subxcnt);
> > for (i = 0; i < xlrec->subxcnt; i++)
> > appendStringInfo(buf, " %u", xlrec->xids[xlrec->xcnt + i]);
> > }
> >
> > if (xlrec->subxid_overflow)
> > appendStringInfoString(buf, "; subxid overflowed");
>
> Yea, it seems like what I proposed upthread. I'm fine with that since
> it is an abonormal situation.
>
> > Or we can output the "subxid overwlowed" first.
>
> (I prefer this, as that doesn't change the output in the normal case
> but the anormality will be easilly seen if happens.)
>

Updated the patch accordingly.

Regards,

-- 
Masahiko Sawada


v3-0001-Improve-description-of-XLOG_RUNNING_XACTS.patch
Description: Binary data


Re: Patch proposal: make use of regular expressions for the username in pg_hba.conf

2022-09-08 Thread Tom Lane
Jacob Champion  writes:
> On Fri, Aug 19, 2022 at 1:13 AM Drouvot, Bertrand  wrote:
>> This is why I think username filtering with regular expressions would
>> provide its own advantages.

> I think your motivation for the feature is solid.

Yeah.  I'm not sure that I buy the argument that this is more useful
than writing a role name and controlling things with GRANT ROLE, but
it's a plausible alternative with properties that might win in some
use-cases.  So I see little reason not to allow it.

I'd actually ask why stop here?  In particular, why not do the same
with the database-name column, especially since that does *not*
have the ability to use roles as a substitute for a wildcard entry?

> I think you're going to have to address backwards compatibility
> concerns. Today, I can create a role named "/a", and I can put that
> into the HBA without quoting it. I'd be unamused if, after an upgrade,
> my rule suddenly matched any role name containing an 'a'.

Meh ... that concern seems overblown to me.  I guess it's possible
that somebody has an HBA entry that looks like that, but it doesn't
seem very plausible.  Note that we made this exact same change in
pg_ident.conf years ago, and AFAIR we got zero complaints.

> Speaking of partial matches, should this feature allow them? Maybe
> rules should have to match the entire username instead, and sidestep
> the inevitable "I forgot to anchor my regex" problems?

I think the pg_ident.conf precedent is binding on us here.  If we
make this one work differently, nobody's going to thank us for it,
they're just going to wonder "did the left hand not know what the
right hand already did?"

regards, tom lane




Re: Patch proposal: make use of regular expressions for the username in pg_hba.conf

2022-09-08 Thread Jacob Champion
On Fri, Aug 19, 2022 at 1:13 AM Drouvot, Bertrand  wrote:
> This is why I think username filtering with regular expressions would
> provide its own advantages.
>
> Thoughts? Looking forward to your feedback,

I think your motivation for the feature is solid. It is killing me a
bit that this is making it easier to switch authentication methods
based on the role name, when I suspect what someone might really want
is to switch authentication methods based on the ID the user is trying
to authenticate with. But that's not your fault or problem to fix,
because the startup packet doesn't currently have that information.
(It does make me wonder whether I withdrew my PGAUTHUSER proposal [1]
a month too early. And man, do I wish that pg_ident and pg_hba were
one file.)

I think you're going to have to address backwards compatibility
concerns. Today, I can create a role named "/a", and I can put that
into the HBA without quoting it. I'd be unamused if, after an upgrade,
my rule suddenly matched any role name containing an 'a'.

Speaking of partial matches, should this feature allow them? Maybe
rules should have to match the entire username instead, and sidestep
the inevitable "I forgot to anchor my regex" problems?

Thanks,
--Jacob

[1] https://commitfest.postgresql.org/38/3314/




Re: pg_upgrade failing for 200+ million Large Objects

2022-09-08 Thread Nathan Bossart
On Thu, Sep 08, 2022 at 04:29:10PM -0700, Jacob Champion wrote:
> On Thu, Sep 8, 2022 at 4:18 PM Nathan Bossart  
> wrote:
>> IIUC the main benefit of this approach is that it isn't dependent on
>> binary-upgrade mode, which seems to be a goal based on the discussion
>> upthread [0].
> 
> To clarify, I agree that pg_dump should contain the core fix. What I'm
> questioning is the addition of --dump-options to make use of that fix
> from pg_upgrade, since it also lets the user do "exciting" new things
> like --exclude-schema and --include-foreign-data and so on. I don't
> think we should let them do that without a good reason.

Ah, yes, I think that is a fair point.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: Reducing the chunk header sizes on all memory context types

2022-09-08 Thread David Rowley
On Fri, 9 Sept 2022 at 10:53, Tom Lane  wrote:
> I hate to give up MemoryContextContains altogether.  The assertions
> that David nuked in b76fb6c2a had some value I think,

Those can be put back if we decide to keep MemoryContextContains.
Those newly added Asserts just temporarily had to go due to b76fb6c2a
making MemoryContextContains temporarily always return false.

> The implementation I suggested upthread would reliably distinguish
> malloc from palloc, and while it is potentially a tad expensive
> I don't think it's too much so for Assert checks.  I don't have an
> objection to trying to get to a place where we only use it in
> Assert, though.

I really think the Assert only form of MemoryContextContains() is the
best move, and if it's doing Assert only, then we can do the
loop-over-the-blocks idea as you described and I drafted in [1].

If the need comes up that we're certain we always have a pointer to
some allocated chunk, but need to know if it's in some memory context,
then the proper form of expressing that, I think, should be:

if (GetMemoryChunkContext(pointer) == somecontext)

If we're worried about getting that wrong, we can beef up the
MemoryChunk struct with a magic_number field in
MEMORY_CONTEXT_CHECKING builds to ensure we catch any code which
passes invalid pointers.

David

[1] 
https://www.postgresql.org/message-id/CAApHDvoKjOmPQeokicwDuO-_Edh=tKp23-=jskycykfw5qu...@mail.gmail.com




Re: pg_upgrade failing for 200+ million Large Objects

2022-09-08 Thread Jacob Champion
On Thu, Sep 8, 2022 at 4:18 PM Nathan Bossart  wrote:
> IIUC the main benefit of this approach is that it isn't dependent on
> binary-upgrade mode, which seems to be a goal based on the discussion
> upthread [0].

To clarify, I agree that pg_dump should contain the core fix. What I'm
questioning is the addition of --dump-options to make use of that fix
from pg_upgrade, since it also lets the user do "exciting" new things
like --exclude-schema and --include-foreign-data and so on. I don't
think we should let them do that without a good reason.

Thanks,
--Jacob




Re: pg_upgrade failing for 200+ million Large Objects

2022-09-08 Thread Nathan Bossart
On Wed, Sep 07, 2022 at 02:42:05PM -0700, Jacob Champion wrote:
> Just to clarify, was Justin's statement upthread (that the XID problem
> is fixed) correct? And is this patch just trying to improve the
> remaining memory and lock usage problems?

I think "fixed" might not be totally accurate, but that is the gist.

> I took a quick look at the pg_upgrade diffs. I agree with Jan that the
> escaping problem is a pretty bad smell, but even putting that aside for
> a bit, is it safe to expose arbitrary options to pg_dump/restore during
> upgrade? It's super flexible, but I can imagine that some of those flags
> might really mess up the new cluster...
> 
> And yeah, if you choose to do that then you get to keep both pieces, I
> guess, but I like that pg_upgrade tries to be (IMO) fairly bulletproof.

IIUC the main benefit of this approach is that it isn't dependent on
binary-upgrade mode, which seems to be a goal based on the discussion
upthread [0].  I think it'd be easily possible to fix only pg_upgrade by
simply dumping and restoring pg_largeobject_metadata, as Andres suggested
in 2018 [1].  In fact, it seems like it ought to be possible to just copy
pg_largeobject_metadata's files as was done before 12a53c7.  AFAICT this
would only work for clusters upgrading from v12 and newer, and it'd break
if any of the underlying data types change their storage format.  This
seems unlikely for OIDs, but there is ongoing discussion about changing
aclitem.

I still think this is a problem worth fixing, but it's not yet clear how to
proceed.

[0] https://postgr.es/m/227228.1616259220%40sss.pgh.pa.us
[1] https://postgr.es/m/20181122001415.ef5bncxqin2y3esb%40alap3.anarazel.de

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: Reducing the chunk header sizes on all memory context types

2022-09-08 Thread Tom Lane
Andres Freund  writes:
> On 2022-09-08 14:10:36 -0400, Tom Lane wrote:
>> No, I don't think we can get away with that.  See int8inc() for a
>> counterexample.

> What I was suggesting a bit below the bit quoted above, was that we'd copy
> whenever there's no finalfunc or if the finalfunc doesn't take an internal
> parameter.

Hmm, OK, I was confusing this with the optimization for transition
functions; but that one is looking for pointer equality rather than
checking MemoryContextContains.  So maybe this'd work.

> This business with interpreting random memory as a palloc'd chunk seems like a
> fundamentally wrong approach worth incurring some pain to fix.

I hate to give up MemoryContextContains altogether.  The assertions
that David nuked in b76fb6c2a had some value I think, and I was hoping
to address your concerns in [1] by adding Assert(MemoryContextContains())
to guc_free.  But I'm not sure how much that'll help to diagnose you-
malloced-instead-of-pallocing if the result is not an assertion failure
but a segfault in a not-obviously-related place.  The failure at guc_free
is already going to be some distance from the scene of the crime.

The implementation I suggested upthread would reliably distinguish
malloc from palloc, and while it is potentially a tad expensive
I don't think it's too much so for Assert checks.  I don't have an
objection to trying to get to a place where we only use it in
Assert, though.

regards, tom lane

[1] 
https://www.postgresql.org/message-id/20220905233233.jhcu5jqsrtosmgh5%40awork3.anarazel.de




Re: Fix gin index cost estimation

2022-09-08 Thread Tom Lane
Ronan Dunklau  writes:
> The problem I'm trying to solve is that, contrary to btree, gist and sp-gist 
> indexes, gin indexes do not charge any cpu-cost for descending the entry tree.

I looked this over briefly.  I think you are correct to charge an
initial-search cost per searchEntries count, but don't we also need to
scale up by arrayScans, similar to the "corrections for cache effects"?

+* We model index descent costs similarly to those for btree, but we 
also
+* need an idea of the tree_height.
+* We use numEntries / numEntryPages as the fanout factor.

I'm not following that calculation?  It seems like it'd be correct
only for a tree height of 1, although maybe I'm just misunderstanding
this (overly terse, perhaps) comment.

+* We charge descentCost once for every entry
+*/
+   if (numTuples > 1)
+   {
+   descentCost = ceil(log(numTuples) / log(2.0)) * 
cpu_operator_cost;
+   *indexStartupCost += descentCost * counts.searchEntries;
+   }

I had to read this twice before absorbing the point of the numTuples
test.  Maybe help the reader a bit:

+   if (numTuples > 1)/* ensure positive log() */

Personally I'd duplicate the comments from nbtreecostestimate rather
than just assuming the reader will go consult them.  For that matter,
why didn't you duplicate nbtree's logic for charging for SA scans?
This bit seems just as relevant for GIN:

 * If there are ScalarArrayOpExprs, charge this once per SA scan.  The
 * ones after the first one are not startup cost so far as the overall
 * plan is concerned, so add them only to "total" cost.

Keep in mind also that pgindent will have its own opinions about how to
format these comments, and it can out-stubborn you.  Either run the
comments into single paragraphs, or if you really want them to be two
paras then leave an empty comment line between.  Another formatting
nitpick is that you seem to have added a number of unnecessary blank
lines.

regards, tom lane




Re: [PATCH] Log details for client certificate failures

2022-09-08 Thread Jacob Champion
On Thu, Jul 28, 2022 at 9:19 AM Jacob Champion  wrote:
> On Thu, Jul 21, 2022 at 4:29 PM Jacob Champion  
> wrote:
> > v4 attempts to fix this by letting the check hooks pass
> > MCXT_ALLOC_NO_OOM to pg_clean_ascii(). (It's ignored in the frontend,
> > which just mallocs.)
>
> Ping -- should I add an open item somewhere so this isn't lost?

Trying again. Peter, is this approach acceptable? Should I try something else?

Thanks,
--Jacob




Re: [RFC] building postgres with meson - v11

2022-09-08 Thread samay sharma
On Tue, Sep 6, 2022 at 9:46 PM Peter Eisentraut <
peter.eisentr...@enterprisedb.com> wrote:

> On 02.09.22 01:12, samay sharma wrote:
> > So, I was thinking of the following structure:
> > - Supported Platforms
> > - Getting the Source
> > - Building with make and autoconf
> >-- Short version
> >-- Requirements
> >-- Installation Procedure and it's subsections
> > - Building with Meson
> >-- Short version
> >-- Requirements
> >-- Installation Procedure and it's subsections
> > - Post-installation Setup
> > - Platform specific notes
>
> I like that.
>

Attached is a docs-only patch with that structure. We need to update the
platform specific notes section to add meson specific nuances. Also, in
terms of supported platforms, if there are platforms which work with make
but not with meson, we have to add that too.

Regards,
Samay

>
> > As a follow up patch, we could also try to fit the Windows part into
> > this model. We could add a Building with visual C++ or Microsoft windows
> > SDK section. It doesn't have a short version but follows the remaining
> > template of requirements and installation procedure subsections
> > (Building, Cleaning and Installing and Running Regression tests) well.
>
> We were thinking about removing the old Windows build system for PG 16.
> Let's see how that goes.  Otherwise, yes, that would be good as well.
>


v1-0001-Add-meson-docs-to-existing-build-from-source.patch
Description: Binary data


Re: Reducing the chunk header sizes on all memory context types

2022-09-08 Thread Andres Freund
Hi,

On 2022-09-08 14:10:36 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > If there is a finalfunc, they're typically going to return data from within
> > the current memory context, but could legitimately also return part of the
> > data from curaggcontext. Perhaps we could forbid that?
> 
> No, I don't think we can get away with that.  See int8inc() for a
> counterexample.

What I was suggesting a bit below the bit quoted above, was that we'd copy
whenever there's no finalfunc or if the finalfunc doesn't take an internal
parameter. And that finalfuncs returning byref with an internal parameter can
be expected to return memory allocated in the right context (which we of
course could check with an assert).  It's not super satisfying - but I don't
think it'd have the problem you describe above.

Alternatively we could add a column to pg_aggregate denoting this. That'd only
be permissible to set for a superuser presumably.


This business with interpreting random memory as a palloc'd chunk seems like a
fundamentally wrong approach worth incurring some pain to fix.

Greetings,

Andres Freund




Re: Fix gin index cost estimation

2022-09-08 Thread Tom Lane
Ronan Dunklau  writes:
> Following the bug report at [1], I sent the attached patch to pgsql-bugs 
> mailing list. I'm starting a thread here to add it to the next commitfest.

That link didn't work easily for me (possibly because it got split across
lines).  Here's another one for anybody having similar issues:

https://www.postgresql.org/message-id/flat/CABs3KGQnOkyQ42-zKQqiE7M0Ks9oWDSee%3D%2BJx3-TGq%3D68xqWYw%40mail.gmail.com

> The problem I'm trying to solve is that, contrary to btree, gist and sp-gist 
> indexes, gin indexes do not charge any cpu-cost for descending the entry tree.

As I said in the bug report thread, I think we really need to take a look
at all of our index AMs not just GIN.  I extended your original reproducer
script to check all the AMs (attached), and suppressed memoize because it
seemed to behave differently for different AMs.  Here's what I see for the
estimated costs of the inner indexscan, and the actual runtime, for each:

btree:
   ->  Index Only Scan using t1_btree_index on t1  (cost=0.28..0.30 rows=1 
width=4) (actual time=0.001..0.001 rows=1 loops=2)
 Execution Time: 19.763 ms

gin (gin_trgm_ops):
   ->  Bitmap Heap Scan on t1  (cost=0.01..0.02 rows=1 width=4) (actual 
time=0.003..0.003 rows=1 loops=2)
 ->  Bitmap Index Scan on t1_gin_index  (cost=0.00..0.01 rows=1 
width=0) (actual time=0.003..0.003 rows=1 loops=2)
 Execution Time: 75.216 ms

gist:
   ->  Index Only Scan using t1_gist_index on t1  (cost=0.14..0.16 rows=1 
width=4) (actual time=0.014..0.014 rows=1 loops=2)
 Execution Time: 277.799 ms

spgist:
   ->  Index Only Scan using t1_spgist_index on t1  (cost=0.14..0.16 rows=1 
width=4) (actual time=0.002..0.002 rows=1 loops=2)
 Execution Time: 51.407 ms

hash:
   ->  Index Scan using t1_hash_index on t1  (cost=0.00..0.02 rows=1 width=4) 
(actual time=0.000..0.000 rows=1 loops=2)
 Execution Time: 13.090 ms

brin:
   ->  Bitmap Heap Scan on t1  (cost=0.03..18.78 rows=1 width=4) (actual 
time=0.049..0.093 rows=1 loops=2)
 ->  Bitmap Index Scan on t1_brin_index  (cost=0.00..0.03 rows=1500 
width=0) (actual time=0.003..0.003 rows=70 loops=2)
 Execution Time: 1890.161 ms

bloom:
   ->  Bitmap Heap Scan on t1  (cost=11.25..11.26 rows=1 width=4) (actual 
time=0.004..0.004 rows=1 loops=2)
 ->  Bitmap Index Scan on t1_bloom_index  (cost=0.00..11.25 rows=1 
width=0) (actual time=0.003..0.003 rows=2 loops=2)
 Execution Time: 88.703 ms

(These figures shouldn't be trusted too much because I did nothing
to suppress noise.  They seem at least somewhat reproducible, though.)

So, taking btree as our reference point, gin has clearly got a problem
because it's estimating less than a tenth as much cost despite actually
being nearly 4X slower.  gist and spgist are not as bad off, but
nonetheless they claim to be cheaper than btree when they are not.
The result for hash looks suspicious as well, though at least we'd
make the right index choice for this particular case.  brin and bloom
correctly report being a lot more expensive than btree, so at least
for the moment I'm not worried about them.

BTW, the artificially small random_page_cost doesn't really affect
this much.  If I set it to a perfectly reasonable value like 1.0,
gin produces a saner cost estimate but gist, spgist, and hash do
not change their estimates at all.  btree's estimate doesn't change
either, which seems like it might be OK for index-only scans but
I doubt I believe it for index scans.  In any case, at least one
of gin and hash is doing it wrong.

In short, I think gist and spgist probably need a minor tweak to
estimate more CPU cost than they do now, and hash needs a really
hard look at whether it's sane at all.

That's all orthogonal to the merits of your patch for gin,
so I'll respond separately about that.

regards, tom lane

CREATE EXTENSION btree_gist;
CREATE EXTENSION pg_trgm;
CREATE EXTENSION bloom;

drop table if exists t1,t2;

CREATE TABLE t1 (
  id text
);

CREATE TABLE t2 (
  id text primary key,
  t1_id text
);

insert into t1 (id)
select i::text FROM generate_series(1, 1500) as i;

insert into t2 (id, t1_id)
SELECT i::text, (i % 1500 + 1)::text FROM generate_series(1, 2) i;

ANALYZE t1, t2;

SET random_page_cost = 0.0001;
-- SET random_page_cost = 1.0;
SET enable_hashjoin = off;
SET enable_mergejoin = off;
SET enable_memoize = off;

CREATE INDEX t1_btree_index ON t1 USING btree (id);
explain (analyze, buffers) select * from t1 join t2 on t1.id = t2.t1_id ;
DROP INDEX t1_btree_index;

CREATE INDEX t1_gin_index ON t1 USING gin (id gin_trgm_ops);
explain (analyze, buffers) select * from t1 join t2 on t1.id = t2.t1_id ;
DROP INDEX t1_gin_index;

CREATE INDEX t1_gist_index ON t1 USING gist (id);
explain (analyze, buffers) select * from t1 join t2 on t1.id = t2.t1_id ;
DROP INDEX t1_gist_index;

CREATE INDEX t1_spgist_index ON t1 USING spgist (id);
explain (analyze, buffers) select * from t1 join t2 on t1.id = 

Re: CFM Manager

2022-09-08 Thread Jacob Champion
On Tue, Aug 23, 2022 at 9:27 AM Jacob Champion  wrote:
> I have updated the CFM checklist through the "2 days before CF"
> section. Let me know if you have questions/suggestions.

I've additionally removed references to "shame emails" for
non-reviewers; I don't think CFMs are doing that anymore and I don't
think it'd be particularly productive anyway. I've also partially
rewritten the "every 5 to 7 days" suggestions.

I still have yet to update the section "5 to 7 days before end of CF"
and onward.

--Jacob




Re: Adding CommandID to heap xlog records

2022-09-08 Thread Tom Lane
Matthias van de Meent  writes:
> Please find attached a patch that adds the CommandId of the inserting
> transaction to heap (batch)insert, update and delete records. It is
> based on the changes we made in the fork we maintain for Neon.

This seems like a very significant cost increment with returns
to only a minuscule number of users.  We certainly cannot consider
it unless you provide some evidence that that impression is wrong.

regards, tom lane




Adding CommandID to heap xlog records

2022-09-08 Thread Matthias van de Meent
Hi,

The current WAL records generated by the Heap tableAM do not contain
the command ID of the query that inserted/updated/deleted the records.
The CID is not included in XLog because it is useful only to
visibility checks in an active read/write transaction, which currently
only appear in a primary node.

In Neon [0], we're using XLog to reconstruct page state, as opposed to
writing out dirty pages. This has the benefit of saving write IO for
dirty page writes, but this does mean that we need the CID in heap
insert/update/delete records to correctly mark the tuples, such that
modified pages that are flushed from the buffer pool get reconstructed
correctly. A more detailed write-up why we do this is here [1].

Neon does not need to be the only user of this API, as adding CID to
xlog records also allows the primary to offload (partial) queries to a
remote physical replica that would utilise the same transaction and
snapshot of the primary.
Right now, it's not possible to offload the read-component of RW
queries to a secondary [2]. The attached patch would make multi-node
transactions possible on systems with a single primary node and
multiple read replicas, without the need for prepared commits and
special extra code to achieve snapshot consistency, as a consistent
snapshot could be copied and used by physical replicas (think parallel
workers, but on a different server).

Please find attached a patch that adds the CommandId of the inserting
transaction to heap (batch)insert, update and delete records. It is
based on the changes we made in the fork we maintain for Neon.

Kind regards,

Matthias van de Meent

[0] https://github.com/neondatabase/neon/#neon
[1] 
https://github.com/neondatabase/neon/blob/main/docs/core_changes.md#add-t_cid-to-heap-wal-records
[2] At least not without blocking XLog replay of the primary
transaction on the secondary, due to the same issues that Neon
encountered: you need the CommandID to distinguish between this
transactions' updates in the current command and previous commands.


v1-0001-Add-cid-to-heap-xlog-records-that-insert-update-d.patch
Description: Binary data


Re: [BUG] wrong FK constraint name when colliding name on ATTACH

2022-09-08 Thread Tom Lane
Andres Freund  writes:
> Something here doesn't look to be quite right. Starting with this commit CI
> [1] started to fail on freebsd (stack trace [2]), and in the meson branch I've
> also seen the crash on windows (CI run[3], stack trace [4]).

The crash seems 100% reproducible if I remove the early-exit optimization
from GetForeignKeyActionTriggers:

diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 53b0f3a9c1..112ca77d97 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -10591,8 +10591,6 @@ GetForeignKeyActionTriggers(Relation trigrel,
Assert(*updateTriggerOid == InvalidOid);
*updateTriggerOid = trgform->oid;
}
-   if (OidIsValid(*deleteTriggerOid) && OidIsValid(*updateTriggerOid))
-   break;
}
 
if (!OidIsValid(*deleteTriggerOid))

With that in place, it's probabilistic whether the Asserts notice anything
wrong, and mostly they don't.  But there are multiple matching triggers:

regression=# select oid, tgconstraint, tgrelid,tgconstrrelid, tgtype, tgname 
from pg_trigger where tgconstraint = 104301;
  oid   | tgconstraint | tgrelid | tgconstrrelid | tgtype |tgname   
  
+--+-+---++---
 104302 |   104301 |  104294 |104294 |  9 | 
RI_ConstraintTrigger_a_104302
 104303 |   104301 |  104294 |104294 | 17 | 
RI_ConstraintTrigger_a_104303
 104304 |   104301 |  104294 |104294 |  5 | 
RI_ConstraintTrigger_c_104304
 104305 |   104301 |  104294 |104294 | 17 | 
RI_ConstraintTrigger_c_104305
(4 rows)

I suspect that the filter conditions being applied are inadequate
for the case of a self-referential FK, which this evidently is
given that tgrelid and tgconstrrelid are equal.

I'd counsel dropping the early-exit optimization; it doesn't
save much I expect, and it evidently hides bugs.  Or maybe
make it conditional on !USE_ASSERT_CHECKING.

regards, tom lane




Re: num_sa_scans in genericcostestimate

2022-09-08 Thread Tom Lane
Jeff Janes  writes:
> Why does having the =ANY in the "Index Cond:" rather than the "Filter:"
> inhibit it from understanding that the rows will still be delivered in
> order by "thousand"?

They won't be.  The =ANY in index conditions results in multiple
index scans, that is we effectively do a scan with

   Index Cond: (thousand < 2) AND (tenthous = 1001)

and then another with

   Index Cond: (thousand < 2) AND (tenthous = 3000)

and only by very good luck would the overall result be sorted by
"thousand".  On the other hand, if the ScalarArrayOp is a plain
filter condition, then it doesn't affect the number of index
scans --- it's just a (rather expensive) filter condition.

indxpath.c's get_index_paths() is explicitly aware of these
considerations, maybe reading the comments there would help.

I don't say there couldn't be a bug here, but you haven't
demonstrated one.  I believe that get_index_paths() will
generate paths both ways, with the ScalarArrayOp as a filter
condition and an indexqual, and it's evidently deciding the
first way is cheaper.

regards, tom lane




Re: num_sa_scans in genericcostestimate

2022-09-08 Thread Tom Lane
Jeff Janes  writes:
> When costing a btree index scan, num_sa_scans gets computed twice, once in
> btcostestmeate and once in genericcostestimate.  But the computations are
> different.  It looks like the generic one includes all =ANY in any column
> in the index, while the bt one includes only =ANY which or on columns for
> which all the preceding index columns are tested for equality.

I think this is correct.  As per the comments in btcostestimate:

 * For a btree scan, only leading '=' quals plus inequality quals for the
 * immediately next attribute contribute to index selectivity (these are
 * the "boundary quals" that determine the starting and stopping points of
 * the index scan).  Additional quals can suppress visits to the heap, so
 * it's OK to count them in indexSelectivity, but they should not count
 * for estimating numIndexTuples.  So we must examine the given indexquals
 * to find out which ones count as boundary quals.  ...

and further down

/* count number of SA scans induced by indexBoundQuals only */
if (alength > 1)
num_sa_scans *= alength;

This num_sa_scans value computed by btcostestimate is (or should be)
only used in calculations related to numIndexTuples, whereas the one
in genericcostestimate should be used for calculations related to the
overall number of heap tuples returned by the indexscan.  Maybe there
is someplace that is using the wrong one, but it's not a bug that they
are different.

> The context for this is that I was looking at cases where btree indexes
> were not using all the columns they could, but rather shoving some of the
> conditions down into a Filter unnecessarily/unhelpfully.  This change
> doesn't fix that, but it does seem to be moving in the right direction.

If it helps, it's only accidental, because this patch is surely wrong.
We *should* be distinguishing these numbers.

regards, tom lane




Re: [PoC] Let libpq reject unexpected authentication requests

2022-09-08 Thread Jacob Champion
On Thu, Sep 8, 2022 at 6:25 AM Peter Eisentraut
 wrote:
> For example, before long someone is going to try putting "ldap" into
> require_auth.  The fact that the methods in pg_hba.conf are not what
> libpq sees is not something that was really exposed to users until now.
> "none" vs. "trust" takes advantage of that.  But then I think we could
> also make "password" clearer, which surely sounds like any kind of
> password, encrypted or not, and that's also how pg_hba.conf behaves.
> The protocol specification calls that "AuthenticationCleartextPassword";
> maybe we could pick a name based on that.

Sounds fair. "cleartext"? "plaintext"? "plain" (like SASL's PLAIN)?

> And then, what if we add a new method in the future, and someone puts
> that into their connection string.  Old clients will just refuse to
> parse that.  Ok, that effectively gives you the same behavior as
> rejecting the server's authentication offer.  But what about the negated
> version?

I assume the alternative behavior you're thinking of is to ignore
negated "future methods"? I think the danger with that (for a feature
that's supposed to be locking communication down) is that it's not
possible to differentiate between a maybe-future method and a typo. If
I want "!password" because my intention is to disallow a plaintext
exchange, I really don't want "!pasword" to silently allow anything.

> Also, what if we add new SASL methods.  How would we modify
> this code to be able to pick and choose and also have backward and
> forward compatible behavior?

On the SASL front: In the back of my head I'd been considering adding
a "sasl:" prefix to "scram-sha-256", so that we have a namespace for
new SASL methods. That would also give us a jumping-off point in the
future if we decide to add SASL method negotiation to the protocol.
What do you think about that?

Backwards compatibility will, I think, be handled trivially by a newer
client. The only way to break backwards compatibility would be to
remove support for a method, which I assume would be independent of
this feature.

Forwards compatibility doesn't seem like something this feature can
add by itself (old clients can't speak new methods). Though we could
backport new method names to allow them to be used in negations, if
maintaining that aspect of compatibility is worth the effort.

> In general, I like this.  We just need to think about the above things a
> bit more.

Thanks!

--Jacob




Re: Reducing the chunk header sizes on all memory context types

2022-09-08 Thread Tom Lane
Andres Freund  writes:
> If there is a finalfunc, they're typically going to return data from within
> the current memory context, but could legitimately also return part of the
> data from curaggcontext. Perhaps we could forbid that?

No, I don't think we can get away with that.  See int8inc() for a
counterexample.

regards, tom lane




Re: Switching XLog source from archive to streaming when primary available

2022-09-08 Thread Nathan Bossart
On Thu, Sep 08, 2022 at 05:16:53PM +0530, Bharath Rupireddy wrote:
> On Wed, Sep 7, 2022 at 3:27 AM Nathan Bossart  
> wrote:
>> It's not clear to me how this is expected to interact with the pg_wal phase
>> of standby recovery.  As the docs note [0], standby servers loop through
>> archive recovery, recovery from pg_wal, and streaming replication.  Does
>> this cause the pg_wal phase to be skipped (i.e., the standby goes straight
>> from archive recovery to streaming replication)?  I wonder if it'd be
>> better for this mechanism to simply move the standby to the pg_wal phase so
>> that the usual ordering isn't changed.
> 
> It doesn't change any behaviour as such for XLOG_FROM_PG_WAL. In
> standby mode when recovery_command is specified, the initial value of
> currentSource would be XLOG_FROM_ARCHIVE (see [1]). If the archive is
> exhausted of WAL or the standby fails to fetch from the archive, then
> it switches to XLOG_FROM_STREAM. If the standby fails to receive WAL
> from primary, it switches back to XLOG_FROM_ARCHIVE. This continues
> unless the standby gets promoted. With the patch, we enable the
> standby to try fetching from the primary, instead of waiting for WAL
> in the archive to get exhausted or for an error to occur in the
> standby while receiving from the archive.

Okay.  I see that you are checking for XLOG_FROM_ARCHIVE.

>> Shouldn't the last_switch_time be set when the state machine first enters
>> XLOG_FROM_ARCHIVE?  IIUC this logic is currently counting time spent
>> elsewhere (e.g., XLOG_FROM_STREAM) when determining whether to force a
>> source switch.  This would mean that a standby that has spent a lot of time
>> in streaming replication before failing would flip to XLOG_FROM_ARCHIVE,
>> immediately flip back to XLOG_FROM_STREAM, and then likely flip back to
>> XLOG_FROM_ARCHIVE when it failed again.  Given the standby will wait for
>> wal_retrieve_retry_interval before going back to XLOG_FROM_ARCHIVE, it
>> seems like we could end up rapidly looping between sources.  Perhaps I am
>> misunderstanding how this is meant to work.
> 
> last_switch_time indicates the time when the standby last attempted to
> switch to primary. For instance, a standby:
> 1) for the first time, sets last_switch_time = current_time when in archive 
> mode
> 2) if current_time < last_switch_time + interval, continues to be in
> archive mode
> 3) if current_time >= last_switch_time + interval, attempts to switch
> to primary and sets last_switch_time = current_time
> 3.1) if successfully switches to primary, continues in there and for
> any reason fails to fetch from primary, then enters archive mode and
> loops from step (2)
> 3.2) if fails to switch to primary, then enters archive mode and loops
> from step (2)

Let's say I have this new parameter set to 5 minutes, and I have a standby
that's been at step 3.1 for 5 days before failing and going back to step 2.
Won't the standby immediately jump back to step 3.1?  I think we should
place the limit on how long the server stays in XLOG_FROM_ARCHIVE, not how
long it's been since we last tried XLOG_FROM_STREAM.

>> I wonder if the lower bound should be higher to avoid switching
>> unnecessarily rapidly between WAL sources.  I see that
>> WaitForWALToBecomeAvailable() ensures that standbys do not switch from
>> XLOG_FROM_STREAM to XLOG_FROM_ARCHIVE more often than once per
>> wal_retrieve_retry_interval.  Perhaps wal_retrieve_retry_interval should be
>> the lower bound for this GUC, too.  Or maybe WaitForWALToBecomeAvailable()
>> should make sure that the standby makes at least once attempt to restore
>> the file from archive before switching to streaming replication.
> 
> No, we need a way to disable the feature, so I'm not changing the
> lower bound. And let's not make this GUC dependent on any other GUC, I
> would like to keep it simple for better usability. However, I've
> increased the default value to 5min and added a note in the docs about
> the lower values.
> 
> I'm attaching the v3 patch with the review comments addressed, please
> review it further.

My general point is that we should probably offer some basic preventative
measure against flipping back and forth between streaming and archive
recovery while making zero progress.  As I noted, maybe that's as simple as
having WaitForWALToBecomeAvailable() attempt to restore a file from archive
at least once before the new parameter forces us to switch to streaming
replication.  There might be other ways to handle this.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: SYSTEM_USER reserved word implementation

2022-09-08 Thread Jacob Champion
On Wed, Sep 7, 2022 at 6:17 PM Michael Paquier  wrote:
> >> +   /* Initialize SystemUser now that MyClientConnectionInfo is 
> >> restored. */
> >> +   InitializeSystemUser(MyClientConnectionInfo.authn_id,
> >> +
> >> hba_authname(MyClientConnectionInfo.auth_method));
> >
> > It makes me a little nervous to call hba_authname(auth_method) without
> > checking to see that auth_method is actually valid (which is only true
> > if authn_id is not NULL).
>
> You have mentioned that a couple of months ago if I recall correctly,
> and we pass down an enum value.

Ah, sorry. Do you remember which thread?

I am probably misinterpreting you, but I don't see why auth_method's
being an enum helps. uaReject (and the "reject" string) is not a sane
value to be using in SYSTEM_USER, and the more call stacks away we get
from MyClientConnectionInfo, the easier it is to forget that that
value is junk. As long as the code doesn't get more complicated, I
suppose there's no real harm being done, but it'd be cleaner not to
access auth_method at all if authn_id is NULL. I won't die on that
hill, though.

> There is actually a second and much deeper issue
> here, in the shape of a collation problem.

Oh, none of that sounds fun. :/

--Jacob




Re: Instrumented pages/tuples frozen in autovacuum's server log out (and VACUUM VERBOSE)

2022-09-08 Thread Peter Geoghegan
On Mon, Sep 5, 2022 at 12:43 PM Peter Geoghegan  wrote:
> Barring any objections I will commit this patch within the next few days.

Pushed this just now.

Thanks
-- 
Peter Geoghegan




Re: [BUG] wrong FK constraint name when colliding name on ATTACH

2022-09-08 Thread Andres Freund
Hi,

On 2022-09-08 13:25:15 +0200, Alvaro Herrera wrote:
> I think you're right, so pushed, and backpatched to 12.  I added the
> test case to regression also.

Something here doesn't look to be quite right. Starting with this commit CI
[1] started to fail on freebsd (stack trace [2]), and in the meson branch I've
also seen the crash on windows (CI run[3], stack trace [4]).

There does appear to be some probabilistic aspect, I also saw a run succeed.

Greetings,

Andres Freund

[1] https://cirrus-ci.com/github/postgres/postgres/
[2] https://api.cirrus-ci.com/v1/task/6180840047640576/logs/cores.log
[3] https://cirrus-ci.com/task/6629440791773184
[4] 
https://api.cirrus-ci.com/v1/artifact/task/6629440791773184/crashlog/crashlog-postgres.exe_1468_2022-09-08_17-05-24-591.txt




Add a missing comment for PGPROC.pgprocno

2022-09-08 Thread Aleksander Alekseev
Hi hackers,

I tried to understand some implementation details of ProcArray and
discovered that this is a bit challenging to do due to a missing
comment for PGPROC.pgprocno. E.g. it's hard to understand why
ProcArrayAdd() preserves procArray->pgprocnos[] sorted by (PGPROC *)
if actually the sorting is done by pgprocno. Took me some time to
figure this out.

Here is the patch that fixes this. Hopefully this will save some time
for the newcomers.

-- 
Best regards,
Aleksander Alekseev


v1-0001-Add-a-missing-comment-for-PGPROC.pgprocno.patch
Description: Binary data


Re: Minimum bison and flex versions

2022-09-08 Thread Andres Freund
Hi,

On 2022-09-06 13:32:32 +0700, John Naylor wrote:
> Here are autoconf-only patches to that effect.

Looks like you did actually include src/tools/msvc as well :)


> Subject: [PATCH v2 2/2] Bump minimum version of Flex to 2.5.35

LGTM.


> From b7f35ae5e0fd55f8dceb2a4a546be3b50065a09c Mon Sep 17 00:00:00 2001
> From: John Naylor 
> Date: Tue, 6 Sep 2022 11:41:58 +0700
> Subject: [PATCH v2 1/2] Bump minimum version of Bison to 2.3
> 
> Since the retirement of some older buildfarm members, the oldest Bison
> that gets regular testing is 2.3. MacOS ships that version, and will
> continue doing so for the forseeable future because of Apple's policy
> regarding GPLv3. While Mac users could use a package manager to install
> a newer version, there is no compelling reason to do so at this time.

s/to do so/to force them to do so/?


> --- a/src/pl/plpgsql/src/pl_gram.y
> +++ b/src/pl/plpgsql/src/pl_gram.y
> @@ -39,10 +39,7 @@
>  /*
>   * Bison doesn't allocate anything that needs to live across parser calls,
>   * so we can easily have it use palloc instead of malloc.  This prevents
> - * memory leaks if we error out during parsing.  Note this only works with
> - * bison >= 2.0.  However, in bison 1.875 the default is to use alloca()
> - * if possible, so there's not really much problem anyhow, at least if
> - * you're building with gcc.
> + * memory leaks if we error out during parsing.
>   */
>  #define YYMALLOC palloc
>  #define YYFREE   pfree

Getting rid of all that copy-pasted stuff alone seems worth doing this :)


LGTM.

Greetings,

Andres Freund




Re: has_privs_of_role vs. is_member_of_role, redux

2022-09-08 Thread walther

Robert Haas:

Fairly obviously, my thinking here is biased by having written the
patch to allow restricting SET ROLE. If alice can neither inherit
bob's privileges nor SET ROLE bob, she had better not be able to
create objects owned by bob, because otherwise she can make a table,
add an expression index that calls a user-defined function, do stuff
until it needs to be autovacuumed, and then give it to bob, and boom,
exploit. But that doesn't mean that the is_member_of_role() tests here
have to be changed to has_privs_of_role(). They could be changed to
has_privs_of_role() || member_can_set_role(). And if the consensus is
to do it that way, I'm OK with that.


A different line of thought (compared to the "USAGE" privilege I 
discussed earlier), would be:

To transfer ownership of an object, you need two sets of privileges:
- You need to have the privilege to initiate a request to transfer 
ownership.

- You need to have the privilege to accept a request to transfer ownership.

Let's imagine there'd be such a request created temporarily, then when I 
start the process of changing ownership, I would have to change to the 
other role and then accept that request.


In theory, I could also inherit that privilege, but that's not how the 
system works today. By using is_member_of_role, the decision was already 
made that this should not depend on inheritance. What is left, is the 
ability to do it via SET ROLE only.


So it should not be has_privs_of_role() nor has_privs_of_role() || 
member_can_set_role(), as you suggested above, but rather just 
member_can_set_role() only. Of course, only in the context of the SET 
ROLE patch.


Basically, with that patch is_member_of_role() has to become 
member_can_set_role().



I'm just a little unconvinced that it's actually the best route. I
think that logic of the form "well Alice could just SET ROLE and do it
anyway" is weak -- and not only because of the patch to allow
restricting SET ROLE, but because AFAICT there is no point to the
INHERIT option in the first place unless it is to force you to issue
SET ROLE. That is literally the only thing it does. If we're going to
have weird exceptions where you don't have to SET ROLE after all, why
even have INHERIT in the first place?


As stated above, I don't think this is about INHERIT. INHERIT works fine 
both without the SET ROLE patch (and keeping is_member_of_role) and with 
the SET ROLE patch (and changing to member_can_set_role).


The exception is made, because there is no formal two-step process for 
requesting and accepting a transfer of ownership. Or alternatively: 
There is no exception, it's just that during the command to transfer 
ownership, the current role has to be changed temporarily to the 
accepting role. And that's the same as checking is_member_of_role or 
member_can_set_role, respectively.


Best

Wolfgang




Re: [RFC] building postgres with meson - v12

2022-09-08 Thread Andres Freund
On 2022-09-08 14:10:45 +0700, John Naylor wrote:
> Okay, done that way.

Thanks!




Re: Reducing the chunk header sizes on all memory context types

2022-09-08 Thread Andres Freund
Hi,

On 2022-09-07 11:08:27 -0400, Tom Lane wrote:
> > I'll need to think about how best to fix this. In the meantime, I
> > think the other 32-bit animals are probably not going to like this
> > either :-(
>
> Yeah.  The basic problem here is that we've greatly reduced the amount
> of redundancy in chunk headers.

Even with the prior amount of redundancy it's quite scary. It's one thing if
the only consequence is a bit of added overhead - but if we *don't* copy the
tuple due to a false positive we're in trouble. Afaict the user would have
some control over the memory contents and so an attacker could work on
triggering this issue. MemoryContextContains() may be ok for an assertion or
such, but relying on it for correctness seems a bad idea.

I wonder if we can get away from needing these checks, without unnecessarily
copying datums every time:

If there is no finalfunc, we know that the tuple ought to be in curaggcontext
or such, and we need to copy.

If there is a finalfunc, they're typically going to return data from within
the current memory context, but could legitimately also return part of the
data from curaggcontext. Perhaps we could forbid that? Our docs already say
the following for serialization funcs:

   The result of the deserialization function should simply be allocated in the
   current memory context, as unlike the combine function's result, it is not
   long-lived.

Perhaps we could institute a similar rule for finalfuncs? The argument against
that is that we can use arbitrary functions as finalfuncs currently. Perhaps
we could treat taking an internal argument as opting into the requirement and
default to copying otherwise?

Greetings,

Andres Freund




Re: has_privs_of_role vs. is_member_of_role, redux

2022-09-08 Thread Wolfgang Walther

Robert Haas:

I think to change the owner of an object from role A to role B, you just
need a different "privilege" on that role B to "use" the role that way,
which is distinct from INHERIT or SET ROLE "privileges".


It's not distinct, though, because if you can transfer ownership of a
table to another user, you can use that ability to gain the privileges
of that user.


Right, but the inverse is not neccessarily true, so you could have SET 
ROLE privileges, but not "USAGE" - and then couldn't change the owner of 
an object to this role.


USAGE is not a good term, because it implies "least amount of 
privileges", but in this case it's quite the opposite.


In any case, adding a grant option for SET ROLE, while keeping the 
required privileges for a transfer of ownership at the minimum 
(membership only), doesn't really make sense. I guess both threads 
should be discussed together?


Best

Wolfgang




Re: has_privs_of_role vs. is_member_of_role, redux

2022-09-08 Thread Robert Haas
On Thu, Sep 8, 2022 at 11:45 AM Wolfgang Walther
 wrote:
> I think to change the owner of an object from role A to role B, you just
> need a different "privilege" on that role B to "use" the role that way,
> which is distinct from INHERIT or SET ROLE "privileges".

It's not distinct, though, because if you can transfer ownership of a
table to another user, you can use that ability to gain the privileges
of that user.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: list of acknowledgments for PG15

2022-09-08 Thread Justin Pryzby
On Thu, Sep 08, 2022 at 11:39:43PM +0800, Japin Li wrote:
> 
> On Thu, 08 Sep 2022 at 20:13, Peter Eisentraut 
>  wrote:
> > Attached is the plain-text list of acknowledgments for the PG15 release 
> > notes, current through REL_15_BETA4.  Please check for problems such as 
> > wrong sorting, duplicate names in different variants, or names in the 
> > wrong order etc.  (Note that the current standard is given name followed 
> > by surname, independent of cultural origin.)
> 
> Hi, Peter
> 
> Li Japin is an alias of Japin Li, it is unnecessary to list both of them.

Thanks.  This script finds another name which seems to be duplicated:

awk '{print $1,$2; print $2,$1}' |sort |uniq -c |sort -nr |awk '$1>1'
  2 Tang Haiying
  2 Li Japin
  2 Japin Li
  2 Haiying Tang

Alternately: awk 'a[$2$1]{print} {a[$1$2]=1}'

-- 
Justin




Re: has_privs_of_role vs. is_member_of_role, redux

2022-09-08 Thread Wolfgang Walther

Robert Haas:

Fairly obviously, my thinking here is biased by having written the
patch to allow restricting SET ROLE. If alice can neither inherit
bob's privileges nor SET ROLE bob, she had better not be able to
create objects owned by bob, because otherwise she can make a table,
add an expression index that calls a user-defined function, do stuff
until it needs to be autovacuumed, and then give it to bob, and boom,
exploit. But that doesn't mean that the is_member_of_role() tests here
have to be changed to has_privs_of_role(). They could be changed to
has_privs_of_role() || member_can_set_role(). And if the consensus is
to do it that way, I'm OK with that.

I'm just a little unconvinced that it's actually the best route. I
think that logic of the form "well Alice could just SET ROLE and do it
anyway" is weak -- and not only because of the patch to allow
restricting SET ROLE, but because AFAICT there is no point to the
INHERIT option in the first place unless it is to force you to issue
SET ROLE. That is literally the only thing it does. If we're going to
have weird exceptions where you don't have to SET ROLE after all, why
even have INHERIT in the first place?


I think to change the owner of an object from role A to role B, you just 
need a different "privilege" on that role B to "use" the role that way, 
which is distinct from INHERIT or SET ROLE "privileges".


When you are allowed to INHERIT a role, you are allowed to use the 
GRANTs that have been given to this role. When you are allowed to SET 
ROLE, then you are allowed to switch into this role. You could think of 
another "privilege", USAGE on a role, which would allow you to "use" 
this role as a target in a statement to change the owner of an object.


To change the owner for an object from role A to role B, you need:
- the privilege to ALTER the object, which is implied when you are A
- the privilege to "use" role B as a target

So basically the privilege to use role B as the new owner, is a 
privilege you have **on** the role object B, while the privilege to 
change the owner of an object is something you have **through** your 
membership in role A.


Up to v15, there were no separate privileges for this. You were either a 
member of a role or you were not. Now with INHERIT and maybe SET ROLE 
privileges/grant options, we can do two things:
- Keep the ability to use a role as a target in those statements as the 
most basic privilege on a role, that is implied by membership in that 
role and can't be taken away (currently the case), or

- invent a new privilege or grant option to allow changing that.

But mixing this with either INHERIT or SET ROLE doesn't make sense, imho.

Best

Wolfgang




Re: list of acknowledgments for PG15

2022-09-08 Thread Japin Li


On Thu, 08 Sep 2022 at 20:13, Peter Eisentraut 
 wrote:
> Attached is the plain-text list of acknowledgments for the PG15 release 
> notes, current through REL_15_BETA4.  Please check for problems such as 
> wrong sorting, duplicate names in different variants, or names in the 
> wrong order etc.  (Note that the current standard is given name followed 
> by surname, independent of cultural origin.)

Hi, Peter

Li Japin is an alias of Japin Li, it is unnecessary to list both of them.

-- 
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.




Re: proposal: possibility to read dumped table's name from file

2022-09-08 Thread Pavel Stehule
Hi


st 7. 9. 2022 v 21:46 odesílatel Daniel Gustafsson  napsal:

> As noted upthread at some point, I'm not overly excited about the parser in
> filter.c, for maintainability and readability reasons.  So, I've
> reimplemented
> the parser in Flex/Bison in the attached patch, which IMHO provides a
> clear(er)
> picture of the grammar and is more per project standards.  This version of
> the
> patch is your latest version with just the parser replaced (at a reduction
> in
> size as a side benefit).
>
> All features supported in your latest patch version are present, and it
> passes
> all the tests added by this patch.  It's been an undisclosed amount of
> years
> since I wrote a Bison parser (well, yacc really) from scratch so I don't
> rule
> out having made silly mistakes.  I would very much appreciate review from
> those
> more well versed in this area.
>
> One thing this patchversion currently lacks is refined error messaging,
> but if
> we feel that this approach is a viable path then that can be tweaked.  The
> function which starts the parser can also be refactored to be shared across
> pg_dump, pg_dumpall and pg_restore but I've kept it simple for now.
>
> Thoughts?  It would be nice to get this patch across the finishline during
> this
> commitfest.
>
>
I have no objections to this, and thank you so you try to move this patch
forward.

Regards

Pavel

--
> Daniel Gustafsson   https://vmware.com/
>
>


Re: small windows psqlrc re-wording

2022-09-08 Thread Tom Lane
Julien Rouhaud  writes:
> On Wed, Sep 07, 2022 at 01:10:11PM -0400, Tom Lane wrote:
> - for example ~/.psqlrc-9.2 or
> - ~/.psqlrc-9.2.5.  The most specific
> + for example ~/.psqlrc-15 or
> + ~/.psqlrc-15.2.  The most specific

> This bit is a bit saddening.  It's probably good to switch to the new 2 digits
> versioning but not trying to maintain it any further right?

It occurred to me later to substitute &majorversion; and &version;
like this:

+ for example ~/.psqlrc-&majorversion; or
+ ~/.psqlrc-&version;.  The most specific

On testing that in HEAD, I read

Both the system-wide startup file and the user's personal startup file
can be made psql-version-specific by appending a dash and the
PostgreSQL major or minor release number to the file name, for example
~/.psqlrc-16 or ~/.psqlrc-16devel.

That's a little confusing but it's actually accurate, because what
process_psqlrc_file appends is the string PG_VERSION, so in a devel
branch or beta release there's a non-numeric "minor release".
I'm inclined to go ahead and do it like that.

regards, tom lane




Re: confirmed_flush_lsn shows LSN of the data that has not yet been received by the logical subscriber.

2022-09-08 Thread Ashutosh Sharma
On Thu, Sep 8, 2022 at 6:23 PM Ashutosh Bapat
 wrote:
>
> On Thu, Sep 8, 2022 at 4:14 PM Ashutosh Sharma  wrote:
> >
> > Hi All,
> >
> > The logically decoded data are sent to the logical subscriber at the time 
> > of transaction commit, assuming that the data is small. However, before the 
> > transaction commit is performed, the LSN representing the data that is yet 
> > to be received by the logical subscriber appears in the confirmed_flush_lsn 
> > column of pg_replication_slots catalog. Isn't the information seen in the 
> > confirmed_flush_lsn column while the transaction is in progress incorrect ? 
> > esp considering the description given in the pg doc for this column.
> >
> > Actually, while the transaction is running, the publisher keeps on sending 
> > keepalive messages containing LSN of the last decoded data saved in reorder 
> > buffer and the subscriber responds with the same LSN as the last received 
> > LSN which is then updated as confirmed_flush_lsn by the publisher. I think 
> > the LSN that we are sending with the keepalive message should be the one 
> > representing the transaction begin message, not the LSN of the last decoded 
> > data which is yet to be sent. Please let me know if I am missing something 
> > here.
>
> The transactions with commit lsn < confirmed_flush_lsn are confirmed
> to be received (and applied by the subscriber. Setting LSN
> corresponding to a WAL record within a transaction in progress as
> confirmed_flush should be ok. Since the transactions are interleaved
> in WAL stream, it's quite possible that LSNs of some WAL records of an
> inflight transaction are lesser than commit LSN of some another
> transaction. So setting commit LSN of another effectively same as
> setting it to any of the LSNs of any previous WAL record irrespective
> of the transaction that it belongs to.

Thank you Ashutosh for the explanation. I still feel that the
documentation on confirmed_flush_lsn needs some improvement. It
actually claims that all the data before the confirmed_flush_lsn has
been received by the logical subscriber, but that's not the case. It
actually means that all the data belonging to the transactions with
commit lsn < confirmed_flush_lsn has been received and applied by the
subscriber. So setting confirmed_flush_lsn to the lsn of wal records
generated by running transaction might make people think that the wal
records belonging to previous data of the same running transaction has
already been received and applied by the subscriber node, but that's
not true.

--
With Regards,
Ashutosh Sharma.




Re: Add SPLIT PARTITION/MERGE PARTITIONS commands

2022-09-08 Thread Dmitry Koval

Thanks for your advice, Justin and Alvaro!

I'll try to reduce the size of this patch and split it into separate 
parts (for MERGE and SPLIT).


--
With best regards,
Dmitry Koval

Postgres Professional: http://postgrespro.com




Re: Add SPLIT PARTITION/MERGE PARTITIONS commands

2022-09-08 Thread Alvaro Herrera
On 2022-Sep-08, Justin Pryzby wrote:

> If the patch were split into separate parts for MERGE and SPLIT, would
> the first patch be significantly smaller than the existing patch
> (hopefully half as big) ?  That would help to review it, even if both
> halves were ultimately squished together.  (An easy way to do this is to
> open up all the files in separate editor instances, trim out the parts
> that aren't needed for the first patch, save the files but don't quit
> the editors, test compilation and regression tests, then git commit
> --amend -a.  Then in each editor, "undo" all the trimmed changes, save,
> and git commit -a).

An easier (IMO) way to do that is to use "git gui" or even "git add -p",
which allow you to selectively add changed lines/hunks to the index.
You add a few, commit, then add the rest, commit again.  With "git add
-p" you can even edit individual hunks in an editor in case you have a
mix of both wanted and unwanted in a single hunk (after "s"plitting, of
course), which turns out to be easier than it sounds.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"El sudor es la mejor cura para un pensamiento enfermo" (Bardia)




Re: predefined role(s) for VACUUM and ANALYZE

2022-09-08 Thread Robert Haas
On Wed, Sep 7, 2022 at 7:09 PM Stephen Frost  wrote:
> Calling this a redesign is over-stating things, imv … and I’d much rather 
> have the per-relation granularity than predefined roles for this, so there is 
> that to consider too, perhaps.

I also prefer the finer granularity.

On the question of whether freeing up more privilege bits is a
prerequisite for this patch, I'm a bit on the fence about that. If I
look at the amount of extra work that your review comments have caused
me to do over, let's say, the last three years, and I compare that to
the amount of extra work that the review comments of other people have
caused me to do in the same period of time, you win. In fact, you win
against all of them added together and doubled. I think that as a
general matter you are far too willing to argue vigorously for people
to do work that isn't closely related to their original goals, and
which is at times even opposed to their original goals, and I think
the project would be better off if you tempered that urge.

Now on the other hand, I also do think we need more privilege bits.
You're not alone in making the case that this is a problem which needs
to be solved, and the set of other people who are also making that
argument includes me. At the same time, there is certainly a double
standard here. When Andrew and Tom committed
d11e84ea466b4e3855d7bd5142fb68f51c273567 and
a0ffa885e478f5eeacc4e250e35ce25a4740c487 respectively, we used up 2 of
the remaining 4 bits, bits which other people would have liked to have
used up years ago and they were told "no you can't." I don't believe I
would have been willing to commit those patches without doing
something to solve this problem, because I would have been worried
about getting yelled at by Tom. But now here we are with only 2 bits
left instead of 4, and we're telling the next patch author - who is
not Tom - that he's on the hook to solve the problem.

Well, we do need to solve the problem. But we're not necessarily being
fair about how the work involved gets distributed. It's a heck of a
lot easier for a committer to get something committed to address this
issue than a non-committer, and it's a heck of a lot easier for a
committer to ignore the fact that the problem hasn't been solved and
press ahead anyway, and yet somehow we're trying to dump a problem
that's a decade in the making on Nathan. I'm not exactly sure what to
propose as an alternative, but that doesn't seem quite fair.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.

2022-09-08 Thread Karina Litskevich
Hi,

I also would like to suggest a cosmetic change.
In v15 a new field checkunique is added after heapallindexed and before
no_btree_expansion fields in struct definition, but in initialisation it is
added after no_btree_expansion:

--- a/src/bin/pg_amcheck/pg_amcheck.c
+++ b/src/bin/pg_amcheck/pg_amcheck.c
@@ -102,6 +102,7 @@ typedef struct AmcheckOptions
  bool parent_check;
  bool rootdescend;
  bool heapallindexed;
+ bool checkunique;

  /* heap and btree hybrid option */
  bool no_btree_expansion;
@@ -132,7 +133,8 @@ static AmcheckOptions opts = {
  .parent_check = false,
  .rootdescend = false,
  .heapallindexed = false,
- .no_btree_expansion = false
+ .no_btree_expansion = false,
+ .checkunique = false
 };

I suggest to add checkunique field between heapallindexed and
no_btree_expansion fields in initialisation as well as in definition:

@@ -132,6 +133,7 @@ static AmcheckOptions opts = {
  .parent_check = false,
  .rootdescend = false,
  .heapallindexed = false,
+ .checkunique = false,
  .no_btree_expansion = false
 };

--
Best regards,
Litskevich Karina
Postgres Professional: http://postgrespro.com/
From 56fe2b608b46c6c97900bbb63b2169e2997bc8cc Mon Sep 17 00:00:00 2001
From: Pavel Borisov 
Date: Wed, 11 May 2022 15:54:13 +0400
Subject: [PATCH v16] Add option for amcheck and pg_amcheck to check unique
 constraint for btree indexes.

Add 'checkunique' argument to bt_index_check() and bt_index_parent_check().
When the flag is specified the procedures will check the unique constraint
violation for unique indexes. Only one heap entry for all equal keys in
the index should be visible (including posting list entries). Report an error
otherwise.

pg_amcheck called with --checkunique option will do the same check for all
the indexes it checks.

Author: Anastasia Lubennikova 
Author: Pavel Borisov 
Author: Maxim Orlov 
Reviewed-by: Mark Dilger 
Reviewed-by: Zhihong Yu 
Reviewed-by: Peter Geoghegan 
Reviewed-by: Aleksander Alekseev 
Discussion: https://postgr.es/m/CALT9ZEHRn5xAM5boga0qnrCmPV52bScEK2QnQ1HmUZDD301JEg%40mail.gmail.com
---
 contrib/amcheck/Makefile  |   2 +-
 contrib/amcheck/amcheck--1.3--1.4.sql |  29 ++
 contrib/amcheck/amcheck.control   |   2 +-
 contrib/amcheck/expected/check_btree.out  |  42 +++
 contrib/amcheck/sql/check_btree.sql   |  14 +
 contrib/amcheck/t/004_verify_nbtree_unique.pl | 235 +
 contrib/amcheck/verify_nbtree.c   | 329 +-
 doc/src/sgml/amcheck.sgml |  14 +-
 doc/src/sgml/ref/pg_amcheck.sgml  |  11 +
 src/bin/pg_amcheck/pg_amcheck.c   |  48 ++-
 src/bin/pg_amcheck/t/003_check.pl |  45 +++
 src/bin/pg_amcheck/t/005_opclass_damage.pl|  65 
 12 files changed, 813 insertions(+), 23 deletions(-)
 create mode 100644 contrib/amcheck/amcheck--1.3--1.4.sql
 create mode 100644 contrib/amcheck/t/004_verify_nbtree_unique.pl

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index b82f221e50..88271687a3 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -7,7 +7,7 @@ OBJS = \
 	verify_nbtree.o
 
 EXTENSION = amcheck
-DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql
+DATA = amcheck--1.3--1.4.sql amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql
 PGFILEDESC = "amcheck - function for verifying relation integrity"
 
 REGRESS = check check_btree check_heap
diff --git a/contrib/amcheck/amcheck--1.3--1.4.sql b/contrib/amcheck/amcheck--1.3--1.4.sql
new file mode 100644
index 00..75574eaa64
--- /dev/null
+++ b/contrib/amcheck/amcheck--1.3--1.4.sql
@@ -0,0 +1,29 @@
+/* contrib/amcheck/amcheck--1.3--1.4.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "ALTER EXTENSION amcheck UPDATE TO '1.4'" to load this file. \quit
+
+-- In order to avoid issues with dependencies when updating amcheck to 1.4,
+-- create new, overloaded versions of the 1.2 bt_index_parent_check signature,
+-- and 1.1 bt_index_check signature.
+
+--
+-- bt_index_parent_check()
+--
+CREATE FUNCTION bt_index_parent_check(index regclass,
+heapallindexed boolean, rootdescend boolean, checkunique boolean)
+RETURNS VOID
+AS 'MODULE_PATHNAME', 'bt_index_parent_check'
+LANGUAGE C STRICT PARALLEL RESTRICTED;
+--
+-- bt_index_check()
+--
+CREATE FUNCTION bt_index_check(index regclass,
+heapallindexed boolean, checkunique boolean)
+RETURNS VOID
+AS 'MODULE_PATHNAME', 'bt_index_check'
+LANGUAGE C STRICT PARALLEL RESTRICTED;
+
+-- We don't want this to be available to public
+REVOKE ALL ON FUNCTION bt_index_parent_check(regclass, boolean, boolean, boolean) FROM PUBLIC;
+REVOKE ALL ON FUNCTION bt_index_check(regclass, boolean, boolean) FROM PUBLIC;
diff --git a/contrib/amcheck/amcheck.control b/contrib/amcheck/amcheck.control
index ab50931f75..e67ace01c9 100644
--- a/contrib/amcheck/amcheck.control
+++ b/contrib/amcheck/amc

Re: Bump MIN_WINNT to 0x0600 (Vista) as minimal runtime in 16~

2022-09-08 Thread Justin Pryzby
On Thu, Sep 08, 2022 at 05:55:40PM +0900, Michael Paquier wrote:
> On Tue, Aug 30, 2022 at 01:29:24PM +1200, Thomas Munro wrote:
> > This reminds me of 24c3ce8f1, which replaced a dlopen()-ish thing with
> > a direct function call.  Can you just call all these functions
> > directly these days?
> 
> Hmm.  Some tests in the CI show that attempting to call directly
> MiniDumpWriteDump() causes a linking failure at compilation.  Anyway,
> in the same fashion, I can get some simplifications done right for
> pg_ctl.c, auth.c and restricted_token.c.  And I am seeing all these
> functions listed in the headers of MinGW, meaning that all these
> should work out of the box in this case, no?
> 
> The others are library-dependent, and I not really confident about
> ldap_start_tls_sA().  So, at the end, I am finishing with the
> attached, what do you think?  This cuts some code, which is nice:
>  3 files changed, 48 insertions(+), 159 deletions(-)

+1

It seems silly to do it at runtime if it's possible to do it at link time.

-- 
Justin




Re: [PoC] Let libpq reject unexpected authentication requests

2022-09-08 Thread Peter Eisentraut
I'm wondering about making the list of things you can specify in 
require_auth less confusing and future proof.


For example, before long someone is going to try putting "ldap" into 
require_auth.  The fact that the methods in pg_hba.conf are not what 
libpq sees is not something that was really exposed to users until now. 
"none" vs. "trust" takes advantage of that.  But then I think we could 
also make "password" clearer, which surely sounds like any kind of 
password, encrypted or not, and that's also how pg_hba.conf behaves. 
The protocol specification calls that "AuthenticationCleartextPassword"; 
maybe we could pick a name based on that.


And then, what if we add a new method in the future, and someone puts 
that into their connection string.  Old clients will just refuse to 
parse that.  Ok, that effectively gives you the same behavior as 
rejecting the server's authentication offer.  But what about the negated 
version?  Also, what if we add new SASL methods.  How would we modify 
this code to be able to pick and choose and also have backward and 
forward compatible behavior?


In general, I like this.  We just need to think about the above things a 
bit more.





Re: has_privs_of_role vs. is_member_of_role, redux

2022-09-08 Thread Robert Haas
On Wed, Sep 7, 2022 at 5:51 PM Stephen Frost  wrote:
> > To be more precise, I propose that in order for alice to create
> > objects owned by bob or to change one of her objects to be owned by
> > bob, she must not only be a member of role bob, but also inherit bob's
> > privileges. If she has the ability to SET ROLE bob but not does not
> > inherit his privileges, she can create new objects owned by bob only
> > if she first does SET ROLE bob, and she cannot reassign ownership of
> > her objects to bob at all.
>
> ... which means that to get a table owned by bob which is currently
> owned by alice, alice has to:
>
> set role bob;
> create table;
> grant insert on table to alice;
> reset role;
> insert into table select * from table;
>
> That's pretty sucky and is the case that had been contemplated at the
> time that was written to allow it (at least, if memory serves).  iirc,
> that's also why we check the *bob* has CREATE rights in the place where
> this is happening (as otherwise the above process wouldn't work either).

Sure. I think it comes down to what you think that the system
administrator intended to block by not allowing alice to inherit bob's
permissions. In existing releases, there's no facility to prevent
alice from doing SET ROLE bob, so the system administrator can't have
intended this as a security measure. But the system administrator
might have intended that alice shouldn't do anything that relies on
bob's permissions by accident, else she should have SET ROLE. And in
that case the intention is defeated by allowing the operation. Now,
you may well have in mind some other intention that the system
administrator could have had where allowing alice to perform this
operation without needing to inherit bob's permissions is sensible;
I'm not trying to say there is no such case. I don't know what it is,
though.

My first reaction was in the same ballpark as yours: what's the big
deal? But as I think about it more, I struggle to reconcile that
instinct with any specific use case.

Fairly obviously, my thinking here is biased by having written the
patch to allow restricting SET ROLE. If alice can neither inherit
bob's privileges nor SET ROLE bob, she had better not be able to
create objects owned by bob, because otherwise she can make a table,
add an expression index that calls a user-defined function, do stuff
until it needs to be autovacuumed, and then give it to bob, and boom,
exploit. But that doesn't mean that the is_member_of_role() tests here
have to be changed to has_privs_of_role(). They could be changed to
has_privs_of_role() || member_can_set_role(). And if the consensus is
to do it that way, I'm OK with that.

I'm just a little unconvinced that it's actually the best route. I
think that logic of the form "well Alice could just SET ROLE and do it
anyway" is weak -- and not only because of the patch to allow
restricting SET ROLE, but because AFAICT there is no point to the
INHERIT option in the first place unless it is to force you to issue
SET ROLE. That is literally the only thing it does. If we're going to
have weird exceptions where you don't have to SET ROLE after all, why
even have INHERIT in the first place?

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: confirmed_flush_lsn shows LSN of the data that has not yet been received by the logical subscriber.

2022-09-08 Thread Ashutosh Bapat
On Thu, Sep 8, 2022 at 4:14 PM Ashutosh Sharma  wrote:
>
> Hi All,
>
> The logically decoded data are sent to the logical subscriber at the time of 
> transaction commit, assuming that the data is small. However, before the 
> transaction commit is performed, the LSN representing the data that is yet to 
> be received by the logical subscriber appears in the confirmed_flush_lsn 
> column of pg_replication_slots catalog. Isn't the information seen in the 
> confirmed_flush_lsn column while the transaction is in progress incorrect ? 
> esp considering the description given in the pg doc for this column.
>
> Actually, while the transaction is running, the publisher keeps on sending 
> keepalive messages containing LSN of the last decoded data saved in reorder 
> buffer and the subscriber responds with the same LSN as the last received LSN 
> which is then updated as confirmed_flush_lsn by the publisher. I think the 
> LSN that we are sending with the keepalive message should be the one 
> representing the transaction begin message, not the LSN of the last decoded 
> data which is yet to be sent. Please let me know if I am missing something 
> here.

The transactions with commit lsn < confirmed_flush_lsn are confirmed
to be received (and applied by the subscriber. Setting LSN
corresponding to a WAL record within a transaction in progress as
confirmed_flush should be ok. Since the transactions are interleaved
in WAL stream, it's quite possible that LSNs of some WAL records of an
inflight transaction are lesser than commit LSN of some another
transaction. So setting commit LSN of another effectively same as
setting it to any of the LSNs of any previous WAL record irrespective
of the transaction that it belongs to.

In case WAL sender restarts with confirmed_flush_lsn set to LSN of a
WAL record of an inflight transaction, the whole inflight transaction
will be sent again since its commit LSN is higher than
confirmed_flush_lsn.

I think logical replication has inherited this from physical
replication. A useful effect of this is to reduce WAL retention by
moving restart_lsn based on the latest confirmed_flush_lsn.

-- 
Best Wishes,
Ashutosh Bapat




Re: vacuumlo: add test to vacuumlo for test coverage

2022-09-08 Thread Daniel Gustafsson
> On 3 Sep 2022, at 10:27, Dong Wook Lee  wrote:

> I write a tiny patch about vacuumlo to improve test coverage.

If we are paying for setting up a cluster we might as well test more scenarios
than just the one.  Perhaps some other low-hanging fruit like calling vacuumlo
on a non-existing databsase, on one where no LO have been made etc?

One thing about the patch:

+IPC::Run::run [ 'vacuumlo', '-v', '-n', '-p', $port, 'postgres' ], '>', 
\$stdout;

This should use run_command() which provides facilities for running commands
and capturing STDOUT.  With this the test can be rewritten something like:

my ($out, $err) = run_command(['vacuumlo', .. ]);
like($out, ..);

run_command() is defined in PostgreSQL::Test::Utils.

--
Daniel Gustafsson   https://vmware.com/





Re: proposal: possibility to read dumped table's name from file

2022-09-08 Thread Daniel Gustafsson
> On 8 Sep 2022, at 13:44, Julien Rouhaud  wrote:
> On Thu, Sep 08, 2022 at 01:38:42PM +0200, Daniel Gustafsson wrote:
>>> On 8 Sep 2022, at 12:00, Erik Rijkers  wrote:

>>> I did notice one peculiarity (in your patch) where for each table a few 
>>> spaces are omitted by pg_dump.
>> 
>> Right, I had that on my TODO to fix before submitting but clearly forgot.  It
>> boils down to consuming the space between commands and object types and 
>> object
>> patterns. The attached v2 fixes that.
> 
> I only had a quick look at the parser,

Thanks for looking!

> .. and one thing that strikes me is:
> 
> +Patterns:
> + /* EMPTY */
> + | Patterns Pattern
> + | Pattern
> + ;
> +
> +Pattern:
> + C_INCLUDE include_object pattern { include_item(priv, $2, $3); }
> 
> It seems confusing to mix Pattern(s) (the rules) and pattern (the token).
> Maybe instead using Include(s) or Item(s) on the bison side, and/or
> name_pattern on the lexer side?

That makes a lot of sense, I renamed the rules in the parser but kept them in
the lexer since that seemed like the clearest scheme.

Also in the attached is a small refactoring to share parser init between
pg_dump and pg_restore (pg_dumpall shares little with these so not there for
now), buffer resize overflow calculation and some error message tweaking.

--
Daniel Gustafsson   https://vmware.com/



v3-0001-Add-include-exclude-filtering-via-file-in-pg_dump.patch
Description: Binary data




Re: Add SPLIT PARTITION/MERGE PARTITIONS commands

2022-09-08 Thread Justin Pryzby
On Thu, Sep 08, 2022 at 02:35:24PM +0300, Dmitry Koval wrote:
> Thanks a lot Justin!
> 
> After compilation PostgreSQL+patch with macros
> RELCACHE_FORCE_RELEASE,
> RANDOMIZE_ALLOCATED_MEMORY,
> I saw a problem on Windows 10, MSVC2019.

Yes, it passes tests on my CI improvements branch.
https://github.com/justinpryzby/postgres/runs/8248668269
Thanks to Alexander Pyhalov for reminding me about
RELCACHE_FORCE_RELEASE last year ;)

On Tue, May 31, 2022 at 12:32:43PM +0300, Dmitry Koval wrote:
> This can be useful for this example cases: 
> need to merge all one-day partitions
> into a month partition.

+1, we would use this (at least the MERGE half).

I wonder if it's possible to reduce the size of this patch (I'm starting
to try to absorb it).  Is there a way to refactor/reuse existing code to
reduce its footprint ?

partbounds.c is adding 500+ LOC about checking if proposed partitions
meet the requirements (don't overlap, etc).  But a lot of those checks
must already happen, no?  Can you re-use/refactor the existing checks ?

An UPDATE on a partitioned table will move tuples from one partition to
another.  Is there a way to re-use that ?  Also, postgres already
supports concurrent DDL (CREATE+ATTACH and DETACH CONCURRENTLY).  Is it 
possible to leverage that ?  (Mostly to reduce the patch size, but also
because maybe some cases could be concurrent?).

If the patch were split into separate parts for MERGE and SPLIT, would
the first patch be significantly smaller than the existing patch
(hopefully half as big) ?  That would help to review it, even if both
halves were ultimately squished together.  (An easy way to do this is to
open up all the files in separate editor instances, trim out the parts
that aren't needed for the first patch, save the files but don't quit
the editors, test compilation and regression tests, then git commit
--amend -a.  Then in each editor, "undo" all the trimmed changes, save,
and git commit -a).

Would it save much code if "default" partitions weren't handled in the
first patch ?

-- 
Justin




list of acknowledgments for PG15

2022-09-08 Thread Peter Eisentraut


Attached is the plain-text list of acknowledgments for the PG15 release 
notes, current through REL_15_BETA4.  Please check for problems such as 
wrong sorting, duplicate names in different variants, or names in the 
wrong order etc.  (Note that the current standard is given name followed 
by surname, independent of cultural origin.)Abhijit Menon-Sen
Adam Brusselback
Adam Mackler
Adrian Ho
Ahsan Hadi
Ajin Cherian
Alastair McKinley
Aleksander Alekseev
Ales Zeleny
Alex Kingsborough
Alex Kozhemyakin
Alexander Korotkov
Alexander Kukushkin
Alexander Lakhin
Alexander Nawratil
Alexander Pyhalov
Alexey Borzov
Alexey Ermakov
Aliaksandr Kalenik
Álvaro Herrera
Amit Kapila
Amit Khandekar
Amit Langote
Amul Sul
Anastasia Lubennikova
Anders Kaseorg
Andreas Dijkman
Andreas Grob
Andreas Seltenreich
Andrei Zubkov
Andres Freund
Andrew Alsup
Andrew Bille
Andrew Dunstan
Andrew Gierth
Andrew Kesper
Andrey Borodin
Andrey Lepikhov
Andy Fan
Anton Melnikov
Anton Voloshin
Antonin Houska
Arjan van de Ven
Arne Roland
Arthur Zakirov
Ashutosh Bapat
Ashutosh Sharma
Ashwin Agrawal
Asif Rehman
Asim Praveen
Atsushi Torikoshi
Aya Iwata
Bauyrzhan Sakhariyev
Bernd Dorn
Bertrand Drouvot
Bharath Rupireddy
Björn Harrtell
Boris Kolpackov
Boris Korzun
Brad Nicholson
Brar Piening
Bruce Momjian
Bruno da Silva
Bryn Llewellyn
Carl Sopchak
Cary Huang
Chapman Flack
Chen Jiaoqian
Chris Bandy
Chris Lowder
Christian Quest
Christoph Berg
Christoph Heiss
Christophe Pettus
Christopher Painter-Wakefield
Claudio Freire
Clemens Zeidler
Corey Huinker
Dag Lem
Dagfinn Ilmari Mannsåker
Dan Kubb
Daniel Cherniy
Daniel Gustafsson
Daniel Polski
Daniel Vérité
Daniel Westermann
Daniele Varrazzo
Daniil Anisimov
Danny Shemesh
Darafei Praliaskouski
Daria Lepikhova
Dave Cramer
Dave Page
David Christensen
David Fetter
David G. Johnston
David Rowley
David Steele
David Zhang
Dean Rasheed
Dian Fay
Dilip Kumar
Dipesh Pandit
Dmitry Dolgov
Dmitry Koval
Dmitry Marakasov
Dominique Devienne
Dong Wook
Drew DeVault
Eduard Català
Egor Chindyaskin
Egor Rogov
Ekaterina Kiryanova
Elena Indrupskaya
Elvis Pranskevichus
Emmanuel Quincerot
Emre Hasegeli
Eric Mutta
Erica Zhang
Erik Rijkers
Erki Eessaar
Etsuro Fujita
Euler Taveira
Fabien Coelho
Fabrice Chapuis
Fabrice Fontaine
Fabrízio de Royes Mello
Feike Steenbergen
Filip Gospodinov
Florin Irion
Floris Van Nee
Frédéric Yhuel
Gabriela Serventi
Gaurab Dey
Geoff Winkless
Geoffrey Blake
Georgios Kokolatos
Gilles Darold
Greg Nancarrow
Greg Rychlewski
Greg Sabino Mullane
Greg Stark
Gregory Smith
Guillaume Lelarge
Gunnar Bluth
Gurjeet Singh
Haiyang Wang
Haiying Tang
Hannu Krosing
Hans Buschmann
Hayato Kuroda
Heikki Linnakangas
Herwig Goemans
Himanshu Upadhyaya
Holly Roberts
Hou Zhijie
Hubert Lubaczewski
Ian Barwick
Ian Campbell
Ibrar Ahmed
Ildus Kurbangaliev
Ilya Anfimov
Itamar Gafni
Jacob Champion
Jaime Casanova
Jakub Wartak
James Coleman
James Hilliard
James Inform
Jan Piotrowski
Japin Li
Jason Harvey
Jason Kim
Jean-Christophe Arnu
Jeevan Ladhe
Jeff Davis
Jeff Janes
Jelte Fennema
Jeremy Evans
Jeremy Schneider
Jian Guo
Jian He
Jimmy Yih
Jiří Fejfar
Jitka Plesníková
Joe Conway
Joe Wildish
Joel Jacobson
Joey Bodoia
John Naylor
Jonathan Katz
Josef Simanek
Joseph Koshakow
Josh Soref
Joshua Brindle
Juan José Santamaría Flecha
Julien Rouhaud
Julien Roze
Junwang Zhao
Jürgen Purtz
Justin Pryzby
Kamigishi Rei
Kawamoto Masaya
Ken Kato
Kevin Burke
Kevin Grittner
Kevin Humphreys
Kevin McKibbin
Kevin Sweet
Kevin Zheng
Klaudie Willis
Konstantin Knizhnik
Konstantina Skovola
Kosei Masumura
Koyu Tanigawa
Kuntal Ghosh
Kyotaro Horiguchi
Lars Kanis
Lauren Fliksteen
Laurent Hasson
Laurenz Albe
Leslie Lemaire
Li Japin
Liam Bowen
Lingjie Qiang
Liu Huailing
Louis Jachiet
Lukas Fittl
Ma Liangzhu
Maciek Sakrejda
Magnus Hagander
Mahendra Singh Thalor
Maksim Milyutin
Marc Bachmann
Marcin Krupowicz
Marcus Gartner
Marek Szuba
Marina Polyakova
Mario Emmenlauer
Mark Dilger
Mark Murawski
Mark Wong
Markus Wanner
Markus Winand
Martijn van Oosterhout
Martin Kalcher
Martín Marqués
Masahiko Sawada
Masahiro Ikeda
Masao Fujii
Masayuki Hirose
Matthias van de Meent
Matthijs van der Vleuten
Maxim Orlov
Maxim Yablokov
Melanie Plageman
Michael Banck
Michael Harris
Michael J. Sullivan
Michael Meskes
Michael Mühlbeyer
Michael Paquier
Michael Powers
Mike Fiedler
Mike Oh
Mikhail Kulagin
Miles Delahunty
Nathan Bossart
Nathan Long
Nazir Bilal Yavuz
Neha Sharma
Neil Chen
Nicola Contu
Nicolas Lutic
Nikhil Benesch
Nikhil Shetty
Nikhil Sontakke
Nikita Glukhov
Nikolai Berkoff
Nikolay Samokhvalov
Nikolay Shaplov
Nitin Jadhav
Noah Misch
Noboru Saito
Noriyoshi Shinoda
Okano Naoki
Olaf Bohlen
Olly Betts
Onder Kalaci
Oskar Stenberg
Otto Kekalainen
Paul Guo
Paul Jungwirth
Paul Martinez
Pavan Deolasee
Pavel Borisov
Pavel Luzanov
Pavel Stehule
Peter Eisentraut
Peter Geoghegan
Peter Slavov
Peter Smith
Petr Jelínek
Phil Florent
Phil Krylov
Pierre-Aurélien Georges
Prabhat Sahu
Quan Zongliang
Rachel Heaton
Rahila Syed
Rajakavitha Kodhandapani
Rajkumar Raghuwanshi
Ranier Vilela
Reid Thompson
Rémi Lapeyre
Renan Soares 

Re: [BUG] wrong FK constraint name when colliding name on ATTACH

2022-09-08 Thread Jehan-Guillaume de Rorthais
On Thu, 8 Sep 2022 13:25:15 +0200
Alvaro Herrera  wrote:

> On 2022-Sep-08, Jehan-Guillaume de Rorthais wrote:
> 
> > Hi there,
> > 
> > I believe this very small bug and its fix are really trivial and could be
> > push out of the way quite quickly. It's just about a bad constraint name
> > fixed by moving one assignation after the next one. This could easily be
> > fixed for next round of releases.
> > 
> > Well, I hope I'm not wrong :)  
> 
> I think you're right, so pushed, and backpatched to 12.  I added the
> test case to regression also.

Great, thank you for the additional work on the regression test and the commit!

> For 11, I adjusted the test case so that it didn't depend on an FK
> pointing to a partitioned table (which is not supported there); it turns
> out that the old code is not smart enough to get into the problem in the
> first place.  [...]
> It seems fair to say that this case, with pg11, is unsupported and
> people should upgrade if they want better behavior.

That works for me.

Thanks!




ATExecAddColumn() doesn't merge inherited properties

2022-09-08 Thread Amit Langote
While testing Alvaro's "cataloguing NOT NULL constraints" patch [1], I
noticed a behavior of inherited column merging during ALTER TABLE that
I thought might be a bug (though see at the bottom).

Note how CREATE TABLE merges the inherited properties of parent foo's
a's NOT NULL into a child bar's own a:

create table foo (a int not null);

create table bar (a int, b int) inherits (foo);
NOTICE:  merging column "a" with inherited definition
CREATE TABLE

\d bar
Table "public.bar"
 Column |  Type   | Collation | Nullable | Default
+-+---+--+-
 a  | integer |   | not null |
 b  | integer |   |  |
Inherits: foo

However, ALTER TABLE apparently doesn't pass down the NOT NULL flag
when merging the parent's new column b into a child table's existing
column b:

alter table foo add b int not null;
NOTICE:  merging definition of column "b" for child "bar"
\d bar
Table "public.bar"
 Column |  Type   | Collation | Nullable | Default
+-+---+--+-
 a  | integer |   | not null |
 b  | integer |   |  |
Inherits: foo

ATExecAddColumn()'s following block of code that handles the merging
seems inadequate compared to the similar block of MergeAttributes()
called during CREATE TABLE:

/*
 * Are we adding the column to a recursion child?  If so, check whether to
 * merge with an existing definition for the column.  If we do merge, we
 * must not recurse.  Children will already have the column, and recursing
 * into them would mess up attinhcount.
 */
if (colDef->inhcount > 0)
{
...
 /* Bump the existing child att's inhcount */
childatt->attinhcount++;
CatalogTupleUpdate(attrdesc, &tuple->t_self, tuple);

heap_freetuple(tuple);

/* Inform the user about the merge */
ereport(NOTICE,
(errmsg("merging definition of column \"%s\" for
child \"%s\"",
colDef->colname, RelationGetRelationName(rel;

table_close(attrdesc, RowExclusiveLock);
return InvalidObjectAddress;
}

This only increments attinhcount of the child's existing column,
unlike MergeAttributes()'s code, which will not only merge the NOT
NULL flag but also check for generated conflicts, so one gets the
following behavior:

create table foo (a int generated always as (1) stored);

create table bar (a int generated always as identity) inherits (foo);
NOTICE:  merging column "a" with inherited definition
ERROR:  column "a" inherits from generated column but specifies identity

create table bar (a int generated always as (2) stored) inherits (foo);
NOTICE:  merging column "a" with inherited definition
ERROR:  child column "a" specifies generation expression
HINT:  Omit the generation expression in the definition of the child
table column to inherit the generation expression from the parent
table.

create table bar (a int, b int generated always as identity) inherits (foo);
NOTICE:  merging column "a" with inherited definition
\d bar
Table "public.bar"
 Column |  Type   | Collation | Nullable |Default
+-+---+--+
 a  | integer |   |  | generated always as (1) stored
 b  | integer |   | not null | generated always as identity
Inherits: foo

alter table foo add b int generated always as (1) stored;
NOTICE:  merging definition of column "b" for child "bar"
\d bar
Table "public.bar"
 Column |  Type   | Collation | Nullable |Default
+-+---+--+
 a  | integer |   |  | generated always as (1) stored
 b  | integer |   | not null | generated always as identity
Inherits: foo

So, adding a column to the parent after-the-fact will allow its
generated definition to differ from that of the child's existing
column, which is not allowed when creating the child table with its
own different generated definition for its column.

I feel like we may have discussed this before and decided that the
$subject is left that way intentionally, but wanted to bring this up
again in the light of Alvaro's patch which touches nearby code.
Should we try to fix this?

-- 
Thanks, Amit Langote
EDB: http://www.enterprisedb.com

[1] https://commitfest.postgresql.org/39/3869/




pg_walinspect float4/float8 confusion

2022-09-08 Thread Peter Eisentraut
The pg_walinspect function pg_get_wal_stats() has output arguments 
declared as float4 (count_percentage, record_size_percentage, etc.), but 
the internal computations are all done in type double.  Is there a 
reason why this is then converted to float4 for output?  It probably 
doesn't matter in practice, but it seems unnecessarily confusing.  Or at 
least add a comment so it doesn't look like an accident.  Also compare 
with pgstattuple, which uses float8 in its SQL interface for similar data.





Re: Switching XLog source from archive to streaming when primary available

2022-09-08 Thread Bharath Rupireddy
On Wed, Sep 7, 2022 at 3:27 AM Nathan Bossart  wrote:
>
> +  
> +   wal_source_switch_interval configuration 
> parameter
> +  
>
> I don't want to bikeshed on the name too much, but I do think we need
> something more descriptive.  I'm thinking of something like
> streaming_replication_attempt_interval or
> streaming_replication_retry_interval.

I could come up with wal_source_switch_interval after a log of
bikeshedding myself :). However, streaming_replication_retry_interval
looks much better, I've used it in the latest patch. Thanks.

> +Specifies how long the standby server should wait before switching 
> WAL
> +source from WAL archive to primary (streaming replication). This can
> +happen either during the standby initial recovery or after a previous
> +failed attempt to stream WAL from the primary.
>
> I'm not sure what the second sentence means.  In general, I think the
> explanation in your commit message is much clearer:

I polished the comments, docs and commit message a bit, please check now.

> 5 seconds seems low.  I would expect the default to be 1-5 minutes.  I
> think it's important to strike a balance between interrupting archive
> recovery to attempt streaming replication and letting archive recovery make
> progress.

+1 for a default value of 5 minutes to avoid frequent interruptions
for archive mode when primary is really down for a long time. I've
also added a cautionary note in the docs about the lower values.

> +* Try reading WAL from primary after every wal_source_switch_interval
> +* milliseconds, when state machine is in XLOG_FROM_ARCHIVE state. If
> +* successful, the state machine moves to XLOG_FROM_STREAM state, 
> otherwise
> +* it falls back to XLOG_FROM_ARCHIVE state.
>
> It's not clear to me how this is expected to interact with the pg_wal phase
> of standby recovery.  As the docs note [0], standby servers loop through
> archive recovery, recovery from pg_wal, and streaming replication.  Does
> this cause the pg_wal phase to be skipped (i.e., the standby goes straight
> from archive recovery to streaming replication)?  I wonder if it'd be
> better for this mechanism to simply move the standby to the pg_wal phase so
> that the usual ordering isn't changed.
>
> [0] 
> https://www.postgresql.org/docs/current/warm-standby.html#STANDBY-SERVER-OPERATION

It doesn't change any behaviour as such for XLOG_FROM_PG_WAL. In
standby mode when recovery_command is specified, the initial value of
currentSource would be XLOG_FROM_ARCHIVE (see [1]). If the archive is
exhausted of WAL or the standby fails to fetch from the archive, then
it switches to XLOG_FROM_STREAM. If the standby fails to receive WAL
from primary, it switches back to XLOG_FROM_ARCHIVE. This continues
unless the standby gets promoted. With the patch, we enable the
standby to try fetching from the primary, instead of waiting for WAL
in the archive to get exhausted or for an error to occur in the
standby while receiving from the archive.

> +   if (!first_time &&
> +   
> TimestampDifferenceExceeds(last_switch_time, curr_time,
> + 
>  wal_source_switch_interval))
>
> Shouldn't this also check that wal_source_switch_interval is not set to 0?

Corrected.

> +   elog(DEBUG2,
> +"trying to switch 
> WAL source to %s after fetching WAL from %s for %d milliseconds",
> +
> xlogSourceNames[XLOG_FROM_STREAM],
> +
> xlogSourceNames[currentSource],
> +
> wal_source_switch_interval);
> +
> +   last_switch_time = curr_time;
>
> Shouldn't the last_switch_time be set when the state machine first enters
> XLOG_FROM_ARCHIVE?  IIUC this logic is currently counting time spent
> elsewhere (e.g., XLOG_FROM_STREAM) when determining whether to force a
> source switch.  This would mean that a standby that has spent a lot of time
> in streaming replication before failing would flip to XLOG_FROM_ARCHIVE,
> immediately flip back to XLOG_FROM_STREAM, and then likely flip back to
> XLOG_FROM_ARCHIVE when it failed again.  Given the standby will wait for
> wal_retrieve_retry_interval before going back to XLOG_FROM_ARCHIVE, it
> seems like we could end up rapidly looping between sources.  Perhaps I am
> misunderstanding how this is meant to work.

last_switch_time indicates the time when the standby last attempted to
switch to primary. For instance, a standby:
1) for the first time, sets last_switch_time = current_time when in archive mode
2) if current_time < last_switch_time + interval, continues to

Re: proposal: possibility to read dumped table's name from file

2022-09-08 Thread Julien Rouhaud
Hi,

On Thu, Sep 08, 2022 at 01:38:42PM +0200, Daniel Gustafsson wrote:
> > On 8 Sep 2022, at 12:00, Erik Rijkers  wrote:
> > 
> > Op 07-09-2022 om 21:45 schreef Daniel Gustafsson:
> >> One thing this patchversion currently lacks is refined error messaging, 
> >> but if
> >> we feel that this approach is a viable path then that can be tweaked.  The
> >> function which starts the parser can also be refactored to be shared across
> >> pg_dump, pg_dumpall and pg_restore but I've kept it simple for now.
> >> Thoughts?  It would be nice to get this patch across the finishline during 
> >> this
> >> commitfest.
> > 
> > > [0001-Add-include-exclude-filtering-via-file-in-pg_dump.patch]
> > 
> > This seems to dump & restore well (as Pavels patch does).
> 
> Thanks for looking!
> 
> > I did notice one peculiarity (in your patch) where for each table a few 
> > spaces are omitted by pg_dump.
> 
> Right, I had that on my TODO to fix before submitting but clearly forgot.  It
> boils down to consuming the space between commands and object types and object
> patterns. The attached v2 fixes that.

I only had a quick look at the parser, and one thing that strikes me is:

+Patterns:
+   /* EMPTY */
+   | Patterns Pattern
+   | Pattern
+   ;
+
+Pattern:
+   C_INCLUDE include_object pattern { include_item(priv, $2, $3); }

It seems confusing to mix Pattern(s) (the rules) and pattern (the token).
Maybe instead using Include(s) or Item(s) on the bison side, and/or
name_pattern on the lexer side?




Re: proposal: possibility to read dumped table's name from file

2022-09-08 Thread Daniel Gustafsson
> On 8 Sep 2022, at 12:00, Erik Rijkers  wrote:
> 
> Op 07-09-2022 om 21:45 schreef Daniel Gustafsson:
>> One thing this patchversion currently lacks is refined error messaging, but 
>> if
>> we feel that this approach is a viable path then that can be tweaked.  The
>> function which starts the parser can also be refactored to be shared across
>> pg_dump, pg_dumpall and pg_restore but I've kept it simple for now.
>> Thoughts?  It would be nice to get this patch across the finishline during 
>> this
>> commitfest.
> 
> > [0001-Add-include-exclude-filtering-via-file-in-pg_dump.patch]
> 
> This seems to dump & restore well (as Pavels patch does).

Thanks for looking!

> I did notice one peculiarity (in your patch) where for each table a few 
> spaces are omitted by pg_dump.

Right, I had that on my TODO to fix before submitting but clearly forgot.  It
boils down to consuming the space between commands and object types and object
patterns. The attached v2 fixes that.

--
Daniel Gustafsson   https://vmware.com/



v2-0001-Add-include-exclude-filtering-via-file-in-pg_dump.patch
Description: Binary data


why can't a table be part of the same publication as its schema

2022-09-08 Thread Peter Eisentraut
Apparently, you can't add a table to a publication if its schema is 
already part of the publication (and vice versa), e.g.,


=# alter publication p1 add table s1.t1;
ERROR:  22023: cannot add relation "s1.t1" to publication
DETAIL:  Table's schema "s1" is already part of the publication or part 
of the specified schema list.


Is there a reason for this?  It looks a bit like a misfeature to me.  It 
constrains how you can move your tables around between schemas, based on 
how somewhere else a publication has been constructed.


It seems to me that it shouldn't be difficult to handle the case that a 
table is part of the publication via two different routes.  (We must 
already handle that since a subscription can use more than one publication.)





Re: [PATCH] Query Jumbling for CALL and SET utility statements

2022-09-08 Thread Julien Rouhaud
Hi,

On Thu, Sep 08, 2022 at 11:06:51AM +0200, Drouvot, Bertrand wrote:
> Hi,
> 
> On 9/8/22 8:50 AM, Julien Rouhaud wrote:
> 
> Thanks for looking at it!
> 
> > On Thu, Sep 08, 2022 at 02:23:19PM +0900, Michael Paquier wrote:
> > > On Wed, Sep 07, 2022 at 06:19:42PM -0700, Jeremy Schneider wrote:
> > > > I didn't fully debug yet, but here's the backtrace on my 14.4 build with
> > > > the patch
> > > What happens on HEAD?  That would be the target branch for a new
> > > feature.
> > It would be the same AFAICS.  From v3:
> > 
> > +   case T_VariableSetStmt:
> > +   {
> > +   VariableSetStmt *stmt = (VariableSetStmt *) 
> > node;
> > +
> > +   APP_JUMB_STRING(stmt->name);
> > +   JumbleExpr(jstate, (Node *) stmt->args);
> > +   }
> > 
> > For a RESET ALL command stmt->name is NULL.
> 
> Right, please find attached v4 addressing the issue and also Sami's comments
> [1].

(Sorry I've not been following this thread until now)

IME if your application relies on 2PC it's very likely that you will hit the
exact same problems described in your original email.  What do you think about
normalizing those too while working on the subject?




Re: [BUG] wrong FK constraint name when colliding name on ATTACH

2022-09-08 Thread Alvaro Herrera
On 2022-Sep-08, Jehan-Guillaume de Rorthais wrote:

> Hi there,
> 
> I believe this very small bug and its fix are really trivial and could be push
> out of the way quite quickly. It's just about a bad constraint name fixed by
> moving one assignation after the next one. This could easily be fixed for next
> round of releases.
> 
> Well, I hope I'm not wrong :)

I think you're right, so pushed, and backpatched to 12.  I added the
test case to regression also.

For 11, I adjusted the test case so that it didn't depend on an FK
pointing to a partitioned table (which is not supported there); it turns
out that the old code is not smart enough to get into the problem in the
first place.  Setup is

CREATE TABLE parted_fk_naming_pk (id bigint primary key);
CREATE TABLE parted_fk_naming (
id_abc bigint,
CONSTRAINT dummy_constr FOREIGN KEY (id_abc)
REFERENCES parted_fk_naming_pk (id)
)
PARTITION BY LIST (id_abc);
CREATE TABLE parted_fk_naming_1 (
id_abc bigint,
CONSTRAINT dummy_constr CHECK (true)
);

and then
ALTER TABLE parted_fk_naming ATTACH PARTITION parted_fk_naming_1 FOR VALUES IN 
('1');
throws this error:

ERROR:  duplicate key value violates unique constraint 
"pg_constraint_conrelid_contypid_conname_index"
DETALLE:  Key (conrelid, contypid, conname)=(686125, 0, dummy_constr) already 
exists.

It seems fair to say that this case, with pg11, is unsupported and
people should upgrade if they want better behavior.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"Those who use electric razors are infidels destined to burn in hell while
we drink from rivers of beer, download free vids and mingle with naked
well shaved babes." (http://slashdot.org/comments.pl?sid=44793&cid=4647152)




Re: Perform streaming logical transactions by background workers and parallel apply

2022-09-08 Thread Amit Kapila
On Thu, Sep 8, 2022 at 12:21 PM Amit Kapila  wrote:
>
> On Mon, Sep 5, 2022 at 6:34 PM houzj.f...@fujitsu.com
>  wrote:
> >
> > Attach the correct patch set this time.
> >
>
> Few comments on v28-0001*:
> ===
>

Some suggestions for comments in v28-0001*
1.
+/*
+ * Entry for a hash table we use to map from xid to the parallel apply worker
+ * state.
+ */
+typedef struct ParallelApplyWorkerEntry

Let's change this comment to: "Hash table entry to map xid to the
parallel apply worker state."

2.
+/*
+ * List that stores the information of parallel apply workers that were
+ * started. Newly added worker information will be removed from the list at the
+ * end of the transaction when there are enough workers in the pool. Besides,
+ * exited workers will be removed from the list after being detected.
+ */
+static List *ParallelApplyWorkersList = NIL;

Can we change this to: "A list to maintain the active parallel apply
workers. The information for the new worker is added to the list after
successfully launching it. The list entry is removed at the end of the
transaction if there are already enough workers in the worker pool.
For more information about the worker pool, see comments atop
worker.c. We also remove the entry from the list if the worker is
exited due to some error."

Apart from this, I have added/changed a few other comments in
v28-0001*. Kindly check the attached, if you are fine with it then
please include it in the next version.

-- 
With Regards,
Amit Kapila.


change_parallel_apply_comments_amit_1.patch
Description: Binary data


Issue in GIN fast-insert: XLogBeginInsert + Read/LockBuffer ordering

2022-09-08 Thread Matthias van de Meent
Hi,

In Neon, we've had to modify the GIN fast insertion path as attached,
due to an unexpected XLOG insertion and buffer locking ordering issue.

The xlog readme [0] mentions that the common order of operations is 1)
pin and lock any buffers you need for the log record, then 2) start a
critical section, then 3) call XLogBeginInsert.
In Neon, we rely on this documented order of operations to expect to
be able to WAL-log hint pages (freespace map, visibility map) when
they are written to disk (e.g. cache eviction, checkpointer). In
general, this works fine, except that in ginHeapTupleFastInsert we
call XLogBeginInsert() before the last of the buffers for the eventual
record was read, thus creating a path where eviction is possible in a
`begininsert_called = true` context. That breaks our current code by
being unable to evict (WAL-log) the dirtied hint pages.

PFA a patch that rectifies this issue, by moving the XLogBeginInsert()
down to where 1.) we have all relevant buffers pinned and locked, and
2.) we're in a critical section, making that part of the code
consistent with the general scheme for XLog insertion.

Kind regards,

Matthias van de Meent

[0] access/transam/README, section "Write-Ahead Log Coding", line 436-470


v1-0001-Fix-GIN-fast-path-XLogBeginInsert-and-Read-LockBu.patch
Description: Binary data


confirmed_flush_lsn shows LSN of the data that has not yet been received by the logical subscriber.

2022-09-08 Thread Ashutosh Sharma
Hi All,

The logically decoded data are sent to the logical subscriber at the time
of transaction commit, assuming that the data is small. However, before the
transaction commit is performed, the LSN representing the data that is yet
to be received by the logical subscriber appears in the confirmed_flush_lsn
column of pg_replication_slots catalog. Isn't the information seen in the
confirmed_flush_lsn column while the transaction is in progress incorrect ?
esp considering the description given in the pg doc for this column.

Actually, while the transaction is running, the publisher keeps on sending
keepalive messages containing LSN of the last decoded data saved in reorder
buffer and the subscriber responds with the same LSN as the last received
LSN which is then updated as confirmed_flush_lsn by the publisher. I think
the LSN that we are sending with the keepalive message should be the one
representing the transaction begin message, not the LSN of the last decoded
data which is yet to be sent. Please let me know if I am missing something
here.

--
With Regards,
Ashutosh Sharma.


Re: making relfilenodes 56 bits

2022-09-08 Thread Dilip Kumar
On Tue, Sep 6, 2022 at 11:07 PM Robert Haas  wrote:

> On Tue, Sep 6, 2022 at 4:40 AM Dilip Kumar  wrote:
> > I have explored this area more and also tried to come up with a
> > working prototype, so while working on this I realized that we would
> > have almost to execute all the code which is getting generated as part
> > of the dumpDatabase() and dumpACL() which is basically,
> >
> > 1. UPDATE pg_catalog.pg_database SET datistemplate = false
> > 2. DROP DATABASE
> > 3. CREATE DATABASE with all the database properties like ENCODING,
> > LOCALE_PROVIDER, LOCALE, LC_COLLATE, LC_CTYPE, ICU_LOCALE,
> > COLLATION_VERSION, TABLESPACE
> > 4. COMMENT ON DATABASE
> > 5. Logic inside dumpACL()
> >
> > I feel duplicating logic like this is really error-prone, but I do not
> > find any clear way to reuse the code as dumpDatabase() has a high
> > dependency on the Archive handle and generating the dump file.
>
> Yeah, I don't think this is the way to go at all. The duplicated logic
> is likely to get broken, and is also likely to annoy the next person
> who has to maintain it.
>

Right


> I suggest that for now we fall back on making the initial
> RelFileNumber for a system table equal to pg_class.oid. I don't really
> love that system and I think maybe we should change it at some point
> in the future, but all the alternatives seem too complicated to cram
> them into the current patch.
>

That makes sense.

On a separate note, while reviewing the latest patch I see there is some
risk of using the unflushed relfilenumber in GetNewRelFileNumber()
function.  Basically, in the current code, the flushing logic is tightly
coupled with the logging new relfilenumber logic and that might not work
with all the values of the VAR_RELNUMBER_NEW_XLOG_THRESHOLD.  So the idea
is we need to keep the flushing logic separate from the logging, I am
working on the idea and I will post the patch soon.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


Re: RFC: Logging plan of the running query

2022-09-08 Thread torikoshia

On 2022-05-16 17:02, torikoshia wrote:

On 2022-03-09 19:04, torikoshia wrote:

On 2022-02-08 01:13, Fujii Masao wrote:

AbortSubTransaction() should reset ActiveQueryDesc to
save_ActiveQueryDesc that ExecutorRun() set, instead of NULL?
Otherwise ActiveQueryDesc of top-level statement will be unavailable
after subtransaction is aborted in the nested statements.


I once agreed above suggestion and made v20 patch making
save_ActiveQueryDesc a global variable, but it caused segfault when
calling pg_log_query_plan() after FreeQueryDesc().

OTOH, doing some kind of reset of ActiveQueryDesc seems necessary
since it also caused segfault when running pg_log_query_plan() during
installcheck.

There may be a better way, but resetting ActiveQueryDesc to NULL seems
safe and simple.
Of course it makes pg_log_query_plan() useless after a subtransaction
is aborted.
However, if it does not often happen that people want to know the
running query's plan whose subtransaction is aborted, resetting
ActiveQueryDesc to NULL would be acceptable.

Attached is a patch that sets ActiveQueryDesc to NULL when a
subtransaction is aborted.

How do you think?

Attached new patch to fix patch apply failures again.

--
Regards,

--
Atsushi Torikoshi
NTT DATA CORPORATIONFrom 5be784278e8e7aeeeadf60a772afccda7b59e6e4 Mon Sep 17 00:00:00 2001
From: Atsushi Torikoshi 
Date: Mon, 5 Sep 2022 12:02:16 +0900
Subject: [PATCH v23] Add function to log the plan of the query currently
 running on the backend.

Currently, we have to wait for the query execution to finish
to check its plan. This is not so convenient when
investigating long-running queries on production environments
where we cannot use debuggers.
To improve this situation, this patch adds
pg_log_query_plan() function that requests to log the
plan of the specified backend process.

By default, only superusers are allowed to request to log the
plans because allowing any users to issue this request at an
unbounded rate would cause lots of log messages and which can
lead to denial of service.

On receipt of the request, at the next CHECK_FOR_INTERRUPTS(),
the target backend logs its plan at LOG_SERVER_ONLY level, so
that these plans will appear in the server log but not be sent
to the client.

Reviewed-by: Bharath Rupireddy, Fujii Masao, Dilip Kumar,
Masahiro Ikeda, Ekaterina Sokolova, Justin Pryzby, Kyotaro
Horiguchi, Robert Treat

---
 doc/src/sgml/func.sgml   |  49 +++
 src/backend/access/transam/xact.c|  13 ++
 src/backend/catalog/system_functions.sql |   2 +
 src/backend/commands/explain.c   | 140 ++-
 src/backend/executor/execMain.c  |  10 ++
 src/backend/storage/ipc/procsignal.c |   4 +
 src/backend/storage/lmgr/lock.c  |   9 +-
 src/backend/tcop/postgres.c  |   4 +
 src/backend/utils/init/globals.c |   2 +
 src/include/catalog/pg_proc.dat  |   6 +
 src/include/commands/explain.h   |   3 +
 src/include/miscadmin.h  |   1 +
 src/include/storage/lock.h   |   2 -
 src/include/storage/procsignal.h |   1 +
 src/include/tcop/pquery.h|   2 +
 src/test/regress/expected/misc_functions.out |  54 +--
 src/test/regress/sql/misc_functions.sql  |  41 --
 17 files changed, 315 insertions(+), 28 deletions(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 67eb380632..0e12f73b9c 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -25535,6 +25535,25 @@ SELECT collation for ('foo' COLLATE "de_DE");

   
 
+  
+   
+
+ pg_log_query_plan
+
+pg_log_query_plan ( pid integer )
+boolean
+   
+   
+Requests to log the plan of the query currently running on the
+backend with specified process ID.
+It will be logged at LOG message level and
+will appear in the server log based on the log
+configuration set (See 
+for more information), but will not be sent to the client
+regardless of .
+
+  
+
   

 
@@ -25649,6 +25668,36 @@ LOG:  Grand total: 1651920 bytes in 201 blocks; 622360 free (88 chunks); 1029560
 because it may generate a large number of log messages.

 
+   
+pg_log_query_plan can be used
+to log the plan of a backend process. For example:
+
+postgres=# SELECT pg_log_query_plan(201116);
+ pg_log_query_plan
+---
+ t
+(1 row)
+
+The format of the query plan is the same as when VERBOSE,
+COSTS, SETTINGS and
+FORMAT TEXT are used in the EXPLAIN
+command. For example:
+
+LOG:  query plan running on backend with PID 201116 is:
+Query Text: SELECT * FROM pgbench_accounts;
+Seq Scan on public.pgbench_accounts  (cost=0.00..52787.00 rows=200 width=97)
+  Output: aid, bid, abalance, filler
+Settings: work_mem = '1MB'
+
+ 

Re: proposal: possibility to read dumped table's name from file

2022-09-08 Thread Erik Rijkers

Op 07-09-2022 om 21:45 schreef Daniel Gustafsson:


One thing this patchversion currently lacks is refined error messaging, but if
we feel that this approach is a viable path then that can be tweaked.  The
function which starts the parser can also be refactored to be shared across
pg_dump, pg_dumpall and pg_restore but I've kept it simple for now.

Thoughts?  It would be nice to get this patch across the finishline during this
commitfest.


> [0001-Add-include-exclude-filtering-via-file-in-pg_dump.patch]

This seems to dump & restore well (as Pavels patch does).

I did notice one peculiarity (in your patch) where for each table a few 
spaces are omitted by pg_dump.


-
#! /bin/bash

psql -qXc "drop database if exists testdb2"
psql -qXc "create database testdb2"

echo "
create schema if not exists test;
create table table0 (id integer);
create table table1 (id integer);
insert into table0 select n from generate_series(1,2) as f(n);
insert into table1 select n from generate_series(1,2) as f(n);
" | psql -qXad testdb2

echo "include table table0" > inputfile1.txt

echo "include table table0
include table table1" > inputfile2.txt

# 1 table, emits 2 spaces
echo -ne ">"
pg_dump -F p -f plainfile1 --filter=inputfile1.txt -d testdb2
echo "<"

# 2 tables, emits 4 space
echo -ne ">"
pg_dump -F p -f plainfile2 --filter=inputfile2.txt -d testdb2
echo "<"

# dump without filter emits no spaces
echo -ne ">"
pg_dump -F c -f plainfile3 -t table0 -table1 -d testdb2
echo "<"
-

It's probably a small thing -- but I didn't find it.

thanks,

Erik Rijkers


--
Daniel Gustafsson   https://vmware.com/






Re: Convert *GetDatum() and DatumGet*() macros to inline functions

2022-09-08 Thread Aleksander Alekseev
Hi Peter,

> > 3. Go with your patch and just fix up the warnings about uninitialized
> > variables.  But that seems the least principled to me.
>
> IMO the 3rd option is the lesser evil. Initializing four bools/ints in
> order to make Clang 11 happy doesn't strike me as such a big deal. At
> least until somebody reports a bottleneck for this particular reason.
> We can optimize the code when and if this will happen.

Since the first patch was applied, cfbot now complains that it can't
apply the patchset. Here is the rebased version.

-- 
Best regards,
Aleksander Alekseev


v5-0001-Convert-GetDatum-and-DatumGet-macros-to-inline-fu.patch
Description: Binary data


Re: Bump MIN_WINNT to 0x0600 (Vista) as minimal runtime in 16~

2022-09-08 Thread Michael Paquier
On Tue, Aug 30, 2022 at 01:29:24PM +1200, Thomas Munro wrote:
> This reminds me of 24c3ce8f1, which replaced a dlopen()-ish thing with
> a direct function call.  Can you just call all these functions
> directly these days?

Hmm.  Some tests in the CI show that attempting to call directly
MiniDumpWriteDump() causes a linking failure at compilation.  Anyway,
in the same fashion, I can get some simplifications done right for
pg_ctl.c, auth.c and restricted_token.c.  And I am seeing all these
functions listed in the headers of MinGW, meaning that all these
should work out of the box in this case, no?

The others are library-dependent, and I not really confident about
ldap_start_tls_sA().  So, at the end, I am finishing with the
attached, what do you think?  This cuts some code, which is nice:
 3 files changed, 48 insertions(+), 159 deletions(-)
--
Michael
diff --git a/src/backend/libpq/auth.c b/src/backend/libpq/auth.c
index b2b0b83a97..b3e51698dc 100644
--- a/src/backend/libpq/auth.c
+++ b/src/backend/libpq/auth.c
@@ -1201,11 +1201,8 @@ pg_SSPI_recvauth(Port *port)
 	DWORD		accountnamesize = sizeof(accountname);
 	DWORD		domainnamesize = sizeof(domainname);
 	SID_NAME_USE accountnameuse;
-	HMODULE		secur32;
 	char	   *authn_id;
 
-	QUERY_SECURITY_CONTEXT_TOKEN_FN _QuerySecurityContextToken;
-
 	/*
 	 * Acquire a handle to the server credentials.
 	 */
@@ -1358,36 +1355,12 @@ pg_SSPI_recvauth(Port *port)
 	 *
 	 * Get the name of the user that authenticated, and compare it to the pg
 	 * username that was specified for the connection.
-	 *
-	 * MingW is missing the export for QuerySecurityContextToken in the
-	 * secur32 library, so we have to load it dynamically.
 	 */
 
-	secur32 = LoadLibrary("SECUR32.DLL");
-	if (secur32 == NULL)
-		ereport(ERROR,
-(errmsg("could not load library \"%s\": error code %lu",
-		"SECUR32.DLL", GetLastError(;
-
-	_QuerySecurityContextToken = (QUERY_SECURITY_CONTEXT_TOKEN_FN) (pg_funcptr_t)
-		GetProcAddress(secur32, "QuerySecurityContextToken");
-	if (_QuerySecurityContextToken == NULL)
-	{
-		FreeLibrary(secur32);
-		ereport(ERROR,
-(errmsg_internal("could not locate QuerySecurityContextToken in secur32.dll: error code %lu",
- GetLastError(;
-	}
-
-	r = (_QuerySecurityContextToken) (sspictx, &token);
+	r = QuerySecurityContextToken(sspictx, &token);
 	if (r != SEC_E_OK)
-	{
-		FreeLibrary(secur32);
 		pg_SSPI_error(ERROR,
 	  _("could not get token from SSPI security context"), r);
-	}
-
-	FreeLibrary(secur32);
 
 	/*
 	 * No longer need the security context, everything from here on uses the
diff --git a/src/common/restricted_token.c b/src/common/restricted_token.c
index 82b74b565e..8f4b76b329 100644
--- a/src/common/restricted_token.c
+++ b/src/common/restricted_token.c
@@ -28,8 +28,6 @@
 /* internal vars */
 char	   *restrict_env;
 
-typedef BOOL (WINAPI * __CreateRestrictedToken) (HANDLE, DWORD, DWORD, PSID_AND_ATTRIBUTES, DWORD, PLUID_AND_ATTRIBUTES, DWORD, PSID_AND_ATTRIBUTES, PHANDLE);
-
 /* Windows API define missing from some versions of MingW headers */
 #ifndef  DISABLE_MAX_PRIVILEGE
 #define DISABLE_MAX_PRIVILEGE	0x1
@@ -52,36 +50,15 @@ CreateRestrictedProcess(char *cmd, PROCESS_INFORMATION *processInfo)
 	HANDLE		restrictedToken;
 	SID_IDENTIFIER_AUTHORITY NtAuthority = {SECURITY_NT_AUTHORITY};
 	SID_AND_ATTRIBUTES dropSids[2];
-	__CreateRestrictedToken _CreateRestrictedToken;
-	HANDLE		Advapi32Handle;
 
 	ZeroMemory(&si, sizeof(si));
 	si.cb = sizeof(si);
 
-	Advapi32Handle = LoadLibrary("ADVAPI32.DLL");
-	if (Advapi32Handle == NULL)
-	{
-		pg_log_error("could not load library \"%s\": error code %lu",
-	 "ADVAPI32.DLL", GetLastError());
-		return 0;
-	}
-
-	_CreateRestrictedToken = (__CreateRestrictedToken) (pg_funcptr_t) GetProcAddress(Advapi32Handle, "CreateRestrictedToken");
-
-	if (_CreateRestrictedToken == NULL)
-	{
-		pg_log_error("cannot create restricted tokens on this platform: error code %lu",
-	 GetLastError());
-		FreeLibrary(Advapi32Handle);
-		return 0;
-	}
-
 	/* Open the current token to use as a base for the restricted one */
 	if (!OpenProcessToken(GetCurrentProcess(), TOKEN_ALL_ACCESS, &origToken))
 	{
 		pg_log_error("could not open process token: error code %lu",
 	 GetLastError());
-		FreeLibrary(Advapi32Handle);
 		return 0;
 	}
 
@@ -97,22 +74,20 @@ CreateRestrictedProcess(char *cmd, PROCESS_INFORMATION *processInfo)
 		pg_log_error("could not allocate SIDs: error code %lu",
 	 GetLastError());
 		CloseHandle(origToken);
-		FreeLibrary(Advapi32Handle);
 		return 0;
 	}
 
-	b = _CreateRestrictedToken(origToken,
-			   DISABLE_MAX_PRIVILEGE,
-			   sizeof(dropSids) / sizeof(dropSids[0]),
-			   dropSids,
-			   0, NULL,
-			   0, NULL,
-			   &restrictedToken);
+	b = CreateRestrictedToken(origToken,
+			  DISABLE_MAX_PRIVILEGE,
+			  sizeof(dropSids) / sizeof(dropSids[0]),
+			  dropSids,
+			  0, NULL,
+			  0, NULL,
+			  &restrictedToken);
 
 	Fr

Re: [BUG] Logical replica crash if there was an error in a function.

2022-09-08 Thread Anton A. Melnikov

Hello!

Added a TAP test for this case.

On 30.08.2022 10:09, Anton A. Melnikov wrote:

Hello!

The patch was rebased on current master.
And here is a simplified crash reproduction:
1) On primary with 'wal_level = logical' execute:
  CREATE TABLE public.test (id int NOT NULL, val integer);
  CREATE PUBLICATION test_pub FOR TABLE test;

2) On replica replace  in the repcmd.sql  attached with primary port and 
execute it:
psql -f repcmd.sql

3) On master execute command:
INSERT INTO test VALUES ('1');

 
With best regards,


--
Anton A. Melnikov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Companycommit d539e1e36ef7af33e1a89eaee00183739c716797
Author: Anton A. Melnikov 
Date:   Sun Aug 21 18:27:44 2022 +0300

Fix logical replica crash if there was an error in a function.

diff --git a/src/backend/catalog/pg_proc.c b/src/backend/catalog/pg_proc.c
index a9fe45e347..1381fae575 100644
--- a/src/backend/catalog/pg_proc.c
+++ b/src/backend/catalog/pg_proc.c
@@ -1007,8 +1007,9 @@ sql_function_parse_error_callback(void *arg)
  * anonymous-block handler, not only for SQL-language functions.
  * It is assumed that the syntax error position is initially relative to the
  * function body string (as passed in).  If possible, we adjust the position
- * to reference the original command text; if we can't manage that, we set
- * up an "internal query" syntax error instead.
+ * to reference the original command text; if we can't manage that or
+ * can't get the original command text when ActivePortal is not defined,
+ * we set up an "internal query" syntax error instead.
  *
  * Returns true if a syntax error was processed, false if not.
  */
@@ -1016,7 +1017,7 @@ bool
 function_parse_error_transpose(const char *prosrc)
 {
 	int			origerrposition;
-	int			newerrposition;
+	int			newerrposition = 0;
 	const char *queryText;
 
 	/*
@@ -1034,12 +1035,15 @@ function_parse_error_transpose(const char *prosrc)
 			return false;
 	}
 
-	/* We can get the original query text from the active portal (hack...) */
-	Assert(ActivePortal && ActivePortal->status == PORTAL_ACTIVE);
-	queryText = ActivePortal->sourceText;
+	if (ActivePortal)
+	{
+		/* We can get the original query text from the active portal (hack...) */
+		Assert(ActivePortal->status == PORTAL_ACTIVE);
+		queryText = ActivePortal->sourceText;
 
-	/* Try to locate the prosrc in the original text */
-	newerrposition = match_prosrc_to_query(prosrc, queryText, origerrposition);
+		/* Try to locate the prosrc in the original text */
+		newerrposition = match_prosrc_to_query(prosrc, queryText, origerrposition);
+	}
 
 	if (newerrposition > 0)
 	{
@@ -1052,7 +1056,8 @@ function_parse_error_transpose(const char *prosrc)
 	else
 	{
 		/*
-		 * If unsuccessful, convert the position to an internal position
+		 * If unsuccessful or ActivePortal not defined and original command
+		 * text is unreachable, convert the position to an internal position
 		 * marker and give the function text as the internal query.
 		 */
 		errposition(0);
diff --git a/src/test/recovery/t/034_logical_replica_on_error.pl b/src/test/recovery/t/034_logical_replica_on_error.pl
new file mode 100644
index 00..380ad74590
--- /dev/null
+++ b/src/test/recovery/t/034_logical_replica_on_error.pl
@@ -0,0 +1,105 @@
+# Copyright (c) 2022, Postgres Professional
+
+# There was an assertion if function text contains an error. See PGPRO-7025
+# Тhis file has a prefix (120_) to prevent prefix collision with
+# upstream test files.
+
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More tests => 2;
+
+# Initialize primary node
+my $node_primary = PostgreSQL::Test::Cluster->new('primary');
+$node_primary->init(allows_streaming => 1);
+$node_primary->append_conf(
+	'postgresql.conf', qq(
+wal_level = logical
+));
+$node_primary->start;
+
+$node_primary->safe_psql('postgres',
+	'CREATE TABLE public.test (id int NOT NULL, val integer);');
+
+$node_primary->safe_psql('postgres',
+	'CREATE PUBLICATION test_pub FOR TABLE test;');
+
+# Initialize logical replica node
+my $node_replica = PostgreSQL::Test::Cluster->new('replica');
+$node_replica->init(has_streaming => 1,
+	has_restoring => 1);
+$node_replica->start;
+
+$node_replica->safe_psql('postgres',
+	'CREATE TABLE test (id int NOT NULL, val integer);');
+
+$node_replica->safe_psql('postgres', q{
+	create or replace procedure rebuild_test(
+	) as
+	$body$
+	declare
+		l_code  text:= E'create or replace function public.test_selector(
+	) returns setof public.test as
+	\$body\$
+		select * from  error_name
+	\$body\$ language sql;';
+	begin
+		execute l_code;
+		perform public.test_selector();
+	end
+	$body$ language plpgsql;
+});
+
+$node_replica->safe_psql('postgres', '
+	create or replace function test_trg()
+	returns trigger as
+	$body$
+		declare
+		begin
+			call public.rebuild_test();
+			return NULL;
+		end
+	$body$ language plpgsql;
+');
+
+$node_replica->safe_psql('postgres',
+	'c

Re: Doc fix and adjustment for MERGE command

2022-09-08 Thread Vik Fearing

On 9/7/22 22:51, Vitaly Burovoy wrote:

Hello hackers!

Reading docs for the MERGE statement I've found a little error: a 
semicolon in middle of a statement and absence of a semicolon in the end 
of it.


Key words in subqueries are written in uppercase everywhere in the docs 
but not in an example for MERGE. I think it should be adjusted too.



Also aliases, table and column names are written in lowercase 
(snake_case) almost all over the docs. I did not dare to fix examples in 
the same patch (may be that style was intentional), but guess that style 
of the first two examples should not differ from the third one and from 
other examples in docs.



I agree with both of these patches (especially the semicolon part which 
is not subjective).

--
Vik Fearing




Re: small windows psqlrc re-wording

2022-09-08 Thread Julien Rouhaud
On Wed, Sep 07, 2022 at 01:10:11PM -0400, Tom Lane wrote:
> After looking at the text more carefully, I thought it could use
> a deal more help than Robert has given it.  I propose the attached.

It looks good to me.

- for example ~/.psqlrc-9.2 or
- ~/.psqlrc-9.2.5.  The most specific
+ for example ~/.psqlrc-15 or
+ ~/.psqlrc-15.2.  The most specific

This bit is a bit saddening.  It's probably good to switch to the new 2 digits
versioning but not trying to maintain it any further right?

That being said, should the patch mention versions that at least currently
exist, like -14 and -14.5?




Re: [RFC] building postgres with meson - v12

2022-09-08 Thread Alvaro Herrera
On 2022-Sep-07, Peter Eisentraut wrote:

> A possible drawback is that the intermediate postgres-full.xml file is
> >10MB, but I guess we're past the point where we are worrying about that
> kind of thing.

I think we are, but maybe mark it .PRECIOUS?  IIUC that would prevent it
from being removed if there's a problem in the other recipes.

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/
"¿Cómo puedes confiar en algo que pagas y que no ves,
y no confiar en algo que te dan y te lo muestran?" (Germán Poo)




Re: New docs chapter on Transaction Management and related changes

2022-09-08 Thread Alvaro Herrera
On 2022-Sep-06, Simon Riggs wrote:

> On Tue, 6 Sept 2022 at 17:19, Erik Rijkers  wrote:
> >
> > Op 06-09-2022 om 17:16 schreef Simon Riggs:
> > > New chapter on transaction management, plus a few related changes.
> > >
> > > Markup and links are not polished yet, so please comment initially on
> > > the topics, descriptions and wording.

I think the concept of XID epoch should be described also, in or near
the paragraph that talks about wrapping around and switching between
int32 and int64.  Right now it's a bit unclear why/how that works.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"Uno puede defenderse de los ataques; contra los elogios se esta indefenso"




Re: [PATCH] Tab completion for SET COMPRESSION

2022-09-08 Thread Shinya Kato

On 2022-09-06 20:57, Aleksander Alekseev wrote:

Hi Shinya,


A minor modification has been made so that the composite type is also
completed after "ALTER TABLE  OF".


LGTM. Here is v3 created with `git format-path`. Unlike v2 it also
includes the commit message.


Thanks! I marked it as ready for committer.

--
Regards,

--
Shinya Kato
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION




Re: [BUG] wrong FK constraint name when colliding name on ATTACH

2022-09-08 Thread Jehan-Guillaume de Rorthais
Hi there,

I believe this very small bug and its fix are really trivial and could be push
out of the way quite quickly. It's just about a bad constraint name fixed by
moving one assignation after the next one. This could easily be fixed for next
round of releases.

Well, I hope I'm not wrong :)

Regards,

On Thu, 1 Sep 2022 18:41:56 +0200
Jehan-Guillaume de Rorthais  wrote:

> While studying and hacking on the parenting constraint issue, I found an
> incoherent piece of code leading to badly chosen fk name. If a constraint
> name collision is detected, while choosing a new name for the constraint,
> the code uses fkconstraint->fk_attrs which is not yet populated:
> 
>   /* No dice.  Set up to create our own constraint */
>   fkconstraint = makeNode(Constraint);
>   if (ConstraintNameIsUsed(CONSTRAINT_RELATION,
>RelationGetRelid(partRel),
>NameStr(constrForm->conname)))
>   fkconstraint->conname =
>   ChooseConstraintName(RelationGetRelationName(partRel),
>ChooseForeignKeyConstraintNameAddition(
>   fkconstraint->fk_attrs),  // <= WOO000OPS
>"fkey",
>RelationGetNamespace(partRel), NIL);
>   else
>   fkconstraint->conname = pstrdup(NameStr(constrForm->conname));
>   fkconstraint->fk_upd_action = constrForm->confupdtype;
>   fkconstraint->fk_del_action = constrForm->confdeltype;
>   fkconstraint->deferrable = constrForm->condeferrable;
>   fkconstraint->initdeferred = constrForm->condeferred;
>   fkconstraint->fk_matchtype = constrForm->confmatchtype;
>   for (int i = 0; i < numfks; i++)
>   {
>   Form_pg_attribute att;
>   
>   att = TupleDescAttr(RelationGetDescr(partRel),
>   mapped_conkey[i] - 1);
>   fkconstraint->fk_attrs = lappend(fkconstraint->fk_attrs, // <=
> POPULATING makeString(NameStr(att->attname)));
>   }
> 
> The following SQL script showcase the bad constraint name:
> 
>   DROP TABLE IF EXISTS parent, child1;
>   
>   CREATE TABLE parent (
>   id bigint NOT NULL default 1,
>   no_part smallint NOT NULL,
>   id_abc bigint,
>   CONSTRAINT dummy_constr FOREIGN KEY (id_abc, no_part)
>   REFERENCES parent(id, no_part) ON UPDATE RESTRICT ON DELETE
> RESTRICT, PRIMARY KEY (id, no_part)
>   )
>   PARTITION BY LIST (no_part);
>   
>   CREATE TABLE child1 (
>   id bigint NOT NULL default 1,
>   no_part smallint NOT NULL,
>   id_abc bigint,
>   PRIMARY KEY (id, no_part),
>   CONSTRAINT dummy_constr CHECK ((no_part = 1))
>   );
> 
>   ALTER TABLE parent ATTACH PARTITION child1 FOR VALUES IN ('1');
> 
>   SELECT conname
>   FROM pg_constraint
>   WHERE conrelid = 'child1'::regclass
> AND contype = 'f';
> 
>   DROP TABLE
>   CREATE TABLE
>   CREATE TABLE
>   ALTER TABLE
>   
>  conname
>   --
>child1__fkey
>   (1 row)
> 
> The resulting constraint name "child1__fkey" is missing the attributes name
> the original code wanted to add. The expected name is
> "child1_id_abc_no_part_fkey".
> 
> Find in attachment a simple fix, moving the name assignation after the
> FK attributes are populated.
> 
> Regards,





Re: [RFC] building postgres with meson - v11

2022-09-08 Thread samay sharma
Hi,

On Tue, Sep 6, 2022 at 9:48 PM Peter Eisentraut <
peter.eisentr...@enterprisedb.com> wrote:

> On 02.09.22 19:16, samay sharma wrote:
> > Another thing I'd like input on. A common question I've heard from
> > people who've tried out the docs is How do we do the equivalent of X in
> > make with meson. As meson will be new for a bunch of people who are
> > fluent with make, I won't be surprised if this is a common ask. To
> > address that, I was planning to add a page to specify the key things one
> > needs to keep in mind while "migrating" from make to meson and having a
> > translation table of commonly used commands.
> >
> > I was planning to add it in the meson section, but if we go ahead with
> > the structure proposed above, it doesn't fit it into one as cleanly.
> > Maybe, it still goes in the meson section? Thoughts?
>
> This could go into the wiki.
>
> For example, we have
> , which was added
> during the CVS->Git transition.
>

That's a good idea. I'll add a page to the wiki about this topic and share
it on the list for review.


>
> This avoids that we make the PostgreSQL documentation a substitute
> manual for a third-party product.
>

Regards,
Samay


Re: Schema variables - new implementation for Postgres 15

2022-09-08 Thread Julien Rouhaud
Hi,

On Tue, Sep 06, 2022 at 06:23:12PM +0800, Julien Rouhaud wrote:
> On Tue, Sep 06, 2022 at 08:43:59AM +0200, Pavel Stehule wrote:
> > Hi
> >
> > After talking with Julian I removed "debug" fields name and nsname from
> > SVariable structure. When it is possible it is better to read these fields
> > from catalog without risk of obsoletely or necessity to refresh these
> > fields. In other cases we display only oid of variable instead name and
> > nsname (It is used just for debug purposes).
>
> Thanks!  I'm just adding back the forgotten Cc list.

About the last change:

+static void
+pg_variable_cache_callback(Datum arg, int cacheid, uint32 hashvalue)
+{
[...]
+   elog(DEBUG1, "session variable \"%s.%s\" (oid:%u) should be 
rechecked (forced by sinval)",
+
get_namespace_name(get_session_variable_namespace(svar->varid)),
+get_session_variable_name(svar->varid),
+svar->varid);

There's no guarantee that the variable still exists in cache (for variables
dropped in the current transaction), or even that the callback is called while
in a transaction state, so we should only display the oid here.

FTR just to be sure I ran all the new regression tests (with this fix) with CCA
and log_min_messages = DEBUG1 and it didn't hit any problem, so it doesn't seem
that there's any other issue hidden somewhere.


Other than that I don't see any remaining problems in session_variable.c.  I
still have a few nitpicking comments though:

+static SVariable
+prepare_variable_for_reading(Oid varid)
+{
[...]
+   /* Store result before releasing Executor memory */
+   set_session_variable(svar, value, isnull, true);
+
+   MemoryContextSwitchTo(oldcxt);
+
+   FreeExecutorState(estate);

The comment and code is a bit misleading, as it's not immediately obvious that
set_session_variable() doesn't rely on the current memory contex for
allocations.  Simply moving the MemoryContextSwitchTo() before the
set_session_variable() would be better.

+typedef struct SVariableData
+{
[...]
+   boolis_domain;
+   Oid basetypeid;
+   void   *domain_check_extra;
+   LocalTransactionId domain_check_extra_lxid;

AFAICS basetypeid isn't needed anymore.


+ /* Both lists hold fields of SVariableXActActionItem type */
+ static List *xact_on_commit_drop_actions = NIL;
+ static List *xact_on_commit_reset_actions = NIL;

Is it possible to merge both in a single list?  I don't think that there's much
to gain trying to separate those.  They shouldn't contain a lot of entries, and
they're usually scanned at the same time anyway.

This is especially important as one of the tricky parts of this patchset is
maintaining those lists across subtransactions, and since both have the same
heuristics all the related code is duplicated.

I see that in AtPreEOXact_SessionVariable_on_xact_actions() both lists are
handled interleaved with the xact_recheck_varids, but I don't see any reason
why we couldn't process both action lists first and then process the rechecks.
I did a quick test and don't see any failure in the regression tests.


+void
+RemoveSessionVariable(Oid varid)
+{

I looks like a layering violation to have (part of) the code for CREATE
VARIABLE in pg_variable.[ch] and the code for DROP VARIABLE in
session_variable.[ch].

I think it was done mostly because it was the initial sync_sessionvars_all()
that was responsible to avoid cleaning up memory for variables dropped in the
current transaction, but that's not a requirement anymore.  So I don't see
anything preventing us from moving RemoveSessionVariable() in pg_variable, and
export some function in session_variable to do the additional work for properly
maintaining the hash table if needed (with that knowledge held in
session_variable, not in pg_variable).  You should only need to pass the oid of
the variable and the eoxaction.

Simlarly, why not move DefineSessionVariable() in pg_variable and expose some
API in session_variable to register the needed SVAR_ON_COMMIT_DROP action?

Also, while not a problem I don't think that the CommandCounterIncrement() is
necessary in DefineSessionVariable().  CREATE VARIABLE is a single operation
and you can't have anything else running in the same ProcessUtility() call.
And since cd3e27464cc you have the guarantee that a CommandCounterIncrement()
will happen at the end of the utility command processing.

While at it, maybe it would be good to add some extra tests in
src/test/modules/test_extensions.  I'm thinking a version 1.0 that creates a
variable and initialize the value (and and extra step after creating the
extension to make sure that the value is really set), and an upgrade to 2.0
that creates a temp variable on commit drop, that has to fail due to the
dependecy on the extension.




Re: [RFC] building postgres with meson - v12

2022-09-08 Thread John Naylor
On Wed, Sep 7, 2022 at 3:36 PM Alvaro Herrera  wrote:
>
> On 2022-Sep-06, John Naylor wrote:
>
> > Note that the indentation hasn't changed. My thought there: perltidy
> > will be run again next year, at which time it will be part of a listed
> > whitespace-only commit. Any objections, since that could confuse
> > someone before then?
>
> I think a good plan is to commit the fix without tidy, then commit the
> tidy separately, then add the latter commit to .git-blame-ignore-revs.
> That avoids leaving the code untidy for a year.

Okay, done that way. I also made sure we got the same info for error
reporting. It's not identical, but arguably better, going from:

Bareword found where operator expected at (eval 4480) line 3, near "'btree' xxx"
(Missing operator before xxx?)
../../../src/include/catalog/pg_amop.dat: error parsing line 20:

to:

Bareword found where operator expected at (eval 12) line 20, near "'btree' xxx"
(Missing operator before xxx?)
error parsing ../../../src/include/catalog/pg_amop.dat

-- 
John Naylor
EDB: http://www.enterprisedb.com