date:20220926

A doubt about a newly added errdetail

2022-09-26 Thread Kyotaro Horiguchi

I saw the following message recently added to publicationcmds.c.

(ERROR: cannot use publication column list for relation "%s.%s"")
> DETAIL: Column list cannot be specified if any schema is part of the 
> publication or specified in the list.

As my reading, the "the list" at the end syntactically means "Column
list" but that is actually wrong; it could be read as "the list
following TABLES IN" but that doesn't seem reasonable.

If I am right, it might should be something like the following:

+ Column list cannot be specified if any schema is part of the publication or 
specified in the command.
+ Column list cannot be specified if any schema is part of the publication or 
specified together.

What do you think about this?

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: pg_rewind WAL segments deletion pitfall

2022-09-26 Thread Alexander Kukushkin

Hello Kyotaro,

any further thoughts on it?

Regards,
--
Alexander Kukushkin

Re: Make mesage at end-of-recovery less scary.

2022-09-26 Thread Kyotaro Horiguchi

At Fri, 16 Sep 2022 23:21:50 -0500, Justin Pryzby  wrote 
in 
> @cfbot: rebased over adb466150, which did the same thing as one of the
> hunks in xlogreader.c.

Oops. Thanks! And then this gets a further conflict (param names
harmonization). So further rebased.  And removed an extra blank line
you pointed.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
>From b70a33bd941e9845106bea502db30d32e0138251 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Thu, 7 Jul 2022 11:51:45 +0900
Subject: [PATCH v21] Make End-Of-Recovery error less scary

When recovery in any type ends, we see a bit scary error message like
"invalid record length" that suggests something serious is
happening. Actually if recovery meets a record with length = 0, that
usually means it finished applying all available WAL records.

Make this message less scary as "reached end of WAL". Instead, raise
the error level for other kind of WAL failure to WARNING.
---
 src/backend/access/transam/xlogreader.c   | 136 +-
 src/backend/access/transam/xlogrecovery.c |  94 +++
 src/backend/replication/walreceiver.c |   7 +-
 src/bin/pg_waldump/pg_waldump.c   |  13 ++-
 src/include/access/xlogreader.h   |   1 +
 src/test/recovery/t/011_crash_recovery.pl | 106 +
 6 files changed, 298 insertions(+), 59 deletions(-)

diff --git a/src/backend/access/transam/xlogreader.c b/src/backend/access/transam/xlogreader.c
index 4d6c34e0fc..b03eeb1487 100644
--- a/src/backend/access/transam/xlogreader.c
+++ b/src/backend/access/transam/xlogreader.c
@@ -48,6 +48,8 @@ static int	ReadPageInternal(XLogReaderState *state, XLogRecPtr pageptr,
 			 int reqLen);
 static void XLogReaderInvalReadState(XLogReaderState *state);
 static XLogPageReadResult XLogDecodeNextRecord(XLogReaderState *state, bool nonblocking);
+static bool ValidXLogRecordLength(XLogReaderState *state, XLogRecPtr RecPtr,
+  XLogRecord *record);
 static bool ValidXLogRecordHeader(XLogReaderState *state, XLogRecPtr RecPtr,
   XLogRecPtr PrevRecPtr, XLogRecord *record, bool randAccess);
 static bool ValidXLogRecord(XLogReaderState *state, XLogRecord *record,
@@ -149,6 +151,7 @@ XLogReaderAllocate(int wal_segment_size, const char *waldir,
 		pfree(state);
 		return NULL;
 	}
+	state->EndOfWAL = false;
 	state->errormsg_buf[0] = '\0';
 
 	/*
@@ -558,6 +561,7 @@ XLogDecodeNextRecord(XLogReaderState *state, bool nonblocking)
 	/* reset error state */
 	state->errormsg_buf[0] = '\0';
 	decoded = NULL;
+	state->EndOfWAL = false;
 
 	state->abortedRecPtr = InvalidXLogRecPtr;
 	state->missingContrecPtr = InvalidXLogRecPtr;
@@ -640,25 +644,21 @@ restart:
 	Assert(pageHeaderSize <= readOff);
 
 	/*
-	 * Read the record length.
+	 * Validate the record header.
 	 *
-	 * NB: Even though we use an XLogRecord pointer here, the whole record
-	 * header might not fit on this page. xl_tot_len is the first field of the
-	 * struct, so it must be on this page (the records are MAXALIGNed), but we
-	 * cannot access any other fields until we've verified that we got the
-	 * whole header.
-	 */
-	record = (XLogRecord *) (state->readBuf + RecPtr % XLOG_BLCKSZ);
-	total_len = record->xl_tot_len;
-
-	/*
-	 * If the whole record header is on this page, validate it immediately.
-	 * Otherwise do just a basic sanity check on xl_tot_len, and validate the
-	 * rest of the header after reading it from the next page.  The xl_tot_len
+	 * Even though we use an XLogRecord pointer here, the whole record header
+	 * might not fit on this page.  If the whole record header is on this page,
+	 * validate it immediately.  Even otherwise xl_tot_len must be on this page
+	 * (it is the first field of MAXALIGNed records), but we still cannot
+	 * access any further fields until we've verified that we got the whole
+	 * header, so do just a basic sanity check on record length, and validate
+	 * the rest of the header after reading it from the next page.  The length
 	 * check is necessary here to ensure that we enter the "Need to reassemble
 	 * record" code path below; otherwise we might fail to apply
 	 * ValidXLogRecordHeader at all.
 	 */
+	record = (XLogRecord *) (state->readBuf + RecPtr % XLOG_BLCKSZ);
+
 	if (targetRecOff <= XLOG_BLCKSZ - SizeOfXLogRecord)
 	{
 		if (!ValidXLogRecordHeader(state, RecPtr, state->DecodeRecPtr, record,
@@ -668,18 +668,14 @@ restart:
 	}
 	else
 	{
-		/* XXX: more validation should be done here */
-		if (total_len < SizeOfXLogRecord)
-		{
-			report_invalid_record(state,
-  "invalid record length at %X/%X: wanted %u, got %u",
-  LSN_FORMAT_ARGS(RecPtr),
-  (uint32) SizeOfXLogRecord, total_len);
+		if (!ValidXLogRecordLength(state, RecPtr, record))
 			goto err;
-		}
+
 		gotheader = false;
 	}
 
+	total_len = record->xl_tot_len;
+
 	/*
 	 * Find space to decode this record.  Don't allow oversized allocation if
 	 * the caller requested nonblocking.  Otherwise, we *have* to try to
@@ -1105,6

RE: [RFC] building postgres with meson - v13

2022-09-26 Thread wangw.f...@fujitsu.com

On Mon, Sep 26, 2022 at 14:47 PM Andres Freund  wrote:
> Hi,
> 
> On 2022-09-26 06:24:42 +, wangw.f...@fujitsu.com wrote:
> > I tried to use meson and ninja and they are really efficient.
> > But when I tried to specify "c_args", it did not take effect.
> 
> They should take effect, but won't be shown in the summary section
> currently. That currently only shows the flags chosen by the configure step,
> rather than user specified ones.
> 
> 
> > After I made the below modifications, the specified "c_args" took effect.
> > ```
> > @@ -2439,6 +2439,10 @@ endif
> >
> >  # Set up compiler / linker arguments to be used everywhere, individual
> targets
> >  # can add further args directly, or indirectly via dependencies
> > +
> > +tmp_c_args = get_option('c_args')
> > +cflags += tmp_c_args
> > +
> >  add_project_arguments(cflags, language: ['c'])
> >  add_project_arguments(cppflags, language: ['c'])
> >  add_project_arguments(cflags_warn, language: ['c'])
> > ```
> 
> That'll likely end up with the same cflags added multiple times. You should
> see them when building with ninja -v.

Thanks for sharing the information.
I saw the user specified CFLAG when building with `ninja -v`.

But, after installing PG with command `ninja -v install`, pg_config does not
show the user specified CFLAG. Should we print this information there?

> How about adding c_args to the summary, in a separate line? I think that'd
> clarify what's happening?

Yes, I think it might be better.

Regards,
Wang wei

Re: A doubt about a newly added errdetail

2022-09-26 Thread Alvaro Herrera

On 2022-Sep-26, Kyotaro Horiguchi wrote:

> I saw the following message recently added to publicationcmds.c.
> 
> (ERROR: cannot use publication column list for relation "%s.%s"")
> > DETAIL: Column list cannot be specified if any schema is part of the 
> > publication or specified in the list.
> 
> As my reading, the "the list" at the end syntactically means "Column
> list" but that is actually wrong; it could be read as "the list
> following TABLES IN" but that doesn't seem reasonable.
> 
> If I am right, it might should be something like the following:
> 
> + Column list cannot be specified if any schema is part of the publication or 
> specified in the command.
> + Column list cannot be specified if any schema is part of the publication or 
> specified together.

I propose

ERROR:  cannot use column list for relation "%s.%s" in publication "%s"
DETAIL:  Column lists cannot be specified in publications containing FOR TABLES 
IN SCHEMA elements.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/

DROP OWNED BY is broken on master branch.

2022-09-26 Thread Rushabh Lathia

Hi All,

Consider the below test:

postgres@53130=#create role test WITH login createdb;
CREATE ROLE
postgres@53130=#\c - test
You are now connected to database "postgres" as user "test".
postgres@53150=#create database test;
CREATE DATABASE
postgres@53150=#\c - rushabh
You are now connected to database "postgres" as user "rushabh".
postgres@53162=#
postgres@53162=#
-- This was working before the below mentioned commit.
postgres@53162=#drop owned by test;
ERROR:  global objects cannot be deleted by doDeletion

Commit 6566133c5f52771198aca07ed18f84519fac1be7 ensure that
pg_auth_members.grantor is always valid.  This commit did changes
into shdepDropOwned() function and combined the SHARED_DEPENDENCY_ACL
and SHARED_DEPENDENCY_OWNER.  In that process it removed condition for
local object in owner dependency.

case SHARED_DEPENDENCY_OWNER:
-   /* If a local object, save it for deletion below */
-   if (sdepForm->dbid == MyDatabaseId)
+   /* Save it for deletion below */

Case ending up with above error because of the above removed condition.

Please find the attached patch which fixes the case.

Thanks,
Rushabh Lathia
www.EnterpriseDB.com
diff --git a/src/backend/catalog/pg_shdepend.c b/src/backend/catalog/pg_shdepend.c
index f2f227f..6134fe3 100644
--- a/src/backend/catalog/pg_shdepend.c
+++ b/src/backend/catalog/pg_shdepend.c
@@ -1411,8 +1411,6 @@ shdepDropOwned(List *roleids, DropBehavior behavior)
 sdepForm->objid);
 		break;
 	}
-	/* FALLTHROUGH */
-case SHARED_DEPENDENCY_OWNER:
 	/* Save it for deletion below */
 	obj.classId = sdepForm->classid;
 	obj.objectId = sdepForm->objid;
@@ -1426,6 +1424,24 @@ shdepDropOwned(List *roleids, DropBehavior behavior)
 	}
 	add_exact_object_address(&obj, deleteobjs);
 	break;
+case SHARED_DEPENDENCY_OWNER:
+	/* If a local object, save it for deletion below */
+	if (sdepForm->dbid == MyDatabaseId)
+	{
+		/* Save it for deletion below */
+		obj.classId = sdepForm->classid;
+		obj.objectId = sdepForm->objid;
+		obj.objectSubId = sdepForm->objsubid;
+		/* as above */
+		AcquireDeletionLock(&obj, 0);
+		if (!systable_recheck_tuple(scan, tuple))
+		{
+			ReleaseDeletionLock(&obj);
+			break;
+		}
+		add_exact_object_address(&obj, deleteobjs);
+	}
+	break;
 			}
 		}

Re: [small patch] Change datatype of ParallelMessagePending from "volatile bool" to "volatile sig_atomic_t"

2022-09-26 Thread Michael Paquier

On Mon, Sep 26, 2022 at 06:57:28AM +, kuroda.hay...@fujitsu.com wrote:
> While reviewing [1], I and Amit noticed that a flag ParallelMessagePending is 
> defined
> as "volatile bool", but other flags set by signal handlers are defined as 
> "volatile sig_atomic_t".
> 
> How do you think?

You are right.  bool is not usually a problem in a signal handler, but
sig_atomic_t is the type we ought to use.  I'll go adjust that.
--
Michael


signature.asc
Description: PGP signature

Re: DROP OWNED BY is broken on master branch.

2022-09-26 Thread Michael Paquier

On Mon, Sep 26, 2022 at 01:13:53PM +0530, Rushabh Lathia wrote:
> Please find the attached patch which fixes the case.

Could it be possible to stress this stuff in the regression tests?
There is a gap here.  (I have not looked at what you are proposing.)
--
Michael


signature.asc
Description: PGP signature

Re: A doubt about a newly added errdetail

2022-09-26 Thread Amit Kapila

On Mon, Sep 26, 2022 at 1:10 PM Alvaro Herrera  wrote:
>
> On 2022-Sep-26, Kyotaro Horiguchi wrote:
>
> > I saw the following message recently added to publicationcmds.c.
> >
> > (ERROR: cannot use publication column list for relation "%s.%s"")
> > > DETAIL: Column list cannot be specified if any schema is part of the 
> > > publication or specified in the list.
> >
> > As my reading, the "the list" at the end syntactically means "Column
> > list" but that is actually wrong; it could be read as "the list
> > following TABLES IN" but that doesn't seem reasonable.
> >
> > If I am right, it might should be something like the following:
> >
> > + Column list cannot be specified if any schema is part of the publication 
> > or specified in the command.
> > + Column list cannot be specified if any schema is part of the publication 
> > or specified together.
>
> I propose
>
> ERROR:  cannot use column list for relation "%s.%s" in publication "%s"
> DETAIL:  Column lists cannot be specified in publications containing FOR 
> TABLES IN SCHEMA elements.
>

This looks mostly good to me. BTW, is it a good idea to add ".. in
publication "%s"" to the error message as this can happen even during
create publication? If so, I think we can change the nearby message as
below to include the same:

if (!pubviaroot &&
pri->relation->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("cannot use publication column list for relation \"%s\"",

I think even if we don't include the publication name, there won't be
any confusion because there won't be multiple publications in the
command.

-- 
With Regards,
Amit Kapila.

Re: Refactor backup related code (was: Is it correct to say, "invalid data in file \"%s\"", BACKUP_LABEL_FILE in do_pg_backup_stop?)

2022-09-26 Thread Kyotaro Horiguchi

At Mon, 26 Sep 2022 11:57:58 +0530, Bharath Rupireddy 
 wrote in 
> On Mon, Sep 26, 2022 at 8:13 AM Michael Paquier  wrote:
> >
> > What I had at hand seemed fine on a second look, so applied after
> > tweaking a couple of comments.  One thing that I have been wondering
> > after-the-fact is whether it would be cleaner to return the contents
> > of the backup history file or backup_label as a char rather than a
> > StringInfo?  This simplifies a bit what the callers of
> > build_backup_content() need to do.
> 
> +1 because callers don't use returned StringInfo structure outside of
> build_backup_content(). The patch looks good to me. I think it will be
> good to add a note about the caller freeing up the retired string's
> memory [1], just in case.

Doesn't the following (from you :) work?

+ * Returns the result generated as a palloc'd string.

This suggests no need for pfree if the caller properly destroys the
context or pfree is needed otherwise. In this case, the active memory
contexts are "ExprContext" and "Replication command context" so I
think we actually do not need to pfree it but I don't mean we sholnd't
do that in this patch (since those contexts are somewhat remote from
what the function does and pfree doesn't matter at all here.).

> [1]
>  * Returns the result generated as a palloc'd string. It is the caller's
>  * responsibility to free the returned string's memory.
>  */
> char *
> build_backup_content(BackupState *state, bool ishistoryfile)
> {

+1.  A nitpick.

-   if (strcmp(backupfrom, "standby") == 0 && !backup_started_in_recovery)
+   if (state->started_in_recovery == true &&
+   backup_stopped_in_recovery == false)

Using == for Booleans may not be great?

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: [RFC] building postgres with meson - v13

2022-09-26 Thread John Naylor

On Wed, Sep 21, 2022 at 7:11 AM Andres Freund  wrote:
>
> I added installation instructions for meson for a bunch of platforms, but

A couple more things for the wiki:

1) /opt/homebrew/ seems to be an "Apple silicon" path? Either way it
doesn't exist on this machine. I was able to get a working build with

/usr/local/Homebrew/Library/Homebrew/os/mac/pkgconfig

(My homebrew install doesn't seem to have anything relevant for
extra_include_dirs or extra_lib_dirs.)

2) Also, "ninja -v install" has the same line count as "ninja install" --
are there versions that do something different?

--
John Naylor
EDB: http://www.enterprisedb.com

Re: pg_stat_have_stats() returns true for dropped indexes (or for index creation transaction rolled back)

2022-09-26 Thread Drouvot, Bertrand


Hi,

On 9/23/22 10:45 PM, Andres Freund wrote:





Hi,

On 2022-09-01 08:40:54 +0200, Drouvot, Bertrand wrote:

Thanks for the suggestion, I'm coming up with this proposal in v4 attached:


I pushed the bugfix / related test portion to 15, master. Thanks!


Thanks!



I left the replication stuff out - it seemed somewhat independent.


Yeah.


Probably
will just push that to master, unless somebody thinks it should be in both?


Sounds good to me as this is not a bug and that seems unlikely to me 
that an issue in this area will be introduced later on on 15 without 
being introduced on master too.


Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

RE: Perform streaming logical transactions by background workers and parallel apply

2022-09-26 Thread kuroda.hay...@fujitsu.com

Dear Wang, 

Thanks for updating patch!... but cfbot says that it cannot be accepted [1].
I thought the header  should be included, like miscadmin.h.

[1]: https://cirrus-ci.com/task/5909508684775424

Best Regards,
Hayato Kuroda
FUJITSU LIMITED

Re: fix stats_fetch_consistency value in postgresql.conf.sample

2022-09-26 Thread Kyotaro Horiguchi

At Sat, 17 Sep 2022 23:53:07 -0500, Justin Pryzby  wrote 
in 
> This is an alternative implementation, which still relies on adding the
> GUC_DYNAMIC, flag but doesn't require adding a new, sql-accessible
> function to convert the GUC to a pretty/human display value.

Thanks!

I'm not sure shared_buffer is GUC_DYNAMIC_DEFAULT, and we need to read
postgresql.conf.sample using SQL, but +1 for the direction.

+   AND NOT (sc.sample_value ~ '^0' AND current_setting(name) ~ '^0') -- 
zeros may be written differently
+   AND NOT (sc.sample_value='60s' AND current_setting(name) = '1min') -- 
two ways to write 1min

Mmm. Couldn't we get away from that explicit exceptions?


regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: A doubt about a newly added errdetail

2022-09-26 Thread Alvaro Herrera

On 2022-Sep-26, Amit Kapila wrote:

> On Mon, Sep 26, 2022 at 1:10 PM Alvaro Herrera  
> wrote:

> > ERROR:  cannot use column list for relation "%s.%s" in publication "%s"
> > DETAIL:  Column lists cannot be specified in publications containing FOR 
> > TABLES IN SCHEMA elements.
> 
> This looks mostly good to me. BTW, is it a good idea to add ".. in
> publication "%s"" to the error message as this can happen even during
> create publication?

Hmm, I don't see why not.  The publication is new, sure, but it would
already have a name, so there's no possible confusion as to what this
means.

(My main change was to take the word "publication" out of the phrase
"publication column list", because that seemed a bit strange; it isn't
the publication that has a column list, but the relation.)

> If so, I think we can change the nearby message as below to include
> the same:
> 
> if (!pubviaroot &&
> pri->relation->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
> ereport(ERROR,
> (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> errmsg("cannot use publication column list for relation \"%s\"",

WFM.

> I think even if we don't include the publication name, there won't be
> any confusion because there won't be multiple publications in the
> command.

True, and the whole error report is likely to contain a STATEMENT line.

However, you could argue that specifying the publication in errmsg is
redundant.  But then, the "for relation %s.%s" is also redundant (since
that is *also* in the STATEMENT line), and could even be misleading: if
you have a command that specifies *two* relations with column lists, why
specify only the first one you find?  The user could be angry that they
remove the column list from that relation and retry, and then receive
the exact same error for the next relation with a list that they didn't
edit.  But I think people don't work that way.  So if you wanted to be
super precise you would also omit the relation name unless you scanned
the whole list and verified that only this relation is specifying a
column list; but whom does that help?

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/

Re: [RFC] building postgres with meson - v13

2022-09-26 Thread Alvaro Herrera

On 2022-Sep-25, Andres Freund wrote:

> From 3eb0ca196084da314d94d1e51c7b775012a4773c Mon Sep 17 00:00:00 2001
> From: Andres Freund 
> Date: Wed, 21 Sep 2022 11:03:07 -0700
> Subject: [PATCH v16 04/16] meson: Add windows resource files

> diff --git a/src/backend/jit/llvm/meson.build 
> b/src/backend/jit/llvm/meson.build
> index de2e624ab58..5fb63768358 100644
> --- a/src/backend/jit/llvm/meson.build
> +++ b/src/backend/jit/llvm/meson.build
> @@ -20,6 +20,12 @@ llvmjit_sources += files(
>'llvmjit_expr.c',
>  )
>  
> +if host_system == 'windows'
> +  llvmjit_sources += rc_lib_gen.process(win32ver_rc, extra_args: [
> +'--NAME', 'llvmjit',
> +'--FILEDESC', 'llvmjit - JIT using LLVM',])
> +endif

This is tediously imperative.  Isn't there a more declarative way to
have it?

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/
"Pensar que el espectro que vemos es ilusorio no lo despoja de espanto,
sólo le suma el nuevo terror de la locura" (Perelandra, C.S. Lewis)

Re: A doubt about a newly added errdetail

2022-09-26 Thread Amit Kapila

On Mon, Sep 26, 2022 at 2:03 PM Alvaro Herrera  wrote:
>
> On 2022-Sep-26, Amit Kapila wrote:
>
> > On Mon, Sep 26, 2022 at 1:10 PM Alvaro Herrera  
> > wrote:
>
> > > ERROR:  cannot use column list for relation "%s.%s" in publication "%s"
> > > DETAIL:  Column lists cannot be specified in publications containing FOR 
> > > TABLES IN SCHEMA elements.
> >
> > This looks mostly good to me. BTW, is it a good idea to add ".. in
> > publication "%s"" to the error message as this can happen even during
> > create publication?
>
> Hmm, I don't see why not.  The publication is new, sure, but it would
> already have a name, so there's no possible confusion as to what this
> means.
>
> (My main change was to take the word "publication" out of the phrase
> "publication column list", because that seemed a bit strange; it isn't
> the publication that has a column list, but the relation.)
>

Okay, that makes sense.

>
> > If so, I think we can change the nearby message as below to include
> > the same:
> >
> > if (!pubviaroot &&
> > pri->relation->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
> > ereport(ERROR,
> > (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> > errmsg("cannot use publication column list for relation \"%s\"",
>
> WFM.
>

Okay.

-- 
With Regards,
Amit Kapila.

RE: A doubt about a newly added errdetail

2022-09-26 Thread houzj.f...@fujitsu.com

On Monday, September 26, 2022 4:57 PM Amit Kapila 
> 
> On Mon, Sep 26, 2022 at 2:03 PM Alvaro Herrera 
> wrote:
> >
> > On 2022-Sep-26, Amit Kapila wrote:
> >
> > > On Mon, Sep 26, 2022 at 1:10 PM Alvaro Herrera
>  wrote:
> >
> > > > ERROR:  cannot use column list for relation "%s.%s" in publication "%s"
> > > > DETAIL:  Column lists cannot be specified in publications containing FOR
> TABLES IN SCHEMA elements.
> > >
> > > This looks mostly good to me. BTW, is it a good idea to add ".. in
> > > publication "%s"" to the error message as this can happen even
> > > during create publication?
> >
> > Hmm, I don't see why not.  The publication is new, sure, but it would
> > already have a name, so there's no possible confusion as to what this
> > means.
> >
> > (My main change was to take the word "publication" out of the phrase
> > "publication column list", because that seemed a bit strange; it isn't
> > the publication that has a column list, but the relation.)
> >
> 
> Okay, that makes sense.

+1

> >
> > > If so, I think we can change the nearby message as below to include
> > > the same:
> > >
> > > if (!pubviaroot &&
> > > pri->relation->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
> > > ereport(ERROR,
> > > (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> > > errmsg("cannot use publication column list for relation \"%s\"",
> >
> > WFM.
> >
> 
> Okay.

While reviewing this part, I notice an unused parameter(queryString) of function
CheckPubRelationColumnList. I feel we can remove that as well while on it. I
plan to post a patch to fix the error message and parameter soon.

Best regards,
Hou zj

Re: Differentiate MERGE queries with different structures

2022-09-26 Thread Julien Rouhaud

Hi,

On Mon, Sep 26, 2022 at 03:12:46PM +0900, bt22nakamorit wrote:
>
> pg_stat_statements module distinguishes queries with different structures,
> but some visibly different MERGE queries were combined as one
> pg_stat_statements entry.
> For example,
> MERGE INTO test1 USING test2 ON test1.id = test2.id WHEN MATCHED THEN UPDATE
> var = 1;
> MERGE INTO test1 USING test2 ON test1.id = test2.id WHEN MATCHED THEN
> DELETE;
> These two queries have different command types after WHEN (UPDATE and
> DELETE), but they were regarded as one entry in pg_stat_statements.
> I think that they should be sampled as distinct queries.

Agreed.

> I attached a patch file that adds information about MERGE queries on the
> documentation of pg_stat_statements, and lines of code that helps with the
> calculation of queryid hash value to differentiate MERGE queries.
> Any kind of feedback is appreciated.

I didn't test the patch (and never looked at MERGE implementation either), but
I'm wondering if MergeAction->matched and MergeAction->override should be
jumbled too?

Also, the patch should contain some extra tests to fully cover MERGE jumbling.

Re: Add ON CONFLICT DO RETURN clause

2022-09-26 Thread Dagfinn Ilmari Mannsåker

Wolfgang Walther  writes:

> Peter Geoghegan:
>> On Sun, Sep 25, 2022 at 8:55 AM Wolfgang Walther
>>  wrote:
>>> The attached patch adds a DO RETURN clause to be able to do this:
>>>
>>> INSERT INTO x (id) VALUES (1)
>>> ON CONFLICT DO RETURN
>>> RETURNING created_at;
>>>
>>> Much simpler. This will either insert or do nothing - but in both cases
>>> return a row.
>> How can you tell which it was, though?
>
> I guess I can't reliably. But isn't that the same in the ON UPDATE case?
>
> In the use cases I had so far, I didn't need to know.
>
>> I don't see why this statement should ever perform steps for any row
>> that are equivalent to DO NOTHING processing -- it should at least
>> lock each and every affected row, if only to conclusively determine
>> that there really must be a conflict.
>> In general ON CONFLICT DO UPDATE allows the user to add a WHERE clause
>> to back out of updating a row based on an arbitrary predicate. DO
>> NOTHING has no such WHERE clause. So DO NOTHING quite literally does
>> nothing for any rows that had conflicts, unlike DO UPDATE, which will
>> at the very least lock the row (with or without an explicit WHERE
>> clause).
>> The READ COMMITTED behavior for DO NOTHING is a bit iffy, even
>> compared to DO UPDATE, but the advantages in bulk loading scenarios
>> can be decisive. Or at least they were before we had MERGE.
>
> Agreed - it needs to lock the row. I don't think I fully understood what
> "nothing" in DO NOTHING extended to.
>
> I guess I want DO RETURN to behave more like a DO SELECT, so with the
> same semantics as selecting the row?

There was a patch for ON CONFLICT DO SELECT submitted a while back, but
the author abandoned it. I hven't read either that patch that or yours,
so I don't know how they compare, but you might want to have a look at
it:

https://commitfest.postgresql.org/16/1241/

> Best
>
> Wolfgang

- ilmari

Re: CFM Manager

2022-09-26 Thread Ibrar Ahmed

On Tue, Sep 20, 2022 at 10:45 PM Jacob Champion 
wrote:

> On Thu, Sep 8, 2022 at 2:34 PM Jacob Champion 
> wrote:
> > I still have yet to update the section "5 to 7 days before end of CF"
> > and onward.
>
> Well, I've saved the hardest part for last...
>
> Ibrar, Hamid, have the checklist rewrites been helpful so far? Are you
> planning on doing an (optional!) triage, and if so, are there any
> pieces in particular you'd like me to document?
>
> --Jacob
>

I think we are good now, thanks Jacob, for the effort.

-- 
Ibrar Ahmed

Re: Query Jumbling for CALL and SET utility statements

2022-09-26 Thread Drouvot, Bertrand



Hi,

On 9/21/22 6:07 PM, Fujii Masao wrote:



On 2022/09/19 15:29, Drouvot, Bertrand wrote:

Please find attached v6 taking care of the remarks mentioned above.


Thanks for updating the patch!

+SET pg_stat_statements.track_utility = TRUE;
+
+-- PL/pgSQL procedure and pg_stat_statements.track = all
+-- we drop and recreate the procedures to avoid any caching funnies
+SET pg_stat_statements.track_utility = FALSE;

Could you tell me why track_utility is enabled just before it's disabled?


Thanks for looking at the new version!

No real reason, I removed those useless SET in the new V7 attached.



Could you tell me what actually happens if we don't drop and
recreate the procedures? I'd like to know what "any caching funnies" are.


Without the drop/recreate the procedure body does not appear normalized 
(while the CALL itself is) when switching from track = top to track = all.


I copy-pasted this comment from the already existing "function" section 
in the pg_stat_statements.sql file. This comment has been introduced for 
the function section in commit 83f2061dd0. Note that the behavior would 
be the same for the function case (function body does not appear 
normalized without the drop/recreate).




+SELECT pg_stat_statements_reset();
+CALL MINUS_TWO(3);
+CALL MINUS_TWO(7);
+CALL SUM_TWO(3, 8);
+CALL SUM_TWO(7, 5);
+
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query 
COLLATE "C";


This test set for the procedures is executed with the following
four conditions, respectively. Do we really need all of these tests?

track = top, track_utility = true
track = top, track_utility = false
track = all, track_utility = true
track = all, track_utility = false


Oh right, the track_utility = false cases have been added when we 
decided up-thread to force track CALL.


But now that's probably not needed to test with track_utility = true. So 
I'm just keeping track_utility = off with track = top or all in the new 
V7 attached (like this is the case for the "function" section).




+begin;
+prepare transaction 'tx1';
+insert into test_tx values (1);
+commit prepared 'tx1';

The test set of 2PC commands is also executed with track_utility = on
and off, respectively. But why do we need to run that test when
track_utility = off?


That's useless, thanks for pointing out. Removed in V7 attached.



-    if (query->utilityStmt)
+    if (query->utilityStmt && !jstate)
  {
  if (pgss_track_utility && 
!PGSS_HANDLED_UTILITY(query->utilityStmt))


"pgss_track_utility" should be
"pgss_track_utility || FORCE_TRACK_UTILITY(parsetree)" theoretically?


Good catch! That's not needed (in practice) with the current code but 
that is "theoretically" needed indeed, let's add it in V7 attached 
(that's safer should the code change later on).



Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.comdiff --git a/contrib/pg_stat_statements/expected/pg_stat_statements.out 
b/contrib/pg_stat_statements/expected/pg_stat_statements.out
index ff0166fb9d..f5fc2f1f38 100644
--- a/contrib/pg_stat_statements/expected/pg_stat_statements.out
+++ b/contrib/pg_stat_statements/expected/pg_stat_statements.out
@@ -272,7 +272,7 @@ FROM pg_stat_statements ORDER BY query COLLATE "C";
  wal_records > $2 as wal_records_generated,   +|   |  |
 |   | 
  wal_records >= rows as wal_records_ge_rows   +|   |  |
 |   | 
  FROM pg_stat_statements ORDER BY query COLLATE "C"|   |  |
 |   | 
- SET pg_stat_statements.track_utility = FALSE  | 1 |0 | f  
 | f | t
+ SET pg_stat_statements.track_utility = $1 | 1 |0 | f  
 | f | t
  UPDATE pgss_test SET b = $1 WHERE a > $2  | 1 |3 | t  
 | t | t
 (7 rows)
 
@@ -423,6 +423,111 @@ SELECT query, calls, rows FROM pg_stat_statements ORDER 
BY query COLLATE "C";
  SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C" 
| 0 |0
 (6 rows)
 
+-- PL/pgSQL procedure and pg_stat_statements.track = top
+CREATE PROCEDURE MINUS_TWO(i INTEGER) AS $$
+DECLARE
+  r INTEGER;
+BEGIN
+  SELECT (i - 1 - 1.0)::INTEGER INTO r;
+END; $$ LANGUAGE plpgsql;
+CREATE PROCEDURE SUM_TWO(i INTEGER, j INTEGER) AS $$
+DECLARE
+  r INTEGER;
+BEGIN
+  SELECT (j + j)::INTEGER INTO r;
+END; $$ LANGUAGE plpgsql;
+SET pg_stat_statements.track = 'top';
+SET pg_stat_statements.track_utility = FALSE;
+SELECT pg_stat_statements_reset();
+ pg_stat_statements_reset 
+--
+ 
+(1 row)
+
+CALL MINUS_TWO(3);
+CALL MINUS_TWO(7);
+CALL SUM_TWO(3, 8);
+CALL SUM_TWO(7, 5);
+SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+

kerberos/001_auth test fails on arm CPU darwin

2022-09-26 Thread Bilal Yavuz

Hi hackers,

I wanted to add ARM CPU darwin to the CI but it seems
that kerberos/001_auth fails on ARM CPU darwin.

OS:
Darwin admins-Virtual-Machine.local 21.6.0 Darwin Kernel Version 21.6.0:
Wed Aug 10 14:26:07 PDT 2022; root:xnu-8020.141.5~2/RELEASE_ARM64_VMAPPLE
arm64

Error message:
Can't exec "kdb5_util": No such file or directory at
/Users/admin/pgsql/src/test/perl/PostgreSQL/Test/Utils.pm line 338.
[02:53:37.177](0.043s) Bail out!  failed to execute command "kdb5_util
create -s -P secret0": No such file or directory

It seems that kerberos is installed at the '/opt/homebrew/opt/krb5' path on
ARM CPU darwin instances instead of the '/usr/local/opt/krb5' path.

I attached two patches:
0001-ci-Add-arm-CPU-for-darwin.patch is about adding ARM CPU darwin to the
CI.
0002-fix-darwin-ARM-CPU-darwin-krb5-path-fix.patch is about fixing the
error.

CI run after ARM CPU darwin is added:
https://cirrus-ci.com/build/5772792711872512

CI run after fix applied:
https://cirrus-ci.com/build/5686842363215872

Regards,
Nazir Bilal Yavuz


0001-ci-Add-arm-CPU-for-darwin.patch
Description: Binary data


0002-fix-darwin-ARM-CPU-darwin-krb5-path-fix.patch
Description: Binary data

Re: Perform streaming logical transactions by background workers and parallel apply

2022-09-26 Thread Amit Kapila

On Mon, Sep 26, 2022 at 8:41 AM wangw.f...@fujitsu.com
 wrote:
>
> On Thur, Sep 22, 2022 at 18:12 PM Amit Kapila  wrote:
>
> > 3.
> > ApplyWorkerMain()
> > {
> > ...
> > ...
> > +
> > + if (server_version >= 16 &&
> > + MySubscription->stream == SUBSTREAM_PARALLEL)
> > + options.proto.logical.streaming = pstrdup("parallel");
> >
> > After deciding here whether the parallel streaming mode is enabled or
> > not, we recheck the same thing in apply_handle_stream_abort() and
> > parallel_apply_can_start(). In parallel_apply_can_start(), we do it
> > via two different checks. How about storing this information say in
> > structure MyLogicalRepWorker in ApplyWorkerMain() and then use it at
> > other places?
>
> Improved as suggested.
> Added a new flag "in_parallel_apply" to structure MyLogicalRepWorker.
>

Can we name the variable in_parallel_apply as parallel_apply and set
it in logicalrep_worker_launch() instead of in
ParallelApplyWorkerMain()?

Few other comments:
==
1.
+ if (is_subworker &&
+ nparallelapplyworkers >= max_parallel_apply_workers_per_subscription)
+ {
+ LWLockRelease(LogicalRepWorkerLock);
+
+ ereport(DEBUG1,
+ (errcode(ERRCODE_CONFIGURATION_LIMIT_EXCEEDED),
+ errmsg("out of parallel apply workers"),
+ errhint("You might need to increase
max_parallel_apply_workers_per_subscription.")));

I think it is better to keep the level of this as LOG. Similar
messages at other places use WARNING or LOG. Here, I prefer LOG
because the system can still proceed without blocking anything.

2.
+/* Reset replication origin tracking. */
+void
+parallel_apply_replorigin_reset(void)
+{
+ bool started_tx = false;
+
+ /* This function might be called inside or outside of transaction. */
+ if (!IsTransactionState())
+ {
+ StartTransactionCommand();
+ started_tx = true;
+ }

Why do we need a transaction in this function?

3. Few suggestions to improve in the patch:
diff --git a/src/backend/replication/logical/worker.c
b/src/backend/replication/logical/worker.c
index 1623c9e2fa..d9c519dfab 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -1264,6 +1264,10 @@ apply_handle_stream_prepare(StringInfo s)
  case TRANS_LEADER_SEND_TO_PARALLEL:
  Assert(winfo);

+ /*
+ * The origin can be active only in one process. See
+ * apply_handle_stream_commit.
+ */
  parallel_apply_replorigin_reset();

  /* Send STREAM PREPARE message to the parallel apply worker. */
@@ -1623,12 +1627,7 @@ apply_handle_stream_abort(StringInfo s)
  (errcode(ERRCODE_PROTOCOL_VIOLATION),
  errmsg_internal("STREAM ABORT message without STREAM STOP")));

- /*
- * Check whether the publisher sends abort_lsn and abort_time.
- *
- * Note that the parallel apply worker is only started when the publisher
- * sends abort_lsn and abort_time.
- */
+ /* We receive abort information only when we can apply in parallel. */
  if (MyLogicalRepWorker->in_parallel_apply)
  read_abort_info = true;

@@ -1656,7 +1655,13 @@ apply_handle_stream_abort(StringInfo s)
  Assert(winfo);

  if (subxid == xid)
+ {
+ /*
+ * The origin can be active only in one process. See
+ * apply_handle_stream_commit.
+ */
  parallel_apply_replorigin_reset();
+ }

  /* Send STREAM ABORT message to the parallel apply worker. */
  parallel_apply_send_data(winfo, s->len, s->data);
@@ -1858,6 +1863,12 @@ apply_handle_stream_commit(StringInfo s)
  case TRANS_LEADER_SEND_TO_PARALLEL:
  Assert(winfo);

+ /*
+ * We need to reset the replication origin before sending the commit
+ * message and set it up again after confirming that parallel worker
+ * has processed the message. This is required because origin can be
+ * active only in one process at-a-time.
+ */
  parallel_apply_replorigin_reset();

  /* Send STREAM COMMIT message to the parallel apply worker. */
diff --git a/src/include/replication/worker_internal.h
b/src/include/replication/worker_internal.h
index 4cbfb43492..2bd9664f86 100644
--- a/src/include/replication/worker_internal.h
+++ b/src/include/replication/worker_internal.h
@@ -70,11 +70,7 @@ typedef struct LogicalRepWorker
  */
  pid_t apply_leader_pid;

- /*
- * Indicates whether to use parallel apply workers.
- *
- * Determined based on streaming parameter and publisher version.
- */
+ /* Indicates whether apply can be performed parallelly. */
  bool in_parallel_apply;


-- 
With Regards,
Amit Kapila.

Re: kerberos/001_auth test fails on arm CPU darwin

2022-09-26 Thread Tom Lane

Bilal Yavuz  writes:
> It seems that kerberos is installed at the '/opt/homebrew/opt/krb5' path on
> ARM CPU darwin instances instead of the '/usr/local/opt/krb5' path.

I think this also needs to account for MacPorts, which would likely
put it under /opt/local/sbin.  (I wonder where /usr/local/opt/krb5
came from at all -- that sounds like somebody's manual installation
rather than a packaged one.)  Maybe we should first try
"krb5-config --prefix" to see if that gives an answer.

regards, tom lane

RE: A doubt about a newly added errdetail

2022-09-26 Thread houzj.f...@fujitsu.com

On Monday, September 26, 2022 5:03 PM houzj.f...@fujitsu.com wrote:
> 
> On Monday, September 26, 2022 4:57 PM Amit Kapila
> 
> >
> > On Mon, Sep 26, 2022 at 2:03 PM Alvaro Herrera
> > 
> > wrote:
> > >
> > > On 2022-Sep-26, Amit Kapila wrote:
> > >
> > > > On Mon, Sep 26, 2022 at 1:10 PM Alvaro Herrera
> >  wrote:
> > >
> > > > > ERROR:  cannot use column list for relation "%s.%s" in publication 
> > > > > "%s"
> > > > > DETAIL:  Column lists cannot be specified in publications
> > > > > containing FOR
> > TABLES IN SCHEMA elements.
> > > >
> > > > This looks mostly good to me. BTW, is it a good idea to add ".. in
> > > > publication "%s"" to the error message as this can happen even
> > > > during create publication?
> > >
> > > Hmm, I don't see why not.  The publication is new, sure, but it
> > > would already have a name, so there's no possible confusion as to
> > > what this means.
> > >
> > > (My main change was to take the word "publication" out of the phrase
> > > "publication column list", because that seemed a bit strange; it
> > > isn't the publication that has a column list, but the relation.)
> > >
> >
> > Okay, that makes sense.
> 
> +1
> 
> > >
> > > > If so, I think we can change the nearby message as below to
> > > > include the same:
> > > >
> > > > if (!pubviaroot &&  
> > > > pri->relation->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
> > > > ereport(ERROR,
> > > > (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> > > > errmsg("cannot use publication column list for relation \"%s\"",
> > >
> > > WFM.
> > >
> >
> > Okay.
> 
> While reviewing this part, I notice an unused parameter(queryString) of 
> function
> CheckPubRelationColumnList. I feel we can remove that as well while on it. I 
> plan
> to post a patch to fix the error message and parameter soon.
> 

Attach the patch. (The patch can apply on both HEAD and PG15)

Best regards,
Hou zj



0001-Improve-some-error-messages.patch
Description: 0001-Improve-some-error-messages.patch

Re: Differentiate MERGE queries with different structures

2022-09-26 Thread Alvaro Herrera

On 2022-Sep-26, Julien Rouhaud wrote:

> On Mon, Sep 26, 2022 at 03:12:46PM +0900, bt22nakamorit wrote:

> > I attached a patch file that adds information about MERGE queries on the
> > documentation of pg_stat_statements, and lines of code that helps with the
> > calculation of queryid hash value to differentiate MERGE queries.
> > Any kind of feedback is appreciated.
> 
> I didn't test the patch (and never looked at MERGE implementation either), but
> I'm wondering if MergeAction->matched and MergeAction->override should be
> jumbled too?

->matched distinguishes these two queries:

MERGE INTO foo USING bar ON (something)
  WHEN MATCHED THEN DO NOTHING;
MERGE INTO foo USING bar ON (something)
  WHEN NOT MATCHED THEN DO NOTHING;

because only DO NOTHING can be used with both MATCHED and NOT MATCHED,
though on the whole the distinction seems pointless.  However I think if
you sent both these queries and got a single pgss entry with the text of
one of them and not the other, you're going to be confused about where
the other went.   So +1 for jumbling it too.

->overriding is used in OVERRIDING SYSTEM VALUES (used for GENERATED
columns).  I don't think there's any case where it's interesting
currently: if you specify the column it's going to be in the column list
(which does affect the query ID).

> Also, the patch should contain some extra tests to fully cover MERGE
> jumbling.

Agreed.  I struggle to find a balance between not having anything and
going overboard, but I decided to add different for the different things
that should be jumbled, so that they would all appear in the view.


I moved the code around; instead of adding it at the end of the switch,
I did what the comment says, which is to mirror expression_tree_walker.
This made me realize that the latter is using the wrong order for fields
according to the struct definition, so I flipped that also.

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"La gente vulgar sólo piensa en pasar el tiempo;
el que tiene talento, en aprovecharlo"
>From b1aff66110ec54e4b58517289a18451906876ed0 Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Mon, 26 Sep 2022 13:27:24 +0200
Subject: [PATCH v2] Fix pg_stat_statements for MERGE

Author: Tatsu 
Reviewed-by: Julien Rouhaud 
Discussion: https://postgr.es/m/d87e391694db75a038abc3b259782...@oss.nttdata.com
---
 .../expected/pg_stat_statements.out   | 41 ++-
 .../sql/pg_stat_statements.sql| 22 ++
 doc/src/sgml/pgstatstatements.sgml|  4 +-
 src/backend/nodes/nodeFuncs.c |  4 +-
 src/backend/utils/misc/queryjumble.c  | 11 +
 5 files changed, 77 insertions(+), 5 deletions(-)

diff --git a/contrib/pg_stat_statements/expected/pg_stat_statements.out b/contrib/pg_stat_statements/expected/pg_stat_statements.out
index ff0166fb9d..9ac5c87c3a 100644
--- a/contrib/pg_stat_statements/expected/pg_stat_statements.out
+++ b/contrib/pg_stat_statements/expected/pg_stat_statements.out
@@ -222,12 +222,51 @@ SELECT * FROM test WHERE a IN (1, 2, 3, 4, 5);
  3 | c   
 (8 rows)
 
+-- MERGE
+MERGE INTO test USING test st ON (st.a = test.a AND st.a >= 4)
+ WHEN MATCHED THEN UPDATE SET b = st.b || st.a::text;
+MERGE INTO test USING test st ON (st.a = test.a AND st.a >= 4)
+ WHEN MATCHED THEN UPDATE SET b = test.b || st.a::text;
+MERGE INTO test USING test st ON (st.a = test.a AND st.a >= 4)
+ WHEN MATCHED AND length(st.b) > 1 THEN UPDATE SET b = test.b || st.a::text;
+MERGE INTO test USING test st ON (st.a = test.a)
+ WHEN NOT MATCHED THEN INSERT (a, b) VALUES (0, NULL);
+MERGE INTO test USING test st ON (st.a = test.a)
+ WHEN NOT MATCHED THEN INSERT VALUES (0, NULL);	-- same as above
+MERGE INTO test USING test st ON (st.a = test.a)
+ WHEN NOT MATCHED THEN INSERT (b, a) VALUES (NULL, 0);
+MERGE INTO test USING test st ON (st.a = test.a)
+ WHEN NOT MATCHED THEN INSERT (a) VALUES (0);
+MERGE INTO test USING test st ON (st.a = test.a AND st.a >= 4)
+ WHEN MATCHED THEN DELETE;
+MERGE INTO test USING test st ON (st.a = test.a AND st.a >= 4)
+ WHEN MATCHED THEN DO NOTHING;
+MERGE INTO test USING test st ON (st.a = test.a AND st.a >= 4)
+ WHEN NOT MATCHED THEN DO NOTHING;
 SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
 query | calls | rows 
 --+---+--
  DELETE FROM test WHERE a > $1| 1 |1
  INSERT INTO test (a, b) VALUES ($1, $2), ($3, $4), ($5, $6)  | 1 |3
  INSERT INTO test VALUES(generate_series($1, $2), $3) | 1 |   10
+ MERGE INTO test USING test st ON (st.a = test.a AND st.a >= $1) +| 1 |6
+  WHEN MATCHED AND length(st.b) > $2 THEN UPDATE SET b = test.b || st.a::text |   | 
+ MERGE INTO test USI

Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher

2022-09-26 Thread Amit Kapila

On Thu, Sep 22, 2022 at 9:44 PM Önder Kalacı  wrote:
>
> Also, do you think is this a good time for me to mark the patch "Ready for 
> committer" in the commit fest? Not sure when and who should change the state, 
> but it seems I can change. I couldn't find any documentation on how that 
> process should work.
>

Normally, the reviewers mark it as "Ready for committer".

-- 
With Regards,
Amit Kapila.

Re: A doubt about a newly added errdetail

2022-09-26 Thread Amit Kapila

On Mon, Sep 26, 2022 at 4:45 PM houzj.f...@fujitsu.com
 wrote:
>
>
> Attach the patch. (The patch can apply on both HEAD and PG15)
>

The patch looks good to me.

*
- errmsg("cannot add schema to the publication"),
+ errmsg("cannot add schema to publication \"%s\"",
+stmt->pubname),

I see that you have improved an additional message in the patch which
appears okay to me.

-- 
With Regards,
Amit Kapila.

Re: Allow foreign keys to reference a superset of unique columns

2022-09-26 Thread James Coleman

On Mon, Sep 26, 2022 at 2:28 AM Wolfgang Walther
 wrote:
>
> James Coleman:
> > If we have a declared constraint on x,y where x is unique based on an
> > index including on x I do not think we should have that fk constraint
> > work differently than a constraint on x,y where there is a unique
> > index on x,y. That would seem to be incredibly confusing behavior
> > (even if it would be useful for some specific use case).
>
> I don't think it's behaving differently from how it does now. See below.
> But I can see how that could be confusing. Maybe it's just about
> describing the feature in a better way than I did so far. Or maybe it
> needs a different syntax.
>
> Anyway, I don't think it's just a specific use case. In every use case I
> had for $subject so far, the immediate next step was to write some
> triggers to fetch those derived values from the referenced table.
>
> Ultimately it's a question of efficiency: We can achieve the same thing
> in two ways today:
> - We can either **not** add the additional column (members.tenant,
> bar.ftype in my examples) to the referencing table at all, and add
> constraint triggers that do all those checks instead. This adds
> complexity to write the triggers and more complicated RLS policies etc,
> and also is potentially slower when executing those more complicated
> queries.
> - Or we can add the additional column, but also add an additional unique
> index on the referenced table, and then make it part of the FK. This
> removes some of the constraint triggers and makes RLS policies simpler
> and likely faster to execute queries. It comes at a cost of additional
> cost of storage, though - and this is something that $subject tries to
> address.
>
> Still, even when $subject is allowed, in practice we need some of the
> triggers to fetch those dependent values. Considering that the current
> FK triggers already do the same kind of queries at the same times, it'd
> be more efficient to have those FK queries fetch those dependent values.
>
> >> But this could also be a CHECK constraint to allow FKs only to a subset
> >> of rows in the target table:
> >
> > Are you suggesting a check constraint that queries another table?
>
> No. I was talking about the CHECK constraint in my example in the next
> paragraph of that mail. The CHECK constraint on bar.ftype is a regular
> CHECK constraint, but because of how ftype is updated automatically, it
> effectively behaves like some kind of additional constraint on the FK
> itself.

Ah, OK.

> > This "derive the value automatically" is not what foreign key
> > constraints do right now at all, right? And if fact it's contradictory
> > to existing behavior, no?
>
> I don't think it's contradicting. Maybe a better way to put my idea is this:
>
> For a foreign key to a superset of unique columns, the already-unique
> columns should behave according to the specified ON UPDATE clause.
> However, the extra columns should always behave as they were ON UPDATE
> CASCADE. And additionally, they should behave similar to something like
> ON INSERT CASCADE. Although that INSERT is about the referencing table,
> not the referenced table, so the analogy isn't 100%.
>
> I guess this would also be a more direct answer to Tom's earlier
> question about what to expect in the ON UPDATE scenario.

So the broader point I'm trying to make is that, as I understand it,
indexes backing foreign key constraints is an implementation detail.
The SQL standard details the behavior of foreign key constraints
regardless of implementation details like a backing index. That means
that the behavior of two column foreign key constraints is defined in
a single way whether or not there's a backing index at all or whether
such a backing index, if present, contains one or two columns.

I understand that for the use case you're describing this isn't the
absolute most efficient way to implement the desired data semantics.
But it would be incredibly confusing (and, I think, a violation of the
SQL standard) to have one foreign key constraint work in a different
way from another such constraint when both are indistinguishable at
the constraint level (the backing index isn't an attribute of the
constraint; it's merely an implementation detail).

James Coleman

Re: [RFC] building postgres with meson - v13

2022-09-26 Thread Peter Eisentraut


On 24.09.22 20:09, Andres Freund wrote:

On 2022-09-24 13:52:29 -0400, Tom Lane wrote:

... btw, shouldn't the CF entry [1] get closed now?


Unfortunately not - there's quite a few followup patches that haven't been
[fully] reviewed and thus not applied yet.


Here is some review of the remaining ones (might not match exactly what 
you attached, I was working off your branch):



9f789350a7a7 meson: ci: wip: move compilerwarnings task to meson

This sounds reasonable to me in principle, but I haven't followed the
cirrus stuff too closely, and it doesn't say why it's "wip".  Perhaps
others could take a closer look.


ccf20a68f874 meson: ci: Add additional CI coverage

IIUC, this is just for testing your branch, not meant for master?


02d84c21b227 meson: prereq: win: remove date from version number in 
win32ver.rc


do it


5c42b3e7812e meson: WIP: Add some of the windows resource files

What is the thinking on this now?  What does this change over the
current state?


9bc60bccfd10 meson: Add support for relative rpaths, fixing tests on 
MacOS w/ SIP


I suggest a separate thread and/or CF entry for this.  There have been
various attempts to deal with SIP before, with varying results.  This
is not part of the meson transition as such.


9f5be26c1215 meson: Add docs for building with meson

I do like the overall layout of this.

The "Supported Platforms" section should be moved back to near the end
of the chapter.  I don't see a reason to move it forward, at least
none that is related to the meson issue.

The changes to the "Getting the Source" section are also not
appropriate for this patch.

In the section "Building and Installation with meson":

- Remove the "git clone" stuff.

- The "Running tests" section should be moved to Chapter 33. Regression 
Tests.


Some copy-editing will probably be suitable, but I haven't focused on
that yet.


9c00d355d0e9 meson: Add PGXS compatibility

This looks like a reasonable direction to me.  How complete is it?  It
says it works for some extensions but not others.  How do we define
the target line here?


3fd5e13dcad3 meson: Add postgresql-extension.pc for building extension 
libraries


Separate thread for this as well.  This is good and important, but we
must also add it to the make build.


4b5bfa1c19aa meson: Add LLVM bitcode emission

still in progress


eb40f6e53104 meson: Add support for building with precompiled headers

Any reason not to enable this by default?  The benefits on non-Windows
appear to be less dramatic, but they are not zero.  Might be better to
enable it consistently so that for example any breakage is easier
caught.


377bfdea6042 meson: Add xmllint/xsltproc wrapper script to handle 
dependencies automatically


Is this part of the initial transition, required for correctness, or
is it an optional optimization?  Could use more explanation.  Maybe
move to separate thread also?

Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features)

2022-09-26 Thread torikoshia


On 2022-09-21 21:11, Damir Belyalov wrote:
Thanks for updating patch.


In the previous patch there was an error when processing constraints.
The patch was fixed, but the code grew up and became more complicated
(0005-COPY_IGNORE_ERRORS). I also simplified the logic of
safeNextCopyFrom().
You asked why we need subtransactions, so the answer is in this patch.
When processing a row that does not satisfy constraints or INSTEAD OF
triggers, it is necessary to rollback the subtransaction and return
the table to its original state.
Cause of complexity, I had to abandon the constraints, triggers
processing in and handle only errors that occur when reading the file.
Attaching simplified patch (0006-COPY_IGNORE_ERRORS).


Do you mean you stop dealing with errors concerned with constraints and 
triggers and we should review 0006-COPY_IGNORE_ERRORS?



Tried to implement your error and could not. The result was the same
as COPY FROM implements.


Hmm, I applied v6 patch and when canceled COPY by sending SIGINT(ctrl + 
C), I faced the same situation as below.

I tested it on CentOS 8.4.

  =# COPY test FROM '/home/tori/pgsql/master/1000.data' WITH 
(IGNORE_ERRORS);

  ^CCancel request sent
  ERROR:  canceling statement due to user request
  CONTEXT:  COPY test, line 628000: "628000   xxx"

  =# SELECT 1;
  ERROR:  current transaction is aborted, commands ignored until end of 
transaction block


 =# ABORT;
  FATAL:  UserAbortTransactionBlock: unexpected state STARTED
  server closed the connection unexpectedly
  This probably means the server terminated abnormally
  before or while processing the request.
  The connection to the server was lost. Attempting reset: Succeeded.

I did the same procedure on COPY FROM without IGNORE_ERRORS and didn't 
face this situation.


--
Regards,

--
Atsushi Torikoshi
NTT DATA CORPORATION

Re: SYSTEM_USER reserved word implementation

2022-09-26 Thread Drouvot, Bertrand


Hi,

On 9/7/22 5:48 PM, Jacob Champion wrote:

On 9/7/22 07:46, Drouvot, Bertrand wrote:

Except the Nit above, that looks all good to me.


A few additional comments:


+assigned a database role. It is represented as
+auth_method:identity or
+NULL if the user has not been authenticated (for
+example if  has been used).
+   


This is rendered as

 ... (for example if Section 21.4 has been used).

which IMO isn't too helpful. Maybe a  would read better than an
?


Thanks for looking at it!
Good catch, V4 coming soon will make use of  instead.



Also, this function's placement in the docs (with the System Catalog
Information Functions) seems wrong to me. I think it should go up above
in the Session Information Functions, with current_user et al.


Agree, will move it to the Session Information Functions in V4.




+   /* Build system user as auth_method:authn_id */
+   char   *system_user;
+   Sizeauthname_len = strlen(auth_method);
+   Sizeauthn_id_len = strlen(authn_id);
+
+   system_user = palloc0(authname_len + authn_id_len + 2);
+   strcat(system_user, auth_method);
+   strcat(system_user, ":");
+   strcat(system_user, authn_id);


If we're palloc'ing anyway, can this be replaced with a single psprintf()?


Fair point, V4 will make use of psprintf().




+   /* Initialize SystemUser now that MyClientConnectionInfo is restored. */
+   InitializeSystemUser(MyClientConnectionInfo.authn_id,
+
hba_authname(MyClientConnectionInfo.auth_method));


It makes me a little nervous to call hba_authname(auth_method) without
checking to see that auth_method is actually valid (which is only true
if authn_id is not NULL).


Will add additional check for safety in V4.




We could pass the bare auth_method index, or update the documentation
for auth_method to state that it's guaranteed to be zero if authn_id is
NULL (and then enforce that).


 case SVFOP_CURRENT_USER:
 case SVFOP_USER:
 case SVFOP_SESSION_USER:
+   case SVFOP_SYSTEM_USER:
 case SVFOP_CURRENT_CATALOG:
 case SVFOP_CURRENT_SCHEMA:
 svf->type = NAMEOID;


Should this be moved to use TEXTOID instead?


Good catch, will do in V4.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: pg_stat_have_stats() returns true for dropped indexes (or for index creation transaction rolled back)

2022-09-26 Thread Drouvot, Bertrand


Hi,

On 9/26/22 10:23 AM, Drouvot, Bertrand wrote:

Hi,

On 9/23/22 10:45 PM, Andres Freund wrote:





Hi,

On 2022-09-01 08:40:54 +0200, Drouvot, Bertrand wrote:
Thanks for the suggestion, I'm coming up with this proposal in v4 
attached:


I pushed the bugfix / related test portion to 15, master. Thanks!


Thanks!



Forgot to say that with that being fixed, I'll come back with a patch 
proposal for the tables/indexes stats split (discovered the issue fixed 
in this current thread while working on the split patch.)


Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: SYSTEM_USER reserved word implementation

2022-09-26 Thread Drouvot, Bertrand


Hi,

On 9/8/22 3:17 AM, Michael Paquier wrote:

On Wed, Sep 07, 2022 at 08:48:43AM -0700, Jacob Champion wrote:


We could pass the bare auth_method index, or update the documentation
for auth_method to state that it's guaranteed to be zero if authn_id is
NULL (and then enforce that).


 case SVFOP_CURRENT_USER:
 case SVFOP_USER:
 case SVFOP_SESSION_USER:
+   case SVFOP_SYSTEM_USER:
 case SVFOP_CURRENT_CATALOG:
 case SVFOP_CURRENT_SCHEMA:
 svf->type = NAMEOID;


Should this be moved to use TEXTOID instead?


Yeah, it should.  There is actually a second and much deeper issue
here, in the shape of a collation problem.  See the assertion failure
in exprSetCollation(), because we expect SQLValueFunction nodes to
return a name or a non-collatable type.  However, for this case, we'd
require a text to get rid of the 63-character limit, and that's
a collatable type.  This reminds me of the recent thread to work on
getting rid of this limit for the name type..


Please find attached V4 taking care of Jacob's previous comments.

As far the assertion failure mentioned by Michael when moving the 
SVFOP_SYSTEM_USER from NAMEOID to TEXTOID: V4 is assuming that it is 
safe to force the collation to C_COLLATION_OID for SQLValueFunction 
having a TEXT type, but I would be happy to also hear your thoughts 
about it.


Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.comdiff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index e1fe4fec57..fe99e65dd5 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -22623,6 +22623,25 @@ SELECT * FROM pg_ls_dir('.') WITH ORDINALITY AS 
t(ls,n);

   
 
+  
+   
+
+ system_user
+
+system_user
+name
+   
+   
+Returns the authentication method and the identity (if any) that the
+user presented during the authentication cycle before they were
+assigned a database role. It is represented as
+auth_method:identity or
+NULL if the user has not been authenticated (for
+example if Trust authentication has
+been used).
+   
+  
+
   

 
diff --git a/src/backend/access/transam/parallel.c 
b/src/backend/access/transam/parallel.c
index bc93101ff7..c2a08e9414 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -1496,6 +1496,14 @@ ParallelWorkerMain(Datum main_arg)

 false);
RestoreClientConnectionInfo(clientconninfospace);
 
+   /*
+* Initialize SystemUser now that MyClientConnectionInfo is restored.
+* Also ensure that auth_method is actually valid, aka authn_id is not 
NULL.
+*/
+   if (MyClientConnectionInfo.authn_id)
+   InitializeSystemUser(MyClientConnectionInfo.authn_id,
+
hba_authname(MyClientConnectionInfo.auth_method));
+
/* Attach to the leader's serializable transaction, if SERIALIZABLE. */
AttachSerializableXact(fps->serializable_xact_handle);
 
diff --git a/src/backend/executor/execExprInterp.c 
b/src/backend/executor/execExprInterp.c
index 9b9bbf00a9..c51578c0b9 100644
--- a/src/backend/executor/execExprInterp.c
+++ b/src/backend/executor/execExprInterp.c
@@ -2537,6 +2537,11 @@ ExecEvalSQLValueFunction(ExprState *state, ExprEvalStep 
*op)
*op->resvalue = session_user(fcinfo);
*op->resnull = fcinfo->isnull;
break;
+   case SVFOP_SYSTEM_USER:
+   InitFunctionCallInfoData(*fcinfo, NULL, 0, InvalidOid, 
NULL, NULL);
+   *op->resvalue = system_user(fcinfo);
+   *op->resnull = fcinfo->isnull;
+   break;
case SVFOP_CURRENT_CATALOG:
InitFunctionCallInfoData(*fcinfo, NULL, 0, InvalidOid, 
NULL, NULL);
*op->resvalue = current_database(fcinfo);
diff --git a/src/backend/nodes/nodeFuncs.c b/src/backend/nodes/nodeFuncs.c
index 3bac350bf5..453ba84494 100644
--- a/src/backend/nodes/nodeFuncs.c
+++ b/src/backend/nodes/nodeFuncs.c
@@ -1142,7 +1142,8 @@ exprSetCollation(Node *expr, Oid collation)
((MinMaxExpr *) expr)->minmaxcollid = collation;
break;
case T_SQLValueFunction:
-   AssertSQLValueFunction *) expr)->type == NAMEOID) ?
+   AssertSQLValueFunction *) expr)->type == NAMEOID ||
+   ((SQLValueFunction *) expr)->type == 
TEXTOID) ?
   (collation == C_COLLATION_OID) :
   (collation == InvalidOid));
b

Avoid memory leaks during base backups

2022-09-26 Thread Bharath Rupireddy

Hi,

Postgres currently can leak memory if a failure occurs during base
backup in do_pg_backup_start() or do_pg_backup_stop() or
perform_base_backup(). The palloc'd memory such as backup_state or
tablespace_map in xlogfuncs.c or basebackup.c or tablespaceinfo or the
memory that gets allocated by bbsink_begin_backup() in
perform_base_backup() or any other, is left-out which may cause memory
bloat on the server eventually. To experience this issue, run
pg_basebackup with --label name longer than 1024 characters and
observe memory with watch command, the memory usage goes up.

It looks like the memory leak issue has been there for quite some
time, discussed in [1].

I'm proposing a patch that leverages the error callback mechanism and
memory context. The design of the patch is as follows:
1) pg_backup_start() and pg_backup_stop() - the error callback frees
up the backup_state and tablespace_map variables allocated in
TopMemoryContext. We don't need a separate memory context here because
do_pg_backup_start() and do_pg_backup_stop() don't return any
dynamically created memory for now. We can choose to create a separate
memory context for the future changes that may come, but now it is not
required.
2) perform_base_backup() - a new memory context has been created that
gets deleted by the callback upon error.

The error callbacks are typically called for all the elevels, but we
need to free up the memory only when elevel is >= ERROR or ==
COMMERROR. The COMMERROR is a common scenario because the server can
close the connection to the client or vice versa in which case the
base backup fails. For all other elevels like WARNING, NOTICE, DEBUGX,
INFO etc. we don't free up the memory.

I'm attaching v1 patch herewith.

Thoughts?

[1] https://www.postgresql.org/message-id/Yyq15ekNzjZecwMW%40paquier.xyz

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com


v1-0001-Avoid-memory-leaks-during-base-backups.patch
Description: Binary data

Re: SUBTRANS: Minimizing calls to SubTransSetParent()

2022-09-26 Thread Simon Riggs

On Fri, 16 Sept 2022 at 13:20, Simon Riggs  wrote:
>
> Thanks for the review.
>
> v10 attached

v11 attached, corrected for recent commit
14ff44f80c09718d43d853363941457f5468cc03.

-- 
Simon Riggshttp://www.EnterpriseDB.com/


002_minimize_calls_to_SubTransSetParent.v11.patch
Description: Binary data

Re: Allow foreign keys to reference a superset of unique columns

2022-09-26 Thread Wolfgang Walther


James Coleman:

So the broader point I'm trying to make is that, as I understand it,
indexes backing foreign key constraints is an implementation detail.
The SQL standard details the behavior of foreign key constraints
regardless of implementation details like a backing index. That means
that the behavior of two column foreign key constraints is defined in
a single way whether or not there's a backing index at all or whether
such a backing index, if present, contains one or two columns.

I understand that for the use case you're describing this isn't the
absolute most efficient way to implement the desired data semantics.
But it would be incredibly confusing (and, I think, a violation of the
SQL standard) to have one foreign key constraint work in a different
way from another such constraint when both are indistinguishable at
the constraint level (the backing index isn't an attribute of the
constraint; it's merely an implementation detail).


Ah, thanks, I understand better now.

The two would only be indistinguishable at the constraint level, if 
$subject was implemented by allowing to create unique constraints on a 
superset of unique columns, backed by a different index (the suggestion 
we both made independently). But if it was possible to reference a 
superset of unique columns, where there was only a unique constraint put 
on a subset of the referenced columns (the idea originally introduced in 
this thread), then there would be a difference, right?


That's if it was only the backing index that is not part of the SQL 
standard, and not also the fact that a foreign key should reference a 
primary key or unique constraint?


Anyway, I can see very well how that would be quite confusing overall. 
It would probably be wiser to allow something roughly like this (if at 
all, of course):


CREATE TABLE bar (
  b INT PRIMARY KEY,
  f INT,
  ftype foo_type GENERATED ALWAYS AS REFERENCE TO foo.type,
  FOREIGN KEY (f, ftype) REFERENCES foo (f, type)
);

It likely wouldn't work exactly like that, but given a foreign key to 
foo, the GENERATED clause could be used to fetch the value through the 
same triggers that form that FK for efficiency. My main point for now 
is: With a much more explicit syntax anything near that, this would 
certainly be an entirely different feature than $subject **and** it 
would be possible to implement on top of $subject. If at all.


So no need for me to distract this thread from $subject anymore. I think 
the idea of allowing to create unique constraints on a superset of the 
columns of an already existing unique index is a good one, so let's 
discuss this further.


Best

Wolfgang

Re: Avoid memory leaks during base backups

2022-09-26 Thread Tom Lane

Bharath Rupireddy  writes:
> Postgres currently can leak memory if a failure occurs during base
> backup in do_pg_backup_start() or do_pg_backup_stop() or
> perform_base_backup(). The palloc'd memory such as backup_state or
> tablespace_map in xlogfuncs.c or basebackup.c or tablespaceinfo or the
> memory that gets allocated by bbsink_begin_backup() in
> perform_base_backup() or any other, is left-out which may cause memory
> bloat on the server eventually. To experience this issue, run
> pg_basebackup with --label name longer than 1024 characters and
> observe memory with watch command, the memory usage goes up.

> It looks like the memory leak issue has been there for quite some
> time, discussed in [1].

> I'm proposing a patch that leverages the error callback mechanism and
> memory context.

This ... seems like inventing your own shape of wheel.  The
normal mechanism for preventing this type of leak is to put the
allocations in a memory context that can be reset or deallocated
in mainline code at the end of the operation.  I do not think that
having an errcontext callback with side-effects like deallocating
memory is even remotely safe, and it's certainly a first-order
abuse of that mechanism.

regards, tom lane

Re: Allow foreign keys to reference a superset of unique columns

2022-09-26 Thread Wolfgang Walther


James Coleman:

As I was reading through the email chain I had this thought: could you
get the same benefit (or 90% of it anyway) by instead allowing the
creation of a uniqueness constraint that contains more columns than
the index backing it? So long as the index backing it still guaranteed
the uniqueness on a subset of columns that would seem to be safe.

Tom notes the additional columns being nullable is something to think
about. But if we go the route of allowing unique constraints with
backing indexes having a subset of columns from the constraint I don't
think the nullability is an issue since it's already the case that a
unique constraint can be declared on columns that are nullable. Indeed
it's also the case that we already support a foreign key constraint
backed by a unique constraint including nullable columns.

Because such an approach would, I believe, avoid changing the foreign
key code (or semantics) at all, I believe that would address Tom's
concerns about information_schema and fuzziness of semantics.



Could we create this additional unique constraint implicitly, when using 
FOREIGN KEY ... REFERENCES on a superset of unique columns? This would 
make it easier to use, but still give proper information_schema output.


Best

Wolfgang

Re: Consider parallel for lateral subqueries with limit

2022-09-26 Thread Robert Haas

On Thu, Sep 22, 2022 at 5:19 PM James Coleman  wrote:
> > Your sample query gets a plan like this:
> >
> >  Nested Loop Left Join  (cost=0.00..1700245.00 rows=1 width=8)
> >->  Seq Scan on foo  (cost=0.00..145.00 rows=1 width=4)
> >->  Limit  (cost=0.00..170.00 rows=1 width=4)
> >  ->  Seq Scan on bar  (cost=0.00..170.00 rows=1 width=4)
> >Filter: (foo.a = a)
> >
> > If this were to occur inside a larger plan tree someplace, it would be
> > OK to insert a Gather node above the Nested Loop node without doing
> > anything further, because then the parameter that stores foo.a would
> > be both set and used in the worker. If you wanted to insert a Gather
> > at any other place in this plan, things get more complicated. But just
> > because you have LATERAL doesn't mean that you have this problem,
> > because if you delete the "limit 1" then the subqueries get flattened
> > together and the parameter disappears,
>
> For future reference in this email thread when you remove the "limit
> 1" this is the plan you get:
>
>  Merge Right Join  (cost=372.18..815.71 rows=28815 width=8)
>Merge Cond: (bar.a = foo.a)
>->  Sort  (cost=158.51..164.16 rows=2260 width=8)
>  Sort Key: bar.a
>  ->  Seq Scan on bar  (cost=0.00..32.60 rows=2260 width=8)
>->  Sort  (cost=179.78..186.16 rows=2550 width=4)
>  Sort Key: foo.a
>  ->  Seq Scan on foo  (cost=0.00..35.50 rows=2550 width=4)
>
> Just to make sure I'm following: by "doesn't mean that you have this
> problem" you mean "doesn't mean you have this limitation on parallel
> query"?

I'm talking specifically about whether there's a parameter. The
problem here isn't created by LATERAL, but by parameters. In the
nested loop plan, there's a parameter that's storing foo.a, and the
storage location that holds that parameter value is in backend-private
memory, so you can't set the value in the leader and then use it in a
worker, and that restricts where you can insert a Gather node. But in
the Merge Join plan (or if you got a Hash Join plan) there is no
parameter. So the fact that parameter storage isn't shared between
leaders and workers does not matter.

> Yes, that's a good point too. I need to play with these examples and
> confirm whether lateral_relids gets set in that case. IIRC when that's
> set isn't exactly the same as whether or not the LATERAL keyword is
> used, and I should clarify that my claims here are meant to be about
> when we execute it that way regardless of the keyword usage. The
> keyword usage I'd assumed just made it easier to talk about, but maybe
> you're implying that it's actually generating confusion.

Yes, I think so.

Stepping back a bit, commit 75f9c4ca5a8047d7a9cfbc7d51a610933d04dc7f
introduced the code that is at issue here, and it took the position
that limit/offset should be marked parallel-restricted because they're
nondeterministic. I'm not sure that's really quite the right thing.
The original idea behind having things be parallel-restricted was that
there might be stuff which depends on state in the leader that is not
replicated to or shared with the workers, and thus it must be executed
in the leader to get the right answer. Here, however, there is no such
problem. Something like LIMIT/OFFSET that is nondeterministic is
perfectly safe to execute in a worker, or in the leader. It's just not
safe to execute the same computation more than once and assume that we
will get the same answer each time. But that's really a different
problem.

And the problem that you've run into here, I think, is that if a Limit
node is getting fed a parameter from higher up in the query plan, then
it's not really the same computation being performed every time, and
thus the non-determinism doesn't matter, and thus the
parallel-restriction goes away, but the code doesn't know that.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: kerberos/001_auth test fails on arm CPU darwin

2022-09-26 Thread Peter Eisentraut


On 26.09.22 13:14, Tom Lane wrote:

Bilal Yavuz  writes:

It seems that kerberos is installed at the '/opt/homebrew/opt/krb5' path on
ARM CPU darwin instances instead of the '/usr/local/opt/krb5' path.

I think this also needs to account for MacPorts, which would likely
put it under /opt/local/sbin.  (I wonder where /usr/local/opt/krb5
came from at all -- that sounds like somebody's manual installation
rather than a packaged one.)


/usr/local/opt/ is used by Homebrew on Intel macOS.

Re: Extending outfuncs support to utility statements

2022-09-26 Thread Peter Eisentraut


On 22.09.22 23:21, Tom Lane wrote:

Anyway, this is a bit far afield from the stated topic of this
thread.  I think we should commit something approximately like
what I posted and then start a new thread specifically about
what we'd like to do about utility commands in new-style SQL
functions.


Right, I have committed everything and will close the CF entry.  I don't 
have a specific idea about how to move forward right now.

Re: Convert GetDatum() and DatumGet() macros to inline functions

2022-09-26 Thread Peter Eisentraut


On 12.09.22 19:59, Peter Eisentraut wrote:

On 12.09.22 19:03, Julien Rouhaud wrote:

On Mon, Sep 12, 2022 at 05:59:09PM +0200, Peter Eisentraut wrote:


committed, thanks


FTR lapwing is complaining about this commit:
https://brekka.postgresql.org/cgi-bin/show_log.pl?nm=lapwing&dt=2022-09-12%2016%3A40%3A18.

Snapper is also failing with similar problems:
https://brekka.postgresql.org/cgi-bin/show_log.pl?nm=snapper&dt=2022-09-12%2016%3A42%3A10


Ok, it has problems with 32-bit platforms.  I can reproduce it locally. 
I'll need to take another look at this.  I have reverted the patch for now.


I have tried to analyze these issues, but I'm quite stuck.  If anyone 
else has any ideas, it would be helpful.

Re: Use pg_pwritev_with_retry() instead of write() in dir_open_for_write() to avoid partial writes?

2022-09-26 Thread Bharath Rupireddy

On Sat, Sep 24, 2022 at 9:44 AM Thomas Munro  wrote:
>
> Although WriteFile() with a synchronous file handle and an explicit
> offset doesn't use the current file position, it appears that it still
> changes it.  :-(
>
> You'd think from the documentation[1] that that isn't the case, because it 
> says:
>
> "Considerations for working with synchronous file handles:
>
>  * If lpOverlapped is NULL, the write operation starts at the current
> file position and WriteFile does not return until the operation is
> complete, and the system updates the file pointer before WriteFile
> returns.
>
>  * If lpOverlapped is not NULL, the write operation starts at the
> offset that is specified in the OVERLAPPED structure and WriteFile
> does not return until the write operation is complete. The system
> updates the OVERLAPPED Internal and InternalHigh fields before
> WriteFile returns."
>
> So it's explicitly saying the file pointer is updated in the first
> bullet point and not the second, but src/port/win32pwrite.c is doing
> the second thing.

The WriteFile() and pwrite() implementation in win32pwrite.c are
changing the file pointer on Windows, see the following and [4] for
more details.

# Running: pg_basebackup --no-sync -cfast -D
C:\cirrus\build\testrun\pg_basebackup\010_pg_basebackup\data\tmp_test_sV4r/backup2
--no-manifest --waldir
C:\cirrus\build\testrun\pg_basebackup\010_pg_basebackup\data\tmp_test_sV4r/xlog2
pg_basebackup: offset_before 0 and offset_after 16777216 aren't the same
Assertion failed: offset_before == offset_after, file
../src/bin/pg_basebackup/walmethods.c, line 254

Irrespective of what Windows does with file pointers in WriteFile(),
should we add lseek(SEEK_SET) in our own pwrite()'s implementation,
something like [5]? This is rather hackish without fully knowing what
Windows does internally in WriteFile(), but this does fix inherent
issues that our pwrite() callers (there are quite a number of places
that use pwrite() and presumes file pointer doesn't change on Windows)
may have on Windows. See the regression tests passing [6] with the fix
[5].

> Yet the following assertion added to Bharath's code
> fails[2]:

That was a very quick patch though, here's another version adding
offset checks and an assertion [3], see the assertion failures here
[4].

I also think that we need to discuss this issue separately.

Thoughts?

> [1] 
> https://learn.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-writefile#synchronization-and-file-position
> [2] 
> https://api.cirrus-ci.com/v1/artifact/task/6201051266154496/testrun/build/testrun/pg_basebackup/010_pg_basebackup/log/regress_log_010_pg_basebackup
[3] - 
https://github.com/BRupireddy/postgres/tree/add_pwrite_and_offset_checks_in_walmethods_v2
[4] - 
https://api.cirrus-ci.com/v1/artifact/task/5294264635621376/testrun/build/testrun/pg_basebackup/010_pg_basebackup/log/regress_log_010_pg_basebackup
[5]
diff --git a/src/port/win32pwrite.c b/src/port/win32pwrite.c
index 7f2e62e8a7..542b548279 100644
--- a/src/port/win32pwrite.c
+++ b/src/port/win32pwrite.c
@@ -37,5 +37,16 @@ pwrite(int fd, const void *buf, size_t size, off_t offset)
return -1;
}

+   /*
+* On Windows, it is found that WriteFile() changes the file
pointer and we
+* want pwrite() to not change. Hence, we explicitly reset the
file pointer
+* to beginning of the file.
+*/
+   if (lseek(fd, 0, SEEK_SET) != 0)
+   {
+   _dosmaperr(GetLastError());
+   return -1;
+   }
+
return result;
 }
[6] - 
https://github.com/BRupireddy/postgres/tree/add_pwrite_and_offset_checks_in_walmethods_v3

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

PostgreSQL 15 GA release date

2022-09-26 Thread Jonathan S. Katz


Hi,

The PostgreSQL 15 GA release (15.0) is now scheduled for October 13, 
2022. The release team changed this from the planned date of October 6 
to allow for additional testing of recent changes.


Please let us know if you have any questions. We're excited that we are 
very close to officially releasing PostgreSQL 15.


Thanks,

Jonathan


OpenPGP_signature
Description: OpenPGP digital signature

Re: Add support for DEFAULT specification in COPY FROM

2022-09-26 Thread Israel Barth Rubio

Hello Andrew,

> . There needs to be a check that this is being used with COPY FROM, and
> the restriction needs to be stated in the docs and tested for. c.f.
> FORCE NULL.
>
> . There needs to be support for this in psql's tab_complete.c, and
> appropriate tests added
>
> . There needs to be support for it in contrib/file_fdw/file_fdw.c, and a
> test added
>
> . The tests should include psql's \copy as well as sql COPY
>
> . I'm not sure we need a separate regression test file for this.
> Probably these tests can go at the end of src/test/regress/sql/copy2.sql.

Thanks for your review! I have applied the suggested changes, and I'm
submitting the new patch version.

Kind regards,
Israel.


v3-0001-Added-support-for-DEFAULT-in-COPY-FROM.patch
Description: Binary data

Re: Consider parallel for lateral subqueries with limit

2022-09-26 Thread James Coleman

On Mon, Sep 26, 2022 at 10:37 AM Robert Haas  wrote:
>
> On Thu, Sep 22, 2022 at 5:19 PM James Coleman  wrote:
> > > Your sample query gets a plan like this:
> > >
> > >  Nested Loop Left Join  (cost=0.00..1700245.00 rows=1 width=8)
> > >->  Seq Scan on foo  (cost=0.00..145.00 rows=1 width=4)
> > >->  Limit  (cost=0.00..170.00 rows=1 width=4)
> > >  ->  Seq Scan on bar  (cost=0.00..170.00 rows=1 width=4)
> > >Filter: (foo.a = a)
> > >
> > > If this were to occur inside a larger plan tree someplace, it would be
> > > OK to insert a Gather node above the Nested Loop node without doing
> > > anything further, because then the parameter that stores foo.a would
> > > be both set and used in the worker. If you wanted to insert a Gather
> > > at any other place in this plan, things get more complicated. But just
> > > because you have LATERAL doesn't mean that you have this problem,
> > > because if you delete the "limit 1" then the subqueries get flattened
> > > together and the parameter disappears,
> >
> > For future reference in this email thread when you remove the "limit
> > 1" this is the plan you get:
> >
> >  Merge Right Join  (cost=372.18..815.71 rows=28815 width=8)
> >Merge Cond: (bar.a = foo.a)
> >->  Sort  (cost=158.51..164.16 rows=2260 width=8)
> >  Sort Key: bar.a
> >  ->  Seq Scan on bar  (cost=0.00..32.60 rows=2260 width=8)
> >->  Sort  (cost=179.78..186.16 rows=2550 width=4)
> >  Sort Key: foo.a
> >  ->  Seq Scan on foo  (cost=0.00..35.50 rows=2550 width=4)
> >
> > Just to make sure I'm following: by "doesn't mean that you have this
> > problem" you mean "doesn't mean you have this limitation on parallel
> > query"?
>
> I'm talking specifically about whether there's a parameter. The
> problem here isn't created by LATERAL, but by parameters. In the
> nested loop plan, there's a parameter that's storing foo.a, and the
> storage location that holds that parameter value is in backend-private
> memory, so you can't set the value in the leader and then use it in a
> worker, and that restricts where you can insert a Gather node. But in
> the Merge Join plan (or if you got a Hash Join plan) there is no
> parameter. So the fact that parameter storage isn't shared between
> leaders and workers does not matter.

Ah, yes, right.

> > Yes, that's a good point too. I need to play with these examples and
> > confirm whether lateral_relids gets set in that case. IIRC when that's
> > set isn't exactly the same as whether or not the LATERAL keyword is
> > used, and I should clarify that my claims here are meant to be about
> > when we execute it that way regardless of the keyword usage. The
> > keyword usage I'd assumed just made it easier to talk about, but maybe
> > you're implying that it's actually generating confusion.
>
> Yes, I think so.
>
> Stepping back a bit, commit 75f9c4ca5a8047d7a9cfbc7d51a610933d04dc7f
> introduced the code that is at issue here, and it took the position
> that limit/offset should be marked parallel-restricted because they're
> nondeterministic. I'm not sure that's really quite the right thing.
> The original idea behind having things be parallel-restricted was that
> there might be stuff which depends on state in the leader that is not
> replicated to or shared with the workers, and thus it must be executed
> in the leader to get the right answer. Here, however, there is no such
> problem. Something like LIMIT/OFFSET that is nondeterministic is
> perfectly safe to execute in a worker, or in the leader. It's just not
> safe to execute the same computation more than once and assume that we
> will get the same answer each time. But that's really a different
> problem.
>
> And the problem that you've run into here, I think, is that if a Limit
> node is getting fed a parameter from higher up in the query plan, then
> it's not really the same computation being performed every time, and
> thus the non-determinism doesn't matter, and thus the
> parallel-restriction goes away, but the code doesn't know that.

Correct. I think the simplest description of the proper limitation is
that we can't parallelize when either:
1. The param comes from outside the worker, or
2. The "same" param value from the same tuple is computed in multiple
workers (as distinct from the same value from different outer tuples).
Or, to put it positively, when can parallelize when:
1. There are no params, or
2. The param is uniquely generated and used inside a single worker.

I believe the followup email I sent (with an updated patch) details
the correctness of that approach; I'd be interested in your feedback
on that (I recognize it's quite a long email, though...) if you get a
chance.

Thanks for your review on this so far,
James Coleman

Re: Add support for DEFAULT specification in COPY FROM

2022-09-26 Thread Zhihong Yu

On Mon, Sep 26, 2022 at 8:12 AM Israel Barth Rubio 
wrote:

> Hello Andrew,
>
> > . There needs to be a check that this is being used with COPY FROM, and
> > the restriction needs to be stated in the docs and tested for. c.f.
> > FORCE NULL.
> >
> > . There needs to be support for this in psql's tab_complete.c, and
> > appropriate tests added
> >
> > . There needs to be support for it in contrib/file_fdw/file_fdw.c, and a
> > test added
> >
> > . The tests should include psql's \copy as well as sql COPY
> >
> > . I'm not sure we need a separate regression test file for this.
> > Probably these tests can go at the end of src/test/regress/sql/copy2.sql.
>
> Thanks for your review! I have applied the suggested changes, and I'm
> submitting the new patch version.
>
> Kind regards,
> Israel.
>

Hi,

+   /* attribute is NOT to be copied from input */

I think saying `is NOT copied from input` should suffice.

+   defaults = (bool *) palloc0(num_phys_attrs * sizeof(bool));
+   MemSet(defaults, false, num_phys_attrs * sizeof(bool));

Is the MemSet() call necessary ?

+   /* fieldno is 0-index and attnum is 1-index */

0-index -> 0-indexed

Cheers

Re: has_privs_of_role vs. is_member_of_role, redux

2022-09-26 Thread Robert Haas

On Sun, Sep 25, 2022 at 5:08 AM Wolfgang Walther
 wrote:
> Robert Haas:
> > Well, maybe. Suppose that role A has been granted pg_read_all_settings
> > WITH INHERIT TRUE, SET TRUE and role B has been granted
> > pg_read_all_settings WITH INHERIT TRUE, SET FALSE. A can create a
> > table owned by pg_read_all_settings. If A does that, then B can now
> > create a trigger on that table and usurp the privileges of
> > pg_read_all_settings, after which B can now create any number of
> > objects owned by pg_read_all_settings.
>
> I'm not seeing how this is possible. A trigger function would run with
> the invoking user's privileges by default, right? So B would have to
> create a trigger with a SECURITY DEFINER function, which is owned by
> pg_read_all_settings to actually usurp the privileges of that role. But
> creating objects with that owner is exactly the thing B can't do.

Yeah, my statement before wasn't correct. It appears that alice can't
just usurp the privileges of pg_read_all_settings trivially, but she
can create a trigger on any preexisting table owned by
pg_read_all_settings and then anyone who performs an operation that
causes that trigger to fire is at risk:

rhaas=# create role alice;
CREATE ROLE
rhaas=# create table foo (a int, b text);
CREATE TABLE
rhaas=# alter table foo owner to pg_read_all_settings;
ALTER TABLE
rhaas=# grant pg_read_all_settings to alice;
GRANT ROLE
rhaas=# grant create on schema public to alice;
GRANT
rhaas=# set session authorization alice;
SET
rhaas=> create or replace function alice_function () returns trigger
as $$begin raise notice 'this trigger is running as %', current_user;
return null; end$$ language plpgsql;
CREATE FUNCTION
rhaas=> create trigger t1 before insert or update or delete on foo for
each row execute function alice_function();
CREATE TRIGGER
rhaas=> begin;
BEGIN
rhaas=*> insert into foo values (1, 'stuff');
NOTICE:  this trigger is running as alice
INSERT 0 0
rhaas=*> rollback;
ROLLBACK
rhaas=> reset session authorization;
RESET
rhaas=# begin;
BEGIN
rhaas=*# insert into foo values (1, 'stuff');
NOTICE:  this trigger is running as rhaas
INSERT 0 0
rhaas=*# rollback;
ROLLBACK

This shows that if rhaas (or whoever) performs DML on a table owned by
pg_read_all_settings, he might trigger arbitrary code written by alice
to run under his own user ID. Now, that hazard would exist anyway for
tables owned by alice, but now it also exists for any tables owned by
pg_read_all_settings. I'm not really sure how significant that is. If
you can create triggers as some other user and that user ever does
stuff as themselves, you can probably steal their privileges, because
they will probably eventually do DML on one of their own tables and
thereby execute your Trojan trigger. However, in the particular case
of pg_read_all_settings, the intent is probably that nobody would ever
run as that user, and there is probably also no reason to create
tables or other objects owned by that user. So maybe we really can say
that just blocking SET ROLE is enough.

I'm slightly skeptical of that conclusion because the whole thing just
feels a bit flimsy. Like, the whole idea that you can compromise your
account by inserting a row into somebody else's table feels a little
nuts to me. Triggers and row-level security policies make it easy to
do things that look safe and are actually very dangerous. I think
anyone would reasonably expect that calling a function owned by some
other user might be risky, because who knows what that function might
do, but it seems less obvious that accessing a table could execute
arbitrary code, yet it can. And it is even less obvious that creating
a table owned by one role might give some other role who inherits that
user's privileges to booby-trap that table in a way that might fool a
third user into doing something unsafe. But I have no idea what we
could reasonably do to improve the situation.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: [RFC] building postgres with meson - v13

2022-09-26 Thread Andres Freund

Hi,

On 2022-09-26 10:41:01 +0200, Alvaro Herrera wrote:
> On 2022-Sep-25, Andres Freund wrote:
> 
> > From 3eb0ca196084da314d94d1e51c7b775012a4773c Mon Sep 17 00:00:00 2001
> > From: Andres Freund 
> > Date: Wed, 21 Sep 2022 11:03:07 -0700
> > Subject: [PATCH v16 04/16] meson: Add windows resource files
> 
> > diff --git a/src/backend/jit/llvm/meson.build 
> > b/src/backend/jit/llvm/meson.build
> > index de2e624ab58..5fb63768358 100644
> > --- a/src/backend/jit/llvm/meson.build
> > +++ b/src/backend/jit/llvm/meson.build
> > @@ -20,6 +20,12 @@ llvmjit_sources += files(
> >'llvmjit_expr.c',
> >  )
> >  
> > +if host_system == 'windows'
> > +  llvmjit_sources += rc_lib_gen.process(win32ver_rc, extra_args: [
> > +'--NAME', 'llvmjit',
> > +'--FILEDESC', 'llvmjit - JIT using LLVM',])
> > +endif
> 
> This is tediously imperative.  Isn't there a more declarative way to
> have it?

I tried to come up with something better, without success. I think it's
acceptable, even if not great.

Greetings,

Andres Freund

Add more docs for pg_surgery?

2022-09-26 Thread Zhang Mingli

Hi, hackers

heap_force_kill/heap_force_freeze doesn’t consider other transactions that are 
using the same tuples even with tuple-locks.
The functions may break transaction semantic, ex:

session1
```
create table htab(id int);
insert into htab values (100), (200), (300), (400), (500);
```

session2
```
begin isolation level repeatable read;
select * from htab for share;
 id
-
 100
 200
 300
 400
 500
(5 rows)
```

session1
```
select heap_force_kill('htab'::regclass, ARRAY['(0, 1)']::tid[]);
 heap_force_kill
-

(1 row)
```

session2
```
select * from htab for share;
 id
-
 200
 300
 400
 500
(4 rows)
```

session2 should get the same results as it's repeatable read isolation level.

By reading the doc:
```
The pg_surgery module provides various functions to perform surgery on a 
damaged relation. These functions are unsafe by design and using them may 
corrupt (or further corrupt) your database. For example, these functions can 
easily be used to make a table inconsistent with its own indexes, to cause 
UNIQUE or FOREIGN KEY constraint violations, or even to make tuples visible 
which, when read, will cause a database server crash. They should be used with 
great caution and only as a last resort.

```
I know they are powerful tools, but also a little surprise with the above 
example.

Should we add more docs to tell the users that the tool will change the tuples 
anyway even there are tuple-locks on them?


Regards,
Zhang Mingli

Re: identifying the backend that owns a temporary schema

2022-09-26 Thread Nathan Bossart

On Sat, Sep 24, 2022 at 01:41:38PM -0400, Tom Lane wrote:
> One thing I don't like about it documentation-wise is that it leaves
> the concept of backend ID pretty much completely undefined.

How specific do you think this definition ought to be?  All I've come up
with so far is "internal identifier for the backend that is independent
from its PID," which is what I used in the attached patch.  Do we want to
mention its uses in more detail (e.g., temp schema name), or should we keep
it vague?

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
>From 4fd862d40ba3c703e00973d9743a0d4897fe4197 Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Fri, 12 Aug 2022 23:07:37 -0700
Subject: [PATCH v3 1/1] Adjust pg_stat_get_backend_*() to use backends' PGPROC
 backend IDs.

Presently, these functions use the index of the backend's entry in
localBackendStatusTable as the backend ID.  While this might bear
some resemblance to the backend ID of the backend's PGPROC struct,
it can quickly diverge as sessions connect and disconnect.  This
change modifies the pg_stat_get_backend_*() suite of functions to
use the PGPROC backend IDs instead.  While we could instead
introduce a new function for retrieving PGPROC backend IDs,
presenting two sets of backend IDs to users seems likely to cause
even more confusion than what already exists.

This is primarily useful for discovering the session that owns a
resource named with its PGPROC backend ID.  For example, temporary
schema names include the PGPROC backend ID, and it might be
necessary to identify the session that owns a temporary table that
is putting the cluster in danger of transaction ID wraparound.

Author: Nathan Bossart
---
 doc/src/sgml/monitoring.sgml| 10 +++---
 src/backend/utils/activity/backend_status.c | 40 +++--
 src/backend/utils/adt/pgstatfuncs.c | 11 +++---
 src/include/utils/backend_status.h  |  8 +
 4 files changed, 57 insertions(+), 12 deletions(-)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 1d9509a2f6..47f2883576 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -5488,10 +5488,11 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
information.  In such cases, an older set of per-backend statistics
access functions can be used; these are shown in .
-   These access functions use a backend ID number, which ranges from one
-   to the number of currently active backends.
+   These access functions use the process's backend ID number, which is an
+   internal identifier for the backend that is independent from its
+   PID.
The function pg_stat_get_backend_idset provides a
-   convenient way to generate one row for each active backend for
+   convenient way to list all the active backends' ID numbers for
invoking these functions.  For example, to show the PIDs and
current queries of all backends:
 
@@ -5526,8 +5527,7 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
 setof integer


-Returns the set of currently active backend ID numbers (from 1 to the
-number of active backends).
+Returns the set of currently active backend ID numbers.

   
 
diff --git a/src/backend/utils/activity/backend_status.c b/src/backend/utils/activity/backend_status.c
index c7ed1e6d7a..159d022070 100644
--- a/src/backend/utils/activity/backend_status.c
+++ b/src/backend/utils/activity/backend_status.c
@@ -849,6 +849,7 @@ pgstat_read_current_status(void)
 			BackendIdGetTransactionIds(i,
 	   &localentry->backend_xid,
 	   &localentry->backend_xmin);
+			localentry->backend_id = i;
 
 			localentry++;
 			localappname += NAMEDATALEN;
@@ -1045,6 +1046,22 @@ pgstat_get_my_query_id(void)
 	return MyBEEntry->st_query_id;
 }
 
+/* --
+ * cmp_lbestatus
+ *
+ *	Comparison function for bsearch() on an array of LocalPgBackendStatus.  The
+ *	backend_id field is used to compare the arguments.
+ *
+ * --
+ */
+static int
+cmp_lbestatus(const void *a, const void *b)
+{
+	LocalPgBackendStatus *lbestatus1 = (LocalPgBackendStatus *) a;
+	LocalPgBackendStatus *lbestatus2 = (LocalPgBackendStatus *) b;
+
+	return lbestatus1->backend_id - lbestatus2->backend_id;
+}
 
 /* --
  * pgstat_fetch_stat_beentry() -
@@ -1052,6 +1069,10 @@ pgstat_get_my_query_id(void)
  *	Support function for the SQL-callable pgstat* functions. Returns
  *	our local copy of the current-activity entry for one backend.
  *
+ *	Unlike pgstat_fetch_stat_local_beentry(), the beid argument refers to the
+ *	backend ID stored in the backend's PGPROC struct instead of its index in
+ *	the localBackendStatusTable.
+ *
  *	NB: caller is responsible for a check if the user is permitted to see
  *	this info (especially the querystring).
  * --
@@ -1059,12 +1080,23 @@ pgstat_get_my_query_id(void)
 PgBackendStatus *
 pgstat_fetch_stat_beentry(int beid)
 {

Re: has_privs_of_role vs. is_member_of_role, redux

2022-09-26 Thread Wolfgang Walther


Robert Haas:

This shows that if rhaas (or whoever) performs DML on a table owned by
pg_read_all_settings, he might trigger arbitrary code written by alice
to run under his own user ID. Now, that hazard would exist anyway for
tables owned by alice, but now it also exists for any tables owned by
pg_read_all_settings.


This hazard exists for all tables that alice has been granted the 
TRIGGER privilege on. While we prevent alice from creating tables that 
are owned by pg_read_all_settings, we do not prevent inheriting the 
TRIGGER privilege.



I'm slightly skeptical of that conclusion because the whole thing just
feels a bit flimsy. Like, the whole idea that you can compromise your
account by inserting a row into somebody else's table feels a little
nuts to me. Triggers and row-level security policies make it easy to
do things that look safe and are actually very dangerous. I think
anyone would reasonably expect that calling a function owned by some
other user might be risky, because who knows what that function might
do, but it seems less obvious that accessing a table could execute
arbitrary code, yet it can. And it is even less obvious that creating
a table owned by one role might give some other role who inherits that
user's privileges to booby-trap that table in a way that might fool a
third user into doing something unsafe. But I have no idea what we
could reasonably do to improve the situation.


Right. This will always be the case when giving out the TRIGGER 
privilege on one of your tables to somebody else.


There is two kind of TRIGGER privileges: An explicitly GRANTed privilege 
and an implicit privilege, that is given to the table owner.


I think, when WITH INHERIT TRUE, SET FALSE is set, we should:
- Inherit all explicitly granted privileges
- Not inherit any DDL privileges implicitly given through ownership: 
CREATE, REFERENCES, TRIGGER.
- Inherit all other privileges implicitly given through ownership (DML + 
others)


Those implicit DDL privileges should be considered part of WITH SET 
TRUE. When you can't do SET ROLE x, then you can't act as the owner of 
any object owned by x.


Or to put it the other way around: Only allow implicit ownership 
privileges to be executed when the CURRENT_USER is the owner. But 
provide a shortcut, when you have the WITH SET TRUE option on a role, so 
that you don't need to do SET ROLE + CREATE TRIGGER, but can just do 
CREATE TRIGGER instead. This is similar to the mental model of 
"requesting and accepting a transfer of ownership" with an implicit SET 
ROLE built-in, that I used before.


Best

Wolfgang

Re: making relfilenodes 56 bits

2022-09-26 Thread Robert Haas

On Wed, Sep 21, 2022 at 6:09 AM Dilip Kumar  wrote:
> Yeah you are right we can make it uint64.  With respect to this, we
> can not directly use uint64 because that is declared in c.h and that
> can not be used in
> postgres_ext.h IIUC.

Ugh.

> Can we move the existing definitions from
> c.h file to some common file (common for client and server)?

Yeah, I think that would be a good idea. Here's a quick patch that
moves them to common/relpath.h, which seems like a possibly-reasonable
choice, though perhaps you or someone else will have a better idea.

> Based on the discussion [1], it seems we can not use
> INT64_FORMAT/UINT64_FORMAT while using ereport.  But all other places
> I am using INT64_FORMAT/UINT64_FORMAT.  Does this make sense?
>
> [1] 
> https://www.postgresql.org/message-id/20220730113922.qd7qmenwcmzyacje%40alvherre.pgsql

Oh, hmm. So you're saying if the string is not translated then use
(U)INT64_FORMAT but if it is translated then cast? I guess that makes
sense. It feels a bit strange to have the style dependent on the
context like that, but maybe it's fine. I'll reread with that idea in
mind.

> If we're going to bank on that, we could adapt this more
> > heavily, e.g. RelidByRelfilenumber() could lose the reltablespace
> > parameter.
>
> Yeah we might, although we need a bool to identify whether it is
> shared relation or not.

Why?

-- 
Robert Haas
EDB: http://www.enterprisedb.com

move-relfilenumber-decls-v1.patch
Description: Binary data

Re: [RFC] building postgres with meson - v13

2022-09-26 Thread Andres Freund

Hi,

On 2022-09-26 15:01:56 +0200, Peter Eisentraut wrote:
> Here is some review of the remaining ones (might not match exactly what you
> attached, I was working off your branch):

Thanks, and makes sense.

> 9f789350a7a7 meson: ci: wip: move compilerwarnings task to meson
>
> This sounds reasonable to me in principle, but I haven't followed the
> cirrus stuff too closely, and it doesn't say why it's "wip".  Perhaps
> others could take a closer look.

It's mostly WIP because it doesn't yet convert all the checks, specifically
headerscheck/cpluspluscheck isn't converted yet.

> ccf20a68f874 meson: ci: Add additional CI coverage
>
> IIUC, this is just for testing your branch, not meant for master?

Yes. I think we might want to add openbsd / netbsd at some point, but that'll
be a separate thread. Until then it just catches a bunch of mistakes more
easily.

> 02d84c21b227 meson: prereq: win: remove date from version number in
> win32ver.rc
>
> do it

The newest version has evolved a bit, changing Project.pm as well.

> 5c42b3e7812e meson: WIP: Add some of the windows resource files
>
> What is the thinking on this now?  What does this change over the
> current state?

The newest commit has a lot more rc files added and has this summary:

meson: Add windows resource files

The generated resource files aren't exactly the same ones as the old
buildsystems generate. Previously "InternalName" and "OriginalFileName" were
mostly wrong / not set (despite being required), but that was hard to fix in
at least the make build. Additionally, the meson build falls back to a
"auto-generated" description when not set, and doesn't set it in a few 
cases -
unlikely that anybody looks at these descriptions in detail.

The only thing missing rc files is the various ecpg libraries. The issue is
that we shouldn't add resource file to static libraries, so we need to split
the definitions. I'll go and do that next.

> 9bc60bccfd10 meson: Add support for relative rpaths, fixing tests on MacOS
> w/ SIP
>
> I suggest a separate thread and/or CF entry for this.  There have been
> various attempts to deal with SIP before, with varying results.  This
> is not part of the meson transition as such.

I think I might need to split this one more time. We don't add all the rpaths
we add with autoconf before this commit, even not on macOS, which is not
great... Nor do we have a --disable-rpath equivalent yet - I suspect we'll
need that.

https://postgr.es/m/2022093729.GA721620%40nathanxps13

> 9f5be26c1215 meson: Add docs for building with meson
>
> I do like the overall layout of this.
>
> The "Supported Platforms" section should be moved back to near the end
> of the chapter.  I don't see a reason to move it forward, at least
> none that is related to the meson issue.
>
> The changes to the "Getting the Source" section are also not
> appropriate for this patch.

We don't really support building from a tarball with meson yet (you'd need to
confiure, maintainer-clean, configure meson), so it does make some sense...

> 9c00d355d0e9 meson: Add PGXS compatibility
>
> This looks like a reasonable direction to me.  How complete is it?  It
> says it works for some extensions but not others.  How do we define
> the target line here?

Yea, those are good questions.

> How complete is it?

It's a bit hard to know. I think the most important stuff is there. But
there's no clear "API" around pgxs. E.g. we don't (yet?) have an exactly
equivalent definition of 'host', because that's very config.guess specific.

There's lots of shortcuts - e.g. with meson we don't need an equivalent to
PGAC_CHECK_STRIP, so we need to make up something for Makefile.global.

Noah suggested using $(error something), but that only works if $variable is
only used in recursively expanded variables - the errors end up confusing.

> It says it works for some extensions but not others.

I think that's slightly outdated - IIRC it was about pgbouncer, but after a
fix the remaining failure is shared between autoconf and meson builds.

> 3fd5e13dcad3 meson: Add postgresql-extension.pc for building extension
> libraries
>
> Separate thread for this as well.  This is good and important, but we
> must also add it to the make build.

Makes sense.

> eb40f6e53104 meson: Add support for building with precompiled headers
>
> Any reason not to enable this by default?  The benefits on non-Windows
> appear to be less dramatic, but they are not zero.  Might be better to
> enable it consistently so that for example any breakage is easier
> caught.

There's no real reason not to - the wins are small on linux, so introducing
PCH didn't seem necessary. I'm also not sure how well pch works across random
compiler versions - it's so crucial on windows that it seems like a more well
worn path there.

linux, gcc 12:

b_pch=false:
real0m16.233s
user6m40.375s
sys 0m48.953s

b_pch=true:
real0m15.983s
user6m20.357s
sys 0m49.967s

freebsd VM, cl

Re: kerberos/001_auth test fails on arm CPU darwin

2022-09-26 Thread Nazir Bilal Yavuz


Hi,


On 9/26/2022 2:14 PM, Tom Lane wrote:


Maybe we should first try "krb5-config --prefix" to see if that gives an answer.



I tested that command on multiple OSes and it was correct for freeBSD, 
debian and openSUSE.


I don't have macOS so I tried to use CI for running macOS VMs(both arm 
and Intel CPU):
When "krb5-config" binary is used from brew or MacPorts installations' 
path it gives the correct path but there is another "krb5-config" binary 
at "/usr/bin/krb5-config" path on the macOS VMs, when this binary is 
used while running "krb5-config --prefix" command run it gives "/" as 
output. This issue can be related about the CI VMs but I couldn't check it.


Regards,
Nazir Bilal Yavuz

Re: Add more docs for pg_surgery?

2022-09-26 Thread Ashutosh Sharma

On Mon, Sep 26, 2022 at 9:29 PM Zhang Mingli  wrote:
>
> Hi, hackers
>
> heap_force_kill/heap_force_freeze doesn’t consider other transactions that 
> are using the same tuples even with tuple-locks.
> The functions may break transaction semantic, ex:
>
> session1
> ```
> create table htab(id int);
> insert into htab values (100), (200), (300), (400), (500);
> ```
>
> session2
> ```
> begin isolation level repeatable read;
> select * from htab for share;
>  id
> -
>  100
>  200
>  300
>  400
>  500
> (5 rows)
> ```
>
> session1
> ```
> select heap_force_kill('htab'::regclass, ARRAY['(0, 1)']::tid[]);
>  heap_force_kill
> -
>
> (1 row)
> ```
>
> session2
> ```
> select * from htab for share;
>  id
> -
>  200
>  300
>  400
>  500
> (4 rows)
> ```
>
> session2 should get the same results as it's repeatable read isolation level.
>
> By reading the doc:
> ```
> The pg_surgery module provides various functions to perform surgery on a 
> damaged relation. These functions are unsafe by design and using them may 
> corrupt (or further corrupt) your database. For example, these functions can 
> easily be used to make a table inconsistent with its own indexes, to cause 
> UNIQUE or FOREIGN KEY constraint violations, or even to make tuples visible 
> which, when read, will cause a database server crash. They should be used 
> with great caution and only as a last resort.
>
> ```
> I know they are powerful tools, but also a little surprise with the above 
> example.
>
> Should we add more docs to tell the users that the tool will change the 
> tuples anyway even there are tuple-locks on them?
>

As the name suggests and as documented, heap_force_kill will "force
kill" the tuple, regardless of whether it is visible to another
transaction or not. And further it looks like you are doing an
experiment on undamaged relation which is not recommended as
documented. If the relation would have been damaged, you probably may
not be able to access it.

--
With Regards,
Ashutosh Sharma.

Re: kerberos/001_auth test fails on arm CPU darwin

2022-09-26 Thread Larry Rosenman


On 09/26/2022 11:39 am, Nazir Bilal Yavuz wrote:

Hi,


On 9/26/2022 2:14 PM, Tom Lane wrote:


Maybe we should first try "krb5-config --prefix" to see if that gives 
an answer.



I tested that command on multiple OSes and it was correct for freeBSD, 
debian and openSUSE.


I don't have macOS so I tried to use CI for running macOS VMs(both arm 
and Intel CPU):
When "krb5-config" binary is used from brew or MacPorts installations' 
path it gives the correct path but there is another "krb5-config" 
binary at "/usr/bin/krb5-config" path on the macOS VMs, when this 
binary is used while running "krb5-config --prefix" command run it 
gives "/" as output. This issue can be related about the CI VMs but I 
couldn't check it.


Regards,
Nazir Bilal Yavuz


on macOS monterey 12.6:
~ via 💎 v3.1.2 on ☁️  (us-east-1) on ﴃ WhereTo - Prod
❯ krb5-config --prefix
/

~ via 💎 v3.1.2 on ☁️  (us-east-1) on ﴃ WhereTo - Prod
❯
--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106

Re: Allow foreign keys to reference a superset of unique columns

2022-09-26 Thread James Coleman

On Mon, Sep 26, 2022 at 9:59 AM Wolfgang Walther
 wrote:
>
> James Coleman:
> > So the broader point I'm trying to make is that, as I understand it,
> > indexes backing foreign key constraints is an implementation detail.
> > The SQL standard details the behavior of foreign key constraints
> > regardless of implementation details like a backing index. That means
> > that the behavior of two column foreign key constraints is defined in
> > a single way whether or not there's a backing index at all or whether
> > such a backing index, if present, contains one or two columns.
> >
> > I understand that for the use case you're describing this isn't the
> > absolute most efficient way to implement the desired data semantics.
> > But it would be incredibly confusing (and, I think, a violation of the
> > SQL standard) to have one foreign key constraint work in a different
> > way from another such constraint when both are indistinguishable at
> > the constraint level (the backing index isn't an attribute of the
> > constraint; it's merely an implementation detail).
>
> Ah, thanks, I understand better now.
>
> The two would only be indistinguishable at the constraint level, if
> $subject was implemented by allowing to create unique constraints on a
> superset of unique columns, backed by a different index (the suggestion
> we both made independently). But if it was possible to reference a
> superset of unique columns, where there was only a unique constraint put
> on a subset of the referenced columns (the idea originally introduced in
> this thread), then there would be a difference, right?
>
> That's if it was only the backing index that is not part of the SQL
> standard, and not also the fact that a foreign key should reference a
> primary key or unique constraint?

I think that's not true: the SQL standard doesn't have the option of
"this foreign key is backed by this unique constraint", does it? So in
either case I believe we would be at minimum implementing an extension
to the standard (and as I argued already I think it would actually be
contradictory to the standard).

> Anyway, I can see very well how that would be quite confusing overall.
> It would probably be wiser to allow something roughly like this (if at
> all, of course):
>
> CREATE TABLE bar (
>b INT PRIMARY KEY,
>f INT,
>ftype foo_type GENERATED ALWAYS AS REFERENCE TO foo.type,
>FOREIGN KEY (f, ftype) REFERENCES foo (f, type)
> );
>
> It likely wouldn't work exactly like that, but given a foreign key to
> foo, the GENERATED clause could be used to fetch the value through the
> same triggers that form that FK for efficiency. My main point for now
> is: With a much more explicit syntax anything near that, this would
> certainly be an entirely different feature than $subject **and** it
> would be possible to implement on top of $subject. If at all.

Yeah, I think that would make more sense if one were proposing an
addition to the SQL standard (or an explicit extension to it that
Postgres would support indepently of the standard).

> So no need for me to distract this thread from $subject anymore. I think
> the idea of allowing to create unique constraints on a superset of the
> columns of an already existing unique index is a good one, so let's
> discuss this further.

Sounds good to me!

James Coleman

Re: Allow foreign keys to reference a superset of unique columns

2022-09-26 Thread James Coleman

On Mon, Sep 26, 2022 at 10:04 AM Wolfgang Walther
 wrote:
>
> James Coleman:
> > As I was reading through the email chain I had this thought: could you
> > get the same benefit (or 90% of it anyway) by instead allowing the
> > creation of a uniqueness constraint that contains more columns than
> > the index backing it? So long as the index backing it still guaranteed
> > the uniqueness on a subset of columns that would seem to be safe.
> >
> > Tom notes the additional columns being nullable is something to think
> > about. But if we go the route of allowing unique constraints with
> > backing indexes having a subset of columns from the constraint I don't
> > think the nullability is an issue since it's already the case that a
> > unique constraint can be declared on columns that are nullable. Indeed
> > it's also the case that we already support a foreign key constraint
> > backed by a unique constraint including nullable columns.
> >
> > Because such an approach would, I believe, avoid changing the foreign
> > key code (or semantics) at all, I believe that would address Tom's
> > concerns about information_schema and fuzziness of semantics.
>
>
> Could we create this additional unique constraint implicitly, when using
> FOREIGN KEY ... REFERENCES on a superset of unique columns? This would
> make it easier to use, but still give proper information_schema output.

Possibly. It'd be my preference to discuss that as a second patch
(could be in the same series); my intuition is that it'd be easier to
get agreement on the first part first, but of course I could be wrong
(if some committer, for example, thought the feature only made sense
as an implicit creation of such a constraint to back the use case
Kaiting opened with).

James Coleman

Re: has_privs_of_role vs. is_member_of_role, redux

2022-09-26 Thread Robert Haas

On Mon, Sep 26, 2022 at 12:16 PM Wolfgang Walther
 wrote:
> I think, when WITH INHERIT TRUE, SET FALSE is set, we should:
> - Inherit all explicitly granted privileges
> - Not inherit any DDL privileges implicitly given through ownership:
> CREATE, REFERENCES, TRIGGER.
> - Inherit all other privileges implicitly given through ownership (DML +
> others)

I don't think we're going to be very happy if we redefine inheriting
the privileges of another role to mean inheriting only some of them.
That seems pretty counterintuitive to me. I also think that this
particular definition is pretty fuzzy.

Your previous proposal was to make the SET attribute of a GRANT
control not only the ability to SET ROLE to the target role but also
the ability to create objects owned by that role and/or transfer
objects to that role. I think some people might find that behavior a
little bit surprising - certainly, it goes beyond what the name SET
implies - but it is at least simple enough to explain in one sentence,
and the consequences don't seem too difficult to reason about.

Here, though, it doesn't really seem simple enough to explain in one
sentence, nor does it seem easy to reason about. I'm not sure that
there's any firm distinction between DML privileges and DDL
privileges. I'm not sure that the privileges that you mention are all
and only those that are security-relevant, nor that it would
necessarily remain true in the future even if it's true today.

There are many operations which are permitted or declined just using
an owner-check. One example is commenting on an object. That sure
sounds like it would fit within your proposed "DDL privileges
implicitly given through ownership" category, but it doesn't really
present any security hazard, so I do not think there is a good reason
to restrict that from a user who has INHERIT TRUE, SET FALSE. Another
is renaming an object, which is a little more murky. You can't
directly usurp someone's privileges by renaming an object that they
own, but you could potentially rename an object out of the way and
replace it with one that you own and thus induce a user to do
something dangerous. I don't really want to make even small exceptions
to the idea that inheriting a role's privileges means inheriting all
of them, and I especially don't want to make large exceptions, or
exceptions that involve judgement calls about the relative degree of
risk of each possible operation.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Convert GetDatum() and DatumGet() macros to inline functions

2022-09-26 Thread Tom Lane

Peter Eisentraut  writes:
>> Ok, it has problems with 32-bit platforms.  I can reproduce it locally. 
>> I'll need to take another look at this.  I have reverted the patch for now.

> I have tried to analyze these issues, but I'm quite stuck.  If anyone 
> else has any ideas, it would be helpful.

It looks to me like the problem is with the rewrite of Int64GetDatumFast
and Float8GetDatumFast:

+static inline Datum
+Int64GetDatumFast(int64 X)
+{
+#ifdef USE_FLOAT8_BYVAL
+   return Int64GetDatum(X);
+#else
+   return PointerGetDatum(&X);
+#endif
+}

In the by-ref code path, this is going to return the address of the
parameter local variable, which of course is broken as soon as the
function exits.  To test, I reverted the mods to those two macros,
and I got through check-world OK in a 32-bit VM.

I think we can do this while still having reasonable type-safety
by adding AssertVariableIsOfTypeMacro() checks to the macros.
An advantage of that solution is that we verify that the code
will be safe for a 32-bit build even in 64-bit builds.  (Of
course, it's just checking the variable's type not its lifespan,
but this is still a step forward.)

0001 attached is what you committed, 0002 is a proposed delta
to fix the Fast macros.

regards, tom lane

commit 595836e99bf1ee6d43405b885fb69bb8c6d3ee23
Author: Peter Eisentraut 
Date:   Mon Sep 12 17:35:55 2022 +0200

Convert *GetDatum() and DatumGet*() macros to inline functions

The previous macro implementations just cast the argument to a target
type but did not check whether the input type was appropriate.  The
function implementation can do better type checking of the input type.

Reviewed-by: Aleksander Alekseev 
Discussion: https://www.postgresql.org/message-id/flat/8528fb7e-0aa2-6b54-85fb-0c0886dbd6ed%40enterprisedb.com

diff --git a/contrib/intarray/_int_gist.c b/contrib/intarray/_int_gist.c
index f1817a6cce..6378aa74b0 100644
--- a/contrib/intarray/_int_gist.c
+++ b/contrib/intarray/_int_gist.c
@@ -51,7 +51,7 @@ g_int_consistent(PG_FUNCTION_ARGS)
 
 	/* Oid		subtype = PG_GETARG_OID(3); */
 	bool	   *recheck = (bool *) PG_GETARG_POINTER(4);
-	bool		retval;
+	bool		retval = false; /* silence compiler warning */
 
 	/* this is exact except for RTSameStrategyNumber */
 	*recheck = (strategy == RTSameStrategyNumber);
diff --git a/doc/src/sgml/xfunc.sgml b/doc/src/sgml/xfunc.sgml
index b8cefb9c2c..cf5810b3c1 100644
--- a/doc/src/sgml/xfunc.sgml
+++ b/doc/src/sgml/xfunc.sgml
@@ -2763,7 +2763,7 @@ c_overpaid(PG_FUNCTION_ARGS)
  is  null.   GetAttributeByName returns a Datum
  value that you can convert to the proper data type by using the
  appropriate DatumGetXXX()
- macro.  Note that the return value is meaningless if the null flag is
+ function.  Note that the return value is meaningless if the null flag is
  set; always check the null flag before trying to do anything with the
  result.
 
diff --git a/src/backend/access/gist/gistutil.c b/src/backend/access/gist/gistutil.c
index d4bf0c7563..1532462317 100644
--- a/src/backend/access/gist/gistutil.c
+++ b/src/backend/access/gist/gistutil.c
@@ -280,7 +280,7 @@ gistMakeUnionKey(GISTSTATE *giststate, int attno,
 bool
 gistKeyIsEQ(GISTSTATE *giststate, int attno, Datum a, Datum b)
 {
-	bool		result;
+	bool		result = false; /* silence compiler warning */
 
 	FunctionCall3Coll(&giststate->equalFn[attno],
 	  giststate->supportCollation[attno],
diff --git a/src/backend/tsearch/ts_parse.c b/src/backend/tsearch/ts_parse.c
index 27b2cca2df..92de1f7141 100644
--- a/src/backend/tsearch/ts_parse.c
+++ b/src/backend/tsearch/ts_parse.c
@@ -354,7 +354,7 @@ void
 parsetext(Oid cfgId, ParsedText *prs, char *buf, int buflen)
 {
 	int			type,
-lenlemm;
+lenlemm = 0;	/* silence compiler warning */
 	char	   *lemm = NULL;
 	LexizeData	ldata;
 	TSLexeme   *norms;
@@ -529,7 +529,7 @@ void
 hlparsetext(Oid cfgId, HeadlineParsedText *prs, TSQuery query, char *buf, int buflen)
 {
 	int			type,
-lenlemm;
+lenlemm = 0;	/* silence compiler warning */
 	char	   *lemm = NULL;
 	LexizeData	ldata;
 	TSLexeme   *norms;
diff --git a/src/backend/utils/mb/mbutils.c b/src/backend/utils/mb/mbutils.c
index 0543c574c6..474ab476f5 100644
--- a/src/backend/utils/mb/mbutils.c
+++ b/src/backend/utils/mb/mbutils.c
@@ -409,8 +409,8 @@ pg_do_encoding_conversion(unsigned char *src, int len,
 	(void) OidFunctionCall6(proc,
 			Int32GetDatum(src_encoding),
 			Int32GetDatum(dest_encoding),
-			CStringGetDatum(src),
-			CStringGetDatum(result),
+			CStringGetDatum((char *) src),
+			CStringGetDatum((char *) result),
 			Int32GetDatum(len),
 			BoolGetDatum(false));
 
@@ -485,8 +485,8 @@ pg_do_encoding_conversion_buf(Oid proc,
 	result = OidFunctionCall6(proc,
 			  Int32GetDatum(src_encoding),
 			  Int32GetDatum(dest_encoding),
-			  CStringGetDatum(src),
-			  CStringGetDatum(dest),
+			  CStringGetDa

Re: kerberos/001_auth test fails on arm CPU darwin

2022-09-26 Thread Tom Lane

Larry Rosenman  writes:
> On 09/26/2022 11:39 am, Nazir Bilal Yavuz wrote:
>> When "krb5-config" binary is used from brew or MacPorts installations' 
>> path it gives the correct path but there is another "krb5-config" 
>> binary at "/usr/bin/krb5-config" path on the macOS VMs, when this 
>> binary is used while running "krb5-config --prefix" command run it 
>> gives "/" as output. This issue can be related about the CI VMs but I 
>> couldn't check it.

> [ yup, it gives "/" ]

Yeah, I see the same on my laptop.  So we can't trust krb5-config
unconditionally.  But we could do something like checking
"-x $config_prefix . '/bin/kinit'" before believing it's good,
and maybe also check sbin/krb5kdc.  We'd want to use similar
probes to decide which of the fallback directories to use, anyway.

regards, tom lane

Re: DROP OWNED BY is broken on master branch.

2022-09-26 Thread Robert Haas

On Mon, Sep 26, 2022 at 3:44 AM Rushabh Lathia  wrote:
> Commit 6566133c5f52771198aca07ed18f84519fac1be7 ensure that
> pg_auth_members.grantor is always valid.  This commit did changes
> into shdepDropOwned() function and combined the SHARED_DEPENDENCY_ACL
> and SHARED_DEPENDENCY_OWNER.  In that process it removed condition for
> local object in owner dependency.
>
> case SHARED_DEPENDENCY_OWNER:
> -   /* If a local object, save it for deletion below */
> -   if (sdepForm->dbid == MyDatabaseId)
> +   /* Save it for deletion below */
>
> Case ending up with above error because of the above removed condition.
>
> Please find the attached patch which fixes the case.

Thanks for the report. I think it would be preferable not to duplicate
the logic as your version does, though, so here's a slightly different
version that avoids that.

Per Michael's suggestion, I have also written a test case and included
it in this version.

Comments?

-- 
Robert Haas
EDB: http://www.enterprisedb.com


0001-Fix-bug-in-DROP-OWNED-BY.patch
Description: Binary data

Re: First draft of the PG 15 release notes

2022-09-26 Thread Tom Lane

Justin Pryzby  writes:
> On Fri, Sep 23, 2022 at 01:33:07PM -0400, Tom Lane wrote:
> +Previously such files were left in the current directory,
> +requiring manual cleanup.  It's still necessary to remove them
> +manually afterwards, but now one can just remove that whole
> +subdirectory.

> If pg_upgrade succeeds, then it removes the dir itself (so it's not
> "necessary").

Ah, I'd only ever paid attention to failure cases, so I didn't
realize that :-(.  Text adjusted:

Previously such files were left in the current directory,
requiring manual cleanup.  Now they are automatically removed on
successful completion of pg_upgrade.

I took most of your other suggestions, too.  Thanks!

regards, tom lane

Re: [RFC] building postgres with meson - v13

2022-09-26 Thread Andres Freund

Hi,

On 2022-09-26 15:18:29 +0700, John Naylor wrote:
> On Wed, Sep 21, 2022 at 7:11 AM Andres Freund  wrote:
> >
> > I added installation instructions for meson for a bunch of platforms, but
> 
> A couple more things for the wiki:
> 
> 1) /opt/homebrew/ seems to be an "Apple silicon" path?

Yea, it's /usr/local on x86-64, based on what was required to make macos CI
work. I updated the wiki page, half-blindly - it'd be nice if you could
confirm that that works?

I needed something like the below to get (nearly?) all dependencies working:

brewpath="/usr/local"
PKG_CONFIG_PATH="${brewpath}/lib/pkgconfig:${PKG_CONFIG_PATH}"

for pkg in icu4c krb5 openldap openssl zstd ; do
  pkgpath="${brewpath}/opt/${pkg}"
  PKG_CONFIG_PATH="${pkgpath}/lib/pkgconfig:${PKG_CONFIG_PATH}"
  PATH="${pkgpath}/bin:${pkgpath}/sbin:$PATH"
done

export PKG_CONFIG_PATH PATH

meson setup \
  --buildtype=debug \
  -Dextra_include_dirs=${brewpath}/include \
  -Dextra_lib_dirs=${brewpath}/lib \
  -Dcassert=true \
  -Dssl=openssl -Duuid=e2fs -Ddtrace=auto \
  -DPG_TEST_EXTRA="$PG_TEST_EXTRA" \
  build

the per-package stuff is needed because some libraries aren't installed into
/usr/local (or /opt/homebrew), but only in a subdirectory within that.

> Either way it doesn't exist on this machine. I was able to get a working
> build with
> 
> /usr/local/Homebrew/Library/Homebrew/os/mac/pkgconfig

Hm - what did you need this path for - I don't think that should be needed.

> (My homebrew install doesn't seem to have anything relevant for
> extra_include_dirs or extra_lib_dirs.)

I think libintl.h / libintl.dylib are only in there. With meson that's the
only need for extra_include_dirs / extra_lib_dirs I found on arm apple.

> 2) Also, "ninja -v install" has the same line count as "ninja install" --
> are there versions that do something different?

Yea, that looks like a copy-and-pasto (not even from me :)). Think I fixed it.

Greetings,

Andres Freund

Re: has_privs_of_role vs. is_member_of_role, redux

2022-09-26 Thread Wolfgang Walther


Robert Haas:

I don't think we're going to be very happy if we redefine inheriting
the privileges of another role to mean inheriting only some of them.
That seems pretty counterintuitive to me. I also think that this
particular definition is pretty fuzzy.


Scratch my previous suggestion. A new, less fuzyy definition would be: 
Ownership is not a privilege itself and as such not inheritable.


When role A is granted to role B, two things happen:
1. Role B now has the right to use the GRANTed privileges of role A.
2. Role B now has the right to become role A via SET ROLE.

WITH SET controls whether point 2 is the case or not.

WITH INHERIT controls whether role B actually executes their right to 
use those privileges ("inheritance") **and** whether the set role is 
done implicitly for anything that requires ownership, but of course only 
WITH SET TRUE.


This is the same way that the role attributes INHERIT / NOINHERIT behave.


Your previous proposal was to make the SET attribute of a GRANT
control not only the ability to SET ROLE to the target role but also
the ability to create objects owned by that role and/or transfer
objects to that role. I think some people might find that behavior a
little bit surprising - certainly, it goes beyond what the name SET
implies - but it is at least simple enough to explain in one sentence,
and the consequences don't seem too difficult to reason about.


This would be included in the above.


Here, though, it doesn't really seem simple enough to explain in one
sentence, nor does it seem easy to reason about.


I think the "ownership is not inheritable" idea is easy to explain.


There are many operations which are permitted or declined just using
an owner-check. One example is commenting on an object. That sure
sounds like it would fit within your proposed "DDL privileges
implicitly given through ownership" category, but it doesn't really
present any security hazard, so I do not think there is a good reason
to restrict that from a user who has INHERIT TRUE, SET FALSE. Another
is renaming an object, which is a little more murky. You can't
directly usurp someone's privileges by renaming an object that they
own, but you could potentially rename an object out of the way and
replace it with one that you own and thus induce a user to do
something dangerous. I don't really want to make even small exceptions
to the idea that inheriting a role's privileges means inheriting all
of them, and I especially don't want to make large exceptions, or
exceptions that involve judgement calls about the relative degree of
risk of each possible operation.


I would not make this about security-risks only. We didn't distinguish 
between privileges and ownership that much before, because we didn't 
have WITH INHERIT or WITH SET. Now that we have both, we could do so.


The ideas of "inherited GRANTs" and "a shortcut to avoid SET ROLE to do 
owner-things" should be better to explain.


No judgement required.

All of this is to find a way to make WITH INHERIT TRUE, SET FALSE a 
"real", risk-free thing - and not just some syntactic sugar. And if that 
comes with the inability to COMMENT ON TABLE 
owned_by_pg_read_all_settings... fine. No need for that at all.


However, it would come with the inability to do SELECT * FROM 
owned_by_pg_read_all_settings, **unless** explicitly GRANTed to the 
owner, too. This might feel strange at first, but should not be a 
problem either. WITH INHERIT TRUE, SET FALSE is designed for built-in 
roles or other container roles that group a set of privileges. Those 
roles should not have objects they own anyway. And if they still do, 
denying access to those objects unless explicitly granted is the safe way.


Best

Wolfgang

Re: has_privs_of_role vs. is_member_of_role, redux

2022-09-26 Thread Stephen Frost

Greetings,

* Robert Haas (robertmh...@gmail.com) wrote:
> On Thu, Sep 8, 2022 at 1:06 PM  wrote:
> > In theory, I could also inherit that privilege, but that's not how the
> > system works today. By using is_member_of_role, the decision was already
> > made that this should not depend on inheritance. What is left, is the
> > ability to do it via SET ROLE only.
> 
> I do not accept the argument that we've already made the decision that
> this should not depend on inheritance. It's pretty clear that we
> haven't thought carefully enough about which checks should depend only
> on membership, and which ones should depend on inheritance. The patch
> I committed just now to fix ALTER DEFAULT PRIVILEGES is one clear
> example of where we've gotten that wrong. We also changed the way
> predefined roles worked with inheritance not too long ago, so that
> they started using has_privs_of_role() rather than
> is_member_of_role(). Our past thinking on this topic has been fuzzy
> enough that we can't really conclude that because something uses
> is_member_of_role() now that's what it should continue to do in the
> future. We are working to get from a messy situation where the rules
> aren't consistent or understandable to one where they are, and that
> may mean changing some things.

Agreed that we haven't been good about the distinction between these,
but that the recent work by Joshua and yourself has been moving us in
the right direction.

> One could take the view that the issue here is that
> pg_read_all_settings shouldn't have the right to create objects in the
> first place, and that this INHERIT vs. SET ROLE distinction is just a
> distraction. However, that would require accepting the idea that it's
> possible for a role to lack privileges granted to PUBLIC, which also
> sounds pretty unsatisfying. On the whole, I'm inclined to think it's
> reasonable to suppose that if you want to grant a role to someone
> without letting them create objects owned by that role, it should be a
> role that doesn't own any existing objects either. Essentially, that's
> legislating that predefined roles should be minimally privileged: they
> should hold the ability to do whatever it is that they are there to do
> (like read all settings) but not have any other privileges (like the
> ability to do stuff to objects they own).

Predefined roles are special in that they should GRANT just the
privileges that the role is described to GRANT and that users really
shouldn't be able to SET ROLE to them nor should they be allowed to own
objects, or at least that's my general feeling on them.

If an administrator doesn't wish for a user to have the privileges
provided by the predefined role by default, they should be able to set
that up by creating another role who has that privilege which the user
is able to SET ROLE to.  That is:

CREATE ROLE admin WITH INHERIT FALSE;
GRANT pg_read_all_settings TO admin;
GRANT admin TO alice;

Would allow 'alice' to log in without the privileges associated with
pg_read_all_settings but 'alice' is able to SET ROLE admin; and gain
those privileges.  It wasn't intended that 'alice' be able to SET ROLE
to pg_read_all_settings itself though.

* Robert Haas (robertmh...@gmail.com) wrote:
> Yeah, my statement before wasn't correct. It appears that alice can't
> just usurp the privileges of pg_read_all_settings trivially, but she
> can create a trigger on any preexisting table owned by
> pg_read_all_settings and then anyone who performs an operation that
> causes that trigger to fire is at risk:

Triggers aren't the only thing to be worried about in this area-
functions defined inside of views are also run with the privileges of
the user running the SELECT and not as the owner of the view.  The same
is true of running SELECT against tables with RLS too, of course.
Generally speaking, it's always been very risky to access the objects of
users who you don't trust in any way and we don't currently provide any
particularly easy way to make that kind of access safe.  RLS at least
provides an escape by allowing a user to turn it off, but the same isn't
available for setting a search_path and then running queries or
accessing views or running DML against tables with triggers.

> I'm slightly skeptical of that conclusion because the whole thing just
> feels a bit flimsy. Like, the whole idea that you can compromise your
> account by inserting a row into somebody else's table feels a little
> nuts to me. Triggers and row-level security policies make it easy to
> do things that look safe and are actually very dangerous. I think
> anyone would reasonably expect that calling a function owned by some
> other user might be risky, because who knows what that function might
> do, but it seems less obvious that accessing a table could execute
> arbitrary code, yet it can. And it is even less obvious that creating
> a table owned by one role might give some other role who inherits that
> user's privileges to booby-trap that table i

Re: has_privs_of_role vs. is_member_of_role, redux

2022-09-26 Thread Stephen Frost

Greetings,

* Wolfgang Walther (walt...@technowledgy.de) wrote:
> Robert Haas:
> > I don't think we're going to be very happy if we redefine inheriting
> > the privileges of another role to mean inheriting only some of them.
> > That seems pretty counterintuitive to me. I also think that this
> > particular definition is pretty fuzzy.
> 
> Scratch my previous suggestion. A new, less fuzyy definition would be:
> Ownership is not a privilege itself and as such not inheritable.

One of the reasons the role system was brought into being was explicitly
to allow other roles to have ownership-level rights on objects that they
didn't directly own.

I don't see us changing that.

Thanks,

Stephen


signature.asc
Description: PGP signature

Re: [RFC] building postgres with meson - v13

2022-09-26 Thread Andres Freund

Hi,

On 2022-09-26 09:35:16 -0700, Andres Freund wrote:
> > 9c00d355d0e9 meson: Add PGXS compatibility
> >
> > This looks like a reasonable direction to me.  How complete is it?  It
> > says it works for some extensions but not others.  How do we define
> > the target line here?
>
> Yea, those are good questions.
>
>
> > How complete is it?
>
> It's a bit hard to know. I think the most important stuff is there. But
> there's no clear "API" around pgxs. E.g. we don't (yet?) have an exactly
> equivalent definition of 'host', because that's very config.guess specific.
>
> There's lots of shortcuts - e.g. with meson we don't need an equivalent to
> PGAC_CHECK_STRIP, so we need to make up something for Makefile.global.
>
> Noah suggested using $(error something), but that only works if $variable is
> only used in recursively expanded variables - the errors end up confusing.

Looking through a few of the not-nicely-replaced things, I think we can
simplify at least some away:

- RANLIB: most platforms use AROPT = crs, making ranlib unnecessary. {free,
  net, open}bsd don't currently, but all support it from what I know

- with_gnu_ld: this is only used on solaris, to set export_dynamic = -Wl,-E
  when using a gnu ld. How about moving this to configure instead, and just
  checking if -Wl,-E links?

- FLEXFLAGS: As a configure input this is afaict unused and undocumented - and
  it's not clear why it'd be useful? Not that an empty replacement is a
  meaningful effort


I'm not sure what to do about:
- autodepend - I'm inclined to set it to true when using a gcc like
  compiler. I think extension authors won't be happy if suddenly their
  extensions don't rebuild reliably anymore. An --enable-depend like
  setting doesn't make sense for meson, so we don't have anything to source it
  from.
- {LDAP,UUID,ICU}_{LIBS,CFLAGS} - might some extension need them?


For some others I think it's ok to not have replacement. Would be good for
somebody to check my thinking though:

- LIBOBJS, PG_CRC32C_OBJS, TAS: Not needed because we don't build
  the server / PLs with the generated makefile
- ZIC: only needed to build tzdata as part of server build
- MSGFMT et al: translation doesn't appear to be supported by pgxs, correct?
- XMLLINT et al: docs don't seem to be supported by pgxs
- GENHTML et al: supporting coverage for pgxs-in-meson build doesn't seem worth 
it
- WINDRES: I don't think extensions are bothering to generate rc files on 
windows


I'll include an updated pgxs-compat patch in the next post of the series (in a
few hours).

Greetings,

Andres Freund

Re: [RFC] building postgres with meson - v13

2022-09-26 Thread Peter Geoghegan

On Sun, Sep 25, 2022 at 5:38 PM Andres Freund  wrote:
> # run just the main pg_regress tests against existing server
> meson test --setup running main/regress-running

> Peter, would this address your use case?

I tried out your v16 patchset, which seems to mostly work as I'd hoped
it would. Some feedback:

* I gather that "running" as it appears in commands like "meson test
--setup running" refers to a particular setup named "running", that
you invented as part of creating a meson-ish substitute for
installcheck. Can "running" be renamed to something that makes it
obvious that it's a Postgres thing, and not a generic meson thing?

Maybe some kind of consistent naming convention would work best here.
This setup could be "pg_against_running_server", or something along
those lines.

* It would be nice if failed tests told me exactly which "diffs" file
I needed to look at, without my having to look for the message through
the meson log (or running with -v). Is this possible?

To be fair I should probably just be running -v when I run tests
against an existing running server, anyway -- so maybe I'm asking for
the wrong thing. It would at least be slightly better if I always got
to see a path to a .diffs file for failed tests, even without -v. But
it's just a "nice to have" thing -- it's not worth going out of your
way to make it work like that

* Just FYI, there are considerations about the libpq that we link to
here (actually this isn't particularly related to the new installcheck
work, but thought I'd mention it in passing).

I'm using Debian Unstable here. Like Nathan, I found that I needed a
custom -Dextra_lib_dirs just so that binaries would link against the
installation's own libpq, rather than the system libpq. This is
important-ish because linking to the wrong libpq means that I get an
error about not being able to connect via trust authentication to a
unix socket from the directory /var/run/postgresql -- confusion over
where to look for sockets visibly breaks many things.

The workaround that I have is fine, but this still seems like
something that should "just work". I believe that there is a pending
patch for this already, so enough said here.

> I think it'd make sense to add a few toplevel targets to run tests in certain
> ways, but I've not done that here.

I usually run "installcheck-world" (not just installcheck) when I want
to do a generic smoke test with Vaglrind. Sometimes that will fail
relatively early for very silly reasons, for example because I don't
have exactly the expected plan in some harmless way (I try to account
for this by running Valgrind in a shellscript that tries to match
"make check", but that doesn't always work). It is nice that I won't
have to worry about such minor issues derailing everything for a long
running and unsupervised Valgrind test. (Maybe I could have worked
around this before now, but I guess I never tried.)

More generally, I think that we should be encouraging users to think
of the tests as something that you can run in any order. People should
be encouraged to think in terms of the meson abstractions, such as
test setups.

I found that "meson test --setup running --list" will show me what
tests I'll be running if I want to do "installcheck" style testing,
without having to run any tests at all -- another small but important
improvement. This seems worth drawing attention to on the meson Wiki
page as a non-obvious improvement over "installcheck". I might even
find it useful to hard-code some of these tests in a shellscript, that
runs only a subset of "--setup running" tests that happen to be
interesting for Valgrind testing right now.

BTW the meson wiki page iencourages users to think of "meson setup"
and "meson configure" as equivalent to autoconf configure. I get why
you explained it like that, but that confused me at first. What I
since figured out (which will be absurdly obvious to you) is that you
really need to decouple the generic from the specific -- very much
unlike autoconf. I found it useful to separate stuff that I know will
never change for a given build directory (such as the prefix install
path) from other things that are variable configuration settings
(things like the optimization level used by GCC). I now have a
scripted way of running "meson setup" for the former stuff (which is
generic), and a scripted way of running "meson configure" for the
latter set of stuff (with variations for "standard" release and debug
builds, building Valgrind, etc).

-- 
Peter Geoghegan

Re: identifying the backend that owns a temporary schema

2022-09-26 Thread Tom Lane

Nathan Bossart  writes:
> On Sat, Sep 24, 2022 at 01:41:38PM -0400, Tom Lane wrote:
>> One thing I don't like about it documentation-wise is that it leaves
>> the concept of backend ID pretty much completely undefined.

> How specific do you think this definition ought to be?

Fairly specific, I think, so that people can reason about how it behaves.
Notably, it seems absolutely critical to be clear that the IDs recycle
over short time frames.  Maybe like

These access functions use the session's backend ID number, which is
a small integer that is distinct from the backend ID of any concurrent
session, although an ID can be recycled as soon as the session exits.
The backend ID is used, among other things, to identify the session's
temporary schema if it has one.

I'd prefer to use the terminology "session" than "backend" in the
definition.  I suppose we can't get away with actually calling it
a "session ID" given that "backend ID" is used in so many places;
but I think people have a clearer handle on what a session is.

regards, tom lane

Re: HOT chain validation in verify_heapam()

2022-09-26 Thread Robert Haas

On Sat, Sep 24, 2022 at 8:45 AM Himanshu Upadhyaya
 wrote:
> Here our objective is to validate if both Predecessor's xmin and current 
> Tuple's xmin are same then cmin of predecessor must be less than current 
> Tuple's cmin. In case when both tuple xmin's are same then I think 
> predecessor's t_cid will always hold combo CID.
> Then either one or both tuple will always have a combo CID and skipping this 
> check based on "either tuple has a combo CID" will make this if condition to 
> be evaluated to false ''.

Fair point. I think maybe we should just remove the CID validation
altogether. I thought initially that it would be possible to have a
newer update with a numerically lower combo CID, but after some
experimentation I don't see a way to do it. However, it also doesn't
seem very useful to me to check that the combo CIDs are in ascending
order. I mean, even if that's not supposed to happen and does anyway,
there aren't really any enduring consequences, because command IDs are
ignored completely outside of the transaction that performed the
operation originally. So even if the combo CIDs were set to completely
random values, is that really corruption? At most it messes things up
for the duration of one transaction. And if we just have plain CIDs
rather than combo CIDs, the same thing is true: they could be totally
messed up and it wouldn't really matter beyond the lifetime of that
one transaction.

Also, it would be a lot more tempting to check this if we could check
it in all cases, but we can't. If a tuple is inserted in transaction
T1 and ten updated twice in transaction T2, we'll have only one combo
CID and nothing to compare it against, nor any way to decode what CMIN
and CMAX it originally represented. And this is probably a pretty
common type of case.

>> +if (TransactionIdEquals(pred_xmin, curr_xmin) &&
>> +(TransactionIdEquals(curr_xmin, curr_xmax) ||
>> + !TransactionIdIsValid(curr_xmax)) && pred_cmin >= 
>> curr_cmin)
>>
>> I don't understand the reason for the middle part of this condition --
>> TransactionIdEquals(curr_xmin, curr_xmax) ||
>> !TransactionIdIsValid(curr_xmax). I suppose the comment is meant to
>> explain this, but I still don't get it. If a tuple with XMIN 12345
>> CMIN 2 is updated to produce a tuple with XMIN 12345 CMIN 1, that's
>> corruption, regardless of what the XMAX of the second tuple may happen
>> to be.
>
> tuple | t_xmin | t_xmax | t_cid | t_ctid |  tuple_data_split  
>  |   
> heap_tuple_infomask_flags
>
> ---+++---++-+--
> -
>  1 |971 |971 | 0 | (0,3)  | 
> {"\\x1774657374312020202020","\\x0100"} | 
> ("{HEAP_HASVARWIDTH,HEAP_COMBOCID,HEAP_XMIN_COMMITTED,HEAP_XMAX_COMMITTED,HEAP_HOT_UPDATED}",{})
>  2 |971 |  0 | 1 | (0,2)  | 
> {"\\x1774657374322020202020","\\x0200"} | 
> ("{HEAP_HASVARWIDTH,HEAP_XMAX_INVALID}",{})
>  3 |971 |971 | 1 | (0,4)  | 
> {"\\x1774657374322020202020","\\x0100"} | 
> ("{HEAP_HASVARWIDTH,HEAP_COMBOCID,HEAP_XMIN_COMMITTED,HEAP_XMAX_COMMITTED,HEAP_UPDATED,HEAP_HOT_UPDATED,HEAP_ONLY
> _TUPLE}",{})
>  4 |971 |972 | 0 | (0,5)  | 
> {"\\x1774657374332020202020","\\x0100"} | 
> ("{HEAP_HASVARWIDTH,HEAP_XMIN_COMMITTED,HEAP_UPDATED,HEAP_HOT_UPDATED,HEAP_ONLY_TUPLE}",{})
>  5 |972 |  0 | 0 | (0,5)  | 
> {"\\x1774657374342020202020","\\x0100"} | 
> ("{HEAP_HASVARWIDTH,HEAP_XMAX_INVALID,HEAP_UPDATED,HEAP_ONLY_TUPLE}",{})
>
> In the above case Tuple 1->3->4 is inserted and updated by xid 971 and tuple 
> 4 is next update by xid 972, here t_cid of tuple 4 is 0 where as its 
> predecessor's t_cid is 1, because in Tuple 4 t_cid is having command ID of 
> deleting transaction(cmax), that is why we need to check xmax of the Tuple.
>
> Please correct me if I am missing anything here?

Hmm, I see, so basically you're trying to check whether the CID field
contains a CMIN as opposed to a CMAX. But I'm not sure this test is
entirely reliable, because heap_prepare/execute_freeze_tuple() can set
a tuple's xmax to InvalidTransactionId even after it's had some other
value, and that won't do anything to the contents of the CID field.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: identifying the backend that owns a temporary schema

2022-09-26 Thread Nathan Bossart

On Mon, Sep 26, 2022 at 03:50:09PM -0400, Tom Lane wrote:
> Nathan Bossart  writes:
>> On Sat, Sep 24, 2022 at 01:41:38PM -0400, Tom Lane wrote:
>>> One thing I don't like about it documentation-wise is that it leaves
>>> the concept of backend ID pretty much completely undefined.
> 
>> How specific do you think this definition ought to be?
> 
> Fairly specific, I think, so that people can reason about how it behaves.
> Notably, it seems absolutely critical to be clear that the IDs recycle
> over short time frames.  Maybe like
> 
> These access functions use the session's backend ID number, which is
> a small integer that is distinct from the backend ID of any concurrent
> session, although an ID can be recycled as soon as the session exits.
> The backend ID is used, among other things, to identify the session's
> temporary schema if it has one.
> 
> I'd prefer to use the terminology "session" than "backend" in the
> definition.  I suppose we can't get away with actually calling it
> a "session ID" given that "backend ID" is used in so many places;
> but I think people have a clearer handle on what a session is.

Thanks for the suggestion.  I used it in v4 of the patch.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
>From d8d32b20e79654732b3d5c99d39a2c463271207c Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Fri, 12 Aug 2022 23:07:37 -0700
Subject: [PATCH v4 1/1] Adjust pg_stat_get_backend_*() to use backends' PGPROC
 backend IDs.

Presently, these functions use the index of the backend's entry in
localBackendStatusTable as the backend ID.  While this might bear
some resemblance to the backend ID of the backend's PGPROC struct,
it can quickly diverge as sessions connect and disconnect.  This
change modifies the pg_stat_get_backend_*() suite of functions to
use the PGPROC backend IDs instead.  While we could instead
introduce a new function for retrieving PGPROC backend IDs,
presenting two sets of backend IDs to users seems likely to cause
even more confusion than what already exists.

This is primarily useful for discovering the session that owns a
resource named with its PGPROC backend ID.  For example, temporary
schema names include the PGPROC backend ID, and it might be
necessary to identify the session that owns a temporary table that
is putting the cluster in danger of transaction ID wraparound.
---
 doc/src/sgml/monitoring.sgml| 12 ---
 src/backend/utils/activity/backend_status.c | 40 +++--
 src/backend/utils/adt/pgstatfuncs.c | 11 +++---
 src/include/utils/backend_status.h  |  8 +
 4 files changed, 59 insertions(+), 12 deletions(-)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 1d9509a2f6..4ca17e9f6f 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -5488,10 +5488,13 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
information.  In such cases, an older set of per-backend statistics
access functions can be used; these are shown in .
-   These access functions use a backend ID number, which ranges from one
-   to the number of currently active backends.
+   These access functions use the session's backend ID number, which is a small
+   integer that is distinct from the backend ID of any concurrent session,
+   although an ID can be recycled as soon as the session exits.  The backend ID
+   is used, among other things, to identify the session's temporary schema if
+   it has one.
The function pg_stat_get_backend_idset provides a
-   convenient way to generate one row for each active backend for
+   convenient way to list all the active backends' ID numbers for
invoking these functions.  For example, to show the PIDs and
current queries of all backends:
 
@@ -5526,8 +5529,7 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
 setof integer


-Returns the set of currently active backend ID numbers (from 1 to the
-number of active backends).
+Returns the set of currently active backend ID numbers.

   
 
diff --git a/src/backend/utils/activity/backend_status.c b/src/backend/utils/activity/backend_status.c
index c7ed1e6d7a..159d022070 100644
--- a/src/backend/utils/activity/backend_status.c
+++ b/src/backend/utils/activity/backend_status.c
@@ -849,6 +849,7 @@ pgstat_read_current_status(void)
 			BackendIdGetTransactionIds(i,
 	   &localentry->backend_xid,
 	   &localentry->backend_xmin);
+			localentry->backend_id = i;
 
 			localentry++;
 			localappname += NAMEDATALEN;
@@ -1045,6 +1046,22 @@ pgstat_get_my_query_id(void)
 	return MyBEEntry->st_query_id;
 }
 
+/* --
+ * cmp_lbestatus
+ *
+ *	Comparison function for bsearch() on an array of LocalPgBackendStatus.  The
+ *	backend_id field is used to compare the arguments.
+ *
+ * --
+ */
+static int
+cmp_lbestatus(const void *a, const void *b)
+{
+	Lo

Re: [PATCH] Introduce array_shuffle() and array_sample()

2022-09-26 Thread Tom Lane

Martin Kalcher  writes:
> [ v4-0001-Introduce-array_shuffle-and-array_sample.patch ]

I think this idea of exporting drandom()'s PRNG for all and sundry
to use is completely misguided.  If we go down that path we'll
be right back in the swamp that we were in when we used random(3),
namely that (a) it's not clear what affects what, and (b) to the
extent that there are such interferences, it could be bad, maybe
even a security problem.  We very intentionally decoupled drandom()
from the rest of the world at commit 6645ad6bd, and I'm not ready
to unlearn that lesson.

With our current PRNG infrastructure it doesn't cost much to have
a separate PRNG for every purpose.  I don't object to having
array_shuffle() and array_sample() share one PRNG, but I don't
think it should go much further than that.

regards, tom lane

Re: has_privs_of_role vs. is_member_of_role, redux

2022-09-26 Thread Robert Haas

On Mon, Sep 26, 2022 at 3:16 PM Wolfgang Walther
 wrote:
> Robert Haas:
> > I don't think we're going to be very happy if we redefine inheriting
> > the privileges of another role to mean inheriting only some of them.
> > That seems pretty counterintuitive to me. I also think that this
> > particular definition is pretty fuzzy.
>
> Scratch my previous suggestion. A new, less fuzyy definition would be:
> Ownership is not a privilege itself and as such not inheritable.
>
> When role A is granted to role B, two things happen:
> 1. Role B now has the right to use the GRANTed privileges of role A.
> 2. Role B now has the right to become role A via SET ROLE.
>
> WITH SET controls whether point 2 is the case or not.
>
> WITH INHERIT controls whether role B actually executes their right to
> use those privileges ("inheritance") **and** whether the set role is
> done implicitly for anything that requires ownership, but of course only
> WITH SET TRUE.

If I'm understanding correctly, this would amount to a major
redefinition of what it means to inherit privileges, and I think the
chances of such a change being accepted are approximately zero.
Inheriting privileges needs to keep meaning what it means now, namely,
you inherit all the rights of the granted role.

> > Here, though, it doesn't really seem simple enough to explain in one
> > sentence, nor does it seem easy to reason about.
>
> I think the "ownership is not inheritable" idea is easy to explain.

I don't. And even if I did think it were easy to explain, I don't
think it would be a good idea. One of my first patches to PostgreSQL
added a grantable TRUNCATE privilege to tables. I think that, under
your proposed definitions, the addition of this privilege would have
had the result that a role grant would cease to allow the recipient to
truncate tables owned by the granted role. There is currently a
proposal on the table to make VACUUM and ANALYZE grantable permissions
on tables, which would have the same issue. I think that if I made it
so that adding such privileges resulted in role inheritance not
working for those operations any more, people would come after me with
pitchforks. And I wouldn't blame them: that sounds terrible.

I think the only thing we should be discussing here is how to tighten
up the tests for operations in categories (1) and (2) in my original
email. The options so far proposed are: (a) do nothing, which makes
the proposed SET option on grants a lot less useful; (b) restrict
those operations by has_privs_of_role(), basically making them
dependent on the INHERIT option, (c) restrict them by
has_privs_of_role() || member_can_set_role(), requiring either the
INHERIT option or the SET option, or (d) restrict them by
member_can_set_role() only, i.e. making them depend on the SET  option
alone. A broader reworking of what the INHERIT option means is not on
the table: I don't want to write a patch for it, I don't think it's a
good idea, and I don't think the community would accept it even if I
did want to write a patch for it and even if I did think it was a good
idea.

I would like to hear more opinions on that topic. I understand your
vote from among those four options to be (d). I do not know what
anyone else thinks.

Thanks,

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: [RFC] building postgres with meson - v13

2022-09-26 Thread Andres Freund

Hi,

On 2022-09-26 12:47:14 -0700, Peter Geoghegan wrote:
> On Sun, Sep 25, 2022 at 5:38 PM Andres Freund  wrote:
> > # run just the main pg_regress tests against existing server
> > meson test --setup running main/regress-running
> 
> > Peter, would this address your use case?
> 
> I tried out your v16 patchset, which seems to mostly work as I'd hoped
> it would.

Thanks & cool.


> Some feedback:
> * I gather that "running" as it appears in commands like "meson test
> --setup running" refers to a particular setup named "running", that
> you invented as part of creating a meson-ish substitute for
> installcheck. Can "running" be renamed to something that makes it
> obvious that it's a Postgres thing, and not a generic meson thing?

Yes. The only caveat is that it makes lines longer, because it's included in
the printed test line (there's no real requirement to have the test suite and
the setup named the same,b ut it seems confusing not to)


> Maybe some kind of consistent naming convention would work best here.
> This setup could be "pg_against_running_server", or something along
> those lines.



> * It would be nice if failed tests told me exactly which "diffs" file
> I needed to look at, without my having to look for the message through
> the meson log (or running with -v). Is this possible?

You can use --print-errorlogs to print the log output iff a test fails. It's a
bit painful that some tests have very verbose output :(.  I don't really see a
way that meson can help us here, this is pretty much on "our" end.


> BTW the meson wiki page iencourages users to think of "meson setup"
> and "meson configure" as equivalent to autoconf configure. I get why
> you explained it like that, but that confused me at first. What I
> since figured out (which will be absurdly obvious to you) is that you
> really need to decouple the generic from the specific -- very much
> unlike autoconf. I found it useful to separate stuff that I know will
> never change for a given build directory (such as the prefix install
> path) from other things that are variable configuration settings
> (things like the optimization level used by GCC). I now have a
> scripted way of running "meson setup" for the former stuff (which is
> generic), and a scripted way of running "meson configure" for the
> latter set of stuff (with variations for "standard" release and debug
> builds, building Valgrind, etc).

Hm. I'm not entirely sure what you mean here. The only thing that you can't
change in a existing build-dir with meson configure is the compiler.

I personally have different types of build dirs set up in parallel
(e.g. assert, optimize, assert-32, assert-w64). I'll occasionally
enable/disable

Greetings,

Andres Freund

Re: [RFC] building postgres with meson - v13

2022-09-26 Thread Peter Geoghegan

On Mon, Sep 26, 2022 at 1:27 PM Andres Freund  wrote:
> > Some feedback:
> > * I gather that "running" as it appears in commands like "meson test
> > --setup running" refers to a particular setup named "running", that
> > you invented as part of creating a meson-ish substitute for
> > installcheck. Can "running" be renamed to something that makes it
> > obvious that it's a Postgres thing, and not a generic meson thing?
>
> Yes. The only caveat is that it makes lines longer, because it's included in
> the printed test line (there's no real requirement to have the test suite and
> the setup named the same,b ut it seems confusing not to)

Probably doesn't have to be too long. And I'm not sure of the details.
Just a small thing from my point of view.

> > * It would be nice if failed tests told me exactly which "diffs" file
> > I needed to look at, without my having to look for the message through
> > the meson log (or running with -v). Is this possible?
>
> You can use --print-errorlogs to print the log output iff a test fails. It's a
> bit painful that some tests have very verbose output :(.  I don't really see a
> way that meson can help us here, this is pretty much on "our" end.

Makes sense. Thanks.

> Hm. I'm not entirely sure what you mean here. The only thing that you can't
> change in a existing build-dir with meson configure is the compiler.

I do understand that it doesn't particularly matter to meson itself.
The point I was making was one about how I personally find it
convenient to set those things that I know will never change in
practice (because they're fundamentally things that I know that I
won't ever need to change) during "meson setup", while doing
everything else using "meson configure". I like to automate everything
using shell scripts. I will very occasionally have to run "meson
setup" via a zsh function anyway, so why not couple that process with
the process of setting "immutable for this build directory" settings?

With autoconf, I will run one of various zsh functions that run
configure in some specific way -- there are various "build types",
such as debug, release, and Valgrind. But with meson it makes sense to
split it in two -- have a generic zsh function for generic once-off
build directory setup (including even the mkdir), that also sets
generic, "immutable" settings, and a specialized zsh function that
changes things in a way that is particular to that kind of build (like
whether asserts are enabled, optimization level, and so on).

> I personally have different types of build dirs set up in parallel
> (e.g. assert, optimize, assert-32, assert-w64). I'll occasionally
> enable/disable

I know that other experienced hackers do it that way. I have found
that ccache works well enough that I don't feel the need for multiple
build directories per branch.

Perhaps I've assumed more than I should about my approach being
broadly representative. It might ultimately be easier to just have
multiple build directories per branch/source directory -- one per
"build type" per branch. That has the advantage of not requiring each
"build type" zsh function to remember to reset anything that might
have been set by one of its sibling zsh functions for some other build
type (there is no need to "reset the setting to its default"). That
approach is more like scripting autoconf/configure would be, in that
you basically never change any settings for a given build directory in
practice (you maybe invent a new kind of build type instead, or you
update the definition of an existing standard build type based on a
new requirement for that build type).

It's really handy that meson lets you quickly change one setting
against an existing build directory. I'm slightly worried that that
will allow me to shoot myself in the foot, though. Perhaps I'll change
some exotic setting in an ad-hoc way, and then forget to unset it
afterwards, leading to (say) a mysterious performance degradation for
what is supposed to be one of my known standard build types. There is
no risk of that with my autoconf/configure workflow, because I'll
rerun the relevant configure zsh function before long anyway, making
it impossible for me to accidentally keep something that I never meant
to keep.

I like being able to throw everything away and quickly rebuild "from
scratch" (in reality rebuild using ccache and a cache for configure)
due to superstition/defensive paranoia/learned helplessness. This has
always worked well enough because ccache works fairly well. I'm not
sure how useful that kind of mindset will be with meson just yet, and
if I'm just thinking about it in the wrong way, so forgive me for
rambling like this.

-- 
Peter Geoghegan

Re: Pluggable toaster

2022-09-26 Thread Nikita Malakhov

Hi,
Meson build for the patchset failed, meson build files attached and
README/Doc package
reworked with more detailed explanation of virtual function table along
with other corrections.

On Sun, Sep 25, 2022 at 1:41 AM Nikita Malakhov  wrote:

> Hi hackers!
> Last patchset has an invalid patch file - v16-0003-toaster-docs.patch.
> Here's corrected patchset,
> sorry for the noise.
>
> On Sat, Sep 24, 2022 at 3:50 PM Nikita Malakhov  wrote:
>
>> Hi hackers!
>>
>> Cfbot is still not happy with the patchset, so I'm attaching a rebased
>> one, rebased onto the current
>> master (from today). The third patch contains documentation package, and
>> the second one contains large
>> README.toastapi file providing additional in-depth docs for developers.
>>
>> Comments would be greatly appreciated.
>>
>> Again, after checking patch sources I have a strong opinion that it needs
>> some refactoring -
>> move all files related to TOAST implementation into new folder
>> /backend/access/toast where
>> Generic (default) Toaster resides.
>>
>> Patchset consists of:
>> v16-0001-toaster-interface.patch - Pluggable TOAST API interface along
>> with reference TOAST mechanics;
>> v16-0002-toaster-default.patch - Default TOAST re-implemented using
>> Toaster API;
>> v16-0003-toaster-docs.patch - Pluggable TOAST API documentation package
>>
>> Actual GitHub branch resides at
>> https://github.com/postgrespro/postgres/tree/toasterapi_clean
>>
>> On Fri, Sep 23, 2022 at 10:54 PM Nikita Malakhov 
>> wrote:
>>
>>> Hi hackers!
>>>
>>> Cfbot is not happy with previous patchset, so I'm attaching new one,
>>> rebased onto current master
>>> (15b4). Also providing patch with documentation package, and the second
>>> one contains large
>>> README.toastapi file providing additional in-depth docs for developers.
>>>
>>> Comments would be greatly appreciated.
>>>
>>> Also, after checking patch sources I have a strong opinion that it needs
>>> some refactoring -
>>> move all files related to TOAST implementation into new folder
>>> /backend/access/toast where
>>> Generic (default) Toaster resides.
>>>
>>> Patchset consists of:
>>> v15-0001-toaster-interface.patch - Pluggable TOAST API interface along
>>> with reference TOAST mechanics;
>>> v15-0002-toaster-default.patch - Default TOAST re-implemented using
>>> Toaster API;
>>> v15-0003-toaster-docs.patch - Pluggable TOAST API documentation package
>>>
>>> On Tue, Sep 13, 2022 at 7:50 PM Jacob Champion 
>>> wrote:
>>>
 On Mon, Sep 12, 2022 at 11:45 PM Nikita Malakhov 
 wrote:
 > It would be more clear for complex data types like JSONB, where
 developers could
 > need some additional functionality to work with internal
 representation of data type,
 > and its full potential is revealed in our JSONB toaster extension.
 The JSONB toaster
 > is still in development but we plan to make it available soon.

 Okay. It'll be good to have that, because as it is now it's hard to
 see the whole picture.

 > On installing dummy_toaster contrib: I've just checked it by making a
 patch from commit
 > and applying onto my clone of master and 2 patches provided in
 previous email without
 > any errors and sll checks passed - applying with git am, configure
 with debug, cassert,
 > depend and enable-tap-tests flags and run checks.
 > Please advice what would cause such a behavior?

 I don't think the default pg_upgrade tests will upgrade contrib
 objects (there are instructions in src/bin/pg_upgrade/TESTING that
 cover manual dumps, if you prefer that method). My manual steps were
 roughly

 =# CREATE EXTENSION dummy_toaster;
 =# CREATE TABLE test (t TEXT
 STORAGE external
 TOASTER dummy_toaster_handler);
 =# \q
 $ initdb -D newdb
 $ pg_ctl -D olddb stop
 $ pg_upgrade -b /bin -B /bin -d
 ./olddb -D ./newdb

 (where /bin is on the PATH, so we're using the right
 binaries).

 Thanks,
 --Jacob

>>>
>>>
>>> --
>>> Regards,
>>> Nikita Malakhov
>>> Postgres Professional
>>> https://postgrespro.ru/
>>>
>>
>>
>> --
>> Regards,
>> Nikita Malakhov
>> Postgres Professional
>> https://postgrespro.ru/
>>
>
>
> --
> Regards,
> Nikita Malakhov
> Postgres Professional
> https://postgrespro.ru/
>


-- 
Regards,
Nikita Malakhov
Postgres Professional
https://postgrespro.ru/


v18-0002-toaster-default.patch.gz
Description: GNU Zip compressed data


v18-0003-toaster-docs.patch.gz
Description: GNU Zip compressed data


v18-0001-toaster-interface.patch.gz
Description: GNU Zip compressed data

Re: Use pg_pwritev_with_retry() instead of write() in dir_open_for_write() to avoid partial writes?

2022-09-26 Thread Nathan Bossart

On Mon, Sep 26, 2022 at 08:33:53PM +0530, Bharath Rupireddy wrote:
> Irrespective of what Windows does with file pointers in WriteFile(),
> should we add lseek(SEEK_SET) in our own pwrite()'s implementation,
> something like [5]? This is rather hackish without fully knowing what
> Windows does internally in WriteFile(), but this does fix inherent
> issues that our pwrite() callers (there are quite a number of places
> that use pwrite() and presumes file pointer doesn't change on Windows)
> may have on Windows. See the regression tests passing [6] with the fix
> [5].

I think so.  I don't see why we would rather have each caller ensure
pwrite() behaves as documented.

> +   /*
> +* On Windows, it is found that WriteFile() changes the file
> pointer and we
> +* want pwrite() to not change. Hence, we explicitly reset the
> file pointer
> +* to beginning of the file.
> +*/
> +   if (lseek(fd, 0, SEEK_SET) != 0)
> +   {
> +   _dosmaperr(GetLastError());
> +   return -1;
> +   }
> +
> return result;
>  }

Why reset to the beginning of the file?  Shouldn't we reset it to what it
was before the call to pwrite()?

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Re: First draft of the PG 15 release notes

2022-09-26 Thread David Rowley

On Tue, 13 Sept 2022 at 09:31, Jonathan S. Katz  wrote:
> Separately, per[1], including dense_rank() in the list of window
> functions with optimizations (dense-rank.diff).

This one might have been forgotten... ? I can push it shortly if nobody objects.

David

> [1]
> https://www.postgresql.org/message-id/CAApHDvpr6N7egNfSttGdQMfL%2BKYBjUb_Zf%2BrHULb7_2k4V%3DGGg%40mail.gmail.com

Re: Use pg_pwritev_with_retry() instead of write() in dir_open_for_write() to avoid partial writes?

2022-09-26 Thread Thomas Munro

On Tue, Sep 27, 2022 at 10:27 AM Nathan Bossart
 wrote:
> On Mon, Sep 26, 2022 at 08:33:53PM +0530, Bharath Rupireddy wrote:
> > Irrespective of what Windows does with file pointers in WriteFile(),
> > should we add lseek(SEEK_SET) in our own pwrite()'s implementation,
> > something like [5]? This is rather hackish without fully knowing what
> > Windows does internally in WriteFile(), but this does fix inherent
> > issues that our pwrite() callers (there are quite a number of places
> > that use pwrite() and presumes file pointer doesn't change on Windows)
> > may have on Windows. See the regression tests passing [6] with the fix
> > [5].
>
> I think so.  I don't see why we would rather have each caller ensure
> pwrite() behaves as documented.

I don't think so, that's an extra kernel call.  I think I'll just have
to revert part of my recent change that removed the pg_ prefix from
those function names in our code, and restore the comment that warns
you about the portability hazard (I thought it went away with HP-UX
10, where we were literally calling lseek() before every write()).
The majority of users of these functions don't intermix them with
calls to read()/write(), so they don't care about the file position,
so I think it's just something we'll have to continue to be mindful of
in the places that do.

Unless, that is, I can find a way to stop it from doing that...  I've
added this to my Windows to-do list.  I am going to have a round of
Windows hacking quite soon.

Re: First draft of the PG 15 release notes

2022-09-26 Thread Tom Lane

David Rowley  writes:
> On Tue, 13 Sept 2022 at 09:31, Jonathan S. Katz  wrote:
>> Separately, per[1], including dense_rank() in the list of window
>> functions with optimizations (dense-rank.diff).

> This one might have been forgotten... ? I can push it shortly if nobody 
> objects.

Yeah, I missed that one.  We're theoretically in the wrap freeze for
15rc1, but I don't have a problem with release-note changes.

regards, tom lane

Re: Use pg_pwritev_with_retry() instead of write() in dir_open_for_write() to avoid partial writes?

2022-09-26 Thread Nathan Bossart

On Tue, Sep 27, 2022 at 10:37:38AM +1300, Thomas Munro wrote:
> I don't think so, that's an extra kernel call.  I think I'll just have
> to revert part of my recent change that removed the pg_ prefix from
> those function names in our code, and restore the comment that warns
> you about the portability hazard (I thought it went away with HP-UX
> 10, where we were literally calling lseek() before every write()).
> The majority of users of these functions don't intermix them with
> calls to read()/write(), so they don't care about the file position,
> so I think it's just something we'll have to continue to be mindful of
> in the places that do.

Ah, you're right, it's probably best to avoid the extra system call for the
majority of callers that don't care about the file position.  I retract my
previous message.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Re: First draft of the PG 15 release notes

2022-09-26 Thread David Rowley

On Tue, 27 Sept 2022 at 10:45, Tom Lane  wrote:
>
> David Rowley  writes:
> > On Tue, 13 Sept 2022 at 09:31, Jonathan S. Katz  
> > wrote:
> >> Separately, per[1], including dense_rank() in the list of window
> >> functions with optimizations (dense-rank.diff).
>
> > This one might have been forgotten... ? I can push it shortly if nobody 
> > objects.
>
> Yeah, I missed that one.  We're theoretically in the wrap freeze for
> 15rc1, but I don't have a problem with release-note changes.

Thanks. I've just pushed it.

David

Re: Enables to call UnregisterXactCallback() in CallXactCallback()

2022-09-26 Thread Tom Lane

Andres Freund  writes:
> On 2022-03-29 14:48:54 +0800, Hao Wu wrote:
>> It's a natural requirement to unregister the callback for transaction or
>> subtransaction when the callback is invoked, so we don't have to
>> unregister the callback somewhere.

> You normally shouldn'd need to do this frequently - what's your use case?
> UnregisterXactCallback() is O(N), so workloads registering / unregistering a
> lot of callbacks would be problematic.

It'd only be slow if you had a lot of distinct callbacks registered
at the same time, which doesn't sound like a common situation.

>> Luckily, we just need a few lines of code to support this feature,
>> by saving the next pointer before calling the callback.

> That seems reasonable...

Yeah.  Whether it's efficient or not, seems like it should *work*.
I'm a bit inclined to call this a bug-fix and backpatch it.

I went looking for other occurrences of this code in places that have
an unregister function, and found one in ResourceOwnerReleaseInternal,
so I think we should fix that too.  Also, a comment seems advisable;
that leads me to the attached v2.

regards, tom lane

diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index 2bb975943c..c1ffbd89b8 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -3656,9 +3656,14 @@ static void
 CallXactCallbacks(XactEvent event)
 {
 	XactCallbackItem *item;
+	XactCallbackItem *next;
 
-	for (item = Xact_callbacks; item; item = item->next)
+	for (item = Xact_callbacks; item; item = next)
+	{
+		/* allow callbacks to unregister themselves when called */
+		next = item->next;
 		item->callback(event, item->arg);
+	}
 }
 
 
@@ -3713,9 +3718,14 @@ CallSubXactCallbacks(SubXactEvent event,
 	 SubTransactionId parentSubid)
 {
 	SubXactCallbackItem *item;
+	SubXactCallbackItem *next;
 
-	for (item = SubXact_callbacks; item; item = item->next)
+	for (item = SubXact_callbacks; item; item = next)
+	{
+		/* allow callbacks to unregister themselves when called */
+		next = item->next;
 		item->callback(event, mySubid, parentSubid, item->arg);
+	}
 }
 
 
diff --git a/src/backend/utils/resowner/resowner.c b/src/backend/utils/resowner/resowner.c
index ece5d98261..37b43ee1f8 100644
--- a/src/backend/utils/resowner/resowner.c
+++ b/src/backend/utils/resowner/resowner.c
@@ -501,6 +501,7 @@ ResourceOwnerReleaseInternal(ResourceOwner owner,
 	ResourceOwner child;
 	ResourceOwner save;
 	ResourceReleaseCallbackItem *item;
+	ResourceReleaseCallbackItem *next;
 	Datum		foundres;
 
 	/* Recurse to handle descendants */
@@ -701,8 +702,12 @@ ResourceOwnerReleaseInternal(ResourceOwner owner,
 	}
 
 	/* Let add-on modules get a chance too */
-	for (item = ResourceRelease_callbacks; item; item = item->next)
+	for (item = ResourceRelease_callbacks; item; item = next)
+	{
+		/* allow callbacks to unregister themselves when called */
+		next = item->next;
 		item->callback(phase, isCommit, isTopLevel, item->arg);
+	}
 
 	CurrentResourceOwner = save;
 }

Re: Enables to call UnregisterXactCallback() in CallXactCallback()

2022-09-26 Thread Nathan Bossart

On Mon, Sep 26, 2022 at 06:05:34PM -0400, Tom Lane wrote:
> Yeah.  Whether it's efficient or not, seems like it should *work*.
> I'm a bit inclined to call this a bug-fix and backpatch it.
> 
> I went looking for other occurrences of this code in places that have
> an unregister function, and found one in ResourceOwnerReleaseInternal,
> so I think we should fix that too.  Also, a comment seems advisable;
> that leads me to the attached v2.

LGTM.  I have no opinion on back-patching.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Re: Reducing the chunk header sizes on all memory context types

2022-09-26 Thread David Rowley

On Tue, 20 Sept 2022 at 13:23, Tom Lane  wrote:
>
> David Rowley  writes:
> > Aside from that, I don't have any ideas on how to get rid of the
> > possible additional datumCopy() from non-Var arguments to these window
> > functions.  Should we just suffer it? It's quite likely that most
> > arguments to these functions are plain Vars anyway.
>
> No, we shouldn't.  I'm pretty sure that we have various window
> functions that are deliberately designed to take advantage of the
> no-copy behavior, and that they have taken a significant speed
> hit from your having disabled that optimization.  I don't say
> that this is enough to justify reverting the chunk header changes
> altogether ... but I'm completely not satisfied with the current
> situation in HEAD.

Looking more closely at window_gettupleslot(), it always allocates the
tuple in ecxt_per_query_memory, so any column we fetch out of that
tuple will be in that memory context.  window_gettupleslot() is used
in lead(), lag(), first_value(), last_value() and nth_value() to fetch
the Nth tuple out of the partition window. The other window functions
all return BIGINT, FLOAT8 or INT which are byval on 64-bit, and on
32-bit these functions return a freshly palloc'd Datum in the
CurrentMemoryContext.

Maybe we could remove the datumCopy() from eval_windowfunction() and
also document that a window function when returning a non-byval type,
must allocate the Datum in either ps_ExprContext's
ecxt_per_tuple_memory or ecxt_per_query_memory. We could ensure any
extension which has its own window functions get the memo about the
API change by adding an Assert to ensure that the return value (for
byref types) is in the current context by calling the
loop-over-the-blocks version of MemoryContextContains().

This would mean that wfuncs like lead(column_name) would no longer do
that extra datumCopy and the likes of lead(col || 'some OpExpr') would
save a little as we'd no longer call MemoryContextContains on
non-Assert builds.

David

[Commitfest 2022-09] Last days

2022-09-26 Thread Ibrar Ahmed

Just a reminder that only some days left of "September 2022 commitfest"
As of now, there are "295" patches in total. Out of these 295 patches, "18"
patches required committer attention, and 167 patches needed reviews.

Total: 295.
Needs review: 167.
Waiting on Author: 44.
Ready for Committer: 18.
Committed: 50.
Moved to next CF: 3.
Returned with Feedback: 3.
Rejected: 2.
Withdrawn: 8.


On the last days of Commitfest, I will perform these activities

- For patches marked "Waiting for Author" and having at least one review,
  set to "Returned with Feedback" and send the appropriate email
- For patches marked "Needs review
 * If it received at least one good review, move it to the next CF
(removing the current reviewer reservation)
 * Otherwise, leave them pending

-- 
Ibrar Ahmed

GUC tables - use designated initializers

2022-09-26 Thread Peter Smith

Hi hackers,

Enums index a number of the GUC tables. This all relies on the
elements being carefully arranged to be in the same order as those
enums. There are comments to say what enum index belongs to each table
element.

But why not use designated initializers to enforce what the comments
are hoping for?

~~

PSA a patch for the same.

Doing this also exposed a minor typo in the comments.
"ERROR_HANDLING" -> "ERROR_HANDLING_OPTIONS"

Furthermore, with this change, now the GUC table elements are able to
be rearranged into any different order - eg alphabetical - if that
would be useful (my patch does not do this).

~~

In passing, I also made a 0002 patch to remove some inconsistent
whitespace noticed in those config tables.

Thoughts?

--
Kind Regards,
Peter Smith.
Fujitsu Australia.


v1-0001-GUC-tables-used-designated-initializers.patch
Description: Binary data


v1-0002-GUC-tables-remove-unnecessary-whitespace.patch
Description: Binary data

GUC values - recommended way to declare the C variables?

2022-09-26 Thread Peter Smith

Hi hackers.

I have a question about the recommended way to declare the C variables
used for the GUC values.

Here are some examples from the code:

~

The GUC boot values are defined in src/backend.utils/misc/guc_tables.c

e.g. See the 4, and 2 below

{
{"max_logical_replication_workers",
PGC_POSTMASTER,
REPLICATION_SUBSCRIBERS,
gettext_noop("Maximum number of logical replication worker processes."),
NULL,
},
&max_logical_replication_workers,
4, 0, MAX_BACKENDS,
NULL, NULL, NULL
},

{
{"max_sync_workers_per_subscription",
PGC_SIGHUP,
REPLICATION_SUBSCRIBERS,
gettext_noop("Maximum number of table synchronization workers per
subscription."),
NULL,
},
&max_sync_workers_per_subscription,
2, 0, MAX_BACKENDS,
NULL, NULL, NULL
},

~~

Meanwhile, the associated C variables are declared in their respective modules.

e.g. src/backend/replication/launcher.c

int max_logical_replication_workers = 4;
int max_sync_workers_per_subscription = 2;

~~

It seems confusing to me that for the above code the initial value is
"hardwired" in multiple places. Specifically, it looks tempting to
just change the variable declaration value, but IIUC that's going to
achieve nothing because it will just be overwritten by the
"boot-value" during the GUC mechanism start-up.

Furthermore, there seems no consistency with how these C variables are
auto-initialized:

a) Sometimes the static variable is assigned some (dummy?) value that
is not the same as the boot value
- See src/backend/utils/misc/guc_tables.c, max_replication_slots boot
value is 10
- See src/backend/replication/slot.c, int max_replication_slots = 0;

b) Sometimes the static value is assigned the same hardwired value as
the GUC boot value
- See src/backend/utils/misc/guc_tables.c,
max_logical_replication_workers boot value is 4
- See src/backend/replication/launcher.c, int
max_logical_replication_workers = 4;

c) Sometimes the GUC C variables don't even have a comment saying that
they are GUC variables, so it is not at all obvious their initial
values are going to get overwritten by some external mechanism.
- See src/backend/replication/launcher.c, int
max_logical_replication_workers = 4;

~

I would like to know what is the recommended way/convention to write
the C variable declarations for the GUC values.

IMO I felt the launch.c code as shown would be greatly improved simply
by starting with 0 values, and including an explanatory comment.

e.g.

CURRENT
int max_logical_replication_workers = 4;
int max_sync_workers_per_subscription = 2;

SUGGESTION
/*
 * GUC variables. Initial values are assigned at startup via
InitializeGUCOptions.
 */
int max_logical_replication_workers = 0;
int max_sync_workers_per_subscription = 0;


Thoughts?

--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Refactor backup related code (was: Is it correct to say, "invalid data in file \"%s\"", BACKUP_LABEL_FILE in do_pg_backup_stop?)

2022-09-26 Thread Michael Paquier

On Mon, Sep 26, 2022 at 05:10:16PM +0900, Kyotaro Horiguchi wrote:
> - if (strcmp(backupfrom, "standby") == 0 && !backup_started_in_recovery)
> + if (state->started_in_recovery == true &&
> + backup_stopped_in_recovery == false)
> 
> Using == for Booleans may not be great?

Yes.  That's why 7d70809 does not use the way proposed by the previous
patch.
--
Michael


signature.asc
Description: PGP signature

Re: GUC values - recommended way to declare the C variables?

2022-09-26 Thread Tom Lane

Peter Smith  writes:
> It seems confusing to me that for the above code the initial value is
> "hardwired" in multiple places. Specifically, it looks tempting to
> just change the variable declaration value, but IIUC that's going to
> achieve nothing because it will just be overwritten by the
> "boot-value" during the GUC mechanism start-up.

Well, if you try that you'll soon discover it doesn't work ;-)

IIRC, the primary argument for hand-initializing GUC variables is to
ensure that they have a sane value even before InitializeGUCOptions runs.
Obviously, that only matters for some subset of the GUCs that could be
consulted very early in startup ... but it's not perfectly clear just
which ones it matters for.

> a) Sometimes the static variable is assigned some (dummy?) value that
> is not the same as the boot value
> - See src/backend/utils/misc/guc_tables.c, max_replication_slots boot
> value is 10
> - See src/backend/replication/slot.c, int max_replication_slots = 0;

That seems pretty bogus.  I think if we're not initializing a GUC to
the "correct" value then we should just leave it as not explicitly
initialized.

> c) Sometimes the GUC C variables don't even have a comment saying that
> they are GUC variables, so it is not at all obvious their initial
> values are going to get overwritten by some external mechanism.

That's flat out sloppy commenting.  There are a lot of people around
here who seem to think comments are optional :-(

> SUGGESTION
> /*
>  * GUC variables. Initial values are assigned at startup via
> InitializeGUCOptions.
>  */
> int max_logical_replication_workers = 0;
> int max_sync_workers_per_subscription = 0;

1. Comment far wordier than necessary.  In most places we just
annotate these as "GUC variables", and I think that's sufficient.
You're going to have a hard time getting people to write more
than that anyway.

2. I don't agree with explicitly initializing to a wrong value.
It'd be sufficient to do

int max_logical_replication_workers;
int max_sync_workers_per_subscription;

which would also make it clearer that initialization happens
through some other mechanism.

regards, tom lane

Re: Refactor backup related code (was: Is it correct to say, "invalid data in file \"%s\"", BACKUP_LABEL_FILE in do_pg_backup_stop?)

2022-09-26 Thread Michael Paquier

On Mon, Sep 26, 2022 at 11:57:58AM +0530, Bharath Rupireddy wrote:
> +1 because callers don't use returned StringInfo structure outside of
> build_backup_content(). The patch looks good to me.

Thanks for looking.

> I think it will be
> good to add a note about the caller freeing up the retired string's
> memory [1], just in case.

Not sure that this is worth it.  It is fine to use palloc() in a
dedicated memory context, for one.
--
Michael


signature.asc
Description: PGP signature

Re: [small patch] Change datatype of ParallelMessagePending from "volatile bool" to "volatile sig_atomic_t"

2022-09-26 Thread Michael Paquier

On Mon, Sep 26, 2022 at 04:50:36PM +0900, Michael Paquier wrote:
> You are right.  bool is not usually a problem in a signal handler, but
> sig_atomic_t is the type we ought to use.  I'll go adjust that.

Done this one.  I have scanned the code, but did not notice a similar
mistake.  It is worth noting that we have only one remaining "volatile
bool" in the headers now.
--
Michael


signature.asc
Description: PGP signature

Re: GUC values - recommended way to declare the C variables?

2022-09-26 Thread Peter Smith

On Tue, Sep 27, 2022 at 10:08 AM Tom Lane  wrote:
>
> Peter Smith  writes:
> > It seems confusing to me that for the above code the initial value is
> > "hardwired" in multiple places. Specifically, it looks tempting to
> > just change the variable declaration value, but IIUC that's going to
> > achieve nothing because it will just be overwritten by the
> > "boot-value" during the GUC mechanism start-up.
>
> Well, if you try that you'll soon discover it doesn't work ;-)
>
> IIRC, the primary argument for hand-initializing GUC variables is to
> ensure that they have a sane value even before InitializeGUCOptions runs.
> Obviously, that only matters for some subset of the GUCs that could be
> consulted very early in startup ... but it's not perfectly clear just
> which ones it matters for.
>
> > a) Sometimes the static variable is assigned some (dummy?) value that
> > is not the same as the boot value
> > - See src/backend/utils/misc/guc_tables.c, max_replication_slots boot
> > value is 10
> > - See src/backend/replication/slot.c, int max_replication_slots = 0;
>
> That seems pretty bogus.  I think if we're not initializing a GUC to
> the "correct" value then we should just leave it as not explicitly
> initialized.
>
> > c) Sometimes the GUC C variables don't even have a comment saying that
> > they are GUC variables, so it is not at all obvious their initial
> > values are going to get overwritten by some external mechanism.
>
> That's flat out sloppy commenting.  There are a lot of people around
> here who seem to think comments are optional :-(
>
> > SUGGESTION
> > /*
> >  * GUC variables. Initial values are assigned at startup via
> > InitializeGUCOptions.
> >  */
> > int max_logical_replication_workers = 0;
> > int max_sync_workers_per_subscription = 0;
>
> 1. Comment far wordier than necessary.  In most places we just
> annotate these as "GUC variables", and I think that's sufficient.
> You're going to have a hard time getting people to write more
> than that anyway.
>
> 2. I don't agree with explicitly initializing to a wrong value.
> It'd be sufficient to do
>
> int max_logical_replication_workers;
> int max_sync_workers_per_subscription;
>
> which would also make it clearer that initialization happens
> through some other mechanism.
>

Thanks for your advice.

I will try to post a patch in the new few days to address (per your
suggestions) some of the variables that I am more familiar with.

--
Kind Regards,
Peter Smith.
Fujitsu Australia.

Re: GUC tables - use designated initializers

2022-09-26 Thread Michael Paquier

On Tue, Sep 27, 2022 at 09:27:48AM +1000, Peter Smith wrote:
> But why not use designated initializers to enforce what the comments
> are hoping for?

This is a C99 thing as far as I understand, adding one safety net.
Why not for these cases..

> Doing this also exposed a minor typo in the comments.
> "ERROR_HANDLING" -> "ERROR_HANDLING_OPTIONS"

Right.
--
Michael


signature.asc
Description: PGP signature

Re: kerberos/001_auth test fails on arm CPU darwin

2022-09-26 Thread Michael Paquier

On Mon, Sep 26, 2022 at 04:39:36PM +0200, Peter Eisentraut wrote:
> On 26.09.22 13:14, Tom Lane wrote:
>> Bilal Yavuz  writes:
>> > It seems that kerberos is installed at the '/opt/homebrew/opt/krb5' path on
>> > ARM CPU darwin instances instead of the '/usr/local/opt/krb5' path.
>> I think this also needs to account for MacPorts, which would likely
>> put it under /opt/local/sbin.  (I wonder where /usr/local/opt/krb5
>> came from at all -- that sounds like somebody's manual installation
>> rather than a packaged one.)
> 
> /usr/local/opt/ is used by Homebrew on Intel macOS.

Hmm.  Is that the case with new setups under x86_64?  I have a M1
where everything goes through /opt/homebrew/, though it has been set
very recently.
--
Michael


signature.asc
Description: PGP signature

Re: kerberos/001_auth test fails on arm CPU darwin

2022-09-26 Thread Larry Rosenman


On 09/26/2022 8:25 pm, Michael Paquier wrote:

On Mon, Sep 26, 2022 at 04:39:36PM +0200, Peter Eisentraut wrote:

On 26.09.22 13:14, Tom Lane wrote:

Bilal Yavuz  writes:
> It seems that kerberos is installed at the '/opt/homebrew/opt/krb5' path on
> ARM CPU darwin instances instead of the '/usr/local/opt/krb5' path.
I think this also needs to account for MacPorts, which would likely
put it under /opt/local/sbin.  (I wonder where /usr/local/opt/krb5
came from at all -- that sounds like somebody's manual installation
rather than a packaged one.)


/usr/local/opt/ is used by Homebrew on Intel macOS.


Hmm.  Is that the case with new setups under x86_64?  I have a M1
where everything goes through /opt/homebrew/, though it has been set
very recently.
--
Michael


Intel:
wf-corporate-chef on  master +6 -420 [✘!] on ☁️  (us-east-1) on ﴃ 
WhereTo - Prod

❯  /usr/local/opt/krb5/bin/krb5-config  --prefix
/usr/local/Cellar/krb5/1.20

wf-corporate-chef on  master +6 -420 [✘!] on ☁️  (us-east-1) on ﴃ 
WhereTo - Prod

❯

Same on my M1 iMac (migrated from an Intel iMac however)

--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106

1 2 >

1 - 100 of 133 matches

Mail list logo