date:20220213

Re: Rewriting the test of pg_upgrade as a TAP test - take three - remastered set

2022-02-13 Thread Michael Paquier

On Sun, Feb 13, 2022 at 02:55:26PM -0800, Andres Freund wrote:
> On 2022-02-14 11:23:18 +1300, Thomas Munro wrote:
>> Ahh, right.  I assume it still needs perl2host() treatment for MSYS2
>> systems, because jacana's log shows TESTDIR is set to a Unixoid path
>> that I assume pg_regress's runtime can't use.  That leads to the
>> attached.
> 
> Looks sane to me.

This looks like a nice cleanup, indeed.  Nice catch.
--
Michael


signature.asc
Description: PGP signature

Re: pg_receivewal.exe unhandled exception in zlib1.dll

2022-02-13 Thread Michael Paquier

On Fri, Feb 11, 2022 at 07:50:44AM -0800, Andres Freund wrote:
> On 2022-02-11 16:33:11 +0100, Juan José Santamaría Flecha wrote:
>> The problem comes from the file descriptor passed to gzdopen() in
>> 'src/bin/pg_basebackup/walmethods.c'. Using gzopen() instead, solves the
>> issue without ifdefing for WIN32. Please find attached a patch for so.

This looks wrong to me as gzclose() would close() the file descriptor
we pass in via gzdopen(), and some code paths of walmethods.c rely on
this assumption so this would leak fds on repeated errors.

> I hit this as well. The problem is really caused by using a debug build of
> postgres vs a production build of zlib. The use different C runtimes, with
> different file descriptors. A lot of resources in the windows world are
> unfortunately tied to the C runtime and that there can multiple C runtimes in
> a single process.

It may be worth warning about that at build time, or just document the
matter perhaps?  I don't recall seeing any MSIs related to zlib that
have compile with the debug options.
--
Michael


signature.asc
Description: PGP signature

Re: How did queensnake corrupt zic.o?

2022-02-13 Thread Tom Lane

Thomas Munro  writes:
> We see a successful compile and then a failure to read the file while
> linking.  We see that the animal got into that state recently and then
> fixed itself, and now it's back in that state.  I don't know if it's
> significant, but it happened to fix itself when a configure change
> came along, which might be explained by ccache invalidation; that is,
> the failure mode doesn't depend on the input files, but once it's
> borked you need a change to kick ccache.  My memory may be playing
> tricks on me but I vaguely recall seeing another animal do this, a
> while back.

queensnake's seen repeated cycles of unexplainable build failures.
I wonder if it's using a bogus ccache version, or if the machine
itself is flaky.

regards, tom lane

sockaddr_un.sun_len vs. reality

2022-02-13 Thread Tom Lane

For a very long time, the AIX 7.1 buildfarm members (sungazer and tern)
have complained about

ip.c:236:17: warning: conversion from 'long unsigned int' to 'uchar_t' {aka 
'unsigned char'} changes value from '1025' to '1' [-Woverflow]

I'd ignored this as not being very interesting, but I got motivated to
recheck it today.  The warning is coming from

#ifdef HAVE_STRUCT_SOCKADDR_STORAGE_SS_LEN
unp->sun_len = sizeof(struct sockaddr_un);
#endif

so we can infer that the sun_len field is unsigned char and the
size of struct sockaddr_un doesn't fit.

Poking around a bit, I find that sun_len exists on most BSD-ish
platforms and it seems to be unsigned char (almost?) everywhere.
But on other platforms sizeof(struct sockaddr_un) is only a bit
over 100, so there's not an overflow problem.

It's not real clear that there's any problem on AIX either;
given that the regression tests pass, I strongly suspect that
nothing is paying attention to the sun_len field.  But if we're
going to fill it in at all, we should probably try to fill it
in with something less misleading than "1".

Googling finds that this seems to be a sore spot for various
people, eg [1] mentions the existence of the SUN_LEN() macro
on some platforms and then says why you shouldn't use it.

I'm leaning to adding a compile-time clamp on the value,
that is

unp->sun_len = Min(sizeof(struct sockaddr_un),
   (1 << (sizeof(unp->sun_len) * 8)) - 1);

(This might need a little bit of refinement if sun_len
could be as wide as int, but in any case it should still
reduce to a compile-time constant.)

Or we could use something involving strlen(unp->sun_path), perhaps
trying to use SUN_LEN() if it exists.  But that implies expending
extra run-time cycles for strlen(), and I've seen no indication
that there's any value here except suppressing a compiler warning.

Thoughts?

regards, tom lane

[1] http://mail-index.netbsd.org/tech-net/2006/10/11/0008.html

Re: Add checkpoint and redo LSN to LogCheckpointEnd log message

2022-02-13 Thread Kyotaro Horiguchi

At Mon, 14 Feb 2022 14:42:18 +0900 (JST), Kyotaro Horiguchi 
 wrote in 
> I'll resend new version soon to avoid the confusion..

In this version , 0001 gets one fix and two comment updates.

-* Archive recovery have ended. Crash recovery ever after 
should always
+* Archive recovery has ended. Crash recovery ever after should 
always

-   /* the second term is just in case */
-   if (PriorRedoPtr != InvalidXLogRecPtr || RedoRecPtr > PriorRedoPtr)
+   /*
+* Update the average distance between checkpoints/restartpoints if the
+* prior checkpoint exists. The second term is just in case.
+*/
+   if (PriorRedoPtr != InvalidXLogRecPtr && RedoRecPtr > PriorRedoPtr)
UpdateCheckPointDistanceEstimate(RedoRecPtr - PriorRedoPtr);

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
>From 4f96baaece1cff35f89893d8e7aa1fdf57435f53 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Tue, 8 Feb 2022 16:42:53 +0900
Subject: [PATCH v10 1/5] Get rid of unused path to handle concurrent
 checkpoints

CreateRestartPoint considered the case a concurrent checkpoint is
running. But 7ff23c6d27 eliminates the possibility that multiple
checkpoints run simultaneously.  That code path, if it were passed,
might leave unrecoverable database by removing WAL segments that are
required by the last established restartpoint.

In passing, the code in the function gets tightened-up and tidied-up
in some points so that it gets easier to read.
---
 src/backend/access/transam/xlog.c | 67 ---
 src/backend/postmaster/checkpointer.c |  1 -
 2 files changed, 41 insertions(+), 27 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 958220c495..8c2882b49f 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -9659,6 +9659,9 @@ CreateRestartPoint(int flags)
 	XLogSegNo	_logSegNo;
 	TimestampTz xtime;
 
+	/* we don't assume concurrent checkpoint/restartpoint to run */
+	Assert (!IsUnderPostmaster || MyBackendType == B_CHECKPOINTER);
+
 	/* Get a local copy of the last safe checkpoint record. */
 	SpinLockAcquire(>info_lck);
 	lastCheckPointRecPtr = XLogCtl->lastCheckPointRecPtr;
@@ -9724,7 +9727,7 @@ CreateRestartPoint(int flags)
 
 	/* Also update the info_lck-protected copy */
 	SpinLockAcquire(>info_lck);
-	XLogCtl->RedoRecPtr = lastCheckPoint.redo;
+	XLogCtl->RedoRecPtr = RedoRecPtr;
 	SpinLockRelease(>info_lck);
 
 	/*
@@ -9743,7 +9746,12 @@ CreateRestartPoint(int flags)
 	/* Update the process title */
 	update_checkpoint_display(flags, true, false);
 
-	CheckPointGuts(lastCheckPoint.redo, flags);
+	CheckPointGuts(RedoRecPtr, flags);
+
+	/*
+	 * Update pg_control, using current time.
+	 */
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
 
 	/*
 	 * Remember the prior checkpoint's redo ptr for
@@ -9751,30 +9759,23 @@ CreateRestartPoint(int flags)
 	 */
 	PriorRedoPtr = ControlFile->checkPointCopy.redo;
 
+	ControlFile->checkPoint = lastCheckPointRecPtr;
+	ControlFile->checkPointCopy = lastCheckPoint;
+
 	/*
-	 * Update pg_control, using current time.  Check that it still shows
-	 * DB_IN_ARCHIVE_RECOVERY state and an older checkpoint, else do nothing;
-	 * this is a quick hack to make sure nothing really bad happens if somehow
-	 * we get here after the end-of-recovery checkpoint.
+	 * Ensure minRecoveryPoint is past the checkpoint record while archive
+	 * recovery is still ongoing.  Normally, this will have happened already
+	 * while writing out dirty buffers, but not necessarily - e.g. because no
+	 * buffers were dirtied.  We do this because a non-exclusive base backup
+	 * uses minRecoveryPoint to determine which WAL files must be included in
+	 * the backup, and the file (or files) containing the checkpoint record
+	 * must be included, at a minimum. Note that for an ordinary restart of
+	 * recovery there's no value in having the minimum recovery point any
+	 * earlier than this anyway, because redo will begin just after the
+	 * checkpoint record.
 	 */
-	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
-	if (ControlFile->state == DB_IN_ARCHIVE_RECOVERY &&
-		ControlFile->checkPointCopy.redo < lastCheckPoint.redo)
+	if (ControlFile->state == DB_IN_ARCHIVE_RECOVERY)
 	{
-		ControlFile->checkPoint = lastCheckPointRecPtr;
-		ControlFile->checkPointCopy = lastCheckPoint;
-
-		/*
-		 * Ensure minRecoveryPoint is past the checkpoint record.  Normally,
-		 * this will have happened already while writing out dirty buffers,
-		 * but not necessarily - e.g. because no buffers were dirtied.  We do
-		 * this because a non-exclusive base backup uses minRecoveryPoint to
-		 * determine which WAL files must be included in the backup, and the
-		 * file (or files) containing the checkpoint record must be included,
-		 * at a minimum. Note that for an ordinary restart of recovery there's
-		 * no value in having the minimum recovery point any

Re: Add checkpoint and redo LSN to LogCheckpointEnd log message

2022-02-13 Thread Kyotaro Horiguchi

At Mon, 14 Feb 2022 14:40:22 +0900 (JST), Kyotaro Horiguchi 
 wrote in 
> It doesn't apply even on pg13 (due to LSN_FORMAT_ARGS).  I will make
> the per-version patches if you are fine with this.

Oops!  I forgot to rename the patch to avoid confusion on CF-bots.
I'll resend new version soon to avoid the confusion..

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: Add checkpoint and redo LSN to LogCheckpointEnd log message

2022-02-13 Thread Kyotaro Horiguchi

At Fri, 11 Feb 2022 01:00:03 +0900, Fujii Masao  
wrote in 
> 
> 
> On 2022/02/09 11:52, Kyotaro Horiguchi wrote:
> > 0001: The fix of CreateRestartPoint
> 
> This patch might be ok for the master branch. But since concurrent
> checkpoint and restartpoint can happen in v14 or before, we would need
> another patch based on that assumption, for the backport. How about
> fixing the bug all the branches at first, then apply this patch in the
> master to improve the code?

For backbranches, the attached for pg14 does part of the full patch.
Of the following, I think we should do (a) and (b) to make future
backpatchings easier.

a) Use RedoRecPtr and PriorRedoPtr after they are assigned.

b) Move assignment to PriorRedoPtr into the ControlFileLock section.

c) Skip udpate of minRecoveryPoint only when the checkpoint gets old.

d) Skip call to UpdateCheckPointDistanceEstimate() when RedoRecPtr <=
  PriorRedoPtr.

# Mmm. The v9-0001  contains a silly mistake here..

Still I'm not sure whether that case really happens and how checkpoint
behaves *after* that happenes, but at least it protects database from
the possible unrecoverable state due to the known issue here..

It doesn't apply even on pg13 (due to LSN_FORMAT_ARGS).  I will make
the per-version patches if you are fine with this.

regards.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
>From df516b487617c570332ea076b91773b810629a11 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Mon, 14 Feb 2022 13:04:33 +0900
Subject: [PATCH v2] Correctly update contfol file at the end of archive
 recovery

CreateRestartPoint runs WAL file cleanup basing on the checkpoint just
have finished in the function.  If the database has exited
DB_IN_ARCHIVE_RECOVERY state when the function is going to update
control file, the function refrains from updating the file at all then
proceeds to WAL cleanup having the latest REDO LSN, which is now
inconsistent with the control file.  As the result, the succeeding
cleanup procedure overly removes WAL files against the control file
and leaves unrecoverable database until the next checkpoint finishes.

Since all buffers have flushed out, we may safely regard that restart
point established regardless of the recovery state.  Thus we may and
should update checkpoint LSNs of checkpoint file in that case.  Still
we update minRecoveryPoint only during archive recovery but explicitly
clear if we have exited recovery.

Addition to that fix, this commit makes some cosmetic changes that
consist with the changes we are going to make on the master branch.
---
 src/backend/access/transam/xlog.c | 81 ---
 1 file changed, 53 insertions(+), 28 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 6208e123e5..28c3c4b7cf 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -9653,7 +9653,7 @@ CreateRestartPoint(int flags)

 	/* Also update the info_lck-protected copy */
 	SpinLockAcquire(>info_lck);
-	XLogCtl->RedoRecPtr = lastCheckPoint.redo;
+	XLogCtl->RedoRecPtr = RedoRecPtr;
 	SpinLockRelease(>info_lck);

 	/*
@@ -9672,7 +9672,10 @@ CreateRestartPoint(int flags)
 	/* Update the process title */
 	update_checkpoint_display(flags, true, false);

-	CheckPointGuts(lastCheckPoint.redo, flags);
+	CheckPointGuts(RedoRecPtr, flags);
+
+	/* Update pg_control */
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);

 	/*
 	 * Remember the prior checkpoint's redo ptr for
@@ -9681,52 +9684,74 @@ CreateRestartPoint(int flags)
 	PriorRedoPtr = ControlFile->checkPointCopy.redo;

 	/*
-	 * Update pg_control, using current time.  Check that it still shows
-	 * DB_IN_ARCHIVE_RECOVERY state and an older checkpoint, else do nothing;
-	 * this is a quick hack to make sure nothing really bad happens if somehow
-	 * we get here after the end-of-recovery checkpoint.
+	 * Update pg_control, using current time if no later checkpoints have been
+	 * performed.
 	 */
-	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
-	if (ControlFile->state == DB_IN_ARCHIVE_RECOVERY &&
-		ControlFile->checkPointCopy.redo < lastCheckPoint.redo)
+	if (PriorRedoPtr < RedoRecPtr)
 	{
 		ControlFile->checkPoint = lastCheckPointRecPtr;
 		ControlFile->checkPointCopy = lastCheckPoint;
 		ControlFile->time = (pg_time_t) time(NULL);

 		/*
-		 * Ensure minRecoveryPoint is past the checkpoint record.  Normally,
-		 * this will have happened already while writing out dirty buffers,
-		 * but not necessarily - e.g. because no buffers were dirtied.  We do
-		 * this because a non-exclusive base backup uses minRecoveryPoint to
-		 * determine which WAL files must be included in the backup, and the
-		 * file (or files) containing the checkpoint record must be included,
-		 * at a minimum. Note that for an ordinary restart of recovery there's
-		 * no value in having the minimum recovery point any earlier than this
+		 * Ensure minRecoveryPoint is past the checkpoint record while

How did queensnake corrupt zic.o?

2022-02-13 Thread Thomas Munro

Hi,

We see a successful compile and then a failure to read the file while
linking.  We see that the animal got into that state recently and then
fixed itself, and now it's back in that state.  I don't know if it's
significant, but it happened to fix itself when a configure change
came along, which might be explained by ccache invalidation; that is,
the failure mode doesn't depend on the input files, but once it's
borked you need a change to kick ccache.  My memory may be playing
tricks on me but I vaguely recall seeing another animal do this, a
while back.

Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints

2022-02-13 Thread Dilip Kumar

On Sun, Feb 13, 2022 at 12:04 PM Dilip Kumar  wrote:
>
> On Sun, Feb 13, 2022 at 10:12 AM Dilip Kumar  wrote:

> Next, I am planning to do some more tests, where we are having pgbench
> running and concurrently we do CREATEDB maybe every 1 minute and see
> what is the CREATEDB time as well as what is the impact on pgbench
> performance.  Because currently I have only measured CREATEDB time but
> we must be knowing the impact of createdb on the other system as well.

I have done tests with the pgbench as well.  So basically I did not
notice any significant difference in the TPS, I was expecting there
should be some difference due to the checkpoint on the head so maybe I
need to test with more backend maybe.  And createdb time there is a
huge difference. I think this is because template1 db is very small so
patch is getting completed in no time whereas head is taking huge time
because of high dirty shared buffers (due to concurrent pgbench).

config:
echo "logging_collector=on" >> data/postgresql.conf
echo "port = 5432" >> data/postgresql.conf
echo "max_wal_size=64GB" >> data/postgresql.conf
echo "checkpoint_timeout=15min" >> data/postgresql.conf
echo "shared_buffers=32GB" >> data/postgresql.conf

Test:
./pgbench -i -s 1000 postgres
./pgbench -c 32 -j 32 -T 1200 -M prepared postgres >> result.txt
-- Concurrently run below script every 1 mins
CREATE DATABASE mydb log_copied_blocks=true/false;

Results:
- Pgbench TPS: Did not observe any difference head vs patch
- Create db time(very small template):
head:  21000 ms to 42000 ms (at different time)
patch: 80 ms

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Re: pgsql: Add TAP test to automate the equivalent of check_guc

2022-02-13 Thread Michael Paquier

On Sat, Feb 12, 2022 at 12:10:46PM -0800, Andres Freund wrote:
> Tap tests currently are always executed in the source directory above t/, as
> far as I can tell. So I don't think VPATH/non-VPATH or windows / non-windows
> would break using a relative reference from the test dir.

That would mean to rely on $ENV{PWD} for this job, as of something
like that:
-# Find the location of postgresql.conf.sample, based on the information
-# provided by pg_config.
-my $sample_file =
-  $node->config_data('--sharedir') . '/postgresql.conf.sample';
+my $rootdir = $ENV{PWD} . "/../../../..";
+my $sample_file = "$rootdir/src/backend/utils/misc/postgresql.conf.sample";

A run through the CI shows that this is working, and it also works
with Debian's patch for the config data.  I still tend to prefer the
readability of a TAP test over the main regression test suite for this
facility.  Even if we could use the same trick with PG_ABS_SRCDIR, I'd
rather not add this kind of dependency with a file in the source
tree in src/test/regress/.  Do others have any opinions on the matter?
--
Michael


signature.asc
Description: PGP signature

Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints

2022-02-13 Thread Dilip Kumar

On Sun, Feb 13, 2022 at 9:56 PM Robert Haas  wrote:
>
> On Sun, Feb 13, 2022 at 1:34 AM Dilip Kumar  wrot>
> > test4:
> > 32 GB shared buffers, template DB size = 10GB, dirty shared buffer=70%
> > Head: 47656 ms
> > Patch: 79767 ms
>
> This seems like the most surprising result of the bunch. Here, the
> template DB is both small enough to fit in shared_buffers and small
> enough not to trigger a checkpoint all by itself, and yet the patch
> loses.

Well this is not really surprising to me because what I have noticed
is that with the new approach the createdb time is completely
dependent upon the template db size.  So if the source db size is 10GB
it is taking around 80sec and the shared buffers size does not have a
major impact.  Maybe a very small shared buffer can have more impact
so I will test that as well.

>
> Did you checkpoint between one test and the next, or might this test
> have been done after a bunch of WAL had already been written since the
> last checkpoint so that the 10GB pushed it over the edge?

Not really,  I am running each test with a new initdb so that could
not be an issue.

> BTW, you have test4 twice in your list of results.

My bad, those are different tests.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Re: Add client connection check during the execution of the query

2022-02-13 Thread Thomas Munro

On Fri, Jan 14, 2022 at 7:30 PM Thomas Munro  wrote:
> On Fri, Jan 14, 2022 at 4:35 PM Andres Freund  wrote:
> > The more I think about it, the less I see why we *ever* need to re-arm the
> > latch in pq_check_connection() in this approach. pq_check_connection() is 
> > only
> > used from from ProcessInterrupts(), and there's plenty things inside
> > ProcessInterrupts() that can cause latches to be reset (e.g. parallel 
> > message
> > processing causing log messages to be sent to the client, causing network 
> > IO,
> > which obviously can do a latch reset).
>
> Thanks for the detailed explanation.  I guess I was being overly
> cautious and a little myopic, "leave things exactly the way you found
> them", so I didn't have to think about any of that.  I see now that
> the scenario I was worrying about would be a bug in whatever
> latch-wait loop happens to reach this code.  Alright then, here is
> just... one... more... patch, this time consuming any latch that gets
> in the way and retrying, with no restore.

And pushed.

My excuse for taking so long to get this into the tree is that it was
tedious to retest this thing across so many OSes and determine that it
really does behave reliably for killed processes AND lost
processes/yanked cables/keepalive timeout, even with buffered data.
In the process I learned a bit more about TCP and got POLLRDHUP added
to FreeBSD (not that it matters for PostgreSQL 15, now that we can use
EV_EOF).  As for the FD_CLOSE behaviour I thought I saw on Windows
upthread: it was a mirage, caused by the RST thing.  There may be some
other way to implement this feature on that TCP implementation, but I
don't know what it is.

Re: Consistently use "startup process"/"WAL sender" instead of "Startup process"/"WAL Sender" in comments and docs.

2022-02-13 Thread Bharath Rupireddy

On Sun, Feb 13, 2022 at 5:49 PM John Naylor
 wrote:
>
> On Sun, Feb 13, 2022 at 3:13 PM Bharath Rupireddy
>  wrote:
> >
> > Hi,
> >
> > In the code comments, it is being used as "Startup process" in the
> > middle of the sentences whereas in most of the places "startup
> > process" is used. Also, "WAL Sender" is being used instead of "WAL
> > sender". Let's be consistent across the docs and code comments.
>
> FWIW, docs need to hold to a higher standard than code comments.

Thanks. In general, I agree that the docs and error/log messages
(user-facing) need to be of higher standard, but at the same time code
comments too IMHO. Because many hackers/developers consider code
comments a great place to learn the internals, being consistent there
does no harm.

Regards,
Bharath Rupireddy.

Re: buildfarm warnings

2022-02-13 Thread Thomas Munro

On Mon, Feb 14, 2022 at 6:10 AM Tom Lane  wrote:
> Another thing I've been ignoring on the grounds that I-don't-feel-
> like-fixing-this is that the Solaris and OpenIndiana machines whine
> about every single reference from loadable modules to the core code,
> eg this from haddock while building contrib/fuzzystrmatch:
>
> Undefined   first referenced
>  symbol in file
> cstring_to_text dmetaphone.o
> varstr_levenshtein  fuzzystrmatch.o
> text_to_cstring dmetaphone.o
> errstart_cold   fuzzystrmatch.o
> errmsg  fuzzystrmatch.o
> palloc  dmetaphone.o
> ExceptionalConditionfuzzystrmatch.o
> errfinish   fuzzystrmatch.o
> varstr_levenshtein_less_equal   fuzzystrmatch.o
> repallocdmetaphone.o
> errcode fuzzystrmatch.o
> errmsg_internal fuzzystrmatch.o
> pg_detoast_datum_packed dmetaphone.o
> ld: warning: symbol referencing errors
>
> Presumably, this means that the Solaris linker would really like
> to see the executable the module will load against, analogously to
> the BE_DLLLIBS settings we use on some other platforms such as macOS.
> It evidently still works without that, but we might be losing
> performance from an extra indirection or the like.  And in any case,
> this amount of noise would be very distracting if you wanted to do
> actual development on that platform.  I'm not especially interested
> in tracking down the correct incantation, but I'd gladly push a patch
> if someone submits one.

I took a quick look at this on an OpenIndiana vagrant image, using the
GNU compiler and [Open]Solaris's linker.  It seems you can't use -l
for this (it only likes files called libX.Y) but the manual[1] says
that "shared objects that reference symbols from an application can
use the -z defs option, together with defining the symbols by using an
extern mapfile directive", so someone might need to figure out how to
build and feed in such a file...

[1] https://docs.oracle.com/cd/E26502_01/html/E26507/chapter2-90421.html

Re: Possible uninitialized use of the variables (src/backend/access/transam/twophase.c)

2022-02-13 Thread Michael Paquier

On Thu, Feb 10, 2022 at 11:38:44AM +0900, Michael Paquier wrote:
> The proposed change is incomplete anyway once you consider this
> argument.  First, there is no need to set up those fields in
> EndPrepare() anymore if there is no origin data, no?  It would be good
> to comment that these are filled in EndPrepare(), I guess, once you do
> the initialization in StartPrepare().

That still looked better on a fresh look in terms of consistency, so
applied this way.
--
Michael


signature.asc
Description: PGP signature

Re: pgsql: Add TAP test to automate the equivalent of check_guc

2022-02-13 Thread Michael Paquier

On Sat, Feb 12, 2022 at 05:31:55PM +0100, Christoph Berg wrote:
> To support multiple major versions in parallel, we need
> /usr/lib/postgresql/NN/lib and /usr/share/postgresql/NN, so it's not a
> single $PREFIX. But you are correct, and ./configure supports that.

Yep, I don't quite see the problem, but I don't have the years of
experience in terms of packaging you have, either.  (NB: I use various
--libdir and --bindir combinations that for some of packaging of PG I
maintain for pg_upgrade, but that's not as complex as Debian, I
guess).

> I was confusing that with this: The problem that led to the pg_config
> patch years ago was that we have a /usr/bin/pg_config in
> (non-major-version-dependant) libpq-dev, and
> /usr/lib/postgresql/NN/bin/pg_config in the individual
> postgresql-server-dev-NN packages, and iirc the /usr/bin version
> didn't particularly like the other binaries being in
> /usr/lib/postgresql/NN/bin/.
> 
> I guess it's time to revisit that problem now and see if it can be
> solved more pretty today on our side.

If you can solve that, my take is that it could make things nicer in
the long-run for some of the TAP test facilities.  Still, I'll try to
look at a solution for this thread that does not interact badly with
what you are doing, and I am going to grab your patch when testing
things.
--
Michael


signature.asc
Description: PGP signature

Re: pgsql: Add suport for server-side LZ4 base backup compression.

2022-02-13 Thread Michael Paquier

On Sun, Feb 13, 2022 at 11:13:51AM -0500, Robert Haas wrote:
> Well, that's because I didn't realize that LZ4 might be set to
> something that doesn't work at all. So Michael's thing worked, but
> because it (in my view) fixed the problem in a more surprising place
> than I would have preferred, I made a commit later that turned out to
> break the buildfarm. So you can blame either one of us that you like.
> 
> Thanks, Michael, for preparing a patch.

Patch that was slightly wrong, as the tests of pg_verifybackup still
failed once I moved away the lz4 command in my environment because LZ4
gets set to an empty value.  I have checked this case locally, and
applied the patch to add the ./configure check, so copperhead should
get back to green.

A last thing, that has been mentioned by Andres upthread, is that we
should be able to remove the extra commands run with --version in the
tests of pg_basebackup, as of the attached.  I have not done that yet,
as it seems better to wait for copperhead first, and the tests of
pg_basebackup run before pg_verifybackup so I don't want to mask one
error for another in case the buildfarm says something.
--
Michael
diff --git a/src/bin/pg_basebackup/t/010_pg_basebackup.pl b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
index a9b2e090b1..7c3074329d 100644
--- a/src/bin/pg_basebackup/t/010_pg_basebackup.pl
+++ b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
@@ -768,8 +768,7 @@ SKIP:
 	my $gzip = $ENV{GZIP_PROGRAM};
 	skip "program gzip is not found in your system", 1
 	  if ( !defined $gzip
-		|| $gzip eq ''
-		|| system_log($gzip, '--version') != 0);
+		|| $gzip eq '');
 
 	my $gzip_is_valid =
 	  system_log($gzip, '--test', @zlib_files, @zlib_files2, @zlib_files3);
diff --git a/src/bin/pg_basebackup/t/020_pg_receivewal.pl b/src/bin/pg_basebackup/t/020_pg_receivewal.pl
index 0e6e685aa6..1717e22a9c 100644
--- a/src/bin/pg_basebackup/t/020_pg_receivewal.pl
+++ b/src/bin/pg_basebackup/t/020_pg_receivewal.pl
@@ -130,8 +130,7 @@ SKIP:
 	my $gzip = $ENV{GZIP_PROGRAM};
 	skip "program gzip is not found in your system", 1
 	  if ( !defined $gzip
-		|| $gzip eq ''
-		|| system_log($gzip, '--version') != 0);
+		|| $gzip eq '');
 
 	my $gzip_is_valid = system_log($gzip, '--test', @zlib_wals);
 	is($gzip_is_valid, 0,
@@ -186,8 +185,7 @@ SKIP:
 	my $lz4 = $ENV{LZ4};
 	skip "program lz4 is not found in your system", 1
 	  if ( !defined $lz4
-		|| $lz4 eq ''
-		|| system_log($lz4, '--version') != 0);
+		|| $lz4 eq '');
 
 	my $lz4_is_valid = system_log($lz4, '-t', @lz4_wals);
 	is($lz4_is_valid, 0,


signature.asc
Description: PGP signature

Re: [PATCH] nodeindexscan with reorder memory leak

2022-02-13 Thread Alexander Korotkov

On Thu, Feb 10, 2022 at 2:35 AM Tom Lane  wrote:
> Alexander Korotkov  writes:
> > I've rechecked that the now patched code branch is covered by
> > regression tests.  I think the memory leak issue is independent of the
> > computational errors we've observed.
> > So, I'm going to push and backpatch this if no objections.
>
> +1.  We should work on the roundoff-error issue as well, but
> as you say, it's an independent problem.

Pushed, thank you.

--
Regards,
Alexander Korotkov

Re: pgsql: Add suport for server-side LZ4 base backup compression.

2022-02-13 Thread Michael Paquier

On Sat, Feb 12, 2022 at 11:12:50PM -0500, Tom Lane wrote:
> Michael Paquier  writes:
>> Well, this somebody is I, and the buildfarm did not blow up on any of
>> that so that looked rather fine.
> 
> Eh?  copperhead for one is failing for exactly this reason:
> 
> Bailout called.  Further testing stopped:  failed to execute command
> "lz4 -d -m
> /home/pgbf/buildroot/HEAD/pgsql.build/src/bin/pg_verifybackup/tmp_check/t_008_untar_primary_data/backup/server-backup/base.tar.lz4":
> No such file or directory

Well, I have put in place a check in the TAP tests, that Robert has
managed to break.  It looks like it was wrong of me to assume that
it would be fine as-is.  Sorry about that.

>> Adding a few cycles for this check
>> is fine by me.  What do you think of the attached?
> 
> Looks OK as far as it goes ... but do we need anything on the
> MSVC side?

It is possible to tweak the environment variable so one can unset it.
If we make things consistent with the configure check, we could tweak
the default to not be "lz4" if we don't have --with-lz4 in the MSVC
configuration?  I don't recall seeing in the wild an installer for LZ4
where we would have the command but not the libraries, so perhaps
that's not worth the trouble, anyway, and we don't have any code paths
in the tests where we use LZ4 without --with-lz4.

> Also, I can't help wondering why we have both GZIP_PROGRAM and GZIP
> variables.  I suppose that's a separate issue though.

From gzip's man page:
"The obsolescent environment variable GZIP can hold a set of default
options for gzip."

This means that saving the command into a variable with the same name
interacts with the command launch.  This was discussed here:
https://www.postgresql.org/message-id/nr63G0R6veCrLY30QMwxYA2aCawclWJopZiKEDN8HtKwIoIqZ79fVmSE2GxTyPD1Mk9FzRdfLCrc-BSGO6vTlDT_E2-aqtWeEiNn-qsDDzk=@pm.me
--
Michael

signature.asc
Description: PGP signature

Re: bailing out in tap tests nearly always a bad idea

2022-02-13 Thread Andres Freund

Hi,

On 2022-02-13 18:32:59 -0500, Tom Lane wrote:
> Andres Freund  writes:
> > Best with a
> > central function signalling fatal error, rather than individual uses of die
> > or such.
>
> Huh, doesn't Test::More already provide a sane way to do this?

I looked, and didn't see anything. But I'm not a perl person, so I might just
have missed something.

> If not, why isn't die() good enough?  (I don't think you can
> realistically expect to prohibit die() anywhere in the TAP tests.)

The output of dying isn't great either:

t/000_fail.pl  Dubious, test returned 25 (wstat 6400, 
0x1900)
No subtests run

it'd be nicer if that that showed the actual reason for failing, rather than
the unhelpful "Dubious, test returned" stuff. But it's still better than
BAIL_OUT. So I thought that putting the failure handling in a central routine
would allow us to make the exit nicer in a central place / at a later stage...

Greetings,

Andres Freund

Re: xml_is_well_formed (was Re: buildfarm warnings)

2022-02-13 Thread Andrew Dunstan



On 2/13/22 13:59, Tom Lane wrote:
> So we're faced with a dilemma: we can't rename either instance without
> breaking something.  The only way to get rid of the warnings seems to be
> to decide that the copy in xml2 has outlived its usefulness and remove
> it.  I don't think that's a hard argument to make: that version has been
> obsolete since 9.1 (a0b7b717a), meaning that only pre-extensions versions
> of xml2 could need it.  In fact, we pulled the trigger on it once before,
> in 2016 (20540710e), and then changed our minds not because anyone
> lobbied to put it back but just because we gave up on the PGDLLEXPORT
> rearrangement that prompted the change then.
>
> I think that getting rid of these build warnings is sufficient reason
> to drop this ancient compatibility function, and now propose that
> we do that.
>
>   


+1


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: bailing out in tap tests nearly always a bad idea

2022-02-13 Thread Tom Lane

Andres Freund  writes:
> Does anybody want to defend / explain the use of BAIL_OUT? If not, I think we
> should consider doing a global replace of the use of bailing.

+1

> Best with a
> central function signalling fatal error, rather than individual uses of die
> or such.

Huh, doesn't Test::More already provide a sane way to do this?
If not, why isn't die() good enough?  (I don't think you can
realistically expect to prohibit die() anywhere in the TAP tests.)

regards, tom lane

Re: buildfarm warnings

2022-02-13 Thread Andrew Dunstan



On 2/12/22 16:42, Tom Lane wrote:
>> I'm of the impression that some people have sql access to BF logs.
> Yeah.  I'm not sure exactly what the access policy is for that machine;
> I know we'll give out logins to committers, but not whether there's
> any precedent for non-committers.


Yeah, I don't think it's wider than that. The sysadmins own these
instances, so the policy is really up to them.


cheers


andrew



--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: [PATCH] Avoid open and lock the table Extendend Statistics (src/backend/commands/statscmds.c)

2022-02-13 Thread Tom Lane

Tomas Vondra  writes:
> On 2/13/22 22:43, Andres Freund wrote:
>> Having looked briefly at it, I don't understand what the locking scheme is?

> I don't recall the details of the discussion (if at all), but if you try
> to do concurrent ALTER STATISTICS, that'll end up with:
>   ERROR: tuple concurrently updated
> reported by CatalogTupleUpdate. AFAIK that's what we do for other object
> types that don't have a relation that we might lock (e.g. try to co
> CREATE OR REPLACE FUNCTION).

Yeah, we generally haven't bothered with a separate locking scheme
for objects other than relations.  I don't see any big problem with
that for objects that are defined by a single catalog entry, since
"tuple concurrently updated" failures serve fine (although maybe
that error message could be nicer).  Extended stats don't quite
fit that definition, but at least for updates that only need to
touch the pg_statistic_ext row, it's the same story.

When we get around to supporting multi-relation statistics,
there might need to be some protocol around when ANALYZE can
update pg_statistic_ext_data rows.  But I don't think that
issue exists yet, since only one ANALYZE can run on a particular
relation at a time.

regards, tom lane

bailing out in tap tests nearly always a bad idea

2022-02-13 Thread Andres Freund

Hi,

A recent cfbot failure reminded me of a topic I wanted to raise: The use of
BAIL in tap tests.

When a tap test exits with BAIL it doesn't just consider that test failed, but
it also stops running further tap tests in the same group:
https://metacpan.org/pod/Test::More
"Indicates to the harness that things are going so badly all testing should
terminate. This includes the running of any additional test scripts."

There may be a few cases where stopping testing makes sense, but nearly all of
the uses of bail in our tests look like they just want to fail the the current
test, rather than other tests in the same prove invocation.


Besides aborting other tests, it's also just hard to parse the output:
https://cirrus-ci.com/task/6243385840238592?logs=test_world#L2329
...
[16:28:12.178] [16:27:23] t/024_archive_recovery.pl  ok 3440 ms 
( 0.00 usr  0.00 sys +  1.43 cusr  0.64 csys =  2.07 CPU)
[16:28:12.178] [16:27:26] t/025_stuck_on_old_timeline.pl ... ok 3024 ms 
( 0.00 usr  0.00 sys +  1.21 cusr  0.61 csys =  1.82 CPU)
[16:28:12.178] [16:27:30] t/026_overwrite_contrecord.pl  ok 3693 ms 
( 0.01 usr  0.00 sys +  1.05 cusr  0.42 csys =  1.48 CPU)
[16:28:12.178] Bailout called.  Further testing stopped:  command 
"/tmp/cirrus-ci-build/src/test/recovery/../../../src/test/regress/pg_regress  
--dlpath="/tmp/cirrus-ci-build/src/test/regress" --bindir= 
--host=/tmp/gFHAMHHTJ_ --port=52708 --schedule=../regress/parallel_schedule 
--max-concurrent-tests=20 --inputdir=../regress 
--outputdir="/tmp/cirrus-ci-build/src/test/recovery/tmp_check"" exited with 
value 1
[16:28:12.178] FAILED--Further testing stopped: command 
"/tmp/cirrus-ci-build/src/test/recovery/../../../src/test/regress/pg_regress  
--dlpath="/tmp/cirrus-ci-build/src/test/regress" --bindir= 
--host=/tmp/gFHAMHHTJ_ --port=52708 --schedule=../regress/parallel_schedule 
--max-concurrent-tests=20 --inputdir=../regress 
--outputdir="/tmp/cirrus-ci-build/src/test/recovery/tmp_check"" exited with 
value 1
[16:28:12.178] make[2]: *** [Makefile:27: check] Error 255
[16:28:12.178] make[1]: *** [Makefile:48: check-recovery-recurse] Error 2

What failed was 027_stream_regress.pl, due to a trivial regression test output
change:

diff -U3 /tmp/cirrus-ci-build/src/test/regress/expected/guc.out 
/tmp/cirrus-ci-build/src/test/recovery/tmp_check/results/guc.out
--- /tmp/cirrus-ci-build/src/test/regress/expected/guc.out  2022-02-13 
16:20:53.931949802 +
+++ /tmp/cirrus-ci-build/src/test/recovery/tmp_check/results/guc.out
2022-02-13 16:28:00.481307866 +
@@ -850,7 +850,8 @@
name
 ---
  default_statistics_target
-(1 row)
+ enable_indexskipscan
+(2 rows)

 -- Runtime-computed GUCs should be part of the preset category.
 SELECT name FROM tab_settings_flags


But there's pretty much no way to know from the error output that this failure
was in 027_stream_regress.pl (just an example, this is widespread).


For some reason prove's output is much worse when the output is redirected to
a file. When testing with prove going to a terminal, bail at least prints the
test name.

interactive:
t/000_fail.pl  Bailout called.  Further testing 
stopped:  I am bailing
FAILED--Further testing stopped: I am bailing

file:
Bailout called.  Further testing stopped:  I am bailing
FAILED--Further testing stopped: I am bailing

the latter is just about undebuggable, particularly when using
PROVE_FLAGS='-j...'.


Even just a 'die' is better than bail, output wise (and better due to not
preventing other tests from running):

interactive:
t/000_fail.pl  Dubious, test returned 25 (wstat 6400, 
0x1900)
No subtests run
t/001_stream_rep.pl .. ok

file:
t/000_fail.pl 
Dubious, test returned 25 (wstat 6400, 0x1900)
No subtests run
t/001_stream_rep.pl .. ok


Does anybody want to defend / explain the use of BAIL_OUT? If not, I think we
should consider doing a global replace of the use of bailing. Best with a
central function signalling fatal error, rather than individual uses of die
or such.

Greetings,

Andres Freund

Re: [PATCH] Avoid open and lock the table Extendend Statistics (src/backend/commands/statscmds.c)

2022-02-13 Thread Tomas Vondra

On 2/13/22 22:43, Andres Freund wrote:
> Hi,
> 
> On 2022-02-13 18:21:38 -0300, Ranier Vilela wrote:
>> Why open and lock the table Extended Statistics if it is not the owner.
>> Check and return to avoid this.
> 
> I was about to say that this opens up time-to-check-time-to-use type
> issues. But it's the wrong thing to lock to prevent those.
> 

I doubt optimizing this is worth any effort - ALTER STATISTICS is rare,
not doing it with owner rights is even rarer.

> Having looked briefly at it, I don't understand what the locking scheme is?
> Shouldn't there be at least some locking against concurrent ALTERs and between
> ALTER and ANALYZE etc?
> 

I don't recall the details of the discussion (if at all), but if you try
to do concurrent ALTER STATISTICS, that'll end up with:

  ERROR: tuple concurrently updated

reported by CatalogTupleUpdate. AFAIK that's what we do for other object
types that don't have a relation that we might lock (e.g. try to co
CREATE OR REPLACE FUNCTION).

ANALYZE reads the statistics from the catalog, so it should see the last
committed stattarget value, I think.

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Mark all GUC variable as PGDLLIMPORT

2022-02-13 Thread Chapman Flack

On 02/13/22 15:48, Tom Lane wrote:
> much fix any worries there.  Or we could provide APIs that let an
> extension look up a "struct config_generic *" once and then fetch
> directly using that pointer.  (But I'd prefer not to, since it'd
> constrain the internals more than I think is wise.)

There is GetConfigOptionByNum already ... but I'm not sure there's
an easy way to ask "what's the num for option foo?".

GetNumConfigOptions exists, so there is a brute-force way.

Regards,
-Chap

Re: Rewriting the test of pg_upgrade as a TAP test - take three - remastered set

2022-02-13 Thread Andres Freund

On 2022-02-14 11:23:18 +1300, Thomas Munro wrote:
> On Sun, Feb 13, 2022 at 6:29 PM Andres Freund  wrote:
> > Afaics all the "regress test inside tap test" cases would need to do is to 
> > pass
> > --outputdir=${PostgreSQL::Test::Utils::tmp_check} and you'd get exactly the 
> > same path as
> > REGRESS_OUTPUTDIR currently provides.
> 
> Ahh, right.  I assume it still needs perl2host() treatment for MSYS2
> systems, because jacana's log shows TESTDIR is set to a Unixoid path
> that I assume pg_regress's runtime can't use.  That leads to the
> attached.

Looks sane to me.

Re: [PATCH] Avoid open and lock the table Extendend Statistics (src/backend/commands/statscmds.c)

2022-02-13 Thread Ranier Vilela

Em dom., 13 de fev. de 2022 às 18:43, Andres Freund 
escreveu:

> Hi,
>
> On 2022-02-13 18:21:38 -0300, Ranier Vilela wrote:
> > Why open and lock the table Extended Statistics if it is not the owner.
> > Check and return to avoid this.
>
> I was about to say that this opens up time-to-check-time-to-use type
> issues. But it's the wrong thing to lock to prevent those.
>
> Having looked briefly at it, I don't understand what the locking scheme is?
> Shouldn't there be at least some locking against concurrent ALTERs and
> between
> ALTER and ANALYZE etc?
>
I checked the pg_statistics_object_ownercheck function and I think the
patch is wrong.
It seems to me that when using SearchSysCache1, it is necessary to open and
lock the table.

Better remove the patch, sorry for the noise.

regards,
Ranier Vilela

Re: Rewriting the test of pg_upgrade as a TAP test - take three - remastered set

2022-02-13 Thread Thomas Munro

On Sun, Feb 13, 2022 at 6:29 PM Andres Freund  wrote:
> Afaics all the "regress test inside tap test" cases would need to do is to 
> pass
> --outputdir=${PostgreSQL::Test::Utils::tmp_check} and you'd get exactly the 
> same path as
> REGRESS_OUTPUTDIR currently provides.

Ahh, right.  I assume it still needs perl2host() treatment for MSYS2
systems, because jacana's log shows TESTDIR is set to a Unixoid path
that I assume pg_regress's runtime can't use.  That leads to the
attached.

0001-Remove-REGRESS_OUTPUTDIR-environment-variable.not-for-cfbot-patch
Description: Binary data

Re: Fix overflow in DecodeInterval

2022-02-13 Thread Joseph Koshakow

Attached is a new version switching from ints to bools, as requested.

- Joe Koshakow
From af8f030ad146602b4386f77b5664c6013743569b Mon Sep 17 00:00:00 2001
From: Joseph Koshakow 
Date: Fri, 11 Feb 2022 14:18:32 -0500
Subject: [PATCH] Check for overflow when decoding an interval

When decoding an interval there are various date units which are
aliases for some multiple of another unit. For example a week is 7 days
and a decade is 10 years. When decoding these specific units, there is
no check for overflow, allowing the interval to overflow. This commit
adds an overflow check for all of these units.

Additionally fractional date/time units are rolled down into the next
smallest unit. For example 0.1 months is 3 days. When adding these
fraction units, there is no check for overflow, allowing the interval
to overflow. This commit adds an overflow check for all of the
fractional units.

Additionally adding the word "ago" to the interval negates every
field. However the negative of INT_MIN is still INT_MIN. This
commit adds a check to make sure that we don't try and negate
a field if it's INT_MIN.

Signed-off-by: Joseph Koshakow 
---
 src/backend/utils/adt/datetime.c   | 100 +++--
 src/test/regress/expected/interval.out |  68 +
 src/test/regress/sql/interval.sql  |  18 +
 3 files changed, 164 insertions(+), 22 deletions(-)

diff --git a/src/backend/utils/adt/datetime.c b/src/backend/utils/adt/datetime.c
index 7926258c06..15a1ebbe67 100644
--- a/src/backend/utils/adt/datetime.c
+++ b/src/backend/utils/adt/datetime.c
@@ -21,6 +21,7 @@
 #include "access/htup_details.h"
 #include "access/xact.h"
 #include "catalog/pg_type.h"
+#include "common/int.h"
 #include "common/string.h"
 #include "funcapi.h"
 #include "miscadmin.h"
@@ -44,10 +45,13 @@ static int	DecodeDate(char *str, int fmask, int *tmask, bool *is2digits,
 	   struct pg_tm *tm);
 static char *AppendSeconds(char *cp, int sec, fsec_t fsec,
 		   int precision, bool fillzeros);
-static void AdjustFractSeconds(double frac, struct pg_tm *tm, fsec_t *fsec,
+static bool AdjustFractSeconds(double frac, struct pg_tm *tm, fsec_t *fsec,
 			   int scale);
-static void AdjustFractDays(double frac, struct pg_tm *tm, fsec_t *fsec,
+static bool AdjustFractDays(double frac, struct pg_tm *tm, fsec_t *fsec,
 			int scale);
+static bool AdjustFractMonths(double frac, struct pg_tm *tm, int scale);
+static bool AdjustDays(int val, struct pg_tm *tm, int multiplier);
+static bool AdjustYears(int val, struct pg_tm *tm, int multiplier);
 static int	DetermineTimeZoneOffsetInternal(struct pg_tm *tm, pg_tz *tzp,
 			pg_time_t *tp);
 static bool DetermineTimeZoneAbbrevOffsetInternal(pg_time_t t,
@@ -499,34 +503,68 @@ AppendTimestampSeconds(char *cp, struct pg_tm *tm, fsec_t fsec)
 /*
  * Multiply frac by scale (to produce seconds) and add to *tm & *fsec.
  * We assume the input frac is less than 1 so overflow is not an issue.
+ * Returns true if successful, false if tm overflows.
  */
-static void
+static bool
 AdjustFractSeconds(double frac, struct pg_tm *tm, fsec_t *fsec, int scale)
 {
 	int			sec;
 
 	if (frac == 0)
-		return;
+		return true;
 	frac *= scale;
 	sec = (int) frac;
-	tm->tm_sec += sec;
+	if (pg_add_s32_overflow(tm->tm_sec, sec, >tm_sec))
+		return false;
 	frac -= sec;
 	*fsec += rint(frac * 100);
+	return true;
 }
 
 /* As above, but initial scale produces days */
-static void
+static bool
 AdjustFractDays(double frac, struct pg_tm *tm, fsec_t *fsec, int scale)
 {
 	int			extra_days;
 
 	if (frac == 0)
-		return;
+		return true;
 	frac *= scale;
 	extra_days = (int) frac;
-	tm->tm_mday += extra_days;
+	if (pg_add_s32_overflow(tm->tm_mday, extra_days, >tm_mday))
+		return false;
 	frac -= extra_days;
-	AdjustFractSeconds(frac, tm, fsec, SECS_PER_DAY);
+	return AdjustFractSeconds(frac, tm, fsec, SECS_PER_DAY);
+}
+
+/* As above, but initial scale produces months */
+static bool
+AdjustFractMonths(double frac, struct pg_tm *tm, int scale)
+{
+	int months = rint(frac * MONTHS_PER_YEAR * scale);
+
+	return !pg_add_s32_overflow(tm->tm_mon, months, >tm_mon);
+}
+
+/*
+ * Multiply val by multiplier (to produce days) and add to *tm.
+ * Returns true if successful, false if tm overflows.
+ */
+static bool
+AdjustDays(int val, struct pg_tm *tm, int multiplier)
+{
+	int			extra_days;
+	return !pg_mul_s32_overflow(val, multiplier, _days) &&
+		!pg_add_s32_overflow(tm->tm_mday, extra_days, >tm_mday);
+}
+
+/* As above, but initial val produces years */
+static bool
+AdjustYears(int val, struct pg_tm *tm, int multiplier)
+{
+	int			years;
+	return !pg_mul_s32_overflow(val, multiplier, ) &&
+		!pg_add_s32_overflow(tm->tm_year, years, >tm_year);
 }
 
 /* Fetch a fractional-second value with suitable error checking */
@@ -3275,56 +3313,69 @@ DecodeInterval(char **field, int *ftype, int nf, int range,
 
 	case DTK_MINUTE:
 		tm->tm_min += val;
-		AdjustFractSeconds(fval, tm, fsec, SECS_PER_MINUTE);

Re: Adding CI to our tree

2022-02-13 Thread Andres Freund

Hi,

On 2022-02-13 15:42:13 -0600, Justin Pryzby wrote:
> > Note that prove unfortunately serializes the test output to be in the order 
> > it
> > started them, even when tests run concurrently. Extremely unhelpful, but ...
> 
> Are you sure ?

Somewhat. I think it's a question of the prove version and some autodetection
of what type of environment prove is running in (stdin/stdout/stderr). I don't
remember the details, but at some point I pinpointed the source of the
serialization, and verified that parallelization makes a significant
difference in runtime even without being easily visible :(.  But this is all
vague memory, so I might be wrong.

Reminds me that somebody (ugh, me???) should fix the perl > 5.26
incompatibilities on windows, then we'd also get a newer prove...

> > One nice bit is that the output is a *lot* easier to read.
> > 
> > You could try improving the total time by having prove remember slow tests 
> > and
> > use that file to run the slowest tests first next time. --state slow,save or
> > such I believe. Of course we'd have to save that state file...
> 
> In a test, this hurt rather than helped (13m 42s).
> https://cirrus-ci.com/task/6359167186239488
> 
> I'm not surprised - it makes sense to run 10 fast tests at once, but usually
> doesn't make sense to run 10 slow tests tests at once (at least a couple of
> which are doing something intensive).  It was faster (12m16s) to do it
> backwards (fastest tests first).
> https://cirrus-ci.com/task/5745115443494912

Hm.

I know I saw significant reduction in test times locally with meson by
starting slow tests earlier, because they're the limiting factor for the
*overall* test runtime - but I have more resources than on cirrus.  Even
locally on a windows VM, with the current buildsystem, I found that moving 027
to earlier withing recoverycheck reduced the test time.

But it's possible that with all tests being scheduled concurrently, starting
the slow tests early leads to sufficient resource overcommit to be
problematic.

> BTW, does it make sense to remove test_regress_parallel_script ?  The
> pg_upgrade run would do the same things, no ?  If so, it might make sense to
> run that first.  OTOH, you suggested to run the upgrade tests with checksums
> enabled, which seems like a good idea.

No, I don't think so. The main regression tests are by far the most important
thing during normal development. Just relying on main regression test runs
embedded in other tests, with different output and config of the main
regression test imo is just confusing.

Greetings,

Andres Freund

Re: generic plans and "initial" pruning

2022-02-13 Thread David Rowley

(just catching up on this thread)

On Thu, 13 Jan 2022 at 07:20, Robert Haas  wrote:
> Yeah. I don't think it's only non-core code we need to worry about
> either. What if I just do EXPLAIN ANALYZE on a prepared query that
> ends up pruning away some stuff? IIRC, the pruned subplans are not
> shown, so we might escape disaster here, but FWIW if I'd committed
> that code I would have pushed hard for showing those and saying "(not
> executed)"  so it's not too crazy to imagine a world in which
> things work that way.

FWIW, that would remove the whole point in init run-time pruning.  The
reason I made two phases of run-time pruning was so that we could get
away from having the init plan overhead of nodes we'll never need to
scan.  If we wanted to show the (never executed) scans in EXPLAIN then
we'd need to do the init plan part and allocate all that memory
needlessly.

Imagine a hash partitioned table on "id" with 1000 partitions. The user does:

PREPARE q1 (INT) AS SELECT * FROM parttab WHERE id = $1;

EXECUTE q1(123);

Assuming a generic plan, if we didn't have init pruning then we have
to build a plan containing the scans for all 1000 partitions. There's
significant overhead to that compared to just locking the partitions,
and initialising 1 scan.

If it worked this way then we'd be even further from Amit's goal of
reducing the overhead of starting plan with run-time pruning nodes.

I understood at the time it was just the EXPLAIN output that you had
concerns with. I thought that was just around the lack of any display
of the condition we used for pruning.

David

Re: Adding CI to our tree

2022-02-13 Thread Andres Freund

Hi,

On 2022-02-13 15:31:20 -0600, Justin Pryzby wrote:
> Oh - I suppose you're right.  That's an unfortunate consequence of running a
> single prove instance without chdir.

I don't think it's chdir that's relevant (that changes into the source
directory after all). It's the TESTDIR environment variable.

I was thinking that we should make Utils.pm's INIT block responsible for
figuring out both the directory a test should run in and the log location,
instead having that in vcregress.pl and Makefile.global.in. Mostly because
doing it in the latter means we can't start tests with different TESTDIR and
working dir at the same time.

If instead we pass the location of the top-level build and top-level source
directory from vcregress.pl / Makefile.global, the tap test infrastructure can
figure out that stuff themselves, on a per-test basis.

For msvc builds we probably would need to pass in some information that allow
Utils.pm to set up PATH appropriately. I think that might just require knowing
that a) msvc build system is used b) Release vs Debug.

Greetings,

Andres Freund

Nonrandom scanned_pages distorts pg_class.reltuples set by VACUUM

2022-02-13 Thread Peter Geoghegan

My recent commit 44fa8488 made vacuumlazy.c always scan the last page
of every heap relation, even when it's all-visible, regardless of any
other factor. The general idea is to always examine the last page in
the relation when that might prevent us from unsuccessfully attempting
to truncate the relation -- since even the attempt is costly (in
particular, we have to get an AccessExclusiveLock, which is painful
with a Hot Standby replica). This created a new problem: if we
manually VACUUM a smaller, static table many times,
vac_estimate_reltuples() will consider the same scanned_page to be a
random sample each time. Of course this page is the further possible
thing from a random sample, because it's the same heap page each time,
and because it's generally more likely that the last page will have
fewer tuples.

Every time you run VACUUM (without changing the table),
pg_class.reltuples shrinks, by just a small amount. Run VACUUM like
that enough times, and pg_class.reltuples will eventually become quite
distorted. It's easy to spot this effect, just by running "VACUUM
VERBOSE" in a loop against a static table, and noting how "tuples:
 ??? remain" tends to shrink over time, even though the contents
of the table won't have changed.

The "always scan last page" behavior isn't really new, actually. It's
closely related to an earlier, slightly different behavior with the
same goal -- though one that conditioned scanning the last page on
certain other factors that were protective (which was probably never
considered by the design). This closely related behavior (essentially
the same behavior) was added by commit e8429082 from 2015. I judged
that it wasn't worth worrying about scanning an extra page
unnecessarily (better to keep things simple), since it's not at all
uncommon for VACUUM to scan a few all-visible pages anyway (due to the
SKIP_PAGES_THRESHOLD stuff). And I haven't changed my mind about that
-- I still think that we should just scan the last page in all cases,
to keep things simple. But I definitely don't think that this
"pg_class.reltuples gets distorted over time" business is okay -- that
has to be fixed.

There has been at least one very similar bug in the past: the commit
message from commit b4b6923e (from 2011) talks about a similar issue
with over-sampling from the beginning of the table (not the end), due
to a similar bias involving visibility map implementation details that
lead to an extremely nonrandom scanned_pages sample. To me that
suggests that the true problem here isn't really in vacuumlazy.c --
going back to the old approach (i.e. scanning the last page
conditionally) seems like a case of the tail wagging the dog. IMV the
real problem is that vac_estimate_reltuples() is totally unconcerned
about these kinds of systematic sampling problems.

Attached draft patch fixes the issue by making
vac_estimate_reltuples() return the old/original reltuples from
pg_class (and not care at all about the scanned_tuples value from its
vacuumlazy.c caller) when:

1. The total_pages for the table hasn't changed (by even one page)
since the last VACUUM (or last ANALYZE).

and,

2. The total number of scanned_pages is less than 2% of total_pages.

This fixes the observed problem directly. It also makes the code
robust against other similar problems that might arise in the future.
The risk that comes from trusting that scanned_pages is a truly random
sample (given these conditions) is generally very large, while the
risk that comes from disbelieving it (given these same conditions) is
vanishingly small.

-- 
Peter Geoghegan


v1-0001-Avoid-vac_estimate_reltuples-distortion.patch
Description: Binary data

Re: [PATCH] Avoid open and lock the table Extendend Statistics (src/backend/commands/statscmds.c)

2022-02-13 Thread Andres Freund

Hi,

On 2022-02-13 18:21:38 -0300, Ranier Vilela wrote:
> Why open and lock the table Extended Statistics if it is not the owner.
> Check and return to avoid this.

I was about to say that this opens up time-to-check-time-to-use type
issues. But it's the wrong thing to lock to prevent those.

Having looked briefly at it, I don't understand what the locking scheme is?
Shouldn't there be at least some locking against concurrent ALTERs and between
ALTER and ANALYZE etc?

Greetings,

Andres Freund

Re: Adding CI to our tree

2022-02-13 Thread Justin Pryzby

On Sat, Feb 12, 2022 at 02:26:25PM -0800, Andres Freund wrote:
> On 2022-02-12 16:06:40 -0600, Justin Pryzby wrote:
> > I had some success with that, but it doesn't seem to be significantly 
> > faster -
> > it looks a lot like the tests are not actually running in parallel.

Note that the total test time is close to the sum of the individual test times.
But I think that may be an artifact of how prove is showing/attributing times
to each test (which, if so, is misleading).

> Note that prove unfortunately serializes the test output to be in the order it
> started them, even when tests run concurrently. Extremely unhelpful, but ...

Are you sure ?  When I run it locally, I see:
rm -fr src/test/recovery/tmp_check ; time PERL5LIB=`pwd`/src/test/perl 
TESTDIR=`pwd`/src/test/recovery 
PATH=`pwd`/tmp_install/usr/local/pgsql/bin:$PATH 
PG_REGRESS=`pwd`/src/test/regress/pg_regress 
REGRESS_SHLIB=`pwd`/src/test/regress/regress.so prove --time -j4 --ext '*.pl' 
`find src -name t`
...
[15:34:48] src/bin/scripts/t/101_vacuumdb_all.pl ... ok 
 104 ms ( 0.00 usr  0.00 sys +  2.35 cusr  0.47 csys =  2.82 CPU)
[15:34:49] src/bin/scripts/t/090_reindexdb.pl .. ok 
8894 ms ( 0.06 usr  0.01 sys + 14.45 cusr  3.38 csys = 17.90 CPU)
[15:34:50] src/bin/pg_config/t/001_pg_config.pl  ok 
  79 ms ( 0.00 usr  0.01 sys +  0.23 cusr  0.04 csys =  0.28 CPU)
[15:34:50] src/bin/pg_waldump/t/001_basic.pl ... ok 
  35 ms ( 0.00 usr  0.00 sys +  0.26 cusr  0.02 csys =  0.28 CPU)
[15:34:51] src/bin/pg_test_fsync/t/001_basic.pl  ok 
 100 ms ( 0.01 usr  0.00 sys +  0.24 cusr  0.04 csys =  0.29 CPU)
[15:34:51] src/bin/pg_archivecleanup/t/010_pg_archivecleanup.pl  ok 
 177 ms ( 0.02 usr  0.00 sys +  0.26 cusr  0.03 csys =  0.31 CPU)
[15:34:55] src/bin/scripts/t/100_vacuumdb.pl ... ok
11267 ms ( 0.12 usr  0.04 sys + 13.47 cusr  3.20 csys = 16.83 CPU)
[15:34:57] src/bin/scripts/t/102_vacuumdb_stages.pl  ok 
5802 ms ( 0.06 usr  0.01 sys +  7.70 cusr  1.37 csys =  9.14 CPU)
...

=> scripts/ stuff, followed by other stuff, followed by more, slower, scripts/ 
stuff.

But I never saw that in cirrus.

> Isn't this kind of a good test time? I thought earlier your alltaptests target
> took a good bit longer?

The original alltaptests runs in 16m 21s.
https://cirrus-ci.com/task/6679061752709120

2 weeks ago, it was ~14min with your patch to cache initdb.
https://cirrus-ci.com/task/5439320633901056

As I recall, that didn't seem to improve runtime when combined with my parallel
patch.

> One nice bit is that the output is a *lot* easier to read.
> 
> You could try improving the total time by having prove remember slow tests and
> use that file to run the slowest tests first next time. --state slow,save or
> such I believe. Of course we'd have to save that state file...

In a test, this hurt rather than helped (13m 42s).
https://cirrus-ci.com/task/6359167186239488

I'm not surprised - it makes sense to run 10 fast tests at once, but usually
doesn't make sense to run 10 slow tests tests at once (at least a couple of
which are doing something intensive).  It was faster (12m16s) to do it
backwards (fastest tests first).
https://cirrus-ci.com/task/5745115443494912

BTW, does it make sense to remove test_regress_parallel_script ?  The
pg_upgrade run would do the same things, no ?  If so, it might make sense to
run that first.  OTOH, you suggested to run the upgrade tests with checksums
enabled, which seems like a good idea.

Note that in the attached patches, I changed the msvc "warnings" to use "tee".

I don't know how to fix the pipeline test in a less hacky way...

You said the docs build should be a separate task, but then said that it'd be
okay to remove the dependency.  So I did it both ways.  There's currently some
duplication between the docs patch and code coverage patch.

-- 
Justin
>From 05c24a1e679e5ae0dd0cb6504d397f77c8b1fc5c Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Mon, 17 Jan 2022 00:53:04 -0600
Subject: [PATCH 1/8] cirrus: include hints how to install OS packages..

This is useful for patches during development, but once a features is merged,
new libraries should be added to the OS image files, rather than installed
during every CI run forever into the future.
---
 .cirrus.yml | 12 +---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/.cirrus.yml b/.cirrus.yml
index dd96a97efe5..eda8ac9596c 100644
--- a/.cirrus.yml
+++ b/.cirrus.yml
@@ -73,10 +73,11 @@ task:
 chown -R postgres:postgres .
 mkdir -p ${CCACHE_DIR}
 chown -R postgres:postgres ${CCACHE_DIR}
-  setup_cores_script: |
+  setup_os_script: |
 mkdir -m 770 /tmp/cores
 chown root:postgres /tmp/cores
 sysctl kern.corefile='/tmp/cores/%N.%P.core'
+#pkg install -y ...

   # NB: Intentionally build without --with-llvm. The freebsd image

Re: Adding CI to our tree

2022-02-13 Thread Justin Pryzby

On Sun, Feb 13, 2022 at 01:23:16PM -0800, Andres Freund wrote:
> If you're seeing this on windows on one of your test branches, that's much
> more likely to be caused by the alltaptests stuff, than by the change in
> artifact instruction.

Oh - I suppose you're right.  That's an unfortunate consequence of running a
single prove instance without chdir.

-- 
Justin

Re: Adding CI to our tree

2022-02-13 Thread Andres Freund

Hi,

On 2022-02-13 15:02:50 -0600, Justin Pryzby wrote:
> On Sat, Feb 12, 2022 at 04:24:20PM -0800, Andres Freund wrote:
> > > What I am excited about is that some of your other changes showed that we
> > > don't need separate *_artifacts for separate directories anymore. That 
> > > used to
> > > be the case, but an array of paths is now supported. Putting log, diffs, 
> > > and
> > > regress_log in one directory will be considerably more convenient...
> >
> > pushed together.
>
> This change actually complicates things.
>
> Before, there was log/src/test/recovery/tmp_check/log, with a few files for
> 001, a few for 002, a few for 003.  This are a lot of output files, but at
> least they're all related.

> Now, there's a single log/tmp_check/log, which has logs for the entire tap
> tests.  There's 3 pages of 001*, 2 pages of 002*, 3 pages of 003, etc.

Hm? Doesn't look like that to me, and I don't see why it would work that way?
This didn't do anything to flatten the directory hierarchy, just combine three
hierarchies into one?

What I see, and what I expect, is that logs end up in
e.g. log/src/test/recovery/tmp_check/log but that that directory contains
regress_log_*, as well as *.log?  Before one needed to go through the
hierarchy multiple times to see both regress_log_ (i.e. tap test log) as well
as 0*.log (i.e. server logs).

A random example: https://cirrus-ci.com/task/5152523873943552 shows the logs
for the failure in log/src/bin/pg_upgrade/tmp_check/log

If you're seeing this on windows on one of your test branches, that's much
more likely to be caused by the alltaptests stuff, than by the change in
artifact instruction.

Greetings,

Andres Freund

[PATCH] Avoid open and lock the table Extendend Statistics (src/backend/commands/statscmds.c)

2022-02-13 Thread Ranier Vilela

Hi,

Commit
https://github.com/postgres/postgres/commit/6d554e3fcd6fb8be2dbcbd3521e2947ed7a552cb

has fixed the bug, but still has room for improvement.

Why open and lock the table Extended Statistics if it is not the owner.
Check and return to avoid this.

trivial patch attached.

regards,
Ranier Vilela


avoid_open_and_lock_table_if_not_owner.patch
Description: Binary data

Re: Mark all GUC variable as PGDLLIMPORT

2022-02-13 Thread Tom Lane

Andres Freund  writes:
> On 2022-02-13 15:36:16 -0500, Chapman Flack wrote:
>> Also, I think there are some options that are only represented by
>> an int, float8, etc., when shown, but whose native internal form
>> is something else, like a struct. I was definitely contemplating
>> that you could 'subscribe' to one of those too, by passing the
>> address of an appropriate struct. But of course a GetConfigOption()
>> flavor could work that way too.

> I have a very hard time seeing a use-case for this. Nor how it'd even work
> with a struct - you can't just copy the struct contents, because of pointers
> to objects etc.  I don't think there really are options like this anyway.

There are a couple of legacy cases like "datestyle" where something
that the user sees as one GUC translates to multiple variables under
the hood, so you'd have to invent a struct if you wanted to pass
them through a mechanism like this.  I don't have a big problem
with leaving those out of any such solution, though.  (I see that
datestyle's underlying variables DateStyle and DateOrder are already
marked PGDLLIMPORT, and that's fine with me.)  A possibly more
interesting case is something like search_path --- but again,
there are already special-purpose APIs for accessing the interpreted
value of that.

regards, tom lane

Re: Adding CI to our tree

2022-02-13 Thread Justin Pryzby

On Sat, Feb 12, 2022 at 04:24:20PM -0800, Andres Freund wrote:
> > What I am excited about is that some of your other changes showed that we
> > don't need separate *_artifacts for separate directories anymore. That used 
> > to
> > be the case, but an array of paths is now supported. Putting log, diffs, and
> > regress_log in one directory will be considerably more convenient...
> 
> pushed together.

This change actually complicates things.

Before, there was log/src/test/recovery/tmp_check/log, with a few files for
001, a few for 002, a few for 003.  This are a lot of output files, but at
least they're all related.

Now, there's a single log/tmp_check/log, which has logs for the entire tap
tests.  There's 3 pages of 001*, 2 pages of 002*, 3 pages of 003, etc.  

-- 
Justin

Re: Mark all GUC variable as PGDLLIMPORT

2022-02-13 Thread Tom Lane

Chapman Flack  writes:
> On 02/13/22 15:16, Tom Lane wrote:
>> Why not just provide equivalents to GetConfigOption() that can
>> deliver int, float8, etc values instead of strings?

> And repeat the bsearch to find the option every time the interested
> extension wants to check the value, even when what it learns is that
> the value has not changed? And make it the job of every piece of
> interested extension code to save the last-retrieved value and compare
> if it wants to detect that it's changed? When the GUC machinery already
> has code that executes exactly when a value is being supplied for
> an option?

(shrug) You could argue the performance aspect either way.  If the
core GUC code updates a value 1000 times while the extension consults
the result once, you've probably lost money on the deal.

As for the bsearch, we could improve on that when and if it's shown
to be a bottleneck --- converting to a hash table ought to pretty
much fix any worries there.  Or we could provide APIs that let an
extension look up a "struct config_generic *" once and then fetch
directly using that pointer.  (But I'd prefer not to, since it'd
constrain the internals more than I think is wise.)

regards, tom lane

Re: Mark all GUC variable as PGDLLIMPORT

2022-02-13 Thread Andres Freund

Hi,

On 2022-02-13 15:36:16 -0500, Chapman Flack wrote:
> Clearly I'm not thinking here of the GUCs that are read-only views of
> values that are determined some other way. How many of those are there
> that are mutable, and could have their values changed without going
> through the GUC mechanisms?

Is there any GUCs where one needs this? There are a few GUCs frequently
changing values, but it's stuff like transaction_read_only. Where I don't
really see a use for constantly checking the value.

> Also, I think there are some options that are only represented by
> an int, float8, etc., when shown, but whose native internal form
> is something else, like a struct. I was definitely contemplating
> that you could 'subscribe' to one of those too, by passing the
> address of an appropriate struct. But of course a GetConfigOption()
> flavor could work that way too.

I have a very hard time seeing a use-case for this. Nor how it'd even work
with a struct - you can't just copy the struct contents, because of pointers
to objects etc.  I don't think there really are options like this anyway.

Greetings,

Andres Freund

Re: fixing bookindex.html bloat

2022-02-13 Thread Andres Freund

Hi,

On 2022-02-13 12:16:18 -0800, Andres Freund wrote:
> Waiting for docbook to fix this seems a bit futile, I eventually found a
> bugreport about this, from 2016:
> https://sourceforge.net/p/docbook/bugs/1384/

Now also reported to the current repo:
https://github.com/docbook/xslt10-stylesheets/issues/239

While there's been semi-regular changes / fixes, they've not done a release in
years... So even if they fix it, it'll likely not trickle down into distro
packages etc anytime soon, if ever.

Greetings,

Andres Freund

Re: Fix overflow in DecodeInterval

2022-02-13 Thread Tom Lane

Andres Freund  writes:
> Do we want to consider backpatching these fixes?

I wasn't thinking that we should.  It's a behavioral change and
people might not be pleased to have it appear in minor releases.

regards, tom lane

Re: Mark all GUC variable as PGDLLIMPORT

2022-02-13 Thread Chapman Flack

On 02/13/22 15:16, Tom Lane wrote:
> That seems like about 10X more complexity than is warranted,
> not only in terms of the infrastructure required, but also in
> the intellectual complexity around "just when could that value
> change?"
> 
> Why not just provide equivalents to GetConfigOption() that can
> deliver int, float8, etc values instead of strings?

And repeat the bsearch to find the option every time the interested
extension wants to check the value, even when what it learns is that
the value has not changed? And make it the job of every piece of
interested extension code to save the last-retrieved value and compare
if it wants to detect that it's changed? When the GUC machinery already
has code that executes exactly when a value is being supplied for
an option?

Clearly I'm not thinking here of the GUCs that are read-only views of
values that are determined some other way. How many of those are there
that are mutable, and could have their values changed without going
through the GUC mechanisms?

Also, I think there are some options that are only represented by
an int, float8, etc., when shown, but whose native internal form
is something else, like a struct. I was definitely contemplating
that you could 'subscribe' to one of those too, by passing the
address of an appropriate struct. But of course a GetConfigOption()
flavor could work that way too.

Regards,
-Chap

Re: Fix overflow in DecodeInterval

2022-02-13 Thread Andres Freund

Hi,

On 2022-02-13 09:35:47 -0500, Joseph Koshakow wrote:
> I chose int return types to keep all these methods
> consistent with DecodeInterval, which returns a
> non-zero int to indicate an error.

That's different, because it actually returns different types of errors. IMO
that difference is actually reason to use a bool for the new cases, because
then it's a tad clearer that they don't return DTERR_*.

> Though I wasn't sure
> if an int or bool would be best, so I'm happy to change
> to bool if people think that's better.

+1 or bool.

> Also I'm realizing now that I've incorrectly been using the
> number of the patch to indicate the version, instead of just
> sticking a v3 to the front. So sorry about that, all the patches
> I sent in this thread are the same patch, just different versions.

No worries ;)

Do we want to consider backpatching these fixes? If so, I'd argue for skipping
10, because it doesn't have int.h stuff yet. There's also the issue with
potentially breaking indexes / constraints? Not that goes entirely away across
major versions...

Greetings,

Andres Freund

Re: [PATCH] Dereference null return value (NULL_RETURNS) (src/backend/commands/statscmds.c)

2022-02-13 Thread Ranier Vilela

Em ter., 25 de ago. de 2020 às 12:42, Ranier Vilela 
escreveu:

> Hi Tom,
>
> Per Coverity.
>
> The SearchSysCache1 allows return NULL and at function AlterStatistics,
> has one mistake, lack of, check of return, which enables a dereference
> NULL pointer,
> at function heap_modify_tuple.
>
> While there is room for improvement.
> Avoid calling SearchSysCache1 and table_open if the user "is not the owner
> of the existing statistics object".
>
After a long time, finally this bug has been fixed.
https://github.com/postgres/postgres/commit/6d554e3fcd6fb8be2dbcbd3521e2947ed7a552cb

regards,
Ranier Vilela

Re: Adding CI to our tree

2022-02-13 Thread Andres Freund

Hi,

On 2022-02-13 15:09:20 -0500, Tom Lane wrote:
> Andres Freund  writes:
> > On 2022-02-13 12:13:17 -0500, Tom Lane wrote:
> >> Right.  Can we set things up so that it's not too painful to inject
> >> custom build options into a CI build?
> 
> > What kind of injection are you thinking about?
> 
> That's exactly what needs to be decided.
> [...]
> I'd prefer something a bit more out-of-band.  I don't know this technology
> well enough to propose a way.

I don't yet understand the precise use case for adjustments well enough to
propose something. Who would like to adjust something for what purpose?  The
"original" author, for a one-off test? A reviewer / committer, to track down a
hunch?

If it's about CI runs in in a personal repository, one can set additional
environment vars from the CI management interface. We can make sure they work
(the ci stuff probably overrides CFLAGS, but COPT should work) and document
the way to do so.

> > A patch author can obviously
> > just add options in .cirrus.yml. That's something possible now, that was not
> > possible with cfbot applying its own .cirrus.yml
> 
> Fine, but are committers supposed to keep track of the fact that they
> shouldn't commit that part of a patch?

I'd say it depends on the the specific modification - there's some patches
where it seems to make sense to adjust extend CI as part of it and have it
merged. But yea, in other cases committers would have to take them out.

For more on-off stuff one would presumably not want to spam the list the list
with a full patchset to trigger a cfbot run, nor wait till cfbot gets round to
that patch again. When pushing to a personal repo it's of course easy to just
have a commit on-top of what's submitted to play around with compile options.

Greetings,

Andres Freund

Re: Signed vs. Unsigned (some)

2022-02-13 Thread Ranier Vilela

Em sex., 11 de jun. de 2021 às 23:05, Ranier Vilela 
escreveu:

> Hi,
>
> Removing legitimate warnings can it be worth it?
>
> -1 CAST can be wrong, when there is an invalid value defined
> (InvalidBucket, InvalidBlockNumber).
> I think depending on the compiler -1 CAST may be different from
> InvalidBucket or InvalidBlockNumber.
>
> pg_rewind is one special case.
> All cases of XLogSegNo (uint64) initialization are zero, but in pg_rewind
> was used -1?
> I did not find it InvalidXLogSegNo!
> Not tested.
>
> Trivial patch attached.
>
After a long time, finally a small part is accepted and fixed.
https://github.com/postgres/postgres/commit/302612a6c74fb16f26d094ff47e9c59cf412740c

regards,
Ranier Vilela

Re: Mark all GUC variable as PGDLLIMPORT

2022-02-13 Thread Tom Lane

Chapman Flack  writes:
> On 02/13/22 02:29, Julien Rouhaud wrote:
>> Maybe we could have an actually usable GUC API to retrieve values in their
>> native format rather than C string for instance, that we could make sure also
>> works for cases like max_backend?

> I proposed a sketch of such an API for discussion back in [0] (the second
> idea in that email, the "what I'd really like" one).

> In that scheme, some extension code that was interested in (say,
> for some reason) log_statement_sample_rate could say:

> static double samprate;
> static int gucs_changed = 0;

> #define SAMPRATE_CHANGED 1

> ...
>   ObserveTypedConfigValue("log_statement_sample_rate",
> , _changed, SAMPRATE_CHANGED);
> ...

> and will be subscribed to have the native-format value stored into samprate,
> and SAMPRATE_CHANGED ORed into gucs_changed, whenever the value changes.


That seems like about 10X more complexity than is warranted,
not only in terms of the infrastructure required, but also in
the intellectual complexity around "just when could that value
change?"

Why not just provide equivalents to GetConfigOption() that can
deliver int, float8, etc values instead of strings?

(In any case we'd need to rethink the GUC "show_hook" APIs, which
currently need only deal in string output.)

regards, tom lane

fixing bookindex.html bloat

2022-02-13 Thread Andres Freund

Hi,

Sometime last year I was surprised to see (not on a public list unfortunately)
that bookindex.html is 657kB, with > 200kB just being repetitions of
xmlns="http://www.w3.org/1999/xhtml; xmlns:xlink="http://www.w3.org/1999/xlink;

Reminded of this, due to a proposal to automatically generate docs as part of
cfbot runs (which'd be fairly likely to update bookindex.html), I spent a few
painful hours last night trying to track this down.


The reason for the two xmlns= are different. The
xmlns="http://www.w3.org/1999/xhtml; is afaict caused by confusion on our
part.

Some of our stylesheets use
xmlns="http://www.w3.org/TR/xhtml1/transitional;
others use
xmlns="http://www.w3.org/1999/xhtml;

It's noteworthy that the docbook xsl stylesheets end up with
http://www.w3.org/1999/xhtml;>
so it's a bit pointless to reference http://www.w3.org/TR/xhtml1/transitional
afaict.

Adding xmlns="http://www.w3.org/1999/xhtml; to stylesheet-html-common.xsl gets
rid of xmlns="http://www.w3.org/TR/xhtml1/transitional; in bookindex specific
content.

Changing stylesheet.xsl from transitional to http://www.w3.org/1999/xhtml gets
rid of xmlns="http://www.w3.org/TR/xhtml1/transitional; in navigation/footer.

Of course we should likely change all http://www.w3.org/TR/xhtml1/transitional
references, rather than just the one necessary to get rid of the xmlns= spam.


So far, so easy. It took me way longer to understand what's causing the
all the xmlns:xlink= appearances.

For a long time I was misdirected because if I remove the  in stylesheet-html-common.xsl, the number of
xmlns:xlink drastically reduces to a handful. Which made me think that their
existance is somehow our fault. And I tried and tried to find the cause.

But it turns out that this originally is caused by a still existing buglet in
the docbook xsl stylesheets, specifically autoidx.xsl. It doesn't omit xlink
in exclude-result-prefixes, but uses ids etc from xlink.

The reason that we end up with so many more xmlns:xlink is just that without
our customization there ends up being a single
http://www.w3.org/1999/xlink; class="index">
and then everything below that doesn't need the xmlns:xlink anymore. But
because stylesheet-html-common.xsl emits the div, the xmlns:xlink is emitted
for each element that autoidx.xsl has "control" over.

Waiting for docbook to fix this seems a bit futile, I eventually found a
bugreport about this, from 2016: https://sourceforge.net/p/docbook/bugs/1384/

But we can easily reduce the "impact" of the issue, by just adding a single
xmlns:xlink to , which is sufficient to convince xsltproc
to not repeat it.


Before:
-rw-r--r-- 1 andres andres 683139 Feb 13 04:31 html-broken/bookindex.html
After:
-rw-r--r-- 1 andres andres 442923 Feb 13 12:03 html/bookindex.html

While most of the savings are in bookindex, the rest of the files are reduced
by another ~100kB.


WIP patch attached. For now I just adjusted the minimal set of
xmlns="http://www.w3.org/TR/xhtml1/transitional;, but I think we should update
all.

Greetings,

Andres Freund
diff --git i/doc/src/sgml/stylesheet-html-common.xsl w/doc/src/sgml/stylesheet-html-common.xsl
index d9961089c65..9f69af40a94 100644
--- i/doc/src/sgml/stylesheet-html-common.xsl
+++ w/doc/src/sgml/stylesheet-html-common.xsl
@@ -4,6 +4,7 @@
 %common.entities;
 ]>
 http://www.w3.org/1999/XSL/Transform;
+xmlns="http://www.w3.org/1999/xhtml;
 version="1.0">
 
 
+  http://www.w3.org/1999/xlink;>
 
 
   
diff --git i/doc/src/sgml/stylesheet.xsl w/doc/src/sgml/stylesheet.xsl
index 0eac594f0cc..24a9481fd49 100644
--- i/doc/src/sgml/stylesheet.xsl
+++ w/doc/src/sgml/stylesheet.xsl
@@ -1,7 +1,7 @@
 
 http://www.w3.org/1999/XSL/Transform;
 version='1.0'
-xmlns="http://www.w3.org/TR/xhtml1/transitional;
+xmlns="http://www.w3.org/1999/xhtml;
 exclude-result-prefixes="#default">
 
 http://docbook.sourceforge.net/release/xsl/current/xhtml/chunk.xsl"/>

Re: Adding CI to our tree

2022-02-13 Thread Tom Lane

Andres Freund  writes:
> On 2022-02-13 12:13:17 -0500, Tom Lane wrote:
>> Right.  Can we set things up so that it's not too painful to inject
>> custom build options into a CI build?

> What kind of injection are you thinking about?

That's exactly what needs to be decided.

> A patch author can obviously
> just add options in .cirrus.yml. That's something possible now, that was not
> possible with cfbot applying its own .cirrus.yml

Fine, but are committers supposed to keep track of the fact that they
shouldn't commit that part of a patch?  I'd prefer something a bit more
out-of-band.  I don't know this technology well enough to propose a way.

> I'd like to have things like -fanitize=aligned and
> -DWRITE_READ_PARSE_PLAN_TREES on by default for CI, primarily for cfbot's
> benefit. Most patch authors won't know about using
> -DWRITE_READ_PARSE_PLAN_TREES etc, so they won't even think about enabling
> them. We're *really* not doing well on the "timely review" side of things, so
> we at least should not waste time on high latency back-forth for easily
> automatically detectable things.

I don't personally have an objection to either of those; maybe Robert
does.

regards, tom lane

Re: Fix overflow in DecodeInterval

2022-02-13 Thread Joseph Koshakow

Actually found an additional overflow issue in
DecodeInterval. I've updated the patch with the
fix. Specifically it prevents trying to negate a field
if it equals INT_MIN. Let me know if this is best
put into another patch.

- Joe Koshakow
From 09eafa9b496c8461f2dc52ea62c9e833ab10a17f Mon Sep 17 00:00:00 2001
From: Joseph Koshakow 
Date: Fri, 11 Feb 2022 14:18:32 -0500
Subject: [PATCH] Check for overflow when decoding an interval

When decoding an interval there are various date units which are
aliases for some multiple of another unit. For example a week is 7 days
and a decade is 10 years. When decoding these specific units, there is
no check for overflow, allowing the interval to overflow. This commit
adds an overflow check for all of these units.

Additionally fractional date/time units are rolled down into the next
smallest unit. For example 0.1 months is 3 days. When adding these
fraction units, there is no check for overflow, allowing the interval
to overflow. This commit adds an overflow check for all of the
fractional units.

Additionally adding the word "ago" to the interval negates every
field. However the negative of INT_MIN is still INT_MIN. This
commit adds a check to make sure that we don't try and negate
a field if it's INT_MIN.

Signed-off-by: Joseph Koshakow 
---
 src/backend/utils/adt/datetime.c   | 108 -
 src/test/regress/expected/interval.out |  68 
 src/test/regress/sql/interval.sql  |  18 +
 3 files changed, 172 insertions(+), 22 deletions(-)

diff --git a/src/backend/utils/adt/datetime.c b/src/backend/utils/adt/datetime.c
index 7926258c06..49bd12075e 100644
--- a/src/backend/utils/adt/datetime.c
+++ b/src/backend/utils/adt/datetime.c
@@ -21,6 +21,7 @@
 #include "access/htup_details.h"
 #include "access/xact.h"
 #include "catalog/pg_type.h"
+#include "common/int.h"
 #include "common/string.h"
 #include "funcapi.h"
 #include "miscadmin.h"
@@ -44,10 +45,13 @@ static int	DecodeDate(char *str, int fmask, int *tmask, bool *is2digits,
 	   struct pg_tm *tm);
 static char *AppendSeconds(char *cp, int sec, fsec_t fsec,
 		   int precision, bool fillzeros);
-static void AdjustFractSeconds(double frac, struct pg_tm *tm, fsec_t *fsec,
+static int AdjustFractSeconds(double frac, struct pg_tm *tm, fsec_t *fsec,
 			   int scale);
-static void AdjustFractDays(double frac, struct pg_tm *tm, fsec_t *fsec,
+static int AdjustFractDays(double frac, struct pg_tm *tm, fsec_t *fsec,
 			int scale);
+static int AdjustFractMonths(double frac, struct pg_tm *tm, int scale);
+static int AdjustDays(int val, struct pg_tm *tm, int multiplier);
+static int AdjustYears(int val, struct pg_tm *tm, int multiplier);
 static int	DetermineTimeZoneOffsetInternal(struct pg_tm *tm, pg_tz *tzp,
 			pg_time_t *tp);
 static bool DetermineTimeZoneAbbrevOffsetInternal(pg_time_t t,
@@ -499,34 +503,76 @@ AppendTimestampSeconds(char *cp, struct pg_tm *tm, fsec_t fsec)
 /*
  * Multiply frac by scale (to produce seconds) and add to *tm & *fsec.
  * We assume the input frac is less than 1 so overflow is not an issue.
+ * Returns 0 if successful, 1 if tm overflows.
  */
-static void
+static int
 AdjustFractSeconds(double frac, struct pg_tm *tm, fsec_t *fsec, int scale)
 {
 	int			sec;
 
 	if (frac == 0)
-		return;
+		return 0;
 	frac *= scale;
 	sec = (int) frac;
-	tm->tm_sec += sec;
+	if (pg_add_s32_overflow(tm->tm_sec, sec, >tm_sec))
+		return 1;
 	frac -= sec;
 	*fsec += rint(frac * 100);
+	return 0;
 }
 
 /* As above, but initial scale produces days */
-static void
+static int
 AdjustFractDays(double frac, struct pg_tm *tm, fsec_t *fsec, int scale)
 {
 	int			extra_days;
 
 	if (frac == 0)
-		return;
+		return 0;
 	frac *= scale;
 	extra_days = (int) frac;
-	tm->tm_mday += extra_days;
+	if (pg_add_s32_overflow(tm->tm_mday, extra_days, >tm_mday))
+		return 1;
 	frac -= extra_days;
-	AdjustFractSeconds(frac, tm, fsec, SECS_PER_DAY);
+	return AdjustFractSeconds(frac, tm, fsec, SECS_PER_DAY);
+}
+
+/* As above, but initial scale produces months */
+static int
+AdjustFractMonths(double frac, struct pg_tm *tm, int scale)
+{
+	int months = rint(frac * MONTHS_PER_YEAR * scale);
+
+	if (pg_add_s32_overflow(tm->tm_mon, months, >tm_mon))
+		return 1;
+	return 0;
+}
+
+/*
+ * Multiply val by multiplier (to produce days) and add to *tm.
+ * Returns 0 if successful, 1 if tm overflows.
+ */
+static int
+AdjustDays(int val, struct pg_tm *tm, int multiplier)
+{
+	int			extra_days;
+	if (pg_mul_s32_overflow(val, multiplier, _days))
+		return 1;
+	if (pg_add_s32_overflow(tm->tm_mday, extra_days, >tm_mday))
+		return 1;
+	return 0;
+}
+
+/* As above, but initial val produces years */
+static int
+AdjustYears(int val, struct pg_tm *tm, int multiplier)
+{
+	int			years;
+	if (pg_mul_s32_overflow(val, multiplier, ))
+		return 1;
+	if (pg_add_s32_overflow(tm->tm_year, years, >tm_year))
+		return 1;
+	return 0;
 }
 
 /* Fetch a fractional-second value with suitable error

Re: FailedAssertion("prev_first_lsn < cur_txn->first_lsn", File: "reorderbuffer.c", Line: 1173, ...

2022-02-13 Thread Andres Freund

Hi,

On 2022-02-13 13:27:08 +0100, Tomas Vondra wrote:
> I'm not sure how could this be related to the issue fixed by b779d7d8fd.
> That's merely about handling of empty xacts in the output plugin, it has
> little (nothing) to do with reorderbuffer - the failures was simply
> about (not) printing BEGIN/COMMIT for empty xacts.

I'd only read your commit message - which didn't go into that much detail. I
just saw that the run didn't yet include that commit and that the failed
command specified skip-empty-xacts 1, which your commit described as fixing...

> So why would it be triggering an Assert in reorderbuffer? Also, there
> was a successful run with the test_decoding changes 80901b3291 (and also
> with sequence decoding infrastructure committed a couple days ago) ...

It's clearly a bug only happen at a lower likelihood...

> Do we have access to the machine? If yes, it'd be interesting to run the
> tests before the fix, and see how often it fails. Also, printing the LSN
> would be interesting, so that we can correlate it to WAL.

We really should provide assert infrastructure that can do comparisons while
also printing values... If C just had proper generics :(.

Greetings,

Andres Freund

Re: Adding CI to our tree

2022-02-13 Thread Andres Freund

Hi,

On 2022-02-13 12:13:17 -0500, Tom Lane wrote:
> Robert Haas  writes:
> > This is exactly why I'm not a huge fan of having ci stuff in the tree.
> > It supposes that there's one right way to do a build, but in reality,
> > different people want and indeed need to use different options for all
> > kinds of reasons.

Sure. But why is that an argument against "having ci stuff in the tree"?

All it does is to make sure that a certain base-level of testing is easy to
achieve for everyone. I don't like working on windows or mac, but my patches
often have platform dependent bits. Now it's less likely that I need to
manually interact with windows.

I don't think we can (or well should) replace the buildfarm with the CI
stuff. The buildfarm provides extensive and varied coverage for master/release
branches. Which isn't feasible for unmerged development work, including cfbot,
from a resource usage POV alone.

> > That's the whole value of having things like
> > configure and pg_config_manual.h. When we start arguing about whether
> > or ci builds should use -DWRITE_READ_PARSE_PLAN_TREES we're inevitably
> > into the realm where no choice is objectively better,

> Right.  Can we set things up so that it's not too painful to inject
> custom build options into a CI build?

What kind of injection are you thinking about? A patch author can obviously
just add options in .cirrus.yml. That's something possible now, that was not
possible with cfbot applying its own .cirrus.yml

It'd be nice if there were a way to do it more easily for msvc and configure
builds together, right now it'd require modifying those tasks in different
ways. But that's not really a CI question.

I'd like to have things like -fanitize=aligned and
-DWRITE_READ_PARSE_PLAN_TREES on by default for CI, primarily for cfbot's
benefit. Most patch authors won't know about using
-DWRITE_READ_PARSE_PLAN_TREES etc, so they won't even think about enabling
them. We're *really* not doing well on the "timely review" side of things, so
we at least should not waste time on high latency back-forth for easily
automatically detectable things.

> I should think that at the very least one needs to be able to vary the
> configure switches and CPPFLAGS/CFLAGS.

Do you mean as part of a patch tested with cfbot, CI running for pushes to
your own repository, or ...?

Greetings,

Andres Freund

xml_is_well_formed (was Re: buildfarm warnings)

2022-02-13 Thread Tom Lane

I wrote:
> This conversation motivated me to take a fresh pass over said filtering
> script, and one thing I notice that it's been ignoring is these complaints
> from all the AIX animals:

> ld: 0711-224 WARNING: Duplicate symbol: .xml_is_well_formed
> ld: 0711-224 WARNING: Duplicate symbol: xml_is_well_formed

> while building contrib/xml2.  Evidently this is because xml2 defines
> a symbol also defined in the core.  We cannot rename xml2's function
> without a user-visible ABI break, but it would be pretty painless
> to rename the core function (at the C level, not SQL level).
> So I propose we do that.

I tried to do that, but it blew up in my face, because it turns out that
actually contrib/xml2's extension script has a by-C-name reference to
the *core* function:

-- deprecated old name for xml_is_well_formed
CREATE FUNCTION xml_valid(text) RETURNS bool
AS 'xml_is_well_formed'
LANGUAGE INTERNAL STRICT STABLE PARALLEL SAFE;

The situation with the instance in xml2 is explained by its comment:

 * Note: this has been superseded by a core function.  We still have to
 * have it in the contrib module so that existing SQL-level references
 * to the function won't fail; but in normal usage with up-to-date SQL
 * definitions for the contrib module, this won't be called.

So we're faced with a dilemma: we can't rename either instance without
breaking something.  The only way to get rid of the warnings seems to be
to decide that the copy in xml2 has outlived its usefulness and remove
it.  I don't think that's a hard argument to make: that version has been
obsolete since 9.1 (a0b7b717a), meaning that only pre-extensions versions
of xml2 could need it.  In fact, we pulled the trigger on it once before,
in 2016 (20540710e), and then changed our minds not because anyone
lobbied to put it back but just because we gave up on the PGDLLEXPORT
rearrangement that prompted the change then.

I think that getting rid of these build warnings is sufficient reason
to drop this ancient compatibility function, and now propose that
we do that.

regards, tom lane

Fix overflow in justify_interval related functions

2022-02-13 Thread Joseph Koshakow

I mentioned this issue briefly in another thread, but the
justify_interval, justify_days, and justify_hours functions
all have the potential to overflow. The attached patch fixes
this issue.

Cheers,
Joe Koshakow
From 4400f2e6886097e3b75d455aeec1ffa9cbc88510 Mon Sep 17 00:00:00 2001
From: Joseph Koshakow 
Date: Sun, 13 Feb 2022 13:26:16 -0500
Subject: [PATCH] Check for overflow in justify_interval functions

justify_interval, justify_hours, and justify_days didn't check for
overflow when moving hours to days or days to hours. This commit adds
check for overflow and returns an appropriate error.
---
 src/backend/utils/adt/timestamp.c  | 20 
 src/test/regress/expected/interval.out |  8 
 src/test/regress/sql/interval.sql  |  6 ++
 3 files changed, 30 insertions(+), 4 deletions(-)

diff --git a/src/backend/utils/adt/timestamp.c b/src/backend/utils/adt/timestamp.c
index 36f8a84bcc..fb4b3ae0aa 100644
--- a/src/backend/utils/adt/timestamp.c
+++ b/src/backend/utils/adt/timestamp.c
@@ -2718,11 +2718,17 @@ interval_justify_interval(PG_FUNCTION_ARGS)
 	result->time = span->time;
 
 	TMODULO(result->time, wholeday, USECS_PER_DAY);
-	result->day += wholeday;	/* could overflow... */
+	if (pg_add_s32_overflow(result->day, wholeday, >day))
+		ereport(ERROR,
+(errcode(ERRCODE_DATETIME_VALUE_OUT_OF_RANGE),
+errmsg("interval out of range")));
 
 	wholemonth = result->day / DAYS_PER_MONTH;
 	result->day -= wholemonth * DAYS_PER_MONTH;
-	result->month += wholemonth;
+	if (pg_add_s32_overflow(result->month, wholemonth, >month))
+		ereport(ERROR,
+(errcode(ERRCODE_DATETIME_VALUE_OUT_OF_RANGE),
+errmsg("interval out of range")));
 
 	if (result->month > 0 &&
 		(result->day < 0 || (result->day == 0 && result->time < 0)))
@@ -2772,7 +2778,10 @@ interval_justify_hours(PG_FUNCTION_ARGS)
 	result->time = span->time;
 
 	TMODULO(result->time, wholeday, USECS_PER_DAY);
-	result->day += wholeday;	/* could overflow... */
+	if (pg_add_s32_overflow(result->day, wholeday, >day))
+		ereport(ERROR,
+(errcode(ERRCODE_DATETIME_VALUE_OUT_OF_RANGE),
+errmsg("interval out of range")));
 
 	if (result->day > 0 && result->time < 0)
 	{
@@ -2808,7 +2817,10 @@ interval_justify_days(PG_FUNCTION_ARGS)
 
 	wholemonth = result->day / DAYS_PER_MONTH;
 	result->day -= wholemonth * DAYS_PER_MONTH;
-	result->month += wholemonth;
+	if (pg_add_s32_overflow(result->month, wholemonth, >month))
+		ereport(ERROR,
+(errcode(ERRCODE_DATETIME_VALUE_OUT_OF_RANGE),
+errmsg("interval out of range")));
 
 	if (result->month > 0 && result->day < 0)
 	{
diff --git a/src/test/regress/expected/interval.out b/src/test/regress/expected/interval.out
index accd4a7d90..e2bb8095c3 100644
--- a/src/test/regress/expected/interval.out
+++ b/src/test/regress/expected/interval.out
@@ -396,6 +396,10 @@ SELECT justify_days(interval '6 months 36 days 5 hours 4 minutes 3 seconds') as
  @ 7 mons 6 days 5 hours 4 mins 3 secs
 (1 row)
 
+SELECT justify_hours(interval '2147483647 days 24 hrs');
+ERROR:  interval out of range
+SELECT justify_days(interval '2147483647 months 30 days');
+ERROR:  interval out of range
 -- test justify_interval()
 SELECT justify_interval(interval '1 month -1 hour') as "1 month -1 hour";
   1 month -1 hour   
@@ -403,6 +407,10 @@ SELECT justify_interval(interval '1 month -1 hour') as "1 month -1 hour";
  @ 29 days 23 hours
 (1 row)
 
+SELECT justify_interval(interval '2147483647 days 24 hrs');
+ERROR:  interval out of range
+SELECT justify_interval(interval '2147483647 months 30 days');
+ERROR:  interval out of range
 -- test fractional second input, and detection of duplicate units
 SET DATESTYLE = 'ISO';
 SET IntervalStyle TO postgres;
diff --git a/src/test/regress/sql/interval.sql b/src/test/regress/sql/interval.sql
index 6d532398bd..3ebdca5023 100644
--- a/src/test/regress/sql/interval.sql
+++ b/src/test/regress/sql/interval.sql
@@ -149,10 +149,16 @@ select '1y 10mon -10d -10h -10min -10.01s ago'::interval
 SELECT justify_hours(interval '6 months 3 days 52 hours 3 minutes 2 seconds') as "6 mons 5 days 4 hours 3 mins 2 seconds";
 SELECT justify_days(interval '6 months 36 days 5 hours 4 minutes 3 seconds') as "7 mons 6 days 5 hours 4 mins 3 seconds";
 
+SELECT justify_hours(interval '2147483647 days 24 hrs');
+SELECT justify_days(interval '2147483647 months 30 days');
+
 -- test justify_interval()
 
 SELECT justify_interval(interval '1 month -1 hour') as "1 month -1 hour";
 
+SELECT justify_interval(interval '2147483647 days 24 hrs');
+SELECT justify_interval(interval '2147483647 months 30 days');
+
 -- test fractional second input, and detection of duplicate units
 SET DATESTYLE = 'ISO';
 SET IntervalStyle TO postgres;
-- 
2.25.1

Re: Mark all GUC variable as PGDLLIMPORT

2022-02-13 Thread Chapman Flack

Hi,

On 02/13/22 02:29, Julien Rouhaud wrote:
> Maybe we could have an actually usable GUC API to retrieve values in their
> native format rather than C string for instance, that we could make sure also
> works for cases like max_backend?

I proposed a sketch of such an API for discussion back in [0] (the second
idea in that email, the "what I'd really like" one).

In that scheme, some extension code that was interested in (say,
for some reason) log_statement_sample_rate could say:

static double samprate;
static int gucs_changed = 0;

#define SAMPRATE_CHANGED 1

...
  ObserveTypedConfigValue("log_statement_sample_rate",
, _changed, SAMPRATE_CHANGED);
...

and will be subscribed to have the native-format value stored into samprate,
and SAMPRATE_CHANGED ORed into gucs_changed, whenever the value changes.

The considerations leading me to that design were:

- avoid subscribing as a 'callback' sort of listener. GUCs can get set
  (or, especially, reset) in delicate situations like error recovery
  where calling out to arbitrary extra code might best be avoided.

- instead, just dump the value in a subscribed location. A copy,
  of course, so no modification there affects the real value.

- but at the same time, OR a flag into a bit set, so subscribing code can
  very cheaply poll for when a value of interest (or any of a bunch of
  values of interest) has changed since last checked.

- do the variable lookup by name once only, and pay no further search cost
  when the subscribing code wants the value.

I never pictured that proposal as the last word on the question, and
different proposals could result from putting different weights on those
objectives, or adding other objectives, but I thought it might serve
as a discussion-starter.

Regards,
-Chap

[0] https://www.postgresql.org/message-id/6123C425.3080409%40anastigmatix.net

Re: Adding CI to our tree

2022-02-13 Thread Tom Lane

Robert Haas  writes:
> This is exactly why I'm not a huge fan of having ci stuff in the tree.
> It supposes that there's one right way to do a build, but in reality,
> different people want and indeed need to use different options for all
> kinds of reasons. That's the whole value of having things like
> configure and pg_config_manual.h. When we start arguing about whether
> or ci builds should use -DWRITE_READ_PARSE_PLAN_TREES we're inevitably
> into the realm where no choice is objectively better,

Right.  Can we set things up so that it's not too painful to inject
custom build options into a CI build?  I should think that at the
very least one needs to be able to vary the configure switches and
CPPFLAGS/CFLAGS.

regards, tom lane

Re: buildfarm warnings

2022-02-13 Thread Tom Lane

I wrote:
> Justin Pryzby  writes:
>> Is there any check for warnings from new code, other than those buildfarm
>> members with -Werror ?

> I periodically grep the buildfarm logs for interesting warnings.
> There are a lot of uninteresting ones :-(, so I'm not sure how
> automatable that'd be.  I do use a prefiltering script, which
> right now says there are
> 13917 total warnings from 41 buildfarm members
> of which it thinks all but 350 are of no interest.

This conversation motivated me to take a fresh pass over said filtering
script, and one thing I notice that it's been ignoring is these complaints
from all the AIX animals:

ld: 0711-224 WARNING: Duplicate symbol: .xml_is_well_formed
ld: 0711-224 WARNING: Duplicate symbol: xml_is_well_formed

while building contrib/xml2.  Evidently this is because xml2 defines
a symbol also defined in the core.  We cannot rename xml2's function
without a user-visible ABI break, but it would be pretty painless
to rename the core function (at the C level, not SQL level).
So I propose we do that.  I think that when I put in that pattern,
there were more problems, but this seems to be all that's left.

Another thing I've been ignoring on the grounds that I-don't-feel-
like-fixing-this is that the Solaris and OpenIndiana machines whine
about every single reference from loadable modules to the core code,
eg this from haddock while building contrib/fuzzystrmatch:

Undefined   first referenced
 symbol in file
cstring_to_text dmetaphone.o
varstr_levenshtein  fuzzystrmatch.o
text_to_cstring dmetaphone.o
errstart_cold   fuzzystrmatch.o
errmsg  fuzzystrmatch.o
palloc  dmetaphone.o
ExceptionalConditionfuzzystrmatch.o
errfinish   fuzzystrmatch.o
varstr_levenshtein_less_equal   fuzzystrmatch.o
repallocdmetaphone.o
errcode fuzzystrmatch.o
errmsg_internal fuzzystrmatch.o
pg_detoast_datum_packed dmetaphone.o
ld: warning: symbol referencing errors

Presumably, this means that the Solaris linker would really like
to see the executable the module will load against, analogously to
the BE_DLLLIBS settings we use on some other platforms such as macOS.
It evidently still works without that, but we might be losing
performance from an extra indirection or the like.  And in any case,
this amount of noise would be very distracting if you wanted to do
actual development on that platform.  I'm not especially interested
in tracking down the correct incantation, but I'd gladly push a patch
if someone submits one.

regards, tom lane

Re: Adding CI to our tree

2022-02-13 Thread Robert Haas

On Sun, Feb 13, 2022 at 3:30 AM Andres Freund  wrote:
> > There's other things that it'd be nice to enable, but maybe these don't 
> > need to
> > be on by default.  Maybe just have a list of optional compiler flags (and 
> > hints
> > when they're useful).  Like WRITE_READ_PARSE_PLAN_TREES.
>
> I think it'd be good to enable a reasonable set by default. Particularly for
> newer contributors stuff like forgetting in/out/readfuncs, or not knowing
> about some undefined behaviour, is easy. Probably makes sense to use different
> settings on different tasks.

This is exactly why I'm not a huge fan of having ci stuff in the tree.
It supposes that there's one right way to do a build, but in reality,
different people want and indeed need to use different options for all
kinds of reasons. That's the whole value of having things like
configure and pg_config_manual.h. When we start arguing about whether
or ci builds should use -DWRITE_READ_PARSE_PLAN_TREES we're inevitably
into the realm where no choice is objectively better, and whoever
yells the loudest will get it the way they want, and then somebody
else later will say "well that's dumb I don't like that" or even just
"well that's not the right thing for testing MY patch," at which point
the previous mailing list discussion will be cited as "precedent" for
what was essentially an arbitrary decision made by 1 or 2 people.

Mind you, I'm not trying to hold back the tide. I realize that in-tree
ci stuff is very much in vogue. But I think it would be a very healthy
thing if we acknowledged that what goes in there is far more arbitrary
than most of what we put in the tree.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints

2022-02-13 Thread Robert Haas

On Sun, Feb 13, 2022 at 1:34 AM Dilip Kumar  wrot>
> test4:
> 32 GB shared buffers, template DB size = 10GB, dirty shared buffer=70%
> Head: 47656 ms
> Patch: 79767 ms

This seems like the most surprising result of the bunch. Here, the
template DB is both small enough to fit in shared_buffers and small
enough not to trigger a checkpoint all by itself, and yet the patch
loses.

Did you checkpoint between one test and the next, or might this test
have been done after a bunch of WAL had already been written since the
last checkpoint so that the 10GB pushed it over the edge?

BTW, you have test4 twice in your list of results.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: pgsql: Add suport for server-side LZ4 base backup compression.

2022-02-13 Thread Robert Haas

On Sat, Feb 12, 2022 at 11:12 PM Tom Lane  wrote:
> Eh?  copperhead for one is failing for exactly this reason:
>
> Bailout called.  Further testing stopped:  failed to execute command "lz4 -d 
> -m 
> /home/pgbf/buildroot/HEAD/pgsql.build/src/bin/pg_verifybackup/tmp_check/t_008_untar_primary_data/backup/server-backup/base.tar.lz4":
>  No such file or directory

Well, that's because I didn't realize that LZ4 might be set to
something that doesn't work at all. So Michael's thing worked, but
because it (in my view) fixed the problem in a more surprising place
than I would have preferred, I made a commit later that turned out to
break the buildfarm. So you can blame either one of us that you like.

Thanks, Michael, for preparing a patch.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Fix overflow in DecodeInterval

2022-02-13 Thread Joseph Koshakow

On Sat, Feb 12, 2022 at 10:51 PM Andres Freund  wrote:
> Any reason for using int return types?
>
> particularly since the pg_*_overflow stuff uses bool?

I chose int return types to keep all these methods
consistent with DecodeInterval, which returns a
non-zero int to indicate an error. Though I wasn't sure
if an int or bool would be best, so I'm happy to change
to bool if people think that's better.

Also I'm realizing now that I've incorrectly been using the
number of the patch to indicate the version, instead of just
sticking a v3 to the front. So sorry about that, all the patches
I sent in this thread are the same patch, just different versions.

- Joe Koshakow

Re: Error "initial slot snapshot too large" in create replication slot

2022-02-13 Thread Yura Sokolov

В Пн, 07/02/2022 в 13:52 +0530, Dilip Kumar пишет:
> On Mon, Jan 31, 2022 at 11:50 AM Kyotaro Horiguchi
>  wrote:
> > At Mon, 17 Jan 2022 09:27:14 +0530, Dilip Kumar  
> > wrote in
> > 
> > me> Mmm. The size of the array cannot be larger than the numbers the
> > me> *Connt() functions return.  Thus we cannot attach the oversized array
> > me> to ->subxip.  (I don't recall clearly but that would lead to assertion
> > me> failure somewhere..)
> > 
> > Then, I fixed the v3 error and post v4.
> 
> Yeah you are right, SetTransactionSnapshot() has that assertion.
> Anyway after looking again it appears that
> GetMaxSnapshotSubxidCount is the correct size because this is
> PGPROC_MAX_CACHED_SUBXIDS +1, i.e. it considers top transactions as
> well so we don't need to add them separately.
> 
> > SnapBUildInitialSnapshot tries to store XIDS of both top and sub
> > transactions into snapshot->xip array but the array is easily
> > overflowed and CREATE_REPLICATOIN_SLOT command ends with an error.
> > 
> > To fix this, this patch is doing the following things.
> > 
> > - Use subxip array instead of xip array to allow us have larger array
> >   for xids.  So the snapshot is marked as takenDuringRecovery, which
> >   is a kind of abuse but largely reduces the chance of getting
> >   "initial slot snapshot too large" error.
> 
> Right. I think the patch looks fine to me.
> 

Good day.

I've looked to the patch. Personally I'd prefer dynamically resize
xip array. But I think there is issue with upgrade if replica source
is upgraded before destination, right?

Concerning patch, I think more comments should be written about new
usage case for `takenDuringRecovery`. May be this field should be renamed
at all?

And there are checks for `takenDuringRecovery` in `heapgetpage` and
`heapam_scan_sample_next_tuple`. Are this checks affected by the change?
Neither the preceding discussion nor commit message answer me.

---

regards

Yura Sokolov
Postgres Professional
y.soko...@postgrespro.ru
funny.fal...@gmail.com

Re: Avoiding smgrimmedsync() during nbtree index builds

2022-02-13 Thread Dmitry Dolgov

> On Wed, Feb 09, 2022 at 01:49:30PM -0500, Melanie Plageman wrote:
> Hi,
> v5 attached and all email feedback addressed below

Thanks for the patch, it looks quite good.

I don't see it in the discussion, so naturally curious -- why directmgr
is not used for bloom index, e.g. in blbuildempty?

> On Sun, Jan 16, 2022 at 3:26 PM Justin Pryzby  wrote:
> > Separate from this issue, I wonder if it'd be useful to write a DEBUG log
> > showing when btree uses shared_buffers vs fsync.  And a regression test 
> > which
> > first SETs client_min_messages=debug to capture the debug log to demonstrate
> > when/that new code path is being hit.  I'm not sure if that would be good to
> > merge, but it may be useful for now.

I can't find the thread right away, but I vaguely remember a similar
situation where such approach, as a main way to test the patch, had
caused some disagreement. Of course for the development phase it would
be indeed convenient.

Re: Identify missing publications from publisher while create/alter subscription.

2022-02-13 Thread vignesh C

On Fri, Feb 11, 2022 at 7:14 PM Ashutosh Sharma  wrote:
>
> I have spent little time looking at the latest patch. The patch looks
> to be in good shape as it has already been reviewed by many people
> here, although I did get some comments. Please take a look and let me
> know your thoughts.
>
>
> +   /* Try to connect to the publisher. */
> +   wrconn = walrcv_connect(sub->conninfo, true, sub->name, );
> +   if (!wrconn)
> +   ereport(ERROR,
> +   (errmsg("could not connect to the publisher: %s", err)));
>
> I think it would be good to also include the errcode
> (ERRCODE_CONNECTION_FAILURE) here?

Modified

> --
>
> @@ -514,6 +671,8 @@ CreateSubscription(ParseState *pstate,
> CreateSubscriptionStmt *stmt,
>
> PG_TRY();
> {
> +   check_publications(wrconn, publications, 
> opts.validate_publication);
> +
>
>
> Instead of passing the opts.validate_publication argument to
> check_publication function, why can't we first check if this option is
> set or not and accordingly call check_publication function? For other
> options I see that it has been done in the similar way for e.g. check
> for opts.connect or opts.refresh or opts.enabled etc.

Modified

> --
>
> Above comment also applies for:
>
> @@ -968,6 +1130,8 @@ AlterSubscription(ParseState *pstate,
> AlterSubscriptionStmt *stmt,
> replaces[Anum_pg_subscription_subpublications - 1] = true;
>
> update_tuple = true;
> +   connect_and_check_pubs(sub, stmt->publication,
> +  opts.validate_publication);
>

Modified

> --
>
> + 
> +  When true, the command verifies if all the specified publications
> +  that are being subscribed to are present in the publisher and 
> throws
> +  an error if any of the publication doesn't exist. The default is
> +  false.
>
> publication -> publications (in the 4th line : throw an error if any
> of the publication doesn't exist)
>
> This applies for both CREATE and ALTER subscription commands.

Modified

Thanks for the comments, the attached v14 patch has the changes for the same.

Regard,s
Vignesh
From a753ffcc3d9ab2aa87d76c52adff10f9192e3ff3 Mon Sep 17 00:00:00 2001
From: Vigneshwaran C 
Date: Sun, 13 Feb 2022 17:48:26 +0530
Subject: [PATCH v14] Identify missing publications from publisher while
 create/alter subscription.

Creating/altering subscription is successful when we specify a publication which
does not exist in the publisher. This patch checks if the specified publications
are present in the publisher and throws an error if any of the publication is
missing in the publisher.
---
 doc/src/sgml/ref/alter_subscription.sgml  |  13 ++
 doc/src/sgml/ref/create_subscription.sgml |  20 ++-
 src/backend/commands/subscriptioncmds.c   | 201 +++---
 src/bin/psql/tab-complete.c   |  14 +-
 src/test/subscription/t/007_ddl.pl|  54 ++
 5 files changed, 269 insertions(+), 33 deletions(-)

diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index 0b027cc346..604bf66b65 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -168,6 +168,19 @@ ALTER SUBSCRIPTION name RENAME TO <
  
 

+
+   
+validate_publication (boolean)
+
+ 
+  When true, the command verifies if all the specified publications
+  that are being subscribed to are present in the publisher and throws
+  an error if any of the publications doesn't exist. The default is
+  false.
+ 
+
+   
+
   
 

diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 990a41f1a1..cd87daeae7 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -110,12 +110,14 @@ CREATE SUBSCRIPTION subscription_nametrue.  Setting this to
   false will force the values of
-  create_slot, enabled and
-  copy_data to false.
+  create_slot, enabled,
+  copy_data and
+  validate_publication to false.
   (You cannot combine setting connect
   to false with
   setting create_slot, enabled,
-  or copy_data to true.)
+  copy_data or
+  validate_publication to true.)
  
 
  
@@ -170,6 +172,18 @@ CREATE SUBSCRIPTION subscription_name
 

+
+   
+validate_publication (boolean)
+
+ 
+  When true, the command verifies if all the specified publications
+  that are being subscribed to are present in the publisher and throws
+  an error if any of the publications doesn't exist. The default is
+  false.
+ 
+
+   
   
  
 
diff --git a/src/backend/commands/subscriptioncmds.c

Re: Identify missing publications from publisher while create/alter subscription.

2022-02-13 Thread vignesh C

On Thu, Feb 10, 2022 at 3:15 PM Ashutosh Sharma  wrote:
>
> On Wed, Feb 9, 2022 at 11:53 PM Euler Taveira  wrote:
> >
> > On Wed, Feb 9, 2022, at 12:06 PM, Ashutosh Sharma wrote:
> >
> > Just wondering if we should also be detecting the incorrect conninfo
> > set with ALTER SUBSCRIPTION command as well. See below:
> >
> > -- try creating a subscription with incorrect conninfo. the command fails.
> > postgres=# create subscription sub1 connection 'host=localhost
> > port=5490 dbname=postgres' publication pub1;
> > ERROR:  could not connect to the publisher: connection to server at
> > "localhost" (::1), port 5490 failed: Connection refused
> > Is the server running on that host and accepting TCP/IP connections?
> > connection to server at "localhost" (127.0.0.1), port 5490 failed:
> > Connection refused
> > Is the server running on that host and accepting TCP/IP connections?
> >
> > That's because by default 'connect' parameter is true.
> >
>
> So can we use this option with the ALTER SUBSCRIPTION command. I think
> we can't, which means if the user sets wrong conninfo using ALTER
> SUBSCRIPTION command then we don't have the option to detect it like
> we have in case of CREATE SUBSCRIPTION command. Since this thread is
> trying to add the ability to identify the wrong/missing publication
> name specified with the ALTER SUBSCRIPTION command, can't we do the
> same for the wrong conninfo?

I felt this can be extended once this feature is committed. Thoughts?

Regards,
Vignesh

Re: FailedAssertion("prev_first_lsn < cur_txn->first_lsn", File: "reorderbuffer.c", Line: 1173, ...

2022-02-13 Thread Tomas Vondra

On 2/13/22 13:27, Tomas Vondra wrote:
> 
> ...
>
> BTW it's probably just a coincidence, but the cfbot does show a strange
> failure on the macOS machine [1], as if we did not wait for all changes
> to be applied by replica. I thought it's because of the issue with
> tracking LSN for sequences [2], but this is with a reworked test that
> does not do just nextval(). So I'm wondering if these two things may be
> related ... both things happened on macOS only.
>

I take this back - this failure was just a thinko in the ROLLBACK test,
I've posted an updated patch fixing this.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: logical decoding and replication of sequences

2022-02-13 Thread Tomas Vondra



On 2/12/22 20:58, Tomas Vondra wrote:
> On 2/12/22 01:34, Tomas Vondra wrote:
>> On 2/10/22 19:17, Tomas Vondra wrote:
>>> I've polished & pushed the first part adding sequence decoding
>>> infrastructure etc. Attached are the two remaining parts.
>>>
>>> I plan to wait a day or two and then push the test_decoding part. The
>>> last part (for built-in replication) will need more work and maybe
>>> rethinking the grammar etc.
>>>
>>
>> I've pushed the second part, adding sequences to test_decoding.
>>
>> Here's the remaining part, rebased, with a small tweak in the TAP test
>> to eliminate the issue with not waiting for sequence increments. I've
>> kept the tweak in a separate patch, so that we can throw it away easily
>> if we happen to resolve the issue.
>>
> 
> Hmm, cfbot was not happy about this, so here's a version fixing the
> elog() format issue reported by CirrusCI/mingw by ditching the log
> message. It was useful for debugging, but otherwise just noise.
> 

There was another elog() making mingw unhappy, so here's a fix for that.

This should also fix an issue on the macOS machine. This is a thinko in
the tests, because wait_for_catchup() may not wait for all the sequence
increments after a rollback. The default mode is "write" which uses
pg_current_wal_lsn(), and that may be a bit stale after a rollback.
Doing a simple insert after the rollback fixes this (using other LSN,
like pg_current_wal_insert_lsn() would work too, but it'd cause long
waits in the test).


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL CompanyFrom c2ab633ba457f5563769bd313dac6078b41e439b Mon Sep 17 00:00:00 2001
From: Tomas Vondra 
Date: Thu, 10 Feb 2022 15:18:59 +0100
Subject: [PATCH 1/2] Add support for decoding sequences to built-in
 replication

---
 doc/src/sgml/catalogs.sgml  |  71 
 doc/src/sgml/ref/alter_publication.sgml |  24 +-
 doc/src/sgml/ref/alter_subscription.sgml|   4 +-
 src/backend/catalog/pg_publication.c| 149 -
 src/backend/catalog/system_views.sql|  10 +
 src/backend/commands/publicationcmds.c  | 350 +++-
 src/backend/commands/sequence.c |  79 +
 src/backend/commands/subscriptioncmds.c | 272 +++
 src/backend/executor/execReplication.c  |   2 +-
 src/backend/nodes/copyfuncs.c   |   1 +
 src/backend/nodes/equalfuncs.c  |   1 +
 src/backend/parser/gram.y   |  32 ++
 src/backend/replication/logical/proto.c |  52 +++
 src/backend/replication/logical/tablesync.c | 114 ++-
 src/backend/replication/logical/worker.c|  56 
 src/backend/replication/pgoutput/pgoutput.c |  85 -
 src/backend/utils/cache/relcache.c  |   4 +-
 src/bin/psql/tab-complete.c |  14 +-
 src/include/catalog/pg_proc.dat |   5 +
 src/include/catalog/pg_publication.h|  14 +
 src/include/commands/sequence.h |   1 +
 src/include/nodes/parsenodes.h  |   6 +
 src/include/replication/logicalproto.h  |  19 ++
 src/include/replication/pgoutput.h  |   1 +
 src/test/regress/expected/rules.out |   8 +
 src/test/subscription/t/028_sequences.pl| 201 +++
 26 files changed, 1539 insertions(+), 36 deletions(-)
 create mode 100644 src/test/subscription/t/028_sequences.pl

diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 879d2dbce03..271dc03e5a2 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -9540,6 +9540,11 @@ SCRAM-SHA-256$iteration count:
   prepared transactions
  
 
+ 
+  pg_publication_sequences
+  publications and their associated sequences
+ 
+
  
   pg_publication_tables
   publications and their associated tables
@@ -11375,6 +11380,72 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
 
  
 
+ 
+  pg_publication_sequences
+
+  
+   pg_publication_sequences
+  
+
+  
+   The view pg_publication_sequences provides
+   information about the mapping between publications and the sequences they
+   contain.  Unlike the underlying catalog
+   pg_publication_rel,
+   this view expands
+   publications defined as FOR ALL SEQUENCES, so for such
+   publications there will be a row for each eligible sequence.
+  
+
+  
+   pg_publication_sequences Columns
+   
+
+ 
+  
+   Column Type
+  
+  
+   Description
+  
+ 
+
+
+
+ 
+  
+   pubname name
+   (references pg_publication.pubname)
+  
+  
+   Name of publication
+  
+ 
+
+ 
+  
+   schemaname name
+   (references pg_namespace.nspname)
+  
+  
+   Name of schema containing sequence
+  
+ 
+
+ 
+  
+   sequencename name
+   (references pg_class.relname)
+  
+  
+   Name of sequence
+  
+ 
+
+   
+  
+ 
+
  
   pg_publication_tables
 
diff --git

Re: FailedAssertion("prev_first_lsn < cur_txn->first_lsn", File: "reorderbuffer.c", Line: 1173, ...

2022-02-13 Thread Tomas Vondra

On 2/13/22 01:28, Andres Freund wrote:
> Hi, 
> 
> On February 12, 2022 3:57:28 PM PST, Thomas Munro  
> wrote:
>> Hi,
>>
>> I don't know what happened here, but I my little script that scrapes
>> for build farm assertion failures hasn't seen this one before now.
>> It's on a 15 year old macOS release and compiled with 15 year old GCC
>> version, for some added mystery.
>>
>> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=locust=2022-02-12%2022%3A07%3A21
>>
>> 2022-02-13 00:09:52.859 CET [92423:92] pg_regress/replorigin
>> STATEMENT:  SELECT data FROM
>> pg_logical_slot_get_changes('regression_slot', NULL, NULL,
>> 'include-xids', '0', 'skip-empty-xacts', '1', 'include-sequences',
>> '0');
>> TRAP: FailedAssertion("prev_first_lsn < cur_txn->first_lsn", File:
>> "reorderbuffer.c", Line: 1173, PID: 92423)
>> 0   postgres0x00592dd0 ExceptionalCondition + 
>> 160\\0
>> 1   postgres0x00391c40
>> ReorderBufferGetRelids + 304\\0
>> 2   postgres0x00394794
>> ReorderBufferAllocate + 1044\\0
>> 3   postgres0x003834dc heap_decode + 76\\0
>> 4   postgres0x003829c8
>> LogicalDecodingProcessRecord + 168\\0
>> 5   postgres0x0038a1d4
>> pg_logical_emit_message_text + 1876\\0
>> 6   postgres0x00241810
>> ExecMakeTableFunctionResult + 576\\0
>> 7   postgres0x002576c4
>> ExecInitFunctionScan + 1444\\0
>> 8   postgres0x00242140 ExecScan + 896\\0
>> 9   postgres0x002397cc standard_ExecutorRun + 
>> 540\\0
>> 10  postgres0x00427940
>> EnsurePortalSnapshotExists + 832\\0
>> 11  postgres0x00428e84 PortalRun + 644\\0
>> 12  postgres0x004252e0 pg_plan_queries + 3232\\0
>> 13  postgres0x00426270 PostgresMain + 3424\\0
>> 14  postgres0x0036b9ec PostmasterMain + 8060\\0
>> 15  postgres0x0029f24c main + 1356\\0
>> 16  postgres0x2524 start + 68\\0
> 
> I think that may be the bug that Tomas fixed a short while ago. He
> said the skip empty xacts stuff didn't yet work for sequences. That's
> only likely to be hit in slower machines...
> 

I'm not sure how could this be related to the issue fixed by b779d7d8fd.
That's merely about handling of empty xacts in the output plugin, it has
little (nothing) to do with reorderbuffer - the failures was simply
about (not) printing BEGIN/COMMIT for empty xacts.

So why would it be triggering an Assert in reorderbuffer? Also, there
was a successful run with the test_decoding changes 80901b3291 (and also
with sequence decoding infrastructure committed a couple days ago) ...

Do we have access to the machine? If yes, it'd be interesting to run the
tests before the fix, and see how often it fails. Also, printing the LSN
would be interesting, so that we can correlate it to WAL.

BTW it's probably just a coincidence, but the cfbot does show a strange
failure on the macOS machine [1], as if we did not wait for all changes
to be applied by replica. I thought it's because of the issue with
tracking LSN for sequences [2], but this is with a reworked test that
does not do just nextval(). So I'm wondering if these two things may be
related ... both things happened on macOS only.

[1] https://cirrus-ci.com/task/6299472241098752

[2]
https://www.postgresql.org/message-id/712cad46-a9c8-1389-aef8-faf0203c9be9%40enterprisedb.com

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Consistently use "startup process"/"WAL sender" instead of "Startup process"/"WAL Sender" in comments and docs.

2022-02-13 Thread John Naylor

On Sun, Feb 13, 2022 at 3:13 PM Bharath Rupireddy
 wrote:
>
> Hi,
>
> In the code comments, it is being used as "Startup process" in the
> middle of the sentences whereas in most of the places "startup
> process" is used. Also, "WAL Sender" is being used instead of "WAL
> sender". Let's be consistent across the docs and code comments.

FWIW, docs need to hold to a higher standard than code comments.

-- 
John Naylor
EDB: http://www.enterprisedb.com

Re: Non-decimal integer literals

2022-02-13 Thread John Naylor

On Wed, Jan 26, 2022 at 10:10 PM Peter Eisentraut
 wrote:
> [v8 patch]

0001-0004 seem pretty straightforward.

0005:

 {realfail1} {
- /*
- * throw back the [Ee], and figure out whether what
- * remains is an {integer} or {decimal}.
- */
- yyless(yyleng - 1);
  SET_YYLLOC();
- return process_integer_literal(yytext, yylval);
+ yyerror("trailing junk after numeric literal");
  }

realfail1 has been subsumed by integer_junk and decimal_junk, so that
pattern can be removed.

 {
+/*
+ * Note that some trailing junk is valid in C (such as 100LL), so we contain
+ * this to SQL mode.
+ */

It seems Flex doesn't like C comments after the "%%", so this stanza
was indented in 0006. If these are to be committed separately, that
fix should happen here.

0006:

Minor nit -- the s/decimal/numeric/ change doesn't seem to have any
notational advantage, but it's not worse, either.

0007:

I've attached an addendum to restore the no-backtrack property.

Will the underscore syntax need treatment in the input routines as well?

-- 
John Naylor
EDB: http://www.enterprisedb.com
diff --git a/src/backend/parser/Makefile b/src/backend/parser/Makefile
index 827bc4c189..5ddb9a92f0 100644
--- a/src/backend/parser/Makefile
+++ b/src/backend/parser/Makefile
@@ -56,7 +56,7 @@ gram.c: BISON_CHECK_CMD = $(PERL) $(srcdir)/check_keywords.pl 
$< $(top_srcdir)/s
 
 
 scan.c: FLEXFLAGS = -CF -p -p
-#scan.c: FLEX_NO_BACKUP=yes
+scan.c: FLEX_NO_BACKUP=yes
 scan.c: FLEX_FIX_WARNING=yes
 
 
diff --git a/src/backend/parser/scan.l b/src/backend/parser/scan.l
index 5b574c4233..3b311ac2dd 100644
--- a/src/backend/parser/scan.l
+++ b/src/backend/parser/scan.l
@@ -400,9 +400,9 @@ hexinteger  0[xX](_?{hexdigit})+
 octinteger 0[oO](_?{octdigit})+
 bininteger 0[bB](_?{bindigit})+
 
-hexfail0[xX]
-octfail0[oO]
-binfail0[bB]
+hexfail0[xX]_?
+octfail0[oO]_?
+binfail0[bB]_?
 
 numeric(({decinteger}\.{decinteger}?)|(\.{decinteger}))
 numericfail{decdigit}+\.\.

Re: pgsql: pg_upgrade: Preserve relfilenodes and tablespace OIDs.

2022-02-13 Thread Christoph Berg

Re: Robert Haas
> pg_upgrade: Preserve relfilenodes and tablespace OIDs.

> src/bin/pg_dump/pg_dumpall.c   |   3 +

--- a/src/bin/pg_dump/pg_dumpall.c
+++ b/src/bin/pg_dump/pg_dumpall.c
@@ -1066,6 +1066,9 @@ dumpTablespaces(PGconn *conn)
/* needed for buildACLCommands() */
fspcname = pg_strdup(fmtId(spcname));

+   appendPQExpBufferStr(buf, "\n-- For binary upgrade, must 
preserve pg_table
+   appendPQExpBuffer(buf, "SELECT 
pg_catalog.binary_upgrade_set_next_pg_table

This needs to be guarded with "if (binary_upgrade)".

Error message during a Debian pg_upgradecluster (-m dump) from 14 to 15:

2022-02-13 12:44:01.272 CET [168032] postgres@template1 LOG:  statement: SELECT 
pg_catalog.binary_upgrade_set_next_pg_tablespace_oid('16408'::pg_catalog.oid);
2022-02-13 12:44:01.272 CET [168032] postgres@template1 ERROR:  function can 
only be called when server is in binary upgrade mode
2022-02-13 12:44:01.272 CET [168032] postgres@template1 STATEMENT:  SELECT 
pg_catalog.binary_upgrade_set_next_pg_tablespace_oid('16408'::pg_catalog.oid);

Christoph

Re: Adding CI to our tree

2022-02-13 Thread Andres Freund

Hi,

On 2022-02-12 23:19:38 -0600, Justin Pryzby wrote:
> On Sat, Feb 12, 2022 at 05:20:08PM -0800, Andres Freund wrote:
> > What was the reason behind moving the docs stuff from the compiler warnings
> > task to linux?
> 
> I wanted to build docs even if the linux task fails.  To allow CFBOT to link 
> to
> them, so somoene can always review the docs, in HTML (rather than reading SGML
> with lines prefixed with +).

I'd be ok with running the compiler warnings job without the dependency, if
that's the connection.


> BTW, docs can be built in parallel, and CI is using BUILD_JOBS: 4.
> /usr/bin/xmllint --path . --noout --valid postgres.sgml
> /usr/bin/xmllint --path . --noout --valid postgres.sgml
> /usr/bin/xsltproc --path . --stringparam pg.version '15devel'  stylesheet.xsl 
> postgres.sgml
> /usr/bin/xsltproc --path . --stringparam pg.version '15devel'  
> stylesheet-man.xsl postgres.sgml

Sure, it just doesn't make a difference:

make -j48 -C doc/src/sgml/ maintainer-clean && time make -j48 -C doc/src/sgml/
real0m34.626s
user0m34.342s
sys 0m0.298s

make -j48 -C doc/src/sgml/ maintainer-clean && time make -C doc/src/sgml/

real0m34.780s
user0m34.494s
sys 0m0.285s



> > 1) iterate over release branches and see which has the smallest diff
> 
> Maybe for each branch: do echo `git revlist or merge base |wc -l` $branch; 
> done |sort -n |head -1
> 
> > > > Is looking at the .c files in the change really a reliable predictor of 
> > > > where
> > > > code coverage changes? I'm doubtful. Consider stuff like removing the 
> > > > last
> > > > user of some infrastructure or such. Or adding the first.
> > >
> > > You're right that it isn't particularly accurate, but it's a step forward 
> > > if
> > > lots of patches use it to check/improve coverge of new code.
> > 
> > Maybe it's good enough... The overhead in test runtime is noticeable (~5.30m
> > -> ~7.15m), but probably acceptable. Although I also would like to enable
> > -fsanitize=alignment and -fsanitize=alignment, which add about 2 minutes as
> > well (-fsanitize=address is a lot more expensive), they both work best on
> > linux.
> 
> There's other things that it'd be nice to enable, but maybe these don't need 
> to
> be on by default.  Maybe just have a list of optional compiler flags (and 
> hints
> when they're useful).  Like WRITE_READ_PARSE_PLAN_TREES.

I think it'd be good to enable a reasonable set by default. Particularly for
newer contributors stuff like forgetting in/out/readfuncs, or not knowing
about some undefined behaviour, is easy. Probably makes sense to use different
settings on different tasks.

Greetings,

Andres Freund

Consistently use "startup process"/"WAL sender" instead of "Startup process"/"WAL Sender" in comments and docs.

2022-02-13 Thread Bharath Rupireddy

Hi,

In the code comments, it is being used as "Startup process" in the
middle of the sentences whereas in most of the places "startup
process" is used. Also, "WAL Sender" is being used instead of "WAL
sender". Let's be consistent across the docs and code comments.

Attaching a patch. Thoughts?

Regards,
Bharath Rupireddy.


v1-0001-s-Startup-process-startup-process-and-s-WAL-Sende.patch
Description: Binary data

78 matches

Mail list logo