Re: Support logical replication of DDLs

2023-04-25 Thread Amit Kapila
On Wed, Apr 26, 2023 at 10:01 AM Masahiko Sawada  wrote:
>
> On Tue, Apr 25, 2023 at 12:58 PM Zhijie Hou (Fujitsu)
>  wrote:
> >
> > Aport from above comments, I splitted the code related to verbose
> > mode to a separate patch. And here is the new version patch set.
> >
>
> As for DDL replication, we create event triggers to write deparsed DDL
> commands to WAL when creating a publication with the ddl option. The
> event triggers are recreated/dropped at ALTER/DROP PUBLICATION. I'm
> concerned it's possible that DDLs executed while such a publication
> not existing are not replicated. For example, imagine the following
> steps,
>
> 1. CREATE PUBLICATION test_pub ... WITH (ddl = 'table);
> 2. CREATE SUBSCRIPTION test_sub ... PUBLICATION test_pub;
> 3. ALTER SUBSCRIPTION test_sub DISABLE;
> 4. DROP PUBLICATION test_pub;
> 5. CREATE PUBLICATION test_pub ... WITH (ddl = 'table);
> 6. ALTER SUBSCRIPTION test_sub ENABLE;
>
> DDLs executed between 4 and 5 won't be replicated.
>

But we won't even send any DMLs between 4 and 5. In fact, WALSender
will give an error for those DMLs that publication doesn't exist as it
uses a historic snapshot. So, why do we expect DDLs between Drop and
Create of publication should be replicated?

-- 
With Regards,
Amit Kapila.




Re: Support logical replication of DDLs

2023-04-25 Thread Masahiko Sawada
On Tue, Apr 25, 2023 at 12:58 PM Zhijie Hou (Fujitsu)
 wrote:
>
> Aport from above comments, I splitted the code related to verbose
> mode to a separate patch. And here is the new version patch set.
>

As for DDL replication, we create event triggers to write deparsed DDL
commands to WAL when creating a publication with the ddl option. The
event triggers are recreated/dropped at ALTER/DROP PUBLICATION. I'm
concerned it's possible that DDLs executed while such a publication
not existing are not replicated. For example, imagine the following
steps,

1. CREATE PUBLICATION test_pub ... WITH (ddl = 'table);
2. CREATE SUBSCRIPTION test_sub ... PUBLICATION test_pub;
3. ALTER SUBSCRIPTION test_sub DISABLE;
4. DROP PUBLICATION test_pub;
5. CREATE PUBLICATION test_pub ... WITH (ddl = 'table);
6. ALTER SUBSCRIPTION test_sub ENABLE;

DDLs executed between 4 and 5 won't be replicated. The same is true
when we unset the ddl option instead of dropping the publication. IIUC
it seems not to be a good idea to tie the event triggers with
publications. I don't have any good alternative ideas for now, though.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com




Re: Add two missing tests in 035_standby_logical_decoding.pl

2023-04-25 Thread vignesh C
On Tue, 25 Apr 2023 at 12:51, Drouvot, Bertrand
 wrote:
>
> Hi,
>
> On 4/25/23 6:23 AM, Amit Kapila wrote:
> > On Mon, Apr 24, 2023 at 3:36 PM Drouvot, Bertrand
> >  wrote:
> >>
> >> Without the second "pg_log_standby_snapshot()" then 
> >> wait_for_subscription_sync() would be waiting
> >> some time on the poll for "SELECT count(1) = 0 FROM pg_subscription_rel 
> >> WHERE srsubstate NOT IN ('r', 's');"
> >>
> >> Adding a comment in V3 to explain the need for the second 
> >> pg_log_standby_snapshot().
> >>
> >
> > Won't this still be unpredictable because it is possible that the
> > tablesync worker may take more time to get launched or create a
> > replication slot? If that happens after your second
> > pg_log_standby_snapshot() then wait_for_subscription_sync() will be
> > hanging.
>
> Oh right, that looks like a possible scenario.
>
> > Wouldn't it be better to create a subscription with
> > (copy_data = false) to make it predictable and then we won't need
> > pg_log_standby_snapshot() to be performed twice?
> >
> > If you agree with the above suggestion then you probably need to move
> > wait_for_subscription_sync() before Insert.
> >
>
> I like that idea, thanks! Done in V4 attached.
>
> Not related to the above corner case, but while re-reading the patch I also 
> added:
>
> "
> $node_primary->wait_for_replay_catchup($node_standby);
> "
>
> between the publication creation on the primary and the subscription to the 
> standby
> (to ensure the publication gets replicated before we request for the 
> subscription creation).

Thanks for the updated patch.
Few comments:
1) subscriber_stdout and  subscriber_stderr are not required for this
test case, we could remove it, I was able to remove those variables
and run the test successfully:
+$node_subscriber->start;
+
+my %psql_subscriber = (
+   'subscriber_stdin'  => '',
+   'subscriber_stdout' => '',
+   'subscriber_stderr' => '');
+$psql_subscriber{run} = IPC::Run::start(
+   [ 'psql', '-XA', '-f', '-', '-d',
$node_subscriber->connstr('postgres') ],
+   '<',
+   \$psql_subscriber{subscriber_stdin},
+   '>',
+   \$psql_subscriber{subscriber_stdout},
+   '2>',
+   \$psql_subscriber{subscriber_stderr},
+   $psql_timeout);

I ran it like:
my %psql_subscriber = (
'subscriber_stdin' => '');
$psql_subscriber{run} = IPC::Run::start(
[ 'psql', '-XA', '-f', '-', '-d', $node_subscriber->connstr('postgres') ],
'<',
\$psql_subscriber{subscriber_stdin},
$psql_timeout);

2) Also we have changed the default timeout here, why is this change required:
 my $node_cascading_standby =
PostgreSQL::Test::Cluster->new('cascading_standby');
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
 my $default_timeout = $PostgreSQL::Test::Utils::timeout_default;
+my $psql_timeout= IPC::Run::timer(2 * $default_timeout);

Regards,
Vignesh




[pg_rewind] use the passing callback instead of global function

2023-04-25 Thread Junwang Zhao
`local_traverse_files` and `libpq_traverse_files` both have a
callback parameter but instead use the global process_source_file
which is no good for function encapsulation.

-- 
Regards
Junwang Zhao


0001-pg_rewind-use-the-passing-callback-instead-of-global.patch
Description: Binary data


Re: Order changes in PG16 since ICU introduction

2023-04-25 Thread Jeff Davis
On Fri, 2023-04-21 at 22:35 +0100, Andrew Gierth wrote:
> > > > > 
> Can lc_collate_is_c() be taught to check whether an ICU locale is
> using
> POSIX collation?

Attached are a few small patches:

  0001: don't convert C to en-US-u-va-posix
  0002: handle locale C the same regardless of the provider, as you
suggest above
  0003: make LOCALE (or --locale) apply to everything including ICU

As far as I can tell, any libc locale has a reasonable match in ICU, so
setting LOCALE to either C or a libc locale name should be fine. Some
locales are only valid in ICU, e.g. '@colStrength=primary', or a
language tag representation, so if you do something like:

  create database foo locale 'en_US@colStrenghth=primary'
template template0;

You'll get a decent error like:

  ERROR:  invalid LC_COLLATE locale name: "en_US@colStrenghth=primary"
  HINT:  If the locale name is specific to ICU, use ICU_LOCALE.

Overall, I think it works out nicely. Let me know if there are still
some confusing cases. I tried a few variations and this one seemed the
best, but I may have missed something.

Regards,
Jeff Davis

From c768e040dc92b033e4eb0e69f08b59d8d1ffe1e4 Mon Sep 17 00:00:00 2001
From: Jeff Davis 
Date: Fri, 21 Apr 2023 14:03:57 -0700
Subject: [PATCH v2 1/3] ICU: do not convert locale 'C' to 'en-US-u-va-posix'.

The conversion was intended to be for convenience, but it's more
likely to be confusing than useful.

The user can still directly specify 'en-US-u-va-posix' if desired.

Discussion: https://postgr.es/m/f83f089ee1e9acd5dbbbf3353294d24e1f196e95.ca...@j-davis.com
---
 src/backend/utils/adt/pg_locale.c | 19 +--
 src/bin/initdb/initdb.c   | 17 +
 .../regress/expected/collate.icu.utf8.out |  8 
 src/test/regress/sql/collate.icu.utf8.sql |  4 
 4 files changed, 14 insertions(+), 34 deletions(-)

diff --git a/src/backend/utils/adt/pg_locale.c b/src/backend/utils/adt/pg_locale.c
index 51df570ce9..58c4c426bc 100644
--- a/src/backend/utils/adt/pg_locale.c
+++ b/src/backend/utils/adt/pg_locale.c
@@ -2782,26 +2782,10 @@ icu_language_tag(const char *loc_str, int elevel)
 {
 #ifdef USE_ICU
 	UErrorCode	 status;
-	char		 lang[ULOC_LANG_CAPACITY];
 	char		*langtag;
 	size_t		 buflen = 32;	/* arbitrary starting buffer size */
 	const bool	 strict = true;
 
-	status = U_ZERO_ERROR;
-	uloc_getLanguage(loc_str, lang, ULOC_LANG_CAPACITY, );
-	if (U_FAILURE(status))
-	{
-		if (elevel > 0)
-			ereport(elevel,
-	(errmsg("could not get language from locale \"%s\": %s",
-			loc_str, u_errorName(status;
-		return NULL;
-	}
-
-	/* C/POSIX locales aren't handled by uloc_getLanguageTag() */
-	if (strcmp(lang, "c") == 0 || strcmp(lang, "posix") == 0)
-		return pstrdup("en-US-u-va-posix");
-
 	/*
 	 * A BCP47 language tag doesn't have a clearly-defined upper limit
 	 * (cf. RFC5646 section 4.4). Additionally, in older ICU versions,
@@ -2889,8 +2873,7 @@ icu_validate_locale(const char *loc_str)
 
 	/* check for special language name */
 	if (strcmp(lang, "") == 0 ||
-		strcmp(lang, "root") == 0 || strcmp(lang, "und") == 0 ||
-		strcmp(lang, "c") == 0 || strcmp(lang, "posix") == 0)
+		strcmp(lang, "root") == 0 || strcmp(lang, "und") == 0)
 		found = true;
 
 	/* search for matching language within ICU */
diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
index 2c208ead01..4086834458 100644
--- a/src/bin/initdb/initdb.c
+++ b/src/bin/initdb/initdb.c
@@ -2238,24 +2238,10 @@ icu_language_tag(const char *loc_str)
 {
 #ifdef USE_ICU
 	UErrorCode	 status;
-	char		 lang[ULOC_LANG_CAPACITY];
 	char		*langtag;
 	size_t		 buflen = 32;	/* arbitrary starting buffer size */
 	const bool	 strict = true;
 
-	status = U_ZERO_ERROR;
-	uloc_getLanguage(loc_str, lang, ULOC_LANG_CAPACITY, );
-	if (U_FAILURE(status))
-	{
-		pg_fatal("could not get language from locale \"%s\": %s",
- loc_str, u_errorName(status));
-		return NULL;
-	}
-
-	/* C/POSIX locales aren't handled by uloc_getLanguageTag() */
-	if (strcmp(lang, "c") == 0 || strcmp(lang, "posix") == 0)
-		return pstrdup("en-US-u-va-posix");
-
 	/*
 	 * A BCP47 language tag doesn't have a clearly-defined upper limit
 	 * (cf. RFC5646 section 4.4). Additionally, in older ICU versions,
@@ -2327,8 +2313,7 @@ icu_validate_locale(const char *loc_str)
 
 	/* check for special language name */
 	if (strcmp(lang, "") == 0 ||
-		strcmp(lang, "root") == 0 || strcmp(lang, "und") == 0 ||
-		strcmp(lang, "c") == 0 || strcmp(lang, "posix") == 0)
+		strcmp(lang, "root") == 0 || strcmp(lang, "und") == 0)
 		found = true;
 
 	/* search for matching language within ICU */
diff --git a/src/test/regress/expected/collate.icu.utf8.out b/src/test/regress/expected/collate.icu.utf8.out
index b5a221b030..99f12d2e73 100644
--- a/src/test/regress/expected/collate.icu.utf8.out
+++ b/src/test/regress/expected/collate.icu.utf8.out
@@ -1020,6 +1020,7 @@ CREATE ROLE regress_test_role;
 CREATE SCHEMA test_schema;
 -- We need to do 

Re: could not extend file "base/5/3501" with FileFallocate(): Interrupted system call

2023-04-25 Thread Thomas Munro
On Tue, Apr 25, 2023 at 12:16 PM Andres Freund  wrote:
> On 2023-04-24 15:32:25 -0700, Andres Freund wrote:
> > We obviously can add a retry loop to FileFallocate(), similar to what's
> > already present e.g. in FileRead(). But I wonder if we shouldn't go a bit
> > further, and do it for all the fd.c routines where it's remotely plausible
> > EINTR could be returned? It's a bit silly to add EINTR retries one-by-one to
> > the functions.
> >
> >
> > The following are documented to potentially return EINTR, without fd.c 
> > having
> > code to retry:
> >
> > - FileWriteback() / pg_flush_data()
> > - FileSync() / pg_fsync()
> > - FileFallocate()
> > - FileTruncate()
> >
> > With the first two there's the added complication that it's not entirely
> > obvious whether it'd be better to handle this in File* or pg_*. I'd argue 
> > the
> > latter is a bit more sensible?
>
> A prototype of that approach is attached. I pushed the retry handling into the
> pg_* routines where applicable.  I guess we could add pg_* routines for
> FileFallocate(), FilePrewarm() etc as well, but I didn't do that here.

One problem we ran into with the the shm_open() case (which is nearly
identical under the covers, since shm_open() just opens a file in a
tmpfs on Linux) was that a simple retry loop like this could never
terminate if the process was receiving a lot of signals from the
recovery process, which is why we went with the idea of masking
signals instead.  Eventually we should probably grow the file in
smaller chunks with a CFI in between so that we both guarantee that we
make progress (by masking for smaller size increases) and service
interrupts in a timely fashion (by unmasking between loops).  I don't
think that applies here because we're not trying to fallocate
humongous size increases in one go, but I just want to note that we're
making a different choice.  I think this looks reasonable for the use
cases we actually have here.




Re: pg_stat_io for the startup process

2023-04-25 Thread Andres Freund
Hi,

On 2023-04-25 16:00:24 -0400, Robert Haas wrote:
> On Tue, Apr 25, 2023 at 2:39 PM Andres Freund  wrote:
> > I'm mildly inclined to not consider it a bug, given that this looks to have
> > been true for other stats for quite a while? But it does still seem worth
> > improving upon - I'd make the consideration when to apply the relevant 
> > patches
> > depend on the complexity. I'm worried we'd need to introduce sufficiently 
> > new
> > infrastructure that 16 doesn't seem like a good idea. Let's come up with a
> > patch and judge it after?
> 
> ISTM that it's pretty desirable to do something about this. If the
> process isn't going to report statistics properly, at least remove it
> from the view.

It's populated after crash recovery, when shutting down and at the time of
promotion, that isn't *completely* crazy.


> If it can be made to report properly, that would be even better. But
> shipping a new view with information that will nearly always be zeroes
> instead of real data seems like a bad call, even if there are existing cases
> that have the same problem.

I refreshed my memory: The startup process has indeed behaved that way for
much longer than pg_stat_io existed - but it's harder to spot, because the
stats are more coarsely aggregated :/. And it's very oddly inconsistent:

The startup process doesn't report per-relation read/hit (it might when we
create a fake relcache entry, to lazy to see what happens exactly), because we
key those stats by oid. However, it *does* report the read/write time. But
only at process exit, of course.  The weird part is that the startup process
does *NOT* increase pg_stat_database.blks_read/blks_hit, because instead of
basing those on pgBufferUsage.shared_blks_read etc, we compute them based on
the relation level stats. pgBufferUsage is just used for EXPLAIN.  This isn't
recent, afaict.

TL;DR: Currently the startup process maintains blk_read_time, blk_write_time,
but doesn't maintain blks_read, blks_hit - which doesn't make sense.

Yikes.

Greetings,

Andres Freund




Re: pg_stat_io for the startup process

2023-04-25 Thread Michael Paquier
On Tue, Apr 25, 2023 at 04:00:24PM -0400, Robert Haas wrote:
> ISTM that it's pretty desirable to do something about this. If the
> process isn't going to report statistics properly, at least remove it
> from the view. If it can be made to report properly, that would be
> even better. But shipping a new view with information that will nearly
> always be zeroes instead of real data seems like a bad call, even if
> there are existing cases that have the same problem.

Agreed that reporting no information may be better than reporting
incorrect information, even if it means an extra qual.  As mentioned
upthread, if this requires an extra design piece, adding the correct
information had better be pushed to v17~.

Perhaps an open item should be added?
--
Michael


signature.asc
Description: PGP signature


Re: Allow pg_archivecleanup to remove backup history files

2023-04-25 Thread Michael Paquier
On Tue, Apr 25, 2023 at 05:29:48PM +0900, Kyotaro Horiguchi wrote:
> I thought that we have decided not to do that, but I coundn't find any
> discussion about it in the ML archive.  Anyway, I think it is great
> that we have that option.

No objections from here to make that optional.  It's been argued for a
couple of times that leaving the archive history files is good for
debugging, but I can also get why one would one to clean up all the
files older than a WAL retention policy.  Even if these are just few
bytes, it can be noisy for the eye to see a large accumulation of
history files mixed with the WAL segments.
--
Michael


signature.asc
Description: PGP signature


Re: Orphaned wait event

2023-04-25 Thread Michael Paquier
On Tue, Apr 25, 2023 at 09:24:58PM +0530, Bharath Rupireddy wrote:
> It looks like this patch attached upthread at [1] isn't in yet,
> meaning WAIT_EVENT_SLRU_FLUSH_SYNC stays unused. IMO, it's worth
> pushing it to the PG16 branch. It will help add a wait event for SLRU
> page flushes. Thoughts?
> 
> [1] 
> https://www.postgresql.org/message-id/CA%2BhUKG%2BewEpxm%3DhPNXyupRUB_SKGh-6tO86viaco0g-P_pm_Cw%40mail.gmail.com

There could be the argument that some external code could abuse of
this value for its own needs, but I don't really buy that.  I'll go
clean up that..
--
Michael


signature.asc
Description: PGP signature


Re: pg_stat_io for the startup process

2023-04-25 Thread Robert Haas
On Tue, Apr 25, 2023 at 2:39 PM Andres Freund  wrote:
> I'm mildly inclined to not consider it a bug, given that this looks to have
> been true for other stats for quite a while? But it does still seem worth
> improving upon - I'd make the consideration when to apply the relevant patches
> depend on the complexity. I'm worried we'd need to introduce sufficiently new
> infrastructure that 16 doesn't seem like a good idea. Let's come up with a
> patch and judge it after?

ISTM that it's pretty desirable to do something about this. If the
process isn't going to report statistics properly, at least remove it
from the view. If it can be made to report properly, that would be
even better. But shipping a new view with information that will nearly
always be zeroes instead of real data seems like a bad call, even if
there are existing cases that have the same problem.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: pg_stat_io for the startup process

2023-04-25 Thread Andres Freund
Hi,

On 2023-04-25 13:54:43 -0400, Melanie Plageman wrote:
> On Tue, Apr 25, 2023 at 10:51:14PM +0900, Fujii Masao wrote:
> > Regarding pg_stat_io for the startup process, I noticed that the counters
> > are only incremented after the startup process exits, not during WAL replay
> > in standby mode. This is because pgstat_flush_io() is only called when
> > the startup process exits. Shouldn't it be called during WAL replay as well
> > to report IO statistics by the startup process even in standby mode?
> 
> Yes, we definitely want stats from the startup process on the standby.
> Elsewhere on the internet where you originally raised this, I mentioned
> that I hacked a pgstat_flush_io() into the redo apply loop in
> PerformWalRecovery() but that I wasn't sure that this was affordable.
> Andres Freund replied saying that it would be too expensive and
> suggested that the set up a regular timeout which sets a flag that's
> checked by HandleStartupProcInterrupts().

It's tempting to try to reuse the STARTUP_PROGRESS_TIMEOUT timer. But it's
controlled by a GUC, so it's not really suitable.


> I'm wondering if this is something we consider a bug and thus would be
> under consideration for 16.

I'm mildly inclined to not consider it a bug, given that this looks to have
been true for other stats for quite a while? But it does still seem worth
improving upon - I'd make the consideration when to apply the relevant patches
depend on the complexity. I'm worried we'd need to introduce sufficiently new
infrastructure that 16 doesn't seem like a good idea. Let's come up with a
patch and judge it after?


> > Also, the pg_stat_io view includes a row with backend_type=startup and
> > context=vacuum, but it seems that the startup process doesn't perform
> > any I/O operations with BAS_VACUUM. If this understanding is right,
> > shouldn't we omit this row from the view? Additionally, I noticed that
> > the view also includes a row with backend_type=startup and
> > context=bulkread / bulkwrite. Do these operations actually occur
> > during startup process?
> 
> Hmm. Yes, I remember posing this question on the thread and not getting
> an answer. I read some code and did some testing and can't see a way we
> would end up with the startup process doing IO in a non-normal context.
> 
> Certainly I can't see how startup process would ever use a BAS_VACUUM
> context given that it executes heap_xlog_vacuum().
> 
> I thought at some point I had encountered an assertion failure when I
> banned the startup process from tracking io operations in bulkread and
> bulkwrite contexts. But, I'm not seeing how that could happen.

It's possible that we decided to not apply such restrictions because the
startup process can be made to execute more code via the extensible
rmgrs.

Greetings,

Andres Freund




Re: could not extend file "base/5/3501" with FileFallocate(): Interrupted system call

2023-04-25 Thread Christoph Berg
Re: Andres Freund
> A prototype of that approach is attached. I pushed the retry handling into the
> pg_* routines where applicable.  I guess we could add pg_* routines for
> FileFallocate(), FilePrewarm() etc as well, but I didn't do that here.
> 
> Christoph, could you verify this fixes your issue?

Everything green with the patch applied. Thanks!

https://pgdgbuild.dus.dg-i.net/job/postgresql-16-binaries-snapshot/839/

Christoph




Re: pg_stat_io for the startup process

2023-04-25 Thread Melanie Plageman
On Tue, Apr 25, 2023 at 10:51:14PM +0900, Fujii Masao wrote:
> Hi,
> 
> Regarding pg_stat_io for the startup process, I noticed that the counters
> are only incremented after the startup process exits, not during WAL replay
> in standby mode. This is because pgstat_flush_io() is only called when
> the startup process exits. Shouldn't it be called during WAL replay as well
> to report IO statistics by the startup process even in standby mode?

Yes, we definitely want stats from the startup process on the standby.
Elsewhere on the internet where you originally raised this, I mentioned
that I hacked a pgstat_flush_io() into the redo apply loop in
PerformWalRecovery() but that I wasn't sure that this was affordable.
Andres Freund replied saying that it would be too expensive and
suggested that the set up a regular timeout which sets a flag that's
checked by HandleStartupProcInterrupts().

I'm wondering if this is something we consider a bug and thus would be
under consideration for 16.
 
> Also, the pg_stat_io view includes a row with backend_type=startup and
> context=vacuum, but it seems that the startup process doesn't perform
> any I/O operations with BAS_VACUUM. If this understanding is right,
> shouldn't we omit this row from the view? Additionally, I noticed that
> the view also includes a row with backend_type=startup and
> context=bulkread / bulkwrite. Do these operations actually occur
> during startup process?

Hmm. Yes, I remember posing this question on the thread and not getting
an answer. I read some code and did some testing and can't see a way we
would end up with the startup process doing IO in a non-normal context.

Certainly I can't see how startup process would ever use a BAS_VACUUM
context given that it executes heap_xlog_vacuum().

I thought at some point I had encountered an assertion failure when I
banned the startup process from tracking io operations in bulkread and
bulkwrite contexts. But, I'm not seeing how that could happen.

- Melanie




Re: base backup vs. concurrent truncation

2023-04-25 Thread Andres Freund
Hi,

On 2023-04-25 11:42:43 -0400, Robert Haas wrote:
> On Mon, Apr 24, 2023 at 8:03 PM Andres Freund  wrote:
> > What we've discussed somewhere in the past is to always truncate N+1 when
> > creating the first page in N. I.e. if we extend into 23456.1, we truncate
> > 23456.2 to 0 blocks.  As far as I can tell, that'd solve this issue?
> 
> Yeah, although leaving 23456.2 forever unless and until that happens
> doesn't sound amazing.

It isn't - but the alternatives aren't great either. It's not that easy to hit
this scenario, so I think something along these lines is more palatable than
adding a pass through the entire data directory.

I think eventually we'll have to make the WAL logging bulletproof enough
against this to avoid the risk of it. I think that is possible.

I suspect we would need to prevent checkpoints from happening in the wrong
moment, if we were to go down that route.

I guess that eventually we'll need to polish the infrastructure for
determining restartpointsm so that delayChkptFlags doesn't actually prevent
checkpoints, just moves the restart to a safe LSN.  Although I guess that
truncations aren't frequent enough (compared to transaction commits), for that
to be required "now".

Greetings,

Andres Freund




Re: Issue in postgres_fdw causing unnecessary wait for cancel request reply

2023-04-25 Thread Fujii Masao




On 2023/04/21 16:39, Jelte Fennema wrote:

In my opinion, PQconnectPoll and PQgetCancel should use the same
parsing function or PQconnectPoll should set parsed values, making
unnecessary for PQgetCancel to parse the same parameter
again.


Yes, I totally agree. So I think patch 0002 looks fine.


It seems like we have reached a consensus to push the 0002 patch.
As for back-patching, although the issue it fixes is trivial,
it may be a good idea to back-patch to v12 where parse_int_param()
was added, for easier back-patching in the future. Therefore
I'm thinking to push the 0002 patch at first and back-patch to v12.



Additionally, PQgetCancel should set appropriate error messages
for all failure modes.


I don't think that PQgetCancel should ever set error messages on the
provided conn object though. It's not part of the documented API and
it's quite confusing since there's actually no error on the connection
itself. That this happens for the keepalive parameter was an
unintended sideeffect of 5987feb70b combined with the fact that the
parsing is different. All those parsing functions should never error,
because setting up the connection should already have checked them.

So I think the newly added libpq_append_conn_error calls in patch 0001
should be removed. The AF_UNIX check and the new WARNING in pg_fdw
seem fine though.


Sounds reasonable to me.

Regarding the WARNING message, another idea is to pass the return value
of PQgetCancel() directly to PQcancel() as follows. If NULL is passed,
PQcancel() will detect it and set the proper error message to errbuf.
Then the warning message "WARNING:  could not send cancel request:
PQcancel() -- no cancel object supplied" is output. This approach is
similar to how dblink_cancel_query() does. Thought?


cancel = PQgetCancel(conn);
if (!PQcancel(cancel, errbuf, sizeof(errbuf)))
{
ereport(WARNING,
(errcode(ERRCODE_CONNECTION_FAILURE),
 errmsg("could not send cancel request: %s",
errbuf)));
PQfreeCancel(cancel);
return false;
}


Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION




Re: Switching XLog source from archive to streaming when primary available

2023-04-25 Thread Bharath Rupireddy
On Fri, Feb 24, 2023 at 10:26 AM Bharath Rupireddy
 wrote:
>
> On Wed, Nov 16, 2022 at 11:39 AM Bharath Rupireddy
>  wrote:
> >
> > I'm attaching the v10 patch for further review.
>
> Needed a rebase. I'm attaching the v11 patch for further review.

Needed a rebase, so attaching the v12 patch. I word-smithed comments
and docs a bit.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
From 80a982fb1c0d0737c5144d8ef50bb5e3ff845d07 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy 
Date: Tue, 25 Apr 2023 08:36:48 +
Subject: [PATCH v12] Allow standby to switch WAL source from archive to
 streaming replication

A standby typically switches to streaming replication (get WAL
from primary), only when receive from WAL archive finishes (no
more WAL left there) or fails for any reason. Reading WAL from
archive may not always be as efficient and fast as reading from
primary. This can be due to the differences in disk types, IO
costs, network latencies etc., all impacting recovery
performance on standby. And, while standby is reading WAL from
archive, primary accumulates WAL because the standby's replication
slot stays inactive. To avoid these problems, one can use this
parameter to make standby switch to stream mode sooner.

This feature adds a new GUC that specifies amount of time after
which standby attempts to switch WAL source from WAL archive to
streaming replication (getting WAL from primary). However, standby
exhausts all the WAL present in pg_wal before switching. If
standby fails to switch to stream mode, it falls back to archive
mode.

Reported-by: SATYANARAYANA NARLAPURAM
Author: Bharath Rupireddy
Reviewed-by: Cary Huang, Nathan Bossart
Reviewed-by: Kyotaro Horiguchi
Discussion: https://www.postgresql.org/message-id/CAHg+QDdLmfpS0n0U3U+e+dw7X7jjEOsJJ0aLEsrtxs-tUyf5Ag@mail.gmail.com
---
 doc/src/sgml/config.sgml  |  45 ++
 doc/src/sgml/high-availability.sgml   |  15 +-
 src/backend/access/transam/xlogrecovery.c | 144 --
 src/backend/utils/misc/guc_tables.c   |  12 ++
 src/backend/utils/misc/postgresql.conf.sample |   4 +
 src/include/access/xlogrecovery.h |   1 +
 src/test/recovery/meson.build |   1 +
 src/test/recovery/t/037_wal_source_switch.pl  | 126 +++
 8 files changed, 328 insertions(+), 20 deletions(-)
 create mode 100644 src/test/recovery/t/037_wal_source_switch.pl

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index b56f073a91..8050f981e9 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4923,6 +4923,51 @@ ANY num_sync ( 
+  streaming_replication_retry_interval (integer)
+  
+   streaming_replication_retry_interval configuration parameter
+  
+  
+  
+   
+Specifies amount of time after which standby attempts to switch WAL
+source from WAL archive to streaming replication (getting WAL from
+primary). However, standby exhausts all the WAL present in pg_wal
+before switching. If standby fails to switch to stream mode, it falls
+back to archive mode. If this parameter's value is specified without
+units, it is taken as milliseconds. Default is five minutes
+(5min). With a lower value for this parameter,
+standby makes frequent WAL source switch attempts. To avoid this, it is
+recommended to set a reasonable value. A setting of 0
+disables the feature. When disabled, standby typically switches to
+stream mode only when receive from WAL archive finishes (no more WAL
+left there) or fails for any reason. This parameter can only be set in
+the postgresql.conf file or on the server command
+line.
+   
+   
+
+ Standby may not always attempt to switch source from WAL archive to
+ streaming replication at exact streaming_replication_retry_interval
+ intervals. For example, if the parameter is set to 1min
+ and fetching WAL file from archive takes about 2min,
+ then the source switch attempt happens for the next WAL file after
+ current WAL file fetched from archive is fully applied.
+
+   
+   
+Reading WAL from archive may not always be as efficient and fast as
+reading from primary. This can be due to the differences in disk types,
+IO costs, network latencies etc., all impacting recovery performance on
+standby. And, while standby is reading WAL from archive, primary
+accumulates WAL because the standby's replication slot stays inactive.
+To avoid these problems, one can use this parameter to make standby
+switch to stream mode sooner.
+   
+  
+ 
+
  
   recovery_min_apply_delay (integer)
   
diff --git a/doc/src/sgml/high-availability.sgml b/doc/src/sgml/high-availability.sgml
index 

Re: Orphaned wait event

2023-04-25 Thread Bharath Rupireddy
On Fri, Mar 24, 2023 at 12:00 PM Bharath Rupireddy
 wrote:
>
> On Fri, Mar 24, 2023 at 3:31 AM Thomas Munro  wrote:
> >
> > On Thu, Mar 23, 2023 at 8:10 PM Bharath Rupireddy
> >  wrote:
> > > Yeah, commit [1] removed the last trace of it. I wonder if we can add
> > > a WAIT_EVENT_SLRU_FLUSH_SYNC wait event in SlruSyncFileTag(), similar
> > > to mdsyncfiletag. This way, we would have covered all sync_syncfiletag
> > > fsyncs with wait events.
> >
> > Ahh, right.  Thanks.  The mistake was indeed that SlruSyncFileTag
> > failed to report it while running pg_fsync().
>
> Thanks. The attached patch looks good to me.

It looks like this patch attached upthread at [1] isn't in yet,
meaning WAIT_EVENT_SLRU_FLUSH_SYNC stays unused. IMO, it's worth
pushing it to the PG16 branch. It will help add a wait event for SLRU
page flushes. Thoughts?

[1] 
https://www.postgresql.org/message-id/CA%2BhUKG%2BewEpxm%3DhPNXyupRUB_SKGh-6tO86viaco0g-P_pm_Cw%40mail.gmail.com

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: base backup vs. concurrent truncation

2023-04-25 Thread Robert Haas
On Mon, Apr 24, 2023 at 8:03 PM Andres Freund  wrote:
> What we've discussed somewhere in the past is to always truncate N+1 when
> creating the first page in N. I.e. if we extend into 23456.1, we truncate
> 23456.2 to 0 blocks.  As far as I can tell, that'd solve this issue?

Yeah, although leaving 23456.2 forever unless and until that happens
doesn't sound amazing.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: enhancing plpgsql debug api - hooks on statements errors and function errors

2023-04-25 Thread Pavel Stehule
Hi


út 25. 4. 2023 v 10:27 odesílatel Pavel Stehule 
napsal:

> Hi
>
> When I implemented profiler and coverage check to plpgsql_check I had to
> write a lot of hard maintaining code related to corect finishing some
> operations (counter incrementing) usually executed by stmt_end and func_end
> hooks. It is based on the fmgr hook and its own statement call stack. Can
> be nice if I can throw this code and use some nice buildin API.
>
> Can we enhance dbg API with two hooks stmt_end_err func_end_err ?
>
> These hooks can be called from exception handlers before re raising.
>
> Or we can define new hooks like executor hooks - stmt_exec and func_exec.
> In custom hooks the exception can be catched.
>
> What do you think about this proposal?
>
>
I did quick and ugly benchmark on worst case

CREATE OR REPLACE FUNCTION public.speedtest(i integer)
 RETURNS void
 LANGUAGE plpgsql
 IMMUTABLE
AS $function$
declare c int = 0;
begin
  while c < i
  loop
c := c + 1;
  end loop;
  raise notice '%', c;
end;
$function$

and is possible to write some code (see ugly patch) without any performance
impacts if the hooks are not used. When hooks are active, then there is 7%
performance lost. It is not nice - but this is the worst case. The impact
on real code should be significantly lower

Regards

Pavel
diff --git a/src/pl/plpgsql/src/pl_exec.c b/src/pl/plpgsql/src/pl_exec.c
index 4b76f7699a..dc21abd54d 100644
--- a/src/pl/plpgsql/src/pl_exec.c
+++ b/src/pl/plpgsql/src/pl_exec.c
@@ -1974,157 +1974,180 @@ exec_stmt_block(PLpgSQL_execstate *estate, PLpgSQL_stmt_block *block)
 }
 
 
-/* --
- * exec_stmts			Iterate over a list of statements
- *as long as their return code is OK
- * --
- */
 static int
-exec_stmts(PLpgSQL_execstate *estate, List *stmts)
+exec_stmt(PLpgSQL_execstate *estate, PLpgSQL_stmt *stmt)
 {
-	PLpgSQL_stmt *save_estmt = estate->err_stmt;
-	ListCell   *s;
+	int		rc;
 
-	if (stmts == NIL)
+	switch (stmt->cmd_type)
 	{
-		/*
-		 * Ensure we do a CHECK_FOR_INTERRUPTS() even though there is no
-		 * statement.  This prevents hangup in a tight loop if, for instance,
-		 * there is a LOOP construct with an empty body.
-		 */
-		CHECK_FOR_INTERRUPTS();
-		return PLPGSQL_RC_OK;
-	}
+		case PLPGSQL_STMT_BLOCK:
+			rc = exec_stmt_block(estate, (PLpgSQL_stmt_block *) stmt);
+			break;
 
-	foreach(s, stmts)
-	{
-		PLpgSQL_stmt *stmt = (PLpgSQL_stmt *) lfirst(s);
-		int			rc;
+		case PLPGSQL_STMT_ASSIGN:
+			rc = exec_stmt_assign(estate, (PLpgSQL_stmt_assign *) stmt);
+			break;
 
-		estate->err_stmt = stmt;
+		case PLPGSQL_STMT_PERFORM:
+			rc = exec_stmt_perform(estate, (PLpgSQL_stmt_perform *) stmt);
+			break;
 
-		/* Let the plugin know that we are about to execute this statement */
-		if (*plpgsql_plugin_ptr && (*plpgsql_plugin_ptr)->stmt_beg)
-			((*plpgsql_plugin_ptr)->stmt_beg) (estate, stmt);
+		case PLPGSQL_STMT_CALL:
+			rc = exec_stmt_call(estate, (PLpgSQL_stmt_call *) stmt);
+			break;
 
-		CHECK_FOR_INTERRUPTS();
+		case PLPGSQL_STMT_GETDIAG:
+			rc = exec_stmt_getdiag(estate, (PLpgSQL_stmt_getdiag *) stmt);
+			break;
 
-		switch (stmt->cmd_type)
-		{
-			case PLPGSQL_STMT_BLOCK:
-rc = exec_stmt_block(estate, (PLpgSQL_stmt_block *) stmt);
-break;
+		case PLPGSQL_STMT_IF:
+			rc = exec_stmt_if(estate, (PLpgSQL_stmt_if *) stmt);
+			break;
 
-			case PLPGSQL_STMT_ASSIGN:
-rc = exec_stmt_assign(estate, (PLpgSQL_stmt_assign *) stmt);
-break;
+		case PLPGSQL_STMT_CASE:
+			rc = exec_stmt_case(estate, (PLpgSQL_stmt_case *) stmt);
+			break;
 
-			case PLPGSQL_STMT_PERFORM:
-rc = exec_stmt_perform(estate, (PLpgSQL_stmt_perform *) stmt);
-break;
+		case PLPGSQL_STMT_LOOP:
+			rc = exec_stmt_loop(estate, (PLpgSQL_stmt_loop *) stmt);
+			break;
 
-			case PLPGSQL_STMT_CALL:
-rc = exec_stmt_call(estate, (PLpgSQL_stmt_call *) stmt);
-break;
+		case PLPGSQL_STMT_WHILE:
+			rc = exec_stmt_while(estate, (PLpgSQL_stmt_while *) stmt);
+			break;
 
-			case PLPGSQL_STMT_GETDIAG:
-rc = exec_stmt_getdiag(estate, (PLpgSQL_stmt_getdiag *) stmt);
-break;
+		case PLPGSQL_STMT_FORI:
+			rc = exec_stmt_fori(estate, (PLpgSQL_stmt_fori *) stmt);
+			break;
 
-			case PLPGSQL_STMT_IF:
-rc = exec_stmt_if(estate, (PLpgSQL_stmt_if *) stmt);
-break;
+		case PLPGSQL_STMT_FORS:
+			rc = exec_stmt_fors(estate, (PLpgSQL_stmt_fors *) stmt);
+			break;
 
-			case PLPGSQL_STMT_CASE:
-rc = exec_stmt_case(estate, (PLpgSQL_stmt_case *) stmt);
-break;
+		case PLPGSQL_STMT_FORC:
+			rc = exec_stmt_forc(estate, (PLpgSQL_stmt_forc *) stmt);
+			break;
 
-			case PLPGSQL_STMT_LOOP:
-rc = exec_stmt_loop(estate, (PLpgSQL_stmt_loop *) stmt);
-break;
+		case PLPGSQL_STMT_FOREACH_A:
+			rc = exec_stmt_foreach_a(estate, (PLpgSQL_stmt_foreach_a *) stmt);
+			break;
 
-			case PLPGSQL_STMT_WHILE:
-rc = exec_stmt_while(estate, (PLpgSQL_stmt_while *) stmt);
-break;
+		case PLPGSQL_STMT_EXIT:
+			rc = exec_stmt_exit(estate, (PLpgSQL_stmt_exit *) stmt);
+			break;
 
-			

Re: Order changes in PG16 since ICU introduction

2023-04-25 Thread Tom Lane
"Daniel Verite"  writes:
> FTR the full text search parser still uses the libc functions
> is[w]space/alpha/digit...  that depend on lc_ctype, whether the db
> collation provider is ICU or not.

Yeah, those aren't even connected up to the collation-selection
mechanisms; lots of work to do there.  I wonder if they could be
made to use regc_pg_locale.c instead of duplicating logic.

regards, tom lane




Re: Should vacuum process config file reload more often

2023-04-25 Thread Masahiko Sawada
On Tue, Apr 25, 2023 at 10:35 PM Daniel Gustafsson  wrote:
>
> > On 25 Apr 2023, at 15:31, Masahiko Sawada  wrote:
> >
> > On Tue, Apr 25, 2023 at 9:39 PM Daniel Gustafsson  wrote:
> >>
> >>> On 17 Apr 2023, at 04:04, Masahiko Sawada  wrote:
> >>
> >>> I've attached an updated patch for fixing at_dobalance condition.
> >>
> >> I revisited this and pushed it to all supported branches after another 
> >> round of
> >> testing and reading.
> >
> > Thanks!
> >
> > Can we mark the open item "Can't disable autovacuum cost delay through
> > storage parameter" as resolved?
>
> Yes, I've gone ahead and done that now.

Great, thank you!

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com




Re: Order changes in PG16 since ICU introduction

2023-04-25 Thread Daniel Verite
Jeff Davis wrote:

> > (I'm not sure whether those operations can get redirected to ICU
> > today
> > or whether they still always go to libc, but we'll surely want to fix
> > it eventually if the latter is still true.)
> 
> Those operations do get redirected to ICU today. 

FTR the full text search parser still uses the libc functions
is[w]space/alpha/digit...  that depend on lc_ctype, whether the db
collation provider is ICU or not.


Best regards,
-- 
Daniel Vérité
https://postgresql.verite.pro/
Twitter: @DanielVerite




Re: Request for comment on setting binary format output per session

2023-04-25 Thread Dave Cramer
On Tue, 25 Apr 2023 at 07:26, Dave Cramer  wrote:

>
>
>
> On Mon, 24 Apr 2023 at 19:18, Merlin Moncure  wrote:
>
>>
>>
>> On Thu, Apr 20, 2023 at 2:52 PM Dave Cramer  wrote:
>>
>>>
>>> As promised here is a patch with defines for all of the protocol
>>> messages.
>>>
>> I created a protocol.h file and put it in src/includes
>>> I'm fairly sure that some of the names I used may need to be changed but
>>> the grunt work of finding and replacing everything is done.
>>>
>>
>> In many cases, converting inline character to macro eliminates the need
>> for inline comment, e.g.:
>> + case SIMPLE_QUERY: /* simple query */
>>
>> ...that's more work obviously, do you agree and if so would you like some
>> help going through that?
>>
>
> I certainly agree. I left them there mostly for reviewers. I expected some
> minor adjustments to names of the macro's
>
> So if you have suggestions I'll make changes.
>
> I'll remove the comments if they are no longer necessary.
>

Patch attached with comments removed


>
> Dave
>
>>


0001-Created-protocol.h.patch
Description: Binary data


Re: Add PQsendSyncMessage() to libpq

2023-04-25 Thread Denis Laxalde

Anton Kirilov wrote:

I would appeciate your thoughts on my proposal.


This sounds like a useful addition to me. I've played a bit with it in 
Psycopg and it works fine.



diff --git a/src/interfaces/libpq/fe-exec.c b/src/interfaces/libpq/fe-exec.c
index a16bbf32ef..e2b32c1379 100644
--- a/src/interfaces/libpq/fe-exec.c
+++ b/src/interfaces/libpq/fe-exec.c
@@ -82,6 +82,7 @@ static intPQsendDescribe(PGconn *conn, char desc_type,
 static int check_field_number(const PGresult *res, int field_num);
 static void pqPipelineProcessQueue(PGconn *conn);
 static int pqPipelineFlush(PGconn *conn);
+static int send_sync_message(PGconn *conn, int flush);

Could (should?) be:
static int send_sync_message(PGconn *conn, bool flush);


diff --git a/src/test/modules/libpq_pipeline/libpq_pipeline.c 
b/src/test/modules/libpq_pipeline/libpq_pipeline.c

index f48da7d963..829907957a 100644
--- a/src/test/modules/libpq_pipeline/libpq_pipeline.c
+++ b/src/test/modules/libpq_pipeline/libpq_pipeline.c
@@ -244,6 +244,104 @@ test_multi_pipelines(PGconn *conn)
fprintf(stderr, "ok\n");
 }

+static void
+test_multi_pipelines_noflush(PGconn *conn)
+{

Maybe test_multi_pipelines() could be extended with an additional 
PQsendQueryParams()+PQsendSyncMessage() step instead of adding this 
extra test case?





pg_stat_io for the startup process

2023-04-25 Thread Fujii Masao

Hi,

Regarding pg_stat_io for the startup process, I noticed that the counters
are only incremented after the startup process exits, not during WAL replay
in standby mode. This is because pgstat_flush_io() is only called when
the startup process exits. Shouldn't it be called during WAL replay as well
to report IO statistics by the startup process even in standby mode?

Also, the pg_stat_io view includes a row with backend_type=startup and
context=vacuum, but it seems that the startup process doesn't perform
any I/O operations with BAS_VACUUM. If this understanding is right,
shouldn't we omit this row from the view? Additionally, I noticed that
the view also includes a row with backend_type=startup and
context=bulkread / bulkwrite. Do these operations actually occur
during startup process?

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION




Re: Should vacuum process config file reload more often

2023-04-25 Thread Daniel Gustafsson
> On 25 Apr 2023, at 15:31, Masahiko Sawada  wrote:
> 
> On Tue, Apr 25, 2023 at 9:39 PM Daniel Gustafsson  wrote:
>> 
>>> On 17 Apr 2023, at 04:04, Masahiko Sawada  wrote:
>> 
>>> I've attached an updated patch for fixing at_dobalance condition.
>> 
>> I revisited this and pushed it to all supported branches after another round 
>> of
>> testing and reading.
> 
> Thanks!
> 
> Can we mark the open item "Can't disable autovacuum cost delay through
> storage parameter" as resolved?

Yes, I've gone ahead and done that now.

--
Daniel Gustafsson





Re: Should vacuum process config file reload more often

2023-04-25 Thread Masahiko Sawada
On Tue, Apr 25, 2023 at 9:39 PM Daniel Gustafsson  wrote:
>
> > On 17 Apr 2023, at 04:04, Masahiko Sawada  wrote:
>
> > I've attached an updated patch for fixing at_dobalance condition.
>
> I revisited this and pushed it to all supported branches after another round 
> of
> testing and reading.

Thanks!

Can we mark the open item "Can't disable autovacuum cost delay through
storage parameter" as resolved?

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com




RE: [PoC] pg_upgrade: allow to upgrade publisher node

2023-04-25 Thread Hayato Kuroda (Fujitsu)
Dear Hackers,

> Thank you for giving comments! PSA new version.

Note that due to the current version could not work well on FreeBSD, maybe
because of the timing issue[1]. I'm now analyzing the reason and will post
the fixed version.

[1]: https://cirrus-ci.com/build/4676441267240960

Best Regards,
Hayato Kuroda
FUJITSU LIMITED





Re: Should vacuum process config file reload more often

2023-04-25 Thread Daniel Gustafsson
> On 17 Apr 2023, at 04:04, Masahiko Sawada  wrote:

> I've attached an updated patch for fixing at_dobalance condition.

I revisited this and pushed it to all supported branches after another round of
testing and reading.

--
Daniel Gustafsson





Re: Request for comment on setting binary format output per session

2023-04-25 Thread Dave Cramer
On Mon, 24 Apr 2023 at 19:18, Merlin Moncure  wrote:

>
>
> On Thu, Apr 20, 2023 at 2:52 PM Dave Cramer  wrote:
>
>>
>> As promised here is a patch with defines for all of the protocol messages.
>>
> I created a protocol.h file and put it in src/includes
>> I'm fairly sure that some of the names I used may need to be changed but
>> the grunt work of finding and replacing everything is done.
>>
>
> In many cases, converting inline character to macro eliminates the need
> for inline comment, e.g.:
> + case SIMPLE_QUERY: /* simple query */
>
> ...that's more work obviously, do you agree and if so would you like some
> help going through that?
>

I certainly agree. I left them there mostly for reviewers. I expected some
minor adjustments to names of the macro's

So if you have suggestions I'll make changes.

I'll remove the comments if they are no longer necessary.

Dave

>


Re: ssl tests aren't concurrency safe due to get_free_port()

2023-04-25 Thread Peter Eisentraut

On 20.11.22 16:10, Andrew Dunstan wrote:


On 2022-11-19 Sa 15:16, Andres Freund wrote:

Hi,

On 2022-11-19 10:56:33 -0500, Andrew Dunstan wrote:

Perhaps we should just export a directory in configure instead of this
guessing game?

I think the obvious candidate would be to export top_builddir from
src/Makefile.global. That would remove the need to infer it from
TESTDATADIR.

I think that'd be good. I'd perhaps rename it in the process so it's
exported uppercase, but whatever...



OK, pushed with a little more tweaking. I didn't upcase top_builddir
because the existing prove_installcheck recipes already export it and I
wanted to stay consistent with those.

If it works ok I will backpatch in couple of days.


These patches have affected pgxs-using extensions that have their own 
TAP tests.


The portlock directory is created at

my $build_dir = $ENV{top_builddir}
  || $PostgreSQL::Test::Utils::tmp_check ;
$portdir ||= "$build_dir/portlock";

but for a pgxs user, top_builddir points into the installation tree, 
specifically at $prefix/lib/pgxs/.


So when running "make installcheck" for an extension, we either won't 
have write access to that directory, or if we do, then it's still not 
good to write into the installation tree during a test suite.


A possible fix is

diff --git a/src/Makefile.global.in b/src/Makefile.global.in
index 5dacc4d838..c493d1a60c 100644
--- a/src/Makefile.global.in
+++ b/src/Makefile.global.in
@@ -464,7 +464,7 @@ rm -rf '$(CURDIR)'/tmp_check && \
 $(MKDIR_P) '$(CURDIR)'/tmp_check && \
 cd $(srcdir) && \
TESTDIR='$(CURDIR)' PATH="$(bindir):$(CURDIR):$$PATH" \
-   PGPORT='6$(DEF_PGPORT)' top_builddir='$(top_builddir)' \
+   PGPORT='6$(DEF_PGPORT)' top_builddir='$(CURDIR)' \
PG_REGRESS='$(top_builddir)/src/test/regress/pg_regress' \
$(PROVE) $(PG_PROVE_FLAGS) $(PROVE_FLAGS) $(if 
$(PROVE_TESTS),$(PROVE_TESTS),t/*.pl)

 endef





Re: postgres_fdw: wrong results with self join + enable_nestloop off

2023-04-25 Thread Richard Guo
On Fri, Apr 14, 2023 at 8:51 PM Etsuro Fujita 
wrote:

> I think that the root cause for this issue would be in the
> create_scan_plan handling of pseudoconstant quals when creating a
> foreign-join (or custom-join) plan.


Yes exactly.  In create_scan_plan, we are supposed to extract all the
pseudoconstant clauses and use them as one-time quals in a gating Result
node.  Currently we check against rel->baserestrictinfo and ppi_clauses
for the pseudoconstant clauses.  But for scans of foreign joins, we do
not have any restriction clauses in these places and thus the gating
Result node as well as the pseudoconstant clauses would just be lost.

I looked at Nishant's patch.  IIUC it treats the pseudoconstant clauses
as local conditions.  While it can fix the wrong results issue, I think
maybe it's better to still treat the pseudoconstant clauses as one-time
quals in a gating node.  So I wonder if we can store the restriction
clauses for foreign joins in ForeignPath, just as what we do for normal
JoinPath, and then check against them for pseudoconstant clauses in
create_scan_plan, something like attached.

BTW, while going through the codes I noticed one place in
add_foreign_final_paths that uses NULL for List *.  I changed it to NIL.

Thanks
Richard


v1-0001-Fix-how-we-handle-pseudoconstant-clauses-for-scans-of-foreign-joins.patch
Description: Binary data


Re: Missing update of all_hasnulls in BRIN opclasses

2023-04-25 Thread Alvaro Herrera
On 2023-Apr-24, Tomas Vondra wrote:

> On 4/24/23 17:36, Alvaro Herrera wrote:

> > (As for your FIXME comment in brin_memtuple_initialize, I think you're
> > correct: we definitely need to reset bt_placeholder.  Otherwise, we risk
> > places that call it in a loop using a BrinMemTuple with one range with
> > the flag set, in a range where that doesn't hold.  It might be
> > impossible for this to happen, given how narrow the conditions are on
> > which bt_placeholder is used; but it seems safer to reset it anyway.)
> 
> Yeah. But isn't that a separate preexisting issue, strictly speaking?

Yes.

> > I did a quick experiment of turning the patches over -- applying test
> > first, code fix after (requires some conflict fixing).  With that I
> > verified that the behavior of 'hasnulls' indeed changes with the code
> > fix.
> 
> Thanks. Could you do some testing of the union_tuples stuff too? It's a
> bit tricky part - the behavior is timing sensitive, so testing it
> requires gdb breakpoints breakpoints or something like that. I think
> it's correct, but it'd be nice to check that.

I'll have a look.

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/
 really, I see PHP as like a strange amalgamation of C, Perl, Shell
 inflex: you know that "amalgam" means "mixture with mercury",
   more or less, right?
 i.e., "deadly poison"




Re: Allow pg_archivecleanup to remove backup history files

2023-04-25 Thread Kyotaro Horiguchi
At Tue, 25 Apr 2023 16:38:16 +0900, torikoshia  
wrote in 
> Hi,
> 
> Currently pg_archivecleanup doesn't remove backup history files even
> when they're older than oldestkeptwalfile.
> 
> Of course the size of backup history files are smaller than WAL files
> and they wouldn't consume much disk space, but a lot of backup history
> files(e.g. daily backup for years) whose backups data have been
> already removed are unnecessary and I would appreciate if
> pg_archivecleanup has an option to remove them.
> 
> Attached a PoC patch, which added new option -b to remove files
> including backup history files older than oldestkeptwalfile.
> 
>   $ ls archivedir
>   00010001 00010003
>   00010006
>   00010008
>   00010002 00010004
>   00010007
>   00010009
>   00010002.0028.backup  00010005
>   00010007.0028.backup
>   0001000A.partial
> 
>   $ pg_archivecleanup -b archivedir 00010009
> 
>   $ ls archivedir
>   00010009  0001000A.partial
> 
> Any thoughts?

I thought that we have decided not to do that, but I coundn't find any
discussion about it in the ML archive.  Anyway, I think it is great
that we have that option.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




enhancing plpgsql debug api - hooks on statements errors and function errors

2023-04-25 Thread Pavel Stehule
Hi

When I implemented profiler and coverage check to plpgsql_check I had to
write a lot of hard maintaining code related to corect finishing some
operations (counter incrementing) usually executed by stmt_end and func_end
hooks. It is based on the fmgr hook and its own statement call stack. Can
be nice if I can throw this code and use some nice buildin API.

Can we enhance dbg API with two hooks stmt_end_err func_end_err ?

These hooks can be called from exception handlers before re raising.

Or we can define new hooks like executor hooks - stmt_exec and func_exec.
In custom hooks the exception can be catched.

What do you think about this proposal?

regards

Pavel


Re: In-placre persistance change of a relation

2023-04-25 Thread Kyotaro Horiguchi
At Fri, 17 Mar 2023 15:16:34 +0900 (JST), Kyotaro Horiguchi 
 wrote in 
> Mmm. It took longer than I said, but this is the patch set that
> includes all three parts.
> 
> 1. "Mark files" to prevent orphan storage files for in-transaction
>   created relations after a crash.
> 
> 2. In-place persistence change: For ALTER TABLE SET LOGGED/UNLOGGED
>   with wal_level minimal, and ALTER TABLE SET UNLOGGED with other
>   wal_levels, the commands don't require a file copy for the relation
>   storage. ALTER TABLE SET LOGGED with non-minimal wal_level emits
>   bulk FPIs instead of a bunch of individual INSERTs.
> 
> 3. An extension to ALTER TABLE SET (UN)LOGGED that can handle all
>   tables in a tablespace at once.
> 
> 
> As a side note, I quickly go over the behavior of the mark files
> introduced by the first patch, particularly what happens when deletion
> fails.
> 
> (1) The mark file for MAIN fork (".u") corresponds to all forks,
> while the mark file for INIT fork ("_init.u") corresponds to
> INIT fork alone.
> 
> (2) The mark file is created just before the the corresponding storage
> file is made. This is always logged in the WAL.
> 
> (3) The mark file is deleted after removing the corresponding storage
> file during the commit and rollback. This action is logged in the
> WAL, too. If the deletion fails, an ERROR is output and the
> transaction aborts.
> 
> (4) If a crash leaves a mark file behind, server will try to delete it
> after successfully removing the corresponding storage file during
> the subsequent startup that runs a recovery. If deletion fails,
> server leaves the mark file alone with emitting a WARNING. (The
> same behavior for non-mark files.)
> 
> (5) If the deletion of the mark file fails, the leftover mark file
> prevents the creation of the corresponding storage file (causing
> an ERROR).  The leftover mark files don't result in the removal of
> the wrong files due to that behavior.
> 
> (6) The mark file for an INIT fork is created only when ALTER TABLE
> SET UNLOGGED is executed (not for CREATE UNLOGGED TABLE) to signal
> the crash-cleanup code to remove the INIT fork. (Otherwise the
> cleanup code removes the main fork instead. This is the main
> objective of introducing the mark files.)

Rebased.

I fixed some code comments and commit messages. I fixed the wrong
arrangement of some changes among patches.  Most importantly, I fixed
the a bug based on a wrong assumption that init-fork is not resides on
shared buffers. Now smgrDoPendingCleanups drops buffer for a init-fork
to be removed.

The new fourth patch is a temporary fix for recently added code, which
will soon be no longer needed.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
>From e645e4782c4a1562aa932f87f3932b4e54beac11 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Thu, 2 Mar 2023 17:25:12 +0900
Subject: [PATCH v28 1/4] Introduce storage mark files

In specific scenarios, certain operations followed by a crash-restart
may generate orphaned storage files that cannot be removed through
standard procedures or cause the server to fail during restart. This
commit introduces 'mark files' to convey information about the storage
file. In particular, an "UNCOMMITTED" mark file is implemented to
identify uncommitted files for removal during the subsequent startup.
---
 src/backend/access/rmgrdesc/smgrdesc.c|  37 +++
 src/backend/access/transam/README |  10 +
 src/backend/access/transam/xact.c |   7 +
 src/backend/access/transam/xlogrecovery.c |  18 ++
 src/backend/backup/basebackup.c   |   9 +-
 src/backend/catalog/storage.c | 268 +-
 src/backend/storage/file/fd.c |   4 +-
 src/backend/storage/file/reinit.c | 313 +++---
 src/backend/storage/smgr/md.c |  94 ++-
 src/backend/storage/smgr/smgr.c   |  32 +++
 src/backend/storage/sync/sync.c   |  26 +-
 src/bin/pg_rewind/parsexlog.c |  16 ++
 src/common/relpath.c  |  47 ++--
 src/include/catalog/storage.h |   3 +
 src/include/catalog/storage_xlog.h|  35 ++-
 src/include/common/relpath.h  |   9 +-
 src/include/storage/fd.h  |   1 +
 src/include/storage/md.h  |   8 +-
 src/include/storage/reinit.h  |   8 +-
 src/include/storage/smgr.h|  17 ++
 src/test/recovery/t/013_crash_restart.pl  |  21 ++
 src/tools/pgindent/typedefs.list  |   6 +
 22 files changed, 847 insertions(+), 142 deletions(-)

diff --git a/src/backend/access/rmgrdesc/smgrdesc.c b/src/backend/access/rmgrdesc/smgrdesc.c
index bd841b96e8..f8187385c4 100644
--- a/src/backend/access/rmgrdesc/smgrdesc.c
+++ b/src/backend/access/rmgrdesc/smgrdesc.c
@@ -40,6 +40,37 @@ smgr_desc(StringInfo buf, XLogReaderState *record)
 		 xlrec->blkno, xlrec->flags);
 		pfree(path);
 	}
+	else if 

Allow pg_archivecleanup to remove backup history files

2023-04-25 Thread torikoshia

Hi,

Currently pg_archivecleanup doesn't remove backup history files even 
when they're older than oldestkeptwalfile.


Of course the size of backup history files are smaller than WAL files 
and they wouldn't consume much disk space, but a lot of backup history 
files(e.g. daily backup for years) whose backups data have been already 
removed are unnecessary and I would appreciate if pg_archivecleanup has 
an option to remove them.


Attached a PoC patch, which added new option -b to remove files 
including backup history files older than oldestkeptwalfile.


  $ ls archivedir
  00010001  00010003  
00010006

  00010008
  00010002  00010004  
00010007

  00010009
  00010002.0028.backup  00010005
  00010007.0028.backup  
0001000A.partial


  $ pg_archivecleanup -b archivedir 00010009

  $ ls archivedir
  00010009  0001000A.partial

Any thoughts?

--
Regards,

--
Atsushi Torikoshi
NTT DATA CORPORATIONFrom ad87224ec5ba6ee13ccf934bf3e5adefb7e67212 Mon Sep 17 00:00:00 2001
From: Atsushi Torikoshi 
Date: Mon, 24 Apr 2023 23:28:06 +0900
Subject: [PATCH v1] Allow pg_archivecleanup to remove backup history files

Add new option -b to remove files including backup history files older
than oldestkeptwalfile.

---
 src/bin/pg_archivecleanup/pg_archivecleanup.c | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/src/bin/pg_archivecleanup/pg_archivecleanup.c b/src/bin/pg_archivecleanup/pg_archivecleanup.c
index 7726d05149..5bc90fbadd 100644
--- a/src/bin/pg_archivecleanup/pg_archivecleanup.c
+++ b/src/bin/pg_archivecleanup/pg_archivecleanup.c
@@ -23,6 +23,8 @@ const char *progname;
 
 /* Options and defaults */
 bool		dryrun = false;		/* are we performing a dry-run operation? */
+bool		removeBackupHistoryFile = false;	/* remove files including
+ * backup history files */
 char	   *additional_ext = NULL;	/* Extension to remove from filenames */
 
 char	   *archiveLocation;	/* where to find the archive? */
@@ -118,7 +120,8 @@ CleanupPriorWALFiles(void)
 			 * file. Note that this means files are not removed in the order
 			 * they were originally written, in case this worries you.
 			 */
-			if ((IsXLogFileName(walfile) || IsPartialXLogFileName(walfile)) &&
+			if (((IsXLogFileName(walfile) || IsPartialXLogFileName(walfile)) ||
+(removeBackupHistoryFile && IsBackupHistoryFileName(walfile))) &&
 strcmp(walfile + 8, exclusiveCleanupFileName + 8) < 0)
 			{
 char		WALFilePath[MAXPGPATH * 2]; /* the file path
@@ -252,6 +255,7 @@ usage(void)
 	printf(_("Usage:\n"));
 	printf(_("  %s [OPTION]... ARCHIVELOCATION OLDESTKEPTWALFILE\n"), progname);
 	printf(_("\nOptions:\n"));
+	printf(_("  -b remove files including backup history files\n"));
 	printf(_("  -d generate debug output (verbose mode)\n"));
 	printf(_("  -n dry run, show the names of the files that would be removed\n"));
 	printf(_("  -V, --version  output version information, then exit\n"));
@@ -294,10 +298,13 @@ main(int argc, char **argv)
 		}
 	}
 
-	while ((c = getopt(argc, argv, "dnx:")) != -1)
+	while ((c = getopt(argc, argv, "bdnx:")) != -1)
 	{
 		switch (c)
 		{
+			case 'b':			/* Remove backup history files too */
+removeBackupHistoryFile = true;
+break;
 			case 'd':			/* Debug mode */
 pg_logging_increase_verbosity();
 break;

base-commit: c5e4ec293ea527394fc1d0006ab88047b3ce580f
-- 
2.25.1



Re: Add two missing tests in 035_standby_logical_decoding.pl

2023-04-25 Thread Drouvot, Bertrand

Hi,

On 4/25/23 6:43 AM, Amit Kapila wrote:

On Mon, Apr 24, 2023 at 5:38 PM Drouvot, Bertrand
 wrote:


We are waiting here at a maximum for 10 * $default_timeout (means 3 minutes) 
before
we time out. Would you prefer to wait more than 3 minutes at a maximum?



No, because I don't know what would be a suitable timeout here.


Yeah, I understand that. On the other hand, there is other places that
rely on a timeout, for example:

- wait_for_catchup(), wait_for_slot_catchup(),
wait_for_subscription_sync() by making use of poll_query_until.
- wait_for_log() by setting a max_attempts.

Couldn't we have the same concern for those ones? (aka be suitable on
slower machines).


At
this stage, I don't have a good idea on how to implement this test in
a better way. Can we split this into a separate patch as the first
test is a bit straightforward, we can push that one and then
brainstorm on if there is a better way to test this functionality.



I created a dedicated 
v4-0002-Add-retained-WAL-test-in-035_standby_logical_deco.patch
just shared up-thread.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Add two missing tests in 035_standby_logical_decoding.pl

2023-04-25 Thread Drouvot, Bertrand

Hi,

On 4/25/23 6:23 AM, Amit Kapila wrote:

On Mon, Apr 24, 2023 at 3:36 PM Drouvot, Bertrand
 wrote:


Without the second "pg_log_standby_snapshot()" then 
wait_for_subscription_sync() would be waiting
some time on the poll for "SELECT count(1) = 0 FROM pg_subscription_rel WHERE 
srsubstate NOT IN ('r', 's');"

Adding a comment in V3 to explain the need for the second 
pg_log_standby_snapshot().



Won't this still be unpredictable because it is possible that the
tablesync worker may take more time to get launched or create a
replication slot? If that happens after your second
pg_log_standby_snapshot() then wait_for_subscription_sync() will be
hanging. 


Oh right, that looks like a possible scenario.


Wouldn't it be better to create a subscription with
(copy_data = false) to make it predictable and then we won't need
pg_log_standby_snapshot() to be performed twice?

If you agree with the above suggestion then you probably need to move
wait_for_subscription_sync() before Insert.



I like that idea, thanks! Done in V4 attached.

Not related to the above corner case, but while re-reading the patch I also 
added:

"
$node_primary->wait_for_replay_catchup($node_standby);
"

between the publication creation on the primary and the subscription to the 
standby
(to ensure the publication gets replicated before we request for the 
subscription creation).

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.comFrom ed92ea9bf3385d2fe9e4ef0d8a04b87b1c7e6b3d Mon Sep 17 00:00:00 2001
From: Bertrand Drouvot 
Date: Tue, 25 Apr 2023 06:02:17 +
Subject: [PATCH v4 2/2] Add retained WAL test in
 035_standby_logical_decoding.pl

Adding one test, to verify that invalidated logical slots do not lead to
retaining WAL.
---
 .../t/035_standby_logical_decoding.pl | 78 ++-
 1 file changed, 76 insertions(+), 2 deletions(-)
 100.0% src/test/recovery/t/

diff --git a/src/test/recovery/t/035_standby_logical_decoding.pl 
b/src/test/recovery/t/035_standby_logical_decoding.pl
index 03346f44f2..6ae4fc1e02 100644
--- a/src/test/recovery/t/035_standby_logical_decoding.pl
+++ b/src/test/recovery/t/035_standby_logical_decoding.pl
@@ -9,6 +9,7 @@ use warnings;
 use PostgreSQL::Test::Cluster;
 use PostgreSQL::Test::Utils;
 use Test::More;
+use Time::HiRes qw(usleep);
 
 my ($stdin, $stdout,$stderr,
$cascading_stdout,  $cascading_stderr,  $subscriber_stdin,
@@ -495,9 +496,82 @@ $node_standby->restart;
 check_slots_conflicting_status(1);
 
 ##
-# Verify that invalidated logical slots do not lead to retaining WAL
+# Verify that invalidated logical slots do not lead to retaining WAL.
 ##
-# X TODO
+
+# Get the restart_lsn from an invalidated slot
+my $restart_lsn = $node_standby->safe_psql('postgres',
+   "SELECT restart_lsn from pg_replication_slots WHERE slot_name = 
'vacuum_full_activeslot' and conflicting is true;"
+);
+
+chomp($restart_lsn);
+
+# Get the WAL file name associated to this lsn on the primary
+my $walfile_name = $node_primary->safe_psql('postgres',
+   "SELECT pg_walfile_name('$restart_lsn')");
+
+chomp($walfile_name);
+
+# Check the WAL file is still on the primary
+ok(-f $node_primary->data_dir . '/pg_wal/' . $walfile_name,
+   "WAL file still on the primary");
+
+# Get the number of WAL files on the standby
+my $nb_standby_files = $node_standby->safe_psql('postgres',
+   "SELECT COUNT(*) FROM pg_ls_dir('pg_wal')");
+
+chomp($nb_standby_files);
+
+# Switch WAL files on the primary
+my @c = (1 .. $nb_standby_files);
+
+$node_primary->safe_psql('postgres', "create table retain_test(a int)");
+
+for (@c)
+{
+   $node_primary->safe_psql(
+   'postgres', "SELECT pg_switch_wal();
+  insert into retain_test values("
+ . $_ . ");");
+}
+
+# Ask for a checkpoint
+$node_primary->safe_psql('postgres', 'checkpoint;');
+
+# Check that the WAL file has not been retained on the primary
+ok(!-f $node_primary->data_dir . '/pg_wal/' . $walfile_name,
+   "WAL file not on the primary anymore");
+
+# Wait for the standby to catch up
+$node_primary->wait_for_catchup($node_standby);
+
+# Generate another WAL switch, more activity and a checkpoint
+$node_primary->safe_psql(
+   'postgres', "SELECT pg_switch_wal();
+ insert into retain_test values(1);");
+$node_primary->safe_psql('postgres', 'checkpoint;');
+
+# Wait for the standby to catch up
+$node_primary->wait_for_catchup($node_standby);
+
+# Verify that the wal file has not been retained on the standby
+my $standby_walfile = $node_standby->data_dir . '/pg_wal/' . $walfile_name;
+
+# We can not test if the WAL file still exists immediately.
+# We need to let some time to the standby to actually "remove" it.
+my $i = 0;
+while (1)
+{

Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)

2023-04-25 Thread Pavel Luzanov

On 24.04.2023 23:53, Melanie Plageman wrote:

I copied the committer who most recently touched pg_stat_io (Michael
Paquier) to see if we could get someone interested in committing this
docs update.


I can explain my motivation by suggesting this update.

pg_stat_io is a very impressive feature. So I decided to try it.
I see 4 rows for some 'standalone backend'  out of 30 total rows of the 
view.


The attempt to find description of 'standalone backend' in the docs
did not result in anything. pg_stat_io page references pg_stat_activity
for backend types. But pg_stat_activity page doesn't say anything
about 'standalone backend'.

I think this question will be popular without clarifying in docs.

--
Pavel Luzanov
Postgres Professional: https://postgrespro.com





Re: Move un-parenthesized syntax docs to "compatibility" for few SQL commands

2023-04-25 Thread Michael Paquier
On Mon, Apr 24, 2023 at 09:52:21AM -0700, Nathan Bossart wrote:
> I think this can wait for v17, but if there's a strong argument for doing
> some of this sooner, we can reevaluate.

FWIW, I agree to hold on this stuff for v17~ once it opens for
business.  There is no urgency here.
--
Michael


signature.asc
Description: PGP signature


Re: Add LZ4 compression in pg_dump

2023-04-25 Thread Michael Paquier
On Wed, Apr 12, 2023 at 07:53:53PM -0500, Justin Pryzby wrote:
> I doubt it - in the !HAVE_LIBZ case, it's currently an "if" statement
> with nothing but a comment, which isn't a problem.
> 
> I think the only issue with an empty "if" is when you have no braces,
> like:
> 
>   if (...)
> #if ...
>   something;
> #endif
> 
>   // problem here //

(My apologies for the late reply.)

Still it could be easily messed up, and that's not a style that
really exists in the tree, either, because there are always #else
blocks set up in such cases.  Another part that makes me a bit
uncomfortable is that we would still call twice
parse_compress_specification(), something that should not happen but
we are doing so on HEAD because the default compression_algorithm_str
is "none" and we want to enforce "gzip" for custom and directory
formats when building with zlib.

What about just moving this block a bit up, just before the
compression spec parsing, then?  If we set compression_algorithm_str,
the specification is compiled with the expected default, once instead
of twice.
--
Michael
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 058244cd17..e4f32d170f 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -721,6 +721,21 @@ main(int argc, char **argv)
 	if (archiveFormat == archNull)
 		plainText = 1;
 
+	/*
+	 * Custom and directory formats are compressed by default with gzip when
+	 * available, not the others.  If gzip is not available, no compression
+	 * is done by default.
+	 */
+	if ((archiveFormat == archCustom || archiveFormat == archDirectory) &&
+		!user_compression_defined)
+	{
+#ifdef HAVE_LIBZ
+		compression_algorithm_str = "gzip";
+#else
+		compression_algorithm_str = "none";
+#endif
+	}
+
 	/*
 	 * Compression options
 	 */
@@ -749,21 +764,6 @@ main(int argc, char **argv)
 		pg_log_warning("compression option \"%s\" is not currently supported by pg_dump",
 	   "workers");
 
-	/*
-	 * Custom and directory formats are compressed by default with gzip when
-	 * available, not the others.
-	 */
-	if ((archiveFormat == archCustom || archiveFormat == archDirectory) &&
-		!user_compression_defined)
-	{
-#ifdef HAVE_LIBZ
-		parse_compress_specification(PG_COMPRESSION_GZIP, NULL,
-	 _spec);
-#else
-		/* Nothing to do in the default case */
-#endif
-	}
-
 	/*
 	 * If emitting an archive format, we always want to emit a DATABASE item,
 	 * in case --create is specified at pg_restore time.


signature.asc
Description: PGP signature