Re: psql tests hangs

2023-05-09 Thread Pavel Stehule
Hi

When I remove this test, then all tests passed

diff --git a/src/bin/psql/t/001_basic.pl b/src/bin/psql/t/001_basic.pl
index 596746de17..631a1a7335 100644
--- a/src/bin/psql/t/001_basic.pl
+++ b/src/bin/psql/t/001_basic.pl
@@ -353,11 +353,6 @@ psql_like(

 # Check \watch
 # Note: the interval value is parsed with locale-aware strtod()
-psql_like(
-   $node,
-   sprintf('SELECT 1 \watch c=3 i=%g', 0.01),
-   qr/1\n1\n1/,
-   '\watch with 3 iterations');

 # Check \watch errors
 psql_fails_like(

Can somebody repeat this testing of FC38?

Regards

Pavel


Re: enhancing plpgsql debug api - hooks on statements errors and function errors

2023-05-09 Thread Kirk Wolak
On Tue, Apr 25, 2023 at 11:33 AM Pavel Stehule 
wrote:

> Hi
> út 25. 4. 2023 v 10:27 odesílatel Pavel Stehule 
> napsal:
>
>> Hi
>>
>> When I implemented profiler and coverage check to plpgsql_check I had to
>> write a lot of hard maintaining code related to corect finishing some
>> operations (counter incrementing) usually executed by stmt_end and func_end
>> hooks. It is based on the fmgr hook and its own statement call stack. Can
>> be nice if I can throw this code and use some nice buildin API.
>>
>> Can we enhance dbg API with two hooks stmt_end_err func_end_err ?
>>
>> These hooks can be called from exception handlers before re raising.
>>
>> Or we can define new hooks like executor hooks - stmt_exec and func_exec.
>> In custom hooks the exception can be catched.
>>
>> What do you think about this proposal?
>>
>> +1.  I believe I bumped into a few of these use cases with plpgsql_check
(special handling for security definer and exception handling).
  My cursory review of the patch file is that despite the movement of the
code, it feels pretty straight forward.

The 7% overhead appears in a "tight loop", so it's probably really
overstated.  I will see if I can apply this and do a more realistic test.
[I have a procedure that takes ~2hrs to process a lot of data, I would be
curious to see this impact and report back]


> I did quick and ugly benchmark on worst case
>
> CREATE OR REPLACE FUNCTION public.speedtest(i integer)
>  RETURNS void
>  LANGUAGE plpgsql
>  IMMUTABLE
> AS $function$
> declare c int = 0;
> begin
>   while c < i
>   loop
> c := c + 1;
>   end loop;
>   raise notice '%', c;
> end;
> $function$
>
> and is possible to write some code (see ugly patch) without any
> performance impacts if the hooks are not used. When hooks are active, then
> there is 7% performance lost. It is not nice - but this is the worst case.
> The impact on real code should be significantly lower
>
> Regards
>
> Pavel
>
>


Unlinking Parallel Hash Join inner batch files sooner

2023-05-09 Thread Thomas Munro
Hi,

One complaint about PHJ is that it can, in rare cases, use a
surprising amount of temporary disk space where non-parallel HJ would
not.  When it decides that it needs to double the number of batches to
try to fit each inner batch into memory, and then again and again
depending on your level of bad luck, it leaves behind all the earlier
generations of inner batch files to be cleaned up at the end of the
query.  That's stupid.  Here's a patch to unlink them sooner, as a
small improvement.

The reason I didn't do this earlier is that sharedtuplestore.c
continues the pre-existing tradition where each parallel process
counts what it writes against its own temp_file_limit.  At the time I
thought I'd need to have one process unlink all the files, but if a
process were to unlink files that it didn't create, that accounting
system would break.  Without some new kind of shared temp_file_limit
mechanism that doesn't currently exist, per-process counters could go
negative, creating free money.  In the attached patch, I realised
something that I'd missed before: there is a safe point for each
backend to unlink just the files that it created, and there is no way
for a process that created files not to reach that point.

Here's an example query that tries 8, 16 and then 32 batches on my
machine, because reltuples is clobbered with a bogus value.
Pathological cases can try many more rounds than that, but 3 is enough
to demonstrate.  Using truss and shell tricks I spat out the list of
create and unlink operations from master and the attached draft/POC
patch.  See below.

  set work_mem = '1MB';
  CREATE TABLE t (i int);
  INSERT INTO t SELECT generate_series(1, 100);
  ANALYZE t;
  UPDATE pg_class SET reltuples = reltuples / 4 WHERE relname = 't';
  EXPLAIN ANALYZE SELECT COUNT(*) FROM t t1 JOIN t t2 USING (i);

This code is also exercised by the existing "bad" case in join_hash.sql.

This is the second of two experimental patches investigating increased
resource usage in PHJ compared to HJ based on user complaints, this
one being per-batch temp files, and the other[1] being per-batch
buffer memory.

[1] 
https://www.postgresql.org/message-id/flat/CA%2BhUKGKCnU9NjFfzO219V-YeyWr8mZe4JRrf%3Dx_uv6qsePBcOw%40mail.gmail.com

=

master:

99861: create i3of8.p0.0
99861: create i6of8.p0.0
99861: create i4of8.p0.0
99861: create i5of8.p0.0
99861: create i7of8.p0.0
99861: create i2of8.p0.0
99861: create i1of8.p0.0
99863: create i2of8.p1.0
99862: create i7of8.p2.0
99863: create i5of8.p1.0
99862: create i1of8.p2.0
99863: create i6of8.p1.0
99862: create i5of8.p2.0
99863: create i1of8.p1.0
99862: create i3of8.p2.0
99863: create i7of8.p1.0
99862: create i4of8.p2.0
99863: create i3of8.p1.0
99862: create i2of8.p2.0
99863: create i4of8.p1.0
99862: create i6of8.p2.0
99863: create i8of16.p1.0
99861: create i8of16.p0.0
99862: create i8of16.p2.0
99863: create i9of16.p1.0
99862: create i1of16.p2.0
99863: create i1of16.p1.0
99862: create i9of16.p2.0
99861: create i9of16.p0.0
99861: create i1of16.p0.0
99862: create i10of16.p2.0
99863: create i2of16.p1.0
99862: create i2of16.p2.0
99863: create i10of16.p1.0
99861: create i2of16.p0.0
99861: create i10of16.p0.0
99862: create i11of16.p2.0
99863: create i3of16.p1.0
99862: create i3of16.p2.0
99861: create i3of16.p0.0
99863: create i11of16.p1.0
99861: create i11of16.p0.0
99863: create i4of16.p1.0
99863: create i12of16.p1.0
99862: create i12of16.p2.0
99862: create i4of16.p2.0
99861: create i12of16.p0.0
99861: create i4of16.p0.0
99863: create i13of16.p1.0
99863: create i5of16.p1.0
99862: create i5of16.p2.0
99862: create i13of16.p2.0
99861: create i5of16.p0.0
99861: create i13of16.p0.0
99862: create i6of16.p2.0
99863: create i6of16.p1.0
99861: create i14of16.p0.0
99862: create i14of16.p2.0
99863: create i14of16.p1.0
99861: create i6of16.p0.0
99863: create i15of16.p1.0
99861: create i7of16.p0.0
99863: create i7of16.p1.0
99862: create i15of16.p2.0
99861: create i15of16.p0.0
99862: create i7of16.p2.0
99863: create i16of32.p1.0
99862: create i16of32.p2.0
99861: create i16of32.p0.0
99863: create i17of32.p1.0
99863: create i1of32.p1.0
99861: create i1of32.p0.0
99862: create i17of32.p2.0
99862: create i1of32.p2.0
99861: create i17of32.p0.0
99863: create i18of32.p1.0
99863: create i2of32.p1.0
99862: create i2of32.p2.0
99862: create i18of32.p2.0
99861: create i2of32.p0.0
99861: create i18of32.p0.0
99862: create i3of32.p2.0
99862: create i19of32.p2.0
99861: create i19of32.p0.0
99861: create i3of32.p0.0
99863: create i19of32.p1.0
99863: create i3of32.p1.0
99863: create i20of32.p1.0
99863: create i4of32.p1.0
99861: create i20of32.p0.0
99861: create i4of32.p0.0
99862: create i20of32.p2.0
99862: create i4of32.p2.0
99861: create i21of32.p0.0
99863: create i21of32.p1.0
99861: create i5of32.p0.0
99863: create i5of32.p1.0
99862: create i5of32.p2.0
99862: create i21of32.p2.0
99863: create i22of32.p1.0
99863: create i6of32.p1.0
99861: create i22of32.p0.0
99862: create i22of32.p2.0
99861: create i6of32.p0.0
99862: 

Re: [PATCH] Use RelationClose rather than table_close in heap_create_with_catalog

2023-05-09 Thread tender wang
Xiaoran Wang  于2023年3月18日周六 15:04写道:

> Hi hackers,
>
>  In heap_create_with_catalog, the Relation new_rel_desc is created
> by RelationBuildLocalRelation, not table_open. So it's better to
> call RelationClose to release it.
>
Why it's better to call RelationClose? Is there a problem if using
table_close()?

> What's more, the comment for it seems useless, just delete it.
>
> Thanks!
>

regard, tender wang


Re: Add LZ4 compression in pg_dump

2023-05-09 Thread Michael Paquier
On Tue, May 09, 2023 at 02:12:44PM +, gkokola...@pm.me wrote:
> Thank you both for looking. A small consolation is that now there are
> tests for this case.

+1, noticing that was pure luck ;)

Worth noting that the patch posted in [1] has these tests, not the
version posted in [2].

+create_sql   => 'INSERT INTO dump_test.test_compression_method (col1) '
+  . 'SELECT string_agg(a::text, \'\') FROM generate_series(1,4096) a;',

Yep, good and cheap idea to check for longer chunks.  That should be
enough to loop twice.

[1]: 
https://www.postgresql.org/message-id/SYTRcNgtAbzyn3y3IInh1x-UfNTKMNpnFvI3mr6SyqyVf3PkaDsMy_cpKKgsl3_HdLy2MFAH4zwjxDmFfiLO8rWtSiJWBtqT06OMjeNo4GA=@pm.me
[2]: 
https://www.postgresql.org/message-id/f735df01-0bb4-2fbc-1297-73a520cfc...@enterprisedb.com

> Moving on to the other open item for this, please find attached v2
> of the patch as requested.

Did you notice the comments of [3] about the second patch that aims to
add the null termination in the line from the LZ4 fgets() callback?

[3]: https://www.postgresql.org/message-id/zfhcyn4gm2eu6...@paquier.xyz
--
Michael


signature.asc
Description: PGP signature


Re: fix stats_fetch_consistency value in postgresql.conf.sample

2023-05-09 Thread Justin Pryzby
On Wed, Mar 29, 2023 at 11:03:59PM -0500, Justin Pryzby wrote:
> On Wed, Jul 13, 2022 at 04:49:00PM -0700, Andres Freund wrote:
> > On 2022-07-14 08:46:02 +0900, Michael Paquier wrote:
> > > On Wed, Jul 13, 2022 at 12:30:00PM -0500, Justin Pryzby wrote:
> > > > How did you make this list ?  Was it by excluding things that failed 
> > > > for you ?
> > > > 
> > > > cfbot is currently failing due to io_concurrency on windows.
> > > > I think there are more GUC which should be included here.
> > > > 
> > > > http://cfbot.cputube.org/kyotaro-horiguchi.html
> > > 
> > > FWIW, I am not really a fan of making this test depend on a hardcoded
> > > list of GUCs.
> > 
> > I wonder if we should add flags indicating platform dependency etc to guc.c?
> > That should allow to remove most of them?
> 
> Michael commented on this, but on another thread, so I'm copying and
> pasting it here.
> 
> On Thu, Mar 23, 2023 at 08:59:57PM -0500, Justin Pryzby wrote:
> > On Fri, Mar 24, 2023 at 10:24:43AM +0900, Michael Paquier wrote:
> > > >> * Check consistency of GUC defaults between .sample.conf and 
> > > >> pg_settings.boot_val
> > > >   - It looks like this was pretty active until last October and might
> > > > have been ready to apply at least partially? But no further work or
> > > > review has happened since.
> > > 
> > > FWIW, I don't find much appealing the addition of two GUC flags for
> > > only the sole purpose of that,
> > 
> > The flags seem independently interesting - adding them here follows
> > a suggestion Andres made in response to your complaint.
> > 20220713234900.z4rniuaerkq34...@awork3.anarazel.de
> > 
> > > particularly as we get a stronger
> > > dependency between GUCs that can be switched dynamically at
> > > initialization and at compile-time.
> > 
> > What do you mean by "stronger dependency between GUCs" ?
> 
> I'm still not clear what that means ?

Michael ?

This fixes an issue with the last version that failed with
log_autovacuum_min_duration in cirrusci's pg_ci_base.conf.

And now includes both a perl and a sql-based versions of the test - both
of which rely on the flags.
>From 963b56636b9f7fd4a78e502c6acba07919518910 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Tue, 19 Jul 2022 08:31:56 -0500
Subject: [PATCH 1/2] pg_settings_get_flags(): add DEFAULT_COMPILE and
 DEFAULT_INITDB ..

for settings which vary by ./configure or platform or initdb

Note that this is independent from PGC_S_DYNAMIC_DEFAULT.
---
 doc/src/sgml/func.sgml  | 15 ++
 src/backend/utils/misc/guc_funcs.c  |  6 ++-
 src/backend/utils/misc/guc_tables.c | 76 ++---
 src/include/utils/guc.h |  2 +
 4 files changed, 69 insertions(+), 30 deletions(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index e13fce6f6b1..17069d2249e 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -24945,6 +24945,21 @@ SELECT collation for ('foo' COLLATE "de_DE");
  FlagDescription
 
 
+
+ 
+  DEFAULT_COMPILE
+  Parameters with this flag have default values which vary by
+  platform, or compile-time options.
+  
+ 
+
+ 
+  DEFAULT_INITDB
+  Parameters with this flag have default values which are
+  determined dynamically during initdb.
+  
+ 
+
  
   EXPLAIN
   Parameters with this flag are included in
diff --git a/src/backend/utils/misc/guc_funcs.c b/src/backend/utils/misc/guc_funcs.c
index 52c361e9755..183943c1973 100644
--- a/src/backend/utils/misc/guc_funcs.c
+++ b/src/backend/utils/misc/guc_funcs.c
@@ -551,7 +551,7 @@ ShowAllGUCConfig(DestReceiver *dest)
 Datum
 pg_settings_get_flags(PG_FUNCTION_ARGS)
 {
-#define MAX_GUC_FLAGS	6
+#define MAX_GUC_FLAGS	8
 	char	   *varname = TextDatumGetCString(PG_GETARG_DATUM(0));
 	struct config_generic *record;
 	int			cnt = 0;
@@ -564,6 +564,10 @@ pg_settings_get_flags(PG_FUNCTION_ARGS)
 	if (record == NULL)
 		PG_RETURN_NULL();
 
+	if (record->flags & GUC_DEFAULT_COMPILE)
+		flags[cnt++] = CStringGetTextDatum("DEFAULT_COMPILE");
+	if (record->flags & GUC_DEFAULT_INITDB)
+		flags[cnt++] = CStringGetTextDatum("DEFAULT_INITDB");
 	if (record->flags & GUC_EXPLAIN)
 		flags[cnt++] = CStringGetTextDatum("EXPLAIN");
 	if (record->flags & GUC_NO_RESET)
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 2f42cebaf62..94b0aa24a98 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -1233,7 +1233,7 @@ struct config_bool ConfigureNamesBool[] =
 		{"debug_assertions", PGC_INTERNAL, PRESET_OPTIONS,
 			gettext_noop("Shows whether the running server has assertion checks enabled."),
 			NULL,
-			GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
+			GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE | GUC_DEFAULT_COMPILE
 		},
 		_enabled,
 		DEFAULT_ASSERT_ENABLED,
@@ -1425,7 +1425,8 @@ struct config_bool ConfigureNamesBool[] =
 	{
 		{"update_process_title", PGC_SUSET, PROCESS_TITLE,
 			

Re: Large files for relations

2023-05-09 Thread Stephen Frost
Greetings,

* Corey Huinker (corey.huin...@gmail.com) wrote:
> On Wed, May 3, 2023 at 1:37 AM Thomas Munro  wrote:
> > On Wed, May 3, 2023 at 5:21 PM Thomas Munro 
> > wrote:
> > > rsync --link-dest

... rsync isn't really a safe tool to use for PG backups by itself
unless you're using it with archiving and with start/stop backup and
with checksums enabled.

> > I wonder if rsync will grow a mode that can use copy_file_range() to
> > share blocks with a reference file (= previous backup).  Something
> > like --copy-range-dest.  That'd work for large-file relations
> > (assuming a file system that has block sharing, like XFS and ZFS).
> > You wouldn't get the "mtime is enough, I don't even need to read the
> > bytes" optimisation, which I assume makes all database hackers feel a
> > bit queasy anyway, but you'd get the space savings via the usual
> > rolling checksum or a cheaper version that only looks for strong
> > checksum matches at the same offset, or whatever other tricks rsync
> > might have up its sleeve.

There's also really good reasons to have multiple full backups and not
just a single full backup and then lots and lots of incrementals which
basically boils down to "are you really sure that one copy of that one
really important file won't every disappear from your backup
repository..?"

That said, pgbackrest does now have block-level incremental backups
(where we define our own block size ...) and there's reasons we decided
against going down the LSN-based approach (not the least of which is
that the LSN isn't always updated...), but long story short, moving to
larger than 1G files should be something that pgbackrest will be able
to handle without as much impact as there would have been previously in
terms of incremental backups.  There is a loss in the ability to use
mtime to scan just the parts of the relation that changed and that's
unfortunate but I wouldn't see it as really a game changer (and yes,
there's certainly an argument for not trusting mtime, though I don't
think we've yet had a report where there was an mtime issue that our
mtime-validity checking didn't catch and force pgbackrest into
checksum-based revalidation automatically which resulted in an invalid
backup... of course, not enough people test their backups...).

> I understand the need to reduce open file handles, despite the
> possibilities enabled by using large numbers of small file sizes.

I'm also generally in favor of reducing the number of open file handles
that we have to deal with.  Addressing the concerns raised nearby about
weird corner-cases of non-1G length ABCDEF.1 files existing while
ABCDEF.2, and more, files exist is certainly another good argument in
favor of getting rid of segments.

> I am curious whether a move like this to create a generational change in
> file file format shouldn't be more ambitious, perhaps altering the block
> format to insert a block format version number, whether that be at every
> block, or every megabyte, or some other interval, and whether we store it
> in-file or in a separate file to accompany the first non-segmented. Having
> such versioning information would allow blocks of different formats to
> co-exist in the same table, which could be critical to future changes such
> as 64 bit XIDs, etc.

To the extent you're interested in this, there are patches posted which
are alrady trying to move us in a direction that would allow for
different page formats that add in space for other features such as
64bit XIDs, better checksums, and TDE tags to be supported.

https://commitfest.postgresql.org/43/3986/

Currently those patches are expecting it to be declared at initdb time,
but the way they're currently written that's more of a soft requirement
as you can tell on a per-page basis what features are enabled for that
page.  Might make sense to support it in that form first anyway though,
before going down the more ambitious route of allowing different pages
to have different sets of features enabled for them concurrently.

When it comes to 'a separate file', we do have forks already and those
serve a very valuable but distinct use-case where you can get
information from the much smaller fork (be it the FSM or the VM or some
future thing) while something like 64bit XIDs or a stronger checksum is
something you'd really need on every page.  I have serious doubts about
a proposal where we'd store information needed on every page read in
some far away block that's still in the same file such as using
something every 1MB as that would turn every block access into two..

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: walsender performance regression due to logical decoding on standby changes

2023-05-09 Thread Andres Freund
Hi,

On 2023-05-09 13:38:24 -0700, Jeff Davis wrote:
> On Tue, 2023-05-09 at 12:02 -0700, Andres Freund wrote:
> > I don't think the approach of not having any sort of "registry" of
> > whether
> > anybody is waiting for the replay position to be updated is
> > feasible. Iterating over all walsenders slots is just too expensive -
> 
> Would it work to use a shared counter for the waiters (or, two
> counters, one for physical and one for logical), and just early exit if
> the count is zero?

That doesn't really fix the problem - once you have a single walsender
connected, performance is bad again.

Greetings,

Andres Freund




Re: Feature: Add reloption support for table access method

2023-05-09 Thread Andres Freund
Hi,

On 2023-05-05 16:44:39 +0800, 吴昊 wrote:
> When I wrote an extension to implement a new storage by table access method. 
> I found some issues
> that the existing code has strong assumptions for heap tables now. Here are 3 
> issues that I currently have:
> 
> 
> 1. Index access method has a callback to handle reloptions, but table access 
> method hasn't. We can't
> add storage-specific parameters by `WITH` clause when creating a table.

Makes sense to add that.


> 2. Existing code strongly assumes that the data file of a table structures by 
> a serial physical files named
> in a hard coded rule: [.]. It may only fit for heap like 
> tables. A new storage may have its
> owner structure on how the data files are organized. The problem happens when 
> dropping a table.

I agree that it's not great, but I don't think it's particularly easy to fix
(because things like transactional DDL require fairly tight integration). Most
of the time it should be possible can also work around the limitations though.


> 3. The existing code also assumes that the data file consists of a series of 
> fix-sized block. It may not
> be desired by other storage. Is there any suggestions on this situation?

That's a requirement of using the buffer manager, but if you don't want to
rely on that, you can use a different pattern. There's some limitations
(format of TIDs, most prominently), but you should be able to deal with that.

I don't think it would make sense to support other block sizes in the buffer
manager.


> The rest of this mail is to talk about the first issue. It looks reasonable 
> to add a similar callback in
> struct TableAmRoutine, and parse reloptions by the callback. This patch is in 
> the attachment file.

Why did you add relkind to the callbacks? The callbacks are specific to a
certain relkind, so I don't think that makes sense.

I don't think we really need GetTableAmRoutineByAmId() that raises nice
errors etc - as the AM has already been converted to an oid, we shouldn't need
to recheck?



> +bytea *
> +table_reloptions_am(Oid accessMethodId, Datum reloptions, char relkind, bool 
> validate)
>  {
> ...
> 
> + /* built-in table access method put here to fetch TAM fast */
> + case HEAP_TABLE_AM_OID:
> + tam = GetHeapamTableAmRoutine();
> + break;
>   default:
> - /* other relkinds are not supported */
> - return NULL;
> + tam = GetTableAmRoutineByAmId(accessMethodId);
> + break;

Why do we need this fastpath? This shouldn't be something called at a
meaningful frequency?


> }
> + return table_reloptions(tam->amoptions, reloptions, relkind, validate);
> }

I'd just pass the tam, instead of an individual function.

> @@ -866,6 +866,11 @@ typedef struct TableAmRoutine
>   
>struct SampleScanState *scanstate,
>   
>TupleTableSlot *slot);
>  
> + /*
> +  * This callback is used to parse reloptions for relation/matview/toast.
> +  */
> + bytea   *(*amoptions)(Datum reloptions, char relkind, bool 
> validate);
> +
>  } TableAmRoutine;

Did you mean table instead of relation in the comment?



> Another thing about reloption is that the current API passes a parameter 
> `validate` to tell the parse
> functioin to check and whether to raise an error. It doesn't have enough 
> context when these reloptioins
> are used:
> 1. CREATE TABLE ... WITH(...)
> 2. ALTER TABLE ... SET ...
> 3. ALTER TABLE ... RESET ...
> The reason why the context matters is that some reloptions are disallowed to 
> change after creating
> the table, while some reloptions are allowed.

What kind of reloption are you thinking of here?

Greetings,

Andres Freund




Re: Large files for relations

2023-05-09 Thread Corey Huinker
On Wed, May 3, 2023 at 1:37 AM Thomas Munro  wrote:

> On Wed, May 3, 2023 at 5:21 PM Thomas Munro 
> wrote:
> > rsync --link-dest
>
> I wonder if rsync will grow a mode that can use copy_file_range() to
> share blocks with a reference file (= previous backup).  Something
> like --copy-range-dest.  That'd work for large-file relations
> (assuming a file system that has block sharing, like XFS and ZFS).
> You wouldn't get the "mtime is enough, I don't even need to read the
> bytes" optimisation, which I assume makes all database hackers feel a
> bit queasy anyway, but you'd get the space savings via the usual
> rolling checksum or a cheaper version that only looks for strong
> checksum matches at the same offset, or whatever other tricks rsync
> might have up its sleeve.
>

I understand the need to reduce open file handles, despite the
possibilities enabled by using large numbers of small file sizes.
Snowflake, for instance, sees everything in 1MB chunks, which makes
massively parallel sequential scans (Snowflake's _only_ query plan)
possible, though I don't know if they accomplish that via separate files,
or via segments within a large file.

I am curious whether a move like this to create a generational change in
file file format shouldn't be more ambitious, perhaps altering the block
format to insert a block format version number, whether that be at every
block, or every megabyte, or some other interval, and whether we store it
in-file or in a separate file to accompany the first non-segmented. Having
such versioning information would allow blocks of different formats to
co-exist in the same table, which could be critical to future changes such
as 64 bit XIDs, etc.


Re: walsender performance regression due to logical decoding on standby changes

2023-05-09 Thread Jeff Davis
On Tue, 2023-05-09 at 12:02 -0700, Andres Freund wrote:
> I don't think the approach of not having any sort of "registry" of
> whether
> anybody is waiting for the replay position to be updated is
> feasible. Iterating over all walsenders slots is just too expensive -

Would it work to use a shared counter for the waiters (or, two
counters, one for physical and one for logical), and just early exit if
the count is zero?

Regards,
Jeff Davis





walsender performance regression due to logical decoding on standby changes

2023-05-09 Thread Andres Freund
Hi,

Unfortunately I have found the following commit to have caused a performance
regression:

commit e101dfac3a53c20bfbf1ca85d30a368c2954facf
Author: Andres Freund 
Date:   2023-04-08 00:24:24 -0700

For cascading replication, wake physical and logical walsenders separately

Physical walsenders can't send data until it's been flushed; logical
walsenders can't decode and send data until it's been applied. On the
standby, the WAL is flushed first, which will only wake up physical
walsenders; and then applied, which will only wake up logical
walsenders.

Previously, all walsenders were awakened when the WAL was flushed. That
was fine for logical walsenders on the primary; but on the standby the
flushed WAL would have been not applied yet, so logical walsenders were
awakened too early.

Per idea from Jeff Davis and Amit Kapila.

Author: "Drouvot, Bertrand" 
Reviewed-By: Jeff Davis 
Reviewed-By: Robert Haas 
Reviewed-by: Amit Kapila 
Reviewed-by: Masahiko Sawada 
Discussion: 
https://postgr.es/m/caa4ek1+zo5lueisabx10c81lu-fwmko4m9wyg1cdkbw7hqh...@mail.gmail.com

The problem is that, on a standby, after the change - as needed to for the
approach to work - the call to WalSndWakeup() in ApplyWalRecord() happens for
every record, instead of only happening when the timeline is changed (or WAL
is flushed or ...).

WalSndWakeup() iterates over all walsender slots, regardless of whether in
use. For each of the walsender slots it acquires a spinlock.

When replaying a lot of small-ish WAL records I found the startup process to
spend the majority of the time in WalSndWakeup(). I've not measured it very
precisely yet, but the overhead is significant (~35% slowdown), even with the
default max_wal_senders. If that's increased substantially, it obviously gets
worse.

The only saving grace is that this is not an issue on the primary.


I unfortunately spent less time on this commit of the
logical-decoding-on-standby series than on the others. There were so many
other senior contributors discussing it, that I "relaxed" a bit too much.


I don't think the approach of not having any sort of "registry" of whether
anybody is waiting for the replay position to be updated is
feasible. Iterating over all walsenders slots is just too expensive -
WalSndWakeup() shows up even if I remove the spinlock (which we likely could,
especially when just checking if the the walsender is connected).

My current guess is that mis-using a condition variable is the best bet. I
think it should work to use ConditionVariablePrepareToSleep() before a
WalSndWait(), and then ConditionVariableCancelSleep(). I.e. to never use
ConditionVariableSleep(). The latch set from ConditionVariableBroadcast()
would still cause the necessary wakeup.

Greetings,

Andres Freund




Re: psql tests hangs

2023-05-09 Thread Pavel Stehule
Hi


út 9. 5. 2023 v 13:53 odesílatel Pavel Stehule 
napsal:

>
>
> út 9. 5. 2023 v 11:07 odesílatel Pavel Stehule 
> napsal:
>
>>
>>
>> út 9. 5. 2023 v 10:48 odesílatel Daniel Gustafsson 
>> napsal:
>>
>>> > On 9 May 2023, at 08:52, Pavel Stehule 
>>> wrote:
>>> >
>>> > Hi
>>> >
>>> > I try run make check-world. Now I have problems with tests of psql
>>> >
>>> > I had to cancel tests
>>> >
>>> > log:
>>> >
>>> > [08:46:49.828](0.038s) ok 63 - no ON_ERROR_STOP, --single-transaction
>>> and multiple -c switches
>>> > [08:46:49.860](0.033s) ok 64 - client-side error commits transaction,
>>> no ON_ERROR_STOP and multiple -c switches
>>> > [08:46:49.928](0.067s) ok 65 - \copy from with DEFAULT: exit code 0
>>> > [08:46:49.929](0.001s) ok 66 - \copy from with DEFAULT: no stderr
>>> > [08:46:49.930](0.001s) ok 67 - \copy from with DEFAULT: matches
>>> > death by signal at
>>> /home/pavel/src/postgresql.master/src/bin/psql/../../../src/test/perl/PostgreSQL/Test/Cluster.pm
>>> line 3042.
>>> > # Postmaster PID for node "main" is 157863
>>> > ### Stopping node "main" using mode immediate
>>> > # Running: pg_ctl -D
>>> /home/pavel/src/postgresql.master/src/bin/psql/tmp_check/t_001_basic_main_data/pgdata
>>> -m immediate stop
>>> > waiting for server to shut down done
>>> > server stopped
>>> > # No postmaster PID for node "main"
>>> > [08:47:30.361](40.431s) # Tests were run but no plan was declared and
>>> done_testing() was not seen.
>>> > [08:47:30.362](0.001s) # Looks like your test exited with 4 just after
>>> 67.
>>> > Warning: unable to close filehandle $orig_stderr properly: Broken pipe
>>> during global destruction.
>>>
>>> I'm unable to reproduce, and this clearly works in the buildfarm and
>>> CI.  Did
>>> you run out of disk on the volume during the test or something similar?
>>> Anything interesting in the serverlogs from the tmp_check install?
>>>
>>
>> I have enough free space on disc
>>
>> I don't see nothing interesting in log (it is another run)
>>
>> 2023-05-09 08:50:04.839 CEST [158930] 001_basic.pl LOG:  statement: COPY
>>  copy_default FROM STDIN with (format 'csv', default 'placeholder');
>> 2023-05-09 08:50:04.841 CEST [158930] 001_basic.pl LOG:  statement:
>> SELECT * FROM copy_default
>> 2023-05-09 08:50:04.879 CEST [158932] 001_basic.pl LOG:  statement:
>> SELECT 1.
>> 2023-05-09 08:50:04.888 CEST [158932] 001_basic.pl LOG:  statement:
>> SELECT 1.
>> 2023-05-09 08:50:04.898 CEST [158932] 001_basic.pl LOG:  statement:
>> SELECT 1.
>> 2023-05-09 08:50:28.375 CEST [158862] LOG:  received immediate shutdown
>> request
>> 2023-05-09 08:50:28.385 CEST [158862] LOG:  database system is shut down
>>
>> backtrace from perl
>>
>> Program received signal SIGINT, Interrupt.
>> 0x7f387ecc1ade in select () from /lib64/libc.so.6
>> (gdb) bt
>> #0  0x7f387ecc1ade in select () from /lib64/libc.so.6
>> #1  0x7f387e97363b in Perl_pp_sselect () from /lib64/libperl.so.5.36
>> #2  0x7f387e917958 in Perl_runops_standard () from
>> /lib64/libperl.so.5.36
>> #3  0x7f387e88259d in perl_run () from /lib64/libperl.so.5.36
>> #4  0x5588bceb234a in main ()
>>
>> Regards
>>
>
> I repeated another build with the same result.
>
> Tested REL_15_STABLE branch without any problems.
>

There is some dependence on locales

for commit  96c498d2f8ce5f0082c64793f94e2d0cfa7d7605

with my cs_CZ.utf8 locale

echo "# +++ tap check in src/bin/psql +++" && rm -rf
'/home/pavel/src/postgresql.master/src/bin/psql'/tmp_check &&
/usr/bin/mkdir -p
'/home/pavel/src/postgresql.master/src/bin/psql'/tmp_check && cd . &&
TESTLOGDIR='/home/pavel/src/postgresql.master/src/bin/psql/tmp_check/log'
TESTDATADIR='/home/pavel/src/postgresql.master/src/bin/psql/tmp_check'
PATH="/home/pavel/src/postgresql.master/tmp_install/usr/local/pgsql/bin:/home/pavel/src/postgresql.master/src/bin/psql:$PATH"
LD_LIBRARY_PATH="/home/pavel/src/postgresql.master/tmp_install/usr/local/pgsql/lib"
 PGPORT='65432'
top_builddir='/home/pavel/src/postgresql.master/src/bin/psql/../../..'
PG_REGRESS='/home/pavel/src/postgresql.master/src/bin/psql/../../../src/test/regress/pg_regress'
/usr/bin/prove -I ../../../src/test/perl/ -I .  t/*.pl
# +++ tap check in src/bin/psql +++
t/001_basic.pl ... 15/?
#   Failed test '\watch with 3 iterations: exit code 0'
#   at t/001_basic.pl line 354.
#  got: '3'
# expected: '0'

#   Failed test '\watch with 3 iterations: no stderr'
#   at t/001_basic.pl line 354.
#  got: 'psql::1: error: \watch: incorrect interval value
"0.01"'
# expected: ''

#   Failed test '\watch with 3 iterations: matches'
#   at t/001_basic.pl line 354.
#   ''
# doesn't match '(?^:1\n1\n1)'
# Looks like you failed 3 tests of 80.
t/001_basic.pl ... Dubious, test returned 3 (wstat 768, 0x300)
Failed 3/80 subtests
t/010_tab_completion.pl .. ok
t/020_cancel.pl .. ok

Test Summary Report
---
t/001_basic.pl (Wstat: 768 (exited 3) Tests: 80 Failed: 3)
  Failed 

Re: benchmark results comparing versions 15.2 and 16

2023-05-09 Thread MARK CALLAGHAN
On Fri, May 5, 2023 at 10:01 PM MARK CALLAGHAN  wrote:

> I have two more runs of the benchmark in progress so we will have 3
> results for each of the test cases to confirm that the small regressions
> are repeatable.
>

They get similar results. Then I tried Linux perf but the hierarchical call
stacks, to be used for Flamegraph, have too many "[unknown]" entries.
I was using: ./configure --prefix=$pfx --enable-debug CFLAGS="-O3
-march=native -mtune=native -flto" LDFLAGS="-flto" > o.cf.$x 2> e.cf.$x
Adding -no-omit-frame-pointer fixes the problem, so I am repeating the
benchmark with that to confirm there are still regressions and then I will
get flamegraphs.

-- 
Mark Callaghan
mdcal...@gmail.com


Re: benchmark results comparing versions 15.2 and 16

2023-05-09 Thread Andres Freund
Hi,

On 2023-05-08 12:11:17 -0700, Andres Freund wrote:
> Hi,
> 
> On 2023-05-08 16:00:01 +0300, Alexander Lakhin wrote:
> > This difference is confirmed by multiple test runs. `git bisect` for this
> > regression pointed at f193883fc.
> 
> I can reproduce a significant regression due to f193883fc of a workload just
> running
> SELECT CURRENT_TIMESTAMP;
> 
> A single session running it on my workstation via pgbench -Mprepared gets
> before:
> tps = 89359.128359 (without initial connection time)
> after:
> tps = 83843.585152 (without initial connection time)
> 
> Obviously this is an extreme workload, but that nevertheless seems too large
> to just accept...

Added an open item for this.

Greetings,

Andres Freund




Re: psql \watch 2nd argument: iteration count

2023-05-09 Thread Tom Lane
Peter Eisentraut  writes:
> On 13.03.23 02:17, Michael Paquier wrote:
>> I am not sure that this will be the last option we'll ever add to
>> \watch, so I'd rather have us choose a design more flexible than
>> what's proposed here, in a way similar to \g or \gx.

> On the other hand, we also have option syntax in \connect that is like 
> -foo.  Would that be a better match here?  We should maybe decide before 
> we diverge and propagate two different option syntaxes in backslash 
> commands.

Reasonable point to raise, but I think \connect's -reuse-previous
is in the minority.  \connect itself can use option=value syntax
in the conninfo string (in fact, I guess -reuse-previous was spelled
that way in hopes of not being confusable with a conninfo option).
We also have option=value in the \g and \gx commands.  I don't see
any other psql metacommands that use options spelled like -foo.

In short, I'm satisfied with the current answer.  There's still
time to debate it though.

regards, tom lane




Re: psql \watch 2nd argument: iteration count

2023-05-09 Thread Peter Eisentraut

On 13.03.23 02:17, Michael Paquier wrote:

On Sun, Mar 12, 2023 at 01:05:39PM -0700, Andrey Borodin wrote:

In the review above Kyotaro-san suggested that message should contain
information on what it expects... So, maybe then
pg_log_error("\\watch interval must be non-negative number, but
argument is '%s'", opt); ?
Or perhaps with articles? pg_log_error("\\watch interval must be a
non-negative number, but the argument is '%s'", opt);


-   HELP0("  \\watch [SEC]   execute query every SEC seconds\n");
+   HELP0("  \\watch [SEC [N]]   execute query every SEC seconds N 
times\n");

Is that really the interface we'd want to work with in the long-term?
For one, this does not give the option to specify only an interval
while relying on the default number of seconds.  This may be fine, but
it does  not strike me as the best choice.  How about doing something
more extensible, for example:
\watch [ (option=value [, option=value] .. ) ] [SEC]

I am not sure that this will be the last option we'll ever add to
\watch, so I'd rather have us choose a design more flexible than
what's proposed here, in a way similar to \g or \gx.


On the other hand, we also have option syntax in \connect that is like 
-foo.  Would that be a better match here?  We should maybe decide before 
we diverge and propagate two different option syntaxes in backslash 
commands.






Re: Order changes in PG16 since ICU introduction

2023-05-09 Thread Jeff Davis
On Tue, 2023-05-09 at 10:25 +0200, Alvaro Herrera wrote:
> I agree with removing these in v16, since they are going to become
> more
> meaningless and confusing.

Agreed, but it would be nice to have an alternative that does the right
thing.

It's awkward for a user to read pg_database.datlocprovider, then
depending on that, either look in datcollate or daticulocale. (It's
awkward in the code, too.)

Maybe some built-in function that returns a tuple of the default
provider, the locale, and the version? Or should we also output the
ctype somehow (which affects the results of upper()/lower())?

Regards,
Jeff Davis





Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features)

2023-05-09 Thread torikoshia

On 2023-05-07 05:05, Alena Rybakina wrote:
Thanks for your reviewing and comments!


I noticed that you used _ignore_datatype_errors_specified_ variable in
_copy.c_ , but guc has a short name _ignore_datatype_errors_. Also you
used the short variable name in _CopyFormatOptions_ structure.


You may already understand it, but these variable names are given in 
imitation of FREEZE and BINARY cases:


  --- a/src/include/commands/copy.h
  +++ b/src/include/commands/copy.h
  @@ -42,6 +42,7 @@ typedef struct CopyFormatOptions
   * -1 if not specified */
  boolbinary; /* binary format? */
  boolfreeze; /* freeze rows on loading? */
  +   boolignore_datatype_errors;  /* ignore rows with datatype 
errors */


  --- a/src/backend/commands/copy.c
  +++ b/src/backend/commands/copy.c
  @@ -419,6 +419,7 @@ ProcessCopyOptions(ParseState *pstate,
  boolformat_specified = false;
  boolfreeze_specified = false;
  boolheader_specified = false;
  +   boolignore_datatype_errors_specified = false;



Name used _ignore_datatype_errors_specified _is seemed very long to
me, may be use a short version of it (_ignore_datatype_errors_) in
_copy.c_ too?


I think it would be sane to align the names with the FREEZE and BINARY 
options.


I agree with the name is too long and we once used the name 
'ignore_errors'.
However, current implementation does not ignore all errors but just data 
type error, so I renamed it.

There may be a better name, but I haven't come up with one.


Besides, I noticed that you used _ignored_errors_ variable in
_CopyFromStateData_ structure and it's name is strikingly similar to
name (_ignore_datatype_error__s_), but they have different meanings.
Maybe it will be better to rename it as _ignored_errors_counter_?


As far as I take a quick look at on PostgreSQL source code, there're few 
variable name with "_counter". It seems to be used for function names.

Something like "ignored_errors_count" might be better.


--
Regards,

--
Atsushi Torikoshi
NTT DATA CORPORATION




PostgreSQL 16 Beta 1 release date

2023-05-09 Thread Jonathan S. Katz

Hi,

The release date for PostgreSQL 16 Beta 1 is scheduled for May 25, 2023.

Please ensure you have committed any work for Beta 1 released committed 
by May 21, 2023 AoE.


Thank you for your efforts with resolving open items[2] as we work to
stabilize PostgreSQL 16 for GA!

Thanks,

Jonathan

[1] https://en.wikipedia.org/wiki/Anywhere_on_Earth
[2] https://wiki.postgresql.org/wiki/PostgreSQL_16_Open_Items



OpenPGP_signature
Description: OpenPGP digital signature


Re: Add LZ4 compression in pg_dump

2023-05-09 Thread gkokolatos





--- Original Message ---
On Tuesday, May 9th, 2023 at 2:54 PM, Tomas Vondra 
 wrote:


> 
> 
> On 5/9/23 00:10, Michael Paquier wrote:
> 
> > On Mon, May 08, 2023 at 08:00:39PM +0200, Tomas Vondra wrote:
> > 
> > > The LZ4Stream_write() forgot to move the pointer to the next chunk, so
> > > it was happily decompressing the initial chunk over and over. A bit
> > > embarrassing oversight :-(
> > > 
> > > The custom format calls WriteDataToArchiveLZ4(), which was correct.
> > > 
> > > The attached patch fixes this for me.
> > 
> > Ouch. So this was corrupting the dumps and the compression when
> > trying to write more than two chunks at once, not the decompression
> > steps. That addresses the issue here as well, thanks!
> 
> 
> Yeah. Thanks for the report, should have been found during review.

Thank you both for looking. A small consolation is that now there are
tests for this case.

Moving on to the other open item for this, please find attached v2
of the patch as requested.

Cheers,
//Georgios

> 
> 
> regards
> 
> --
> Tomas Vondra
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL CompanyFrom cb0a229be59dffe09cc0ceceececdbd06a559d3f Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos 
Date: Mon, 8 May 2023 11:58:57 +
Subject: [PATCH v2] Null terminate the output buffer of LZ4Stream_gets

LZ4Stream_gets did not null terminate its output buffer. Its callers expected
the buffer to be null terminated so they passed it around to functions such as
sscanf with unintended consequences.

Reported-by: Alexander Lakhin
---
 src/bin/pg_dump/compress_lz4.c | 12 +++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/src/bin/pg_dump/compress_lz4.c b/src/bin/pg_dump/compress_lz4.c
index 423e1b7976..f97b7550d1 100644
--- a/src/bin/pg_dump/compress_lz4.c
+++ b/src/bin/pg_dump/compress_lz4.c
@@ -459,6 +459,10 @@ LZ4Stream_read_internal(LZ4State *state, void *ptr, int ptrsize, bool eol_flag)
 	if (!LZ4Stream_init(state, size, false /* decompressing */ ))
 		return -1;
 
+	/* No work needs to be done for a zero-sized output buffer */
+	if (size <= 0)
+		return 0;
+
 	/* Verify that there is enough space in the outbuf */
 	if (size > state->buflen)
 	{
@@ -636,7 +640,7 @@ LZ4Stream_gets(char *ptr, int size, CompressFileHandle *CFH)
 	LZ4State   *state = (LZ4State *) CFH->private_data;
 	int			ret;
 
-	ret = LZ4Stream_read_internal(state, ptr, size, true);
+	ret = LZ4Stream_read_internal(state, ptr, size - 1, true);
 	if (ret < 0 || (ret == 0 && !LZ4Stream_eof(CFH)))
 		pg_fatal("could not read from input file: %s", LZ4Stream_get_error(CFH));
 
@@ -644,6 +648,12 @@ LZ4Stream_gets(char *ptr, int size, CompressFileHandle *CFH)
 	if (ret == 0)
 		return NULL;
 
+	/*
+	 * Our caller expects the return string to be NULL terminated
+	 * and we know that ret is greater than zero.
+	 */
+	ptr[ret - 1] = '\0';
+
 	return ptr;
 }
 
-- 
2.34.1



Re: Proposal for Prototype Implementation to Enhance C/C++ Interoperability in PostgreSQL

2023-05-09 Thread 盏一
I apologize for my previous hasty conclusion. I have conducted further testing 
on different platforms and would like to share my findings.

> FreeBSD

Based on my tests, it appears that FreeBSD follows the Itanium C++ ABI 
specification. The previous test failed because the C++ compiler was not used 
when linking libpgsjlj.so.

> macOS, M1

My tests show that macOS M1 roughly follows the Itanium C++ ABI specification, 
with only slight differences, such as the parameters accepted by the 
_Unwind_Stop_Fn function.

> macOS, x86

I don't have the resources to do the testing, but from a code perspective, it 
appears that macOS x86 follows the Itanium C++ ABI specification.

> Windows

It seems that Windows does not follow the Itanium C++ ABI specification at all. 
If we compile the program using the `/EHsc` option, longjmp will also trigger 
forced unwinding. However, unlike the Itanium C++ ABI, the forced unwinding 
triggered here cannot be captured by a C++ catch statement.

Re: Allow pg_archivecleanup to remove backup history files

2023-05-09 Thread torikoshia

Horiguchi-san, Michael-san

Thanks for your comments and information!

Attached a patch with documentation and regression tests.


On 2023-04-26 06:39, Michael Paquier wrote:

On Tue, Apr 25, 2023 at 05:29:48PM +0900, Kyotaro Horiguchi wrote:

I thought that we have decided not to do that, but I coundn't find any
discussion about it in the ML archive.  Anyway, I think it is great
that we have that option.


No objections from here to make that optional.  It's been argued for a
couple of times that leaving the archive history files is good for
debugging, but I can also get why one would one to clean up all the
files older than a WAL retention policy.  Even if these are just few
bytes, it can be noisy for the eye to see a large accumulation of
history files mixed with the WAL segments.
--
Michael


--
Regards,

--
Atsushi Torikoshi
NTT DATA CORPORATIONFrom 7d44134e5de930ce04819285a5b7359e370708d4 Mon Sep 17 00:00:00 2001
From: Atsushi Torikoshi 
Date: Tue, 9 May 2023 21:52:01 +0900
Subject: [PATCH v2] Allow pg_archivecleanup to remove backup history files

Add new option -b to remove files including backup history files older
than oldestkeptwalfile.

---
 doc/src/sgml/ref/pgarchivecleanup.sgml| 11 +
 src/bin/pg_archivecleanup/pg_archivecleanup.c | 11 -
 .../t/010_pg_archivecleanup.pl| 45 +--
 3 files changed, 62 insertions(+), 5 deletions(-)

diff --git a/doc/src/sgml/ref/pgarchivecleanup.sgml b/doc/src/sgml/ref/pgarchivecleanup.sgml
index 635e7c7685..6cb565156f 100644
--- a/doc/src/sgml/ref/pgarchivecleanup.sgml
+++ b/doc/src/sgml/ref/pgarchivecleanup.sgml
@@ -93,6 +93,17 @@ pg_archivecleanup:  removing file "archive/00010037000E"
 
 
 
+ 
+  -b
+  
+   
+ Remove files including backup history files, whose suffix is .backup.
+ Note that when oldestkeptwalfile is a backup history file,
+ specified file is kept and only preceding WAL files and backup history files are removed.
+   
+  
+ 
+
  
   -d
   
diff --git a/src/bin/pg_archivecleanup/pg_archivecleanup.c b/src/bin/pg_archivecleanup/pg_archivecleanup.c
index 7726d05149..ae6ae76f08 100644
--- a/src/bin/pg_archivecleanup/pg_archivecleanup.c
+++ b/src/bin/pg_archivecleanup/pg_archivecleanup.c
@@ -23,6 +23,8 @@ const char *progname;
 
 /* Options and defaults */
 bool		dryrun = false;		/* are we performing a dry-run operation? */
+bool		removeBackupHistoryFile = false;	/* remove files including
+ * backup history files */
 char	   *additional_ext = NULL;	/* Extension to remove from filenames */
 
 char	   *archiveLocation;	/* where to find the archive? */
@@ -118,7 +120,8 @@ CleanupPriorWALFiles(void)
 			 * file. Note that this means files are not removed in the order
 			 * they were originally written, in case this worries you.
 			 */
-			if ((IsXLogFileName(walfile) || IsPartialXLogFileName(walfile)) &&
+			if (((IsXLogFileName(walfile) || IsPartialXLogFileName(walfile)) ||
+(removeBackupHistoryFile && IsBackupHistoryFileName(walfile))) &&
 strcmp(walfile + 8, exclusiveCleanupFileName + 8) < 0)
 			{
 char		WALFilePath[MAXPGPATH * 2]; /* the file path
@@ -252,6 +255,7 @@ usage(void)
 	printf(_("Usage:\n"));
 	printf(_("  %s [OPTION]... ARCHIVELOCATION OLDESTKEPTWALFILE\n"), progname);
 	printf(_("\nOptions:\n"));
+	printf(_("  -b remove files including backup history files\n"));
 	printf(_("  -d generate debug output (verbose mode)\n"));
 	printf(_("  -n dry run, show the names of the files that would be removed\n"));
 	printf(_("  -V, --version  output version information, then exit\n"));
@@ -294,10 +298,13 @@ main(int argc, char **argv)
 		}
 	}
 
-	while ((c = getopt(argc, argv, "dnx:")) != -1)
+	while ((c = getopt(argc, argv, "bdnx:")) != -1)
 	{
 		switch (c)
 		{
+			case 'b': 			/* Remove backup history files too */
+removeBackupHistoryFile = true;
+break;
 			case 'd':			/* Debug mode */
 pg_logging_increase_verbosity();
 break;
diff --git a/src/bin/pg_archivecleanup/t/010_pg_archivecleanup.pl b/src/bin/pg_archivecleanup/t/010_pg_archivecleanup.pl
index 76321d1284..afd2ff6209 100644
--- a/src/bin/pg_archivecleanup/t/010_pg_archivecleanup.pl
+++ b/src/bin/pg_archivecleanup/t/010_pg_archivecleanup.pl
@@ -16,9 +16,14 @@ my @walfiles = (
 	'00010037000C.gz', '00010037000D',
 	'00010037000E','00010037000F.partial',);
 
+my @walfiles_with_backuphistoryfile = (
+	'000200380007.0028.backup', '000200380008',
+	'000200380009',
+	'00020038000A', '00020038000B.007C9330.backup');
+
 sub create_files
 {
-	foreach my $fn (@walfiles, 'unrelated_file')
+	foreach my $fn (@_, 'unrelated_file')
 	{
 		open my $file, '>', "$tempdir/$fn";
 		print $file 'CONTENT';
@@ -27,7 +32,7 @@ sub create_files
 	return;
 }
 
-create_files();

Re: Add LZ4 compression in pg_dump

2023-05-09 Thread Tomas Vondra
On 5/9/23 00:10, Michael Paquier wrote:
> On Mon, May 08, 2023 at 08:00:39PM +0200, Tomas Vondra wrote:
>> The LZ4Stream_write() forgot to move the pointer to the next chunk, so
>> it was happily decompressing the initial chunk over and over. A bit
>> embarrassing oversight :-(
>>
>> The custom format calls WriteDataToArchiveLZ4(), which was correct.
>>
>> The attached patch fixes this for me.
> 
> Ouch.  So this was corrupting the dumps and the compression when
> trying to write more than two chunks at once, not the decompression
> steps.  That addresses the issue here as well, thanks!

Yeah. Thanks for the report, should have been found during review.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: [PATCH] Allow Postgres to pick an unused port to listen

2023-05-09 Thread Denis Laxalde

The documentation fails to build for me:

$ ninja docs
[1/2] Generating doc/src/sgml/postgres-full.xml with a custom command
FAILED: doc/src/sgml/postgres-full.xml
/usr/bin/python3 ../postgresql/doc/src/sgml/xmltools_dep_wrapper 
--targetname doc/src/sgml/postgres-full.xml --depfile 
doc/src/sgml/postgres-full.xml.d --tool /usr/bin/xmllint -- --nonet 
--noent --valid --path doc/src/sgml -o doc/src/sgml/postgres-full.xml 
../postgresql/doc/src/sgml/postgres.sgml
../postgresql/doc/src/sgml/postgres.sgml:685: element para: validity 
error : Element entry is not declared in para list of possible children

ninja: build stopped: subcommand failed.


Removing the  tag resolves the issue:
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index cd07bad3b5..f71859f710 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -684,7 +684,7 @@ include_dir 'conf.d'


 The port can be set to 0 to make Postgres pick an unused port 
number.
-The assigned port number can be then retrieved from 
postmaster.pid.
+The assigned port number can be then retrieved from 
postmaster.pid.


   
  





Re: drop table in transaction

2023-05-09 Thread Euler Taveira
On Tue, May 9, 2023, at 7:42 AM, Fabrice Chapuis wrote:
> Where in the code is written the mechanism used for isolation when drop table 
> is executed in a transaction 

RemoveRelations() in src/backend/commands/tablecmds.c

If you are looking for a previous layer, check ExecDropStmt().


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: psql tests hangs

2023-05-09 Thread Pavel Stehule
út 9. 5. 2023 v 11:07 odesílatel Pavel Stehule 
napsal:

>
>
> út 9. 5. 2023 v 10:48 odesílatel Daniel Gustafsson 
> napsal:
>
>> > On 9 May 2023, at 08:52, Pavel Stehule  wrote:
>> >
>> > Hi
>> >
>> > I try run make check-world. Now I have problems with tests of psql
>> >
>> > I had to cancel tests
>> >
>> > log:
>> >
>> > [08:46:49.828](0.038s) ok 63 - no ON_ERROR_STOP, --single-transaction
>> and multiple -c switches
>> > [08:46:49.860](0.033s) ok 64 - client-side error commits transaction,
>> no ON_ERROR_STOP and multiple -c switches
>> > [08:46:49.928](0.067s) ok 65 - \copy from with DEFAULT: exit code 0
>> > [08:46:49.929](0.001s) ok 66 - \copy from with DEFAULT: no stderr
>> > [08:46:49.930](0.001s) ok 67 - \copy from with DEFAULT: matches
>> > death by signal at
>> /home/pavel/src/postgresql.master/src/bin/psql/../../../src/test/perl/PostgreSQL/Test/Cluster.pm
>> line 3042.
>> > # Postmaster PID for node "main" is 157863
>> > ### Stopping node "main" using mode immediate
>> > # Running: pg_ctl -D
>> /home/pavel/src/postgresql.master/src/bin/psql/tmp_check/t_001_basic_main_data/pgdata
>> -m immediate stop
>> > waiting for server to shut down done
>> > server stopped
>> > # No postmaster PID for node "main"
>> > [08:47:30.361](40.431s) # Tests were run but no plan was declared and
>> done_testing() was not seen.
>> > [08:47:30.362](0.001s) # Looks like your test exited with 4 just after
>> 67.
>> > Warning: unable to close filehandle $orig_stderr properly: Broken pipe
>> during global destruction.
>>
>> I'm unable to reproduce, and this clearly works in the buildfarm and CI.
>> Did
>> you run out of disk on the volume during the test or something similar?
>> Anything interesting in the serverlogs from the tmp_check install?
>>
>
> I have enough free space on disc
>
> I don't see nothing interesting in log (it is another run)
>
> 2023-05-09 08:50:04.839 CEST [158930] 001_basic.pl LOG:  statement: COPY
>  copy_default FROM STDIN with (format 'csv', default 'placeholder');
> 2023-05-09 08:50:04.841 CEST [158930] 001_basic.pl LOG:  statement:
> SELECT * FROM copy_default
> 2023-05-09 08:50:04.879 CEST [158932] 001_basic.pl LOG:  statement:
> SELECT 1.
> 2023-05-09 08:50:04.888 CEST [158932] 001_basic.pl LOG:  statement:
> SELECT 1.
> 2023-05-09 08:50:04.898 CEST [158932] 001_basic.pl LOG:  statement:
> SELECT 1.
> 2023-05-09 08:50:28.375 CEST [158862] LOG:  received immediate shutdown
> request
> 2023-05-09 08:50:28.385 CEST [158862] LOG:  database system is shut down
>
> backtrace from perl
>
> Program received signal SIGINT, Interrupt.
> 0x7f387ecc1ade in select () from /lib64/libc.so.6
> (gdb) bt
> #0  0x7f387ecc1ade in select () from /lib64/libc.so.6
> #1  0x7f387e97363b in Perl_pp_sselect () from /lib64/libperl.so.5.36
> #2  0x7f387e917958 in Perl_runops_standard () from
> /lib64/libperl.so.5.36
> #3  0x7f387e88259d in perl_run () from /lib64/libperl.so.5.36
> #4  0x5588bceb234a in main ()
>
> Regards
>

I repeated another build with the same result.

Tested REL_15_STABLE branch without any problems.

Regards

Pavel




>
> Pavel
>
>
>
>1.
>
>
>
>
>
>
>>
>> --
>> Daniel Gustafsson
>>
>>


Re: Tables getting stuck at 's' state during logical replication

2023-05-09 Thread Amit Kapila
On Fri, May 5, 2023 at 7:27 PM Padmavathi G  wrote:
>
> Some background on the setup on which I am trying to carry out the upgrade:
>
> We have a pod in a kubernetes cluster which contains the postgres 11 image. 
> We are following the logical replication process for upgrade
>
> Steps followed for logical replication:
>
> 1. Created a new pod in the same kubernetes cluster with the latest postgres 
> 15 image
> 2. Created a publication  (say publication 1) in the old pod including all 
> tables in a database
> 3. Created a subscription (say subscription 1) in the new pod for the above 
> mentioned publication
> 4. When monitoring the subscription via pg_subscription_rel in the 
> subscriber, I noticed that out of 45 tables 20 were in the 'r' state and 25 
> were in 's' state and they remained in the same state for almost 2 days, 
> there was no improvement in the state. But the logs showed that the tables 
> which had 's' state also had "synchronization workers for  
> finished".
>

I think the problem happened in this step where some of the tables
remained in 's' state. It is not clear why this could happen because
apply worker should eventually update these relations state to 'r'. If
this is reproducible, we can try two things to further investigate the
issue: (a) Disable and Enable the subscription once and see if that
helps; (b) try increasing the LOG level to DEBUG2 and see if we get
any useful LOG message.

-- 
With Regards,
Amit Kapila.




Re: Cleaning up array_in()

2023-05-09 Thread Alexander Lakhin

09.05.2023 06:06, Tom Lane wrote:

Alexander Lakhin  writes:

The only thing that confused me, is the error message (it's not new, too):
select '{{1}}'::int[];
or even:
select '{{'::int[];
ERROR:  number of array dimensions (7) exceeds the maximum allowed (6)

Yeah, I didn't touch that, but it's pretty bogus because the first
number will always be "7" even if you wrote more than 7 left braces,
since the code errors out immediately upon finding that it's seen
too many braces.

The equivalent message in the PLs just says "number of array dimensions
exceeds the maximum allowed (6)".  I'm inclined to do likewise in
array_in, but didn't touch it here.


I think that, strictly speaking, we have no array dimensions in the string
'{{'; there are only characters (braces) during the string parsing.
While in the PLs we definitely deal with real arrays, which have dimensions.


Beside that, I would like to note the following error text changes
(all of these are feasible, I think):

I'll look into whether we can improve those, unless you had a patch
in mind already?


Those messages looked more or less correct to me, I just wanted to note how 
they are
changing (and haven't highlighted messages, that are not), but if you see here 
room
for improvement, please look into it (I have no good formulations yet).

Best regards,
Alexander




drop table in transaction

2023-05-09 Thread Fabrice Chapuis
Where in the code is written the mechanism used for isolation when drop
table is executed in a transaction

Thanks for your help

Fabrice


Re: Perform streaming logical transactions by background workers and parallel apply

2023-05-09 Thread Amit Kapila
On Tue, May 9, 2023 at 7:50 AM Masahiko Sawada  wrote:
>
> On Mon, May 8, 2023 at 8:09 PM Amit Kapila  wrote:
> >
> >
> > I think it is only possible for the leader apply can worker to try to
> > receive the error message from an error queue after your 0002 patch.
> > Because another place already detached from the queue before stopping
> > the parallel apply workers. So, I combined both the patches and
> > changed a few comments and a commit message. Let me know what you
> > think of the attached.
>
> I have one comment on the detaching error queue part:
>
> +   /*
> +* Detach from the error_mq_handle for the parallel apply worker 
> before
> +* stopping it. This prevents the leader apply worker from trying to
> +* receive the message from the error queue that might already
> be detached
> +* by the parallel apply worker.
> +*/
> +   shm_mq_detach(winfo->error_mq_handle);
> +   winfo->error_mq_handle = NULL;
>
> In pa_detach_all_error_mq(), we try to detach error queues of all
> workers in the pool. I think we should check if the queue is already
> detached (i.e. is NULL) there. Otherwise, we will end up a SEGV if an
> error happens after detaching the error queue and before removing the
> worker from the pool.
>

Agreed, I have made this change, added the same check at one other
place for the sake of consistency, and pushed the patch.

-- 
With Regards,
Amit Kapila.




RE: [PoC] pg_upgrade: allow to upgrade publisher node

2023-05-09 Thread Hayato Kuroda (Fujitsu)
Dear Peter,

Thank you for reviewing! PSA new version.

> 
> General.
> 
> 1. pg_dump option is documented to the user.
> 
> I'm not sure about exposing the new pg_dump
> --logical-replication-slots-only option to the user.
> 
> I thought this pg_dump option was intended only to be called
> *internally* by the pg_upgrade.
> But, this patch is also documenting the new option for the user (in
> case they want to call it independently?)
> 
> Maybe exposing it  is OK, but if you do that then I thought perhaps
> there should also be some additional pg_dump tests just for this
> option (i.e. tested independently of the pg_upgrade)

Right, I have written the document for the moment, but it should not
If it is not exposed. Removed from the doc.

> Commit message
> 
> 2.
> For pg_upgrade, when '--include-logical-replication-slots' is
> specified, it executes
> pg_dump with the new "--logical-replication-slots-only" option and
> restores from the
> dump. Apart from restoring schema, pg_resetwal must not be called
> after restoring
> replication slots. This is because the command discards WAL files and
> starts from a
> new segment, even if they are required by replication slots. This
> leads to an ERROR:
> "requested WAL segment XXX has already been removed". To avoid this,
> replication slots
> are restored at a different time than other objects, after running 
> pg_resetwal.
> 
> ~~
> 
> The "Apart from" sentence maybe could do with some rewording. I
> noticed there is a code comment (below fragment) that says the same as
> this, but more clearly. Maybe it is better to use that code-comment
> wording in the comment message.
> 
> + * XXX We cannot dump replication slots at the same time as the schema
> + * dump because we need to separate the timing of restoring
> + * replication slots and other objects. Replication slots, in
> + * particular, should not be restored before executing the pg_resetwal
> + * command because it will remove WALs that are required by the slots.

Changed.

> src/bin/pg_dump/pg_dump.c
> 
> 3. main
> 
> + if (dopt.logical_slots_only && !dopt.binary_upgrade)
> + pg_fatal("options --logical-replication-slots-only requires option
> --binary-upgrade");
> +
> + if (dopt.logical_slots_only && dopt.dataOnly)
> + pg_fatal("options --logical-replication-slots-only and
> -a/--data-only cannot be used together");
> + if (dopt.logical_slots_only && dopt.schemaOnly)
> + pg_fatal("options --logical-replication-slots-only and
> -s/--schema-only cannot be used together");
> +
> 
> Consider if it might be simpler to combine together all those
> dopt.logical_slots_only checks.
> 
> SUGGESTION
> 
> if (dopt.logical_slots_only)
> {
> if (!dopt.binary_upgrade)
> pg_fatal("options --logical-replication-slots-only requires
> option --binary-upgrade");
> 
> if (dopt.dataOnly)
> pg_fatal("options --logical-replication-slots-only and
> -a/--data-only cannot be used together");
> if (dopt.schemaOnly)
> pg_fatal("options --logical-replication-slots-only and
> -s/--schema-only cannot be used together");
> }

Right, fixed.

> 4. getLogicalReplicationSlots
> 
> + /* Check whether we should dump or not */
> + if (fout->remoteVersion < 16 || !dopt->logical_slots_only)
> + return;
> 
> I'm not sure if this check is necessary. Given the way this function
> is called, is it possible for this check to fail? Maybe that quick
> exit would be better code as an Assert?

I think the version check must be needed because it is not done yet.
(Actually I'm not sure the restriction is needed, but now I will keep)
About dopt->logical_slots_only, I agreed to remove that. 

> 5. dumpLogicalReplicationSlot
> 
> +dumpLogicalReplicationSlot(Archive *fout,
> +const LogicalReplicationSlotInfo *slotinfo)
> +{
> + DumpOptions *dopt = fout->dopt;
> +
> + if (!dopt->logical_slots_only)
> + return;
> 
> (Similar to the previous comment). Is it even possible to arrive here
> when dopt->logical_slots_only is false. Maybe that quick exit would be
> better coded as an Assert?

I think it is not possible, so changed to Assert().

> 6.
> + PQExpBuffer query = createPQExpBuffer();
> + char*slotname = pg_strdup(slotinfo->dobj.name);
> 
> I wondered if it was really necessary to strdup/free this slotname.
> e.g. And if it is, then why don't you do this for the slotinfo->plugin
> field?

This was a debris for my testing. Removed.

> src/bin/pg_upgrade/check.c
> 
> 7. check_and_dump_old_cluster
> 
>   /* Extract a list of databases and tables from the old cluster */
>   get_db_and_rel_infos(_cluster);
> + get_logical_slot_infos(_cluster);
> 
> Is it correct to associate this new call with that existing comment
> about "databases and tables"?

Added a comment.

> 8. check_new_cluster
> 
> @@ -188,6 +190,7 @@ void
>  check_new_cluster(void)
>  {
>   get_db_and_rel_infos(_cluster);
> + get_logical_slot_infos(_cluster);
> 
>   check_new_cluster_is_empty();
> 
> @@ -210,6 +213,9 @@ check_new_cluster(void)
>   

Re: base backup vs. concurrent truncation

2023-05-09 Thread Aleksander Alekseev
Hi Robert,

> I admit I haven't done the legwork to nail down a test
> case where everything comes together just right to show user-visible
> breakage, but your success in finding one where it doesn't is no proof
> of anything.

Respectfully, what made you think this was my intention?

Quite the opposite, personally I am inclined to think that the problem
does exist. In order to fix it however we need a test that reliably
reproduces it first. Otherwise there is no way to figure out whether
the fix was correct or not.

What the experiment showed is that the test scenario you initially
described is probably the wrong one for reasons yet to be understood
and we need to come up with a better one.

-- 
Best regards,
Aleksander Alekseev




Re: psql tests hangs

2023-05-09 Thread Pavel Stehule
út 9. 5. 2023 v 10:48 odesílatel Daniel Gustafsson  napsal:

> > On 9 May 2023, at 08:52, Pavel Stehule  wrote:
> >
> > Hi
> >
> > I try run make check-world. Now I have problems with tests of psql
> >
> > I had to cancel tests
> >
> > log:
> >
> > [08:46:49.828](0.038s) ok 63 - no ON_ERROR_STOP, --single-transaction
> and multiple -c switches
> > [08:46:49.860](0.033s) ok 64 - client-side error commits transaction, no
> ON_ERROR_STOP and multiple -c switches
> > [08:46:49.928](0.067s) ok 65 - \copy from with DEFAULT: exit code 0
> > [08:46:49.929](0.001s) ok 66 - \copy from with DEFAULT: no stderr
> > [08:46:49.930](0.001s) ok 67 - \copy from with DEFAULT: matches
> > death by signal at
> /home/pavel/src/postgresql.master/src/bin/psql/../../../src/test/perl/PostgreSQL/Test/Cluster.pm
> line 3042.
> > # Postmaster PID for node "main" is 157863
> > ### Stopping node "main" using mode immediate
> > # Running: pg_ctl -D
> /home/pavel/src/postgresql.master/src/bin/psql/tmp_check/t_001_basic_main_data/pgdata
> -m immediate stop
> > waiting for server to shut down done
> > server stopped
> > # No postmaster PID for node "main"
> > [08:47:30.361](40.431s) # Tests were run but no plan was declared and
> done_testing() was not seen.
> > [08:47:30.362](0.001s) # Looks like your test exited with 4 just after
> 67.
> > Warning: unable to close filehandle $orig_stderr properly: Broken pipe
> during global destruction.
>
> I'm unable to reproduce, and this clearly works in the buildfarm and CI.
> Did
> you run out of disk on the volume during the test or something similar?
> Anything interesting in the serverlogs from the tmp_check install?
>

I have enough free space on disc

I don't see nothing interesting in log (it is another run)

2023-05-09 08:50:04.839 CEST [158930] 001_basic.pl LOG:  statement: COPY
 copy_default FROM STDIN with (format 'csv', default 'placeholder');
2023-05-09 08:50:04.841 CEST [158930] 001_basic.pl LOG:  statement: SELECT
* FROM copy_default
2023-05-09 08:50:04.879 CEST [158932] 001_basic.pl LOG:  statement: SELECT
1.
2023-05-09 08:50:04.888 CEST [158932] 001_basic.pl LOG:  statement: SELECT
1.
2023-05-09 08:50:04.898 CEST [158932] 001_basic.pl LOG:  statement: SELECT
1.
2023-05-09 08:50:28.375 CEST [158862] LOG:  received immediate shutdown
request
2023-05-09 08:50:28.385 CEST [158862] LOG:  database system is shut down

backtrace from perl

Program received signal SIGINT, Interrupt.
0x7f387ecc1ade in select () from /lib64/libc.so.6
(gdb) bt
#0  0x7f387ecc1ade in select () from /lib64/libc.so.6
#1  0x7f387e97363b in Perl_pp_sselect () from /lib64/libperl.so.5.36
#2  0x7f387e917958 in Perl_runops_standard () from
/lib64/libperl.so.5.36
#3  0x7f387e88259d in perl_run () from /lib64/libperl.so.5.36
#4  0x5588bceb234a in main ()

Regards

Pavel



   1.






>
> --
> Daniel Gustafsson
>
>


Re: psql tests hangs

2023-05-09 Thread Daniel Gustafsson
> On 9 May 2023, at 08:52, Pavel Stehule  wrote:
> 
> Hi
> 
> I try run make check-world. Now I have problems with tests of psql
> 
> I had to cancel tests
> 
> log:
> 
> [08:46:49.828](0.038s) ok 63 - no ON_ERROR_STOP, --single-transaction and 
> multiple -c switches
> [08:46:49.860](0.033s) ok 64 - client-side error commits transaction, no 
> ON_ERROR_STOP and multiple -c switches
> [08:46:49.928](0.067s) ok 65 - \copy from with DEFAULT: exit code 0
> [08:46:49.929](0.001s) ok 66 - \copy from with DEFAULT: no stderr
> [08:46:49.930](0.001s) ok 67 - \copy from with DEFAULT: matches
> death by signal at 
> /home/pavel/src/postgresql.master/src/bin/psql/../../../src/test/perl/PostgreSQL/Test/Cluster.pm
>  line 3042.
> # Postmaster PID for node "main" is 157863
> ### Stopping node "main" using mode immediate
> # Running: pg_ctl -D 
> /home/pavel/src/postgresql.master/src/bin/psql/tmp_check/t_001_basic_main_data/pgdata
>  -m immediate stop
> waiting for server to shut down done
> server stopped
> # No postmaster PID for node "main"
> [08:47:30.361](40.431s) # Tests were run but no plan was declared and 
> done_testing() was not seen.
> [08:47:30.362](0.001s) # Looks like your test exited with 4 just after 67.
> Warning: unable to close filehandle $orig_stderr properly: Broken pipe during 
> global destruction.

I'm unable to reproduce, and this clearly works in the buildfarm and CI.  Did
you run out of disk on the volume during the test or something similar?
Anything interesting in the serverlogs from the tmp_check install?

--
Daniel Gustafsson





Re: Order changes in PG16 since ICU introduction

2023-05-09 Thread Alvaro Herrera
On 2023-Apr-24, Peter Eisentraut wrote:

> The GUC settings lc_collate and lc_ctype are from a time when those locale
> settings were cluster-global.  When we made those locale settings
> per-database (PG 8.4), we kept them as read-only.  As of PG 15, you can use
> ICU as the per-database locale provider, so what is being attempted in the
> above example is already meaningless before PG 16, since you need to look
> into pg_database to find out what is really happening.
> 
> I think we should just remove the GUC parameters lc_collate and lc_ctype.

I agree with removing these in v16, since they are going to become more
meaningless and confusing.

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/




Re: [DOC] Update ALTER SUBSCRIPTION documentation

2023-05-09 Thread Masahiko Sawada
On Tue, May 9, 2023 at 3:40 PM Amit Kapila  wrote:
>
> On Mon, May 8, 2023 at 1:51 PM Masahiko Sawada  wrote:
> >
> > Apart from the documentation change, given that setting slot_name =
> > NONE always requires for the subscription to be disabled beforehand,
> > does it make sense to change ALTER SUBSCRIPTION SET so that we disable
> > the subscription when setting slot_name = NONE? Setting slot_name to a
> > valid slot name doesn't enable the subscription, though.
> >
>
> I think this is worth considering. Offhand, I don't see any problem
> with this idea but users may not like the automatic disabling of
> subscriptions along with setting slot_name=NONE. It would be better to
> discuss this in a separate thread to seek the opinion of others.
>

Agreed.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com




Re: Improve list manipulation in several places

2023-05-09 Thread Alvaro Herrera
On 2023-May-08, Ranier Vilela wrote:

> Em seg., 8 de mai. de 2023 às 14:26, Alvaro Herrera 
> escreveu:
> 
> > The problem I see is that each of these new functions has a single
> > caller, and the only one that looks like it could have a performance
> > advantage is list_copy_move_nth_to_head() (which is the weirdest of the
> > lot).  I'm inclined not to have any of these single-use functions unless
> > a performance case can be made for them.
> >
> I think you missed list_nth_xid, It makes perfect sense to exist.

I saw that one; it's just syntactic sugar, just like list_nth_int and
list_nth_oid, except it has only one possible callsite instead of a
dozen like those others.  I see no harm in that function, but no
practical advantage to it either.  Xid lists are a very fringe feature,
there being exactly one place in the whole server that uses them.  

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/




Re: Add two missing tests in 035_standby_logical_decoding.pl

2023-05-09 Thread Drouvot, Bertrand

Hi,

On 5/9/23 8:02 AM, Amit Kapila wrote:

On Mon, May 8, 2023 at 1:45 PM Drouvot, Bertrand
 wrote:




Why not initialize the cascading standby node just before the standby
promotion test: "Test standby promotion and logical decoding behavior
after the standby gets promoted."? That way we will avoid any unknown
side-effects of cascading standby and it will anyway look more logical
to initialize it where the test needs it.



Yeah, that's even better. Moved the physical slot creation on the standby
and the cascading standby initialization where "strictly" needed in V2
attached.

Also ensuring that hsf is set to on on the cascading standby to be on the
safe side of thing.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.comFrom 0292d14a62b72b38197d434d85ee2c6a7f2cec00 Mon Sep 17 00:00:00 2001
From: Bertrand Drouvot 
Date: Tue, 9 May 2023 06:43:59 +
Subject: [PATCH v2] fix retaining WAL test

---
 .../t/035_standby_logical_decoding.pl | 36 +--
 1 file changed, 18 insertions(+), 18 deletions(-)
 100.0% src/test/recovery/t/

diff --git a/src/test/recovery/t/035_standby_logical_decoding.pl 
b/src/test/recovery/t/035_standby_logical_decoding.pl
index ad1def2474..2b4a688330 100644
--- a/src/test/recovery/t/035_standby_logical_decoding.pl
+++ b/src/test/recovery/t/035_standby_logical_decoding.pl
@@ -276,20 +276,6 @@ $node_standby->append_conf('postgresql.conf',
max_replication_slots = 5]);
 $node_standby->start;
 $node_primary->wait_for_replay_catchup($node_standby);
-$node_standby->safe_psql('testdb', qq[SELECT * FROM 
pg_create_physical_replication_slot('$standby_physical_slotname');]);
-
-###
-# Initialize cascading standby node
-###
-$node_standby->backup($backup_name);
-$node_cascading_standby->init_from_backup(
-   $node_standby, $backup_name,
-   has_streaming => 1,
-   has_restoring => 1);
-$node_cascading_standby->append_conf('postgresql.conf',
-   qq[primary_slot_name = '$standby_physical_slotname']);
-$node_cascading_standby->start;
-$node_standby->wait_for_replay_catchup($node_cascading_standby, $node_primary);
 
 ###
 # Initialize subscriber node
@@ -503,9 +489,6 @@ check_slots_conflicting_status(1);
 # Verify that invalidated logical slots do not lead to retaining WAL.
 ##
 
-# Wait for the cascading standby to catchup before removing the WAL file(s)
-$node_standby->wait_for_replay_catchup($node_cascading_standby, $node_primary);
-
 # Get the restart_lsn from an invalidated slot
 my $restart_lsn = $node_standby->safe_psql('postgres',
"SELECT restart_lsn from pg_replication_slots WHERE slot_name = 
'vacuum_full_activeslot' and conflicting is true;"
@@ -777,9 +760,26 @@ $node_standby->reload;
 $node_primary->psql('postgres', q[CREATE DATABASE testdb]);
 $node_primary->safe_psql('testdb', qq[CREATE TABLE decoding_test(x integer, y 
text);]);
 
-# Wait for the standby to catchup before creating the slots
+# Wait for the standby to catchup before initializing the cascading standby
 $node_primary->wait_for_replay_catchup($node_standby);
 
+# Create a physical replication slot on the standby.
+# Keep this step after the "Verify that invalidated logical slots do not lead
+# to retaining WAL" test (as the physical slot on the standby could prevent the
+# WAL file removal).
+$node_standby->safe_psql('testdb', qq[SELECT * FROM 
pg_create_physical_replication_slot('$standby_physical_slotname');]);
+
+# Initialize cascading standby node
+$node_standby->backup($backup_name);
+$node_cascading_standby->init_from_backup(
+   $node_standby, $backup_name,
+   has_streaming => 1,
+   has_restoring => 1);
+$node_cascading_standby->append_conf('postgresql.conf',
+   qq[primary_slot_name = '$standby_physical_slotname'
+  hot_standby_feedback = on]);
+$node_cascading_standby->start;
+
 # create the logical slots
 create_logical_slots($node_standby, 'promotion_');
 
-- 
2.34.1



psql tests hangs

2023-05-09 Thread Pavel Stehule
Hi

I try run make check-world. Now I have problems with tests of psql

I had to cancel tests

log:

[08:46:49.828](0.038s) ok 63 - no ON_ERROR_STOP, --single-transaction and
multiple -c switches
[08:46:49.860](0.033s) ok 64 - client-side error commits transaction, no
ON_ERROR_STOP and multiple -c switches
[08:46:49.928](0.067s) ok 65 - \copy from with DEFAULT: exit code 0
[08:46:49.929](0.001s) ok 66 - \copy from with DEFAULT: no stderr
[08:46:49.930](0.001s) ok 67 - \copy from with DEFAULT: matches
death by signal at
/home/pavel/src/postgresql.master/src/bin/psql/../../../src/test/perl/PostgreSQL/Test/Cluster.pm
line 3042.
# Postmaster PID for node "main" is 157863
### Stopping node "main" using mode immediate
# Running: pg_ctl -D
/home/pavel/src/postgresql.master/src/bin/psql/tmp_check/t_001_basic_main_data/pgdata
-m immediate stop
waiting for server to shut down done
server stopped
# No postmaster PID for node "main"
[08:47:30.361](40.431s) # Tests were run but no plan was declared and
done_testing() was not seen.
[08:47:30.362](0.001s) # Looks like your test exited with 4 just after 67.
Warning: unable to close filehandle $orig_stderr properly: Broken pipe
during global destruction.

I use Fedora 38

Regards

Pavel


Re: [DOC] Update ALTER SUBSCRIPTION documentation

2023-05-09 Thread Amit Kapila
On Mon, May 8, 2023 at 1:51 PM Masahiko Sawada  wrote:
>
> Apart from the documentation change, given that setting slot_name =
> NONE always requires for the subscription to be disabled beforehand,
> does it make sense to change ALTER SUBSCRIPTION SET so that we disable
> the subscription when setting slot_name = NONE? Setting slot_name to a
> valid slot name doesn't enable the subscription, though.
>

I think this is worth considering. Offhand, I don't see any problem
with this idea but users may not like the automatic disabling of
subscriptions along with setting slot_name=NONE. It would be better to
discuss this in a separate thread to seek the opinion of others.

-- 
With Regards,
Amit Kapila.




Re: WAL Insertion Lock Improvements

2023-05-09 Thread Michael Paquier
On Tue, May 09, 2023 at 02:10:20PM +0900, Michael Paquier wrote:
> Should we split this patch into two parts, as they aim at tackling two
> different cases then?  One for LWLockConflictsWithVar() and
> LWLockReleaseClearVar() which are the straight-forward pieces
> (using one pg_atomic_write_u64() in LWLockUpdateVar instead), then
> a second for LWLockUpdateVar()?

I have been studying that a bit more, and I'd like to take this
suggestion back.  Apologies for the noise.
--
Michael


signature.asc
Description: PGP signature


Re: Add two missing tests in 035_standby_logical_decoding.pl

2023-05-09 Thread Amit Kapila
On Mon, May 8, 2023 at 1:45 PM Drouvot, Bertrand
 wrote:
>
> On 5/8/23 4:42 AM, Amit Kapila wrote:
> > On Sat, May 6, 2023 at 9:33 PM Drouvot, Bertrand
> >  wrote:
> >>
> >> On 5/6/23 3:28 PM, Amit Kapila wrote:
> >>> On Sat, May 6, 2023 at 1:52 PM Drouvot, Bertrand
> >>>  wrote:
> >>
> >>> Next steps:
> >>> =
> >>> 1. The first thing is we should verify this theory by adding some LOG
> >>> in KeepLogSeg() to see if the _logSegNo is reduced due to the value
> >>> returned by XLogGetReplicationSlotMinimumLSN().
> >>
> >> Yeah, will do that early next week.
>
> It's done with the following changes:
>
> https://github.com/bdrouvot/postgres/commit/79e1bd9ab429a22f876b9364eb8a0da2dacaaef7#diff-c1cb3ab2a19606390c1a7ed00ffe5a45531702ca5faf999d401c548f8951c65bL7454-R7514
>
> With that in place, there is one failing test here: 
> https://cirrus-ci.com/task/5173216310722560
>
> Where we can see:
>
> 2023-05-08 07:42:56.301 UTC [18038][checkpointer] LOCATION:  
> UpdateMinRecoveryPoint, xlog.c:2500
> 2023-05-08 07:42:56.302 UTC [18038][checkpointer] LOG:  0: 
> CreateRestartPoint: After XLByteToSeg RedoRecPtr is 0/498, _logSegNo is 4
> 2023-05-08 07:42:56.302 UTC [18038][checkpointer] LOCATION:  
> CreateRestartPoint, xlog.c:7271
> 2023-05-08 07:42:56.302 UTC [18038][checkpointer] LOG:  0: KeepLogSeg: 
> segno changed to 4 due to XLByteToSeg
> 2023-05-08 07:42:56.302 UTC [18038][checkpointer] LOCATION:  KeepLogSeg, 
> xlog.c:7473
> 2023-05-08 07:42:56.302 UTC [18038][checkpointer] LOG:  0: KeepLogSeg: 
> segno changed to 3 due to XLogGetReplicationSlotMinimumLSN()
> 2023-05-08 07:42:56.302 UTC [18038][checkpointer] LOCATION:  KeepLogSeg, 
> xlog.c:7483
> 2023-05-08 07:42:56.302 UTC [18038][checkpointer] LOG:  0: 
> CreateRestartPoint: After KeepLogSeg RedoRecPtr is 0/498, endptr is 
> 0/4000148, _logSegNo is 3
> 2023-05-08 07:42:56.302 UTC [18038][checkpointer] LOCATION:  
> CreateRestartPoint, xlog.c:7284
> 2023-05-08 07:42:56.302 UTC [18038][checkpointer] LOG:  0: BDT1 about to 
> call RemoveOldXlogFiles in CreateRestartPoint
> 2023-05-08 07:42:56.302 UTC [18038][checkpointer] LOCATION:  
> CreateRestartPoint, xlog.c:7313
> 2023-05-08 07:42:56.302 UTC [18038][checkpointer] LOG:  0: attempting to 
> remove WAL segments older than log file 0002
>
> So the suspicion about XLogGetReplicationSlotMinimumLSN() was correct 
> (_logSegNo moved from
> 4 to 3 due to XLogGetReplicationSlotMinimumLSN()).
>
> >> What about postponing the physical slot creation on the standby and the
> >> cascading standby node initialization after this test?
> >>
> >
> > Yeah, that is also possible. But, I have a few questions regarding
> > that: (a) There doesn't seem to be a physical slot on cascading
> > standby, if I am missing something, can you please point me to the
> > relevant part of the test?
>
> That's right. There is a physical slot only on the primary and on the standby.
>
> What I meant up-thread is to also postpone the cascading standby node 
> initialization
> after this test (once the physical slot on the standby is created).
>
> Please find attached a proposal doing so.
>
> > (b) Which test is currently dependent on
> > the physical slot on standby?
>
> Not a test but the cascading standby initialization with the 
> "primary_slot_name" parameter.
>
> Also, now I think that's better to have the physical slot on the standby + 
> hsf set to on on the
> cascading standby (coming from the standby backup).
>
> Idea is to avoid any risk of logical slot invalidation on the cascading 
> standby in the
> standby promotion test.
>

Why not initialize the cascading standby node just before the standby
promotion test: "Test standby promotion and logical decoding behavior
after the standby gets promoted."? That way we will avoid any unknown
side-effects of cascading standby and it will anyway look more logical
to initialize it where the test needs it.

-- 
With Regards,
Amit Kapila.