date:20250912

Re: Eagerly evict bulkwrite strategy ring

2025-09-12 Thread Nazir Bilal Yavuz

Hi,

On Fri, 12 Sept 2025 at 02:17, Melanie Plageman
 wrote:
>
> While looking at this, I decided it didn't make sense to have sweep
> variables in the strategy object, so I've actually changed the way
> StrategySweepNextBuffer() works. There was also an issue with the
> sweep -- it could run into and past the starting buffer. So, I had to
> change it. Take a look at the new method and let me know what you
> think.

It looks good to me. StrategySweepNextBuffer()'s comment is no longer
correct, though.

-- 
Regards,
Nazir Bilal Yavuz
Microsoft

RE: Conflict detection for update_deleted in logical replication

2025-09-12 Thread Zhijie Hou (Fujitsu)

On Tuesday, September 2, 2025 6:00 PM shveta malik  
wrote:
> 
> On Mon, Sep 1, 2025 at 5:45 PM shveta malik 
> wrote:
> >
> > >
> > > Here is V70 patch set.
> > >
> >
> > The patch v70-0001 looks good to me. Verified, all the old issues are 
> > resolved.
> >
> > Will resume review of v70-0002 now.
> >
> 
> Please find a few comments on v70-0002:
> 
> 1)
> - * Note: Retention won't be resumed automatically. The user must manually
> - * disable retain_dead_tuples and re-enable it after confirming that the
> - * replication slot maintained by the launcher has been dropped.
> + * The retention will resume automatically if the worker has confirmed
> + that the
> + * retention duration is now within the max_retention_duration.
> 
> Do we need this comment atop stop as it does not directly concern stop? Isn't
> the details regarding RDT_RESUME_CONFLICT_INFO_RETENTION
> in the file-header section sufficient?

Agreed. I removed this comment.

> 
> 2)
> + /* Stop retention if not yet */
> + if (MySubscription->retentionactive)
> + {
> + rdt_data->phase = RDT_STOP_CONFLICT_INFO_RETENTION;
> 
> - /* process the next phase */
> - process_rdt_phase_transition(rdt_data, false);
> + /* process the next phase */
> + process_rdt_phase_transition(rdt_data, false); }
> +
> + reset_retention_data_fields(rdt_data);
> 
> should_stop_conflict_info_retention() does reset_retention_data_fields
> everytime when wait-time exceeds the limit, and when it actually stops i.e.
> calls stop_conflict_info_retention through phase change; the stop function
> also does reset_retention_data_fields and calls process_rdt_phase_transition.
> Can we optimize this code part by consolidating the
> reset_retention_data_fields() and
> process_rdt_phase_transition() calls into
> should_stop_conflict_info_retention() itself, eliminating redundancy?

Agreed. I improved the code here.

> 
> 3)
> Shall we update 035_conflicts.pl to have a resume test by setting
> max_retention_duration to 0 after stop-retention test?

Added.

> 
> 4)
> +  subscription. The retention will be automatically resumed
> once at least
> +  one apply worker confirms that the retention duration is within the
> +  specified limit, or a new subscription is created with
> +  retain_dead_tuples = true, or the user manually
> +  re-enables retain_dead_tuples.
> 
> Shall we rephrase it slightly to:
> 
> Retention will automatically resume when at least one apply worker confirms
> that the retention duration is within the specified limit, or when a new
> subscription is created with retain_dead_tuples = true.
> Alternatively, retention can be manually resumed by re-enabling
> retain_dead_tuples.

Changed as suggested.

Here is V71 patch set which addressed above comments and [1].

[1] 
https://www.postgresql.org/message-id/CAJpy0uC8w442wGEJ0gyR23ojAyvd-s_g-m8fUbixy0V9yOmrcg%40mail.gmail.com

Best Regards,
Hou zj


v71-0002-Add-a-dead_tuple_retention_active-column-in-pg_s.patch
Description:  v71-0002-Add-a-dead_tuple_retention_active-column-in-pg_s.patch


v71-0001-Allow-conflict-relevant-data-retention-to-resume.patch
Description:  v71-0001-Allow-conflict-relevant-data-retention-to-resume.patch

Re: postmaster uses more CPU in 18 beta1 with io_method=io_uring

2025-09-12 Thread Jakub Wartak

Hi Andres / Robert,

On Mon, Sep 8, 2025 at 5:55 PM Andres Freund  wrote:
>
> Hi,
>
> On 2025-09-06 09:12:19 -0400, Robert Treat wrote:
[..]
> > > [..], but 6.1 is the LTS kernel, so plenty of people
> > > are going to hit those regressions with io_uring io_method, won't
> > > they?
>
> I doubt it, but who knows.

RHEL 8.x won't have it (RH KB [1] says "RHEL 8.x: The addition to
RHEL8 was being tracked in private Bug 1881561 - Add io_uring support.
Unfortunately, it has been decided that io_uring support will not be
enabled in RHEL8."

RHEL 9.x seems to be all based on 5.14.x (so much below 6.5.x) and
states that uring is in Tech Preview there and is disabled, but it can
be enabled via sysctl. Hard to tell what they will backpatch into
5.14.x there. So if anywhere, I would speculate it would be RHEL9 (?),
therefore 5.14.x (+their custom back patches).

> > > I can try to prepare a patch, please just let me know.
>
> Yes, please do.

Attached.

> If they just upgrade in-place, they won't use io_uring. And they won't simply
> use io_uring with this large max_connections without also tuning the file
> descriptor limits...

Business as usual, just another obstacle...

-J.

v1-0001-aio-warn-user-if-combined-io_uring-memory-mapping.patch
Description: Binary data

Add support for entry counting in pgstats

2025-09-12 Thread Michael Paquier

Hi all,

One obstacle to the move of pg_stat_statements to the shmem pgstats
that we have in core is that there is no easy way to track the total
number of entries that are stored in the shared hash table of pgstats
for variable-sized stats kinds.

PGSS currently hash_get_num_entries() as a cheap way to get the number
of entries stored in its hash table, but moving PGSS to the in-core
stats facility means that we need a different approach to count these
entries.

I have looked at what can be done, and finished with a rather simple
approach, as we have only one code path when a new entry is inserted,
and one code path when an entry is removed when the refcount of an
entry reaches 0 (includes resets, drops for kind matches, etc.).  The
counters are stored in a uint64 atomic, one for each stats kind in
PgStat_ShmemControl, set only if the option is enabled.

I am wondering how much protection we want for the reads, and chose
the lightest "lossy" approach, leaving to stats kind what they want to
do when deallocating entries.  Hence, a stats kind may want to protect
the entry deallocations with an extra lock, but that's not required
either depending on how aggressive removals should happen.

Please find attached that adds an option for variable-sized stats
kinds given these the possibility to track the number of entries in
the shared hash table.  The option is named track_counts.  The patch
has some coverage added in the test module injection_points, in its
TAP test.

Thoughts are welcome.
--
Michael
From 0c924a72f586c385f6ab1a22174b9c3b1cf2dd08 Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Fri, 12 Sep 2025 16:09:51 +0900
Subject: [PATCH 1/2] Add support for entry counting in pgstats

Stats kinds can set an option call track_counts, that will make pgstats
track the number of entries that exist in the shared hashtable.

As there is only one code path where a new entry is added, and one code
path where entries are removed, the count tracking is rather
straight-forward.  Reads of these counters are optimistic, and may
change across two calls.

A use case of this facility is PGSS, where we need to be able to cap the
number of entries that would be stored in the shared hashtable, based on
its "max" GUC.
---
 src/include/utils/pgstat_internal.h   | 26 +
 src/backend/utils/activity/pgstat.c   |  6 +++
 src/backend/utils/activity/pgstat_shmem.c | 47 +--
 3 files changed, 67 insertions(+), 12 deletions(-)

diff --git a/src/include/utils/pgstat_internal.h b/src/include/utils/pgstat_internal.h
index 6cf8f633..8762a06081f8 100644
--- a/src/include/utils/pgstat_internal.h
+++ b/src/include/utils/pgstat_internal.h
@@ -216,6 +216,12 @@ typedef struct PgStat_KindInfo
 	/* Should stats be written to the on-disk stats file? */
 	bool		write_to_file:1;
 
+	/*
+	 * Should the number of entries be tracked?  For variable-numbered stats,
+	 * to update its PgStat_ShmemControl.entry_counts.
+	 */
+	bool		track_counts:1;
+
 	/*
 	 * The size of an entry in the shared stats hash table (pointed to by
 	 * PgStatShared_HashEntry->body).  For fixed-numbered statistics, this is
@@ -485,6 +491,15 @@ typedef struct PgStat_ShmemControl
 	 */
 	pg_atomic_uint64 gc_request_count;
 
+	/*
+	 * Counters for the number of entries associated to a single stats kind
+	 * that uses variable-numbered objects stored in the shared hash table.
+	 * These counters can be enabled on a per-kind basis, when track_counts is
+	 * set.  This counter is incremented each time a new entry is created (not
+	 * when reused), and each time an entry is dropped.
+	 */
+	pg_atomic_uint64 entry_counts[PGSTAT_KIND_MAX];
+
 	/*
 	 * Stats data for fixed-numbered objects.
 	 */
@@ -910,6 +925,17 @@ pgstat_get_entry_data(PgStat_Kind kind, PgStatShared_Common *entry)
 	return ((char *) (entry)) + off;
 }
 
+/*
+ * Returns the number of entries counted for a stats kind.
+ */
+static inline uint64
+pgstat_get_entry_count(PgStat_Kind kind)
+{
+	Assert(pgstat_get_kind_info(kind)->track_counts);
+
+	return pg_atomic_read_u64(&pgStatLocal.shmem->entry_counts[kind - 1]);
+}
+
 /*
  * Returns a pointer to the shared memory area of custom stats for
  * fixed-numbered statistics.
diff --git a/src/backend/utils/activity/pgstat.c b/src/backend/utils/activity/pgstat.c
index f8e91484e36b..f6784ad6cd15 100644
--- a/src/backend/utils/activity/pgstat.c
+++ b/src/backend/utils/activity/pgstat.c
@@ -1490,6 +1490,12 @@ pgstat_register_kind(PgStat_Kind kind, const PgStat_KindInfo *kind_info)
 			ereport(ERROR,
 	(errmsg("custom cumulative statistics property is invalid"),
 	 errhint("Custom cumulative statistics require a shared memory size for fixed-numbered objects.")));
+		if (kind_info->track_counts)
+		{
+			ereport(ERROR,
+	(errmsg("custom cumulative statistics property is invalid"),
+	 errhint("Custom cumulative statistics cannot use counter tracking for fixed-numbered objects.")));
+		}
 	}
 
 	/*
diff --git a/src/backend/ut

Re: someone else to do the list of acknowledgments

2025-09-12 Thread Corey Huinker

516 contributors this cycle vs 462 last cycle.

When name accents/capitalization differed, I went with the string used in
the previous year.

Some of the names that come from bug reports and doc fixes are just first
names, and the discussion threads shed no light on the full name of the
person.

I have a git log that has been "redacted", which is to say that every name
attribution has been screened out, as well as names of known past
contributors, and various other name-looking things that aren't people (ex.
Microsoft Windows) screened out. Even with all that automation it is still
over 50k lines, so if you want to double check that, make sure you have a
comfy chair.

On Tue, Sep 9, 2025 at 12:07 PM jian he  wrote:

> On Sun, Aug 17, 2025 at 12:40 PM Corey Huinker 
> wrote:
> >
> > On Sat, Aug 16, 2025 at 11:58 PM jian he 
> wrote:
> >>
> >> hi.
> >> maybe we should start working on this?
> >>
> >> https://www.postgresql.org/developer/roadmap
> >> says 18 will be released in September 2025.
> >
> >
> > I am.
> >
>
> hi.
> Maybe you can share the draft patch so others can review it.
>
> https://www.postgresql.org says:
> "The planned date for the general availability of PostgreSQL 18 is
> September 25, 2025."
>
Abhishek Chanda
Adam Guo
Adam Rauch
Aidar Imamov
Ajin Cherian
Alastair Turner
Alec Cozens
Aleksander Alekseev
Alena Rybakina
Alex Friedman
Alex Richman
Alexander Alehin
Alexander Borisov
Alexander Korotkov
Alexander Kozhemyakin
Alexander Kukushkin
Alexander Kuzmenkov
Alexander Kuznetsov
Alexander Lakhin
Alexander Pyhalov
Alexandra Wang
Alexey Dvoichenkov
Alexey Makhmutov
Alexey Shishkin
Ali Akbar
Alvaro Herrera
Amit Kapila
Amit Langote
Amul Sul
Andreas Karlsson
Andreas Scherbaum
Andreas Ulbrich
Andrei Lepikhov
Andres Freund
Andrew
Andrew Bille
Andrew Dunstan
Andrew Jackson
Andrew Kane
Andrew Watkins
Andrey Borodin
Andrey Chudnovsky
Andrey Lepikhov
Andrey Rachitskiy
Andrey Rudometov
Andy Alsup
Andy Fan
Anthonin Bonnefoy
Anthony Hsu
Anthony Leung
Anton A. Melnikov
Anton Melnikov
Anton Voloshin
Antonin Houska
Antti Lampinen
Arseniy Mukhin
Artur Zakirov
Arun Thirupathi
Ashutosh Bapat
Asphator
Atsushi Torikoshi
Avi Weinberg
Aya Iwata
Aysén Region
Ayush Tiwari
Ayush Vatsa
Bastien Roucariès
Ben Peachey Higdon
Benoit Lobréau
Bernd Helmle
Bernd Reiß
Bernhard Wiedemann
Bertrand Drouvot
Bertrand Mamasam
Bharath Rupireddy
Bogdan Grigorenko
Boyu Yang
Braulio Fdo Gonzalez
Bruce Momjian
Bykov Ivan
Cameron Vogt
Cary Huang
Cees van Zeeland
ChangAo Chen
Chao Li
Chapman Flack
Charles Samborski
Chengxi Sun
Chiranmoy Bhattacharya
Chris Gooch
Christoph Berg
Christophe Courtois
Christopher Inokuchi
Clemens Ruck
Corey Huinker
Craig Milhiser
Crisp Lee
Cédric Villemain
Dagfinn Ilmari Mannsåker
Daniel Elishakov
Daniel Gustafs
Daniel Gustafsson
Daniel Vérité
Daniel Westermann
Daniele Varrazzo
Daniil Davydov
Daria Shanina
Dave Cramer
Dave Page
David Benjamin
David Christensen
David E. Wheeler
David Fiedler
David G. Johnston
David Geier
David Rowley
David Steele
David Wheeler
David Zhang
Davinder Singh
Dean Rasheed
Devanga Susmitha
Devrim Gündüz
Dian Fay
Dilip Kumar
Dimitrios Apostolou
Dipesh Dhameliya
Dmitrii Bondar
Dmitry
Dmitry Dolgov
Dmitry Koval
Dmitry Kovalenko
Dominique Devienne
Donghang Lin
Dorjpalam Batbaatar
Drew Callahan
Duncan Sands
Dzmitry Jachnik
Egor Chindyaskin
Egor Rogov
Emanuele Musella
Emre Hasegeli
Eric Cyr
Erica Zhang
Erik Nordström
Erik Rijkers
Erik Wienhold
Erki Eessaar
Ethan Mertz
Etienne LAFARGE
Etsuro Fujita
Euler Taveira
Evan Si
Evgeniy Gorbanev
Fabio R. Sluzala
Fabrízio de Royes Mello
Feike Steenbergen
Feliphe Pozzer
Felix
Fire Emerald
Florents Tselai
Francesco Degrassi
Frank Streitzig
Fredrik Widlert
Frédéric Yhuel
Fujii Masao
Gabriele Bartolini
Gavin Panella
Geoff Winkless
George MacKerron
Gilles Darold
Grant Gryczan
Greg Burd
Greg Sabino Mullane
Greg Stark
Grigory Kryachko
Guillaume Lelarge
Gunnar Morling
Gurjeet Singh
Hacking Discord
Haifang Wang
Hajime Matsunaga
Hamid Akhtar
Hannu Krosing
Hari Krishna Sunder
Haruka Takatsuka
Hayato Kuroda
Heikki Linnakangas
Hironobu Suzuki
Holger Jakobs
Hou Zhijie
Hubert Lubaczewski
Hugo DUBOIS
Hugo Zhang
Hunaid Sohail
Hywel Carver
Ian Barwick
Ibrar Ahmed
Igor Gnatyuk
Igor Korot
Ilia Evdokimov
Ilya Gladyshev
Ilyasov Ian
Imran Zaheer
Isaac Morland
Israel Barth Rubio
Ivan Kush
Jacob Brazeal
Jacob Champion
Jaime Casanova
Jakob Egger
Jakub Wartak
James Coleman
James Hunter
Jan Behrens
Japin Li
Jason Smith
Jayesh Dehankar
Jeevan Chalke
Jeff Davis
Jehan-Guillaume de Rorthais
Jelte Fennema-Nio
Jian He
Jianghua Yang
Jiao Shuntian
Jim Jones
Jim Nasby
Jingtang Zhang
Jingzhou Fu
Joe Conway
Joel Jacobson
John Hutchins
John Naylor
Jonathan Katz
Josef Šimánek
Joseph Koshakow
José Villanova
Julien Rouhaud
Junwang Zhao
Justin Pryzby
Kaido Vaikla
Kaimeh
Karina Litskevich
Karthik S
Kartyshov Ivan
Kashif Zeeshan
Katsuragi Yuta
Keisuke Kuroda
Kevin Hale Boyes
Kevin K Biju
Kirill Reshke
Kirill Zdornyy
Koen De Groote
Koichi Suzuki
Koki Nakamura
Konstantin Knizhnik
Kuntal Ghosh
Kuntal G

Re: issue with synchronized_standby_slots

2025-09-12 Thread Fujii Masao

On Tue, Sep 9, 2025 at 2:39 PM Zhijie Hou (Fujitsu)
 wrote:
> I agree. For synchronized_standby_slots, I think it is acceptable to report 
> only
> parsing errors, because slots could be dropped even after validating the slot
> existence during GUC loading. Additionally, we would report WARNINGs for
> non-existent slots during the wait function anyway (e.g., in
> StandbySlotsHaveCaughtup()).

+1, to also address the issue I reported upthread.

As I understand it, only the walsender and the slot sync worker actually
use synchronized_standby_slots. But in the current code, all processes check
for slot existence via the GUC check hook, which wastes cycles. The proposed
change would eliminate that overhead.

Regards,

-- 
Fujii Masao

Re: [PATCH] jit: fix build with LLVM-21

2025-09-12 Thread Holger Hoffstätte


On 2025-09-12 08:36, Peter Eisentraut wrote:

On 08.09.25 15:20, Holger Hoffstätte wrote:

I tried building against LLVM-21 and noticed that a function for symbol
lookup was renamed (without semantic changes), breaking the LLVM JIT.
The following patch fixes this by adding a version guard. It applies equally
to both master and 17.6. Passes the test suite and verified on 17.6 with the
jit example from the documentation.


I can confirm that this change seems correct.  See [0] for reference.


Excellent! Thanks for taking a look. This was my first post to pgsql-hackers
and I wasn't sure if I had done something wrong.


[0]:
https://github.com/llvm/llvm-project/commit/d3d856ad84698fa4ec66177d00558b2f5b438d3b

 As a small style request, I would flip the conditional around so
that the new code appears first.  I see that we don't do this very
consistently in the existing code, but maybe we can start a new
trend. ;-)

I knew this would come up since I pondered the same thing :D
As you said, the existing code is not consistent, but I can switch this
to let the new code appear first, if you prefer.
A new patch is attached.


In my testing with LLVM 21, I'm getting an additional error:

../src/backend/jit/llvm/llvmjit_wrap.cpp:56:18: error: no matching constructor 
for initialization of 'llvm::orc::RTDyldObjectLinkingLayer'
    56 | return wrap(new llvm::orc::RTDyldObjectLinkingLayer(
   | ^
    57 | *unwrap(ES), [] { return 
std::make_unique(nullptr, true); }));
   | 
~
/opt/homebrew/Cellar/llvm/21.1.0/include/llvm/ExecutionEngine/Orc/RTDyldObjectLinkingLayer.h:58:3:
 note: candidate constructor not viable: no known conversion from '(lambda at 
../src/backend/jit/llvm/llvmjit_wrap.cpp:57:16)' to 'GetMemoryManagerFunction' (aka 
'unique_function (const MemoryBuffer 
&)>') for 2nd argument
    58 |   RTDyldObjectLinkingLayer(ExecutionSession &ES,
   |   ^
    59 |    GetMemoryManagerFunction GetMemoryManager);
   | ~
/opt/homebrew/Cellar/llvm/21.1.0/include/llvm/ExecutionEngine/Orc/RTDyldObjectLinkingLayer.h:37:16:
 note: candidate constructor (the implicit copy constructor) not viable: 
requires 1 argument, but 2 were provided
    37 | class LLVM_ABI RTDyldObjectLinkingLayer
   |    ^~~~

I gather you're not seeing that?


I do not - I dropped the patch into my Gentoo package build for 17.6 and
it has been working fine when building with gcc-15.2 and the jit part
with clang-21.1.0. I also just rebuilt everything with clang-21.1.1 (no gcc)
and also do not see the problem. This is on amd64 Linux.

The following commit seems to be involved:
https://github.com/postgres/postgres/commit/9044fc1d45a0212fd123bd8f364eac058f60fed7

which is also in my 17.6 tree. I verified that the new
SectionMemoryManager.cpp is compiled without error.

Since you're using homebrew I guess this is triggered by the
__aarch64__ guard in src/include/jit/llvmjit_backport.h.

Unfortunately this is as far as I can help with this.
Please let me know if there is anything else I can do.

cheers
Holger
>From 699a5aa8f9438c902ac5ad063525193ddf0ebb59 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Holger=20Hoffst=C3=A4tte?= 
Date: Fri, 12 Sep 2025 09:40:31 +0200
Subject: [PATCH] jit: fix build with LLVM-21
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

LLVM-21 renamed llvm::GlobalValue::getGUID() to
getGUIDAssumingExternalLinkage(), so add a version guard.

Signed-off-by: Holger Hoffstätte 
---
 src/backend/jit/llvm/llvmjit_inline.cpp | 4 
 1 file changed, 4 insertions(+)

diff --git a/src/backend/jit/llvm/llvmjit_inline.cpp b/src/backend/jit/llvm/llvmjit_inline.cpp
index 2764c3bbe2f..51b32cd9f94 100644
--- a/src/backend/jit/llvm/llvmjit_inline.cpp
+++ b/src/backend/jit/llvm/llvmjit_inline.cpp
@@ -238,7 +238,11 @@ llvm_build_inline_plan(LLVMContextRef lc, llvm::Module *mod)
 
 		llvm_split_symbol_name(symbolName.data(), &cmodname, &cfuncname);
 
+#if LLVM_VERSION_MAJOR >= 21
+		funcGUID = llvm::GlobalValue::getGUIDAssumingExternalLinkage(cfuncname);
+#else
 		funcGUID = llvm::GlobalValue::getGUID(cfuncname);
+#endif
 
 		/* already processed */
 		if (inlineState.processed)
-- 
2.51.0

Re: Conflict detection for update_deleted in logical replication

2025-09-12 Thread Amit Kapila

On Fri, Sep 12, 2025 at 8:55 AM Zhijie Hou (Fujitsu)
 wrote:
>
> I agree. Here is a V73 patch that will restart the worker if the retention
> resumes. I also addressed other comments posted by Amit[1].
>

Few comments:

* In adjust_xid_advance_interval(), for the case when retention is not
active, we still cap the interval by wal_receiver_status_interval. Is
that required or do we let it go till 3 minutes? We can add a new else
if loop to keep the code clear, if you agree with this.

*
+ /*
+ * Resume retention immediately if required. (See
+ * should_resume_retention_immediately() for details).
+ */
+ if (should_resume_retention_immediately(rdt_data, status_received))
+ rdt_data->phase = RDT_RESUME_CONFLICT_INFO_RETENTION;

Is this optimization for the case when we are in stop_phase or after
stop_phase and someone has changed maxretention to 0? If so, I don't
think it is worth optimizing for such a rare case at the cost of
making code difficult to understand.

Apart from this, I have changed a few comments in the attached.

-- 
With Regards,
Amit Kapila.
diff --git a/src/backend/replication/logical/worker.c 
b/src/backend/replication/logical/worker.c
index 1274192de7e..7d1ed506622 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -4363,10 +4363,7 @@ maybe_advance_nonremovable_xid(RetainDeadTuplesData 
*rdt_data,
if (!can_advance_nonremovable_xid(rdt_data))
return;
 
-   /*
-* Resume retention immediately if required. (See
-* should_resume_retention_immediately() for details).
-*/
+   /* Resume dead_tuples retention immediately if required. */
if (should_resume_retention_immediately(rdt_data, status_received))
rdt_data->phase = RDT_RESUME_CONFLICT_INFO_RETENTION;
 
@@ -4549,8 +4546,8 @@ wait_for_publisher_status(RetainDeadTuplesData *rdt_data,
{
/*
 * Stop retention if not yet. Otherwise, reset to the initial 
phase to
-* retry all phases. This is required to recalculate the 
current wait
-* time and resume retention if the time falls within
+* retry resuming retention. This reset is required to 
recalculate the
+* current wait time and resume retention if the time falls 
within
 * max_retention_duration.
 */
if (MySubscription->retentionactive)
@@ -4683,8 +4680,8 @@ wait_for_local_flush(RetainDeadTuplesData *rdt_data)
{
/*
 * Stop retention if not yet. Otherwise, reset to the initial 
phase to
-* retry all phases. This is required to recalculate the 
current wait
-* time and resume retention if the time falls within
+* retry resuming retention. This reset is required to 
recalculate the
+* current wait time and resume retention if the time falls 
within
 * max_retention_duration.
 */
if (MySubscription->retentionactive)
@@ -4763,9 +4760,7 @@ wait_for_local_flush(RetainDeadTuplesData *rdt_data)
  * Check whether conflict information retention should be stopped due to
  * exceeding the maximum wait time (max_retention_duration).
  *
- * If retention should be stopped, transition to the
- * RDT_STOP_CONFLICT_INFO_RETENTION phase and return true. Otherwise, return
- * false.
+ * If retention should be stopped, return true. Otherwise, return false.
  */
 static bool
 should_stop_conflict_info_retention(RetainDeadTuplesData *rdt_data)

Re: Conflict detection for update_deleted in logical replication

2025-09-12 Thread shveta malik

On Fri, Sep 12, 2025 at 8:55 AM Zhijie Hou (Fujitsu)
 wrote:
>
>
> I agree. Here is a V73 patch that will restart the worker if the retention
> resumes. I also addressed other comments posted by Amit[1].
>

Thanks for the patch. Few comments:

1)
There is a small window where worker can exit  while resuming
retention and launcher can end up acessign stale worker info.

Lets say launcher is at a stage where it has fetched worker:
w = logicalrep_worker_find(sub->oid, InvalidOid, false);

And after this point, before the launcher could do
compute_min_nonremovable_xid(), the worker has stopped retention and
resumed as well. Now the worker has exited but in
compute_min_nonremovable_xid(), launcher will still use the
worker-info fetched previously.

2)

  if (should_stop_conflict_info_retention(rdt_data))
+  {
+/*
+ * Stop retention if not yet. Otherwise, reset to the initial phase to
+ * retry all phases. This is required to recalculate the current wait
+ * time and resume retention if the time falls within
+ * max_retention_duration.
+ */
+if (MySubscription->retentionactive)
+  rdt_data->phase = RDT_STOP_CONFLICT_INFO_RETENTION;
+else
+  reset_retention_data_fields(rdt_data);
+
return;
+  }

Shall we have an Assert( !MyLogicalRepWorker->oldest_nonremovable_xid)
in 'else' part above?

thanks
Shveta

Re: Add support for entry counting in pgstats

2025-09-12 Thread Chao Li



> On Sep 12, 2025, at 15:23, Michael Paquier  wrote:
> 
> 
> --
> Michael
> <0001-Add-support-for-entry-counting-in-pgstats.patch><0002-injection_points-Add-entry-counting.patch>


The code overall looks good to me, very clear and neat. Just a few small 
comments:

1 - 0001
```
+* set.  This counter is incremented each time a new entry is created 
(not
+* when reused), and each time an entry is dropped.
+*/
+   pg_atomic_uint64 entry_counts[PGSTAT_KIND_MAX];
```

“and each time an entry is dropped” is a little misleading, it should be “and 
decremented when an entry is removed”.

2 - 0001
```
+   /* Increment entry count, if required. */
+   if (kind_info->track_counts)
+   pg_atomic_fetch_add_u64(&pgStatLocal.shmem->entry_counts[kind - 
1], 1);
```

The code is quite straightforward, so the comment seems unnecessary.

3 - 0001
```
+   if (kind_info->track_counts)
+   {
+   ereport(ERROR,
+   (errmsg("custom cumulative statistics 
property is invalid"),
+errhint("Custom cumulative statistics 
cannot use counter tracking for fixed-numbered objects.")));
+   }
```

{} are not needed. Looks like in the existing code, when “if” has a single 
statement, {} are not used. There is a similar “if” right above this change.

4 - 0004
```
diff --git a/src/test/modules/injection_points/injection_stats.c 
b/src/test/modules/injection_points/injection_stats.c
index e3947b23ba57..15ad9883dedb 100644
--- a/src/test/modules/injection_points/injection_stats.c
+++ b/src/test/modules/injection_points/injection_stats.c
@@ -49,6 +49,7 @@ static const PgStat_KindInfo injection_stats = {
.shared_data_len = sizeof(((PgStatShared_InjectionPoint *) 0)->stats),
.pending_size = sizeof(PgStat_StatInjEntry),
.flush_pending_cb = injection_stats_flush_cb,
+   .track_counts = true,
 };
```

Would it be better to move “.track_counts” up to be with the other boolean 
flags? Whey define together to share the same byte, feels better to initialize 
them together as well.

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/

Re: How can end users know the cause of LR slot sync delays?

2025-09-12 Thread Ashutosh Sharma

Hi Amit,

On Mon, Sep 8, 2025 at 9:52 AM Ashutosh Sharma  wrote:
>
> Hi Amit,
>
> On Sat, Sep 6, 2025 at 10:46 AM Amit Kapila  wrote:
> >
> > On Fri, Sep 5, 2025 at 12:50 PM Ashutosh Sharma  
> > wrote:
> > >
> > > Good to hear that you’re also interested in working on this task.
> > >
> > > On Thu, Sep 4, 2025 at 8:26 PM Shlok Kyal  
> > > wrote:
> > >>
> > >> Hi Ashutosh,
> > >>
> > >> I am also interested in this thread. And was working on a patch for it.
> > >>
> > >> On Wed, 3 Sept 2025 at 17:52, Ashutosh Sharma  
> > >> wrote:
> > >> >
> > >> > Hi Amit,
> > >> >
> > >> > On Thu, Aug 28, 2025 at 3:26 PM Amit Kapila  
> > >> > wrote:
> > >> >>
> > >> >> On Thu, Aug 28, 2025 at 11:07 AM Ashutosh Sharma 
> > >> >>  wrote:
> > >> >> >
> > >> >> > We have seen cases where slot synchronization gets delayed, for 
> > >> >> > example when the slot is behind the failover standby or vice versa, 
> > >> >> > and the slot sync worker has to wait for one to catch up with the 
> > >> >> > other. During this waiting period, users querying 
> > >> >> > pg_replication_slots can only see whether the slot has been 
> > >> >> > synchronized or not. If it has already synchronized, that’s fine, 
> > >> >> > but if synchronization is taking longer, users would naturally want 
> > >> >> > to understand the reason for the delay.
> > >> >> >
> > >> >> > Is there a way for end users to know the cause of slot 
> > >> >> > synchronization delays, so they can take appropriate actions to 
> > >> >> > speed it up?
> > >> >> >
> > >> >> > I understand that server logs are emitted in such cases, but logs 
> > >> >> > are not something end users would want to check regularly. 
> > >> >> > Moreover, since logging is configuration-based, relevant messages 
> > >> >> > may sometimes be skipped or suppressed.
> > >> >> >
> > >> >>
> > >> >> Currently, the way to see the reason for sync skip is LOGs but I think
> > >> >> it is better to add a new column like sync_skip_reason in
> > >> >> pg_replication_slots. This can show the reasons like
> > >> >> standby_LSN_ahead_remote_LSN. I think ideally users can compare
> > >> >> standby's slot LSN/XMIN with remote_slot being synced. Do you have any
> > >> >> better ideas?
> > >> >>
> > >> >
> > >> > I have similar thoughts, but for clarity, I’d like to outline some of 
> > >> > the key steps I plan to take:
> > >> >
> > >> > Step 1: Define an enum for all possible reasons a slot persistence was 
> > >> > skipped.
> > >> >
> > >> > /*
> > >> >  * Reasons why a replication slot sync was skipped.
> > >> >  */
> > >> > typedef enum ReplicationSlotSyncSkipReason
> > >> > {
> > >> > RS_SYNC_SKIP_NONE = 0, /* No skip */
> > >> >
> > >> > RS_SYNC_SKIP_REMOTE_BEHIND = (1 << 0), /* Remote slot is behind 
> > >> > local reserved LSN */
> > >> >
> > >> > RS_SYNC_SKIP_DATA_LOSS = (1 << 1), /* Local slot ahead of 
> > >> > remote, risk of data loss */
> > >> >
> > >> > RS_SYNC_SKIP_NO_SNAPSHOT = (1 << 2)/* Standby could not build 
> > >> > a consistent snapshot */
> > >> > } ReplicationSlotSyncSkipReason;
> > >> >
> > >> > --
> > >> >
> > >> I think we should also add the case when "remote_slot->confirmed_lsn >
> > >> latestFlushPtr" (WAL corresponding to the confirmed lsn on remote slot
> > >> is still not flushed on the Standby). In this case as well we are
> > >> skipping the slot sync.
> > >
> > >
> > > Yes, we can include this case as well.
> > >
> > >>
> > >>
> > >> > Step 2: Introduce new column to pg_replication_slots to store the skip 
> > >> > reason
> > >> >
> > >> > /* Inside pg_replication_slots table */
> > >> > ReplicationSlotSyncSkipReason slot_sync_skip_reason;
> > >> >
> > >> > --
> > >> >
> > >> As per the discussion [1], I think it is more of stat related data and
> > >> we should add it in the pg_stat_replication_slots view. Also we can
> > >> add columns for 'slot sync skip count' and 'last slot sync skip'.
> > >> Thoughts?
> > >
> > >
> > > It’s not a bad choice, but what makes it a bit confusing for me is that 
> > > some of the slot sync information is stored in pg_replication_slots, 
> > > while some is in pg_stat_replication_slots.
> > >
> >
> > How about keeping sync_skip_reason in pg_replication_slots and
> > sync_skip_count in pg_stat_replication_slots?
> >
>
> I think we can do that, since sync_skip_reason appears to be a
> descriptive metadata rather than statistical data like
> slot_sync_skip_count and last_slot_sync_skip. However, it's also true
> that all three pieces of data are transient by nature - they will just
> be present in the runtime.
>

After spending some more time on this, I found that maintaining
sync_skip_reason in pg_replication_slots would make the code changes a
bit messy and harder to maintain. I think storing all 3 pieces of
information - sync_skip_reason, sync_skip_count, and last_sync_skip in
pg_stat_replication_slots would be a cleaner solution. This way, all
the sync-related info stays together and the code remain

Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

2025-09-12 Thread Ashutosh Bapat

On Fri, Sep 12, 2025 at 9:38 AM Amit Kapila  wrote:
>
>
> One thing related to this which needs a discussion is after this
> change, it is possible that part of the transaction contains
> additional logical_wal_info. I couldn't think of a problem due to this
> but users using pg_waldump or other WAL reading utilities could
> question this. One possibility is that we always start including
> logical_wal_info for the next new transaction but not sure if that is
> required. It would be good if other people involved in the discussion
> or otherwise could share their opinion on this point.
>

AFAIR, logical info is a separate section in a WAL record, and there
is not marker which says "WAL will contain logical info henceforth".
So the utilities should be checking for the existence of such info
before reading it. So I guess it should be ok. Some extra sensitive
utilities may expect that once a WAL record has logical info, all the
succeeding WAL records will have it. They may find it troublesome that
WAL records with and without logical info are interleaved. Generally,
I would prefer that presence/absence of logical info changes at
transaction boundaries, but we will still have interleaving WAL
records. So I doubt how much that matters.

Sorry for jumping late in the discussion. I have a few comments,
mostly superficial ones. I am yet to take a deeper look at the
synchronization logic.

@@ -328,8 +362,7 @@ postgres=# select * from
pg_logical_slot_get_changes('regression_slot', NULL, NU
that could be needed by the logical decoding on the standby (as it does
not know about the catalog_xmin on the standby).
Existing logical slots on standby also get invalidated if
- wal_level on the primary is reduced to less than
- logical.
+ logical decoding becomes disabled on the primary.

s/becomes disabled/is disabled/ or /gets disabled/. Given that logical
decoding can be disabled in multiple ways, it's better to add a
reference here to a section which explains what disabling logical
decoding means.

wal_level_insufficient means that the
- primary doesn't have a  sufficient to
- perform logical decoding. It is set only for logical slots.
+ logical decoding is disabled on primary due to insufficient
+  or no logical slots. It is set only
+ for logical slots.

It may not be apparent to the users that insufficient wal_level means
'minimal' here. It will be better if we just mention logical decoding
is disabled on primary and refer to a section which explains what
disabling logical decoding means.

*
* Skip this if we're taking a full-page image of the new page, as we
* don't include the new tuple in the WAL record in that case. Also
- * disable if wal_level='logical', as logical decoding needs to be able to
- * read the new tuple in whole from the WAL record alone.
+ * disable if logical decoding is enabled, as logical decoding needs to be
+ * able to read the new tuple in whole from the WAL record alone.
*/

Not fault of this patch, but I find the comment to be slighly out of
sync with the code. The code actually checks whether the logical
decoding is enabled and whether the relation requires WAL to be logged
for logical decoding. The difference is subtle, but it did cause me a
bit of confusion when I read the code. Please consider rephrasing the
comment while you are modifying it.
if (oldbuf == newbuf && !need_tuple_data &&
!XLogCheckBufferNeedsBackup(newbuf))
@@ -9057,8 +9057,8 @@ log_heap_update(Relation reln, Buffer oldbuf,
/*
* Perform XLogInsert of an XLOG_HEAP2_NEW_CID record
*
- * This is only used in wal_level >= WAL_LEVEL_LOGICAL, and only for catalog
- * tuples.
+ * This is only used when effective WAL level is logical, and only for

Given that the earlier comment used GUC name, it's better to use the
GUC name "effective_wal_level" here.

case XLOG_PARAMETER_CHANGE:
+
+ /*
+ * Even if wal_level on the primary got decreased to 'replica' it
+ * doesn't necessarily mean to disable the logical decoding as
+ * long as we have at least one logical slot. So we don't check
+ * the logical decoding availability here but do in
+ * XLOG_LOGICAL_DECODING_STATUS_CHANGE case.
+ */

The earlier code checked for wal_level < WAL_LEVEL_LOGICAL, which
includes the case when wal_level is 'minimal'. I don't see the new
code handling 'minimal' case here. Am I missing something? Do we need
a comment here which specifically mentions "minimal" case

grammar "Even if wal_level on the primary was lowered to 'replica', as
long as there is at least one logical slot, the logical decoding
remains enabled. ... " also ... do so in ... .

+ * The module maintains separate controls of two aspects: writing information
+ * required by logical decoding to WAL records and utilizing logical decoding
+ * itself, controlled by LogicalDecodingCtl->xlog_logical_info and
+ * ->logical_decoding_enabled fields respectively. The activation process of
+ * logical decoding involves several steps, beginning with maintaining logical
+ * decoding in a disabled stat

let ALTER COLUMN SET DATA TYPE cope with POLICY dependency

2025-09-12 Thread jian he

hi.

in [1],
RememberAllDependentForRebuilding
/*
 * A policy can depend on a column because the column is
 * specified in the policy's USING or WITH CHECK qual
 * expressions.  It might be possible to rewrite and recheck
 * the policy expression, but punt for now.  It's certainly
 * easy enough to remove and recreate the policy; still, FIXME
 * someday.
 */
After 11 year, I am trying to allow column type changes to cope with
security policy dependencies.

CREATE TABLE s (a int, b int);
CREATE POLICY p2 ON s USING (s.b = 1);
--master branch will result error
ALTER TABLE s ALTER COLUMN b SET DATA TYPE INT8;
ERROR:  cannot alter type of a column used in a policy definition
DETAIL:  policy p2 on table s depends on column "b"

with the attached patch,  ALTER TABLE SET DATA TYPE can cope with columns that
have associated security policy.
The above ALTER TABLE SET DATA TYPE will just work fine.
The code roughly follows how statistics are recreated after a column
data type change.

Currently table rewrite does not recheck the policy expression, for example:
RESET SESSION AUTHORIZATION;
CREATE USER regress_rls_alice NOLOGIN;
GRANT ALL ON SCHEMA public to public;
DROP TABLE IF EXISTS R1;
SET row_security = on;
begin;
set role regress_rls_alice;
CREATE TABLE r1 (a int, b int GENERATED ALWAYS AS (a * 10) STORED);
INSERT INTO r1 VALUES (1), (2), (4);
CREATE POLICY p0 ON r1 USING (true);
CREATE POLICY p1 ON r1 AS RESTRICTIVE USING (b > 10);
ALTER TABLE r1 ENABLE ROW LEVEL SECURITY;
ALTER TABLE r1 FORCE ROW LEVEL SECURITY;
commit;
set role regress_rls_alice;
INSERT INTO r1 VALUES (0); -- Should fail p1
ALTER TABLE r1 ALTER COLUMN b SET EXPRESSION AS (-1); --OK

so i guess ALTER TABLE SET DATA TYPE, table rewrite no checking policy
should be fine?

[1] 
https://git.postgresql.org/cgit/postgresql.git/commit/?id=143b39c1855f8a22f474f20354ee5ee5d2f4d266
From be887b714a3bc788b6859954fad63137d64d1f61 Mon Sep 17 00:00:00 2001
From: jian he 
Date: Fri, 12 Sep 2025 16:06:53 +0800
Subject: [PATCH v1 1/1] let ALTER COLUMN SET DATA TYPE cope with POLICY
 dependency

CREATE TABLE s (a int, b int);
CREATE POLICY p2 ON s USING (s.b = 1);

--no-error, while master branch will result error
ALTER TABLE s ALTER COLUMN b SET DATA TYPE INT8;

discussion: https://postgr.es/m/
---
 src/backend/commands/policy.c |  59 ++
 src/backend/commands/tablecmds.c  | 122 ++--
 src/backend/parser/gram.y |   1 +
 src/backend/utils/adt/ruleutils.c | 185 ++
 src/include/commands/policy.h |   1 +
 src/include/nodes/parsenodes.h|   2 +
 src/include/utils/ruleutils.h |   1 +
 .../test_ddl_deparse/test_ddl_deparse.c   |   3 +
 src/test/regress/expected/rowsecurity.out |  79 
 src/test/regress/sql/rowsecurity.sql  |  41 
 10 files changed, 479 insertions(+), 15 deletions(-)

diff --git a/src/backend/commands/policy.c b/src/backend/commands/policy.c
index 83056960fe4..c287c08f155 100644
--- a/src/backend/commands/policy.c
+++ b/src/backend/commands/policy.c
@@ -24,8 +24,10 @@
 #include "catalog/namespace.h"
 #include "catalog/objectaccess.h"
 #include "catalog/pg_authid.h"
+#include "catalog/pg_depend.h"
 #include "catalog/pg_policy.h"
 #include "catalog/pg_type.h"
+#include "commands/comment.h"
 #include "commands/policy.h"
 #include "miscadmin.h"
 #include "nodes/pg_list.h"
@@ -755,6 +757,11 @@ CreatePolicy(CreatePolicyStmt *stmt)
 	relation_close(target_table, NoLock);
 	table_close(pg_policy_rel, RowExclusiveLock);
 
+	/* Add any requested comment */
+	if (stmt->polcomment != NULL)
+		CreateComments(policy_id, PolicyRelationId, 0,
+	   stmt->polcomment);
+
 	return myself;
 }
 
@@ -1277,3 +1284,55 @@ relation_has_policies(Relation rel)
 
 	return ret;
 }
+
+/*
+ * PoliciesGetRelations -
+ * Collect all relations that this policy depends on.
+ * Since the policy's check qual or qual may reference other relations, we
+ * include those as well.
+ */
+List *
+PoliciesGetRelations(Oid policyId)
+{
+	List	   *result = NIL;
+	Relation	depRel;
+	ScanKeyData key[2];
+	SysScanDesc depScan;
+	HeapTuple	depTup;
+
+	/*
+	 * We scan pg_depend to find those things that policy being depended on.
+	 */
+	depRel = table_open(DependRelationId, AccessShareLock);
+
+	ScanKeyInit(&key[0],
+Anum_pg_depend_classid,
+BTEqualStrategyNumber, F_OIDEQ,
+ObjectIdGetDatum(PolicyRelationId));
+	ScanKeyInit(&key[1],
+Anum_pg_depend_objid,
+BTEqualStrategyNumber, F_OIDEQ,
+ObjectIdGetDatum(policyId));
+
+	depScan = systable_beginscan(depRel, DependDependerIndexId, true,
+ NULL, 2, key);
+	while (HeapTupleIsValid(depTup = systable_getnext(depScan)))
+	{
+		Form_pg_depend pg_depend = (Form_pg_depend) GETSTRUCT(depTup);
+
+		/* Prepend oid of the relation with this policy.

Re: issue with synchronized_standby_slots

2025-09-12 Thread Shlok Kyal

On Thu, 11 Sept 2025 at 12:00, Rahila Syed  wrote:
>
> Hi,
>
>
>>
>>
>> BTW, we should also try to conclude on my yesterday's point as to why
>> it is okay to have the same behavior for default_tablespace and
>> default_table_access_method and not for this parameter? I am asking
>> because if we change the current behavior, tomorrow, we can get
>> complaints that one expects the old behaviour as that was similar to
>> other GUCs like default_tablespace and default_table_access_method.
>>
>
> Fair point. I haven't examined the validation of GUCs in parallel workers 
> closely,
> but one argument for preventing parallel workers from failing due to an 
> incorrect
> value of synchronized_standby_slots is that a select query works in this 
> situation
> without parallel workers.
>
> Whereas, for incorrect values of default_tablespace and 
> default_table_access_method,
> most commands would fail regardless of whether parallel workers are enabled.
>
> PFA a test for the original bug report on this thread.  This applies on the 
> v3 version of the patch
> that was shared.
>
Thanks for sharing the patch. I checked the test and it looks good to
me. But I am not sure if we should have a new file for the test. I
have added the test in the '040_standby_failover_slots_sync.pl' file
along with some other tests.
Also I have addressed the comments by Ashutosh in [1][2].

I have attached the updated v4 patch

[1]: 
https://www.postgresql.org/message-id/CAE9k0P%3Dx3J3nmSmYKmTkiFXTDKLxJkXFO4%2BVHJyNu01Od6CZfg%40mail.gmail.com
[2]: 
https://www.postgresql.org/message-id/CAE9k0P%3DOFMFCRy9aDGWZ3bt91tbB1WnzsAbzXN72iWBaGVuMrw%40mail.gmail.com

Thanks,
Shlok Kyal


v4-0001-Remove-the-validation-from-the-GUC-check-hook-and.patch
Description: Binary data

Re: let ALTER TABLE DROP COLUMN drop whole-row referenced object

2025-09-12 Thread jian he

On Thu, Sep 11, 2025 at 9:27 AM Chao Li  wrote:
>
> But v2 needs a rebase, I cannot apply it to master.
>

hi.
please check attached v3-0001, v3-0002.

v3-0001: ALTER TABLE DROP COLUMN cascade to drop any whole-row
referenced constraints and indexes.
v3-0002: ALTER COLUMN SET DATA TYPE error out when whole-row
referenced constraint exists

with the v3-0002 patch,
CREATE TABLE ts (a int, c int, b int  constraint cc check((ts = ROW(1,1,1;
ALTER TABLE ts ALTER COLUMN b SET DATA TYPE INT8;
ERROR:  cannot alter table column "ts"."b" to type bigint because
constraint "cc" uses table "ts" row type
HINT:  You might need to drop constraint "cc" first to change column
"ts"."b" data type

Of course, even if we do not error out, regular insert will fail too.
insert into ts values(1,1,1);
ERROR:  cannot compare dissimilar column types bigint and integer at
record column 3
src7=# \errverbose
ERROR:  42804: cannot compare dissimilar column types bigint and
integer at record column 3
LOCATION:  record_eq, rowtypes.c:1193
then you need debug to find out the root error cause is constraint cc
is not being satisfied;
and you still need to handle the corrupted constraint cc afterward.
With the v3-0002 patch, ALTER TABLE SET DATA TYPE provides an explicit error
message that helps quickly identify the problem.
So I guess it should be helpful.


index expression/predicate and check constraint expression can not contain
subquery, that's why using pull_varattnos to test whole-row containment works
fine.  but pull_varattnos can not cope with subquery, see pull_varattnos
comments.

row security policy can have subquery, for example:
CREATE POLICY p1 ON document AS PERMISSIVE
USING (dlevel <= (SELECT seclv FROM uaccount WHERE pguser = current_user));

so I am still working on whole-row referenced policies interacting with ALTER
TABLE SET DATA TYPE/ALTER TABLE DROP COLUMN.
From 090a087da9b7fb072acd7e9683faf9ba2b5c76af Mon Sep 17 00:00:00 2001
From: jian he 
Date: Fri, 12 Sep 2025 17:10:19 +0800
Subject: [PATCH v3 2/2] disallow ALTER COLUMN SET DATA TYPE when wholerow
 referenced constraint exists

CREATE TABLE ts (a int, c int, b int  constraint cc check((ts = ROW(1,1,1;
INSERT INTO ts values (1,1,1);

ALTER TABLE ts ALTER COLUMN b SET DATA TYPE INT8;
ERROR:  cannot alter table column "ts"."b" to type bigint because constraint "cc" uses table "ts" row type
HINT:  You might need to drop constraint "cc" first to change column "ts"."b" data type

No need to worry about the index; index rebuild will fail automatically.
discussion: https://postgr.es/m/cacjufxga6kvqy7dbhglvw9s9kkmpgyzt5me6c7kefjdpr2w...@mail.gmail.com
---
 src/backend/commands/tablecmds.c  | 40 +++
 src/test/regress/expected/constraints.out | 11 +++
 src/test/regress/sql/constraints.sql  | 10 ++
 3 files changed, 61 insertions(+)

diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 409bc65e53f..62c84726c4e 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -14664,6 +14664,46 @@ ATPrepAlterColumnType(List **wqueue,
 		find_composite_type_dependencies(rel->rd_rel->reltype, rel, NULL);
 	}
 
+	/*
+	 * If the table has a whole-row referenced CHECK constraint, then changing
+	 * any column data type is not allowed.
+	*/
+	if (targettype != attTup->atttypid || targettypmod != attTup->atttypmod)
+	{
+		TupleConstr *constr = RelationGetDescr(rel)->constr;
+
+		if (constr && constr->num_check > 0)
+		{
+			ConstrCheck *check = constr->check;
+			Node	*check_expr;
+
+			for (int i = 0; i < constr->num_check; i++)
+			{
+Bitmapset  *expr_attrs = NULL;
+
+char	   *constr_name = check[i].ccname;
+check_expr = stringToNode(check[i].ccbin);
+
+/* Find all attributes referenced */
+pull_varattnos(check_expr, 1, &expr_attrs);
+
+if (bms_is_member(0 - FirstLowInvalidHeapAttributeNumber, expr_attrs))
+	ereport(ERROR,
+			errcode(ERRCODE_DATATYPE_MISMATCH),
+			errmsg("cannot alter table column \"%s\".\"%s\" to type %s because constraint \"%s\" uses table \"%s\" row type",
+   RelationGetRelationName(rel),
+   colName,
+   format_type_with_typemod(targettype, targettypmod),
+   constr_name,
+   RelationGetRelationName(rel)),
+			errhint("You might need to drop constraint \"%s\" first to change column \"%s\".\"%s\" data type",
+	constr_name,
+	RelationGetRelationName(rel),
+	colName));
+			}
+		}
+	}
+
 	ReleaseSysCache(tuple);
 
 	/*
diff --git a/src/test/regress/expected/constraints.out b/src/test/regress/expected/constraints.out
index ce2fb02971f..6de0cc4ec33 100644
--- a/src/test/regress/expected/constraints.out
+++ b/src/test/regress/expected/constraints.out
@@ -271,6 +271,17 @@ ALTER TABLE DROP_COL_CHECK_TBL DROP COLUMN city;
 
 DROP TABLE DROP_COL_CHECK_TBL;
 --
+-- Change column data type should error out because the whole-row reference

RE: Conflict detection for update_deleted in logical replication

2025-09-12 Thread Zhijie Hou (Fujitsu)

On Friday, September 12, 2025 4:48 PM shveta malik  
wrote:
> On Fri, Sep 12, 2025 at 8:55 AM Zhijie Hou (Fujitsu) 
> wrote:
> >
> >
> > I agree. Here is a V73 patch that will restart the worker if the
> > retention resumes. I also addressed other comments posted by Amit[1].
> >
> 
> Thanks for the patch. Few comments:

Thanks for the comments!

> 
> 1)
> There is a small window where worker can exit  while resuming retention and
> launcher can end up acessign stale worker info.
> 
> Lets say launcher is at a stage where it has fetched worker:
> w = logicalrep_worker_find(sub->oid, InvalidOid, false);
> 
> And after this point, before the launcher could do
> compute_min_nonremovable_xid(), the worker has stopped retention and
> resumed as well. Now the worker has exited but in
> compute_min_nonremovable_xid(), launcher will still use the worker-info
> fetched previously.

Thanks for catching this, I have fixed by computing the xid under
LogicalRepWorkerLock.

> 
> 2)
> 
>   if (should_stop_conflict_info_retention(rdt_data))
> +  {
> +/*
> + * Stop retention if not yet. Otherwise, reset to the initial phase
> +to
> + * retry all phases. This is required to recalculate the current
> +wait
> + * time and resume retention if the time falls within
> + * max_retention_duration.
> + */
> +if (MySubscription->retentionactive)
> +  rdt_data->phase = RDT_STOP_CONFLICT_INFO_RETENTION;
> +else
> +  reset_retention_data_fields(rdt_data);
> +
> return;
> +  }
> 
> 
> 
> Shall we have an Assert( !MyLogicalRepWorker->oldest_nonremovable_xid)
> in 'else' part above?

Added.

Here is the V74 patch which addressed all comments.

Best Regards,
Hou zj


v74-0001-Allow-conflict-relevant-data-retention-to-resume.patch
Description:  v74-0001-Allow-conflict-relevant-data-retention-to-resume.patch

Re: How can end users know the cause of LR slot sync delays?

2025-09-12 Thread Amit Kapila

On Fri, Sep 12, 2025 at 1:07 PM Ashutosh Sharma  wrote:
>
> On Mon, Sep 8, 2025 at 9:52 AM Ashutosh Sharma  wrote:
> >
> > I think we can do that, since sync_skip_reason appears to be a
> > descriptive metadata rather than statistical data like
> > slot_sync_skip_count and last_slot_sync_skip. However, it's also true
> > that all three pieces of data are transient by nature - they will just
> > be present in the runtime.
> >
>
> After spending some more time on this, I found that maintaining
> sync_skip_reason in pg_replication_slots would make the code changes a
> bit messy and harder to maintain.
>

What exactly is your worry? It seems more logical to have
sync_skip_reason in pg_replication_slots and other two in
pg_stat_replication_slots as the latter is purely a stats view and the
sync_skip_count/last_sync_skip suits there better.

 I think storing all 3 pieces of
> information - sync_skip_reason, sync_skip_count, and last_sync_skip in
> pg_stat_replication_slots would be a cleaner solution. This way, all
> the sync-related info stays together and the code remains
> straightforward.
>

Having all the sync information together has merit but don't you think
in this case the sync_skip_reason doesn't seem to be matching with the
existing columns in pg_stat_replication_slots?

-- 
With Regards,
Amit Kapila.

Re: pg_waldump: support decoding of WAL inside tarfile

2025-09-12 Thread Amul Sul

On Mon, Sep 8, 2025 at 7:07 PM Jakub Wartak
 wrote:
>
> On Tue, Aug 26, 2025 at 1:53 PM Amul Sul  wrote:
> >
> [..patch]
>
> Hi Amul!
>

Thanks for your review. I'm replying to a few of your comments now,
but for the rest, I need to think about them. I'm kind of in agreement
with some of them for the fix, but I won't be able to spend time on
that next week due to official travel. I'll try to get back as soon as
possible after that.

> a.  Why should it be necessary to provide startLSN (-s) ? Couldn't
> it autodetect the first WAL (tar file) inside and just use that with
> some info message?
> $ /usr/pgsql19/bin/pg_waldump --path=/tmp/base/pg_wal.tar
> pg_waldump: error: no start WAL location given
>

There are two reasons. First,  existing pg_waldump
--path=some_directory would result in the same error. Second, it would
force us to re-read the archive twice just to locate the first WAL
segment, which is inefficient.

> c. It doesnt work when using SEGSTART, but it's there:
> $ /usr/pgsql19/bin/pg_waldump --path=/tmp/base/pg_wal.tar
> 00010059
> pg_waldump: error: could not open file "00010059":
> Not a directory
> $ tar tf /tmp/base/pg_wal.tar | head -1
> 00010059
>

I don't believe this is the correct use case. The WAL files are inside
a tar archive, and the requirement is to use a starting LSN and a
timeline (if not the default).

> d. I've later noticed that follow-up patches seem to use the
> -s switch and there it seems to work OK. The above SEGSTART issue was
> not detected, probably because tests need to be extended cover  of
> segment name rather than just --start LSN (see test_pg_waldump):
> $ /usr/pgsql19/bin/pg_waldump --path=/tmp/base/pg_wal.tar --stats
> -s 0/59000358
> pg_waldump: first record is after 0/59000358, at 0/590003E8,
> skipping over 144 bytes
> WAL statistics between 0/590003E8 and 0/6100:
> [..]
>

Hope previous reasoning makes sense to you.

> e. Code around`if (walpath == NULL && directory != NULL)` needs
> some comments.
>

I think this is an existing one.

> f. Code around `if (fname != NULL && is_tar_file(fname,
> &compression))` , so if fname is WAL segment here
> (0001005A) and we do check again if that has been
> tar-ed (is_tar_file())? Why?
>

Again, how?

> g. Just a question: the commit message says `Note that this patch
> requires that the WAL files within the archive be in sequential order;
> an error will be reported otherwise`. I'm wondering if such
> occurrences are known to be happening in the wild? Or is it just an
> assumption that if someone would modify the tar somehow? (either way
> we could just add a reason why we need to handle such a case if we
> know -- is manual alternation the only source of such state?). For the
> record, I've tested crafting custom archives with out of sequence WAL
> archives and the code seems to work (it was done using: tar --append
> -f pg_wal.tar --format=ustar ..)
>

This is an almost nonexistent occurrence. While pg_basebackup archives
WAL files in sequential order, we don't have an explicit code to
enforce that order within it. Furthermore, since we can't control how
external tools might handle the files, this extra precaution is
necessary.


> Another open question I have is this: shouldn't backup_manifest come
> with CRC checksum for the archived WALs? Or does that guarantee that
> backup_manifest WAL-Ranges are present in pg_wal.tar is good enough
> because individual WAL files are CRC-protected itself?
>

I don't know, I have to check pg_verifybackup.

Regards,
Amul

Re: Error with DEFAULT VALUE in temp table

2025-09-12 Thread Sergey Shinderuk


Hi, Tom!

Thank you for working on this. I see you've fixed the patch and 
committed it as a0b99fc1220. I tested it a bit and see some side effects 
which may be unintentional.


1. SCHEMA lost object_name.

Before:

postgres=# create schema foo;
CREATE SCHEMA
postgres=# drop schema foo;
DROP SCHEMA
postgres=# select * from dropped_objects \gx
-[ RECORD 1 ]---+---
n   | 1
classid | 2615
objid   | 16404
objsubid| 0
original| t
normal  | f
is_temporary| f
object_type | schema
schema_name |
object_name | foo
object_identity | foo
address_names   | {foo}
address_args| {}

After:

postgres=# select * from dropped_objects \gx
-[ RECORD 1 ]---+---
n   | 1
classid | 2615
objid   | 16394
objsubid| 0
original| t
normal  | f
is_temporary| f
object_type | schema
schema_name |
object_name |
object_identity | foo
address_names   | {foo}
address_args| {}


2. DEFAULT VALUE now has schema_name and object_name.

Before:

postgres=# create temp table bar (a int default 0);
CREATE TABLE
postgres=# drop table bar;
DROP TABLE
postgres=# select * from dropped_objects where object_type = 
'default value' \gx

-[ RECORD 1 ]---+--
n   | 4
classid | 2604
objid   | 16422
objsubid| 0
original| f
normal  | f
is_temporary| f
object_type | default value
schema_name |
object_name |
object_identity | for pg_temp.bar.a
address_names   | {pg_temp,bar,a}
address_args| {}

After:

postgres=# select * from dropped_objects where object_type = 
'default value' \gx

-[ RECORD 1 ]---+--
n   | 4
classid | 2604
objid   | 16430
objsubid| 0
original| f
normal  | f
is_temporary| t
object_type | default value
schema_name | pg_temp
object_name | bar
object_identity | for pg_temp.bar.a
address_names   | {pg_temp,bar,a}
address_args| {}

This may be intentional, but doesn't quite match the description for 
object_name in the docs:


Name of the object, if the combination of schema and name can be 
used as a unique identifier for the object; otherwise NULL.


Also it doesn't match with the record for the column itself:

postgres=# create temp table bar (a int default 0);
CREATE TABLE
postgres=# alter table bar drop column a;
ALTER TABLE
postgres=# select * from dropped_objects \gx
-[ RECORD 1 ]---+--
n   | 1
classid | 1259
objid   | 16435
objsubid| 1
original| t
normal  | f
is_temporary| t
object_type | table column
schema_name | pg_temp
object_name |
object_identity | pg_temp.bar.a
address_names   | {pg_temp,bar,a}
address_args| {}
-[ RECORD 2 ]---+--
n   | 2
classid | 2604
objid   | 16438
objsubid| 0
original| f
normal  | f
is_temporary| t
object_type | default value
schema_name | pg_temp
object_name | bar
object_identity | for pg_temp.bar.a
address_names   | {pg_temp,bar,a}
address_args| {}

object_name is null for the table column, but not null for its default 
value.


As for schema_name, I'm not sure whether it should be null or not. 
Currently schema_name is null for triggers and policy objects, but that 
may be accidental.


Best regards,

--
Sergey Shinderukhttps://postgrespro.com/

Re: New recovery_target_timeline=primary option

2025-09-12 Thread Efrain J. Berdecia

Even the documentation states/warns:
"Set restore_command to a simple command to copy files from the WAL archive. If 
you plan to have multiple standby servers for high availability purposes, make 
sure that recovery_target_timeline is set to latest (the default), to make the 
standby server follow the timeline change that occurs at failover to another 
standby."
By default, recovery_target_timeline is set to latest. What I'm recommending is 
an option to set it to just follow or stay within the primarie's timeline 
without having to receive the fatal message stated before that ends up stopping 
the recovery of the standby.
Supposed we have timelines 1-3 archived in our backup repo. Currently our 
streaming replication setup is running in timeline 3. But now, we need to 
restore the primary to timeline 2.  We can specify recovery_target_timeline=2 
to initially restore the primary. But when I go to reinit or rebuild the 
standby, why not just add a new option, recovery_target_timeline=primary, that 
forces the standby to just stay on the primaries timeline without having to 
figure out the correct timeline for the standby.
Without this new parameter or without specifying the timeline when restoring 
the standby, the restore will take the standby to timeline 3 and get the fatal 
error message. This happens a lot on setups using tools like patroni.
Just trying to make the administrator's and HA tools lives a little easier when 
setting up a standby.
Yahoo Mail: Search, Organize, Conquer 

  On Thu, Sep 11, 2025 at 9:19 PM, Euler Taveira wrote:   On 
Thu, Sep 11, 2025, at 10:07 PM, Efrain J. Berdecia wrote:
> The error I would like to address with this feature is the following:
>
> FATAL: highest timeline xxx of the primary is behind timeline yyy
>

It seems your procedure to set up a standby is incorrect. See [1]. You are not
using the base backup from the primary server.

You didn't describe the whole procedure so it is hard to point out where the
problem is.

[1] 
https://www.postgresql.org/docs/current/warm-standby.html#STANDBY-SERVER-SETUP

-- 
Euler Taveira
EDB  https://www.enterprisedb.com/

Preferred use of macro GetPGProcByNumber

2025-09-12 Thread Maksim.Melnikov


Hello hackers,
I've noticed some places where elements of ProcGlobal->allProcs are 
addressed directly via arr index.
But in proc.h macros GetPGProcByNumber exist, that was added to get proc 
objects by index,
so I suggest minor refactoring 'ProcGlobal->allProcs[index]' -> 
'GetPGProcByNumber(index)'.

Please, see attached patch, branched from rev 6ede13d1b5f.

Best regards
Melnikov Maksim
From f03b596e8fd3fa80f86843babb1704fae4032480 Mon Sep 17 00:00:00 2001
From: Maksim Melnikov 
Date: Fri, 12 Sep 2025 15:17:39 +0300
Subject: [PATCH v1] Preferred use of macro GetPGProcByNumber.

Minor refactoring, using of GetPGProcByNumber(index) macros
instead of ProcGlobal->allProcs[index].
---
 src/backend/access/transam/clog.c  |  4 +-
 src/backend/postmaster/pgarch.c|  2 +-
 src/backend/postmaster/walsummarizer.c |  2 +-
 src/backend/storage/buffer/freelist.c  |  2 +-
 src/backend/storage/ipc/procarray.c| 64 --
 src/backend/storage/lmgr/lock.c|  6 +--
 src/backend/storage/lmgr/proc.c|  2 +-
 7 files changed, 39 insertions(+), 43 deletions(-)

diff --git a/src/backend/access/transam/clog.c b/src/backend/access/transam/clog.c
index e80fbe109cf..bbdcce39990 100644
--- a/src/backend/access/transam/clog.c
+++ b/src/backend/access/transam/clog.c
@@ -574,7 +574,7 @@ TransactionGroupUpdateXidStatus(TransactionId xid, XidStatus status,
 	/* Walk the list and update the status of all XIDs. */
 	while (nextidx != INVALID_PROC_NUMBER)
 	{
-		PGPROC	   *nextproc = &ProcGlobal->allProcs[nextidx];
+		PGPROC	   *nextproc = GetPGProcByNumber(nextidx);
 		int64		thispageno = nextproc->clogGroupMemberPage;
 
 		/*
@@ -633,7 +633,7 @@ TransactionGroupUpdateXidStatus(TransactionId xid, XidStatus status,
 	 */
 	while (wakeidx != INVALID_PROC_NUMBER)
 	{
-		PGPROC	   *wakeproc = &ProcGlobal->allProcs[wakeidx];
+		PGPROC	   *wakeproc = GetPGProcByNumber(wakeidx);
 
 		wakeidx = pg_atomic_read_u32(&wakeproc->clogGroupNext);
 		pg_atomic_write_u32(&wakeproc->clogGroupNext, INVALID_PROC_NUMBER);
diff --git a/src/backend/postmaster/pgarch.c b/src/backend/postmaster/pgarch.c
index 78e39e5f866..a35895babeb 100644
--- a/src/backend/postmaster/pgarch.c
+++ b/src/backend/postmaster/pgarch.c
@@ -289,7 +289,7 @@ PgArchWakeup(void)
 	 * be relaunched shortly and will start archiving.
 	 */
 	if (arch_pgprocno != INVALID_PROC_NUMBER)
-		SetLatch(&ProcGlobal->allProcs[arch_pgprocno].procLatch);
+		SetLatch(&GetPGProcByNumber(arch_pgprocno)->procLatch);
 }
 
 
diff --git a/src/backend/postmaster/walsummarizer.c b/src/backend/postmaster/walsummarizer.c
index e1f142f20c7..69461ea9a79 100644
--- a/src/backend/postmaster/walsummarizer.c
+++ b/src/backend/postmaster/walsummarizer.c
@@ -649,7 +649,7 @@ WakeupWalSummarizer(void)
 	LWLockRelease(WALSummarizerLock);
 
 	if (pgprocno != INVALID_PROC_NUMBER)
-		SetLatch(&ProcGlobal->allProcs[pgprocno].procLatch);
+		SetLatch(&GetPGProcByNumber(pgprocno)->procLatch);
 }
 
 /*
diff --git a/src/backend/storage/buffer/freelist.c b/src/backend/storage/buffer/freelist.c
index 01909be0272..2fbd254e4bc 100644
--- a/src/backend/storage/buffer/freelist.c
+++ b/src/backend/storage/buffer/freelist.c
@@ -239,7 +239,7 @@ StrategyGetBuffer(BufferAccessStrategy strategy, uint32 *buf_state, bool *from_r
 		 * actually fine because procLatch isn't ever freed, so we just can
 		 * potentially set the wrong process' (or no process') latch.
 		 */
-		SetLatch(&ProcGlobal->allProcs[bgwprocno].procLatch);
+		SetLatch(&GetPGProcByNumber(bgwprocno)->procLatch);
 	}
 
 	/*
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index 200f72c6e25..d3b9680b562 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -268,8 +268,6 @@ typedef enum KAXCompressReason
 
 static ProcArrayStruct *procArray;
 
-static PGPROC *allProcs;
-
 /*
  * Cache to reduce overhead of repeated calls to TransactionIdIsInProgress()
  */
@@ -444,8 +442,6 @@ ProcArrayShmemInit(void)
 		TransamVariables->xactCompletionCount = 1;
 	}
 
-	allProcs = ProcGlobal->allProcs;
-
 	/* Create or attach to the KnownAssignedXids arrays too, if needed */
 	if (EnableHotStandby)
 	{
@@ -502,7 +498,7 @@ ProcArrayAdd(PGPROC *proc)
 		int			this_procno = arrayP->pgprocnos[index];
 
 		Assert(this_procno >= 0 && this_procno < (arrayP->maxProcs + NUM_AUXILIARY_PROCS));
-		Assert(allProcs[this_procno].pgxactoff == index);
+		Assert(GetPGProcByNumber(this_procno)->pgxactoff == index);
 
 		/* If we have found our right position in the array, break */
 		if (this_procno > pgprocno)
@@ -538,9 +534,9 @@ ProcArrayAdd(PGPROC *proc)
 		int			procno = arrayP->pgprocnos[index];
 
 		Assert(procno >= 0 && procno < (arrayP->maxProcs + NUM_AUXILIARY_PROCS));
-		Assert(allProcs[procno].pgxactoff == index - 1);
+		Assert(GetPGProcByNumber(procno)->pgxactoff == index - 1);
 
-		allProcs[procno].pgxactoff = index;
+		GetPGProcByNumber(procno)->pgxactoff = index;
 	}
 
 	/*
@@ -

Re: Error with DEFAULT VALUE in temp table

2025-09-12 Thread Sergey Shinderuk


On 12.09.2025 14:01, Sergey Shinderuk wrote:
object_name is null for the table column, but not null for its default 
value.


As for schema_name, I'm not sure whether it should be null or not. 
Currently schema_name is null for triggers and policy objects, but that 
may be accidental.



Perhaps "default value" should be like "table constraint", which have 
schema_name and null object_name.



postgres=# create temp table bar (a int not null default 0);
CREATE TABLE
postgres=# alter table bar drop column a;
ALTER TABLE
postgres=# select * from dropped_objects \gx
-[ RECORD 1 ]---+--
n   | 1
classid | 1259
objid   | 16445
objsubid| 1
original| t
normal  | f
is_temporary| t
object_type | table column
schema_name | pg_temp
object_name |
object_identity | pg_temp.bar.a
address_names   | {pg_temp,bar,a}
address_args| {}
-[ RECORD 2 ]---+--
n   | 2
classid | 2604
objid   | 16448
objsubid| 0
original| f
normal  | f
is_temporary| t
object_type | default value
schema_name | pg_temp
object_name | bar
object_identity | for pg_temp.bar.a
address_names   | {pg_temp,bar,a}
address_args| {}
-[ RECORD 3 ]---+--
n   | 3
classid | 2606
objid   | 16449
objsubid| 0
original| f
normal  | f
is_temporary| t
object_type | table constraint
schema_name | pg_temp
object_name |
object_identity | bar_a_not_null on pg_temp.bar
address_names   | {pg_temp,bar,bar_a_not_null}
address_args| {}


Best regards,

--
Sergey Shinderukhttps://postgrespro.com/

Re: [PATCH] jit: fix build with LLVM-21

2025-09-12 Thread Holger Hoffstätte


On 2025-09-12 09:47, Holger Hoffstätte wrote:

On 2025-09-12 08:36, Peter Eisentraut wrote:

In my testing with LLVM 21, I'm getting an additional error:

../src/backend/jit/llvm/llvmjit_wrap.cpp:56:18: error: no matching constructor 
for initialization of 'llvm::orc::RTDyldObjectLinkingLayer'
    56 | return wrap(new llvm::orc::RTDyldObjectLinkingLayer(
   | ^
    57 | *unwrap(ES), [] { return 
std::make_unique(nullptr, true); }));
   | 
~
/opt/homebrew/Cellar/llvm/21.1.0/include/llvm/ExecutionEngine/Orc/RTDyldObjectLinkingLayer.h:58:3:
 note: candidate constructor not viable: no known conversion from '(lambda at 
../src/backend/jit/llvm/llvmjit_wrap.cpp:57:16)' to 'GetMemoryManagerFunction' (aka 
'unique_function (const MemoryBuffer 
&)>') for 2nd argument
    58 |   RTDyldObjectLinkingLayer(ExecutionSession &ES,
   |   ^
    59 |    GetMemoryManagerFunction GetMemoryManager);
   | ~
/opt/homebrew/Cellar/llvm/21.1.0/include/llvm/ExecutionEngine/Orc/RTDyldObjectLinkingLayer.h:37:16:
 note: candidate constructor (the implicit copy constructor) not viable: 
requires 1 argument, but 2 were provided
    37 | class LLVM_ABI RTDyldObjectLinkingLayer
   |    ^~~~

I gather you're not seeing that?


I temporarily removed the __arch64__ guard and can reproduce the above error,
using either gcc-15 or clang-21. It seems this llvm monkey patch backport thing
is affected by the following commit in llvm-21:
https://github.com/llvm/llvm-project/commit/cd585864c0bbbd74ed2a2b1ccc191eed4d1c8f90

Trying to figure out how to adapt the code..

chers
Holger

Re: Add support for specifying tables in pg_createsubscriber.

2025-09-12 Thread Euler Taveira

On Fri, Sep 12, 2025, at 2:05 AM, Amit Kapila wrote:
> Yeah, I am also not sure that it is worth adding a new option to save
> backward compatibility for this option. It seems intuitive to use
> existing publications when they exist, though we should clearly
> document it.
>

That's why we have a "Migration to Version XY" in the release notes. ;)


-- 
Euler Taveira
EDB   https://www.enterprisedb.com/

Re: misleading error message in ProcessUtilitySlow T_CreateStatsStmt

2025-09-12 Thread Peter Eisentraut


On 05.09.25 14:30, Peter Eisentraut wrote:

On 29.08.25 14:48, Álvaro Herrera wrote:

On 2025-Aug-29, jian he wrote:


On Fri, Aug 29, 2025 at 5:46 AM Tom Lane  wrote:



WFM, although I think you could shorten it to "tables, materialized
views, and foreign tables".  We generally expect that partitioned
tables are included when saying "tables", no?  I'm not dead set on
that either way, though.


https://www.postgresql.org/docs/current/sql-copy.html
use "COPY TO can be used only with plain tables, not views, and does
not copy rows from child tables or child partitions"


I'm inclined to think that we should only mention partitioned tables
specifically when they for some reason deviate from what we do for
regular tables, i.e., what Tom is saying.  I don't think we've had an
explicit, consistent rule for that thus far, so there may be places
where we fail to follow it.

Anyway, I have pushed the error message change.


I think this message is still wrong.  The check doesn't even look at the 
relation kind, which is what the message is implying.  (If the message 
were about relkinds, then it should use 
errdetail_relkind_not_supported().)  It checks that the from list entry 
is a table name instead of some other thing like VALUES or JOIN.  So it 
should be something like


CREATE STATISTICS only supports plain table names in the FROM clause


I propose the attached patch to fix this.  I think this restores the 
original meaning better.
From 7c55d765dc1ec2f8a28e8ae1fe3551d32ed5a964 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Fri, 12 Sep 2025 15:30:16 +0200
Subject: [PATCH v4] CREATE STATISTICS: improve misleading error message

The previous change (commit f225473cbae) was still not on target,
because it talked about relation kinds, which are not what is being
checked here.  Provide a more accurate message.

Discussion: 
https://postgr.es/m/CACJufxEZ48toGH0Em_6vdsT57Y3L8pLF=DZCQ_gCii6=c3m...@mail.gmail.com
---
 src/backend/tcop/utility.c  |  5 ++---
 src/test/regress/expected/stats_ext.out | 21 +++--
 2 files changed, 9 insertions(+), 17 deletions(-)

diff --git a/src/backend/tcop/utility.c b/src/backend/tcop/utility.c
index 5f442bc3bd4..68605656b31 100644
--- a/src/backend/tcop/utility.c
+++ b/src/backend/tcop/utility.c
@@ -1873,9 +1873,8 @@ ProcessUtilitySlow(ParseState *pstate,
 
if (!IsA(rel, RangeVar))
ereport(ERROR,
-   
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-errmsg("cannot 
create statistics on the specified relation"),
-
errdetail("CREATE STATISTICS only supports tables, foreign tables and 
materialized views.")));
+   
(errcode(ERRCODE_SYNTAX_ERROR),
+errmsg("CREATE 
STATISTICS only supports relation names in the FROM clause")));
 
/*
 * CREATE STATISTICS will influence 
future execution plans
diff --git a/src/test/regress/expected/stats_ext.out 
b/src/test/regress/expected/stats_ext.out
index a1f83b58b23..fdc0aa130bd 100644
--- a/src/test/regress/expected/stats_ext.out
+++ b/src/test/regress/expected/stats_ext.out
@@ -56,29 +56,22 @@ CREATE STATISTICS tst (unrecognized) ON x, y FROM 
ext_stats_test;
 ERROR:  unrecognized statistics kind "unrecognized"
 -- unsupported targets
 CREATE STATISTICS tst ON a FROM (VALUES (x)) AS foo;
-ERROR:  cannot create statistics on the specified relation
-DETAIL:  CREATE STATISTICS only supports tables, foreign tables and 
materialized views.
+ERROR:  CREATE STATISTICS only supports relation names in the FROM clause
 CREATE STATISTICS tst ON a FROM foo NATURAL JOIN bar;
-ERROR:  cannot create statistics on the specified relation
-DETAIL:  CREATE STATISTICS only supports tables, foreign tables and 
materialized views.
+ERROR:  CREATE STATISTICS only supports relation names in the FROM clause
 CREATE STATISTICS tst ON a FROM (SELECT * FROM ext_stats_test) AS foo;
-ERROR:  cannot create statistics on the specified relation
-DETAIL:  CREATE STATISTICS only supports tables, foreign tables and 
materialized views.
+ERROR:  CREATE STATISTICS only supports relation names in the FROM clause
 CREATE STATISTICS tst ON a FROM ext_stats_test s TABLESAMPLE system (x);
-ERROR:  cannot create statistics on the specified relation
-DETAIL:  CREATE STATISTICS only supports tables, foreign tables and 
materialized views.
+ERROR:  CREATE STATISTICS only supports relation names in the FROM clause
 CREATE STATISTICS tst ON a FROM XMLTABLE('foo' PASSING 'bar' COLUMNS a text);
-ERROR:  cannot create statistics on the specified relation
-DETAIL:  CREATE STATISTICS only supports tabl

Re: Cannot find a working 64-bit integer type on Illumos

2025-09-12 Thread Peter Eisentraut


On 04.09.25 11:01, Peter Eisentraut wrote:

On 03.09.25 17:04, Peter Eisentraut wrote:
Consider a third-party extension that does something like dblink or 
postgres_fdw.  It will compile against a server and also a libpq.  The 
server and the libpq might not be of the same major version.  (On 
Debian, only the latest libpq will be available.)  If you have for 
example server version 17 and libpq version 18, then you will get the 
pg_int64 typedef both from postgres_ext.h (from the PG17 server 
includes) and from libpq-fe.h (from PG18 libpq).  That is not allowed 
in C99, and even if it were, the underlying types of PG_INT64_TYPE (in 
PG17) and int64_t (in PG18) might be different (long int vs. long long 
int) and this would fail.


I think this could be fixed by moving the definition of pg_int64 back 
to postgres_ext.h.  Then extension builds would only get one 
definition, because of the header guards.  Depending on include order, 
they could get a different underlying type, but that's a smaller 
problem, since the type is supposed to be deprecated anyway.


Here is a patch that has been reported to fix the problem.


I propose to go ahead with this patch in a few days if there are no 
other solutions coming.

Re: [BUG] temporary file usage report with extended protocol and unnamed portals

2025-09-12 Thread Mircea Cadariu


Hi,

On 11/09/2025 16:28, Sami Imseih wrote:

I prefer a separate thread for this, as it's an optimization of the
existing behavior.


OK, I removed my changes to CreatePortal. I left the test description 
fixes and added a proposed commit message.


If no objections I will set it to ready for committer in the app.

Kind regards,

Mircea Cadariu
From 268e2a5bac002ed32db4b2b8ab5dc4bb4f64c290 Mon Sep 17 00:00:00 2001
From: "Sami Imseih (AWS)"
 
Date: Fri, 29 Aug 2025 17:19:34 +
Subject: [PATCH v10] =?UTF-8?q?Fix=20incorrect=20query=20attribution=20in?=
 =?UTF-8?q?=20temporary=E2=80=90file=20logging?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Previously, in the extended query protocol with unnamed
portals, the log line for temporary files would show the
wrong STATEMENT: line (e.g. from a subsequent simple query)
rather than the query that actually caused the temp file.
This happened because the unnamed portal from the previous
Bind was dropped after setting the new debug_query_string,
so the logging triggered by PortalDrop used the new query
text instead of the one that had generated the temporary file.

This patch ensures that when an unnamed portal is dropped,
it is associated with the correct query text by dropping
the previous unnamed portal before assigning the debug_query_string.

Author: Sami Imseih 
Author: Frédéric Yhuel 
Reviewed-by: Mircea Cadariu 
Discussion: 
https://www.postgresql.org/message-id/flat/3d07ee43-8855-42db-97e0-bad5db82d...@dalibo.com
---
 src/backend/tcop/postgres.c   |  35 ++
 src/test/modules/test_misc/meson.build|   1 +
 .../modules/test_misc/t/009_log_temp_files.pl | 115 ++
 3 files changed, 151 insertions(+)
 create mode 100644 src/test/modules/test_misc/t/009_log_temp_files.pl

diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index d356830f75..d74e9e7bd1 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -148,6 +148,7 @@ static bool ignore_till_sync = false;
  * in order to reduce overhead for short-lived queries.
  */
 static CachedPlanSource *unnamed_stmt_psrc = NULL;
+static bool unnamed_portal = false;
 
 /* assorted command-line switches */
 static const char *userDoption = NULL; /* -D switch */
@@ -182,6 +183,7 @@ static bool IsTransactionExitStmt(Node *parsetree);
 static bool IsTransactionExitStmtList(List *pstmts);
 static bool IsTransactionStmtList(List *pstmts);
 static void drop_unnamed_stmt(void);
+static void drop_unnamed_portal(void);
 static void log_disconnections(int code, Datum arg);
 static void enable_statement_timeout(void);
 static void disable_statement_timeout(void);
@@ -1024,6 +1026,12 @@ exec_simple_query(const char *query_string)
booluse_implicit_block;
charmsec_str[32];
 
+   /*
+* Drop the unnamed portal before setting debug_query_string, to avoid
+* attributing messages from the drop (e.g., temp usage) to the new 
query.
+*/
+   drop_unnamed_portal();
+
/*
 * Report query to various monitoring facilities.
 */
@@ -1676,6 +1684,12 @@ exec_bind_message(StringInfo input_message)
 errmsg("unnamed prepared statement 
does not exist")));
}
 
+   /*
+* Same as exec_simple_query, drop the unnamed portal before setting
+* debug_query_string.
+*/
+   drop_unnamed_portal();
+
/*
 * Report query to various monitoring facilities.
 */
@@ -1757,10 +1771,14 @@ exec_bind_message(StringInfo input_message)
 * if the unnamed portal is specified.
 */
if (portal_name[0] == '\0')
+   {
portal = CreatePortal(portal_name, true, true);
+   unnamed_portal = true;
+   }
else
portal = CreatePortal(portal_name, false, false);
 
+
/*
 * Prepare to copy stuff into the portal's memory context.  We do all 
this
 * copying first, because it could possibly fail (out-of-memory) and we
@@ -5236,3 +5254,20 @@ disable_statement_timeout(void)
if (get_timeout_active(STATEMENT_TIMEOUT))
disable_timeout(STATEMENT_TIMEOUT, false);
 }
+
+/* Drop the unnamed portal if one exists */
+static void
+drop_unnamed_portal(void)
+{
+   Portal  portal;
+
+   if (!unnamed_portal)
+   return;
+
+   /* Get the portal and drop it */
+   portal = GetPortalByName("");
+   if (PortalIsValid(portal))
+   PortalDrop(portal, false);
+
+   unnamed_portal = false;
+}
diff --git a/src/test/modules/test_misc/meson.build 
b/src/test/modules/test_misc/meson.build
index 6b1e730bf4..f258bf1ccd 100644
--- a/src/test/modules/test_misc/meson.build
+++ b/src/test/modules/test_misc/meson.build
@@ -17,6 +17,7 @@ tests += {
   't/006_signal_autovacuum.pl',
   't/007_catcache_inval.pl',

Re: misleading error message in ProcessUtilitySlow T_CreateStatsStmt

2025-09-12 Thread Tom Lane

Peter Eisentraut  writes:
> I propose the attached patch to fix this.  I think this restores the 
> original meaning better.

I'm okay with this wording change, but I would stay with
ERRCODE_FEATURE_NOT_SUPPORTED rather than calling this
a "syntax error".  It's not a syntax error IMV, but a
potential feature that we have deliberately left syntax
space for, even though we don't yet have ideas about
a workable implementation.

regards, tom lane

Re: split func.sgml to separated individual sgml files

2025-09-12 Thread Nazir Bilal Yavuz

Hi,

On Tue, 2 Sept 2025 at 17:54, Andrew Dunstan  wrote:
>
> Ah, you’re right, but then again,  I’d expect ALL_SGML to be used 
> consistently, but it isn't and I didn't check.
> v3 does that.
> Note that GENERATED_SGML where'te included in these two targets but I think 
> there's no harm in checking them too.
>
> Do we actually care about those? I don't want to add needless cycles 
> anywhere. I note that the meson.build doesn't appear to have a check target 
> at all, or anything that looks for hard tabs or nbsps.Those checks were added 
> to the Makefile back in October in commit 5b7da5c261d, but that got missed 
> even though Daniel had mentioned it in the discussion thread.[1]

I have been working on running these checks under the Meson build
system. To do this, I converted the checks into a Perl script
(sgml_syntax_check) and ran it against both the Makefile and Meson.
Test's name is 'sgml_syntax_check' in the Meson. One difference I
noticed: I could not find a way in Meson to create a test that does
not run by default. As a result, this syntax test runs every time you
run the 'meson test'. This behaviour differs from Autoconf, but I
think it is acceptable.

Additionally, some of the CI OSes were missing docbook-xml; but it has
now been installed.

I did not create a new thread for that, I can create one if you think
that it would be better.

CI run with the attached patch applied:
https://cirrus-ci.com/build/6610354173640704

-- 
Regards,
Nazir Bilal Yavuz
Microsoft
From 27ab61775945d837e37ed6a0ce0c301697d183a1 Mon Sep 17 00:00:00 2001
From: Nazir Bilal Yavuz 
Date: Mon, 8 Sep 2025 17:16:05 +0300
Subject: [PATCH v1] Add sgml_syntax_check test to the Meson build

The 'sgml' check from the Makefile has been converted into a Perl script
(sgml_syntax_check) and integrated into meson.build. Unlike Autoconf,
Meson does not provide a way to mark tests as non-default, so this
script runs on every 'meson test'. While this differs from the previous
behavior, it is considered acceptable.
---
 doc/src/sgml/Makefile   |  16 +---
 doc/src/sgml/meson.build|  23 ++
 doc/src/sgml/t/sgml_syntax_check.pl | 118 
 .cirrus.tasks.yml   |   3 +
 4 files changed, 146 insertions(+), 14 deletions(-)
 create mode 100755 doc/src/sgml/t/sgml_syntax_check.pl

diff --git a/doc/src/sgml/Makefile b/doc/src/sgml/Makefile
index 11aac913812..3256340a5b2 100644
--- a/doc/src/sgml/Makefile
+++ b/doc/src/sgml/Makefile
@@ -200,8 +200,8 @@ MAKEINFO = makeinfo
 ##
 
 # Quick syntax check without style processing
-check: postgres.sgml $(ALL_SGML) check-tabs check-nbsp
-   $(XMLLINT) $(XMLINCLUDE) --noout --valid $<
+check: postgres.sgml $(ALL_SGML)
+   $(PERL) $(srcdir)/t/sgml_syntax_check.pl --xmllint "$(XMLLINT)" 
--srcdir $(srcdir)
 
 
 ##
@@ -261,18 +261,6 @@ clean-man:
 
 endif # sqlmansectnum != 7
 
-# tabs are harmless, but it is best to avoid them in SGML files
-check-tabs:
-   @( ! grep ' ' $(wildcard $(srcdir)/*.sgml $(srcdir)/ref/*.sgml 
$(srcdir)/*.xsl) ) || \
-   (echo "Tabs appear in SGML/XML files" 1>&2;  exit 1)
-
-# Non-breaking spaces are harmless, but it is best to avoid them in SGML files.
-# Use perl command because non-GNU grep or sed could not have hex escape 
sequence.
-check-nbsp:
-   @ ( $(PERL) -ne '/\xC2\xA0/ and print("$$ARGV:$$_"),$$n++; END 
{exit($$n>0)}' \
- $(wildcard $(srcdir)/*.sgml $(srcdir)/ref/*.sgml $(srcdir)/*.xsl 
$(srcdir)/images/*.xsl) ) || \
-   (echo "Non-breaking spaces appear in SGML/XML files" 1>&2;  exit 1)
-
 ##
 ## Clean
 ##
diff --git a/doc/src/sgml/meson.build b/doc/src/sgml/meson.build
index 6ae192eac68..89d8b01c944 100644
--- a/doc/src/sgml/meson.build
+++ b/doc/src/sgml/meson.build
@@ -306,3 +306,26 @@ endif
 if alldocs.length() != 0
   alias_target('alldocs', alldocs)
 endif
+
+sgml_syntax_check = files(
+  't/sgml_syntax_check.pl'
+)
+
+test(
+  'sgml_syntax_check',
+  perl,
+  protocol: 'exitcode',
+  suite: 'doc',
+  args: [
+sgml_syntax_check,
+'--xmllint',
+  '@0@ --nonet'.format(xmllint_bin.full_path()),
+'--srcdir',
+  meson.current_source_dir(),
+'--builddir',
+  meson.current_build_dir(),
+  ],
+  depends: doc_generated
+)
+
+testprep_targets += doc_generated
diff --git a/doc/src/sgml/t/sgml_syntax_check.pl 
b/doc/src/sgml/t/sgml_syntax_check.pl
new file mode 100755
index 000..7ff1d9a7c26
--- /dev/null
+++ b/doc/src/sgml/t/sgml_syntax_check.pl
@@ -0,0 +1,118 @@
+# /usr/bin/perl
+
+# doc/src/sgml/sgml_syntax_check.pl
+
+use strict;
+use warnings FATAL => 'all';
+use Getopt::Long;
+
+use File::Find;
+
+my $xmllint;
+my $srcdir;
+my $builddir;
+
+GetOptions(
+   'xmllint:s' => \$xmllint,
+   'srcdir:s' => \$srcdir,
+   'builddir:s' => \$builddir) or die "$0: wrong arguments";
+
+die "$0: --srcdir must be specified\n" unless defined $srcdir;
+
+my $postgres_sgml = "postgres.sgml";
+my $xmlinclude = "--path . --path $srcdir";
+

Re: [PATCH] jit: fix build with LLVM-21

2025-09-12 Thread Holger Hoffstätte


On 2025-09-12 14:47, Holger Hoffstätte wrote:


I temporarily removed the __arch64__ guard and can reproduce the above error,
using either gcc-15 or clang-21. It seems this llvm monkey patch backport thing
is affected by the following commit in llvm-21:
https://github.com/llvm/llvm-project/commit/cd585864c0bbbd74ed2a2b1ccc191eed4d1c8f90

Trying to figure out how to adapt the code..


Try the attached patch on your homebrew setup.

This compiles and passes "make check", though I do not think it runs any jit 
tests.
So intead I dropped this into my 17.6, disabled the __arch64__ guard to make 
sure
I get the backport class, rebuilt and ran the jit on/off example from the docs.
This showed the expected performance difference with jit=on vs. off.

cheers
Holger
>From da8c15ef4f8dbfe931fe2700d849f49078bfca97 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Holger=20Hoffst=C3=A4tte?= 
Date: Fri, 12 Sep 2025 16:01:15 +0200
Subject: [PATCH] jit: update aarch64-only use of SectionMemoryManager for
 LLVM-21
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

LLVM-21 changed the invocation of RTDyldObjectLinkingLayer, breaking
compilation with the backported SectionMemoryManager for aarch64.
Introduce a version guard for the newly passed parameter.

Signed-off-by: Holger Hoffstätte 
---
 src/backend/jit/llvm/llvmjit_wrap.cpp | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/src/backend/jit/llvm/llvmjit_wrap.cpp b/src/backend/jit/llvm/llvmjit_wrap.cpp
index da850d67ab6..d98e3d0d900 100644
--- a/src/backend/jit/llvm/llvmjit_wrap.cpp
+++ b/src/backend/jit/llvm/llvmjit_wrap.cpp
@@ -53,7 +53,14 @@ DEFINE_SIMPLE_CONVERSION_FUNCTIONS(llvm::orc::ObjectLayer, LLVMOrcObjectLayerRef
 LLVMOrcObjectLayerRef
 LLVMOrcCreateRTDyldObjectLinkingLayerWithSafeSectionMemoryManager(LLVMOrcExecutionSessionRef ES)
 {
+#if LLVM_VERSION_MAJOR >= 21
+	return wrap(new llvm::orc::RTDyldObjectLinkingLayer(
+		*unwrap(ES), [](const llvm::MemoryBuffer &) {
+			return std::make_unique(nullptr, true);
+		}));
+#else
 	return wrap(new llvm::orc::RTDyldObjectLinkingLayer(
 		*unwrap(ES), [] { return std::make_unique(nullptr, true); }));
+#endif
 }
 #endif
-- 
2.51.0

Re: Issue with logical replication slot during switchover

2025-09-12 Thread Fabrice Chapuis

Thanks Shveta for all your comments. I'll prepare a version 2 of the patch
including your proposals.

Regards

Fabrice

On Thu, 11 Sep 2025, 07:42 shveta malik  wrote:

> On Sun, Sep 7, 2025 at 1:32 PM Fabrice Chapuis 
> wrote:
> >
> > Hi,
> > Here the version 1 of the patch with the modifications.
> > I do a git diff --check until there are no more errors.
> > In a next version of the patch, I think I have make change to call
> ReplicationSlotCreate() function with the value of the flag
> allow_overwrite  be consistent
>
> In which flow? I see all ReplicationSlotCreate() references already
> accepting the value.
>
> > and modify the interface ReplicationSlotCreate to display the attribute
> allow_overwrite.
> >
>
> Do you mean to modify pg_replication_slots?
>
> Thank You for the patch. Please find a few comments:
>
> 1)
> In both create_physical_replication_slot() and
> create_physical_replication_slot():
> +   false, false,false);
>
> ,false --> , false (space after comma is recommended)
>
> 2)
> + elog(DEBUG1, "logical replication slot %s created with option
> allow_overwrite to %s",
> + NameStr(slot->data.name), slot->data.allow_overwrite ? "true" :
> "false");
>
> IMO, we don't need this as we do not log other properties of the slot as
> well.
>
> 3)
> We can have pg_replication_slots (pg_get_replication_slots) modified
> to display the new property. Also we can modify docs to have the new
> property defined.
>
> 4)
> + {
> + /* Check if we need to overwrite an existing logical slot */
> + if (allow_overwrite)
> + {
> + /* Get rid of a replication slot that is no longer wanted */
> + ReplicationSlotDrop(remote_slot->name,true);
> +
> + /* Get rid of a replication slot that is no longer wanted */
> + ereport(WARNING,
> + errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
> + errmsg("slot \"%s\" already exists"
> + " on the standby but it will be dropped because "
> + "flag allow_overwrite is set to true",
> + remote_slot->name));
> +
> + /* Going back to the main loop after droping the failover slot */
> + return false;
>
> Currently we are returning after dropping the slot. But I think we
> shall drop the slot and proceed with new slot creation of the same
> name otherwise we may be left with old slot dropped and no new slot
> created  specially when API is run.
>
> Scenario:
> --On primary, create 2 slots: slot1 and slot2.
> --On standby, create one slot slot1 with allow_overwrite=true.
> --Run 'SELECT pg_sync_replication_slots();' on standby.
>
> At the end of API, expectation is that both slots will be present with
> 'synced'=true, but only slot2 is present. if we run this API next
> time, slot1 is created. It should have been dropped and recreated (or
> say overwritten) in the first run itself.
>
> 5)
> IMO, LOG is sufficient here, as the action aligns with the
> user-provided 'allow_overwrite' setting.
>
> thanks
> Shveta
>

Re: ABI Compliance Checker GSoC Project

2025-09-12 Thread Peter Eisentraut


On 12.09.25 16:52, David E. Wheeler wrote:

On Sep 12, 2025, at 10:37, Peter Eisentraut  wrote:


This was a change that was intentionally backpatched in a different way in 
order to preserve ABI compatibility.  Compare commits 344662848ac on 
REL_18_STABLE and 0b934d3994f on REL_17_STABLE.  So everything is in order. :)


Excellent! But an example like this presumably helps make Tom’s case that each 
branch could have a file that suggests which commit to use as the base for 
comparison, so that in this example it could be set to 344662848ac and the 
failures would go away. As it is, they will persist until a new tag is added or 
one overrides the base in the build farm client config.


I don't think we need any ABI checking until there is a dot-0 release, 
so I don't agree that a facility like that is needed.  Just compare 
against the previous release tag.

Re: ABI Compliance Checker GSoC Project

2025-09-12 Thread David E. Wheeler

On Sep 12, 2025, at 11:37, Peter Eisentraut  wrote:

> I don't think we need any ABI checking until there is a dot-0 release, so I 
> don't agree that a facility like that is needed.

Hrm. I wonder how best to filter out a branch without a dot-0 tag.

> Just compare against the previous release tag.

Yep, that’s what it’s doing. 

Best,

David



signature.asc
Description: Message signed with OpenPGP

Re: plan shape work

2025-09-12 Thread Robert Haas

On Fri, Sep 12, 2025 at 11:08 AM Tom Lane  wrote:
> I was arguing for labeling plan nodes according to which OJ they
> start (always a unique relid).  Richard was arguing for labeling
> according to which OJ(s) they complete (zero, one, or more relids).

I agree with everything in your reply up to and including this part.

> But I now think it's probably worth doing both.  We need the
> completion bitmapsets if we want to cross-check Var nullingrels,
> because those correspond to the nullingrels that should get added
> at each join's output.  I think that we also want the start labels
> though.  For one thing, if the start nodes are not identified,
> it's impossible to understand how much of the tree is the "no
> man's land" where a C variable may or may not have gone to null
> on its way to becoming a C* variable.  But in general I think
> that we'll want to be able to identify an outer-join plan node
> even if it does not complete its OJ.

So, it looks to me like the way this works today is that
join_is_legal() figures out the relevant SpecialJoinInfo and then
add_outer_joins_to_relids() decides what to add to the joinrel's relid
set. So I think, though I am not quite sure, that if somewhere around
that point in the code we copied sjinfo->ojrelid into the RelOptInfo,
and then propagated that through to the final plan, that might be what
you're looking for here. However, that assumes that the choice of
SpecialJoinInfo is fixed for all possible ways of constructing a given
joinrel, which I think might not be true. In Richard's test case, one
simply can't go wrong, because the lower join only draws from two
baserels. But in a case like A LJ B ON A.x = B.x LJ C ON B.y = C.y LJ
D ON B.z = D.z, the B-C-D joinrel could presumably be constructed by
joining either B-D to C or B-C to D, and I'm guessing that will result
in a different choice of SpecialJoinInfo in each case. That would mean
that the ojrelid has to be per-Path rather than per-RelOptInfo.

While in theory that's fine, it sounds expensive. I was hoping that we
could piggyback mostly on existing calculations here, exposing data we
already have instead of calculating new things. If we want to expose
the starting-the-outer-join-RTI value that you want here, it seems
like we're going to have to redo some of the join_is_legal() work and
some of the add_outer_joins_to_relids() work for every path. I'm
skeptical about expending those cycles. It's not clear to me that I
need to care about outer join RTIs at all for what I'm trying to do --
focusing where RTIs originating from baserels end up in the final plan
tree is, as far as I can see, completely adequate. There might be
other people who want to do other things that would benefit more from
seeing that stuff, though. I'm not against exposing information that
is easily calculated and might be useful to somebody, even if it's
just for planner debugging. But it seems to me that what you're asking
here might be going quite a bit further than that.

So my counter-proposal is that this patch set should either (1) expose
nothing at all about join RTIs because I don't have a need for them or
(2) expose the join RTIs completed at a certain level because that's
easily calculated from the data we already have; and if you want to
later also expose the single join RTI started at a certain level for a
varnullingrels cross-check or any other purpose, then you can propose
a patch for that. Alternatively, if you want to edit my 0003 patch to
work the way you think it should, cool.  Or if you can describe what
you think it should do, I'm somewhat willing to try implementing that,
but that's definitely not such a great choice from my perspective. I'm
afraid that I'm getting pulled a bit further down the garden path than
I really want to go trying to satisfy your desire to perform a
cross-check that I don't really understand and/or expose information
for which I don't see a clear need in my own work. What I need is for
all the baserels that appear in the final Plan tree to be properly
labelled.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: [WIP]Vertical Clustered Index (columnar store extension) - take2

2025-09-12 Thread Timur Magomedov

Hello Peter!
Thanks for updates. Here is a small fix for clang "variable set but not
used" warnings.

-- 
Regards,
Timur Magomedov

From 8c17b18dc0c15ae87945169a85c3ee944fc815b7 Mon Sep 17 00:00:00 2001
From: Timur Magomedov 
Date: Fri, 12 Sep 2025 17:09:50 +0300
Subject: [PATCH] Fixed "set but not used" clang warnings

---
 contrib/vci/executor/vci_vector_executor.c | 4 
 contrib/vci/storage/vci_fetch.c| 3 ---
 2 files changed, 7 deletions(-)

diff --git a/contrib/vci/executor/vci_vector_executor.c b/contrib/vci/executor/vci_vector_executor.c
index daf475b964e..d89988b4e75 100644
--- a/contrib/vci/executor/vci_vector_executor.c
+++ b/contrib/vci/executor/vci_vector_executor.c
@@ -1792,7 +1792,6 @@ static vci_vp_item_id
 vci_add_control_nodes(VciVPExecOp_func head_func, VciVPExecOp_func next_func, List *args,
 	  Expr *expr, PlanState *parent, ExprContext *econtext, VciVPContext *vpcontext, uint16 *skip_list)
 {
-	int			i;
 	vci_vp_item_id ret;
 	ListCell   *l;
 	Datum	   *itemValue = palloc(sizeof(Datum) * VCI_MAX_FETCHING_ROWS);
@@ -1807,7 +1806,6 @@ vci_add_control_nodes(VciVPExecOp_func head_func, VciVPExecOp_func next_func, Li
 	head_node->itemIsNull = itemIsNull;
 	head_node->data.init.orig_skip_list = skip_list;
 
-	i = 0;
 	foreach(l, args)
 	{
 		Expr	   *arg = (Expr *) lfirst(l);
@@ -1820,8 +1818,6 @@ vci_add_control_nodes(VciVPExecOp_func head_func, VciVPExecOp_func next_func, Li
 		next_node = &vpcontext->itemNode[ret];
 		next_node->itemValue = itemValue;
 		next_node->itemIsNull = itemIsNull;
-
-		i++;
 	}
 
 	return ret;
diff --git a/contrib/vci/storage/vci_fetch.c b/contrib/vci/storage/vci_fetch.c
index 80e497c4628..588ae59fa43 100644
--- a/contrib/vci/storage/vci_fetch.c
+++ b/contrib/vci/storage/vci_fetch.c
@@ -1479,7 +1479,6 @@ FillFixedWidthCopyBody1(char *dstData,
 		int numRows)
 {
 	int			aId;
-	int			numWrite = 0;
 	BlockNumber bNumCur = startBN;
 	int64		offset = startOf;
 	Buffer		buffer = InvalidBuffer;
@@ -1516,7 +1515,6 @@ FillFixedWidthCopyBody1(char *dstData,
 #endif			/* #ifdef WORDS_BIGENDIAN */
 			dstData += stepDstData;
 			offset += dataWidth;
-			++numWrite;
 		}
 
 		aId = maxBId;
@@ -1547,7 +1545,6 @@ FillFixedWidthCopyBody1(char *dstData,
 #endif			/* #ifdef WORDS_BIGENDIAN */
 			dstData += stepDstData;
 			offset = nextOffset;
-			++numWrite;
 			++aId;
 		}
 		else
-- 
2.43.0

Re: plan shape work

2025-09-12 Thread Tom Lane

Robert Haas  writes:
> So, it looks to me like the way this works today is that
> join_is_legal() figures out the relevant SpecialJoinInfo and then
> add_outer_joins_to_relids() decides what to add to the joinrel's relid
> set. So I think, though I am not quite sure, that if somewhere around
> that point in the code we copied sjinfo->ojrelid into the RelOptInfo,
> and then propagated that through to the final plan, that might be what
> you're looking for here.

Not the RelOptInfo, but the Path.  I have not looked at details, but
it might be necessary to identify these labels at Path construction
time rather than reconstructing them during createplan.c.  I'd rather
not, because bloating Path nodes with hopefully-redundant information
isn't attractive.  But if it turns out that createplan.c doesn't have
enough information then we might have to.

I'm also thinking that this notion of starting/completing OJs might be
useful in its own right to clarify and even simplify some of the Path
manipulations we do.  But that will require reviewing the code.

> So my counter-proposal is that this patch set should either (1) expose
> nothing at all about join RTIs because I don't have a need for them or
> (2) expose the join RTIs completed at a certain level because that's
> easily calculated from the data we already have; and if you want to
> later also expose the single join RTI started at a certain level for a
> varnullingrels cross-check or any other purpose, then you can propose
> a patch for that.

If what you want to do is only interested in baserel RTIs, then I
think we should leave outer join RTIs out of the discussion for the
present.  I was not looking to make you do work you aren't interested
in.  I reserve the right to do said work later ...

regards, tom lane

Re: Incorrect logic in XLogNeedsFlush()

2025-09-12 Thread Jeff Davis

On Fri, 2025-09-12 at 14:04 +0900, Michael Paquier wrote:
> On Fri, Sep 12, 2025 at 10:21:27AM +0530, Dilip Kumar wrote:
> > Yeah, asserting it at the end makes sense, as we can ensure that
> > XLogFlush() and XLogNeedsFlush() agree on the same thing without
> > adding additional overhead.  Here is the revised one.
> 
> Melanie and Jeff, what do you think?

IIUC: during the end-of-recovery checkpoint, a hypothetical call to
XLogNeedsFlush() would have incorrectly used the LocalMinRecoveryPoint;
and with the change, it would correctly use the flush pointer.

That wasn't a live bug because XLogNeedsFlush() was never actually
called during the end-of-recovery checkpoint. CreateCheckPoint() 
called XLogFlush() directly, which correctly checked the flush pointer.
But, hypothetically, if that code had logic based around
XLogNeedsFlush() before calling XLogFlush(), that would have been a
bug.

So this fix has no behavior change, but it would make the code more
resilient against changes to CreateCheckPoint(), more consistent, and
less confusing. Is that right?

Assuming I'm right so far, then I agree with changing the test in
XLogNeedsFlush() to !XLogInsertAllowed(), as the patch does.

The comment in the patch is describing what's happening, but lost the
"why". Perhaps something like:

"During recovery, we don't flush WAL but update minRecoveryPoint
instead. So "needs flush" is taken to mean whether minRecoveryPoint
would need to be updated. The end-of-recovery checkpoint still must
flush WAL like normal, though, so check !XLogInsertAllowed(). This
check must be consistent with XLogFlush()."

The commit message is vague about whether there's a live bug or not; I
suggest clarifying that the inconsistency between the two functions
could have been a bug but wasn't. Also, I generally use past tense for
the stuff that's being fixed and present tense for what the patch does.

One loose end: there was some discussion about the times when
LocalMinRecoveryPoint is valid/correct. I'm not sure I entirely
understood -- is that no longer a concern?

Regards,
Jeff Davis

Re: ABI Compliance Checker GSoC Project

2025-09-12 Thread Peter Eisentraut


On 08.09.25 17:17, David E. Wheeler wrote:

The RC1 change surprised me a little; here’s the log:


Leaf changes summary: 1 artifact changed
Changed leaf types summary: 0 leaf type changed
Removed/Changed/Added functions summary: 0 Removed, 1 Changed, 0 Added function
Removed/Changed/Added variables summary: 0 Removed, 0 Changed, 0 Added variable

1 function with some sub-type change:

[C] 'function void CheckValidResultRel(ResultRelInfo*, CmdType, List*)' has 
some sub-type changes:
parameter 4 of type 'List*' was added
parameter 3 of type 'List*' changed:
entity changed from 'List*' to 'typedef OnConflictAction'
type size changed from 8 to 4 (in bytes)
type alignment changed from 0 to 4

Presumably this is expected, but it looks like it might be an issue if it 
weren’t a pre-release change, yes?


This was a change that was intentionally backpatched in a different way 
in order to preserve ABI compatibility.  Compare commits 344662848ac on 
REL_18_STABLE and 0b934d3994f on REL_17_STABLE.  So everything is in 
order. :)

Re: PostgreSQL 18 GA press release draft

2025-09-12 Thread Jeff Davis

On Tue, 2025-09-09 at 23:13 -0400, Jonathan S. Katz wrote:
> Hi,
> 
> Attached is a draft of the PostgreSQL 18 GA press release. A few 
> comments before the ask for reviewing:

"This release adds the [`PG_UNICODE_FAST`], which accelerates lookups
using the `upper` and `lower` string comparison functions, and helps
speed up the new [`casefold`] function for case-insensitive
comparisons."

Use the link:
https://www.postgresql.org/docs/18/collation.html#COLLATION-MANAGING-STANDARD
which is a bit more descriptive. (Though perhaps that document could be
improved with examples.)

What's actually new with PG_UNICODE_FAST are the "full" Unicode
semantics for case transformations, e.g. it uppercases 'ß' to 'SS'. The
rest is the same as PC_C_UTF8, which uses "simple" Unicode semantics.

The name PG_UNICODE_FAST is meant to convey that it provides full
unicode semantics for case mapping and pattern matching, while also
being fast because it uses memcmp for comparisons. While the name
PG_C_UTF8 is meant to convey that it's closer to what users of the libc
"C.UTF-8" locale might expect.

Regards,
Jeff Davis

Re: ABI Compliance Checker GSoC Project

2025-09-12 Thread Tom Lane

Peter Eisentraut  writes:
> On 12.09.25 16:52, David E. Wheeler wrote:
>> Excellent! But an example like this presumably helps make Tom’s case that 
>> each branch could have a file that suggests which commit to use as the base 
>> for comparison, so that in this example it could be set to 344662848ac and 
>> the failures would go away. As it is, they will persist until a new tag is 
>> added or one overrides the base in the build farm client config.

> I don't think we need any ABI checking until there is a dot-0 release, 
> so I don't agree that a facility like that is needed.  Just compare 
> against the previous release tag.

My concern is that when we get a report, we might decide to apply
some fix to remove the ABI delta, or we might decide it's intentional
and/or harmless and leave the code as-is.  In the latter case, the
ABI-checking BF animal(s) will be red until the next release is made,
which helps nobody, particularly because it will probably interfere
with noticing any ABI breaks in later commits.

So I think we do need a way to move a branch's baseline that is not
rigidly tied to making a release.  The only way we could possibly use
this testing methodology without that is if the ABI checker agrees
100.00% with our opinion of what is or is not a breaking change.
Which we already see isn't so.  A concrete example is the hack we
frequently use of sandwiching a new field into what had been padding
space.  I would be astonished if an ABI checker doesn't flag that,
but the checker's output doesn't make it not OK.

I'm not wedded to any particular way of implementing that, though
I'd prefer to avoid basing it on git tags given that they can't
readily be deleted or moved.

regards, tom lane

Avoid resource leak (src/test/modules/test_binaryheap/test_binaryheap.c)

2025-09-12 Thread Ranier Vilela

Hi.

Per Coverity.

Coverity reports this resource leak in test_binaryheap module.
I think that is right.

Trivial patch attached.

best regards,
Ranier Vilela


avoid-resource-leak-test_binaryheap.patch
Description: Binary data

Re: Unexpected changes of CurrentResourceOwner and CurrentMemoryContext

2025-09-12 Thread Antonin Houska

Álvaro Herrera  wrote:

> On 2025-Sep-12, Antonin Houska wrote:
> 
> > Euler Taveira  wrote:
> > 
> > > Interesting. I'm wondering that if this patch is applied we could remove 
> > > the
> > > following code [...] from pg_logical_slot_get_changes_guts and
> > > LogicalSlotAdvanceAndCheckSnapState functions too. IIUC the referred
> > > code is a band-aid that will be improved someday.
> > 
> > Even though we're fixing the likely reason of this problem, we cannot be 
> > 100%
> > sure that no other problem like this still exists. So I'd not remove this
> > assignment. Maybe add Assert(CurrentResourceOwner == old_resowner) in front 
> > of
> > that, and adjust the comment?
> 
> Yeah, I'm going to risk removing it, because if we don't do it now,
> we're never going to do it.  We can mitigate the risk of missing
> remaining bugs by having that assertion you suggest, so that if anyone
> actually hits a problem here, we'll know soon enough.
> 
> I have pushed it with that change.  I'll also add an open item for pg19
> so that we remember to come back to remove the assertions, if we feel we
> no longer need them.

I was worried about removing those workarounds because it was not trivial to
diagnose the issue. But it should be ok for the master branch. Thanks.

-- 
Antonin Houska
Web: https://www.cybertec-postgresql.com

Re: Proposal: Out-of-Order NOTIFY via GUC to Improve LISTEN/NOTIFY Throughput

2025-09-12 Thread Masahiko Sawada

On Wed, Sep 10, 2025 at 12:00 PM Arseniy Mukhin
 wrote:
>
>
> 6) We have fixed size queue entries, so I think we don't need this
> "padding" logic at the end of the page anymore, because we know how
> many entries we can have on each page.

+1

Probably we also no longer need to have a pair of page number and
offset number as an entry queue's position since AsyncQueueEntry is
now a fixed size.

> BTW, what do you think about creating a separate thread for the patch?
> The current thread's subject seems a bit irrelevant.

+1

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Re: split func.sgml to separated individual sgml files

2025-09-12 Thread Andrew Dunstan




On 2025-09-12 Fr 10:12 AM, Nazir Bilal Yavuz wrote:

Hi,

On Tue, 2 Sept 2025 at 17:54, Andrew Dunstan  wrote:

Ah, you’re right, but then again,  I’d expect ALL_SGML to be used consistently, 
but it isn't and I didn't check.
v3 does that.
Note that GENERATED_SGML where'te included in these two targets but I think 
there's no harm in checking them too.

Do we actually care about those? I don't want to add needless cycles anywhere. 
I note that the meson.build doesn't appear to have a check target at all, or 
anything that looks for hard tabs or nbsps.Those checks were added to the 
Makefile back in October in commit 5b7da5c261d, but that got missed even though 
Daniel had mentioned it in the discussion thread.[1]

I have been working on running these checks under the Meson build
system.



Thanks for this!



To do this, I converted the checks into a Perl script
(sgml_syntax_check) and ran it against both the Makefile and Meson.
Test's name is 'sgml_syntax_check' in the Meson. One difference I
noticed: I could not find a way in Meson to create a test that does
not run by default. As a result, this syntax test runs every time you
run the 'meson test'. This behaviour differs from Autoconf, but I
think it is acceptable.



Yes, I think so too.




Additionally, some of the CI OSes were missing docbook-xml; but it has
now been installed.

I did not create a new thread for that, I can create one if you think
that it would be better.

CI run with the attached patch applied:
https://cirrus-ci.com/build/6610354173640704



I am away this coming week, will check it out in detail when I return.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: PostgreSQL 18 GA press release draft

2025-09-12 Thread Masahiko Sawada

On Fri, Sep 12, 2025 at 8:46 AM Jonathan S. Katz  wrote:
>
> On 9/9/25 11:13 PM, Jonathan S. Katz wrote:
> > Hi,
> >
> > Attached is a draft of the PostgreSQL 18 GA press release. A few
> > comments before the ask for reviewing:
> >
> > * I still need to write up the theme and the quote, which I'll provide
> > tomorrow after hearing the first round of feedback
> > * The release is a bit long - this is due in part to what I think is a
> > lot of really good features in PostgreSQL 18. "Something for everyone"
> >
> > Please review for the press release for the following:
> >
> > * Correctness and accuracy of what's described - we want to be
> > technically correct, but we also need to explain why something matters
> > to our users.
> > * Notable omissions (apologies if I missed something, but also note not
> > everything going to make the cut).
> > * Features to exclude.
> > * How easy phrases are to translate / confusing phrases
> > * Typos.
> >
> > Like previous years, I won't really look at stylistic remarks, unless it
> > falls into the "making it simpler to translate / confusing phrase" bucket.
> >
> > After 2025-09-14 0:00 UTC I won't be able to accept more feedback, as
> > we'll need to begin the translation process. Anyone who provides
> > accepted feedback will also get a mention in the press release "thank
> > you" page[1], which stays up for about a year ;)
>
> Thanks for all of your feedback thus far! Attached is the next version
> of the draft. This incorporates the feedback, and now includes the
> thematic statement and a draft quote.
>
> As per above, I'll need remaining feedback no later than 2025-09-14 0:00
> UTC - after this, it's frozen for the translation process to begin.
>

I found a typo;

s/gen_rand_uuid()/gen_random_uuid()/

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Re: PostgreSQL 18 GA press release draft

2025-09-12 Thread Nico Williams

On Fri, Sep 12, 2025 at 10:11:59AM -0700, Jeff Davis wrote:
> The name PG_UNICODE_FAST is meant to convey that it provides full
> unicode semantics for case mapping and pattern matching, while also
> being fast because it uses memcmp for comparisons. While the name
> PG_C_UTF8 is meant to convey that it's closer to what users of the libc
> "C.UTF-8" locale might expect.

How does one do form-insensitive comparisons?

Nico
--

Re: pg_waldump: support decoding of WAL inside tarfile

2025-09-12 Thread Robert Haas

Here are some review comments on v3-0004:

In general, I think this looks pretty nice, but I think it needs more
cleanup and polishing.

There doesn't seem to be any reason for
astreamer_waldump_content_new() to take an astreamer *next argument.
If you look at astreamer.h, you'll see that some astreamer_BLAH_new()
functions take such an argument, and others don't. The ones that do
forward their input to another astreamer; the ones that don't, like
astreamer_plain_writer_new(), send it somewhere else. AFAICT, this
astreamer is never going to send its output to another astreamer, so
there's no reason for this argument.

I'm also a little confused by the choice of the name
astreamer_waldump_content_new(). I would have thought this would be
something like astreamer_waldump_new() or astreamer_xlogreader_new().
The word "content" doesn't seem to me to be adding much here, and it
invites confusion with the "content" callback.

I think you can merge setup_astreamer() into
init_tar_archive_reader(). The only other caller is
verify_tar_archive(), but that does exactly the same additional steps
as init_tar_archive_reader(), as far as I can see.

The return statement for astreamer_wal_read is really odd:

+   return (count - nbytes) ? (count - nbytes) : -1;

Since 0 is false in C, this is equivalent to: count != nbytes ? count
- nbytes : -1, but it's a strange way to write it. What makes it even
stranger is that it seems as though the intention here is to count the
number of bytes read, but you do that by taking the number of bytes
requested (count) and subtracting the number of bytes we didn't manage
to read (nbytes); and then you just up and return -1 instead of 0
whenever the answer would have been zero. This is all lacking in
comments and seems a bit more confusing than it needs to be. So my
suggestions are:

1. Consider redefining nbytes to be the number of bytes that you have
read instead of the number of bytes you haven't read. So the loop in
this function would be while (nbytes < count) instead of while (nbytes
> 0).

2. If you need to map 0 to -1, consider having the caller do this
instead of putting that inside this function.

3. Add a comment saying what the return value is supposed to be".

If you do both 1 and 2, then the return statement can just say "return
nbytes;" and the comment can say "Returns the number of bytes
successfully read."

I would suggest changing the name of the variable from "readBuff" to
"readBuf". There are no existing uses of readBuff in the code base.

I think this comment also needs improvement:

+   /*
+* Ignore existing data if the required target page
has not yet been
+* read.
+*/
+   if (recptr >= endPtr)
+   {
+   len = 0;
+
+   /* Reset the buffer */
+   resetStringInfo(astreamer_buf);
+   }

This comment is problematic for a few reasons. First, we're not
ignoring the existing data: we're throwing it out. Second, the comment
doesn't say why we're doing what we're doing, only that we're doing
it. Here's my guess at the actual explanation -- please correct me if
I'm wrong: "pg_waldump never reads the same WAL bytes more than once,
so if we're now being asked for data beyond the end of what we've
already read, that means none of the data we currently have in the
buffer will ever be consulted again. So, we can discard the existing
buffer contents and start over." By the way, if this explanation is
correct, it might be nice to add an assertion someplace that verifies
it, like asserting that we're always reading from an LSN greater than
or equal to (or exactly equal to?) the LSN immediately following the
last data we read.

In general, I wonder whether there's a way to make the separation of
concerns between astreamer_wal_read() and TarWALDumpReadPage()
cleaner. Right now, the latter is basically a stub, but I'm not sure
that is the best thing here. I already mentioned one example of how to
do this: make the responsibility for 0 => -1 translation the job of
TarWALDumpReadPage() rather than astreamer_wal_read(). But I think
there might be a little more we can do. In particular, I wonder
whether we could say that astreamer_wal_read() is only responsible for
filling the buffer, and the caller, TarWALDumpReadPage() in this case,
needs to empty it. That seems like it might produce a cleaner
separation of duties.

Another thing that isn't so nice right now is that
verify_tar_archive() has to open and close the archive only for
init_tar_archive_reader() to be called to reopen it again just moments
later. It would be nicer to open the file just once and then keep it
open. Here again, I wonder if the separation of duties could be a bit
cleaner.

Is there a real need to pass XLogDumpPrivate to astreamer_wal_read or
astreamer_archive_read? The only things that they need are archive_fd,
archive_name, archive_streamer, archive_streamer_

Re: Avoid resource leak (src/test/modules/test_binaryheap/test_binaryheap.c)

2025-09-12 Thread Robert Haas

On Fri, Sep 12, 2025 at 1:54 PM Ranier Vilela  wrote:
> Per Coverity.
>
> Coverity reports this resource leak in test_binaryheap module.
> I think that is right.
>
> Trivial patch attached.

If this were correct, we'd need to also free the memory in all the
error paths. But of course, in both error and non-error paths, we rely
on memory context cleanup to free memory for us, except in cases where
there's some specific reason to believe that's not good enough. I
doubt that there is any such reason in this case.

See src/backend/utils/mmgr/README

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: [Proposal] Expose internal MultiXact member count function for efficient monitoring

2025-09-12 Thread Ashutosh Bapat

On Fri, Sep 12, 2025 at 9:03 AM Naga Appani  wrote:
>

+ 
+ 
+ Returns statistics about current multixact usage:
+ num_mxids is the total number of multixact IDs
assigned since startup,
+ num_members is the total number of multixact
member entries created since startup,

GetMultiXactInfo() returns following
*members = nextOffset - *oldestOffset;
*multixacts = nextMultiXactId - *oldestMultiXactId;
They seem to be the numbers that exist in the system at the time of
the call and not since the startup. Am I missing something?

+ up to approximately 2^32 entries(occupying roughly 20GB in the

space between s and (

+ proallargtypes => '{int4,int8,int8,xid,int8}',

I think the first parameter should also be int8 since uint32 won't fit
into int4.

>
> > + See  for further
> > details on multixact wraparound.
> >
> > I don't think we need this reference here. Reference back from that
> > section is enough.
> I kept the cross-reference for now since other multixact function docs
> (such as pg_get_multixact_members()) already use this style, and it helps
> readers who land directly on the function reference page. Please let me
> know if you would prefer that I remove it.

None of the write up there talks about multixact wraparound so the
reference seems arbitrary to me. I would remove it.

>
> > + * Returns NULL if the oldest referenced offset is unknown, which
> > happens during
> > + * system startup or when no MultiXact references exist in any relation.
> >
> > If no MultiXact references exist, and GetMultiXactInfo() returns
> > false, MultiXactMemberFreezeThreshold() will assume the worst, which I
> > take as meaning that it will trigger aggressive autovacuum. No
> > MultiXact references existing is a common case which shouldn't be
> > assumed as the worst case. The comment I quoted means "the oldest
> > value of the offset referenced by any multi-xact referenced by a
> > relation *may not be always known". You seem to have interpreted "may
> > not be known" as "does not exist" That's not right. I would write this
> > as "Returns NULL if the oldest referenced offset is unknown which
> > happens during system startup".
> >
> > Similarly I would rephrase the following docs as
> > + 
> > + The function returns NULL when multixact
> > statistics are unavailable.
> > + For example, during startup before multixact initialization completes or 
> > when
> > + the oldest member offset cannot be determined.
> >
> > "The function returns NULL when multixact
> > statistics when the oldest multixact offset corresponding to a
> > multixact referenced by a relation is not known after starting the
> > system."
> >
> Updated.

Thanks for updating the documentation. But the comment in prologue of
pg_get_multixact_stats is not completely correct as mentioned in my
previous reply. I would just say "Returns NULL if the oldest
referenced offset is unknown". Anybody who wants to know when can that
happen, may search relevant code by looking at GetMultiXactInfo().

I still find the comment at the start of the isolation test a bit
verbose. But I think it's best to leave that to a committer's
judgement. However, the word "locker" is unusual. You want to say the
session that locks a row (or something similar). We may leave exact
words to a committer's judgement.

I still find think that the specific usage scenarios described in the
documentation are not required. And the computation in
pg_get_multixact_stats() should use macros instead of bare numbers.
But we can leave that for a committer to decide. Once you address
above comments, we may mark the CF entry as RFC.

-- 
Best Wishes,
Ashutosh Bapat

Re: PostgreSQL 18 GA press release draft

2025-09-12 Thread Álvaro Herrera

Hello,

Thanks for putting this together!  It's quite the monster.  I read
through and found the following points worth mentioning:

I think almost all of the non-stock section titles Are Lacking Case
Title.

> The new 
> [`io_method`](https://www.postgresql.org/docs/18/runtime-config-resource.html#GUC-IO-METHOD)
>  setting lets you toggle between the AIO methods, including `worker` and 
> `io_uring` (when built with PostgreSQL, available on certain Linux systems), 
> or you can choose to maintain the current PostgreSQL behavior with the `sync` 
> setting. There are now more parameters to consider tuning with AIO, which you 
> can [learn more about in the 
> documentation](https://www.postgresql.org/docs/18/runtime-config-resource.html#RUNTIME-CONFIG-RESOURCE-IO).

I don't understand the "when built with PostgreSQL".  Did you mean to
reference something else here?

> PostgreSQL 18 further accelerates query performance with features that 
> automatically make your workloads faster. This release introduces "skip scan" 
> lookups on [multicolumn B-tree 
> indexes](https://www.postgresql.org/docs/18/indexes-multicolumn.html), which 
> improves execution time for queries that omit an `=` condition on one or more 
> prefix index columns. It can also optimize queries that use `OR` conditions 
> in a `WHERE` to use an index, leading to significantly faster execution. 
> There are also numerous improvements for how PostgreSQL plans and executes 
> table joins, from boosting the performance of hash joins to allowing merge 
> joins to use incremental sorts.

introduces "skip scan" lookups ..., which improve
(remove ending 's')

> PostgreSQL 18 now supports parallel builds for [GIN 
> indexes](https://www.postgresql.org/docs/18/gin.html), joining B-tree and 
> [BRIN indexes](https://www.postgresql.org/docs/current/brin.html) in 
> supporting this capability. Additionally, [materialized 
> views](https://www.postgresql.org/docs/18/rules-materializedviews.html) can 
> now use unique indexes that aren't B-trees as partition keys, expanding how 
> you can construct materialized views.

Materialized views can use non-btree indexes as partition keys?  Sounds
cool, but the linked-to matview page doesn't talk about partitioning at
all.  I think there's something wrong with the way this has been
written.  [trawls the release notes]  Hmm, the release notes have this
text:

   "Allow non-btree unique indexes to be used as partition keys and in
   materialized views"

I think the confusion stems from having missed the "and" there.

Actually I wonder if these two items (commits f278e1fe3 and 9d6db8bec)
are actually worthy of being in the press release.  They are about using
unique indexes that aren't btrees, but as I understand, in stock
Postgres there isn't any other way to build unique indexes, so this is
about allowing out-of-core index AMs to be used in these contexts.


> PostgreSQL 18 makes text processing easier and faster with several new 
> enhancements. This release adds the 
> [`PG_UNICODE_FAST`](https://www.postgresql.org/docs/18/locale.html#LOCALE-PROVIDERS),
>  which accelerates lookups using the `upper` and `lower` string comparison 
> functions, and helps speed up the new 
> [`casefold`](https://www.postgresql.org/docs/18/functions-string.html#FUNCTIONS-STRING-OTHER)
>  function for case-insensitive comparisons. Additionally, PostgreSQL 18 now 
> supports making `LIKE` comparisons over text that uses a [nondeterministic 
> collation](https://www.postgresql.org/docs/18/collation.html#COLLATION-NONDETERMINISTIC),
>  simplifying how you can perform more complex pattern matching. This release 
> also changes [full text 
> search](https://www.postgresql.org/docs/18/textsearch.html) to use the 
> default collation provider of a cluster instead of always using libc, which 
> may require you to reindex all [full text 
> search](https://www.postgresql.org/docs/18/textsearch-tables.html#TEXTSEARCH-TABLES-INDEX)
>  and [`pg_trgm`](https://www.postgresql.org/docs/18/pgtrgm.html#PGTRGM-INDEX) 
> indexes after running 
> [`pg_upgrade`](https://www.postgresql.org/docs/18/pgupgrade.html).

I think this should say "This release adds the PG_UNICODE_FAST local
provider", or something like that, because ending in just
"PG_UNICODE_FAST" seems to be unintelligible.

> ### Authentication and security features

In this section I would also mention that pgcrypto gained SHA-2 cipher
for passwords (commit 749a9e20c979).

> PostgreSQL 18 now supports reporting logical replication write conflicts in 
> logs and in the 
> [`pg_stat_subscription_stats`](https://www.postgresql.org/docs/18/monitoring-stats.html#MONITORING-PG-STAT-SUBSCRIPTION-STATS)
>  view. Additionally, [`CREATE 
> SUBSCRIPTION`](https://www.postgresql.org/docs/18/sql-createsubscription.html)
>  now defaults to use parallel streaming for applying transactions, which can 
> help improve performance. The 
> [`pg_createsubscriber`](https://www.postgresql.org/docs/18/app-pgcreatesub

Re: Increase OpenBSD CI task's RAM disk size

2025-09-12 Thread Andres Freund

Hi,

On 2025-09-02 10:24:34 +0200, Daniel Gustafsson wrote:
> > On 15 Aug 2025, at 12:20, Nazir Bilal Yavuz  wrote:
> > On Fri, 15 Aug 2025 at 12:19, Daniel Gustafsson  wrote:
> >> 
> >>> On 15 Aug 2025, at 10:46, Nazir Bilal Yavuz  wrote:
> 
> >>> Sometimes OpenBSD CI tasks fail with 'No space left on device' [1].
> >>> It's due to the OpenBSD CI task's RAM disk size being ~3.8 GB. This
> >>> patch increases it to ~4.6 GB. I chose this size manually, we can
> >>> increase it more.
> >> 
> >> I don't know Cirrus enough to know if this is a daft question, but does 
> >> this
> >> impact the cost (in actual money or credits) for running the CI tests?
> > 
> > I don't expect this to increase costs, and it might even reduce them
> > slightly. Disks generally perform slower when free space is low, so
> > the change could make OpenBSD CI tasks faster.
> 
> Thanks for clarifying, this definitely seems like something we should do to
> keep tests from failing.

Pushed it now.

I've just enabled openbsd, netbsd and mingw to run by default on
postgres/postgres. If that works without a problem, I'll also enable it on
postgresql-cfbot.

Greetings,

Andres Freund

Re: ABI Compliance Checker GSoC Project

2025-09-12 Thread Tom Lane

=?utf-8?Q?=C3=81lvaro?= Herrera  writes:
> On 2025-Sep-12, Tom Lane wrote:
>> My concern is that when we get a report, we might decide to apply
>> some fix to remove the ABI delta, or we might decide it's intentional
>> and/or harmless and leave the code as-is.

> The solution I propose for this (which I have mentioned before) is to
> allow for our source tree to carry exclusion files.  The ABI checker
> would refrain from turning red for any breaks that match what's in those
> files.

That seems substantially more complicated than just moving which
commit is considered the baseline.  It might be less reliable too:
if the exclusions are written sloppily, they might hide problems
we'd rather find out about.  (I do not regard valgrind suppression
files as a model of good engineering...)

regards, tom lane

Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

2025-09-12 Thread Tatsuo Ishii

>> Attached are the v18 patches for adding RESPECT/IGNORE NULLS options
>> to some window functions. Recent changes to doc/src/sgml/func.sgml
>> required v17 to be rebased. Other than that, nothing has been changed.
>>
>> Oliver, do you have any comments on the patches?
>>
> 
> Looks good, tried it on the nth_value test script from a bit ago - I added
> a 1 million rows test and it takes an average of 12 seconds on my i7.

I would like to push the patch by the end of this month or early in
October if there's no objection.

Comments/suggestions are welcome.
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp

Re: race condition in pg_class

2025-09-12 Thread Tom Lane

Noah Misch  writes:
> On Thu, Sep 11, 2025 at 10:12:59PM -0400, Tom Lane wrote:
>> I'm content with making it master-only.

> I've pushed the test update to master.

Thanks.  I'll go deal with the original issue shortly.

regards, tom lane

Re: race condition in pg_class

2025-09-12 Thread Noah Misch

On Thu, Sep 11, 2025 at 10:12:59PM -0400, Tom Lane wrote:
> Noah Misch  writes:
> > On Thu, Sep 11, 2025 at 03:34:47PM -0400, Tom Lane wrote:
> >> However ... it might still be okay to cram it into v18.  Do you have
> >> an opinion about that?
> 
> > Feels like something I wouldn't do, but I don't have a concrete reason.
> 
> I'm content with making it master-only.

I've pushed the test update to master.

Re: pg_waldump: support decoding of WAL inside tarfile

2025-09-12 Thread Robert Haas

On Fri, Sep 12, 2025 at 2:28 PM Robert Haas  wrote:
> Is there a real need to pass XLogDumpPrivate to astreamer_wal_read or
> astreamer_archive_read? The only things that they need are archive_fd,
> archive_name, archive_streamer, archive_streamer_buf, and
> archive_streamer_read_ptr. In other words, they really don't care
> about any of the *existing* things that are in XLogDumpPrivate. This
> makes me wonder whether we should actually try to make this new
> astreamer completely independent of xlogreader. In other words,
> instead of calling it astreamer_waldump() or astreamer_xlogreader() as
> I proposed above, maybe it could be a completely generic astreamer,
> say astreamer_stringinfo_new(StringInfo *buf) that just appends to the
> buffer. That would require also moving the stuff out of
> astreamer_wal_read() that knows about XLogRecPtr, but why does that
> function need to know about XLogRecPtr? Couldn't the caller figure out
> that part and just tell this function how many bytes are needed?

Hmm, on further thought, I think this was a silly idea. Part of the
intended function of this astreamer is to make sure we're only reading
WAL files from the archive, and eventually reordering them if
required, so obviously something completely generic isn't going to
work. Maybe there's a way to make this look a little cleaner and
tidier but this isn't it...

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: PostgreSQL 18 GA press release draft

2025-09-12 Thread Nico Williams

On Fri, Sep 12, 2025 at 12:13:23PM -0700, Jeff Davis wrote:
> On Fri, 2025-09-12 at 13:21 -0500, Nico Williams wrote:
> > How does one do form-insensitive comparisons?
> 
> If you mean case insensitive matching, you can do:

I meant form-insensitive, as in comparing equivalent strings where one
might be using decomposed sequences and another pre-conposed sequences
(and either might be normalized or not at all).

Nico
--

Re: RFC: extensible planner state

2025-09-12 Thread Melanie Plageman

On Tue, Aug 26, 2025 at 4:58 AM Andrei Lepikhov  wrote:
>
> 1. Why is it necessary to maintain the GetExplainExtensionId and
> GetPlannerExtensionId routines? It seems that using a single
> extension_id (related to the order of the library inside the
> file_scanner) is more transparent and more straightforward if necessary.

But this wouldn't work for in-core use cases like GEQO, right? Also,
how would it work if there are multiple "extensions" in the same .so
file?

> 2. Why does the extensibility approach in 0001 differ from that in 0004?
> I can imagine it is all about limiting extensions, but anyway, a module
> has access to PlannerInfo, PlannerGlobal, etc. So, this machinery looks
> a little redundant, doesn't it?

What do you mean that the extensibility approach differs? Like that
the type of extension_state is different?

- Melanie

Re: PostgreSQL 18 GA press release draft

2025-09-12 Thread Jeff Davis

On Fri, 2025-09-12 at 15:48 -0500, Nico Williams wrote:
> I meant form-insensitive, as in comparing equivalent strings where
> one
> might be using decomposed sequences and another pre-conposed
> sequences
> (and either might be normalized or not at all).

Use NORMALIZE(a) = NORMALIZE(b).

This is getting a little off-topic from what's actually being released,
so please reopen a more relevant thread or start a new one, and CC me.

Regards,
Jeff Davis

Re: ALTER DATABASE RESET with unexistent guc doesn't report an error

2025-09-12 Thread Tom Lane

I wrote:
> Looking at the patch, the delta in database.out raises
> another question:
>  ALTER DATABASE regression_tbd RENAME TO regression_utf8;
>  ALTER DATABASE regression_utf8 SET TABLESPACE regress_tblspace;
>  ALTER DATABASE regression_utf8 RESET TABLESPACE;
> +ERROR:  unrecognized configuration parameter "tablespace"
>  ALTER DATABASE regression_utf8 CONNECTION_LIMIT 123;
> The author of this bit of test script evidently thought that
> ALTER ... RESET TABLESPACE is the inverse of ALTER ... SET TABLESPACE,
> and what we are seeing is that it is not.  That may be a bug in
> itself, but it's not what Vitaly is on about, IIUC.

That was in fact a test bug, now corrected at 4adb0380b.
I've substituted a more on-point test case, wordsmithed the
comment a little bit, and pushed it.

Thanks for the report and patch!

regards, tom lane

Re: [PG19-3 PATCH] Don't ignore passfile

2025-09-12 Thread Robert Haas

On Sat, Sep 6, 2025 at 9:52 AM Paul Ohlhauser
 wrote:
> My point is not that the user is not happy, that they have to change 
> permissions.
> It is that the user would rather get a clear error message than to get two 
> separate messages (warning that doesn't mention "ignore" and authentication 
> error) they need to connect to understand what is going on.

Would it address your concern if we reworded that error message to be
more clear that the file is going to be ignored? I think that proposal
would have a better chance of success than this one.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Avoid resource leak (src/test/modules/test_binaryheap/test_binaryheap.c)

2025-09-12 Thread Tom Lane

Robert Haas  writes:
> On Fri, Sep 12, 2025 at 1:54 PM Ranier Vilela  wrote:
>> Coverity reports this resource leak in test_binaryheap module.
>> I think that is right.

> If this were correct, we'd need to also free the memory in all the
> error paths. But of course, in both error and non-error paths, we rely
> on memory context cleanup to free memory for us, except in cases where
> there's some specific reason to believe that's not good enough. I
> doubt that there is any such reason in this case.

I agree this isn't interesting from a resource-leak perspective.
However, is it interesting from a test-coverage perspective?
AFAICS, test_binaryheap doesn't presently exercise binaryheap_free,
which seems a little sad for what's supposed to be a unit-test
module.

Of course, binaryheap_free is quite trivial and we do already
have coverage of it elsewhere.  So I'm not super excited about
the omission.

regards, tom lane

Re: RFC: extensible planner state

2025-09-12 Thread Robert Haas

Hmm. I don't have a copy of Andrei's email in my gmail. I see it in
the archives but I have not got it. I don't understand how that
happened. I now wonder if there are other emails from Andrei I haven't
received.

On Fri, Sep 12, 2025 at 4:34 PM Melanie Plageman
 wrote:
> On Tue, Aug 26, 2025 at 4:58 AM Andrei Lepikhov  wrote:
> > 1. Why is it necessary to maintain the GetExplainExtensionId and
> > GetPlannerExtensionId routines? It seems that using a single
> > extension_id (related to the order of the library inside the
> > file_scanner) is more transparent and more straightforward if necessary.
>
> But this wouldn't work for in-core use cases like GEQO, right? Also,
> how would it work if there are multiple "extensions" in the same .so
> file?

We probably don't want to all extensions on any topic to be allocating
extension IDs from the same space, because it's used as a list index
and we don't want to have to null-pad lists excessively. Combining the
explain and planner cases wouldn't be too much of a stretch, perhaps,
but it's also not really costing us anything to have separate IDs for
those cases.

> > 2. Why does the extensibility approach in 0001 differ from that in 0004?
> > I can imagine it is all about limiting extensions, but anyway, a module
> > has access to PlannerInfo, PlannerGlobal, etc. So, this machinery looks
> > a little redundant, doesn't it?
>
> What do you mean that the extensibility approach differs? Like that
> the type of extension_state is different?

I suspect the question here is about why not use the
index-by-planner-extension-ID approach for 0004. That could maybe
work, but here everything has to be a Node, so I feel like it would be
more contorted than the existing cases.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Proposal: Conflict log history table for Logical Replication

2025-09-12 Thread Bharath Rupireddy

Hi,

On Wed, Sep 10, 2025 at 8:13 PM Amit Kapila  wrote:
>
> > How about streaming the conflicts in fixed format to a separate log
> > file other than regular postgres server log file?
>
> I would prefer this info to be stored in tables as it would be easy to
> query them. If we use separate LOGs then we should provide some views
> to query the LOG.

Providing views to query the conflicts LOG is the easiest way than
having tables (Probably we must provide both - logging conflicts to
tables and separate LOG files). However, wanting the conflicts logs
after failovers is something that makes me think the table approach is
better. I'm open to more thoughts here.

-- 
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Proposal: Conflict log history table for Logical Replication

2025-09-12 Thread Bharath Rupireddy

Hi,

On Fri, Sep 12, 2025 at 3:13 AM Dilip Kumar  wrote:
>
> I was looking into another thread where we provide an error table for
> COPY [1], it requires the user to pre-create the error table. And
> inside the COPY command we will validate the table, validation in that
> context is a one-time process checking for: (1) table existence, (2)
> ability to acquire a sufficient lock, (3) INSERT privileges, and (4)
> matching column names and data types. This approach avoids concerns
> about the user's DROP or ALTER permissions.
>
> Our requirement for the logical replication conflict log table
> differs, as we must validate the target table upon every conflict
> insertion, not just at subscription creation. A more robust
> alternative is to perform validation and acquire a lock on the
> conflict table whenever the subscription worker starts. This prevents
> modifications (like ALTER or DROP) while the worker is active. When
> the worker gets restarted, we can re-validate the table and
> automatically disable the conflict logging feature if validation
> fails.  And this can be enabled by ALTER SUBSCRIPTION by setting the
> option again.

Having to worry about ALTER/DROP and adding code to protect seems like
an overkill.

> And if we want in first version we can expect user to create the table
> as per the expected schema and supply it, this will avoid the need of
> handling how to avoid it from publishing as it will be user's
> responsibility and then in top up patches we can also allow to create
> the table internally if tables doesn't exist and then we can find out
> solution to avoid it from being publish when ALL TABLES are published.

This looks much more simple to start with.

-- 
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: [BUG] temporary file usage report with extended protocol and unnamed portals

2025-09-12 Thread Sami Imseih

> Only question is if we should avoid the extra portal hashtable lookup as 
> well, or leave that for a separate patch.

I prefer a separate thread for this, as it's an optimization of the
existing behavior.

--
Sami Imseih
Amazon Web Services (AWS)

Re: OAuth client code doesn't work with Google OAuth

2025-09-12 Thread Zsolt Parragi

> I hear what you're saying, but I don't see how it relates to the
> paragraph you replied to.

I realized something incorrect in my last email: I wrote that google
doesn't ask permission for the email/profile scope. That's not true,
it didn't show the scope screen because I already gave permission for
the application and it remembered it.

So that made me wonder: how does Okta handle it? And turns out it's the same.

1. I defined a custom scope in Okta, with required explicit consent
from the user
2. I logged in for the first time with psql
3. As expected, Okta showed the consent screen and made me explicitly
accept that I'm granting "testScope"
4. I logged in for the second time with psql
5. I entered the code to the okta device login page, and the next
screen immediately was "Device activated"

Not even a single page of warning, nothing about
scopes/permissions/etc, it didn't even verify/tell me which
application I'm logging into. The first page just said "activate your
device" (It couldn't know where I want to login, as that's the generic
enter your code page), and the next page was:

"Device activated
Follow the instructions on your device for next steps"

That was all the text on it. It still didn't even tell me which device
(application) I logged into. Just that I succeeded.

There's a "prompt=consent" parameter to the API, and a similar setting
in the admin interface, but it doesn't seem to do anything with device
flow.

So in this sense, this is much worse than Google/Azure... More
searching tells me that typically oidc providers work this way, and
seems like the RFC again doesn't specify when and how often the
providers have to show the consent screen. RFC 6819 warns against
this, but that's all I could find, and that's a separate RFC.

This makes me question again: how does the scope help pg security in
practice? (I know that it's the oidc provider's fault, but if all
providers are like this, how does that help in practice?)

Also, the internet(/android/etc) is full of "login with google"
buttons without special scopes, and generally, if somebody logs in to
site/app X with google, he/she can expect to modify things within that
app. I'm not saying that this is ideal, but this is how it works in
practice, and how people are used to it.

Anyway, I  also understand why you don't agree with this, so if you
don't want to include google specific handling, I understand, I won't
argue more for it.

> 2) I don't yet understand how this helps you as a validator author,
> because you shouldn't know what flow was in use. Forget about device
> flow for a minute. How does this new information help you if the
> organization you're installed in is using only PKCE-flow libpq from
> Postgres 25 or whatever?

Yes, as a validator author, I don't know if somebody is using libpq, a
fork or libpq, or a completely different implementation of the
protocol (that does or does not support the not entirely compliant
google oauth).

But the administrator configuring the Postgres instance / the
validator should be aware of how the authentication flow is
configured, I wouldn't want to restrict the options by saying that
something is not supported, especially not blocking one of the most
popular services.

> How does this new information help you if the
> organization you're installed in is using only PKCE-flow libpq from
> Postgres 25 or whatever?

> Is that still true for PKCE? In general, we're not running
> confidential clients here; it'd be a lot easier if we were.

I've been thinking about this for the last few days, but shouldn't a
proper PKCE implementation require a protocol change, moving part of
the logic to the server side? And that would solve these scenarios we
are talking about, there would be no question who and how created the
access token. Naive PKCE support only on the client side, and still
only sending an access token to the server wouldn't help the security
of the server too much.

On Thu, Sep 11, 2025 at 4:55 PM Jacob Champion
 wrote:
>
> On Thu, Sep 11, 2025 at 12:42 AM Zsolt Parragi
>  wrote:
> > It just says that:
> >
> > > Once you enter the code displayed on your app or device, it will have 
> > > access to your account.
> > > Do not enter codes from sources you don't trust.
> >
> > And that's all. And not loudly, this isn't written with a big font
> > size with a warning sign or anything like that, it's just quietly
> > mentioned.
>
> YMMV; on my screen the last sentence is in bolded font and quite
> front-and-center, but they haven't used  tags or red boxes or
> anything. In any case I agree it's not enough.
>
> But I don't have control over them. If a hypothetical provider (and I
> am not saying Entra does this, just to be very clear) were to skip the
> consent screen entirely, and immediately grant all requested scopes,
> neither the client nor the validator would be able to figure that out.
> That would be a provider-side problem, and we're trusting the provider
> not to do that, or else to com

Re: Set log_lock_waits=on by default

2025-09-12 Thread Laurenz Albe

On Fri, 2025-09-12 at 08:06 +0200, Peter Eisentraut wrote:
> committed

Thank you!

Laurenz Albe

68 matches

Mail list logo